[2024-06-10 09:26:36,846][32177] Saving configuration to /workspace/metta/train_dir/p2.metta.6/config.json... [2024-06-10 09:26:36,862][32177] Rollout worker 0 uses device cpu [2024-06-10 09:26:36,863][32177] Rollout worker 1 uses device cpu [2024-06-10 09:26:36,863][32177] Rollout worker 2 uses device cpu [2024-06-10 09:26:36,863][32177] Rollout worker 3 uses device cpu [2024-06-10 09:26:36,864][32177] Rollout worker 4 uses device cpu [2024-06-10 09:26:36,864][32177] Rollout worker 5 uses device cpu [2024-06-10 09:26:36,864][32177] Rollout worker 6 uses device cpu [2024-06-10 09:26:36,864][32177] Rollout worker 7 uses device cpu [2024-06-10 09:26:36,865][32177] Rollout worker 8 uses device cpu [2024-06-10 09:26:36,865][32177] Rollout worker 9 uses device cpu [2024-06-10 09:26:36,865][32177] Rollout worker 10 uses device cpu [2024-06-10 09:26:36,865][32177] Rollout worker 11 uses device cpu [2024-06-10 09:26:36,865][32177] Rollout worker 12 uses device cpu [2024-06-10 09:26:36,866][32177] Rollout worker 13 uses device cpu [2024-06-10 09:26:36,866][32177] Rollout worker 14 uses device cpu [2024-06-10 09:26:36,866][32177] Rollout worker 15 uses device cpu [2024-06-10 09:26:36,866][32177] Rollout worker 16 uses device cpu [2024-06-10 09:26:36,867][32177] Rollout worker 17 uses device cpu [2024-06-10 09:26:36,867][32177] Rollout worker 18 uses device cpu [2024-06-10 09:26:36,867][32177] Rollout worker 19 uses device cpu [2024-06-10 09:26:36,867][32177] Rollout worker 20 uses device cpu [2024-06-10 09:26:36,868][32177] Rollout worker 21 uses device cpu [2024-06-10 09:26:36,868][32177] Rollout worker 22 uses device cpu [2024-06-10 09:26:36,868][32177] Rollout worker 23 uses device cpu [2024-06-10 09:26:36,869][32177] Rollout worker 24 uses device cpu [2024-06-10 09:26:36,869][32177] Rollout worker 25 uses device cpu [2024-06-10 09:26:36,869][32177] Rollout worker 26 uses device cpu [2024-06-10 09:26:36,869][32177] Rollout worker 27 uses device cpu [2024-06-10 09:26:36,869][32177] Rollout worker 28 uses device cpu [2024-06-10 09:26:36,869][32177] Rollout worker 29 uses device cpu [2024-06-10 09:26:36,869][32177] Rollout worker 30 uses device cpu [2024-06-10 09:26:36,870][32177] Rollout worker 31 uses device cpu [2024-06-10 09:26:37,382][32177] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-10 09:26:37,382][32177] InferenceWorker_p0-w0: min num requests: 10 [2024-06-10 09:26:37,443][32177] Starting all processes... [2024-06-10 09:26:37,443][32177] Starting process learner_proc0 [2024-06-10 09:26:37,708][32177] Starting all processes... [2024-06-10 09:26:37,711][32177] Starting process inference_proc0-0 [2024-06-10 09:26:37,712][32177] Starting process rollout_proc0 [2024-06-10 09:26:37,712][32177] Starting process rollout_proc1 [2024-06-10 09:26:37,714][32177] Starting process rollout_proc2 [2024-06-10 09:26:37,716][32177] Starting process rollout_proc3 [2024-06-10 09:26:37,716][32177] Starting process rollout_proc4 [2024-06-10 09:26:37,717][32177] Starting process rollout_proc5 [2024-06-10 09:26:37,719][32177] Starting process rollout_proc6 [2024-06-10 09:26:37,719][32177] Starting process rollout_proc7 [2024-06-10 09:26:37,720][32177] Starting process rollout_proc8 [2024-06-10 09:26:37,720][32177] Starting process rollout_proc9 [2024-06-10 09:26:37,720][32177] Starting process rollout_proc10 [2024-06-10 09:26:37,720][32177] Starting process rollout_proc11 [2024-06-10 09:26:37,721][32177] Starting process rollout_proc12 [2024-06-10 09:26:37,722][32177] Starting process rollout_proc13 [2024-06-10 09:26:37,722][32177] Starting process rollout_proc14 [2024-06-10 09:26:37,722][32177] Starting process rollout_proc15 [2024-06-10 09:26:37,724][32177] Starting process rollout_proc16 [2024-06-10 09:26:37,724][32177] Starting process rollout_proc17 [2024-06-10 09:26:37,724][32177] Starting process rollout_proc18 [2024-06-10 09:26:37,725][32177] Starting process rollout_proc19 [2024-06-10 09:26:37,725][32177] Starting process rollout_proc20 [2024-06-10 09:26:37,725][32177] Starting process rollout_proc21 [2024-06-10 09:26:37,726][32177] Starting process rollout_proc22 [2024-06-10 09:26:37,727][32177] Starting process rollout_proc23 [2024-06-10 09:26:37,727][32177] Starting process rollout_proc24 [2024-06-10 09:26:37,729][32177] Starting process rollout_proc25 [2024-06-10 09:26:37,730][32177] Starting process rollout_proc26 [2024-06-10 09:26:37,732][32177] Starting process rollout_proc27 [2024-06-10 09:26:37,732][32177] Starting process rollout_proc28 [2024-06-10 09:26:37,736][32177] Starting process rollout_proc29 [2024-06-10 09:26:37,736][32177] Starting process rollout_proc30 [2024-06-10 09:26:37,738][32177] Starting process rollout_proc31 [2024-06-10 09:26:39,508][32435] Worker 20 uses CPU cores [20] [2024-06-10 09:26:39,704][32427] Worker 11 uses CPU cores [11] [2024-06-10 09:26:39,774][32420] Worker 5 uses CPU cores [5] [2024-06-10 09:26:39,828][32423] Worker 8 uses CPU cores [8] [2024-06-10 09:26:39,852][32424] Worker 9 uses CPU cores [9] [2024-06-10 09:26:39,866][32416] Worker 3 uses CPU cores [3] [2024-06-10 09:26:39,884][32443] Worker 27 uses CPU cores [27] [2024-06-10 09:26:39,887][32425] Worker 10 uses CPU cores [10] [2024-06-10 09:26:39,900][32445] Worker 31 uses CPU cores [31] [2024-06-10 09:26:39,911][32422] Worker 6 uses CPU cores [6] [2024-06-10 09:26:39,935][32429] Worker 12 uses CPU cores [12] [2024-06-10 09:26:39,940][32436] Worker 21 uses CPU cores [21] [2024-06-10 09:26:39,952][32446] Worker 30 uses CPU cores [30] [2024-06-10 09:26:39,955][32418] Worker 2 uses CPU cores [2] [2024-06-10 09:26:40,008][32421] Worker 7 uses CPU cores [7] [2024-06-10 09:26:40,012][32438] Worker 24 uses CPU cores [24] [2024-06-10 09:26:40,031][32430] Worker 15 uses CPU cores [15] [2024-06-10 09:26:40,053][32394] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-10 09:26:40,053][32394] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-06-10 09:26:40,059][32414] Worker 0 uses CPU cores [0] [2024-06-10 09:26:40,060][32415] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-10 09:26:40,060][32415] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-06-10 09:26:40,062][32394] Num visible devices: 1 [2024-06-10 09:26:40,077][32415] Num visible devices: 1 [2024-06-10 09:26:40,104][32428] Worker 14 uses CPU cores [14] [2024-06-10 09:26:40,104][32394] Setting fixed seed 0 [2024-06-10 09:26:40,105][32394] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-10 09:26:40,105][32394] Initializing actor-critic model on device cuda:0 [2024-06-10 09:26:40,125][32426] Worker 13 uses CPU cores [13] [2024-06-10 09:26:40,138][32419] Worker 4 uses CPU cores [4] [2024-06-10 09:26:40,145][32439] Worker 23 uses CPU cores [23] [2024-06-10 09:26:40,146][32417] Worker 1 uses CPU cores [1] [2024-06-10 09:26:40,153][32444] Worker 29 uses CPU cores [29] [2024-06-10 09:26:40,175][32434] Worker 18 uses CPU cores [18] [2024-06-10 09:26:40,175][32437] Worker 22 uses CPU cores [22] [2024-06-10 09:26:40,188][32433] Worker 19 uses CPU cores [19] [2024-06-10 09:26:40,191][32441] Worker 25 uses CPU cores [25] [2024-06-10 09:26:40,219][32440] Worker 26 uses CPU cores [26] [2024-06-10 09:26:40,235][32432] Worker 17 uses CPU cores [17] [2024-06-10 09:26:40,256][32442] Worker 28 uses CPU cores [28] [2024-06-10 09:26:40,271][32431] Worker 16 uses CPU cores [16] [2024-06-10 09:26:40,851][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,851][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,851][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,851][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,851][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,852][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,852][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,852][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,852][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,852][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,852][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,852][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,852][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,852][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,852][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,852][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,852][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,852][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,852][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,852][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,852][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,852][32394] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:40,855][32394] RunningMeanStd input shape: (1,) [2024-06-10 09:26:40,856][32394] RunningMeanStd input shape: (1,) [2024-06-10 09:26:40,856][32394] RunningMeanStd input shape: (1,) [2024-06-10 09:26:40,856][32394] RunningMeanStd input shape: (1,) [2024-06-10 09:26:40,895][32394] RunningMeanStd input shape: (1,) [2024-06-10 09:26:40,900][32394] Created Actor Critic model with architecture: [2024-06-10 09:26:40,900][32394] SampleFactoryAgentWrapper( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (agent): MettaAgent( (_encoder): MultiFeatureSetEncoder( (feature_set_encoders): ModuleDict( (grid_obs): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (converter): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:hp): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv_r1): RunningMeanStdInPlace() (agent:inv_r2): RunningMeanStdInPlace() (agent:inv_r3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (altar:hp): RunningMeanStdInPlace() (altar:state): RunningMeanStdInPlace() (converter:hp): RunningMeanStdInPlace() (converter:state): RunningMeanStdInPlace() (generator:amount): RunningMeanStdInPlace() (generator:hp): RunningMeanStdInPlace() (generator:state): RunningMeanStdInPlace() (wall:hp): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (global_vars): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (_steps): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_action): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_reward): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_reward): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) ) (merged_encoder): Sequential( (0): Linear(in_features=536, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (_core): ModelCoreRNN( (core): GRU(512, 512) ) (_decoder): Decoder( (mlp): Identity() ) (_critic_linear): Linear(in_features=512, out_features=1, bias=True) (_action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) ) [2024-06-10 09:26:40,967][32394] Using optimizer [2024-06-10 09:26:41,151][32394] No checkpoints found [2024-06-10 09:26:41,151][32394] Did not load from checkpoint, starting from scratch! [2024-06-10 09:26:41,151][32394] Initialized policy 0 weights for model version 0 [2024-06-10 09:26:41,152][32394] LearnerWorker_p0 finished initialization! [2024-06-10 09:26:41,153][32394] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-10 09:26:41,917][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,917][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,917][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,917][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,917][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,917][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,917][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,917][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,917][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,917][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,917][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,917][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,917][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,917][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,918][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,918][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,918][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,918][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,918][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,918][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,918][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,918][32415] RunningMeanStd input shape: (11, 11) [2024-06-10 09:26:41,921][32415] RunningMeanStd input shape: (1,) [2024-06-10 09:26:41,921][32415] RunningMeanStd input shape: (1,) [2024-06-10 09:26:41,921][32415] RunningMeanStd input shape: (1,) [2024-06-10 09:26:41,921][32415] RunningMeanStd input shape: (1,) [2024-06-10 09:26:41,961][32415] RunningMeanStd input shape: (1,) [2024-06-10 09:26:41,984][32177] Inference worker 0-0 is ready! [2024-06-10 09:26:41,984][32177] All inference workers are ready! Signal rollout workers to start! [2024-06-10 09:26:44,450][32434] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,451][32435] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,461][32439] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,464][32432] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,467][32431] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,472][32440] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,472][32445] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,474][32441] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,486][32433] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,487][32446] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,489][32436] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,490][32444] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,492][32437] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,516][32442] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,518][32443] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,540][32427] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,545][32420] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,548][32416] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,558][32426] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,558][32424] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,559][32414] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,560][32419] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,561][32418] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,568][32428] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,568][32430] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,573][32425] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,573][32423] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,574][32429] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,575][32422] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,576][32421] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,589][32417] Decorrelating experience for 0 frames... [2024-06-10 09:26:44,592][32177] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-10 09:26:44,630][32438] Decorrelating experience for 0 frames... [2024-06-10 09:26:45,868][32434] Decorrelating experience for 256 frames... [2024-06-10 09:26:45,874][32435] Decorrelating experience for 256 frames... [2024-06-10 09:26:45,903][32432] Decorrelating experience for 256 frames... [2024-06-10 09:26:45,916][32439] Decorrelating experience for 256 frames... [2024-06-10 09:26:45,921][32431] Decorrelating experience for 256 frames... [2024-06-10 09:26:45,928][32445] Decorrelating experience for 256 frames... [2024-06-10 09:26:45,933][32441] Decorrelating experience for 256 frames... [2024-06-10 09:26:45,936][32440] Decorrelating experience for 256 frames... [2024-06-10 09:26:45,955][32433] Decorrelating experience for 256 frames... [2024-06-10 09:26:45,958][32436] Decorrelating experience for 256 frames... [2024-06-10 09:26:45,965][32444] Decorrelating experience for 256 frames... [2024-06-10 09:26:45,965][32437] Decorrelating experience for 256 frames... [2024-06-10 09:26:45,970][32446] Decorrelating experience for 256 frames... [2024-06-10 09:26:46,003][32442] Decorrelating experience for 256 frames... [2024-06-10 09:26:46,005][32427] Decorrelating experience for 256 frames... [2024-06-10 09:26:46,026][32443] Decorrelating experience for 256 frames... [2024-06-10 09:26:46,028][32416] Decorrelating experience for 256 frames... [2024-06-10 09:26:46,028][32420] Decorrelating experience for 256 frames... [2024-06-10 09:26:46,035][32414] Decorrelating experience for 256 frames... [2024-06-10 09:26:46,047][32419] Decorrelating experience for 256 frames... [2024-06-10 09:26:46,050][32418] Decorrelating experience for 256 frames... [2024-06-10 09:26:46,050][32426] Decorrelating experience for 256 frames... [2024-06-10 09:26:46,051][32424] Decorrelating experience for 256 frames... [2024-06-10 09:26:46,056][32425] Decorrelating experience for 256 frames... [2024-06-10 09:26:46,067][32423] Decorrelating experience for 256 frames... [2024-06-10 09:26:46,067][32430] Decorrelating experience for 256 frames... [2024-06-10 09:26:46,067][32421] Decorrelating experience for 256 frames... [2024-06-10 09:26:46,070][32428] Decorrelating experience for 256 frames... [2024-06-10 09:26:46,070][32429] Decorrelating experience for 256 frames... [2024-06-10 09:26:46,075][32422] Decorrelating experience for 256 frames... [2024-06-10 09:26:46,090][32417] Decorrelating experience for 256 frames... [2024-06-10 09:26:46,114][32438] Decorrelating experience for 256 frames... [2024-06-10 09:26:49,592][32177] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 9580.3. Samples: 47900. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-10 09:26:52,360][32434] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-06-10 09:26:52,360][32432] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-06-10 09:26:52,372][32433] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-06-10 09:26:52,372][32435] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-06-10 09:26:52,372][32436] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-06-10 09:26:52,372][32439] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-06-10 09:26:52,375][32440] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-06-10 09:26:52,384][32431] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-06-10 09:26:52,385][32444] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-06-10 09:26:52,395][32445] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-06-10 09:26:52,395][32441] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-06-10 09:26:52,395][32446] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-06-10 09:26:52,396][32437] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-06-10 09:26:52,404][32428] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-06-10 09:26:52,418][32442] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-06-10 09:26:52,429][32424] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-06-10 09:26:52,438][32418] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-06-10 09:26:52,438][32425] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-06-10 09:26:52,438][32427] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-06-10 09:26:52,439][32443] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-06-10 09:26:52,439][32419] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-06-10 09:26:52,444][32430] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-06-10 09:26:52,446][32426] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-06-10 09:26:52,451][32429] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-06-10 09:26:52,451][32423] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-06-10 09:26:52,462][32416] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-06-10 09:26:52,475][32438] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-06-10 09:26:52,487][32422] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-06-10 09:26:52,492][32394] Signal inference workers to stop experience collection... [2024-06-10 09:26:52,515][32417] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-06-10 09:26:52,541][32415] InferenceWorker_p0-w0: stopping experience collection [2024-06-10 09:26:52,544][32421] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-06-10 09:26:52,544][32420] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-06-10 09:26:52,991][32394] Signal inference workers to resume experience collection... [2024-06-10 09:26:52,991][32415] InferenceWorker_p0-w0: resuming experience collection [2024-06-10 09:26:54,074][32415] Updated weights for policy 0, policy_version 10 (0.0011) [2024-06-10 09:26:54,592][32177] Fps is (10 sec: 16383.9, 60 sec: 16383.9, 300 sec: 16383.9). Total num frames: 163840. Throughput: 0: 32861.8. Samples: 328620. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-10 09:26:57,226][32417] Worker 1 awakens! [2024-06-10 09:26:57,378][32177] Heartbeat connected on Batcher_0 [2024-06-10 09:26:57,380][32177] Heartbeat connected on LearnerWorker_p0 [2024-06-10 09:26:57,385][32177] Heartbeat connected on RolloutWorker_w0 [2024-06-10 09:26:57,387][32177] Heartbeat connected on RolloutWorker_w1 [2024-06-10 09:26:57,433][32177] Heartbeat connected on InferenceWorker_p0-w0 [2024-06-10 09:26:59,592][32177] Fps is (10 sec: 16383.9, 60 sec: 10922.7, 300 sec: 10922.7). Total num frames: 163840. Throughput: 0: 22137.4. Samples: 332060. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-10 09:27:01,860][32418] Worker 2 awakens! [2024-06-10 09:27:01,866][32177] Heartbeat connected on RolloutWorker_w2 [2024-06-10 09:27:04,592][32177] Fps is (10 sec: 1638.4, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 180224. Throughput: 0: 17375.0. Samples: 347500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 10.0) [2024-06-10 09:27:06,595][32416] Worker 3 awakens! [2024-06-10 09:27:06,612][32177] Heartbeat connected on RolloutWorker_w3 [2024-06-10 09:27:09,592][32177] Fps is (10 sec: 3276.8, 60 sec: 7864.3, 300 sec: 7864.3). Total num frames: 196608. Throughput: 0: 14892.8. Samples: 372320. Policy #0 lag: (min: 0.0, avg: 1.2, max: 11.0) [2024-06-10 09:27:11,223][32419] Worker 4 awakens! [2024-06-10 09:27:11,228][32177] Heartbeat connected on RolloutWorker_w4 [2024-06-10 09:27:14,592][32177] Fps is (10 sec: 6553.8, 60 sec: 8192.1, 300 sec: 8192.1). Total num frames: 245760. Throughput: 0: 13168.1. Samples: 395040. Policy #0 lag: (min: 0.0, avg: 2.0, max: 14.0) [2024-06-10 09:27:14,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:27:14,593][32394] Saving new best policy, reward=0.000! [2024-06-10 09:27:16,004][32420] Worker 5 awakens! [2024-06-10 09:27:16,010][32177] Heartbeat connected on RolloutWorker_w5 [2024-06-10 09:27:19,151][32415] Updated weights for policy 0, policy_version 20 (0.0015) [2024-06-10 09:27:19,592][32177] Fps is (10 sec: 13107.5, 60 sec: 9362.4, 300 sec: 9362.4). Total num frames: 327680. Throughput: 0: 13623.5. Samples: 476820. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) [2024-06-10 09:27:19,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:27:20,712][32422] Worker 6 awakens! [2024-06-10 09:27:20,716][32177] Heartbeat connected on RolloutWorker_w6 [2024-06-10 09:27:24,592][32177] Fps is (10 sec: 18022.3, 60 sec: 10649.7, 300 sec: 10649.7). Total num frames: 425984. Throughput: 0: 14728.6. Samples: 589140. Policy #0 lag: (min: 0.0, avg: 2.6, max: 5.0) [2024-06-10 09:27:24,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:27:25,456][32421] Worker 7 awakens! [2024-06-10 09:27:25,465][32177] Heartbeat connected on RolloutWorker_w7 [2024-06-10 09:27:26,679][32415] Updated weights for policy 0, policy_version 30 (0.0011) [2024-06-10 09:27:29,592][32177] Fps is (10 sec: 21299.0, 60 sec: 12015.0, 300 sec: 12015.0). Total num frames: 540672. Throughput: 0: 14568.5. Samples: 655580. Policy #0 lag: (min: 0.0, avg: 1.9, max: 6.0) [2024-06-10 09:27:29,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:27:30,051][32423] Worker 8 awakens! [2024-06-10 09:27:30,055][32177] Heartbeat connected on RolloutWorker_w8 [2024-06-10 09:27:33,791][32415] Updated weights for policy 0, policy_version 40 (0.0011) [2024-06-10 09:27:34,592][32177] Fps is (10 sec: 24575.8, 60 sec: 13434.9, 300 sec: 13434.9). Total num frames: 671744. Throughput: 0: 16694.2. Samples: 799140. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-06-10 09:27:34,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:27:34,717][32424] Worker 9 awakens! [2024-06-10 09:27:34,724][32177] Heartbeat connected on RolloutWorker_w9 [2024-06-10 09:27:39,413][32425] Worker 10 awakens! [2024-06-10 09:27:39,418][32177] Heartbeat connected on RolloutWorker_w10 [2024-06-10 09:27:39,592][32177] Fps is (10 sec: 26214.3, 60 sec: 14596.7, 300 sec: 14596.7). Total num frames: 802816. Throughput: 0: 14324.1. Samples: 973200. Policy #0 lag: (min: 0.0, avg: 3.0, max: 7.0) [2024-06-10 09:27:39,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:27:39,885][32415] Updated weights for policy 0, policy_version 50 (0.0011) [2024-06-10 09:27:44,100][32427] Worker 11 awakens! [2024-06-10 09:27:44,107][32177] Heartbeat connected on RolloutWorker_w11 [2024-06-10 09:27:44,574][32415] Updated weights for policy 0, policy_version 60 (0.0013) [2024-06-10 09:27:44,592][32177] Fps is (10 sec: 31129.5, 60 sec: 16384.0, 300 sec: 16384.0). Total num frames: 983040. Throughput: 0: 16312.0. Samples: 1066100. Policy #0 lag: (min: 0.0, avg: 3.4, max: 8.0) [2024-06-10 09:27:44,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:27:48,800][32429] Worker 12 awakens! [2024-06-10 09:27:48,805][32177] Heartbeat connected on RolloutWorker_w12 [2024-06-10 09:27:49,386][32415] Updated weights for policy 0, policy_version 70 (0.0012) [2024-06-10 09:27:49,592][32177] Fps is (10 sec: 34406.4, 60 sec: 19114.7, 300 sec: 17644.3). Total num frames: 1146880. Throughput: 0: 20525.8. Samples: 1271160. Policy #0 lag: (min: 0.0, avg: 4.0, max: 9.0) [2024-06-10 09:27:49,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:27:53,404][32426] Worker 13 awakens! [2024-06-10 09:27:53,412][32177] Heartbeat connected on RolloutWorker_w13 [2024-06-10 09:27:53,877][32415] Updated weights for policy 0, policy_version 80 (0.0017) [2024-06-10 09:27:54,592][32177] Fps is (10 sec: 32767.9, 60 sec: 19114.7, 300 sec: 18724.6). Total num frames: 1310720. Throughput: 0: 24745.4. Samples: 1485860. Policy #0 lag: (min: 0.0, avg: 5.1, max: 9.0) [2024-06-10 09:27:54,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:27:58,129][32428] Worker 14 awakens! [2024-06-10 09:27:58,135][32177] Heartbeat connected on RolloutWorker_w14 [2024-06-10 09:27:58,456][32415] Updated weights for policy 0, policy_version 90 (0.0019) [2024-06-10 09:27:59,592][32177] Fps is (10 sec: 36044.9, 60 sec: 22391.5, 300 sec: 20097.8). Total num frames: 1507328. Throughput: 0: 26627.9. Samples: 1593300. Policy #0 lag: (min: 0.0, avg: 4.6, max: 10.0) [2024-06-10 09:27:59,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:28:02,856][32430] Worker 15 awakens! [2024-06-10 09:28:02,861][32177] Heartbeat connected on RolloutWorker_w15 [2024-06-10 09:28:02,971][32415] Updated weights for policy 0, policy_version 100 (0.0018) [2024-06-10 09:28:04,592][32177] Fps is (10 sec: 37683.4, 60 sec: 25122.2, 300 sec: 21094.4). Total num frames: 1687552. Throughput: 0: 29826.1. Samples: 1819000. Policy #0 lag: (min: 0.0, avg: 4.9, max: 11.0) [2024-06-10 09:28:04,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:28:07,358][32415] Updated weights for policy 0, policy_version 110 (0.0028) [2024-06-10 09:28:07,484][32431] Worker 16 awakens! [2024-06-10 09:28:07,493][32177] Heartbeat connected on RolloutWorker_w16 [2024-06-10 09:28:09,592][32177] Fps is (10 sec: 37682.8, 60 sec: 28125.9, 300 sec: 22166.6). Total num frames: 1884160. Throughput: 0: 32063.9. Samples: 2032020. Policy #0 lag: (min: 0.0, avg: 37.5, max: 110.0) [2024-06-10 09:28:09,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:28:12,014][32415] Updated weights for policy 0, policy_version 120 (0.0020) [2024-06-10 09:28:12,148][32432] Worker 17 awakens! [2024-06-10 09:28:12,158][32177] Heartbeat connected on RolloutWorker_w17 [2024-06-10 09:28:14,592][32177] Fps is (10 sec: 39321.8, 60 sec: 30583.4, 300 sec: 23119.7). Total num frames: 2080768. Throughput: 0: 33109.4. Samples: 2145500. Policy #0 lag: (min: 0.0, avg: 42.1, max: 121.0) [2024-06-10 09:28:14,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:28:16,196][32415] Updated weights for policy 0, policy_version 130 (0.0030) [2024-06-10 09:28:16,834][32434] Worker 18 awakens! [2024-06-10 09:28:16,843][32177] Heartbeat connected on RolloutWorker_w18 [2024-06-10 09:28:19,592][32177] Fps is (10 sec: 39322.2, 60 sec: 32494.9, 300 sec: 23972.4). Total num frames: 2277376. Throughput: 0: 34958.3. Samples: 2372260. Policy #0 lag: (min: 0.0, avg: 7.2, max: 13.0) [2024-06-10 09:28:19,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:28:20,481][32415] Updated weights for policy 0, policy_version 140 (0.0029) [2024-06-10 09:28:21,532][32433] Worker 19 awakens! [2024-06-10 09:28:21,543][32177] Heartbeat connected on RolloutWorker_w19 [2024-06-10 09:28:24,119][32415] Updated weights for policy 0, policy_version 150 (0.0023) [2024-06-10 09:28:24,592][32177] Fps is (10 sec: 37682.7, 60 sec: 33860.2, 300 sec: 24576.0). Total num frames: 2457600. Throughput: 0: 36282.6. Samples: 2605920. Policy #0 lag: (min: 0.0, avg: 7.1, max: 14.0) [2024-06-10 09:28:24,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:28:26,222][32435] Worker 20 awakens! [2024-06-10 09:28:26,233][32177] Heartbeat connected on RolloutWorker_w20 [2024-06-10 09:28:28,322][32415] Updated weights for policy 0, policy_version 160 (0.0034) [2024-06-10 09:28:29,592][32177] Fps is (10 sec: 39321.5, 60 sec: 35498.7, 300 sec: 25434.3). Total num frames: 2670592. Throughput: 0: 37012.9. Samples: 2731680. Policy #0 lag: (min: 0.0, avg: 7.4, max: 15.0) [2024-06-10 09:28:29,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:28:30,910][32436] Worker 21 awakens! [2024-06-10 09:28:30,921][32177] Heartbeat connected on RolloutWorker_w21 [2024-06-10 09:28:32,617][32415] Updated weights for policy 0, policy_version 170 (0.0033) [2024-06-10 09:28:34,592][32177] Fps is (10 sec: 42597.8, 60 sec: 36863.8, 300 sec: 26214.4). Total num frames: 2883584. Throughput: 0: 37851.8. Samples: 2974500. Policy #0 lag: (min: 1.0, avg: 7.5, max: 15.0) [2024-06-10 09:28:34,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:28:34,732][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000000177_2899968.pth... [2024-06-10 09:28:35,543][32437] Worker 22 awakens! [2024-06-10 09:28:35,554][32177] Heartbeat connected on RolloutWorker_w22 [2024-06-10 09:28:36,734][32415] Updated weights for policy 0, policy_version 180 (0.0025) [2024-06-10 09:28:39,592][32177] Fps is (10 sec: 39321.1, 60 sec: 37683.2, 300 sec: 26641.8). Total num frames: 3063808. Throughput: 0: 38616.4. Samples: 3223600. Policy #0 lag: (min: 0.0, avg: 5.5, max: 14.0) [2024-06-10 09:28:39,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:28:40,204][32439] Worker 23 awakens! [2024-06-10 09:28:40,214][32177] Heartbeat connected on RolloutWorker_w23 [2024-06-10 09:28:40,225][32415] Updated weights for policy 0, policy_version 190 (0.0027) [2024-06-10 09:28:44,346][32415] Updated weights for policy 0, policy_version 200 (0.0028) [2024-06-10 09:28:44,592][32177] Fps is (10 sec: 40961.3, 60 sec: 38502.5, 300 sec: 27443.2). Total num frames: 3293184. Throughput: 0: 39154.3. Samples: 3355240. Policy #0 lag: (min: 0.0, avg: 8.2, max: 16.0) [2024-06-10 09:28:44,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:28:45,072][32438] Worker 24 awakens! [2024-06-10 09:28:45,082][32177] Heartbeat connected on RolloutWorker_w24 [2024-06-10 09:28:48,140][32415] Updated weights for policy 0, policy_version 210 (0.0025) [2024-06-10 09:28:49,592][32177] Fps is (10 sec: 44237.8, 60 sec: 39321.7, 300 sec: 28049.5). Total num frames: 3506176. Throughput: 0: 39921.0. Samples: 3615440. Policy #0 lag: (min: 0.0, avg: 7.9, max: 16.0) [2024-06-10 09:28:49,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:28:49,682][32441] Worker 25 awakens! [2024-06-10 09:28:49,695][32177] Heartbeat connected on RolloutWorker_w25 [2024-06-10 09:28:51,280][32415] Updated weights for policy 0, policy_version 220 (0.0025) [2024-06-10 09:28:54,348][32440] Worker 26 awakens! [2024-06-10 09:28:54,359][32177] Heartbeat connected on RolloutWorker_w26 [2024-06-10 09:28:54,592][32177] Fps is (10 sec: 42598.6, 60 sec: 40140.9, 300 sec: 28609.0). Total num frames: 3719168. Throughput: 0: 40956.6. Samples: 3875060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 17.0) [2024-06-10 09:28:54,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:28:55,646][32415] Updated weights for policy 0, policy_version 230 (0.0023) [2024-06-10 09:28:58,673][32415] Updated weights for policy 0, policy_version 240 (0.0026) [2024-06-10 09:28:59,102][32443] Worker 27 awakens! [2024-06-10 09:28:59,116][32177] Heartbeat connected on RolloutWorker_w27 [2024-06-10 09:28:59,592][32177] Fps is (10 sec: 44236.0, 60 sec: 40686.9, 300 sec: 29248.5). Total num frames: 3948544. Throughput: 0: 41311.5. Samples: 4004520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 18.0) [2024-06-10 09:28:59,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:29:02,923][32415] Updated weights for policy 0, policy_version 250 (0.0032) [2024-06-10 09:29:03,768][32442] Worker 28 awakens! [2024-06-10 09:29:03,781][32177] Heartbeat connected on RolloutWorker_w28 [2024-06-10 09:29:04,592][32177] Fps is (10 sec: 45874.9, 60 sec: 41506.2, 300 sec: 29842.3). Total num frames: 4177920. Throughput: 0: 42132.9. Samples: 4268240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-06-10 09:29:04,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:29:06,525][32415] Updated weights for policy 0, policy_version 260 (0.0023) [2024-06-10 09:29:08,384][32444] Worker 29 awakens! [2024-06-10 09:29:08,397][32177] Heartbeat connected on RolloutWorker_w29 [2024-06-10 09:29:09,592][32177] Fps is (10 sec: 44236.4, 60 sec: 41779.2, 300 sec: 30282.2). Total num frames: 4390912. Throughput: 0: 43057.3. Samples: 4543500. Policy #0 lag: (min: 0.0, avg: 7.7, max: 17.0) [2024-06-10 09:29:09,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:29:10,157][32415] Updated weights for policy 0, policy_version 270 (0.0037) [2024-06-10 09:29:13,120][32446] Worker 30 awakens! [2024-06-10 09:29:13,133][32177] Heartbeat connected on RolloutWorker_w30 [2024-06-10 09:29:14,337][32415] Updated weights for policy 0, policy_version 280 (0.0027) [2024-06-10 09:29:14,592][32177] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 30692.7). Total num frames: 4603904. Throughput: 0: 43303.0. Samples: 4680320. Policy #0 lag: (min: 0.0, avg: 26.9, max: 279.0) [2024-06-10 09:29:14,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:29:17,221][32415] Updated weights for policy 0, policy_version 290 (0.0029) [2024-06-10 09:29:17,804][32445] Worker 31 awakens! [2024-06-10 09:29:17,817][32177] Heartbeat connected on RolloutWorker_w31 [2024-06-10 09:29:19,592][32177] Fps is (10 sec: 45875.8, 60 sec: 42871.4, 300 sec: 31288.2). Total num frames: 4849664. Throughput: 0: 43960.2. Samples: 4952700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-10 09:29:19,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:29:20,972][32415] Updated weights for policy 0, policy_version 300 (0.0022) [2024-06-10 09:29:24,331][32415] Updated weights for policy 0, policy_version 310 (0.0035) [2024-06-10 09:29:24,592][32177] Fps is (10 sec: 49152.3, 60 sec: 43963.8, 300 sec: 31846.4). Total num frames: 5095424. Throughput: 0: 44660.5. Samples: 5233320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-10 09:29:24,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:29:27,749][32415] Updated weights for policy 0, policy_version 320 (0.0031) [2024-06-10 09:29:29,596][32177] Fps is (10 sec: 44217.5, 60 sec: 43687.5, 300 sec: 32072.1). Total num frames: 5292032. Throughput: 0: 44901.3. Samples: 5376000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 09:29:29,597][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:29:31,283][32394] Signal inference workers to stop experience collection... (50 times) [2024-06-10 09:29:31,335][32415] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-06-10 09:29:31,348][32394] Signal inference workers to resume experience collection... (50 times) [2024-06-10 09:29:31,350][32415] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-06-10 09:29:31,481][32415] Updated weights for policy 0, policy_version 330 (0.0043) [2024-06-10 09:29:34,596][32177] Fps is (10 sec: 44217.0, 60 sec: 44233.6, 300 sec: 32574.4). Total num frames: 5537792. Throughput: 0: 45198.0. Samples: 5649560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 09:29:34,597][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:29:35,028][32415] Updated weights for policy 0, policy_version 340 (0.0035) [2024-06-10 09:29:38,413][32415] Updated weights for policy 0, policy_version 350 (0.0020) [2024-06-10 09:29:39,592][32177] Fps is (10 sec: 49173.5, 60 sec: 45329.1, 300 sec: 33048.9). Total num frames: 5783552. Throughput: 0: 45475.4. Samples: 5921460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-10 09:29:39,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:29:42,394][32415] Updated weights for policy 0, policy_version 360 (0.0027) [2024-06-10 09:29:44,592][32177] Fps is (10 sec: 44257.0, 60 sec: 44782.9, 300 sec: 33223.1). Total num frames: 5980160. Throughput: 0: 45856.5. Samples: 6068060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 09:29:44,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:29:45,357][32415] Updated weights for policy 0, policy_version 370 (0.0024) [2024-06-10 09:29:49,254][32415] Updated weights for policy 0, policy_version 380 (0.0037) [2024-06-10 09:29:49,592][32177] Fps is (10 sec: 45875.3, 60 sec: 45602.0, 300 sec: 33742.2). Total num frames: 6242304. Throughput: 0: 46230.6. Samples: 6348620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 09:29:49,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:29:52,603][32415] Updated weights for policy 0, policy_version 390 (0.0034) [2024-06-10 09:29:54,592][32177] Fps is (10 sec: 50790.7, 60 sec: 46148.3, 300 sec: 34147.7). Total num frames: 6488064. Throughput: 0: 46110.0. Samples: 6618440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-10 09:29:54,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:29:56,195][32415] Updated weights for policy 0, policy_version 400 (0.0027) [2024-06-10 09:29:59,592][32177] Fps is (10 sec: 42598.8, 60 sec: 45329.2, 300 sec: 34196.4). Total num frames: 6668288. Throughput: 0: 46194.5. Samples: 6759060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-10 09:29:59,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:29:59,885][32415] Updated weights for policy 0, policy_version 410 (0.0038) [2024-06-10 09:30:03,570][32415] Updated weights for policy 0, policy_version 420 (0.0022) [2024-06-10 09:30:04,592][32177] Fps is (10 sec: 42598.1, 60 sec: 45602.1, 300 sec: 34570.3). Total num frames: 6914048. Throughput: 0: 46312.0. Samples: 7036740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 09:30:04,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:30:06,918][32415] Updated weights for policy 0, policy_version 430 (0.0033) [2024-06-10 09:30:09,592][32177] Fps is (10 sec: 49151.2, 60 sec: 46148.3, 300 sec: 34925.9). Total num frames: 7159808. Throughput: 0: 46088.9. Samples: 7307320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 09:30:09,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:30:10,547][32415] Updated weights for policy 0, policy_version 440 (0.0024) [2024-06-10 09:30:14,002][32415] Updated weights for policy 0, policy_version 450 (0.0039) [2024-06-10 09:30:14,596][32177] Fps is (10 sec: 49131.2, 60 sec: 46691.2, 300 sec: 35263.9). Total num frames: 7405568. Throughput: 0: 46124.2. Samples: 7451580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-10 09:30:14,597][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:30:17,358][32415] Updated weights for policy 0, policy_version 460 (0.0032) [2024-06-10 09:30:19,592][32177] Fps is (10 sec: 45875.6, 60 sec: 46148.3, 300 sec: 35435.2). Total num frames: 7618560. Throughput: 0: 46260.7. Samples: 7731080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-10 09:30:19,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:30:20,942][32415] Updated weights for policy 0, policy_version 470 (0.0034) [2024-06-10 09:30:24,310][32415] Updated weights for policy 0, policy_version 480 (0.0032) [2024-06-10 09:30:24,592][32177] Fps is (10 sec: 45894.2, 60 sec: 46148.2, 300 sec: 35746.9). Total num frames: 7864320. Throughput: 0: 46254.1. Samples: 8002900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 09:30:24,602][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:30:28,190][32415] Updated weights for policy 0, policy_version 490 (0.0027) [2024-06-10 09:30:29,592][32177] Fps is (10 sec: 47513.9, 60 sec: 46697.9, 300 sec: 35972.0). Total num frames: 8093696. Throughput: 0: 46308.1. Samples: 8151920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-10 09:30:29,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:30:31,561][32415] Updated weights for policy 0, policy_version 500 (0.0034) [2024-06-10 09:30:34,592][32177] Fps is (10 sec: 44236.8, 60 sec: 46151.7, 300 sec: 36116.0). Total num frames: 8306688. Throughput: 0: 46151.0. Samples: 8425420. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-10 09:30:34,601][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:30:34,606][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000000507_8306688.pth... [2024-06-10 09:30:35,174][32415] Updated weights for policy 0, policy_version 510 (0.0024) [2024-06-10 09:30:38,683][32415] Updated weights for policy 0, policy_version 520 (0.0033) [2024-06-10 09:30:39,592][32177] Fps is (10 sec: 45874.6, 60 sec: 46148.2, 300 sec: 36393.4). Total num frames: 8552448. Throughput: 0: 46253.2. Samples: 8699840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-10 09:30:39,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:30:42,398][32415] Updated weights for policy 0, policy_version 530 (0.0032) [2024-06-10 09:30:44,592][32177] Fps is (10 sec: 49152.4, 60 sec: 46967.4, 300 sec: 36659.2). Total num frames: 8798208. Throughput: 0: 46384.8. Samples: 8846380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 09:30:44,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:30:45,578][32415] Updated weights for policy 0, policy_version 540 (0.0030) [2024-06-10 09:30:49,299][32415] Updated weights for policy 0, policy_version 550 (0.0040) [2024-06-10 09:30:49,592][32177] Fps is (10 sec: 45875.8, 60 sec: 46148.3, 300 sec: 36780.4). Total num frames: 9011200. Throughput: 0: 46382.8. Samples: 9123960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 09:30:49,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:30:50,547][32394] Signal inference workers to stop experience collection... (100 times) [2024-06-10 09:30:50,548][32394] Signal inference workers to resume experience collection... (100 times) [2024-06-10 09:30:50,558][32415] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-06-10 09:30:50,572][32415] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-06-10 09:30:52,563][32415] Updated weights for policy 0, policy_version 560 (0.0026) [2024-06-10 09:30:54,592][32177] Fps is (10 sec: 42597.9, 60 sec: 45602.0, 300 sec: 36896.8). Total num frames: 9224192. Throughput: 0: 46385.7. Samples: 9394680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 09:30:54,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:30:56,541][32415] Updated weights for policy 0, policy_version 570 (0.0037) [2024-06-10 09:30:59,592][32177] Fps is (10 sec: 45874.1, 60 sec: 46694.2, 300 sec: 37137.1). Total num frames: 9469952. Throughput: 0: 46368.7. Samples: 9537980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 09:30:59,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:30:59,776][32415] Updated weights for policy 0, policy_version 580 (0.0039) [2024-06-10 09:31:03,793][32415] Updated weights for policy 0, policy_version 590 (0.0034) [2024-06-10 09:31:04,592][32177] Fps is (10 sec: 49152.5, 60 sec: 46694.4, 300 sec: 37368.1). Total num frames: 9715712. Throughput: 0: 46331.5. Samples: 9816000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-10 09:31:04,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:31:06,862][32415] Updated weights for policy 0, policy_version 600 (0.0033) [2024-06-10 09:31:09,592][32177] Fps is (10 sec: 45875.5, 60 sec: 46148.2, 300 sec: 37466.8). Total num frames: 9928704. Throughput: 0: 46404.0. Samples: 10091080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 09:31:09,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:31:10,653][32415] Updated weights for policy 0, policy_version 610 (0.0029) [2024-06-10 09:31:13,987][32415] Updated weights for policy 0, policy_version 620 (0.0045) [2024-06-10 09:31:14,596][32177] Fps is (10 sec: 47493.7, 60 sec: 46421.3, 300 sec: 37743.3). Total num frames: 10190848. Throughput: 0: 46111.6. Samples: 10227140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-10 09:31:14,596][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:31:17,665][32415] Updated weights for policy 0, policy_version 630 (0.0030) [2024-06-10 09:31:19,592][32177] Fps is (10 sec: 49152.4, 60 sec: 46694.4, 300 sec: 37891.7). Total num frames: 10420224. Throughput: 0: 46321.9. Samples: 10509900. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-10 09:31:19,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:31:20,713][32415] Updated weights for policy 0, policy_version 640 (0.0042) [2024-06-10 09:31:24,592][32177] Fps is (10 sec: 44255.5, 60 sec: 46148.3, 300 sec: 37975.8). Total num frames: 10633216. Throughput: 0: 46594.3. Samples: 10796580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 09:31:24,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:31:24,722][32415] Updated weights for policy 0, policy_version 650 (0.0032) [2024-06-10 09:31:27,935][32415] Updated weights for policy 0, policy_version 660 (0.0039) [2024-06-10 09:31:29,592][32177] Fps is (10 sec: 44236.8, 60 sec: 46148.2, 300 sec: 38114.4). Total num frames: 10862592. Throughput: 0: 46067.6. Samples: 10919420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 09:31:29,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:31:31,873][32415] Updated weights for policy 0, policy_version 670 (0.0047) [2024-06-10 09:31:34,592][32177] Fps is (10 sec: 47513.6, 60 sec: 46694.5, 300 sec: 38304.7). Total num frames: 11108352. Throughput: 0: 46371.0. Samples: 11210660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 09:31:34,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:31:34,879][32415] Updated weights for policy 0, policy_version 680 (0.0041) [2024-06-10 09:31:39,071][32415] Updated weights for policy 0, policy_version 690 (0.0035) [2024-06-10 09:31:39,592][32177] Fps is (10 sec: 45875.6, 60 sec: 46148.4, 300 sec: 38377.5). Total num frames: 11321344. Throughput: 0: 46490.0. Samples: 11486720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-10 09:31:39,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:31:42,143][32415] Updated weights for policy 0, policy_version 700 (0.0034) [2024-06-10 09:31:44,592][32177] Fps is (10 sec: 45874.9, 60 sec: 46148.2, 300 sec: 39210.5). Total num frames: 11567104. Throughput: 0: 46306.3. Samples: 11621760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 09:31:44,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:31:45,880][32415] Updated weights for policy 0, policy_version 710 (0.0033) [2024-06-10 09:31:49,046][32415] Updated weights for policy 0, policy_version 720 (0.0033) [2024-06-10 09:31:49,592][32177] Fps is (10 sec: 49152.1, 60 sec: 46694.4, 300 sec: 39488.3). Total num frames: 11812864. Throughput: 0: 46171.2. Samples: 11893700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-10 09:31:49,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:31:53,066][32415] Updated weights for policy 0, policy_version 730 (0.0044) [2024-06-10 09:31:54,592][32177] Fps is (10 sec: 45875.8, 60 sec: 46694.6, 300 sec: 40210.2). Total num frames: 12025856. Throughput: 0: 46224.2. Samples: 12171160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 09:31:54,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:31:56,190][32415] Updated weights for policy 0, policy_version 740 (0.0036) [2024-06-10 09:31:59,592][32177] Fps is (10 sec: 42598.4, 60 sec: 46148.5, 300 sec: 40876.7). Total num frames: 12238848. Throughput: 0: 46276.9. Samples: 12309400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-10 09:31:59,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:32:00,297][32415] Updated weights for policy 0, policy_version 750 (0.0033) [2024-06-10 09:32:03,011][32415] Updated weights for policy 0, policy_version 760 (0.0031) [2024-06-10 09:32:04,592][32177] Fps is (10 sec: 47512.6, 60 sec: 46421.3, 300 sec: 41709.8). Total num frames: 12500992. Throughput: 0: 46080.3. Samples: 12583520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 09:32:04,593][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:32:07,219][32415] Updated weights for policy 0, policy_version 770 (0.0022) [2024-06-10 09:32:09,592][32177] Fps is (10 sec: 49152.0, 60 sec: 46694.5, 300 sec: 42320.7). Total num frames: 12730368. Throughput: 0: 46033.4. Samples: 12868080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-10 09:32:09,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:32:10,306][32415] Updated weights for policy 0, policy_version 780 (0.0026) [2024-06-10 09:32:14,352][32415] Updated weights for policy 0, policy_version 790 (0.0037) [2024-06-10 09:32:14,592][32177] Fps is (10 sec: 44237.7, 60 sec: 45878.5, 300 sec: 42765.0). Total num frames: 12943360. Throughput: 0: 46461.4. Samples: 13010180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 09:32:14,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:32:17,472][32415] Updated weights for policy 0, policy_version 800 (0.0035) [2024-06-10 09:32:19,592][32177] Fps is (10 sec: 44236.2, 60 sec: 45875.2, 300 sec: 43209.3). Total num frames: 13172736. Throughput: 0: 45970.6. Samples: 13279340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 09:32:19,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:32:21,347][32415] Updated weights for policy 0, policy_version 810 (0.0036) [2024-06-10 09:32:24,489][32415] Updated weights for policy 0, policy_version 820 (0.0036) [2024-06-10 09:32:24,592][32177] Fps is (10 sec: 49151.9, 60 sec: 46694.5, 300 sec: 43709.2). Total num frames: 13434880. Throughput: 0: 46144.0. Samples: 13563200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 09:32:24,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:32:28,456][32394] Signal inference workers to stop experience collection... (150 times) [2024-06-10 09:32:28,466][32415] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-06-10 09:32:28,514][32394] Signal inference workers to resume experience collection... (150 times) [2024-06-10 09:32:28,514][32415] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-06-10 09:32:28,516][32415] Updated weights for policy 0, policy_version 830 (0.0034) [2024-06-10 09:32:29,592][32177] Fps is (10 sec: 47513.5, 60 sec: 46421.3, 300 sec: 43986.9). Total num frames: 13647872. Throughput: 0: 46436.0. Samples: 13711380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-10 09:32:29,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:32:31,398][32415] Updated weights for policy 0, policy_version 840 (0.0036) [2024-06-10 09:32:34,592][32177] Fps is (10 sec: 44236.5, 60 sec: 46148.3, 300 sec: 44320.1). Total num frames: 13877248. Throughput: 0: 46470.6. Samples: 13984880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-10 09:32:34,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:32:34,660][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000000848_13893632.pth... [2024-06-10 09:32:34,714][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000000177_2899968.pth [2024-06-10 09:32:35,409][32415] Updated weights for policy 0, policy_version 850 (0.0033) [2024-06-10 09:32:38,467][32415] Updated weights for policy 0, policy_version 860 (0.0034) [2024-06-10 09:32:39,592][32177] Fps is (10 sec: 47513.6, 60 sec: 46694.3, 300 sec: 44542.3). Total num frames: 14123008. Throughput: 0: 46432.3. Samples: 14260620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 09:32:39,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:32:42,436][32415] Updated weights for policy 0, policy_version 870 (0.0028) [2024-06-10 09:32:44,592][32177] Fps is (10 sec: 47513.1, 60 sec: 46421.3, 300 sec: 44764.4). Total num frames: 14352384. Throughput: 0: 46601.6. Samples: 14406480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 09:32:44,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:32:45,638][32415] Updated weights for policy 0, policy_version 880 (0.0037) [2024-06-10 09:32:49,592][32177] Fps is (10 sec: 44237.2, 60 sec: 45875.2, 300 sec: 44931.1). Total num frames: 14565376. Throughput: 0: 46670.4. Samples: 14683680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-10 09:32:49,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:32:49,631][32415] Updated weights for policy 0, policy_version 890 (0.0036) [2024-06-10 09:32:52,680][32415] Updated weights for policy 0, policy_version 900 (0.0041) [2024-06-10 09:32:54,592][32177] Fps is (10 sec: 44236.8, 60 sec: 46148.1, 300 sec: 45042.1). Total num frames: 14794752. Throughput: 0: 46516.3. Samples: 14961320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 09:32:54,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:32:56,638][32415] Updated weights for policy 0, policy_version 910 (0.0029) [2024-06-10 09:32:59,592][32177] Fps is (10 sec: 47513.8, 60 sec: 46694.4, 300 sec: 45264.3). Total num frames: 15040512. Throughput: 0: 46327.1. Samples: 15094900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-10 09:32:59,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:32:59,801][32415] Updated weights for policy 0, policy_version 920 (0.0029) [2024-06-10 09:33:03,579][32415] Updated weights for policy 0, policy_version 930 (0.0025) [2024-06-10 09:33:04,592][32177] Fps is (10 sec: 49151.8, 60 sec: 46421.3, 300 sec: 45430.9). Total num frames: 15286272. Throughput: 0: 46521.2. Samples: 15372800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-10 09:33:04,593][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:33:07,017][32415] Updated weights for policy 0, policy_version 940 (0.0024) [2024-06-10 09:33:09,592][32177] Fps is (10 sec: 45873.8, 60 sec: 46148.0, 300 sec: 45486.4). Total num frames: 15499264. Throughput: 0: 46374.4. Samples: 15650060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-10 09:33:09,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:33:10,708][32415] Updated weights for policy 0, policy_version 950 (0.0030) [2024-06-10 09:33:14,164][32415] Updated weights for policy 0, policy_version 960 (0.0035) [2024-06-10 09:33:14,592][32177] Fps is (10 sec: 45875.5, 60 sec: 46694.3, 300 sec: 45653.0). Total num frames: 15745024. Throughput: 0: 46121.8. Samples: 15786860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-10 09:33:14,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:33:18,154][32415] Updated weights for policy 0, policy_version 970 (0.0029) [2024-06-10 09:33:19,592][32177] Fps is (10 sec: 47514.9, 60 sec: 46694.5, 300 sec: 45819.7). Total num frames: 15974400. Throughput: 0: 46264.9. Samples: 16066800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 09:33:19,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:33:21,252][32415] Updated weights for policy 0, policy_version 980 (0.0026) [2024-06-10 09:33:24,592][32177] Fps is (10 sec: 44236.9, 60 sec: 45875.1, 300 sec: 45819.7). Total num frames: 16187392. Throughput: 0: 46360.0. Samples: 16346820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 09:33:24,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:33:25,046][32415] Updated weights for policy 0, policy_version 990 (0.0032) [2024-06-10 09:33:28,318][32415] Updated weights for policy 0, policy_version 1000 (0.0035) [2024-06-10 09:33:29,592][32177] Fps is (10 sec: 45874.6, 60 sec: 46421.3, 300 sec: 45930.8). Total num frames: 16433152. Throughput: 0: 46006.2. Samples: 16476760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 09:33:29,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:33:32,084][32415] Updated weights for policy 0, policy_version 1010 (0.0027) [2024-06-10 09:33:34,592][32177] Fps is (10 sec: 47514.1, 60 sec: 46421.4, 300 sec: 46097.4). Total num frames: 16662528. Throughput: 0: 46152.9. Samples: 16760560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 09:33:34,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:33:35,483][32415] Updated weights for policy 0, policy_version 1020 (0.0027) [2024-06-10 09:33:39,000][32415] Updated weights for policy 0, policy_version 1030 (0.0032) [2024-06-10 09:33:39,591][32177] Fps is (10 sec: 45876.2, 60 sec: 46148.4, 300 sec: 46097.4). Total num frames: 16891904. Throughput: 0: 46126.9. Samples: 17037020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-10 09:33:39,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:33:42,429][32415] Updated weights for policy 0, policy_version 1040 (0.0025) [2024-06-10 09:33:44,592][32177] Fps is (10 sec: 45874.2, 60 sec: 46148.2, 300 sec: 46152.9). Total num frames: 17121280. Throughput: 0: 46429.1. Samples: 17184220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 09:33:44,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:33:46,254][32415] Updated weights for policy 0, policy_version 1050 (0.0031) [2024-06-10 09:33:49,592][32177] Fps is (10 sec: 44236.6, 60 sec: 46148.3, 300 sec: 46152.9). Total num frames: 17334272. Throughput: 0: 46150.9. Samples: 17449580. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-10 09:33:49,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:33:49,786][32415] Updated weights for policy 0, policy_version 1060 (0.0028) [2024-06-10 09:33:53,319][32415] Updated weights for policy 0, policy_version 1070 (0.0033) [2024-06-10 09:33:54,592][32177] Fps is (10 sec: 45876.1, 60 sec: 46421.4, 300 sec: 46208.5). Total num frames: 17580032. Throughput: 0: 46324.3. Samples: 17734640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-10 09:33:54,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:33:56,856][32415] Updated weights for policy 0, policy_version 1080 (0.0033) [2024-06-10 09:33:59,592][32177] Fps is (10 sec: 47512.8, 60 sec: 46148.1, 300 sec: 46208.4). Total num frames: 17809408. Throughput: 0: 46398.6. Samples: 17874800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 09:33:59,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:34:00,345][32415] Updated weights for policy 0, policy_version 1090 (0.0025) [2024-06-10 09:34:03,811][32415] Updated weights for policy 0, policy_version 1100 (0.0035) [2024-06-10 09:34:04,592][32177] Fps is (10 sec: 45874.7, 60 sec: 45875.3, 300 sec: 46264.0). Total num frames: 18038784. Throughput: 0: 46352.8. Samples: 18152680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-10 09:34:04,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:34:07,690][32415] Updated weights for policy 0, policy_version 1110 (0.0036) [2024-06-10 09:34:09,592][32177] Fps is (10 sec: 49152.3, 60 sec: 46694.6, 300 sec: 46430.6). Total num frames: 18300928. Throughput: 0: 46267.1. Samples: 18428840. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-10 09:34:09,593][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:34:11,113][32415] Updated weights for policy 0, policy_version 1120 (0.0029) [2024-06-10 09:34:14,576][32415] Updated weights for policy 0, policy_version 1130 (0.0035) [2024-06-10 09:34:14,596][32177] Fps is (10 sec: 47493.7, 60 sec: 46145.0, 300 sec: 46318.9). Total num frames: 18513920. Throughput: 0: 46553.9. Samples: 18571880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 09:34:14,596][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:34:18,418][32415] Updated weights for policy 0, policy_version 1140 (0.0026) [2024-06-10 09:34:19,592][32177] Fps is (10 sec: 45874.8, 60 sec: 46421.2, 300 sec: 46319.5). Total num frames: 18759680. Throughput: 0: 46561.1. Samples: 18855820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 09:34:19,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:34:21,778][32415] Updated weights for policy 0, policy_version 1150 (0.0022) [2024-06-10 09:34:22,316][32394] Signal inference workers to stop experience collection... (200 times) [2024-06-10 09:34:22,325][32415] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-06-10 09:34:22,431][32394] Signal inference workers to resume experience collection... (200 times) [2024-06-10 09:34:22,431][32415] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-06-10 09:34:24,592][32177] Fps is (10 sec: 45894.2, 60 sec: 46421.3, 300 sec: 46375.7). Total num frames: 18972672. Throughput: 0: 46251.8. Samples: 19118360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-10 09:34:24,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:34:25,630][32415] Updated weights for policy 0, policy_version 1160 (0.0034) [2024-06-10 09:34:28,731][32415] Updated weights for policy 0, policy_version 1170 (0.0038) [2024-06-10 09:34:29,592][32177] Fps is (10 sec: 44237.1, 60 sec: 46148.3, 300 sec: 46320.2). Total num frames: 19202048. Throughput: 0: 46185.4. Samples: 19262560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 09:34:29,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:34:32,931][32415] Updated weights for policy 0, policy_version 1180 (0.0042) [2024-06-10 09:34:34,596][32177] Fps is (10 sec: 45856.1, 60 sec: 46144.9, 300 sec: 46263.3). Total num frames: 19431424. Throughput: 0: 46343.1. Samples: 19535220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 09:34:34,597][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:34:34,619][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000001186_19431424.pth... [2024-06-10 09:34:34,661][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000000507_8306688.pth [2024-06-10 09:34:36,119][32415] Updated weights for policy 0, policy_version 1190 (0.0031) [2024-06-10 09:34:39,592][32177] Fps is (10 sec: 42598.4, 60 sec: 45602.0, 300 sec: 46264.0). Total num frames: 19628032. Throughput: 0: 46030.5. Samples: 19806020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-10 09:34:39,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:34:40,183][32415] Updated weights for policy 0, policy_version 1200 (0.0037) [2024-06-10 09:34:43,457][32415] Updated weights for policy 0, policy_version 1210 (0.0032) [2024-06-10 09:34:44,592][32177] Fps is (10 sec: 47533.7, 60 sec: 46421.4, 300 sec: 46319.5). Total num frames: 19906560. Throughput: 0: 45945.4. Samples: 19942340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-10 09:34:44,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:34:47,297][32415] Updated weights for policy 0, policy_version 1220 (0.0030) [2024-06-10 09:34:49,592][32177] Fps is (10 sec: 47514.5, 60 sec: 46148.3, 300 sec: 46152.9). Total num frames: 20103168. Throughput: 0: 45865.9. Samples: 20216640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-10 09:34:49,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:34:50,545][32415] Updated weights for policy 0, policy_version 1230 (0.0033) [2024-06-10 09:34:54,378][32415] Updated weights for policy 0, policy_version 1240 (0.0031) [2024-06-10 09:34:54,592][32177] Fps is (10 sec: 40960.6, 60 sec: 45602.2, 300 sec: 46264.0). Total num frames: 20316160. Throughput: 0: 45944.6. Samples: 20496340. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-10 09:34:54,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:34:57,386][32415] Updated weights for policy 0, policy_version 1250 (0.0029) [2024-06-10 09:34:59,592][32177] Fps is (10 sec: 47513.3, 60 sec: 46148.4, 300 sec: 46319.5). Total num frames: 20578304. Throughput: 0: 45759.9. Samples: 20630880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-10 09:34:59,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:35:01,510][32415] Updated weights for policy 0, policy_version 1260 (0.0029) [2024-06-10 09:35:04,596][32177] Fps is (10 sec: 47492.5, 60 sec: 45871.9, 300 sec: 46207.8). Total num frames: 20791296. Throughput: 0: 45574.8. Samples: 20906880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 09:35:04,597][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:35:04,777][32415] Updated weights for policy 0, policy_version 1270 (0.0031) [2024-06-10 09:35:08,629][32415] Updated weights for policy 0, policy_version 1280 (0.0027) [2024-06-10 09:35:09,593][32177] Fps is (10 sec: 45867.8, 60 sec: 45601.0, 300 sec: 46208.9). Total num frames: 21037056. Throughput: 0: 45968.3. Samples: 21187000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-10 09:35:09,594][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:35:11,787][32415] Updated weights for policy 0, policy_version 1290 (0.0035) [2024-06-10 09:35:14,592][32177] Fps is (10 sec: 47533.9, 60 sec: 45878.4, 300 sec: 46264.0). Total num frames: 21266432. Throughput: 0: 45813.3. Samples: 21324160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 09:35:14,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:35:15,669][32415] Updated weights for policy 0, policy_version 1300 (0.0029) [2024-06-10 09:35:18,394][32415] Updated weights for policy 0, policy_version 1310 (0.0027) [2024-06-10 09:35:19,592][32177] Fps is (10 sec: 47521.4, 60 sec: 45875.4, 300 sec: 46264.0). Total num frames: 21512192. Throughput: 0: 45974.7. Samples: 21603880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 09:35:19,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:35:22,471][32415] Updated weights for policy 0, policy_version 1320 (0.0035) [2024-06-10 09:35:24,592][32177] Fps is (10 sec: 45875.3, 60 sec: 45875.2, 300 sec: 46208.4). Total num frames: 21725184. Throughput: 0: 46194.3. Samples: 21884760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 09:35:24,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:35:25,576][32415] Updated weights for policy 0, policy_version 1330 (0.0030) [2024-06-10 09:35:29,592][32177] Fps is (10 sec: 42597.9, 60 sec: 45602.2, 300 sec: 46208.4). Total num frames: 21938176. Throughput: 0: 46048.9. Samples: 22014540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-10 09:35:29,600][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:35:29,662][32415] Updated weights for policy 0, policy_version 1340 (0.0025) [2024-06-10 09:35:32,629][32394] Signal inference workers to stop experience collection... (250 times) [2024-06-10 09:35:32,637][32394] Signal inference workers to resume experience collection... (250 times) [2024-06-10 09:35:32,660][32415] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-06-10 09:35:32,660][32415] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-06-10 09:35:32,773][32415] Updated weights for policy 0, policy_version 1350 (0.0036) [2024-06-10 09:35:34,596][32177] Fps is (10 sec: 50768.7, 60 sec: 46694.4, 300 sec: 46374.4). Total num frames: 22233088. Throughput: 0: 46363.9. Samples: 22303220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 09:35:34,597][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:35:36,941][32415] Updated weights for policy 0, policy_version 1360 (0.0027) [2024-06-10 09:35:39,592][32177] Fps is (10 sec: 49151.6, 60 sec: 46694.4, 300 sec: 46208.4). Total num frames: 22429696. Throughput: 0: 46272.2. Samples: 22578600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 09:35:39,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:35:39,828][32415] Updated weights for policy 0, policy_version 1370 (0.0040) [2024-06-10 09:35:43,895][32415] Updated weights for policy 0, policy_version 1380 (0.0034) [2024-06-10 09:35:44,592][32177] Fps is (10 sec: 39338.2, 60 sec: 45329.0, 300 sec: 46152.9). Total num frames: 22626304. Throughput: 0: 46190.5. Samples: 22709460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-10 09:35:44,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:35:46,707][32415] Updated weights for policy 0, policy_version 1390 (0.0044) [2024-06-10 09:35:49,592][32177] Fps is (10 sec: 47514.2, 60 sec: 46694.3, 300 sec: 46375.1). Total num frames: 22904832. Throughput: 0: 46235.6. Samples: 22987280. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-06-10 09:35:49,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:35:51,156][32415] Updated weights for policy 0, policy_version 1400 (0.0036) [2024-06-10 09:35:54,121][32415] Updated weights for policy 0, policy_version 1410 (0.0026) [2024-06-10 09:35:54,592][32177] Fps is (10 sec: 49152.1, 60 sec: 46694.3, 300 sec: 46264.0). Total num frames: 23117824. Throughput: 0: 46162.4. Samples: 23264240. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-10 09:35:54,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:35:58,040][32415] Updated weights for policy 0, policy_version 1420 (0.0032) [2024-06-10 09:35:59,592][32177] Fps is (10 sec: 40960.3, 60 sec: 45602.2, 300 sec: 46097.4). Total num frames: 23314432. Throughput: 0: 46187.7. Samples: 23402600. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-10 09:35:59,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:36:01,307][32415] Updated weights for policy 0, policy_version 1430 (0.0037) [2024-06-10 09:36:04,592][32177] Fps is (10 sec: 45875.3, 60 sec: 46424.6, 300 sec: 46264.0). Total num frames: 23576576. Throughput: 0: 45976.3. Samples: 23672820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 09:36:04,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:36:05,179][32415] Updated weights for policy 0, policy_version 1440 (0.0027) [2024-06-10 09:36:08,385][32415] Updated weights for policy 0, policy_version 1450 (0.0021) [2024-06-10 09:36:09,592][32177] Fps is (10 sec: 50790.4, 60 sec: 46422.6, 300 sec: 46209.1). Total num frames: 23822336. Throughput: 0: 46006.8. Samples: 23955060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 09:36:09,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:36:12,598][32415] Updated weights for policy 0, policy_version 1460 (0.0022) [2024-06-10 09:36:14,592][32177] Fps is (10 sec: 45875.9, 60 sec: 46148.4, 300 sec: 46152.9). Total num frames: 24035328. Throughput: 0: 46213.0. Samples: 24094120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-10 09:36:14,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:36:15,445][32415] Updated weights for policy 0, policy_version 1470 (0.0024) [2024-06-10 09:36:19,404][32415] Updated weights for policy 0, policy_version 1480 (0.0032) [2024-06-10 09:36:19,592][32177] Fps is (10 sec: 42598.3, 60 sec: 45602.1, 300 sec: 46152.9). Total num frames: 24248320. Throughput: 0: 45882.2. Samples: 24367720. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-10 09:36:19,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:36:22,899][32415] Updated weights for policy 0, policy_version 1490 (0.0032) [2024-06-10 09:36:24,592][32177] Fps is (10 sec: 49151.4, 60 sec: 46694.4, 300 sec: 46319.5). Total num frames: 24526848. Throughput: 0: 45636.5. Samples: 24632240. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-10 09:36:24,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:36:26,314][32415] Updated weights for policy 0, policy_version 1500 (0.0022) [2024-06-10 09:36:27,710][32394] Signal inference workers to stop experience collection... (300 times) [2024-06-10 09:36:27,710][32394] Signal inference workers to resume experience collection... (300 times) [2024-06-10 09:36:27,760][32415] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-06-10 09:36:27,761][32415] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-06-10 09:36:29,592][32177] Fps is (10 sec: 45875.0, 60 sec: 46148.3, 300 sec: 46097.4). Total num frames: 24707072. Throughput: 0: 46199.2. Samples: 24788420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-10 09:36:29,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:36:30,073][32415] Updated weights for policy 0, policy_version 1510 (0.0032) [2024-06-10 09:36:33,575][32415] Updated weights for policy 0, policy_version 1520 (0.0042) [2024-06-10 09:36:34,592][32177] Fps is (10 sec: 39321.5, 60 sec: 44786.1, 300 sec: 46097.3). Total num frames: 24920064. Throughput: 0: 45978.1. Samples: 25056300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-10 09:36:34,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:36:34,607][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000001521_24920064.pth... [2024-06-10 09:36:34,652][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000000848_13893632.pth [2024-06-10 09:36:37,007][32415] Updated weights for policy 0, policy_version 1530 (0.0037) [2024-06-10 09:36:39,592][32177] Fps is (10 sec: 49151.7, 60 sec: 46148.3, 300 sec: 46208.4). Total num frames: 25198592. Throughput: 0: 45796.9. Samples: 25325100. Policy #0 lag: (min: 0.0, avg: 13.2, max: 25.0) [2024-06-10 09:36:39,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:36:40,805][32415] Updated weights for policy 0, policy_version 1540 (0.0031) [2024-06-10 09:36:43,952][32415] Updated weights for policy 0, policy_version 1550 (0.0034) [2024-06-10 09:36:44,592][32177] Fps is (10 sec: 49152.7, 60 sec: 46421.5, 300 sec: 46097.4). Total num frames: 25411584. Throughput: 0: 45970.2. Samples: 25471260. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-10 09:36:44,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:36:47,826][32415] Updated weights for policy 0, policy_version 1560 (0.0033) [2024-06-10 09:36:49,592][32177] Fps is (10 sec: 42598.9, 60 sec: 45329.1, 300 sec: 46097.4). Total num frames: 25624576. Throughput: 0: 46120.1. Samples: 25748220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 09:36:49,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:36:51,299][32415] Updated weights for policy 0, policy_version 1570 (0.0034) [2024-06-10 09:36:54,592][32177] Fps is (10 sec: 45875.2, 60 sec: 45875.3, 300 sec: 46208.4). Total num frames: 25870336. Throughput: 0: 45929.3. Samples: 26021880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 09:36:54,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:36:54,788][32415] Updated weights for policy 0, policy_version 1580 (0.0040) [2024-06-10 09:36:58,318][32415] Updated weights for policy 0, policy_version 1590 (0.0025) [2024-06-10 09:36:59,592][32177] Fps is (10 sec: 50790.6, 60 sec: 46967.5, 300 sec: 46208.5). Total num frames: 26132480. Throughput: 0: 45944.9. Samples: 26161640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-10 09:36:59,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:37:01,878][32415] Updated weights for policy 0, policy_version 1600 (0.0037) [2024-06-10 09:37:04,592][32177] Fps is (10 sec: 45874.9, 60 sec: 45875.3, 300 sec: 46097.3). Total num frames: 26329088. Throughput: 0: 46063.5. Samples: 26440580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-10 09:37:04,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:37:05,392][32415] Updated weights for policy 0, policy_version 1610 (0.0030) [2024-06-10 09:37:09,266][32415] Updated weights for policy 0, policy_version 1620 (0.0028) [2024-06-10 09:37:09,594][32177] Fps is (10 sec: 40949.2, 60 sec: 45327.1, 300 sec: 46097.0). Total num frames: 26542080. Throughput: 0: 46273.9. Samples: 26714680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-10 09:37:09,595][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:37:12,483][32415] Updated weights for policy 0, policy_version 1630 (0.0031) [2024-06-10 09:37:14,596][32177] Fps is (10 sec: 47493.2, 60 sec: 46144.9, 300 sec: 46207.8). Total num frames: 26804224. Throughput: 0: 45848.1. Samples: 26851780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-10 09:37:14,597][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:37:16,409][32415] Updated weights for policy 0, policy_version 1640 (0.0048) [2024-06-10 09:37:19,592][32177] Fps is (10 sec: 47525.9, 60 sec: 46148.3, 300 sec: 46041.8). Total num frames: 27017216. Throughput: 0: 46049.0. Samples: 27128500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-10 09:37:19,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:37:19,694][32415] Updated weights for policy 0, policy_version 1650 (0.0033) [2024-06-10 09:37:23,699][32415] Updated weights for policy 0, policy_version 1660 (0.0030) [2024-06-10 09:37:24,596][32177] Fps is (10 sec: 42598.6, 60 sec: 45052.9, 300 sec: 46041.2). Total num frames: 27230208. Throughput: 0: 46123.7. Samples: 27400860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-10 09:37:24,596][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:37:24,618][32394] Saving new best policy, reward=0.001! [2024-06-10 09:37:26,907][32415] Updated weights for policy 0, policy_version 1670 (0.0032) [2024-06-10 09:37:29,592][32177] Fps is (10 sec: 45874.8, 60 sec: 46148.2, 300 sec: 46097.4). Total num frames: 27475968. Throughput: 0: 45825.2. Samples: 27533400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-10 09:37:29,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:37:30,772][32415] Updated weights for policy 0, policy_version 1680 (0.0024) [2024-06-10 09:37:33,938][32415] Updated weights for policy 0, policy_version 1690 (0.0042) [2024-06-10 09:37:34,592][32177] Fps is (10 sec: 49173.1, 60 sec: 46694.5, 300 sec: 46097.4). Total num frames: 27721728. Throughput: 0: 45931.1. Samples: 27815120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-10 09:37:34,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:37:37,871][32415] Updated weights for policy 0, policy_version 1700 (0.0034) [2024-06-10 09:37:39,592][32177] Fps is (10 sec: 45875.7, 60 sec: 45602.2, 300 sec: 46041.8). Total num frames: 27934720. Throughput: 0: 45953.4. Samples: 28089780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-10 09:37:39,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:37:40,909][32415] Updated weights for policy 0, policy_version 1710 (0.0026) [2024-06-10 09:37:44,592][32177] Fps is (10 sec: 42598.5, 60 sec: 45602.2, 300 sec: 46041.8). Total num frames: 28147712. Throughput: 0: 45807.1. Samples: 28222960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-10 09:37:44,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:37:44,814][32415] Updated weights for policy 0, policy_version 1720 (0.0032) [2024-06-10 09:37:48,041][32415] Updated weights for policy 0, policy_version 1730 (0.0040) [2024-06-10 09:37:49,592][32177] Fps is (10 sec: 47512.0, 60 sec: 46421.1, 300 sec: 46152.9). Total num frames: 28409856. Throughput: 0: 45778.4. Samples: 28500620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 09:37:49,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:37:51,928][32415] Updated weights for policy 0, policy_version 1740 (0.0048) [2024-06-10 09:37:54,592][32177] Fps is (10 sec: 49151.4, 60 sec: 46148.2, 300 sec: 46097.3). Total num frames: 28639232. Throughput: 0: 46065.2. Samples: 28787500. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-06-10 09:37:54,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:37:55,226][32415] Updated weights for policy 0, policy_version 1750 (0.0031) [2024-06-10 09:37:56,592][32394] Signal inference workers to stop experience collection... (350 times) [2024-06-10 09:37:56,592][32394] Signal inference workers to resume experience collection... (350 times) [2024-06-10 09:37:56,627][32415] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-06-10 09:37:56,627][32415] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-06-10 09:37:59,157][32415] Updated weights for policy 0, policy_version 1760 (0.0032) [2024-06-10 09:37:59,592][32177] Fps is (10 sec: 44237.6, 60 sec: 45328.9, 300 sec: 45986.3). Total num frames: 28852224. Throughput: 0: 45915.9. Samples: 28917800. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-10 09:37:59,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:38:02,167][32415] Updated weights for policy 0, policy_version 1770 (0.0026) [2024-06-10 09:38:04,592][32177] Fps is (10 sec: 47513.1, 60 sec: 46421.2, 300 sec: 46152.9). Total num frames: 29114368. Throughput: 0: 46070.9. Samples: 29201700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 09:38:04,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:38:05,885][32415] Updated weights for policy 0, policy_version 1780 (0.0033) [2024-06-10 09:38:09,082][32415] Updated weights for policy 0, policy_version 1790 (0.0028) [2024-06-10 09:38:09,596][32177] Fps is (10 sec: 49131.2, 60 sec: 46693.0, 300 sec: 46096.7). Total num frames: 29343744. Throughput: 0: 46249.3. Samples: 29482080. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-10 09:38:09,597][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:38:13,118][32415] Updated weights for policy 0, policy_version 1800 (0.0033) [2024-06-10 09:38:14,596][32177] Fps is (10 sec: 42580.8, 60 sec: 45602.1, 300 sec: 45985.6). Total num frames: 29540352. Throughput: 0: 46318.3. Samples: 29617920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-10 09:38:14,597][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:38:16,386][32415] Updated weights for policy 0, policy_version 1810 (0.0030) [2024-06-10 09:38:19,592][32177] Fps is (10 sec: 44255.7, 60 sec: 46148.2, 300 sec: 46097.4). Total num frames: 29786112. Throughput: 0: 46086.1. Samples: 29889000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 09:38:19,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:38:20,330][32415] Updated weights for policy 0, policy_version 1820 (0.0039) [2024-06-10 09:38:23,469][32415] Updated weights for policy 0, policy_version 1830 (0.0038) [2024-06-10 09:38:24,592][32177] Fps is (10 sec: 49173.3, 60 sec: 46697.7, 300 sec: 46097.4). Total num frames: 30031872. Throughput: 0: 46092.0. Samples: 30163920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-10 09:38:24,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:38:27,511][32415] Updated weights for policy 0, policy_version 1840 (0.0027) [2024-06-10 09:38:29,592][32177] Fps is (10 sec: 45875.6, 60 sec: 46148.3, 300 sec: 46041.8). Total num frames: 30244864. Throughput: 0: 46287.1. Samples: 30305880. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-10 09:38:29,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:38:30,741][32415] Updated weights for policy 0, policy_version 1850 (0.0035) [2024-06-10 09:38:34,342][32415] Updated weights for policy 0, policy_version 1860 (0.0038) [2024-06-10 09:38:34,592][32177] Fps is (10 sec: 44236.2, 60 sec: 45875.1, 300 sec: 46041.8). Total num frames: 30474240. Throughput: 0: 46250.4. Samples: 30581880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-10 09:38:34,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:38:34,598][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000001860_30474240.pth... [2024-06-10 09:38:34,651][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000001186_19431424.pth [2024-06-10 09:38:37,785][32415] Updated weights for policy 0, policy_version 1870 (0.0023) [2024-06-10 09:38:39,592][32177] Fps is (10 sec: 49152.1, 60 sec: 46694.4, 300 sec: 46152.9). Total num frames: 30736384. Throughput: 0: 46033.9. Samples: 30859020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-10 09:38:39,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:38:41,484][32415] Updated weights for policy 0, policy_version 1880 (0.0026) [2024-06-10 09:38:44,592][32177] Fps is (10 sec: 45876.0, 60 sec: 46421.3, 300 sec: 46097.4). Total num frames: 30932992. Throughput: 0: 46211.3. Samples: 30997300. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-10 09:38:44,592][32177] Avg episode reward: [(0, '0.000')] [2024-06-10 09:38:44,948][32415] Updated weights for policy 0, policy_version 1890 (0.0035) [2024-06-10 09:38:48,657][32415] Updated weights for policy 0, policy_version 1900 (0.0041) [2024-06-10 09:38:49,592][32177] Fps is (10 sec: 40959.9, 60 sec: 45602.4, 300 sec: 45986.3). Total num frames: 31145984. Throughput: 0: 45972.2. Samples: 31270440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 09:38:49,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:38:52,175][32415] Updated weights for policy 0, policy_version 1910 (0.0030) [2024-06-10 09:38:54,592][32177] Fps is (10 sec: 45875.0, 60 sec: 45875.3, 300 sec: 46041.8). Total num frames: 31391744. Throughput: 0: 45862.2. Samples: 31545680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-10 09:38:54,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:38:55,739][32415] Updated weights for policy 0, policy_version 1920 (0.0033) [2024-06-10 09:38:59,167][32415] Updated weights for policy 0, policy_version 1930 (0.0033) [2024-06-10 09:38:59,592][32177] Fps is (10 sec: 49151.5, 60 sec: 46421.3, 300 sec: 46097.4). Total num frames: 31637504. Throughput: 0: 46107.0. Samples: 31692540. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-10 09:38:59,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:39:03,028][32415] Updated weights for policy 0, policy_version 1940 (0.0030) [2024-06-10 09:39:04,592][32177] Fps is (10 sec: 44236.4, 60 sec: 45329.1, 300 sec: 45875.2). Total num frames: 31834112. Throughput: 0: 46011.1. Samples: 31959500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 09:39:04,592][32177] Avg episode reward: [(0, '0.002')] [2024-06-10 09:39:06,412][32415] Updated weights for policy 0, policy_version 1950 (0.0026) [2024-06-10 09:39:09,592][32177] Fps is (10 sec: 42598.8, 60 sec: 45332.4, 300 sec: 45931.4). Total num frames: 32063488. Throughput: 0: 45896.9. Samples: 32229280. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-10 09:39:09,592][32177] Avg episode reward: [(0, '0.002')] [2024-06-10 09:39:09,865][32415] Updated weights for policy 0, policy_version 1960 (0.0026) [2024-06-10 09:39:13,316][32415] Updated weights for policy 0, policy_version 1970 (0.0026) [2024-06-10 09:39:14,592][32177] Fps is (10 sec: 49151.7, 60 sec: 46424.6, 300 sec: 45986.3). Total num frames: 32325632. Throughput: 0: 46058.5. Samples: 32378520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 09:39:14,592][32177] Avg episode reward: [(0, '0.002')] [2024-06-10 09:39:17,376][32415] Updated weights for policy 0, policy_version 1980 (0.0037) [2024-06-10 09:39:19,591][32177] Fps is (10 sec: 44237.1, 60 sec: 45329.2, 300 sec: 45875.2). Total num frames: 32505856. Throughput: 0: 45815.7. Samples: 32643580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-10 09:39:19,592][32177] Avg episode reward: [(0, '0.002')] [2024-06-10 09:39:20,890][32415] Updated weights for policy 0, policy_version 1990 (0.0033) [2024-06-10 09:39:24,567][32415] Updated weights for policy 0, policy_version 2000 (0.0037) [2024-06-10 09:39:24,591][32177] Fps is (10 sec: 44237.7, 60 sec: 45602.2, 300 sec: 45986.3). Total num frames: 32768000. Throughput: 0: 45850.7. Samples: 32922300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 09:39:24,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:39:27,935][32415] Updated weights for policy 0, policy_version 2010 (0.0029) [2024-06-10 09:39:29,592][32177] Fps is (10 sec: 49151.7, 60 sec: 45875.2, 300 sec: 45987.0). Total num frames: 32997376. Throughput: 0: 45769.7. Samples: 33056940. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-10 09:39:29,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:39:31,659][32415] Updated weights for policy 0, policy_version 2020 (0.0035) [2024-06-10 09:39:34,592][32177] Fps is (10 sec: 45874.5, 60 sec: 45875.2, 300 sec: 46097.4). Total num frames: 33226752. Throughput: 0: 45862.1. Samples: 33334240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 09:39:34,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:39:35,239][32415] Updated weights for policy 0, policy_version 2030 (0.0042) [2024-06-10 09:39:38,801][32415] Updated weights for policy 0, policy_version 2040 (0.0035) [2024-06-10 09:39:39,592][32177] Fps is (10 sec: 44236.0, 60 sec: 45055.9, 300 sec: 45875.2). Total num frames: 33439744. Throughput: 0: 45684.7. Samples: 33601500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-10 09:39:39,592][32177] Avg episode reward: [(0, '0.002')] [2024-06-10 09:39:41,405][32394] Signal inference workers to stop experience collection... (400 times) [2024-06-10 09:39:41,406][32394] Signal inference workers to resume experience collection... (400 times) [2024-06-10 09:39:41,445][32415] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-06-10 09:39:41,445][32415] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-06-10 09:39:42,319][32415] Updated weights for policy 0, policy_version 2050 (0.0029) [2024-06-10 09:39:44,596][32177] Fps is (10 sec: 45855.8, 60 sec: 45871.9, 300 sec: 46041.1). Total num frames: 33685504. Throughput: 0: 45429.1. Samples: 33737040. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-10 09:39:44,597][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:39:46,186][32415] Updated weights for policy 0, policy_version 2060 (0.0034) [2024-06-10 09:39:49,514][32415] Updated weights for policy 0, policy_version 2070 (0.0035) [2024-06-10 09:39:49,592][32177] Fps is (10 sec: 47514.4, 60 sec: 46148.3, 300 sec: 46097.4). Total num frames: 33914880. Throughput: 0: 45638.3. Samples: 34013220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-10 09:39:49,592][32177] Avg episode reward: [(0, '0.002')] [2024-06-10 09:39:49,592][32394] Saving new best policy, reward=0.002! [2024-06-10 09:39:53,352][32415] Updated weights for policy 0, policy_version 2080 (0.0031) [2024-06-10 09:39:54,592][32177] Fps is (10 sec: 42615.8, 60 sec: 45328.9, 300 sec: 45875.2). Total num frames: 34111488. Throughput: 0: 45773.1. Samples: 34289080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-10 09:39:54,592][32177] Avg episode reward: [(0, '0.002')] [2024-06-10 09:39:56,788][32415] Updated weights for policy 0, policy_version 2090 (0.0037) [2024-06-10 09:39:59,592][32177] Fps is (10 sec: 44236.2, 60 sec: 45329.1, 300 sec: 45986.9). Total num frames: 34357248. Throughput: 0: 45304.9. Samples: 34417240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-10 09:39:59,592][32177] Avg episode reward: [(0, '0.002')] [2024-06-10 09:40:00,331][32415] Updated weights for policy 0, policy_version 2100 (0.0037) [2024-06-10 09:40:04,150][32415] Updated weights for policy 0, policy_version 2110 (0.0026) [2024-06-10 09:40:04,592][32177] Fps is (10 sec: 47514.5, 60 sec: 45875.2, 300 sec: 45931.0). Total num frames: 34586624. Throughput: 0: 45674.6. Samples: 34698940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-10 09:40:04,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:40:07,593][32415] Updated weights for policy 0, policy_version 2120 (0.0047) [2024-06-10 09:40:09,592][32177] Fps is (10 sec: 45874.8, 60 sec: 45875.0, 300 sec: 45930.7). Total num frames: 34816000. Throughput: 0: 45551.3. Samples: 34972120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-10 09:40:09,592][32177] Avg episode reward: [(0, '0.002')] [2024-06-10 09:40:11,221][32415] Updated weights for policy 0, policy_version 2130 (0.0027) [2024-06-10 09:40:14,592][32177] Fps is (10 sec: 45875.4, 60 sec: 45329.2, 300 sec: 45875.2). Total num frames: 35045376. Throughput: 0: 45509.8. Samples: 35104880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 09:40:14,592][32177] Avg episode reward: [(0, '0.002')] [2024-06-10 09:40:14,607][32415] Updated weights for policy 0, policy_version 2140 (0.0032) [2024-06-10 09:40:18,187][32415] Updated weights for policy 0, policy_version 2150 (0.0034) [2024-06-10 09:40:19,592][32177] Fps is (10 sec: 47513.8, 60 sec: 46421.2, 300 sec: 45986.3). Total num frames: 35291136. Throughput: 0: 45543.9. Samples: 35383720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-10 09:40:19,592][32177] Avg episode reward: [(0, '0.001')] [2024-06-10 09:40:21,721][32415] Updated weights for policy 0, policy_version 2160 (0.0036) [2024-06-10 09:40:24,592][32177] Fps is (10 sec: 44236.1, 60 sec: 45328.9, 300 sec: 45930.7). Total num frames: 35487744. Throughput: 0: 45728.0. Samples: 35659260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-10 09:40:24,595][32177] Avg episode reward: [(0, '0.002')] [2024-06-10 09:40:25,319][32415] Updated weights for policy 0, policy_version 2170 (0.0032) [2024-06-10 09:40:28,743][32415] Updated weights for policy 0, policy_version 2180 (0.0024) [2024-06-10 09:40:29,592][32177] Fps is (10 sec: 44237.4, 60 sec: 45602.1, 300 sec: 45764.8). Total num frames: 35733504. Throughput: 0: 45703.9. Samples: 35793520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 09:40:29,592][32177] Avg episode reward: [(0, '0.002')] [2024-06-10 09:40:32,843][32415] Updated weights for policy 0, policy_version 2190 (0.0034) [2024-06-10 09:40:34,591][32177] Fps is (10 sec: 49153.2, 60 sec: 45875.3, 300 sec: 45930.8). Total num frames: 35979264. Throughput: 0: 45665.8. Samples: 36068180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-10 09:40:34,592][32177] Avg episode reward: [(0, '0.003')] [2024-06-10 09:40:34,600][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000002196_35979264.pth... [2024-06-10 09:40:34,644][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000001521_24920064.pth [2024-06-10 09:40:35,882][32415] Updated weights for policy 0, policy_version 2200 (0.0031) [2024-06-10 09:40:39,592][32177] Fps is (10 sec: 44236.3, 60 sec: 45602.2, 300 sec: 45930.7). Total num frames: 36175872. Throughput: 0: 45729.4. Samples: 36346900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 09:40:39,592][32177] Avg episode reward: [(0, '0.003')] [2024-06-10 09:40:39,763][32415] Updated weights for policy 0, policy_version 2210 (0.0026) [2024-06-10 09:40:43,262][32415] Updated weights for policy 0, policy_version 2220 (0.0034) [2024-06-10 09:40:44,592][32177] Fps is (10 sec: 42597.9, 60 sec: 45332.3, 300 sec: 45764.1). Total num frames: 36405248. Throughput: 0: 45693.4. Samples: 36473440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 09:40:44,592][32177] Avg episode reward: [(0, '0.003')] [2024-06-10 09:40:44,609][32394] Saving new best policy, reward=0.003! [2024-06-10 09:40:46,988][32415] Updated weights for policy 0, policy_version 2230 (0.0033) [2024-06-10 09:40:49,592][32177] Fps is (10 sec: 47513.7, 60 sec: 45602.1, 300 sec: 45875.2). Total num frames: 36651008. Throughput: 0: 45592.8. Samples: 36750620. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-10 09:40:49,592][32177] Avg episode reward: [(0, '0.004')] [2024-06-10 09:40:50,392][32415] Updated weights for policy 0, policy_version 2240 (0.0035) [2024-06-10 09:40:54,164][32415] Updated weights for policy 0, policy_version 2250 (0.0036) [2024-06-10 09:40:54,592][32177] Fps is (10 sec: 47513.1, 60 sec: 46148.3, 300 sec: 45986.2). Total num frames: 36880384. Throughput: 0: 45653.8. Samples: 37026540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 09:40:54,593][32177] Avg episode reward: [(0, '0.003')] [2024-06-10 09:40:57,386][32415] Updated weights for policy 0, policy_version 2260 (0.0033) [2024-06-10 09:40:59,592][32177] Fps is (10 sec: 45873.4, 60 sec: 45874.9, 300 sec: 45875.1). Total num frames: 37109760. Throughput: 0: 45883.0. Samples: 37169640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 09:40:59,598][32177] Avg episode reward: [(0, '0.002')] [2024-06-10 09:41:01,388][32415] Updated weights for policy 0, policy_version 2270 (0.0031) [2024-06-10 09:41:04,583][32415] Updated weights for policy 0, policy_version 2280 (0.0034) [2024-06-10 09:41:04,592][32177] Fps is (10 sec: 47514.6, 60 sec: 46148.3, 300 sec: 45875.2). Total num frames: 37355520. Throughput: 0: 45646.0. Samples: 37437780. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-10 09:41:04,592][32177] Avg episode reward: [(0, '0.004')] [2024-06-10 09:41:08,497][32415] Updated weights for policy 0, policy_version 2290 (0.0025) [2024-06-10 09:41:09,592][32177] Fps is (10 sec: 45876.6, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 37568512. Throughput: 0: 45610.6. Samples: 37711740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 09:41:09,592][32177] Avg episode reward: [(0, '0.003')] [2024-06-10 09:41:10,631][32394] Signal inference workers to stop experience collection... (450 times) [2024-06-10 09:41:10,680][32415] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-06-10 09:41:10,743][32394] Signal inference workers to resume experience collection... (450 times) [2024-06-10 09:41:10,743][32415] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-06-10 09:41:11,808][32415] Updated weights for policy 0, policy_version 2300 (0.0030) [2024-06-10 09:41:14,592][32177] Fps is (10 sec: 40959.5, 60 sec: 45329.0, 300 sec: 45819.7). Total num frames: 37765120. Throughput: 0: 45606.6. Samples: 37845820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 09:41:14,592][32177] Avg episode reward: [(0, '0.005')] [2024-06-10 09:41:14,725][32394] Saving new best policy, reward=0.005! [2024-06-10 09:41:15,817][32415] Updated weights for policy 0, policy_version 2310 (0.0032) [2024-06-10 09:41:18,937][32415] Updated weights for policy 0, policy_version 2320 (0.0029) [2024-06-10 09:41:19,592][32177] Fps is (10 sec: 45876.3, 60 sec: 45602.3, 300 sec: 45764.1). Total num frames: 38027264. Throughput: 0: 45708.9. Samples: 38125080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 09:41:19,592][32177] Avg episode reward: [(0, '0.002')] [2024-06-10 09:41:22,689][32415] Updated weights for policy 0, policy_version 2330 (0.0029) [2024-06-10 09:41:24,592][32177] Fps is (10 sec: 50790.8, 60 sec: 46421.5, 300 sec: 45986.3). Total num frames: 38273024. Throughput: 0: 45711.3. Samples: 38403900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 09:41:24,592][32177] Avg episode reward: [(0, '0.002')] [2024-06-10 09:41:25,887][32415] Updated weights for policy 0, policy_version 2340 (0.0027) [2024-06-10 09:41:29,592][32177] Fps is (10 sec: 44235.8, 60 sec: 45602.0, 300 sec: 45930.7). Total num frames: 38469632. Throughput: 0: 45865.2. Samples: 38537380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 09:41:29,592][32177] Avg episode reward: [(0, '0.003')] [2024-06-10 09:41:30,015][32415] Updated weights for policy 0, policy_version 2350 (0.0028) [2024-06-10 09:41:33,081][32415] Updated weights for policy 0, policy_version 2360 (0.0025) [2024-06-10 09:41:34,592][32177] Fps is (10 sec: 44236.3, 60 sec: 45602.0, 300 sec: 45819.7). Total num frames: 38715392. Throughput: 0: 45782.2. Samples: 38810820. Policy #0 lag: (min: 1.0, avg: 10.3, max: 19.0) [2024-06-10 09:41:34,592][32177] Avg episode reward: [(0, '0.004')] [2024-06-10 09:41:37,197][32415] Updated weights for policy 0, policy_version 2370 (0.0027) [2024-06-10 09:41:39,592][32177] Fps is (10 sec: 47514.6, 60 sec: 46148.4, 300 sec: 45875.2). Total num frames: 38944768. Throughput: 0: 45625.1. Samples: 39079660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 09:41:39,592][32177] Avg episode reward: [(0, '0.003')] [2024-06-10 09:41:40,443][32415] Updated weights for policy 0, policy_version 2380 (0.0022) [2024-06-10 09:41:44,303][32415] Updated weights for policy 0, policy_version 2390 (0.0019) [2024-06-10 09:41:44,592][32177] Fps is (10 sec: 44236.4, 60 sec: 45875.1, 300 sec: 45875.2). Total num frames: 39157760. Throughput: 0: 45607.9. Samples: 39221980. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-10 09:41:44,592][32177] Avg episode reward: [(0, '0.004')] [2024-06-10 09:41:47,520][32415] Updated weights for policy 0, policy_version 2400 (0.0030) [2024-06-10 09:41:49,592][32177] Fps is (10 sec: 44237.0, 60 sec: 45602.2, 300 sec: 45819.7). Total num frames: 39387136. Throughput: 0: 45637.3. Samples: 39491460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-10 09:41:49,592][32177] Avg episode reward: [(0, '0.005')] [2024-06-10 09:41:51,623][32415] Updated weights for policy 0, policy_version 2410 (0.0028) [2024-06-10 09:41:54,592][32177] Fps is (10 sec: 45875.7, 60 sec: 45602.2, 300 sec: 45708.6). Total num frames: 39616512. Throughput: 0: 45885.9. Samples: 39776600. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-06-10 09:41:54,592][32177] Avg episode reward: [(0, '0.005')] [2024-06-10 09:41:54,755][32415] Updated weights for policy 0, policy_version 2420 (0.0031) [2024-06-10 09:41:58,628][32415] Updated weights for policy 0, policy_version 2430 (0.0035) [2024-06-10 09:41:59,592][32177] Fps is (10 sec: 45874.9, 60 sec: 45602.5, 300 sec: 45819.7). Total num frames: 39845888. Throughput: 0: 45849.8. Samples: 39909060. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-10 09:41:59,592][32177] Avg episode reward: [(0, '0.006')] [2024-06-10 09:41:59,599][32394] Saving new best policy, reward=0.006! [2024-06-10 09:42:01,677][32415] Updated weights for policy 0, policy_version 2440 (0.0033) [2024-06-10 09:42:04,592][32177] Fps is (10 sec: 44236.1, 60 sec: 45055.8, 300 sec: 45820.0). Total num frames: 40058880. Throughput: 0: 45755.3. Samples: 40184080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-10 09:42:04,592][32177] Avg episode reward: [(0, '0.004')] [2024-06-10 09:42:06,024][32415] Updated weights for policy 0, policy_version 2450 (0.0034) [2024-06-10 09:42:08,869][32415] Updated weights for policy 0, policy_version 2460 (0.0034) [2024-06-10 09:42:09,592][32177] Fps is (10 sec: 45875.3, 60 sec: 45602.3, 300 sec: 45764.8). Total num frames: 40304640. Throughput: 0: 45491.1. Samples: 40451000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-10 09:42:09,592][32177] Avg episode reward: [(0, '0.004')] [2024-06-10 09:42:13,040][32415] Updated weights for policy 0, policy_version 2470 (0.0031) [2024-06-10 09:42:14,592][32177] Fps is (10 sec: 49152.7, 60 sec: 46421.3, 300 sec: 45875.2). Total num frames: 40550400. Throughput: 0: 45731.7. Samples: 40595300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 09:42:14,592][32177] Avg episode reward: [(0, '0.002')] [2024-06-10 09:42:16,314][32415] Updated weights for policy 0, policy_version 2480 (0.0033) [2024-06-10 09:42:19,592][32177] Fps is (10 sec: 44236.7, 60 sec: 45329.0, 300 sec: 45820.3). Total num frames: 40747008. Throughput: 0: 45641.0. Samples: 40864660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-10 09:42:19,592][32177] Avg episode reward: [(0, '0.003')] [2024-06-10 09:42:20,321][32415] Updated weights for policy 0, policy_version 2490 (0.0037) [2024-06-10 09:42:23,487][32415] Updated weights for policy 0, policy_version 2500 (0.0034) [2024-06-10 09:42:24,592][32177] Fps is (10 sec: 45874.7, 60 sec: 45602.0, 300 sec: 45875.2). Total num frames: 41009152. Throughput: 0: 45627.4. Samples: 41132900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-10 09:42:24,592][32177] Avg episode reward: [(0, '0.003')] [2024-06-10 09:42:27,564][32415] Updated weights for policy 0, policy_version 2510 (0.0028) [2024-06-10 09:42:29,592][32177] Fps is (10 sec: 47513.4, 60 sec: 45875.3, 300 sec: 45764.1). Total num frames: 41222144. Throughput: 0: 45636.6. Samples: 41275620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-10 09:42:29,592][32177] Avg episode reward: [(0, '0.004')] [2024-06-10 09:42:30,809][32394] Signal inference workers to stop experience collection... (500 times) [2024-06-10 09:42:30,809][32394] Signal inference workers to resume experience collection... (500 times) [2024-06-10 09:42:30,816][32415] Updated weights for policy 0, policy_version 2520 (0.0039) [2024-06-10 09:42:30,862][32415] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-06-10 09:42:30,862][32415] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-06-10 09:42:34,596][32177] Fps is (10 sec: 42580.5, 60 sec: 45325.8, 300 sec: 45763.4). Total num frames: 41435136. Throughput: 0: 45657.3. Samples: 41546240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 09:42:34,597][32177] Avg episode reward: [(0, '0.004')] [2024-06-10 09:42:34,611][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000002529_41435136.pth... [2024-06-10 09:42:34,665][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000001860_30474240.pth [2024-06-10 09:42:34,807][32415] Updated weights for policy 0, policy_version 2530 (0.0030) [2024-06-10 09:42:37,842][32415] Updated weights for policy 0, policy_version 2540 (0.0034) [2024-06-10 09:42:39,592][32177] Fps is (10 sec: 44236.8, 60 sec: 45329.0, 300 sec: 45819.6). Total num frames: 41664512. Throughput: 0: 45317.8. Samples: 41815900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 09:42:39,592][32177] Avg episode reward: [(0, '0.003')] [2024-06-10 09:42:42,009][32415] Updated weights for policy 0, policy_version 2550 (0.0032) [2024-06-10 09:42:44,592][32177] Fps is (10 sec: 47534.2, 60 sec: 45875.3, 300 sec: 45764.2). Total num frames: 41910272. Throughput: 0: 45457.3. Samples: 41954640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 09:42:44,593][32177] Avg episode reward: [(0, '0.003')] [2024-06-10 09:42:45,290][32415] Updated weights for policy 0, policy_version 2560 (0.0031) [2024-06-10 09:42:49,158][32415] Updated weights for policy 0, policy_version 2570 (0.0037) [2024-06-10 09:42:49,592][32177] Fps is (10 sec: 47513.0, 60 sec: 45875.0, 300 sec: 45764.1). Total num frames: 42139648. Throughput: 0: 45603.6. Samples: 42236240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-10 09:42:49,592][32177] Avg episode reward: [(0, '0.002')] [2024-06-10 09:42:52,239][32415] Updated weights for policy 0, policy_version 2580 (0.0033) [2024-06-10 09:42:54,596][32177] Fps is (10 sec: 44217.8, 60 sec: 45598.9, 300 sec: 45763.5). Total num frames: 42352640. Throughput: 0: 45577.4. Samples: 42502180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 09:42:54,597][32177] Avg episode reward: [(0, '0.004')] [2024-06-10 09:42:56,181][32415] Updated weights for policy 0, policy_version 2590 (0.0033) [2024-06-10 09:42:59,391][32415] Updated weights for policy 0, policy_version 2600 (0.0031) [2024-06-10 09:42:59,592][32177] Fps is (10 sec: 45875.2, 60 sec: 45875.1, 300 sec: 45708.6). Total num frames: 42598400. Throughput: 0: 45550.1. Samples: 42645060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-10 09:42:59,592][32177] Avg episode reward: [(0, '0.004')] [2024-06-10 09:43:03,295][32415] Updated weights for policy 0, policy_version 2610 (0.0036) [2024-06-10 09:43:04,592][32177] Fps is (10 sec: 45894.5, 60 sec: 45875.3, 300 sec: 45653.7). Total num frames: 42811392. Throughput: 0: 45709.7. Samples: 42921600. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-10 09:43:04,592][32177] Avg episode reward: [(0, '0.007')] [2024-06-10 09:43:06,500][32415] Updated weights for policy 0, policy_version 2620 (0.0026) [2024-06-10 09:43:09,592][32177] Fps is (10 sec: 42597.9, 60 sec: 45328.8, 300 sec: 45709.2). Total num frames: 43024384. Throughput: 0: 45750.1. Samples: 43191660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-10 09:43:09,593][32177] Avg episode reward: [(0, '0.003')] [2024-06-10 09:43:10,540][32415] Updated weights for policy 0, policy_version 2630 (0.0039) [2024-06-10 09:43:14,093][32415] Updated weights for policy 0, policy_version 2640 (0.0031) [2024-06-10 09:43:14,592][32177] Fps is (10 sec: 47514.3, 60 sec: 45602.2, 300 sec: 45764.1). Total num frames: 43286528. Throughput: 0: 45551.6. Samples: 43325440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-10 09:43:14,592][32177] Avg episode reward: [(0, '0.005')] [2024-06-10 09:43:17,604][32415] Updated weights for policy 0, policy_version 2650 (0.0030) [2024-06-10 09:43:19,592][32177] Fps is (10 sec: 47514.3, 60 sec: 45875.1, 300 sec: 45653.0). Total num frames: 43499520. Throughput: 0: 45818.5. Samples: 43607880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-10 09:43:19,592][32177] Avg episode reward: [(0, '0.007')] [2024-06-10 09:43:21,027][32415] Updated weights for policy 0, policy_version 2660 (0.0041) [2024-06-10 09:43:24,592][32177] Fps is (10 sec: 44236.7, 60 sec: 45329.2, 300 sec: 45708.6). Total num frames: 43728896. Throughput: 0: 45830.7. Samples: 43878280. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-10 09:43:24,592][32177] Avg episode reward: [(0, '0.005')] [2024-06-10 09:43:24,970][32415] Updated weights for policy 0, policy_version 2670 (0.0025) [2024-06-10 09:43:28,150][32415] Updated weights for policy 0, policy_version 2680 (0.0038) [2024-06-10 09:43:29,592][32177] Fps is (10 sec: 45876.0, 60 sec: 45602.2, 300 sec: 45708.6). Total num frames: 43958272. Throughput: 0: 45741.9. Samples: 44013020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-10 09:43:29,592][32177] Avg episode reward: [(0, '0.004')] [2024-06-10 09:43:32,084][32415] Updated weights for policy 0, policy_version 2690 (0.0033) [2024-06-10 09:43:34,592][32177] Fps is (10 sec: 45875.3, 60 sec: 45878.6, 300 sec: 45597.5). Total num frames: 44187648. Throughput: 0: 45667.3. Samples: 44291260. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-10 09:43:34,592][32177] Avg episode reward: [(0, '0.005')] [2024-06-10 09:43:35,270][32415] Updated weights for policy 0, policy_version 2700 (0.0032) [2024-06-10 09:43:39,271][32415] Updated weights for policy 0, policy_version 2710 (0.0036) [2024-06-10 09:43:39,592][32177] Fps is (10 sec: 44236.7, 60 sec: 45602.2, 300 sec: 45653.0). Total num frames: 44400640. Throughput: 0: 45833.3. Samples: 44564480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 09:43:39,592][32177] Avg episode reward: [(0, '0.006')] [2024-06-10 09:43:42,654][32415] Updated weights for policy 0, policy_version 2720 (0.0029) [2024-06-10 09:43:44,592][32177] Fps is (10 sec: 45874.1, 60 sec: 45602.0, 300 sec: 45764.1). Total num frames: 44646400. Throughput: 0: 45520.8. Samples: 44693500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-10 09:43:44,593][32177] Avg episode reward: [(0, '0.003')] [2024-06-10 09:43:46,488][32415] Updated weights for policy 0, policy_version 2730 (0.0023) [2024-06-10 09:43:49,121][32394] Signal inference workers to stop experience collection... (550 times) [2024-06-10 09:43:49,175][32415] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-06-10 09:43:49,189][32394] Signal inference workers to resume experience collection... (550 times) [2024-06-10 09:43:49,190][32415] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-06-10 09:43:49,475][32415] Updated weights for policy 0, policy_version 2740 (0.0028) [2024-06-10 09:43:49,592][32177] Fps is (10 sec: 49151.5, 60 sec: 45875.3, 300 sec: 45764.1). Total num frames: 44892160. Throughput: 0: 45601.3. Samples: 44973660. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-10 09:43:49,595][32177] Avg episode reward: [(0, '0.003')] [2024-06-10 09:43:53,414][32415] Updated weights for policy 0, policy_version 2750 (0.0023) [2024-06-10 09:43:54,592][32177] Fps is (10 sec: 42598.4, 60 sec: 45332.2, 300 sec: 45541.9). Total num frames: 45072384. Throughput: 0: 45726.7. Samples: 45249360. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-10 09:43:54,592][32177] Avg episode reward: [(0, '0.008')] [2024-06-10 09:43:54,614][32394] Saving new best policy, reward=0.008! [2024-06-10 09:43:56,786][32415] Updated weights for policy 0, policy_version 2760 (0.0033) [2024-06-10 09:43:59,596][32177] Fps is (10 sec: 42580.5, 60 sec: 45325.9, 300 sec: 45707.9). Total num frames: 45318144. Throughput: 0: 45528.1. Samples: 45374400. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-10 09:43:59,597][32177] Avg episode reward: [(0, '0.005')] [2024-06-10 09:44:00,777][32415] Updated weights for policy 0, policy_version 2770 (0.0030) [2024-06-10 09:44:03,886][32415] Updated weights for policy 0, policy_version 2780 (0.0038) [2024-06-10 09:44:04,593][32177] Fps is (10 sec: 47507.0, 60 sec: 45601.0, 300 sec: 45708.3). Total num frames: 45547520. Throughput: 0: 45444.7. Samples: 45652960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 09:44:04,594][32177] Avg episode reward: [(0, '0.006')] [2024-06-10 09:44:07,890][32415] Updated weights for policy 0, policy_version 2790 (0.0036) [2024-06-10 09:44:09,592][32177] Fps is (10 sec: 45894.1, 60 sec: 45875.3, 300 sec: 45597.5). Total num frames: 45776896. Throughput: 0: 45518.0. Samples: 45926600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 09:44:09,592][32177] Avg episode reward: [(0, '0.005')] [2024-06-10 09:44:11,533][32415] Updated weights for policy 0, policy_version 2800 (0.0032) [2024-06-10 09:44:14,592][32177] Fps is (10 sec: 45882.0, 60 sec: 45328.9, 300 sec: 45764.1). Total num frames: 46006272. Throughput: 0: 45511.4. Samples: 46061040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 09:44:14,592][32177] Avg episode reward: [(0, '0.006')] [2024-06-10 09:44:15,345][32415] Updated weights for policy 0, policy_version 2810 (0.0032) [2024-06-10 09:44:18,519][32415] Updated weights for policy 0, policy_version 2820 (0.0027) [2024-06-10 09:44:19,592][32177] Fps is (10 sec: 44237.2, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 46219264. Throughput: 0: 45249.7. Samples: 46327500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 09:44:19,592][32177] Avg episode reward: [(0, '0.008')] [2024-06-10 09:44:22,454][32415] Updated weights for policy 0, policy_version 2830 (0.0029) [2024-06-10 09:44:24,592][32177] Fps is (10 sec: 42599.1, 60 sec: 45056.0, 300 sec: 45542.0). Total num frames: 46432256. Throughput: 0: 45324.0. Samples: 46604060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-10 09:44:24,592][32177] Avg episode reward: [(0, '0.005')] [2024-06-10 09:44:25,709][32415] Updated weights for policy 0, policy_version 2840 (0.0038) [2024-06-10 09:44:29,592][32177] Fps is (10 sec: 44236.8, 60 sec: 45055.9, 300 sec: 45542.0). Total num frames: 46661632. Throughput: 0: 45365.9. Samples: 46734960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 09:44:29,592][32177] Avg episode reward: [(0, '0.008')] [2024-06-10 09:44:30,035][32415] Updated weights for policy 0, policy_version 2850 (0.0031) [2024-06-10 09:44:32,962][32415] Updated weights for policy 0, policy_version 2860 (0.0036) [2024-06-10 09:44:34,592][32177] Fps is (10 sec: 45875.2, 60 sec: 45056.0, 300 sec: 45597.5). Total num frames: 46891008. Throughput: 0: 44924.1. Samples: 46995240. Policy #0 lag: (min: 0.0, avg: 13.5, max: 25.0) [2024-06-10 09:44:34,592][32177] Avg episode reward: [(0, '0.006')] [2024-06-10 09:44:34,652][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000002863_46907392.pth... [2024-06-10 09:44:34,702][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000002196_35979264.pth [2024-06-10 09:44:37,416][32415] Updated weights for policy 0, policy_version 2870 (0.0021) [2024-06-10 09:44:39,592][32177] Fps is (10 sec: 47513.6, 60 sec: 45602.1, 300 sec: 45598.2). Total num frames: 47136768. Throughput: 0: 45053.5. Samples: 47276760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-10 09:44:39,592][32177] Avg episode reward: [(0, '0.008')] [2024-06-10 09:44:40,370][32415] Updated weights for policy 0, policy_version 2880 (0.0029) [2024-06-10 09:44:44,574][32415] Updated weights for policy 0, policy_version 2890 (0.0035) [2024-06-10 09:44:44,592][32177] Fps is (10 sec: 45874.9, 60 sec: 45056.1, 300 sec: 45542.0). Total num frames: 47349760. Throughput: 0: 45379.4. Samples: 47416280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-10 09:44:44,592][32177] Avg episode reward: [(0, '0.003')] [2024-06-10 09:44:47,401][32415] Updated weights for policy 0, policy_version 2900 (0.0028) [2024-06-10 09:44:49,592][32177] Fps is (10 sec: 44236.8, 60 sec: 44783.0, 300 sec: 45653.1). Total num frames: 47579136. Throughput: 0: 45076.2. Samples: 47681320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-10 09:44:49,592][32177] Avg episode reward: [(0, '0.008')] [2024-06-10 09:44:52,119][32415] Updated weights for policy 0, policy_version 2910 (0.0045) [2024-06-10 09:44:54,588][32415] Updated weights for policy 0, policy_version 2920 (0.0039) [2024-06-10 09:44:54,592][32177] Fps is (10 sec: 49152.3, 60 sec: 46148.5, 300 sec: 45708.6). Total num frames: 47841280. Throughput: 0: 44946.9. Samples: 47949200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-10 09:44:54,592][32177] Avg episode reward: [(0, '0.007')] [2024-06-10 09:44:58,906][32394] Signal inference workers to stop experience collection... (600 times) [2024-06-10 09:44:58,911][32394] Signal inference workers to resume experience collection... (600 times) [2024-06-10 09:44:58,929][32415] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-06-10 09:44:58,929][32415] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-06-10 09:44:59,055][32415] Updated weights for policy 0, policy_version 2930 (0.0035) [2024-06-10 09:44:59,592][32177] Fps is (10 sec: 44236.7, 60 sec: 45059.2, 300 sec: 45542.0). Total num frames: 48021504. Throughput: 0: 45060.9. Samples: 48088780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 09:44:59,592][32177] Avg episode reward: [(0, '0.007')] [2024-06-10 09:45:02,079][32415] Updated weights for policy 0, policy_version 2940 (0.0041) [2024-06-10 09:45:04,596][32177] Fps is (10 sec: 40942.3, 60 sec: 45054.0, 300 sec: 45541.3). Total num frames: 48250880. Throughput: 0: 45207.3. Samples: 48362020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 09:45:04,597][32177] Avg episode reward: [(0, '0.006')] [2024-06-10 09:45:06,541][32415] Updated weights for policy 0, policy_version 2950 (0.0035) [2024-06-10 09:45:09,557][32415] Updated weights for policy 0, policy_version 2960 (0.0030) [2024-06-10 09:45:09,592][32177] Fps is (10 sec: 47514.1, 60 sec: 45329.2, 300 sec: 45597.5). Total num frames: 48496640. Throughput: 0: 44972.9. Samples: 48627840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-10 09:45:09,592][32177] Avg episode reward: [(0, '0.006')] [2024-06-10 09:45:13,669][32415] Updated weights for policy 0, policy_version 2970 (0.0036) [2024-06-10 09:45:14,592][32177] Fps is (10 sec: 42616.6, 60 sec: 44509.9, 300 sec: 45375.4). Total num frames: 48676864. Throughput: 0: 45084.5. Samples: 48763760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-10 09:45:14,592][32177] Avg episode reward: [(0, '0.007')] [2024-06-10 09:45:16,591][32415] Updated weights for policy 0, policy_version 2980 (0.0042) [2024-06-10 09:45:19,592][32177] Fps is (10 sec: 42598.2, 60 sec: 45056.0, 300 sec: 45542.0). Total num frames: 48922624. Throughput: 0: 45269.3. Samples: 49032360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-10 09:45:19,592][32177] Avg episode reward: [(0, '0.009')] [2024-06-10 09:45:21,144][32415] Updated weights for policy 0, policy_version 2990 (0.0042) [2024-06-10 09:45:23,822][32415] Updated weights for policy 0, policy_version 3000 (0.0042) [2024-06-10 09:45:24,592][32177] Fps is (10 sec: 49151.5, 60 sec: 45602.0, 300 sec: 45541.9). Total num frames: 49168384. Throughput: 0: 44885.3. Samples: 49296600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-10 09:45:24,592][32177] Avg episode reward: [(0, '0.004')] [2024-06-10 09:45:28,148][32415] Updated weights for policy 0, policy_version 3010 (0.0026) [2024-06-10 09:45:29,592][32177] Fps is (10 sec: 44233.6, 60 sec: 45055.5, 300 sec: 45375.2). Total num frames: 49364992. Throughput: 0: 45012.6. Samples: 49441880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-10 09:45:29,593][32177] Avg episode reward: [(0, '0.008')] [2024-06-10 09:45:31,114][32415] Updated weights for policy 0, policy_version 3020 (0.0033) [2024-06-10 09:45:34,592][32177] Fps is (10 sec: 44236.7, 60 sec: 45328.9, 300 sec: 45542.0). Total num frames: 49610752. Throughput: 0: 45181.7. Samples: 49714500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-10 09:45:34,592][32177] Avg episode reward: [(0, '0.009')] [2024-06-10 09:45:35,516][32415] Updated weights for policy 0, policy_version 3030 (0.0034) [2024-06-10 09:45:38,629][32415] Updated weights for policy 0, policy_version 3040 (0.0036) [2024-06-10 09:45:39,592][32177] Fps is (10 sec: 47517.0, 60 sec: 45056.0, 300 sec: 45542.0). Total num frames: 49840128. Throughput: 0: 45174.1. Samples: 49982040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 09:45:39,592][32177] Avg episode reward: [(0, '0.006')] [2024-06-10 09:45:42,927][32415] Updated weights for policy 0, policy_version 3050 (0.0032) [2024-06-10 09:45:44,592][32177] Fps is (10 sec: 44236.9, 60 sec: 45055.9, 300 sec: 45430.9). Total num frames: 50053120. Throughput: 0: 45056.4. Samples: 50116320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-10 09:45:44,592][32177] Avg episode reward: [(0, '0.004')] [2024-06-10 09:45:45,600][32415] Updated weights for policy 0, policy_version 3060 (0.0020) [2024-06-10 09:45:49,592][32177] Fps is (10 sec: 42598.7, 60 sec: 44783.0, 300 sec: 45375.4). Total num frames: 50266112. Throughput: 0: 45078.1. Samples: 50390340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 09:45:49,592][32177] Avg episode reward: [(0, '0.006')] [2024-06-10 09:45:50,233][32415] Updated weights for policy 0, policy_version 3070 (0.0033) [2024-06-10 09:45:53,093][32415] Updated weights for policy 0, policy_version 3080 (0.0037) [2024-06-10 09:45:54,596][32177] Fps is (10 sec: 45856.1, 60 sec: 44506.6, 300 sec: 45430.3). Total num frames: 50511872. Throughput: 0: 45045.4. Samples: 50655080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-10 09:45:54,605][32177] Avg episode reward: [(0, '0.010')] [2024-06-10 09:45:54,619][32394] Saving new best policy, reward=0.010! [2024-06-10 09:45:57,193][32415] Updated weights for policy 0, policy_version 3090 (0.0031) [2024-06-10 09:45:59,592][32177] Fps is (10 sec: 45875.1, 60 sec: 45056.1, 300 sec: 45319.8). Total num frames: 50724864. Throughput: 0: 45118.3. Samples: 50794080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 09:45:59,592][32177] Avg episode reward: [(0, '0.005')] [2024-06-10 09:46:00,314][32415] Updated weights for policy 0, policy_version 3100 (0.0021) [2024-06-10 09:46:04,592][32177] Fps is (10 sec: 42616.4, 60 sec: 44786.1, 300 sec: 45319.8). Total num frames: 50937856. Throughput: 0: 45185.3. Samples: 51065700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 09:46:04,592][32177] Avg episode reward: [(0, '0.006')] [2024-06-10 09:46:04,853][32415] Updated weights for policy 0, policy_version 3110 (0.0031) [2024-06-10 09:46:07,687][32415] Updated weights for policy 0, policy_version 3120 (0.0032) [2024-06-10 09:46:09,592][32177] Fps is (10 sec: 47513.1, 60 sec: 45055.9, 300 sec: 45542.0). Total num frames: 51200000. Throughput: 0: 45229.8. Samples: 51331940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 09:46:09,592][32177] Avg episode reward: [(0, '0.005')] [2024-06-10 09:46:11,831][32415] Updated weights for policy 0, policy_version 3130 (0.0041) [2024-06-10 09:46:14,596][32177] Fps is (10 sec: 49131.2, 60 sec: 45871.9, 300 sec: 45430.2). Total num frames: 51429376. Throughput: 0: 45113.3. Samples: 51472140. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-10 09:46:14,597][32177] Avg episode reward: [(0, '0.012')] [2024-06-10 09:46:14,667][32394] Saving new best policy, reward=0.012! [2024-06-10 09:46:14,681][32415] Updated weights for policy 0, policy_version 3140 (0.0030) [2024-06-10 09:46:19,427][32415] Updated weights for policy 0, policy_version 3150 (0.0032) [2024-06-10 09:46:19,592][32177] Fps is (10 sec: 40960.5, 60 sec: 44783.0, 300 sec: 45208.7). Total num frames: 51609600. Throughput: 0: 44888.2. Samples: 51734460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-10 09:46:19,592][32177] Avg episode reward: [(0, '0.006')] [2024-06-10 09:46:22,060][32415] Updated weights for policy 0, policy_version 3160 (0.0021) [2024-06-10 09:46:24,592][32177] Fps is (10 sec: 44255.5, 60 sec: 45056.0, 300 sec: 45430.9). Total num frames: 51871744. Throughput: 0: 44947.9. Samples: 52004700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-10 09:46:24,592][32177] Avg episode reward: [(0, '0.006')] [2024-06-10 09:46:26,409][32415] Updated weights for policy 0, policy_version 3170 (0.0029) [2024-06-10 09:46:29,469][32415] Updated weights for policy 0, policy_version 3180 (0.0027) [2024-06-10 09:46:29,592][32177] Fps is (10 sec: 49151.0, 60 sec: 45602.6, 300 sec: 45375.3). Total num frames: 52101120. Throughput: 0: 45052.0. Samples: 52143660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 09:46:29,592][32177] Avg episode reward: [(0, '0.010')] [2024-06-10 09:46:33,641][32415] Updated weights for policy 0, policy_version 3190 (0.0034) [2024-06-10 09:46:34,592][32177] Fps is (10 sec: 44237.3, 60 sec: 45056.1, 300 sec: 45319.8). Total num frames: 52314112. Throughput: 0: 45082.7. Samples: 52419060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 09:46:34,592][32177] Avg episode reward: [(0, '0.010')] [2024-06-10 09:46:34,600][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000003193_52314112.pth... [2024-06-10 09:46:34,652][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000002529_41435136.pth [2024-06-10 09:46:35,985][32394] Signal inference workers to stop experience collection... (650 times) [2024-06-10 09:46:35,992][32394] Signal inference workers to resume experience collection... (650 times) [2024-06-10 09:46:36,003][32415] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-06-10 09:46:36,004][32415] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-06-10 09:46:36,893][32415] Updated weights for policy 0, policy_version 3200 (0.0040) [2024-06-10 09:46:39,592][32177] Fps is (10 sec: 42599.3, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 52527104. Throughput: 0: 45125.7. Samples: 52685540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-10 09:46:39,592][32177] Avg episode reward: [(0, '0.012')] [2024-06-10 09:46:41,063][32415] Updated weights for policy 0, policy_version 3210 (0.0031) [2024-06-10 09:46:43,840][32415] Updated weights for policy 0, policy_version 3220 (0.0042) [2024-06-10 09:46:44,592][32177] Fps is (10 sec: 47512.6, 60 sec: 45602.1, 300 sec: 45430.8). Total num frames: 52789248. Throughput: 0: 45196.2. Samples: 52827920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 09:46:44,592][32177] Avg episode reward: [(0, '0.012')] [2024-06-10 09:46:48,454][32415] Updated weights for policy 0, policy_version 3230 (0.0028) [2024-06-10 09:46:49,592][32177] Fps is (10 sec: 44236.0, 60 sec: 45055.9, 300 sec: 45264.3). Total num frames: 52969472. Throughput: 0: 45121.3. Samples: 53096160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 09:46:49,592][32177] Avg episode reward: [(0, '0.009')] [2024-06-10 09:46:51,295][32415] Updated weights for policy 0, policy_version 3240 (0.0031) [2024-06-10 09:46:54,592][32177] Fps is (10 sec: 40960.1, 60 sec: 44786.0, 300 sec: 45264.2). Total num frames: 53198848. Throughput: 0: 45103.0. Samples: 53361580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 09:46:54,592][32177] Avg episode reward: [(0, '0.008')] [2024-06-10 09:46:55,491][32415] Updated weights for policy 0, policy_version 3250 (0.0038) [2024-06-10 09:46:58,592][32415] Updated weights for policy 0, policy_version 3260 (0.0028) [2024-06-10 09:46:59,592][32177] Fps is (10 sec: 47513.6, 60 sec: 45328.9, 300 sec: 45375.4). Total num frames: 53444608. Throughput: 0: 44931.3. Samples: 53493860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-10 09:46:59,592][32177] Avg episode reward: [(0, '0.009')] [2024-06-10 09:47:02,917][32415] Updated weights for policy 0, policy_version 3270 (0.0027) [2024-06-10 09:47:04,592][32177] Fps is (10 sec: 45875.4, 60 sec: 45329.1, 300 sec: 45264.2). Total num frames: 53657600. Throughput: 0: 45159.4. Samples: 53766640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-10 09:47:04,604][32177] Avg episode reward: [(0, '0.015')] [2024-06-10 09:47:04,617][32394] Saving new best policy, reward=0.015! [2024-06-10 09:47:05,885][32415] Updated weights for policy 0, policy_version 3280 (0.0035) [2024-06-10 09:47:09,592][32177] Fps is (10 sec: 42598.5, 60 sec: 44509.8, 300 sec: 45153.2). Total num frames: 53870592. Throughput: 0: 45226.6. Samples: 54039900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 09:47:09,592][32177] Avg episode reward: [(0, '0.011')] [2024-06-10 09:47:10,094][32415] Updated weights for policy 0, policy_version 3290 (0.0023) [2024-06-10 09:47:13,172][32415] Updated weights for policy 0, policy_version 3300 (0.0028) [2024-06-10 09:47:14,592][32177] Fps is (10 sec: 47514.4, 60 sec: 45059.3, 300 sec: 45375.4). Total num frames: 54132736. Throughput: 0: 45099.3. Samples: 54173120. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-10 09:47:14,592][32177] Avg episode reward: [(0, '0.012')] [2024-06-10 09:47:17,390][32415] Updated weights for policy 0, policy_version 3310 (0.0029) [2024-06-10 09:47:19,592][32177] Fps is (10 sec: 45875.1, 60 sec: 45328.9, 300 sec: 45153.2). Total num frames: 54329344. Throughput: 0: 45028.7. Samples: 54445360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-10 09:47:19,592][32177] Avg episode reward: [(0, '0.009')] [2024-06-10 09:47:20,454][32415] Updated weights for policy 0, policy_version 3320 (0.0028) [2024-06-10 09:47:24,405][32415] Updated weights for policy 0, policy_version 3330 (0.0032) [2024-06-10 09:47:24,592][32177] Fps is (10 sec: 42597.9, 60 sec: 44783.0, 300 sec: 45208.7). Total num frames: 54558720. Throughput: 0: 45163.9. Samples: 54717920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-10 09:47:24,592][32177] Avg episode reward: [(0, '0.015')] [2024-06-10 09:47:27,710][32415] Updated weights for policy 0, policy_version 3340 (0.0029) [2024-06-10 09:47:29,592][32177] Fps is (10 sec: 47514.4, 60 sec: 45056.1, 300 sec: 45320.5). Total num frames: 54804480. Throughput: 0: 45072.2. Samples: 54856160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 09:47:29,592][32177] Avg episode reward: [(0, '0.010')] [2024-06-10 09:47:31,498][32415] Updated weights for policy 0, policy_version 3350 (0.0039) [2024-06-10 09:47:34,592][32177] Fps is (10 sec: 45875.3, 60 sec: 45056.0, 300 sec: 45264.3). Total num frames: 55017472. Throughput: 0: 45148.1. Samples: 55127820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 09:47:34,592][32177] Avg episode reward: [(0, '0.010')] [2024-06-10 09:47:34,783][32415] Updated weights for policy 0, policy_version 3360 (0.0032) [2024-06-10 09:47:38,900][32415] Updated weights for policy 0, policy_version 3370 (0.0029) [2024-06-10 09:47:39,592][32177] Fps is (10 sec: 42596.1, 60 sec: 45055.6, 300 sec: 45153.1). Total num frames: 55230464. Throughput: 0: 45436.5. Samples: 55406240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 09:47:39,593][32177] Avg episode reward: [(0, '0.019')] [2024-06-10 09:47:39,663][32394] Saving new best policy, reward=0.019! [2024-06-10 09:47:42,082][32415] Updated weights for policy 0, policy_version 3380 (0.0025) [2024-06-10 09:47:44,592][32177] Fps is (10 sec: 45875.0, 60 sec: 44783.0, 300 sec: 45208.7). Total num frames: 55476224. Throughput: 0: 45407.6. Samples: 55537200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-10 09:47:44,592][32177] Avg episode reward: [(0, '0.015')] [2024-06-10 09:47:46,399][32415] Updated weights for policy 0, policy_version 3390 (0.0024) [2024-06-10 09:47:49,445][32415] Updated weights for policy 0, policy_version 3400 (0.0029) [2024-06-10 09:47:49,592][32177] Fps is (10 sec: 49154.7, 60 sec: 45875.3, 300 sec: 45320.5). Total num frames: 55721984. Throughput: 0: 45409.4. Samples: 55810060. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-10 09:47:49,592][32177] Avg episode reward: [(0, '0.015')] [2024-06-10 09:47:53,475][32415] Updated weights for policy 0, policy_version 3410 (0.0040) [2024-06-10 09:47:53,922][32394] Signal inference workers to stop experience collection... (700 times) [2024-06-10 09:47:53,948][32415] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-06-10 09:47:53,982][32394] Signal inference workers to resume experience collection... (700 times) [2024-06-10 09:47:53,983][32415] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-06-10 09:47:54,592][32177] Fps is (10 sec: 45875.4, 60 sec: 45602.2, 300 sec: 45208.7). Total num frames: 55934976. Throughput: 0: 45241.0. Samples: 56075740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-10 09:47:54,592][32177] Avg episode reward: [(0, '0.019')] [2024-06-10 09:47:56,445][32415] Updated weights for policy 0, policy_version 3420 (0.0025) [2024-06-10 09:47:59,592][32177] Fps is (10 sec: 42597.9, 60 sec: 45056.0, 300 sec: 45208.7). Total num frames: 56147968. Throughput: 0: 45373.2. Samples: 56214920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 09:47:59,592][32177] Avg episode reward: [(0, '0.008')] [2024-06-10 09:48:00,767][32415] Updated weights for policy 0, policy_version 3430 (0.0035) [2024-06-10 09:48:03,656][32415] Updated weights for policy 0, policy_version 3440 (0.0030) [2024-06-10 09:48:04,592][32177] Fps is (10 sec: 45875.2, 60 sec: 45602.2, 300 sec: 45319.8). Total num frames: 56393728. Throughput: 0: 45368.6. Samples: 56486940. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-10 09:48:04,596][32177] Avg episode reward: [(0, '0.011')] [2024-06-10 09:48:07,952][32415] Updated weights for policy 0, policy_version 3450 (0.0039) [2024-06-10 09:48:09,592][32177] Fps is (10 sec: 47513.7, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 56623104. Throughput: 0: 45447.5. Samples: 56763060. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-10 09:48:09,592][32177] Avg episode reward: [(0, '0.015')] [2024-06-10 09:48:11,030][32415] Updated weights for policy 0, policy_version 3460 (0.0039) [2024-06-10 09:48:14,592][32177] Fps is (10 sec: 42598.4, 60 sec: 44782.8, 300 sec: 45153.2). Total num frames: 56819712. Throughput: 0: 45163.5. Samples: 56888520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 09:48:14,592][32177] Avg episode reward: [(0, '0.012')] [2024-06-10 09:48:15,317][32415] Updated weights for policy 0, policy_version 3470 (0.0037) [2024-06-10 09:48:18,247][32415] Updated weights for policy 0, policy_version 3480 (0.0028) [2024-06-10 09:48:19,592][32177] Fps is (10 sec: 44237.4, 60 sec: 45602.3, 300 sec: 45208.7). Total num frames: 57065472. Throughput: 0: 45239.6. Samples: 57163600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 09:48:19,592][32177] Avg episode reward: [(0, '0.013')] [2024-06-10 09:48:22,389][32415] Updated weights for policy 0, policy_version 3490 (0.0024) [2024-06-10 09:48:24,592][32177] Fps is (10 sec: 45874.7, 60 sec: 45329.0, 300 sec: 45153.2). Total num frames: 57278464. Throughput: 0: 45043.9. Samples: 57433200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-10 09:48:24,592][32177] Avg episode reward: [(0, '0.017')] [2024-06-10 09:48:25,679][32415] Updated weights for policy 0, policy_version 3500 (0.0041) [2024-06-10 09:48:29,592][32177] Fps is (10 sec: 40959.4, 60 sec: 44509.8, 300 sec: 45042.1). Total num frames: 57475072. Throughput: 0: 45120.4. Samples: 57567620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 09:48:29,592][32177] Avg episode reward: [(0, '0.017')] [2024-06-10 09:48:29,906][32415] Updated weights for policy 0, policy_version 3510 (0.0032) [2024-06-10 09:48:32,874][32415] Updated weights for policy 0, policy_version 3520 (0.0039) [2024-06-10 09:48:34,592][32177] Fps is (10 sec: 47514.3, 60 sec: 45602.2, 300 sec: 45264.3). Total num frames: 57753600. Throughput: 0: 45093.8. Samples: 57839280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-10 09:48:34,592][32177] Avg episode reward: [(0, '0.012')] [2024-06-10 09:48:34,610][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000003525_57753600.pth... [2024-06-10 09:48:34,662][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000002863_46907392.pth [2024-06-10 09:48:37,076][32415] Updated weights for policy 0, policy_version 3530 (0.0034) [2024-06-10 09:48:39,592][32177] Fps is (10 sec: 47514.4, 60 sec: 45329.5, 300 sec: 45097.7). Total num frames: 57950208. Throughput: 0: 45148.5. Samples: 58107420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-10 09:48:39,592][32177] Avg episode reward: [(0, '0.024')] [2024-06-10 09:48:39,604][32394] Saving new best policy, reward=0.024! [2024-06-10 09:48:40,224][32415] Updated weights for policy 0, policy_version 3540 (0.0041) [2024-06-10 09:48:44,209][32415] Updated weights for policy 0, policy_version 3550 (0.0026) [2024-06-10 09:48:44,592][32177] Fps is (10 sec: 40959.8, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 58163200. Throughput: 0: 44961.8. Samples: 58238200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-10 09:48:44,592][32177] Avg episode reward: [(0, '0.011')] [2024-06-10 09:48:47,350][32415] Updated weights for policy 0, policy_version 3560 (0.0028) [2024-06-10 09:48:49,596][32177] Fps is (10 sec: 47493.4, 60 sec: 45052.8, 300 sec: 45263.7). Total num frames: 58425344. Throughput: 0: 44947.9. Samples: 58509780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 09:48:49,596][32177] Avg episode reward: [(0, '0.017')] [2024-06-10 09:48:51,532][32415] Updated weights for policy 0, policy_version 3570 (0.0025) [2024-06-10 09:48:54,430][32415] Updated weights for policy 0, policy_version 3580 (0.0040) [2024-06-10 09:48:54,592][32177] Fps is (10 sec: 49151.8, 60 sec: 45329.0, 300 sec: 45209.4). Total num frames: 58654720. Throughput: 0: 45048.0. Samples: 58790220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 09:48:54,595][32177] Avg episode reward: [(0, '0.010')] [2024-06-10 09:48:58,823][32415] Updated weights for policy 0, policy_version 3590 (0.0041) [2024-06-10 09:48:59,592][32177] Fps is (10 sec: 44255.5, 60 sec: 45329.2, 300 sec: 45153.4). Total num frames: 58867712. Throughput: 0: 45258.7. Samples: 58925160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-10 09:48:59,592][32177] Avg episode reward: [(0, '0.017')] [2024-06-10 09:49:01,780][32415] Updated weights for policy 0, policy_version 3600 (0.0040) [2024-06-10 09:49:04,596][32177] Fps is (10 sec: 44218.1, 60 sec: 45052.8, 300 sec: 45152.6). Total num frames: 59097088. Throughput: 0: 45297.4. Samples: 59202180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-10 09:49:04,597][32177] Avg episode reward: [(0, '0.019')] [2024-06-10 09:49:05,776][32415] Updated weights for policy 0, policy_version 3610 (0.0031) [2024-06-10 09:49:09,026][32415] Updated weights for policy 0, policy_version 3620 (0.0036) [2024-06-10 09:49:09,592][32177] Fps is (10 sec: 45875.2, 60 sec: 45056.1, 300 sec: 45153.2). Total num frames: 59326464. Throughput: 0: 45232.6. Samples: 59468660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 09:49:09,592][32177] Avg episode reward: [(0, '0.018')] [2024-06-10 09:49:13,080][32415] Updated weights for policy 0, policy_version 3630 (0.0041) [2024-06-10 09:49:13,628][32394] Signal inference workers to stop experience collection... (750 times) [2024-06-10 09:49:13,629][32394] Signal inference workers to resume experience collection... (750 times) [2024-06-10 09:49:13,671][32415] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-06-10 09:49:13,671][32415] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-06-10 09:49:14,592][32177] Fps is (10 sec: 45894.5, 60 sec: 45602.1, 300 sec: 45208.7). Total num frames: 59555840. Throughput: 0: 45406.6. Samples: 59610920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 09:49:14,592][32177] Avg episode reward: [(0, '0.015')] [2024-06-10 09:49:16,260][32415] Updated weights for policy 0, policy_version 3640 (0.0054) [2024-06-10 09:49:19,592][32177] Fps is (10 sec: 42594.7, 60 sec: 44782.3, 300 sec: 45153.1). Total num frames: 59752448. Throughput: 0: 45220.9. Samples: 59874260. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-10 09:49:19,593][32177] Avg episode reward: [(0, '0.025')] [2024-06-10 09:49:19,594][32394] Saving new best policy, reward=0.025! [2024-06-10 09:49:20,305][32415] Updated weights for policy 0, policy_version 3650 (0.0046) [2024-06-10 09:49:23,702][32415] Updated weights for policy 0, policy_version 3660 (0.0029) [2024-06-10 09:49:24,592][32177] Fps is (10 sec: 45875.7, 60 sec: 45602.2, 300 sec: 45264.3). Total num frames: 60014592. Throughput: 0: 45372.4. Samples: 60149180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-10 09:49:24,592][32177] Avg episode reward: [(0, '0.015')] [2024-06-10 09:49:27,587][32415] Updated weights for policy 0, policy_version 3670 (0.0029) [2024-06-10 09:49:29,593][32177] Fps is (10 sec: 47509.6, 60 sec: 45874.0, 300 sec: 45208.5). Total num frames: 60227584. Throughput: 0: 45697.0. Samples: 60294640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-10 09:49:29,594][32177] Avg episode reward: [(0, '0.024')] [2024-06-10 09:49:30,621][32415] Updated weights for policy 0, policy_version 3680 (0.0030) [2024-06-10 09:49:34,564][32415] Updated weights for policy 0, policy_version 3690 (0.0033) [2024-06-10 09:49:34,592][32177] Fps is (10 sec: 44236.5, 60 sec: 45055.9, 300 sec: 45153.2). Total num frames: 60456960. Throughput: 0: 45596.2. Samples: 60561420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 09:49:34,593][32177] Avg episode reward: [(0, '0.015')] [2024-06-10 09:49:38,108][32415] Updated weights for policy 0, policy_version 3700 (0.0030) [2024-06-10 09:49:39,596][32177] Fps is (10 sec: 45862.8, 60 sec: 45598.8, 300 sec: 45208.1). Total num frames: 60686336. Throughput: 0: 45109.9. Samples: 60820360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-10 09:49:39,597][32177] Avg episode reward: [(0, '0.019')] [2024-06-10 09:49:42,022][32415] Updated weights for policy 0, policy_version 3710 (0.0040) [2024-06-10 09:49:44,592][32177] Fps is (10 sec: 42598.9, 60 sec: 45329.1, 300 sec: 45097.7). Total num frames: 60882944. Throughput: 0: 45264.4. Samples: 60962060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-10 09:49:44,592][32177] Avg episode reward: [(0, '0.024')] [2024-06-10 09:49:45,409][32415] Updated weights for policy 0, policy_version 3720 (0.0033) [2024-06-10 09:49:49,494][32415] Updated weights for policy 0, policy_version 3730 (0.0032) [2024-06-10 09:49:49,592][32177] Fps is (10 sec: 42617.2, 60 sec: 44786.1, 300 sec: 44986.6). Total num frames: 61112320. Throughput: 0: 45155.0. Samples: 61233960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-10 09:49:49,592][32177] Avg episode reward: [(0, '0.022')] [2024-06-10 09:49:52,538][32415] Updated weights for policy 0, policy_version 3740 (0.0034) [2024-06-10 09:49:54,592][32177] Fps is (10 sec: 49151.5, 60 sec: 45329.1, 300 sec: 45264.3). Total num frames: 61374464. Throughput: 0: 45111.0. Samples: 61498660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-10 09:49:54,592][32177] Avg episode reward: [(0, '0.018')] [2024-06-10 09:49:56,740][32415] Updated weights for policy 0, policy_version 3750 (0.0031) [2024-06-10 09:49:59,592][32177] Fps is (10 sec: 45875.1, 60 sec: 45056.0, 300 sec: 45153.9). Total num frames: 61571072. Throughput: 0: 45117.0. Samples: 61641180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 09:49:59,592][32177] Avg episode reward: [(0, '0.026')] [2024-06-10 09:49:59,789][32415] Updated weights for policy 0, policy_version 3760 (0.0031) [2024-06-10 09:50:04,122][32415] Updated weights for policy 0, policy_version 3770 (0.0032) [2024-06-10 09:50:04,592][32177] Fps is (10 sec: 40960.0, 60 sec: 44786.1, 300 sec: 45042.1). Total num frames: 61784064. Throughput: 0: 45413.2. Samples: 61917820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-10 09:50:04,592][32177] Avg episode reward: [(0, '0.020')] [2024-06-10 09:50:07,012][32415] Updated weights for policy 0, policy_version 3780 (0.0031) [2024-06-10 09:50:09,592][32177] Fps is (10 sec: 45875.3, 60 sec: 45056.0, 300 sec: 45264.3). Total num frames: 62029824. Throughput: 0: 45026.7. Samples: 62175380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-10 09:50:09,592][32177] Avg episode reward: [(0, '0.023')] [2024-06-10 09:50:11,358][32415] Updated weights for policy 0, policy_version 3790 (0.0034) [2024-06-10 09:50:14,362][32415] Updated weights for policy 0, policy_version 3800 (0.0032) [2024-06-10 09:50:14,592][32177] Fps is (10 sec: 47513.4, 60 sec: 45056.0, 300 sec: 45208.7). Total num frames: 62259200. Throughput: 0: 44824.7. Samples: 62311680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-10 09:50:14,592][32177] Avg episode reward: [(0, '0.026')] [2024-06-10 09:50:14,752][32394] Saving new best policy, reward=0.026! [2024-06-10 09:50:18,269][32415] Updated weights for policy 0, policy_version 3810 (0.0033) [2024-06-10 09:50:19,592][32177] Fps is (10 sec: 45875.5, 60 sec: 45602.8, 300 sec: 45153.2). Total num frames: 62488576. Throughput: 0: 45091.7. Samples: 62590540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 09:50:19,592][32177] Avg episode reward: [(0, '0.022')] [2024-06-10 09:50:21,611][32415] Updated weights for policy 0, policy_version 3820 (0.0041) [2024-06-10 09:50:22,264][32394] Signal inference workers to stop experience collection... (800 times) [2024-06-10 09:50:22,290][32415] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-06-10 09:50:22,330][32394] Signal inference workers to resume experience collection... (800 times) [2024-06-10 09:50:22,331][32415] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-06-10 09:50:24,592][32177] Fps is (10 sec: 42598.2, 60 sec: 44509.8, 300 sec: 45153.3). Total num frames: 62685184. Throughput: 0: 45278.9. Samples: 62857720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-10 09:50:24,592][32177] Avg episode reward: [(0, '0.026')] [2024-06-10 09:50:25,714][32415] Updated weights for policy 0, policy_version 3830 (0.0028) [2024-06-10 09:50:29,017][32415] Updated weights for policy 0, policy_version 3840 (0.0041) [2024-06-10 09:50:29,592][32177] Fps is (10 sec: 45874.3, 60 sec: 45330.3, 300 sec: 45208.7). Total num frames: 62947328. Throughput: 0: 45228.8. Samples: 62997360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-10 09:50:29,592][32177] Avg episode reward: [(0, '0.026')] [2024-06-10 09:50:33,034][32415] Updated weights for policy 0, policy_version 3850 (0.0040) [2024-06-10 09:50:34,592][32177] Fps is (10 sec: 44237.3, 60 sec: 44509.9, 300 sec: 45042.1). Total num frames: 63127552. Throughput: 0: 45021.7. Samples: 63259940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 09:50:34,592][32177] Avg episode reward: [(0, '0.027')] [2024-06-10 09:50:34,739][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000003854_63143936.pth... [2024-06-10 09:50:34,781][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000003193_52314112.pth [2024-06-10 09:50:34,785][32394] Saving new best policy, reward=0.027! [2024-06-10 09:50:36,116][32415] Updated weights for policy 0, policy_version 3860 (0.0034) [2024-06-10 09:50:39,592][32177] Fps is (10 sec: 42598.6, 60 sec: 44786.2, 300 sec: 45153.2). Total num frames: 63373312. Throughput: 0: 45081.8. Samples: 63527340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-10 09:50:39,592][32177] Avg episode reward: [(0, '0.031')] [2024-06-10 09:50:39,596][32394] Saving new best policy, reward=0.031! [2024-06-10 09:50:40,436][32415] Updated weights for policy 0, policy_version 3870 (0.0032) [2024-06-10 09:50:43,549][32415] Updated weights for policy 0, policy_version 3880 (0.0026) [2024-06-10 09:50:44,592][32177] Fps is (10 sec: 49152.3, 60 sec: 45602.1, 300 sec: 45264.3). Total num frames: 63619072. Throughput: 0: 44850.3. Samples: 63659440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-10 09:50:44,592][32177] Avg episode reward: [(0, '0.023')] [2024-06-10 09:50:47,436][32415] Updated weights for policy 0, policy_version 3890 (0.0031) [2024-06-10 09:50:49,596][32177] Fps is (10 sec: 44218.0, 60 sec: 45052.8, 300 sec: 45097.7). Total num frames: 63815680. Throughput: 0: 44854.5. Samples: 63936460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-10 09:50:49,597][32177] Avg episode reward: [(0, '0.022')] [2024-06-10 09:50:50,596][32415] Updated weights for policy 0, policy_version 3900 (0.0032) [2024-06-10 09:50:54,592][32177] Fps is (10 sec: 42598.1, 60 sec: 44509.9, 300 sec: 45153.2). Total num frames: 64045056. Throughput: 0: 45181.7. Samples: 64208560. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-10 09:50:54,592][32177] Avg episode reward: [(0, '0.037')] [2024-06-10 09:50:54,606][32394] Saving new best policy, reward=0.037! [2024-06-10 09:50:54,815][32415] Updated weights for policy 0, policy_version 3910 (0.0039) [2024-06-10 09:50:58,107][32415] Updated weights for policy 0, policy_version 3920 (0.0026) [2024-06-10 09:50:59,592][32177] Fps is (10 sec: 47533.9, 60 sec: 45329.0, 300 sec: 45264.3). Total num frames: 64290816. Throughput: 0: 45087.6. Samples: 64340620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 09:50:59,592][32177] Avg episode reward: [(0, '0.027')] [2024-06-10 09:51:01,980][32415] Updated weights for policy 0, policy_version 3930 (0.0032) [2024-06-10 09:51:04,592][32177] Fps is (10 sec: 45875.6, 60 sec: 45329.2, 300 sec: 45097.7). Total num frames: 64503808. Throughput: 0: 44904.9. Samples: 64611260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-10 09:51:04,592][32177] Avg episode reward: [(0, '0.021')] [2024-06-10 09:51:05,227][32415] Updated weights for policy 0, policy_version 3940 (0.0027) [2024-06-10 09:51:09,434][32415] Updated weights for policy 0, policy_version 3950 (0.0021) [2024-06-10 09:51:09,592][32177] Fps is (10 sec: 42598.7, 60 sec: 44782.9, 300 sec: 45042.8). Total num frames: 64716800. Throughput: 0: 44866.4. Samples: 64876700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 09:51:09,592][32177] Avg episode reward: [(0, '0.028')] [2024-06-10 09:51:12,696][32415] Updated weights for policy 0, policy_version 3960 (0.0035) [2024-06-10 09:51:14,592][32177] Fps is (10 sec: 45874.7, 60 sec: 45056.0, 300 sec: 45264.3). Total num frames: 64962560. Throughput: 0: 44801.8. Samples: 65013440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-10 09:51:14,592][32177] Avg episode reward: [(0, '0.026')] [2024-06-10 09:51:16,598][32415] Updated weights for policy 0, policy_version 3970 (0.0045) [2024-06-10 09:51:19,592][32177] Fps is (10 sec: 45875.4, 60 sec: 44782.9, 300 sec: 45097.7). Total num frames: 65175552. Throughput: 0: 44852.6. Samples: 65278300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 09:51:19,592][32177] Avg episode reward: [(0, '0.024')] [2024-06-10 09:51:19,874][32415] Updated weights for policy 0, policy_version 3980 (0.0040) [2024-06-10 09:51:24,194][32415] Updated weights for policy 0, policy_version 3990 (0.0032) [2024-06-10 09:51:24,592][32177] Fps is (10 sec: 42598.9, 60 sec: 45056.2, 300 sec: 45042.2). Total num frames: 65388544. Throughput: 0: 44926.3. Samples: 65549020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-10 09:51:24,592][32177] Avg episode reward: [(0, '0.021')] [2024-06-10 09:51:27,216][32415] Updated weights for policy 0, policy_version 4000 (0.0028) [2024-06-10 09:51:29,592][32177] Fps is (10 sec: 44236.3, 60 sec: 44509.9, 300 sec: 45097.6). Total num frames: 65617920. Throughput: 0: 44800.8. Samples: 65675480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-10 09:51:29,592][32177] Avg episode reward: [(0, '0.033')] [2024-06-10 09:51:30,892][32394] Signal inference workers to stop experience collection... (850 times) [2024-06-10 09:51:30,893][32394] Signal inference workers to resume experience collection... (850 times) [2024-06-10 09:51:30,917][32415] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-06-10 09:51:30,917][32415] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-06-10 09:51:31,301][32415] Updated weights for policy 0, policy_version 4010 (0.0043) [2024-06-10 09:51:34,470][32415] Updated weights for policy 0, policy_version 4020 (0.0041) [2024-06-10 09:51:34,591][32177] Fps is (10 sec: 47514.0, 60 sec: 45602.3, 300 sec: 45208.7). Total num frames: 65863680. Throughput: 0: 44660.0. Samples: 65945960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 09:51:34,592][32177] Avg episode reward: [(0, '0.033')] [2024-06-10 09:51:38,839][32415] Updated weights for policy 0, policy_version 4030 (0.0033) [2024-06-10 09:51:39,592][32177] Fps is (10 sec: 42598.2, 60 sec: 44509.8, 300 sec: 44931.1). Total num frames: 66043904. Throughput: 0: 44500.4. Samples: 66211080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-10 09:51:39,592][32177] Avg episode reward: [(0, '0.026')] [2024-06-10 09:51:41,936][32415] Updated weights for policy 0, policy_version 4040 (0.0033) [2024-06-10 09:51:44,591][32177] Fps is (10 sec: 40959.9, 60 sec: 44236.9, 300 sec: 45097.7). Total num frames: 66273280. Throughput: 0: 44509.9. Samples: 66343560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 09:51:44,592][32177] Avg episode reward: [(0, '0.033')] [2024-06-10 09:51:45,995][32415] Updated weights for policy 0, policy_version 4050 (0.0031) [2024-06-10 09:51:49,191][32415] Updated weights for policy 0, policy_version 4060 (0.0033) [2024-06-10 09:51:49,592][32177] Fps is (10 sec: 49152.0, 60 sec: 45332.3, 300 sec: 45208.7). Total num frames: 66535424. Throughput: 0: 44658.1. Samples: 66620880. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-10 09:51:49,592][32177] Avg episode reward: [(0, '0.024')] [2024-06-10 09:51:53,233][32415] Updated weights for policy 0, policy_version 4070 (0.0045) [2024-06-10 09:51:54,592][32177] Fps is (10 sec: 44236.0, 60 sec: 44509.8, 300 sec: 44986.6). Total num frames: 66715648. Throughput: 0: 44890.1. Samples: 66896760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 09:51:54,592][32177] Avg episode reward: [(0, '0.036')] [2024-06-10 09:51:56,222][32415] Updated weights for policy 0, policy_version 4080 (0.0021) [2024-06-10 09:51:59,592][32177] Fps is (10 sec: 42598.9, 60 sec: 44509.9, 300 sec: 45097.7). Total num frames: 66961408. Throughput: 0: 44663.2. Samples: 67023280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 09:51:59,592][32177] Avg episode reward: [(0, '0.040')] [2024-06-10 09:51:59,601][32394] Saving new best policy, reward=0.040! [2024-06-10 09:52:00,220][32415] Updated weights for policy 0, policy_version 4090 (0.0045) [2024-06-10 09:52:03,424][32415] Updated weights for policy 0, policy_version 4100 (0.0028) [2024-06-10 09:52:04,592][32177] Fps is (10 sec: 49152.1, 60 sec: 45055.9, 300 sec: 45208.7). Total num frames: 67207168. Throughput: 0: 44828.3. Samples: 67295580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 09:52:04,592][32177] Avg episode reward: [(0, '0.031')] [2024-06-10 09:52:07,832][32415] Updated weights for policy 0, policy_version 4110 (0.0043) [2024-06-10 09:52:09,591][32177] Fps is (10 sec: 45875.5, 60 sec: 45056.0, 300 sec: 45042.1). Total num frames: 67420160. Throughput: 0: 44830.7. Samples: 67566400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-10 09:52:09,592][32177] Avg episode reward: [(0, '0.033')] [2024-06-10 09:52:10,975][32415] Updated weights for policy 0, policy_version 4120 (0.0027) [2024-06-10 09:52:14,593][32177] Fps is (10 sec: 42592.1, 60 sec: 44508.8, 300 sec: 45097.4). Total num frames: 67633152. Throughput: 0: 44973.2. Samples: 67699340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-10 09:52:14,594][32177] Avg episode reward: [(0, '0.027')] [2024-06-10 09:52:15,181][32415] Updated weights for policy 0, policy_version 4130 (0.0040) [2024-06-10 09:52:18,238][32415] Updated weights for policy 0, policy_version 4140 (0.0038) [2024-06-10 09:52:19,591][32177] Fps is (10 sec: 45875.3, 60 sec: 45056.0, 300 sec: 45153.2). Total num frames: 67878912. Throughput: 0: 45012.0. Samples: 67971500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-10 09:52:19,592][32177] Avg episode reward: [(0, '0.034')] [2024-06-10 09:52:22,339][32415] Updated weights for policy 0, policy_version 4150 (0.0032) [2024-06-10 09:52:24,592][32177] Fps is (10 sec: 44243.8, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 68075520. Throughput: 0: 45204.1. Samples: 68245260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-10 09:52:24,592][32177] Avg episode reward: [(0, '0.031')] [2024-06-10 09:52:25,307][32415] Updated weights for policy 0, policy_version 4160 (0.0036) [2024-06-10 09:52:29,503][32415] Updated weights for policy 0, policy_version 4170 (0.0027) [2024-06-10 09:52:29,592][32177] Fps is (10 sec: 44236.6, 60 sec: 45056.1, 300 sec: 45097.7). Total num frames: 68321280. Throughput: 0: 45175.5. Samples: 68376460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-10 09:52:29,592][32177] Avg episode reward: [(0, '0.052')] [2024-06-10 09:52:29,593][32394] Saving new best policy, reward=0.052! [2024-06-10 09:52:32,638][32415] Updated weights for policy 0, policy_version 4180 (0.0034) [2024-06-10 09:52:34,592][32177] Fps is (10 sec: 47513.6, 60 sec: 44782.9, 300 sec: 45153.3). Total num frames: 68550656. Throughput: 0: 44867.7. Samples: 68639920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-10 09:52:34,593][32177] Avg episode reward: [(0, '0.038')] [2024-06-10 09:52:34,600][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000004184_68550656.pth... [2024-06-10 09:52:34,662][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000003525_57753600.pth [2024-06-10 09:52:36,915][32415] Updated weights for policy 0, policy_version 4190 (0.0036) [2024-06-10 09:52:39,592][32177] Fps is (10 sec: 45875.1, 60 sec: 45602.2, 300 sec: 45097.7). Total num frames: 68780032. Throughput: 0: 44896.6. Samples: 68917100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 09:52:39,592][32177] Avg episode reward: [(0, '0.042')] [2024-06-10 09:52:40,071][32415] Updated weights for policy 0, policy_version 4200 (0.0028) [2024-06-10 09:52:44,206][32415] Updated weights for policy 0, policy_version 4210 (0.0032) [2024-06-10 09:52:44,592][32177] Fps is (10 sec: 44237.1, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 68993024. Throughput: 0: 45059.2. Samples: 69050940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 09:52:44,592][32177] Avg episode reward: [(0, '0.035')] [2024-06-10 09:52:45,912][32394] Signal inference workers to stop experience collection... (900 times) [2024-06-10 09:52:45,952][32415] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-06-10 09:52:45,959][32394] Signal inference workers to resume experience collection... (900 times) [2024-06-10 09:52:45,965][32415] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-06-10 09:52:47,109][32415] Updated weights for policy 0, policy_version 4220 (0.0029) [2024-06-10 09:52:49,591][32177] Fps is (10 sec: 42598.8, 60 sec: 44510.0, 300 sec: 44986.6). Total num frames: 69206016. Throughput: 0: 44942.9. Samples: 69318000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-10 09:52:49,592][32177] Avg episode reward: [(0, '0.042')] [2024-06-10 09:52:51,590][32415] Updated weights for policy 0, policy_version 4230 (0.0031) [2024-06-10 09:52:54,591][32177] Fps is (10 sec: 45875.4, 60 sec: 45602.3, 300 sec: 45097.7). Total num frames: 69451776. Throughput: 0: 44905.4. Samples: 69587140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-10 09:52:54,592][32177] Avg episode reward: [(0, '0.049')] [2024-06-10 09:52:54,606][32415] Updated weights for policy 0, policy_version 4240 (0.0024) [2024-06-10 09:52:58,826][32415] Updated weights for policy 0, policy_version 4250 (0.0028) [2024-06-10 09:52:59,591][32177] Fps is (10 sec: 44236.8, 60 sec: 44783.0, 300 sec: 44931.1). Total num frames: 69648384. Throughput: 0: 44963.9. Samples: 69722640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-10 09:52:59,592][32177] Avg episode reward: [(0, '0.038')] [2024-06-10 09:53:01,961][32415] Updated weights for policy 0, policy_version 4260 (0.0030) [2024-06-10 09:53:04,592][32177] Fps is (10 sec: 42598.1, 60 sec: 44510.0, 300 sec: 44931.1). Total num frames: 69877760. Throughput: 0: 44812.8. Samples: 69988080. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-10 09:53:04,592][32177] Avg episode reward: [(0, '0.043')] [2024-06-10 09:53:05,995][32415] Updated weights for policy 0, policy_version 4270 (0.0025) [2024-06-10 09:53:09,039][32415] Updated weights for policy 0, policy_version 4280 (0.0033) [2024-06-10 09:53:09,596][32177] Fps is (10 sec: 47492.6, 60 sec: 45052.7, 300 sec: 45097.0). Total num frames: 70123520. Throughput: 0: 44730.3. Samples: 70258320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-10 09:53:09,605][32177] Avg episode reward: [(0, '0.037')] [2024-06-10 09:53:13,556][32415] Updated weights for policy 0, policy_version 4290 (0.0032) [2024-06-10 09:53:14,592][32177] Fps is (10 sec: 44236.9, 60 sec: 44784.1, 300 sec: 44931.0). Total num frames: 70320128. Throughput: 0: 44998.2. Samples: 70401380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-10 09:53:14,592][32177] Avg episode reward: [(0, '0.038')] [2024-06-10 09:53:16,585][32415] Updated weights for policy 0, policy_version 4300 (0.0036) [2024-06-10 09:53:19,592][32177] Fps is (10 sec: 40978.0, 60 sec: 44236.8, 300 sec: 44931.1). Total num frames: 70533120. Throughput: 0: 44911.6. Samples: 70660940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-10 09:53:19,593][32177] Avg episode reward: [(0, '0.037')] [2024-06-10 09:53:20,846][32415] Updated weights for policy 0, policy_version 4310 (0.0036) [2024-06-10 09:53:24,061][32415] Updated weights for policy 0, policy_version 4320 (0.0033) [2024-06-10 09:53:24,591][32177] Fps is (10 sec: 49152.2, 60 sec: 45602.2, 300 sec: 45208.8). Total num frames: 70811648. Throughput: 0: 44710.3. Samples: 70929060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 09:53:24,592][32177] Avg episode reward: [(0, '0.030')] [2024-06-10 09:53:28,203][32415] Updated weights for policy 0, policy_version 4330 (0.0035) [2024-06-10 09:53:29,592][32177] Fps is (10 sec: 45874.9, 60 sec: 44509.8, 300 sec: 44875.5). Total num frames: 70991872. Throughput: 0: 44741.3. Samples: 71064300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-10 09:53:29,592][32177] Avg episode reward: [(0, '0.039')] [2024-06-10 09:53:31,315][32415] Updated weights for policy 0, policy_version 4340 (0.0035) [2024-06-10 09:53:34,591][32177] Fps is (10 sec: 39321.6, 60 sec: 44236.9, 300 sec: 44931.0). Total num frames: 71204864. Throughput: 0: 44710.2. Samples: 71329960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 09:53:34,592][32177] Avg episode reward: [(0, '0.043')] [2024-06-10 09:53:35,494][32415] Updated weights for policy 0, policy_version 4350 (0.0025) [2024-06-10 09:53:38,602][32415] Updated weights for policy 0, policy_version 4360 (0.0030) [2024-06-10 09:53:39,591][32177] Fps is (10 sec: 45875.7, 60 sec: 44509.9, 300 sec: 45042.1). Total num frames: 71450624. Throughput: 0: 44728.0. Samples: 71599900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 09:53:39,592][32177] Avg episode reward: [(0, '0.037')] [2024-06-10 09:53:42,969][32415] Updated weights for policy 0, policy_version 4370 (0.0038) [2024-06-10 09:53:44,592][32177] Fps is (10 sec: 49151.9, 60 sec: 45056.0, 300 sec: 44987.2). Total num frames: 71696384. Throughput: 0: 44897.3. Samples: 71743020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 09:53:44,592][32177] Avg episode reward: [(0, '0.044')] [2024-06-10 09:53:45,760][32415] Updated weights for policy 0, policy_version 4380 (0.0033) [2024-06-10 09:53:49,592][32177] Fps is (10 sec: 44236.4, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 71892992. Throughput: 0: 44841.3. Samples: 72005940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-10 09:53:49,592][32177] Avg episode reward: [(0, '0.041')] [2024-06-10 09:53:50,242][32415] Updated weights for policy 0, policy_version 4390 (0.0036) [2024-06-10 09:53:53,224][32415] Updated weights for policy 0, policy_version 4400 (0.0043) [2024-06-10 09:53:54,591][32177] Fps is (10 sec: 44236.9, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 72138752. Throughput: 0: 44774.6. Samples: 72272980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 09:53:54,592][32177] Avg episode reward: [(0, '0.047')] [2024-06-10 09:53:57,478][32415] Updated weights for policy 0, policy_version 4410 (0.0039) [2024-06-10 09:53:59,592][32177] Fps is (10 sec: 45875.2, 60 sec: 45055.9, 300 sec: 44931.7). Total num frames: 72351744. Throughput: 0: 44540.4. Samples: 72405700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 09:53:59,592][32177] Avg episode reward: [(0, '0.053')] [2024-06-10 09:54:00,759][32415] Updated weights for policy 0, policy_version 4420 (0.0038) [2024-06-10 09:54:04,592][32177] Fps is (10 sec: 40959.9, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 72548352. Throughput: 0: 44683.5. Samples: 72671700. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-10 09:54:04,592][32177] Avg episode reward: [(0, '0.060')] [2024-06-10 09:54:04,605][32394] Saving new best policy, reward=0.060! [2024-06-10 09:54:05,133][32415] Updated weights for policy 0, policy_version 4430 (0.0039) [2024-06-10 09:54:06,325][32394] Signal inference workers to stop experience collection... (950 times) [2024-06-10 09:54:06,325][32394] Signal inference workers to resume experience collection... (950 times) [2024-06-10 09:54:06,361][32415] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-06-10 09:54:06,361][32415] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-06-10 09:54:07,898][32415] Updated weights for policy 0, policy_version 4440 (0.0032) [2024-06-10 09:54:09,591][32177] Fps is (10 sec: 45875.6, 60 sec: 44786.2, 300 sec: 44931.1). Total num frames: 72810496. Throughput: 0: 44732.0. Samples: 72942000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-10 09:54:09,592][32177] Avg episode reward: [(0, '0.055')] [2024-06-10 09:54:12,174][32415] Updated weights for policy 0, policy_version 4450 (0.0035) [2024-06-10 09:54:14,592][32177] Fps is (10 sec: 47513.6, 60 sec: 45056.0, 300 sec: 44986.7). Total num frames: 73023488. Throughput: 0: 44864.5. Samples: 73083200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-10 09:54:14,592][32177] Avg episode reward: [(0, '0.049')] [2024-06-10 09:54:15,241][32415] Updated weights for policy 0, policy_version 4460 (0.0031) [2024-06-10 09:54:19,539][32415] Updated weights for policy 0, policy_version 4470 (0.0040) [2024-06-10 09:54:19,595][32177] Fps is (10 sec: 42582.0, 60 sec: 45053.1, 300 sec: 44819.4). Total num frames: 73236480. Throughput: 0: 44840.2. Samples: 73347940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-10 09:54:19,596][32177] Avg episode reward: [(0, '0.043')] [2024-06-10 09:54:22,596][32415] Updated weights for policy 0, policy_version 4480 (0.0033) [2024-06-10 09:54:24,592][32177] Fps is (10 sec: 45875.1, 60 sec: 44509.8, 300 sec: 44931.3). Total num frames: 73482240. Throughput: 0: 44744.8. Samples: 73613420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-10 09:54:24,592][32177] Avg episode reward: [(0, '0.051')] [2024-06-10 09:54:27,084][32415] Updated weights for policy 0, policy_version 4490 (0.0029) [2024-06-10 09:54:29,596][32177] Fps is (10 sec: 45872.6, 60 sec: 45052.8, 300 sec: 44874.9). Total num frames: 73695232. Throughput: 0: 44677.9. Samples: 73753720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 09:54:29,597][32177] Avg episode reward: [(0, '0.055')] [2024-06-10 09:54:30,103][32415] Updated weights for policy 0, policy_version 4500 (0.0037) [2024-06-10 09:54:34,557][32415] Updated weights for policy 0, policy_version 4510 (0.0035) [2024-06-10 09:54:34,592][32177] Fps is (10 sec: 40960.1, 60 sec: 44782.9, 300 sec: 44765.1). Total num frames: 73891840. Throughput: 0: 44717.4. Samples: 74018220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-10 09:54:34,592][32177] Avg episode reward: [(0, '0.054')] [2024-06-10 09:54:34,610][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000004510_73891840.pth... [2024-06-10 09:54:34,663][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000003854_63143936.pth [2024-06-10 09:54:37,150][32415] Updated weights for policy 0, policy_version 4520 (0.0036) [2024-06-10 09:54:39,592][32177] Fps is (10 sec: 44255.7, 60 sec: 44782.8, 300 sec: 44931.0). Total num frames: 74137600. Throughput: 0: 44836.7. Samples: 74290640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-10 09:54:39,592][32177] Avg episode reward: [(0, '0.053')] [2024-06-10 09:54:41,561][32415] Updated weights for policy 0, policy_version 4530 (0.0044) [2024-06-10 09:54:44,595][32177] Fps is (10 sec: 47495.8, 60 sec: 44507.1, 300 sec: 44930.5). Total num frames: 74366976. Throughput: 0: 45088.7. Samples: 74434860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 09:54:44,596][32177] Avg episode reward: [(0, '0.056')] [2024-06-10 09:54:44,693][32415] Updated weights for policy 0, policy_version 4540 (0.0040) [2024-06-10 09:54:48,874][32415] Updated weights for policy 0, policy_version 4550 (0.0027) [2024-06-10 09:54:49,593][32177] Fps is (10 sec: 42591.5, 60 sec: 44508.6, 300 sec: 44708.6). Total num frames: 74563584. Throughput: 0: 44893.8. Samples: 74692000. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-10 09:54:49,594][32177] Avg episode reward: [(0, '0.050')] [2024-06-10 09:54:52,406][32415] Updated weights for policy 0, policy_version 4560 (0.0032) [2024-06-10 09:54:54,591][32177] Fps is (10 sec: 44253.6, 60 sec: 44509.9, 300 sec: 44875.5). Total num frames: 74809344. Throughput: 0: 44652.4. Samples: 74951360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-10 09:54:54,592][32177] Avg episode reward: [(0, '0.051')] [2024-06-10 09:54:56,287][32415] Updated weights for policy 0, policy_version 4570 (0.0032) [2024-06-10 09:54:59,395][32415] Updated weights for policy 0, policy_version 4580 (0.0029) [2024-06-10 09:54:59,591][32177] Fps is (10 sec: 47522.1, 60 sec: 44783.0, 300 sec: 44931.1). Total num frames: 75038720. Throughput: 0: 44656.5. Samples: 75092740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 09:54:59,592][32177] Avg episode reward: [(0, '0.064')] [2024-06-10 09:54:59,592][32394] Saving new best policy, reward=0.064! [2024-06-10 09:55:03,735][32415] Updated weights for policy 0, policy_version 4590 (0.0030) [2024-06-10 09:55:04,592][32177] Fps is (10 sec: 44236.6, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 75251712. Throughput: 0: 44911.8. Samples: 75368800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 09:55:04,592][32177] Avg episode reward: [(0, '0.050')] [2024-06-10 09:55:06,590][32415] Updated weights for policy 0, policy_version 4600 (0.0035) [2024-06-10 09:55:09,594][32177] Fps is (10 sec: 45863.5, 60 sec: 44781.0, 300 sec: 44875.1). Total num frames: 75497472. Throughput: 0: 44910.0. Samples: 75634480. Policy #0 lag: (min: 1.0, avg: 12.0, max: 24.0) [2024-06-10 09:55:09,594][32177] Avg episode reward: [(0, '0.062')] [2024-06-10 09:55:10,769][32415] Updated weights for policy 0, policy_version 4610 (0.0038) [2024-06-10 09:55:13,945][32415] Updated weights for policy 0, policy_version 4620 (0.0030) [2024-06-10 09:55:14,594][32177] Fps is (10 sec: 45863.1, 60 sec: 44781.0, 300 sec: 44819.6). Total num frames: 75710464. Throughput: 0: 45066.2. Samples: 75781620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 09:55:14,595][32177] Avg episode reward: [(0, '0.064')] [2024-06-10 09:55:17,880][32415] Updated weights for policy 0, policy_version 4630 (0.0040) [2024-06-10 09:55:19,592][32177] Fps is (10 sec: 44247.7, 60 sec: 45058.8, 300 sec: 44931.1). Total num frames: 75939840. Throughput: 0: 45171.1. Samples: 76050920. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-10 09:55:19,592][32177] Avg episode reward: [(0, '0.064')] [2024-06-10 09:55:21,435][32415] Updated weights for policy 0, policy_version 4640 (0.0039) [2024-06-10 09:55:24,592][32177] Fps is (10 sec: 45886.3, 60 sec: 44782.8, 300 sec: 44820.0). Total num frames: 76169216. Throughput: 0: 44928.8. Samples: 76312440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-10 09:55:24,596][32177] Avg episode reward: [(0, '0.065')] [2024-06-10 09:55:25,269][32415] Updated weights for policy 0, policy_version 4650 (0.0035) [2024-06-10 09:55:28,575][32415] Updated weights for policy 0, policy_version 4660 (0.0035) [2024-06-10 09:55:29,592][32177] Fps is (10 sec: 45874.9, 60 sec: 45059.2, 300 sec: 44986.6). Total num frames: 76398592. Throughput: 0: 44833.4. Samples: 76452200. Policy #0 lag: (min: 2.0, avg: 11.7, max: 24.0) [2024-06-10 09:55:29,592][32177] Avg episode reward: [(0, '0.056')] [2024-06-10 09:55:32,718][32415] Updated weights for policy 0, policy_version 4670 (0.0025) [2024-06-10 09:55:34,054][32394] Signal inference workers to stop experience collection... (1000 times) [2024-06-10 09:55:34,055][32394] Signal inference workers to resume experience collection... (1000 times) [2024-06-10 09:55:34,065][32415] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-06-10 09:55:34,065][32415] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-06-10 09:55:34,592][32177] Fps is (10 sec: 42599.3, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 76595200. Throughput: 0: 45185.7. Samples: 76725280. Policy #0 lag: (min: 2.0, avg: 11.7, max: 24.0) [2024-06-10 09:55:34,592][32177] Avg episode reward: [(0, '0.064')] [2024-06-10 09:55:35,604][32415] Updated weights for policy 0, policy_version 4680 (0.0026) [2024-06-10 09:55:39,592][32177] Fps is (10 sec: 42598.8, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 76824576. Throughput: 0: 45284.9. Samples: 76989180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 09:55:39,592][32177] Avg episode reward: [(0, '0.061')] [2024-06-10 09:55:39,902][32415] Updated weights for policy 0, policy_version 4690 (0.0029) [2024-06-10 09:55:43,284][32415] Updated weights for policy 0, policy_version 4700 (0.0022) [2024-06-10 09:55:44,592][32177] Fps is (10 sec: 49148.1, 60 sec: 45331.3, 300 sec: 44987.1). Total num frames: 77086720. Throughput: 0: 45144.0. Samples: 77124260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 09:55:44,593][32177] Avg episode reward: [(0, '0.073')] [2024-06-10 09:55:44,605][32394] Saving new best policy, reward=0.073! [2024-06-10 09:55:47,050][32415] Updated weights for policy 0, policy_version 4710 (0.0031) [2024-06-10 09:55:49,592][32177] Fps is (10 sec: 44236.6, 60 sec: 45057.3, 300 sec: 44820.0). Total num frames: 77266944. Throughput: 0: 45013.3. Samples: 77394400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-10 09:55:49,592][32177] Avg episode reward: [(0, '0.065')] [2024-06-10 09:55:50,880][32415] Updated weights for policy 0, policy_version 4720 (0.0028) [2024-06-10 09:55:54,593][32415] Updated weights for policy 0, policy_version 4730 (0.0031) [2024-06-10 09:55:54,596][32177] Fps is (10 sec: 40945.2, 60 sec: 44779.6, 300 sec: 44763.8). Total num frames: 77496320. Throughput: 0: 44905.2. Samples: 77655300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 09:55:54,597][32177] Avg episode reward: [(0, '0.074')] [2024-06-10 09:55:54,610][32394] Saving new best policy, reward=0.074! [2024-06-10 09:55:57,921][32415] Updated weights for policy 0, policy_version 4740 (0.0041) [2024-06-10 09:55:59,593][32177] Fps is (10 sec: 47506.4, 60 sec: 45054.8, 300 sec: 44875.3). Total num frames: 77742080. Throughput: 0: 44667.7. Samples: 77791620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-10 09:55:59,594][32177] Avg episode reward: [(0, '0.080')] [2024-06-10 09:55:59,594][32394] Saving new best policy, reward=0.080! [2024-06-10 09:56:02,050][32415] Updated weights for policy 0, policy_version 4750 (0.0029) [2024-06-10 09:56:04,594][32177] Fps is (10 sec: 42608.9, 60 sec: 44508.4, 300 sec: 44764.1). Total num frames: 77922304. Throughput: 0: 44570.1. Samples: 78056660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-10 09:56:04,594][32177] Avg episode reward: [(0, '0.060')] [2024-06-10 09:56:05,104][32415] Updated weights for policy 0, policy_version 4760 (0.0022) [2024-06-10 09:56:09,076][32415] Updated weights for policy 0, policy_version 4770 (0.0029) [2024-06-10 09:56:09,591][32177] Fps is (10 sec: 42605.1, 60 sec: 44511.8, 300 sec: 44764.4). Total num frames: 78168064. Throughput: 0: 44853.6. Samples: 78330840. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-10 09:56:09,592][32177] Avg episode reward: [(0, '0.058')] [2024-06-10 09:56:12,583][32415] Updated weights for policy 0, policy_version 4780 (0.0027) [2024-06-10 09:56:14,591][32177] Fps is (10 sec: 49161.8, 60 sec: 45058.0, 300 sec: 44875.5). Total num frames: 78413824. Throughput: 0: 44601.4. Samples: 78459260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-10 09:56:14,592][32177] Avg episode reward: [(0, '0.061')] [2024-06-10 09:56:16,102][32415] Updated weights for policy 0, policy_version 4790 (0.0028) [2024-06-10 09:56:19,591][32177] Fps is (10 sec: 42598.5, 60 sec: 44236.9, 300 sec: 44764.4). Total num frames: 78594048. Throughput: 0: 44554.3. Samples: 78730220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-10 09:56:19,592][32177] Avg episode reward: [(0, '0.084')] [2024-06-10 09:56:19,617][32394] Saving new best policy, reward=0.084! [2024-06-10 09:56:20,099][32415] Updated weights for policy 0, policy_version 4800 (0.0036) [2024-06-10 09:56:23,569][32415] Updated weights for policy 0, policy_version 4810 (0.0028) [2024-06-10 09:56:24,592][32177] Fps is (10 sec: 40959.9, 60 sec: 44237.0, 300 sec: 44764.4). Total num frames: 78823424. Throughput: 0: 44626.7. Samples: 78997380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-10 09:56:24,592][32177] Avg episode reward: [(0, '0.072')] [2024-06-10 09:56:27,526][32415] Updated weights for policy 0, policy_version 4820 (0.0032) [2024-06-10 09:56:29,594][32177] Fps is (10 sec: 50777.8, 60 sec: 45054.2, 300 sec: 44875.1). Total num frames: 79101952. Throughput: 0: 44624.6. Samples: 79132440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-10 09:56:29,594][32177] Avg episode reward: [(0, '0.076')] [2024-06-10 09:56:31,124][32415] Updated weights for policy 0, policy_version 4830 (0.0026) [2024-06-10 09:56:34,592][32177] Fps is (10 sec: 45875.1, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 79282176. Throughput: 0: 44553.8. Samples: 79399320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-10 09:56:34,592][32177] Avg episode reward: [(0, '0.071')] [2024-06-10 09:56:34,652][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000004840_79298560.pth... [2024-06-10 09:56:34,662][32415] Updated weights for policy 0, policy_version 4840 (0.0031) [2024-06-10 09:56:34,696][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000004184_68550656.pth [2024-06-10 09:56:38,120][32415] Updated weights for policy 0, policy_version 4850 (0.0039) [2024-06-10 09:56:39,593][32177] Fps is (10 sec: 40963.9, 60 sec: 44781.8, 300 sec: 44875.3). Total num frames: 79511552. Throughput: 0: 44707.8. Samples: 79667020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-10 09:56:39,593][32177] Avg episode reward: [(0, '0.076')] [2024-06-10 09:56:42,102][32415] Updated weights for policy 0, policy_version 4860 (0.0025) [2024-06-10 09:56:43,284][32394] Signal inference workers to stop experience collection... (1050 times) [2024-06-10 09:56:43,285][32394] Signal inference workers to resume experience collection... (1050 times) [2024-06-10 09:56:43,328][32415] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-06-10 09:56:43,328][32415] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-06-10 09:56:44,596][32177] Fps is (10 sec: 47492.8, 60 sec: 44507.2, 300 sec: 44819.3). Total num frames: 79757312. Throughput: 0: 44528.3. Samples: 79795520. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-10 09:56:44,597][32177] Avg episode reward: [(0, '0.069')] [2024-06-10 09:56:45,262][32415] Updated weights for policy 0, policy_version 4870 (0.0030) [2024-06-10 09:56:49,591][32177] Fps is (10 sec: 42604.9, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 79937536. Throughput: 0: 44610.0. Samples: 80064020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 09:56:49,592][32177] Avg episode reward: [(0, '0.086')] [2024-06-10 09:56:49,703][32394] Saving new best policy, reward=0.086! [2024-06-10 09:56:49,711][32415] Updated weights for policy 0, policy_version 4880 (0.0038) [2024-06-10 09:56:52,816][32415] Updated weights for policy 0, policy_version 4890 (0.0038) [2024-06-10 09:56:54,592][32177] Fps is (10 sec: 40977.7, 60 sec: 44513.1, 300 sec: 44764.4). Total num frames: 80166912. Throughput: 0: 44448.8. Samples: 80331040. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-10 09:56:54,592][32177] Avg episode reward: [(0, '0.079')] [2024-06-10 09:56:56,835][32415] Updated weights for policy 0, policy_version 4900 (0.0026) [2024-06-10 09:56:59,592][32177] Fps is (10 sec: 47513.2, 60 sec: 44511.0, 300 sec: 44764.4). Total num frames: 80412672. Throughput: 0: 44471.5. Samples: 80460480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-10 09:56:59,592][32177] Avg episode reward: [(0, '0.075')] [2024-06-10 09:57:00,432][32415] Updated weights for policy 0, policy_version 4910 (0.0034) [2024-06-10 09:57:04,180][32415] Updated weights for policy 0, policy_version 4920 (0.0037) [2024-06-10 09:57:04,591][32177] Fps is (10 sec: 47514.2, 60 sec: 45330.6, 300 sec: 44820.0). Total num frames: 80642048. Throughput: 0: 44665.8. Samples: 80740180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-10 09:57:04,592][32177] Avg episode reward: [(0, '0.092')] [2024-06-10 09:57:04,776][32394] Saving new best policy, reward=0.092! [2024-06-10 09:57:07,406][32415] Updated weights for policy 0, policy_version 4930 (0.0036) [2024-06-10 09:57:09,591][32177] Fps is (10 sec: 40960.3, 60 sec: 44236.8, 300 sec: 44709.1). Total num frames: 80822272. Throughput: 0: 44598.7. Samples: 81004320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 09:57:09,592][32177] Avg episode reward: [(0, '0.070')] [2024-06-10 09:57:11,471][32415] Updated weights for policy 0, policy_version 4940 (0.0023) [2024-06-10 09:57:14,438][32415] Updated weights for policy 0, policy_version 4950 (0.0030) [2024-06-10 09:57:14,592][32177] Fps is (10 sec: 45874.6, 60 sec: 44782.8, 300 sec: 44819.9). Total num frames: 81100800. Throughput: 0: 44477.4. Samples: 81133820. Policy #0 lag: (min: 0.0, avg: 13.1, max: 23.0) [2024-06-10 09:57:14,592][32177] Avg episode reward: [(0, '0.068')] [2024-06-10 09:57:18,922][32415] Updated weights for policy 0, policy_version 4960 (0.0039) [2024-06-10 09:57:19,592][32177] Fps is (10 sec: 49151.8, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 81313792. Throughput: 0: 44648.5. Samples: 81408500. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-06-10 09:57:19,592][32177] Avg episode reward: [(0, '0.093')] [2024-06-10 09:57:19,704][32394] Saving new best policy, reward=0.093! [2024-06-10 09:57:21,999][32415] Updated weights for policy 0, policy_version 4970 (0.0038) [2024-06-10 09:57:24,593][32177] Fps is (10 sec: 40955.9, 60 sec: 44782.1, 300 sec: 44708.7). Total num frames: 81510400. Throughput: 0: 44634.6. Samples: 81675560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-10 09:57:24,593][32177] Avg episode reward: [(0, '0.089')] [2024-06-10 09:57:25,996][32415] Updated weights for policy 0, policy_version 4980 (0.0030) [2024-06-10 09:57:29,460][32415] Updated weights for policy 0, policy_version 4990 (0.0038) [2024-06-10 09:57:29,592][32177] Fps is (10 sec: 44236.5, 60 sec: 44238.5, 300 sec: 44764.4). Total num frames: 81756160. Throughput: 0: 44653.6. Samples: 81804740. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-10 09:57:29,592][32177] Avg episode reward: [(0, '0.073')] [2024-06-10 09:57:33,315][32415] Updated weights for policy 0, policy_version 5000 (0.0032) [2024-06-10 09:57:34,592][32177] Fps is (10 sec: 49157.4, 60 sec: 45329.1, 300 sec: 44820.0). Total num frames: 82001920. Throughput: 0: 44944.4. Samples: 82086520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-10 09:57:34,592][32177] Avg episode reward: [(0, '0.065')] [2024-06-10 09:57:36,478][32415] Updated weights for policy 0, policy_version 5010 (0.0033) [2024-06-10 09:57:39,591][32177] Fps is (10 sec: 39322.1, 60 sec: 43964.8, 300 sec: 44597.8). Total num frames: 82149376. Throughput: 0: 44873.9. Samples: 82350360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-10 09:57:39,592][32177] Avg episode reward: [(0, '0.095')] [2024-06-10 09:57:39,706][32394] Saving new best policy, reward=0.095! [2024-06-10 09:57:40,912][32415] Updated weights for policy 0, policy_version 5020 (0.0031) [2024-06-10 09:57:44,268][32415] Updated weights for policy 0, policy_version 5030 (0.0032) [2024-06-10 09:57:44,592][32177] Fps is (10 sec: 40959.2, 60 sec: 44239.9, 300 sec: 44764.4). Total num frames: 82411520. Throughput: 0: 44610.1. Samples: 82467940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 09:57:44,592][32177] Avg episode reward: [(0, '0.075')] [2024-06-10 09:57:48,297][32415] Updated weights for policy 0, policy_version 5040 (0.0029) [2024-06-10 09:57:49,591][32177] Fps is (10 sec: 50790.4, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 82657280. Throughput: 0: 44643.1. Samples: 82749120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 09:57:49,592][32177] Avg episode reward: [(0, '0.087')] [2024-06-10 09:57:51,823][32415] Updated weights for policy 0, policy_version 5050 (0.0028) [2024-06-10 09:57:54,556][32394] Signal inference workers to stop experience collection... (1100 times) [2024-06-10 09:57:54,557][32394] Signal inference workers to resume experience collection... (1100 times) [2024-06-10 09:57:54,591][32177] Fps is (10 sec: 42599.4, 60 sec: 44510.0, 300 sec: 44708.9). Total num frames: 82837504. Throughput: 0: 44934.7. Samples: 83026380. Policy #0 lag: (min: 0.0, avg: 13.4, max: 28.0) [2024-06-10 09:57:54,592][32177] Avg episode reward: [(0, '0.083')] [2024-06-10 09:57:54,600][32415] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-06-10 09:57:54,600][32415] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-06-10 09:57:55,311][32415] Updated weights for policy 0, policy_version 5060 (0.0031) [2024-06-10 09:57:58,747][32415] Updated weights for policy 0, policy_version 5070 (0.0026) [2024-06-10 09:57:59,592][32177] Fps is (10 sec: 42598.1, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 83083264. Throughput: 0: 44714.3. Samples: 83145960. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-10 09:57:59,592][32177] Avg episode reward: [(0, '0.093')] [2024-06-10 09:58:02,686][32415] Updated weights for policy 0, policy_version 5080 (0.0027) [2024-06-10 09:58:04,592][32177] Fps is (10 sec: 52428.5, 60 sec: 45329.0, 300 sec: 44876.2). Total num frames: 83361792. Throughput: 0: 44782.7. Samples: 83423720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-10 09:58:04,592][32177] Avg episode reward: [(0, '0.089')] [2024-06-10 09:58:05,695][32415] Updated weights for policy 0, policy_version 5090 (0.0028) [2024-06-10 09:58:09,596][32177] Fps is (10 sec: 44217.7, 60 sec: 45052.7, 300 sec: 44763.8). Total num frames: 83525632. Throughput: 0: 45160.7. Samples: 83707940. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-10 09:58:09,596][32177] Avg episode reward: [(0, '0.091')] [2024-06-10 09:58:09,841][32415] Updated weights for policy 0, policy_version 5100 (0.0047) [2024-06-10 09:58:13,147][32415] Updated weights for policy 0, policy_version 5110 (0.0024) [2024-06-10 09:58:14,592][32177] Fps is (10 sec: 39321.4, 60 sec: 44236.8, 300 sec: 44819.9). Total num frames: 83755008. Throughput: 0: 44858.2. Samples: 83823360. Policy #0 lag: (min: 0.0, avg: 12.4, max: 26.0) [2024-06-10 09:58:14,592][32177] Avg episode reward: [(0, '0.090')] [2024-06-10 09:58:17,171][32415] Updated weights for policy 0, policy_version 5120 (0.0027) [2024-06-10 09:58:19,591][32177] Fps is (10 sec: 49173.7, 60 sec: 45056.0, 300 sec: 44764.4). Total num frames: 84017152. Throughput: 0: 44749.8. Samples: 84100260. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-10 09:58:19,592][32177] Avg episode reward: [(0, '0.094')] [2024-06-10 09:58:20,781][32415] Updated weights for policy 0, policy_version 5130 (0.0039) [2024-06-10 09:58:24,346][32415] Updated weights for policy 0, policy_version 5140 (0.0028) [2024-06-10 09:58:24,595][32177] Fps is (10 sec: 47495.3, 60 sec: 45326.9, 300 sec: 44874.9). Total num frames: 84230144. Throughput: 0: 44982.3. Samples: 84374740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-10 09:58:24,596][32177] Avg episode reward: [(0, '0.094')] [2024-06-10 09:58:27,760][32415] Updated weights for policy 0, policy_version 5150 (0.0024) [2024-06-10 09:58:29,592][32177] Fps is (10 sec: 40959.7, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 84426752. Throughput: 0: 45181.1. Samples: 84501080. Policy #0 lag: (min: 0.0, avg: 13.3, max: 23.0) [2024-06-10 09:58:29,592][32177] Avg episode reward: [(0, '0.096')] [2024-06-10 09:58:31,839][32415] Updated weights for policy 0, policy_version 5160 (0.0030) [2024-06-10 09:58:34,592][32177] Fps is (10 sec: 45893.2, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 84688896. Throughput: 0: 44892.0. Samples: 84769260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 18.0) [2024-06-10 09:58:34,592][32177] Avg episode reward: [(0, '0.085')] [2024-06-10 09:58:34,619][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000005169_84688896.pth... [2024-06-10 09:58:34,697][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000004510_73891840.pth [2024-06-10 09:58:34,839][32415] Updated weights for policy 0, policy_version 5170 (0.0040) [2024-06-10 09:58:39,275][32415] Updated weights for policy 0, policy_version 5180 (0.0027) [2024-06-10 09:58:39,592][32177] Fps is (10 sec: 45873.8, 60 sec: 45601.9, 300 sec: 44708.8). Total num frames: 84885504. Throughput: 0: 44898.3. Samples: 85046820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-10 09:58:39,592][32177] Avg episode reward: [(0, '0.096')] [2024-06-10 09:58:39,594][32394] Saving new best policy, reward=0.096! [2024-06-10 09:58:42,023][32415] Updated weights for policy 0, policy_version 5190 (0.0027) [2024-06-10 09:58:44,593][32177] Fps is (10 sec: 39317.5, 60 sec: 44509.2, 300 sec: 44708.7). Total num frames: 85082112. Throughput: 0: 44961.7. Samples: 85169280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-10 09:58:44,593][32177] Avg episode reward: [(0, '0.105')] [2024-06-10 09:58:44,615][32394] Saving new best policy, reward=0.105! [2024-06-10 09:58:46,481][32415] Updated weights for policy 0, policy_version 5200 (0.0038) [2024-06-10 09:58:49,368][32415] Updated weights for policy 0, policy_version 5210 (0.0024) [2024-06-10 09:58:49,592][32177] Fps is (10 sec: 47514.8, 60 sec: 45055.9, 300 sec: 44819.9). Total num frames: 85360640. Throughput: 0: 44749.7. Samples: 85437460. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-10 09:58:49,592][32177] Avg episode reward: [(0, '0.091')] [2024-06-10 09:58:52,142][32394] Signal inference workers to stop experience collection... (1150 times) [2024-06-10 09:58:52,143][32394] Signal inference workers to resume experience collection... (1150 times) [2024-06-10 09:58:52,156][32415] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-06-10 09:58:52,157][32415] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-06-10 09:58:53,923][32415] Updated weights for policy 0, policy_version 5220 (0.0027) [2024-06-10 09:58:54,591][32177] Fps is (10 sec: 49157.4, 60 sec: 45602.1, 300 sec: 44820.0). Total num frames: 85573632. Throughput: 0: 44669.7. Samples: 85717880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-10 09:58:54,592][32177] Avg episode reward: [(0, '0.097')] [2024-06-10 09:58:56,419][32415] Updated weights for policy 0, policy_version 5230 (0.0046) [2024-06-10 09:58:59,592][32177] Fps is (10 sec: 39321.8, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 85753856. Throughput: 0: 44976.5. Samples: 85847300. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-10 09:58:59,592][32177] Avg episode reward: [(0, '0.109')] [2024-06-10 09:58:59,592][32394] Saving new best policy, reward=0.109! [2024-06-10 09:59:01,266][32415] Updated weights for policy 0, policy_version 5240 (0.0024) [2024-06-10 09:59:03,365][32415] Updated weights for policy 0, policy_version 5250 (0.0026) [2024-06-10 09:59:04,592][32177] Fps is (10 sec: 44236.6, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 86016000. Throughput: 0: 44771.1. Samples: 86114960. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-10 09:59:04,592][32177] Avg episode reward: [(0, '0.090')] [2024-06-10 09:59:08,645][32415] Updated weights for policy 0, policy_version 5260 (0.0036) [2024-06-10 09:59:09,592][32177] Fps is (10 sec: 50790.5, 60 sec: 45605.5, 300 sec: 44875.5). Total num frames: 86261760. Throughput: 0: 44765.3. Samples: 86389000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 09:59:09,592][32177] Avg episode reward: [(0, '0.099')] [2024-06-10 09:59:11,051][32415] Updated weights for policy 0, policy_version 5270 (0.0029) [2024-06-10 09:59:14,592][32177] Fps is (10 sec: 40959.9, 60 sec: 44509.9, 300 sec: 44709.5). Total num frames: 86425600. Throughput: 0: 44839.6. Samples: 86518860. Policy #0 lag: (min: 0.0, avg: 13.5, max: 23.0) [2024-06-10 09:59:14,592][32177] Avg episode reward: [(0, '0.101')] [2024-06-10 09:59:15,913][32415] Updated weights for policy 0, policy_version 5280 (0.0039) [2024-06-10 09:59:18,687][32415] Updated weights for policy 0, policy_version 5290 (0.0040) [2024-06-10 09:59:19,592][32177] Fps is (10 sec: 40959.5, 60 sec: 44236.7, 300 sec: 44708.9). Total num frames: 86671360. Throughput: 0: 44696.4. Samples: 86780600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 09:59:19,592][32177] Avg episode reward: [(0, '0.100')] [2024-06-10 09:59:23,269][32415] Updated weights for policy 0, policy_version 5300 (0.0037) [2024-06-10 09:59:24,592][32177] Fps is (10 sec: 50790.4, 60 sec: 45058.9, 300 sec: 44876.2). Total num frames: 86933504. Throughput: 0: 44551.0. Samples: 87051600. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-10 09:59:24,592][32177] Avg episode reward: [(0, '0.103')] [2024-06-10 09:59:25,688][32415] Updated weights for policy 0, policy_version 5310 (0.0025) [2024-06-10 09:59:29,591][32177] Fps is (10 sec: 44237.5, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 87113728. Throughput: 0: 44928.2. Samples: 87191000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-10 09:59:29,592][32177] Avg episode reward: [(0, '0.117')] [2024-06-10 09:59:29,699][32394] Saving new best policy, reward=0.117! [2024-06-10 09:59:30,278][32415] Updated weights for policy 0, policy_version 5320 (0.0029) [2024-06-10 09:59:32,579][32415] Updated weights for policy 0, policy_version 5330 (0.0039) [2024-06-10 09:59:34,592][32177] Fps is (10 sec: 40960.0, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 87343104. Throughput: 0: 44704.9. Samples: 87449180. Policy #0 lag: (min: 0.0, avg: 13.9, max: 23.0) [2024-06-10 09:59:34,592][32177] Avg episode reward: [(0, '0.101')] [2024-06-10 09:59:37,814][32415] Updated weights for policy 0, policy_version 5340 (0.0035) [2024-06-10 09:59:39,592][32177] Fps is (10 sec: 50789.8, 60 sec: 45602.3, 300 sec: 44931.6). Total num frames: 87621632. Throughput: 0: 44568.8. Samples: 87723480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-10 09:59:39,592][32177] Avg episode reward: [(0, '0.097')] [2024-06-10 09:59:40,294][32415] Updated weights for policy 0, policy_version 5350 (0.0036) [2024-06-10 09:59:44,591][32177] Fps is (10 sec: 42598.7, 60 sec: 44783.8, 300 sec: 44764.7). Total num frames: 87769088. Throughput: 0: 44790.7. Samples: 87862880. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-06-10 09:59:44,592][32177] Avg episode reward: [(0, '0.097')] [2024-06-10 09:59:45,173][32415] Updated weights for policy 0, policy_version 5360 (0.0022) [2024-06-10 09:59:45,754][32394] Signal inference workers to stop experience collection... (1200 times) [2024-06-10 09:59:45,801][32415] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-06-10 09:59:45,808][32394] Signal inference workers to resume experience collection... (1200 times) [2024-06-10 09:59:45,819][32415] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-06-10 09:59:48,038][32415] Updated weights for policy 0, policy_version 5370 (0.0035) [2024-06-10 09:59:49,592][32177] Fps is (10 sec: 39320.4, 60 sec: 44236.6, 300 sec: 44764.4). Total num frames: 88014848. Throughput: 0: 44794.7. Samples: 88130740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 09:59:49,593][32177] Avg episode reward: [(0, '0.095')] [2024-06-10 09:59:52,399][32415] Updated weights for policy 0, policy_version 5380 (0.0035) [2024-06-10 09:59:54,596][32177] Fps is (10 sec: 52405.2, 60 sec: 45325.7, 300 sec: 44930.4). Total num frames: 88293376. Throughput: 0: 44639.6. Samples: 88397980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 09:59:54,597][32177] Avg episode reward: [(0, '0.107')] [2024-06-10 09:59:54,918][32415] Updated weights for policy 0, policy_version 5390 (0.0035) [2024-06-10 09:59:59,430][32415] Updated weights for policy 0, policy_version 5400 (0.0036) [2024-06-10 09:59:59,596][32177] Fps is (10 sec: 45856.7, 60 sec: 45325.7, 300 sec: 44819.3). Total num frames: 88473600. Throughput: 0: 45211.2. Samples: 88553560. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-10 09:59:59,597][32177] Avg episode reward: [(0, '0.108')] [2024-06-10 10:00:02,712][32415] Updated weights for policy 0, policy_version 5410 (0.0039) [2024-06-10 10:00:04,596][32177] Fps is (10 sec: 37683.5, 60 sec: 44233.6, 300 sec: 44653.1). Total num frames: 88670208. Throughput: 0: 45045.9. Samples: 88807860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-10 10:00:04,597][32177] Avg episode reward: [(0, '0.097')] [2024-06-10 10:00:06,723][32415] Updated weights for policy 0, policy_version 5420 (0.0028) [2024-06-10 10:00:09,592][32177] Fps is (10 sec: 47533.9, 60 sec: 44782.8, 300 sec: 44875.9). Total num frames: 88948736. Throughput: 0: 44852.8. Samples: 89069980. Policy #0 lag: (min: 0.0, avg: 6.8, max: 17.0) [2024-06-10 10:00:09,592][32177] Avg episode reward: [(0, '0.112')] [2024-06-10 10:00:10,173][32415] Updated weights for policy 0, policy_version 5430 (0.0040) [2024-06-10 10:00:14,179][32415] Updated weights for policy 0, policy_version 5440 (0.0027) [2024-06-10 10:00:14,592][32177] Fps is (10 sec: 49173.2, 60 sec: 45602.1, 300 sec: 44820.0). Total num frames: 89161728. Throughput: 0: 45102.5. Samples: 89220620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-10 10:00:14,592][32177] Avg episode reward: [(0, '0.099')] [2024-06-10 10:00:17,117][32415] Updated weights for policy 0, policy_version 5450 (0.0032) [2024-06-10 10:00:19,592][32177] Fps is (10 sec: 39321.8, 60 sec: 44509.9, 300 sec: 44653.4). Total num frames: 89341952. Throughput: 0: 45354.2. Samples: 89490120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-10 10:00:19,592][32177] Avg episode reward: [(0, '0.106')] [2024-06-10 10:00:21,249][32415] Updated weights for policy 0, policy_version 5460 (0.0038) [2024-06-10 10:00:24,143][32415] Updated weights for policy 0, policy_version 5470 (0.0041) [2024-06-10 10:00:24,592][32177] Fps is (10 sec: 47513.7, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 89636864. Throughput: 0: 44988.9. Samples: 89747980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-10 10:00:24,592][32177] Avg episode reward: [(0, '0.109')] [2024-06-10 10:00:28,826][32415] Updated weights for policy 0, policy_version 5480 (0.0031) [2024-06-10 10:00:29,591][32177] Fps is (10 sec: 49152.6, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 89833472. Throughput: 0: 45332.9. Samples: 89902860. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-10 10:00:29,592][32177] Avg episode reward: [(0, '0.112')] [2024-06-10 10:00:31,468][32415] Updated weights for policy 0, policy_version 5490 (0.0032) [2024-06-10 10:00:34,592][32177] Fps is (10 sec: 37683.1, 60 sec: 44509.8, 300 sec: 44708.9). Total num frames: 90013696. Throughput: 0: 45138.5. Samples: 90161960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-10 10:00:34,592][32177] Avg episode reward: [(0, '0.129')] [2024-06-10 10:00:34,605][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000005494_90013696.pth... [2024-06-10 10:00:34,661][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000004840_79298560.pth [2024-06-10 10:00:34,671][32394] Saving new best policy, reward=0.129! [2024-06-10 10:00:35,977][32415] Updated weights for policy 0, policy_version 5500 (0.0038) [2024-06-10 10:00:38,930][32415] Updated weights for policy 0, policy_version 5510 (0.0030) [2024-06-10 10:00:39,596][32177] Fps is (10 sec: 45854.9, 60 sec: 44506.7, 300 sec: 44763.9). Total num frames: 90292224. Throughput: 0: 44914.3. Samples: 90419120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-10 10:00:39,597][32177] Avg episode reward: [(0, '0.121')] [2024-06-10 10:00:43,438][32415] Updated weights for policy 0, policy_version 5520 (0.0029) [2024-06-10 10:00:43,942][32394] Signal inference workers to stop experience collection... (1250 times) [2024-06-10 10:00:43,942][32394] Signal inference workers to resume experience collection... (1250 times) [2024-06-10 10:00:43,982][32415] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-06-10 10:00:43,982][32415] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-06-10 10:00:44,592][32177] Fps is (10 sec: 50790.8, 60 sec: 45875.2, 300 sec: 44931.0). Total num frames: 90521600. Throughput: 0: 44735.0. Samples: 90566440. Policy #0 lag: (min: 1.0, avg: 13.2, max: 25.0) [2024-06-10 10:00:44,592][32177] Avg episode reward: [(0, '0.095')] [2024-06-10 10:00:45,969][32415] Updated weights for policy 0, policy_version 5530 (0.0032) [2024-06-10 10:00:49,592][32177] Fps is (10 sec: 39338.5, 60 sec: 44510.1, 300 sec: 44709.5). Total num frames: 90685440. Throughput: 0: 45147.4. Samples: 90839300. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-10 10:00:49,592][32177] Avg episode reward: [(0, '0.114')] [2024-06-10 10:00:50,702][32415] Updated weights for policy 0, policy_version 5540 (0.0030) [2024-06-10 10:00:53,197][32415] Updated weights for policy 0, policy_version 5550 (0.0030) [2024-06-10 10:00:54,592][32177] Fps is (10 sec: 42597.5, 60 sec: 44239.9, 300 sec: 44764.6). Total num frames: 90947584. Throughput: 0: 45074.6. Samples: 91098340. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-10 10:00:54,601][32177] Avg episode reward: [(0, '0.118')] [2024-06-10 10:00:57,917][32415] Updated weights for policy 0, policy_version 5560 (0.0041) [2024-06-10 10:00:59,592][32177] Fps is (10 sec: 50790.8, 60 sec: 45332.4, 300 sec: 44986.9). Total num frames: 91193344. Throughput: 0: 44933.4. Samples: 91242620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-10 10:00:59,592][32177] Avg episode reward: [(0, '0.107')] [2024-06-10 10:01:00,939][32415] Updated weights for policy 0, policy_version 5570 (0.0027) [2024-06-10 10:01:04,592][32177] Fps is (10 sec: 42599.1, 60 sec: 45059.3, 300 sec: 44764.4). Total num frames: 91373568. Throughput: 0: 44855.6. Samples: 91508620. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-10 10:01:04,592][32177] Avg episode reward: [(0, '0.116')] [2024-06-10 10:01:05,185][32415] Updated weights for policy 0, policy_version 5580 (0.0026) [2024-06-10 10:01:08,114][32415] Updated weights for policy 0, policy_version 5590 (0.0040) [2024-06-10 10:01:09,591][32177] Fps is (10 sec: 40960.2, 60 sec: 44236.9, 300 sec: 44708.9). Total num frames: 91602944. Throughput: 0: 44913.4. Samples: 91769080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-10 10:01:09,592][32177] Avg episode reward: [(0, '0.122')] [2024-06-10 10:01:12,478][32415] Updated weights for policy 0, policy_version 5600 (0.0021) [2024-06-10 10:01:14,592][32177] Fps is (10 sec: 49151.7, 60 sec: 45056.0, 300 sec: 44986.6). Total num frames: 91865088. Throughput: 0: 44694.9. Samples: 91914140. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-10 10:01:14,592][32177] Avg episode reward: [(0, '0.126')] [2024-06-10 10:01:15,282][32415] Updated weights for policy 0, policy_version 5610 (0.0032) [2024-06-10 10:01:19,592][32177] Fps is (10 sec: 44236.4, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 92045312. Throughput: 0: 44995.1. Samples: 92186740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-10 10:01:19,592][32177] Avg episode reward: [(0, '0.122')] [2024-06-10 10:01:20,039][32415] Updated weights for policy 0, policy_version 5620 (0.0036) [2024-06-10 10:01:22,857][32415] Updated weights for policy 0, policy_version 5630 (0.0029) [2024-06-10 10:01:24,592][32177] Fps is (10 sec: 42598.8, 60 sec: 44236.8, 300 sec: 44709.2). Total num frames: 92291072. Throughput: 0: 45168.8. Samples: 92451520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-10 10:01:24,592][32177] Avg episode reward: [(0, '0.108')] [2024-06-10 10:01:27,159][32415] Updated weights for policy 0, policy_version 5640 (0.0035) [2024-06-10 10:01:29,592][32177] Fps is (10 sec: 49151.9, 60 sec: 45055.9, 300 sec: 44931.0). Total num frames: 92536832. Throughput: 0: 44804.4. Samples: 92582640. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-06-10 10:01:29,592][32177] Avg episode reward: [(0, '0.122')] [2024-06-10 10:01:30,269][32415] Updated weights for policy 0, policy_version 5650 (0.0027) [2024-06-10 10:01:34,592][32177] Fps is (10 sec: 42598.0, 60 sec: 45056.0, 300 sec: 44764.6). Total num frames: 92717056. Throughput: 0: 44747.5. Samples: 92852940. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-10 10:01:34,592][32177] Avg episode reward: [(0, '0.118')] [2024-06-10 10:01:34,595][32415] Updated weights for policy 0, policy_version 5660 (0.0024) [2024-06-10 10:01:37,719][32415] Updated weights for policy 0, policy_version 5670 (0.0028) [2024-06-10 10:01:39,592][32177] Fps is (10 sec: 39320.5, 60 sec: 43966.7, 300 sec: 44654.0). Total num frames: 92930048. Throughput: 0: 44749.2. Samples: 93112060. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-10 10:01:39,601][32177] Avg episode reward: [(0, '0.124')] [2024-06-10 10:01:41,326][32394] Signal inference workers to stop experience collection... (1300 times) [2024-06-10 10:01:41,326][32394] Signal inference workers to resume experience collection... (1300 times) [2024-06-10 10:01:41,363][32415] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-06-10 10:01:41,363][32415] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-06-10 10:01:41,762][32415] Updated weights for policy 0, policy_version 5680 (0.0029) [2024-06-10 10:01:44,592][32177] Fps is (10 sec: 49152.4, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 93208576. Throughput: 0: 44493.3. Samples: 93244820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-10 10:01:44,592][32177] Avg episode reward: [(0, '0.118')] [2024-06-10 10:01:44,946][32415] Updated weights for policy 0, policy_version 5690 (0.0028) [2024-06-10 10:01:49,193][32415] Updated weights for policy 0, policy_version 5700 (0.0028) [2024-06-10 10:01:49,592][32177] Fps is (10 sec: 47514.8, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 93405184. Throughput: 0: 44836.4. Samples: 93526260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-10 10:01:49,592][32177] Avg episode reward: [(0, '0.109')] [2024-06-10 10:01:52,083][32415] Updated weights for policy 0, policy_version 5710 (0.0028) [2024-06-10 10:01:54,592][32177] Fps is (10 sec: 40959.5, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 93618176. Throughput: 0: 44962.9. Samples: 93792420. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-10 10:01:54,592][32177] Avg episode reward: [(0, '0.130')] [2024-06-10 10:01:56,446][32415] Updated weights for policy 0, policy_version 5720 (0.0030) [2024-06-10 10:01:59,592][32177] Fps is (10 sec: 47514.1, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 93880320. Throughput: 0: 44472.6. Samples: 93915400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-10 10:01:59,592][32177] Avg episode reward: [(0, '0.119')] [2024-06-10 10:01:59,593][32415] Updated weights for policy 0, policy_version 5730 (0.0029) [2024-06-10 10:02:03,889][32415] Updated weights for policy 0, policy_version 5740 (0.0027) [2024-06-10 10:02:04,592][32177] Fps is (10 sec: 47514.2, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 94093312. Throughput: 0: 44606.2. Samples: 94194020. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-10 10:02:04,592][32177] Avg episode reward: [(0, '0.121')] [2024-06-10 10:02:06,847][32415] Updated weights for policy 0, policy_version 5750 (0.0049) [2024-06-10 10:02:09,592][32177] Fps is (10 sec: 40959.8, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 94289920. Throughput: 0: 44569.8. Samples: 94457160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-10 10:02:09,592][32177] Avg episode reward: [(0, '0.110')] [2024-06-10 10:02:11,379][32415] Updated weights for policy 0, policy_version 5760 (0.0028) [2024-06-10 10:02:14,274][32415] Updated weights for policy 0, policy_version 5770 (0.0029) [2024-06-10 10:02:14,592][32177] Fps is (10 sec: 45875.2, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 94552064. Throughput: 0: 44539.2. Samples: 94586900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-10 10:02:14,592][32177] Avg episode reward: [(0, '0.136')] [2024-06-10 10:02:14,596][32394] Saving new best policy, reward=0.136! [2024-06-10 10:02:18,506][32415] Updated weights for policy 0, policy_version 5780 (0.0034) [2024-06-10 10:02:19,593][32177] Fps is (10 sec: 49144.4, 60 sec: 45601.0, 300 sec: 44986.5). Total num frames: 94781440. Throughput: 0: 44787.4. Samples: 94868440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-10 10:02:19,594][32177] Avg episode reward: [(0, '0.123')] [2024-06-10 10:02:21,846][32415] Updated weights for policy 0, policy_version 5790 (0.0035) [2024-06-10 10:02:24,592][32177] Fps is (10 sec: 39321.6, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 94945280. Throughput: 0: 44840.3. Samples: 95129860. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-06-10 10:02:24,592][32177] Avg episode reward: [(0, '0.129')] [2024-06-10 10:02:25,610][32415] Updated weights for policy 0, policy_version 5800 (0.0031) [2024-06-10 10:02:28,884][32415] Updated weights for policy 0, policy_version 5810 (0.0031) [2024-06-10 10:02:29,592][32177] Fps is (10 sec: 40965.9, 60 sec: 44236.7, 300 sec: 44708.9). Total num frames: 95191040. Throughput: 0: 44716.8. Samples: 95257080. Policy #0 lag: (min: 1.0, avg: 9.2, max: 24.0) [2024-06-10 10:02:29,592][32177] Avg episode reward: [(0, '0.123')] [2024-06-10 10:02:33,119][32415] Updated weights for policy 0, policy_version 5820 (0.0032) [2024-06-10 10:02:34,592][32177] Fps is (10 sec: 50789.8, 60 sec: 45602.1, 300 sec: 45097.6). Total num frames: 95453184. Throughput: 0: 44578.6. Samples: 95532300. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-10 10:02:34,592][32177] Avg episode reward: [(0, '0.127')] [2024-06-10 10:02:34,599][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000005826_95453184.pth... [2024-06-10 10:02:34,645][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000005169_84688896.pth [2024-06-10 10:02:36,023][32415] Updated weights for policy 0, policy_version 5830 (0.0038) [2024-06-10 10:02:39,592][32177] Fps is (10 sec: 44236.5, 60 sec: 45056.1, 300 sec: 44820.0). Total num frames: 95633408. Throughput: 0: 44780.4. Samples: 95807540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-10 10:02:39,592][32177] Avg episode reward: [(0, '0.137')] [2024-06-10 10:02:39,600][32394] Saving new best policy, reward=0.137! [2024-06-10 10:02:40,417][32415] Updated weights for policy 0, policy_version 5840 (0.0046) [2024-06-10 10:02:43,330][32415] Updated weights for policy 0, policy_version 5850 (0.0028) [2024-06-10 10:02:44,596][32177] Fps is (10 sec: 40942.8, 60 sec: 44233.6, 300 sec: 44763.8). Total num frames: 95862784. Throughput: 0: 44804.6. Samples: 95931800. Policy #0 lag: (min: 0.0, avg: 13.4, max: 24.0) [2024-06-10 10:02:44,596][32177] Avg episode reward: [(0, '0.131')] [2024-06-10 10:02:47,723][32415] Updated weights for policy 0, policy_version 5860 (0.0027) [2024-06-10 10:02:48,011][32394] Signal inference workers to stop experience collection... (1350 times) [2024-06-10 10:02:48,060][32394] Signal inference workers to resume experience collection... (1350 times) [2024-06-10 10:02:48,072][32415] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-06-10 10:02:48,072][32415] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-06-10 10:02:49,592][32177] Fps is (10 sec: 50791.1, 60 sec: 45602.2, 300 sec: 45097.6). Total num frames: 96141312. Throughput: 0: 44837.7. Samples: 96211720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-10 10:02:49,592][32177] Avg episode reward: [(0, '0.118')] [2024-06-10 10:02:50,883][32415] Updated weights for policy 0, policy_version 5870 (0.0030) [2024-06-10 10:02:54,591][32177] Fps is (10 sec: 42617.1, 60 sec: 44510.0, 300 sec: 44764.4). Total num frames: 96288768. Throughput: 0: 45059.2. Samples: 96484820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 10:02:54,592][32177] Avg episode reward: [(0, '0.134')] [2024-06-10 10:02:54,950][32415] Updated weights for policy 0, policy_version 5880 (0.0041) [2024-06-10 10:02:57,895][32415] Updated weights for policy 0, policy_version 5890 (0.0023) [2024-06-10 10:02:59,592][32177] Fps is (10 sec: 37683.0, 60 sec: 43963.6, 300 sec: 44597.8). Total num frames: 96518144. Throughput: 0: 44837.2. Samples: 96604580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 10:02:59,592][32177] Avg episode reward: [(0, '0.129')] [2024-06-10 10:03:02,623][32415] Updated weights for policy 0, policy_version 5900 (0.0032) [2024-06-10 10:03:04,592][32177] Fps is (10 sec: 50789.8, 60 sec: 45056.0, 300 sec: 44987.2). Total num frames: 96796672. Throughput: 0: 44577.1. Samples: 96874340. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-10 10:03:04,592][32177] Avg episode reward: [(0, '0.128')] [2024-06-10 10:03:05,093][32415] Updated weights for policy 0, policy_version 5910 (0.0021) [2024-06-10 10:03:09,592][32177] Fps is (10 sec: 44237.3, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 96960512. Throughput: 0: 44794.2. Samples: 97145600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-10 10:03:09,592][32177] Avg episode reward: [(0, '0.130')] [2024-06-10 10:03:09,935][32415] Updated weights for policy 0, policy_version 5920 (0.0046) [2024-06-10 10:03:12,321][32415] Updated weights for policy 0, policy_version 5930 (0.0030) [2024-06-10 10:03:14,592][32177] Fps is (10 sec: 40960.1, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 97206272. Throughput: 0: 44730.8. Samples: 97269960. Policy #0 lag: (min: 0.0, avg: 13.2, max: 24.0) [2024-06-10 10:03:14,592][32177] Avg episode reward: [(0, '0.121')] [2024-06-10 10:03:17,253][32415] Updated weights for policy 0, policy_version 5940 (0.0028) [2024-06-10 10:03:19,592][32177] Fps is (10 sec: 50789.1, 60 sec: 44783.9, 300 sec: 44876.1). Total num frames: 97468416. Throughput: 0: 44705.2. Samples: 97544040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 10:03:19,592][32177] Avg episode reward: [(0, '0.145')] [2024-06-10 10:03:19,593][32394] Saving new best policy, reward=0.145! [2024-06-10 10:03:19,801][32415] Updated weights for policy 0, policy_version 5950 (0.0036) [2024-06-10 10:03:24,336][32415] Updated weights for policy 0, policy_version 5960 (0.0035) [2024-06-10 10:03:24,592][32177] Fps is (10 sec: 45874.9, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 97665024. Throughput: 0: 44697.4. Samples: 97818920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 10:03:24,592][32177] Avg episode reward: [(0, '0.134')] [2024-06-10 10:03:26,746][32415] Updated weights for policy 0, policy_version 5970 (0.0042) [2024-06-10 10:03:29,593][32177] Fps is (10 sec: 40956.3, 60 sec: 44782.2, 300 sec: 44708.7). Total num frames: 97878016. Throughput: 0: 44628.0. Samples: 97939920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-10 10:03:29,593][32177] Avg episode reward: [(0, '0.126')] [2024-06-10 10:03:31,918][32415] Updated weights for policy 0, policy_version 5980 (0.0036) [2024-06-10 10:03:34,424][32415] Updated weights for policy 0, policy_version 5990 (0.0033) [2024-06-10 10:03:34,592][32177] Fps is (10 sec: 47513.6, 60 sec: 44782.9, 300 sec: 44931.1). Total num frames: 98140160. Throughput: 0: 44451.1. Samples: 98212020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-10 10:03:34,592][32177] Avg episode reward: [(0, '0.124')] [2024-06-10 10:03:39,181][32415] Updated weights for policy 0, policy_version 6000 (0.0027) [2024-06-10 10:03:39,592][32177] Fps is (10 sec: 44241.5, 60 sec: 44783.0, 300 sec: 44875.6). Total num frames: 98320384. Throughput: 0: 44579.8. Samples: 98490920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-10 10:03:39,592][32177] Avg episode reward: [(0, '0.141')] [2024-06-10 10:03:41,424][32415] Updated weights for policy 0, policy_version 6010 (0.0030) [2024-06-10 10:03:44,592][32177] Fps is (10 sec: 40956.8, 60 sec: 44785.5, 300 sec: 44708.8). Total num frames: 98549760. Throughput: 0: 44700.6. Samples: 98616140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-10 10:03:44,593][32177] Avg episode reward: [(0, '0.140')] [2024-06-10 10:03:46,171][32394] Signal inference workers to stop experience collection... (1400 times) [2024-06-10 10:03:46,171][32394] Signal inference workers to resume experience collection... (1400 times) [2024-06-10 10:03:46,197][32415] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-06-10 10:03:46,197][32415] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-06-10 10:03:46,309][32415] Updated weights for policy 0, policy_version 6020 (0.0030) [2024-06-10 10:03:48,838][32415] Updated weights for policy 0, policy_version 6030 (0.0027) [2024-06-10 10:03:49,596][32177] Fps is (10 sec: 47493.4, 60 sec: 44233.6, 300 sec: 44819.3). Total num frames: 98795520. Throughput: 0: 44601.5. Samples: 98881600. Policy #0 lag: (min: 1.0, avg: 13.3, max: 22.0) [2024-06-10 10:03:49,597][32177] Avg episode reward: [(0, '0.133')] [2024-06-10 10:03:53,800][32415] Updated weights for policy 0, policy_version 6040 (0.0029) [2024-06-10 10:03:54,592][32177] Fps is (10 sec: 44240.4, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 98992128. Throughput: 0: 44588.4. Samples: 99152080. Policy #0 lag: (min: 1.0, avg: 13.3, max: 22.0) [2024-06-10 10:03:54,592][32177] Avg episode reward: [(0, '0.116')] [2024-06-10 10:03:56,173][32415] Updated weights for policy 0, policy_version 6050 (0.0035) [2024-06-10 10:03:59,591][32177] Fps is (10 sec: 40978.1, 60 sec: 44783.1, 300 sec: 44708.9). Total num frames: 99205120. Throughput: 0: 44664.5. Samples: 99279860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-10 10:03:59,592][32177] Avg episode reward: [(0, '0.133')] [2024-06-10 10:04:01,337][32415] Updated weights for policy 0, policy_version 6060 (0.0030) [2024-06-10 10:04:03,631][32415] Updated weights for policy 0, policy_version 6070 (0.0038) [2024-06-10 10:04:04,596][32177] Fps is (10 sec: 45855.5, 60 sec: 44233.6, 300 sec: 44708.2). Total num frames: 99450880. Throughput: 0: 44466.2. Samples: 99545200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 10:04:04,597][32177] Avg episode reward: [(0, '0.152')] [2024-06-10 10:04:04,609][32394] Saving new best policy, reward=0.152! [2024-06-10 10:04:08,421][32415] Updated weights for policy 0, policy_version 6080 (0.0042) [2024-06-10 10:04:09,592][32177] Fps is (10 sec: 47513.4, 60 sec: 45329.1, 300 sec: 44931.0). Total num frames: 99680256. Throughput: 0: 44560.1. Samples: 99824120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 10:04:09,592][32177] Avg episode reward: [(0, '0.132')] [2024-06-10 10:04:10,941][32415] Updated weights for policy 0, policy_version 6090 (0.0031) [2024-06-10 10:04:14,592][32177] Fps is (10 sec: 42617.0, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 99876864. Throughput: 0: 44849.2. Samples: 99958080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-10 10:04:14,592][32177] Avg episode reward: [(0, '0.146')] [2024-06-10 10:04:15,708][32415] Updated weights for policy 0, policy_version 6100 (0.0027) [2024-06-10 10:04:18,227][32415] Updated weights for policy 0, policy_version 6110 (0.0028) [2024-06-10 10:04:19,592][32177] Fps is (10 sec: 44236.3, 60 sec: 44236.9, 300 sec: 44708.9). Total num frames: 100122624. Throughput: 0: 44750.2. Samples: 100225780. Policy #0 lag: (min: 0.0, avg: 12.9, max: 20.0) [2024-06-10 10:04:19,592][32177] Avg episode reward: [(0, '0.133')] [2024-06-10 10:04:22,993][32415] Updated weights for policy 0, policy_version 6120 (0.0031) [2024-06-10 10:04:24,596][32177] Fps is (10 sec: 47492.9, 60 sec: 44779.8, 300 sec: 44874.8). Total num frames: 100352000. Throughput: 0: 44682.9. Samples: 100501840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-10 10:04:24,596][32177] Avg episode reward: [(0, '0.138')] [2024-06-10 10:04:25,331][32415] Updated weights for policy 0, policy_version 6130 (0.0030) [2024-06-10 10:04:29,596][32177] Fps is (10 sec: 42580.2, 60 sec: 44507.5, 300 sec: 44763.8). Total num frames: 100548608. Throughput: 0: 44771.2. Samples: 100631000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-10 10:04:29,597][32177] Avg episode reward: [(0, '0.131')] [2024-06-10 10:04:30,228][32415] Updated weights for policy 0, policy_version 6140 (0.0048) [2024-06-10 10:04:32,785][32415] Updated weights for policy 0, policy_version 6150 (0.0025) [2024-06-10 10:04:34,592][32177] Fps is (10 sec: 42616.8, 60 sec: 43963.8, 300 sec: 44597.8). Total num frames: 100777984. Throughput: 0: 44734.5. Samples: 100894460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-10 10:04:34,592][32177] Avg episode reward: [(0, '0.143')] [2024-06-10 10:04:34,684][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000006152_100794368.pth... [2024-06-10 10:04:34,729][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000005494_90013696.pth [2024-06-10 10:04:37,469][32415] Updated weights for policy 0, policy_version 6160 (0.0023) [2024-06-10 10:04:39,592][32177] Fps is (10 sec: 50812.0, 60 sec: 45602.1, 300 sec: 45042.1). Total num frames: 101056512. Throughput: 0: 44826.6. Samples: 101169280. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-10 10:04:39,592][32177] Avg episode reward: [(0, '0.130')] [2024-06-10 10:04:40,224][32415] Updated weights for policy 0, policy_version 6170 (0.0031) [2024-06-10 10:04:44,592][32177] Fps is (10 sec: 44236.4, 60 sec: 44510.4, 300 sec: 44764.5). Total num frames: 101220352. Throughput: 0: 45194.9. Samples: 101313640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-10 10:04:44,592][32177] Avg episode reward: [(0, '0.138')] [2024-06-10 10:04:44,784][32415] Updated weights for policy 0, policy_version 6180 (0.0030) [2024-06-10 10:04:47,346][32415] Updated weights for policy 0, policy_version 6190 (0.0043) [2024-06-10 10:04:49,593][32177] Fps is (10 sec: 40955.7, 60 sec: 44512.2, 300 sec: 44653.8). Total num frames: 101466112. Throughput: 0: 45035.7. Samples: 101571660. Policy #0 lag: (min: 0.0, avg: 13.8, max: 22.0) [2024-06-10 10:04:49,593][32177] Avg episode reward: [(0, '0.132')] [2024-06-10 10:04:52,161][32415] Updated weights for policy 0, policy_version 6200 (0.0026) [2024-06-10 10:04:53,459][32394] Signal inference workers to stop experience collection... (1450 times) [2024-06-10 10:04:53,510][32415] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-06-10 10:04:53,516][32394] Signal inference workers to resume experience collection... (1450 times) [2024-06-10 10:04:53,525][32415] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-06-10 10:04:54,596][32177] Fps is (10 sec: 49131.0, 60 sec: 45325.8, 300 sec: 44875.5). Total num frames: 101711872. Throughput: 0: 44808.1. Samples: 101840680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-10 10:04:54,597][32177] Avg episode reward: [(0, '0.144')] [2024-06-10 10:04:55,002][32415] Updated weights for policy 0, policy_version 6210 (0.0029) [2024-06-10 10:04:59,555][32415] Updated weights for policy 0, policy_version 6220 (0.0028) [2024-06-10 10:04:59,592][32177] Fps is (10 sec: 44241.5, 60 sec: 45055.9, 300 sec: 44876.1). Total num frames: 101908480. Throughput: 0: 44783.5. Samples: 101973340. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-10 10:04:59,592][32177] Avg episode reward: [(0, '0.131')] [2024-06-10 10:05:02,747][32415] Updated weights for policy 0, policy_version 6230 (0.0038) [2024-06-10 10:05:04,592][32177] Fps is (10 sec: 40977.3, 60 sec: 44513.0, 300 sec: 44653.3). Total num frames: 102121472. Throughput: 0: 44690.6. Samples: 102236860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 10:05:04,593][32177] Avg episode reward: [(0, '0.140')] [2024-06-10 10:05:06,806][32415] Updated weights for policy 0, policy_version 6240 (0.0037) [2024-06-10 10:05:09,591][32177] Fps is (10 sec: 45876.0, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 102367232. Throughput: 0: 44533.7. Samples: 102505660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 10:05:09,592][32177] Avg episode reward: [(0, '0.132')] [2024-06-10 10:05:09,807][32415] Updated weights for policy 0, policy_version 6250 (0.0036) [2024-06-10 10:05:13,850][32415] Updated weights for policy 0, policy_version 6260 (0.0030) [2024-06-10 10:05:14,592][32177] Fps is (10 sec: 47514.2, 60 sec: 45329.0, 300 sec: 44931.0). Total num frames: 102596608. Throughput: 0: 44977.2. Samples: 102654780. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-10 10:05:14,601][32177] Avg episode reward: [(0, '0.131')] [2024-06-10 10:05:17,026][32415] Updated weights for policy 0, policy_version 6270 (0.0033) [2024-06-10 10:05:19,592][32177] Fps is (10 sec: 40959.3, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 102776832. Throughput: 0: 44950.6. Samples: 102917240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-10 10:05:19,601][32177] Avg episode reward: [(0, '0.142')] [2024-06-10 10:05:21,408][32415] Updated weights for policy 0, policy_version 6280 (0.0031) [2024-06-10 10:05:24,308][32415] Updated weights for policy 0, policy_version 6290 (0.0047) [2024-06-10 10:05:24,592][32177] Fps is (10 sec: 45875.3, 60 sec: 45059.3, 300 sec: 44819.9). Total num frames: 103055360. Throughput: 0: 44703.2. Samples: 103180920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-10 10:05:24,592][32177] Avg episode reward: [(0, '0.137')] [2024-06-10 10:05:29,046][32415] Updated weights for policy 0, policy_version 6300 (0.0037) [2024-06-10 10:05:29,592][32177] Fps is (10 sec: 49152.5, 60 sec: 45332.4, 300 sec: 44931.0). Total num frames: 103268352. Throughput: 0: 44660.1. Samples: 103323340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-10 10:05:29,592][32177] Avg episode reward: [(0, '0.136')] [2024-06-10 10:05:31,829][32415] Updated weights for policy 0, policy_version 6310 (0.0027) [2024-06-10 10:05:34,592][32177] Fps is (10 sec: 40959.8, 60 sec: 44782.9, 300 sec: 44654.0). Total num frames: 103464960. Throughput: 0: 44977.5. Samples: 103595600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-10 10:05:34,604][32177] Avg episode reward: [(0, '0.153')] [2024-06-10 10:05:34,617][32394] Saving new best policy, reward=0.153! [2024-06-10 10:05:36,129][32415] Updated weights for policy 0, policy_version 6320 (0.0043) [2024-06-10 10:05:38,905][32415] Updated weights for policy 0, policy_version 6330 (0.0024) [2024-06-10 10:05:39,593][32177] Fps is (10 sec: 45868.7, 60 sec: 44508.9, 300 sec: 44764.2). Total num frames: 103727104. Throughput: 0: 44744.8. Samples: 103854060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-10 10:05:39,593][32177] Avg episode reward: [(0, '0.143')] [2024-06-10 10:05:43,413][32415] Updated weights for policy 0, policy_version 6340 (0.0032) [2024-06-10 10:05:44,592][32177] Fps is (10 sec: 47512.0, 60 sec: 45328.9, 300 sec: 44931.0). Total num frames: 103940096. Throughput: 0: 45065.9. Samples: 104001320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-10 10:05:44,593][32177] Avg episode reward: [(0, '0.148')] [2024-06-10 10:05:46,368][32415] Updated weights for policy 0, policy_version 6350 (0.0040) [2024-06-10 10:05:49,592][32177] Fps is (10 sec: 40965.6, 60 sec: 44510.7, 300 sec: 44708.9). Total num frames: 104136704. Throughput: 0: 45088.1. Samples: 104265820. Policy #0 lag: (min: 0.0, avg: 13.2, max: 22.0) [2024-06-10 10:05:49,592][32177] Avg episode reward: [(0, '0.143')] [2024-06-10 10:05:50,857][32415] Updated weights for policy 0, policy_version 6360 (0.0033) [2024-06-10 10:05:53,588][32415] Updated weights for policy 0, policy_version 6370 (0.0027) [2024-06-10 10:05:54,592][32177] Fps is (10 sec: 44238.1, 60 sec: 44513.0, 300 sec: 44708.9). Total num frames: 104382464. Throughput: 0: 44885.1. Samples: 104525500. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-10 10:05:54,592][32177] Avg episode reward: [(0, '0.145')] [2024-06-10 10:05:58,076][32415] Updated weights for policy 0, policy_version 6380 (0.0045) [2024-06-10 10:05:59,592][32177] Fps is (10 sec: 49151.5, 60 sec: 45329.0, 300 sec: 44931.0). Total num frames: 104628224. Throughput: 0: 44837.7. Samples: 104672480. Policy #0 lag: (min: 1.0, avg: 8.5, max: 18.0) [2024-06-10 10:05:59,592][32177] Avg episode reward: [(0, '0.152')] [2024-06-10 10:06:00,986][32415] Updated weights for policy 0, policy_version 6390 (0.0042) [2024-06-10 10:06:04,592][32177] Fps is (10 sec: 40960.2, 60 sec: 44510.0, 300 sec: 44708.9). Total num frames: 104792064. Throughput: 0: 44884.9. Samples: 104937060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-10 10:06:04,592][32177] Avg episode reward: [(0, '0.146')] [2024-06-10 10:06:05,285][32415] Updated weights for policy 0, policy_version 6400 (0.0032) [2024-06-10 10:06:08,055][32415] Updated weights for policy 0, policy_version 6410 (0.0038) [2024-06-10 10:06:09,596][32177] Fps is (10 sec: 42580.3, 60 sec: 44779.6, 300 sec: 44708.2). Total num frames: 105054208. Throughput: 0: 45052.5. Samples: 105208480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-10 10:06:09,597][32177] Avg episode reward: [(0, '0.151')] [2024-06-10 10:06:10,276][32394] Signal inference workers to stop experience collection... (1500 times) [2024-06-10 10:06:10,290][32415] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-06-10 10:06:10,340][32394] Signal inference workers to resume experience collection... (1500 times) [2024-06-10 10:06:10,340][32415] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-06-10 10:06:12,630][32415] Updated weights for policy 0, policy_version 6420 (0.0031) [2024-06-10 10:06:14,592][32177] Fps is (10 sec: 49151.1, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 105283584. Throughput: 0: 45051.3. Samples: 105350660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 10:06:14,592][32177] Avg episode reward: [(0, '0.142')] [2024-06-10 10:06:15,568][32415] Updated weights for policy 0, policy_version 6430 (0.0033) [2024-06-10 10:06:19,592][32177] Fps is (10 sec: 42617.1, 60 sec: 45056.1, 300 sec: 44708.9). Total num frames: 105480192. Throughput: 0: 44787.6. Samples: 105611040. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-06-10 10:06:19,592][32177] Avg episode reward: [(0, '0.150')] [2024-06-10 10:06:20,278][32415] Updated weights for policy 0, policy_version 6440 (0.0030) [2024-06-10 10:06:22,668][32415] Updated weights for policy 0, policy_version 6450 (0.0027) [2024-06-10 10:06:24,592][32177] Fps is (10 sec: 44237.7, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 105725952. Throughput: 0: 44865.8. Samples: 105872960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-10 10:06:24,592][32177] Avg episode reward: [(0, '0.157')] [2024-06-10 10:06:24,603][32394] Saving new best policy, reward=0.157! [2024-06-10 10:06:27,387][32415] Updated weights for policy 0, policy_version 6460 (0.0037) [2024-06-10 10:06:29,592][32177] Fps is (10 sec: 49152.1, 60 sec: 45056.0, 300 sec: 44931.1). Total num frames: 105971712. Throughput: 0: 44745.7. Samples: 106014860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-10 10:06:29,592][32177] Avg episode reward: [(0, '0.158')] [2024-06-10 10:06:30,106][32415] Updated weights for policy 0, policy_version 6470 (0.0020) [2024-06-10 10:06:34,592][32177] Fps is (10 sec: 44236.6, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 106168320. Throughput: 0: 44864.0. Samples: 106284700. Policy #0 lag: (min: 1.0, avg: 7.4, max: 19.0) [2024-06-10 10:06:34,592][32177] Avg episode reward: [(0, '0.146')] [2024-06-10 10:06:34,593][32415] Updated weights for policy 0, policy_version 6480 (0.0040) [2024-06-10 10:06:34,600][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000006480_106168320.pth... [2024-06-10 10:06:34,658][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000005826_95453184.pth [2024-06-10 10:06:37,233][32415] Updated weights for policy 0, policy_version 6490 (0.0025) [2024-06-10 10:06:39,592][32177] Fps is (10 sec: 42598.4, 60 sec: 44510.9, 300 sec: 44708.9). Total num frames: 106397696. Throughput: 0: 45005.0. Samples: 106550720. Policy #0 lag: (min: 1.0, avg: 7.4, max: 19.0) [2024-06-10 10:06:39,592][32177] Avg episode reward: [(0, '0.139')] [2024-06-10 10:06:42,141][32415] Updated weights for policy 0, policy_version 6500 (0.0050) [2024-06-10 10:06:44,539][32415] Updated weights for policy 0, policy_version 6510 (0.0022) [2024-06-10 10:06:44,592][32177] Fps is (10 sec: 49151.7, 60 sec: 45329.3, 300 sec: 44931.0). Total num frames: 106659840. Throughput: 0: 44786.3. Samples: 106687860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-10 10:06:44,592][32177] Avg episode reward: [(0, '0.164')] [2024-06-10 10:06:44,605][32394] Saving new best policy, reward=0.164! [2024-06-10 10:06:49,573][32415] Updated weights for policy 0, policy_version 6520 (0.0035) [2024-06-10 10:06:49,592][32177] Fps is (10 sec: 42598.3, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 106823680. Throughput: 0: 44751.6. Samples: 106950880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-10 10:06:49,592][32177] Avg episode reward: [(0, '0.145')] [2024-06-10 10:06:52,171][32415] Updated weights for policy 0, policy_version 6530 (0.0029) [2024-06-10 10:06:54,592][32177] Fps is (10 sec: 39321.4, 60 sec: 44509.8, 300 sec: 44653.3). Total num frames: 107053056. Throughput: 0: 44544.2. Samples: 107212780. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-10 10:06:54,592][32177] Avg episode reward: [(0, '0.132')] [2024-06-10 10:06:56,607][32415] Updated weights for policy 0, policy_version 6540 (0.0028) [2024-06-10 10:06:59,452][32415] Updated weights for policy 0, policy_version 6550 (0.0034) [2024-06-10 10:06:59,592][32177] Fps is (10 sec: 49152.1, 60 sec: 44783.1, 300 sec: 44820.0). Total num frames: 107315200. Throughput: 0: 44450.9. Samples: 107350940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 10:06:59,592][32177] Avg episode reward: [(0, '0.149')] [2024-06-10 10:07:04,002][32415] Updated weights for policy 0, policy_version 6560 (0.0021) [2024-06-10 10:07:04,594][32177] Fps is (10 sec: 44226.9, 60 sec: 45054.2, 300 sec: 44764.1). Total num frames: 107495424. Throughput: 0: 44694.5. Samples: 107622400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 10:07:04,595][32177] Avg episode reward: [(0, '0.143')] [2024-06-10 10:07:06,734][32415] Updated weights for policy 0, policy_version 6570 (0.0031) [2024-06-10 10:07:09,592][32177] Fps is (10 sec: 42598.2, 60 sec: 44786.2, 300 sec: 44708.9). Total num frames: 107741184. Throughput: 0: 44821.3. Samples: 107889920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-10 10:07:09,592][32177] Avg episode reward: [(0, '0.144')] [2024-06-10 10:07:11,513][32415] Updated weights for policy 0, policy_version 6580 (0.0030) [2024-06-10 10:07:12,513][32394] Signal inference workers to stop experience collection... (1550 times) [2024-06-10 10:07:12,562][32415] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-06-10 10:07:12,619][32394] Signal inference workers to resume experience collection... (1550 times) [2024-06-10 10:07:12,619][32415] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-06-10 10:07:14,102][32415] Updated weights for policy 0, policy_version 6590 (0.0042) [2024-06-10 10:07:14,593][32177] Fps is (10 sec: 49157.7, 60 sec: 45055.2, 300 sec: 44764.5). Total num frames: 107986944. Throughput: 0: 44597.9. Samples: 108021820. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-10 10:07:14,593][32177] Avg episode reward: [(0, '0.161')] [2024-06-10 10:07:18,568][32415] Updated weights for policy 0, policy_version 6600 (0.0033) [2024-06-10 10:07:19,592][32177] Fps is (10 sec: 40960.2, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 108150784. Throughput: 0: 44428.5. Samples: 108283980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 10:07:19,592][32177] Avg episode reward: [(0, '0.153')] [2024-06-10 10:07:21,561][32415] Updated weights for policy 0, policy_version 6610 (0.0029) [2024-06-10 10:07:24,592][32177] Fps is (10 sec: 40964.4, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 108396544. Throughput: 0: 44475.0. Samples: 108552100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 10:07:24,592][32177] Avg episode reward: [(0, '0.157')] [2024-06-10 10:07:25,665][32415] Updated weights for policy 0, policy_version 6620 (0.0044) [2024-06-10 10:07:28,790][32415] Updated weights for policy 0, policy_version 6630 (0.0033) [2024-06-10 10:07:29,592][32177] Fps is (10 sec: 49152.0, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 108642304. Throughput: 0: 44545.4. Samples: 108692400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 10:07:29,600][32177] Avg episode reward: [(0, '0.129')] [2024-06-10 10:07:33,342][32415] Updated weights for policy 0, policy_version 6640 (0.0037) [2024-06-10 10:07:34,594][32177] Fps is (10 sec: 44227.3, 60 sec: 44508.2, 300 sec: 44764.1). Total num frames: 108838912. Throughput: 0: 44689.7. Samples: 108962020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 10:07:34,594][32177] Avg episode reward: [(0, '0.144')] [2024-06-10 10:07:36,137][32415] Updated weights for policy 0, policy_version 6650 (0.0030) [2024-06-10 10:07:39,596][32177] Fps is (10 sec: 42579.8, 60 sec: 44506.6, 300 sec: 44764.4). Total num frames: 109068288. Throughput: 0: 44608.3. Samples: 109220340. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-10 10:07:39,597][32177] Avg episode reward: [(0, '0.161')] [2024-06-10 10:07:41,103][32415] Updated weights for policy 0, policy_version 6660 (0.0041) [2024-06-10 10:07:43,395][32415] Updated weights for policy 0, policy_version 6670 (0.0038) [2024-06-10 10:07:44,592][32177] Fps is (10 sec: 47524.6, 60 sec: 44236.9, 300 sec: 44653.4). Total num frames: 109314048. Throughput: 0: 44717.8. Samples: 109363240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-10 10:07:44,592][32177] Avg episode reward: [(0, '0.154')] [2024-06-10 10:07:48,244][32415] Updated weights for policy 0, policy_version 6680 (0.0031) [2024-06-10 10:07:49,591][32177] Fps is (10 sec: 40978.1, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 109477888. Throughput: 0: 44444.2. Samples: 109622280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-10 10:07:49,592][32177] Avg episode reward: [(0, '0.177')] [2024-06-10 10:07:49,608][32394] Saving new best policy, reward=0.177! [2024-06-10 10:07:51,058][32415] Updated weights for policy 0, policy_version 6690 (0.0026) [2024-06-10 10:07:54,592][32177] Fps is (10 sec: 42597.8, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 109740032. Throughput: 0: 44338.1. Samples: 109885140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 10:07:54,592][32177] Avg episode reward: [(0, '0.150')] [2024-06-10 10:07:55,323][32415] Updated weights for policy 0, policy_version 6700 (0.0037) [2024-06-10 10:07:58,301][32415] Updated weights for policy 0, policy_version 6710 (0.0035) [2024-06-10 10:07:59,592][32177] Fps is (10 sec: 50789.9, 60 sec: 44509.8, 300 sec: 44708.9). Total num frames: 109985792. Throughput: 0: 44541.6. Samples: 110026140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 10:07:59,592][32177] Avg episode reward: [(0, '0.159')] [2024-06-10 10:08:02,710][32415] Updated weights for policy 0, policy_version 6720 (0.0042) [2024-06-10 10:08:04,592][32177] Fps is (10 sec: 44237.0, 60 sec: 44784.6, 300 sec: 44819.9). Total num frames: 110182400. Throughput: 0: 44704.8. Samples: 110295700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-10 10:08:04,592][32177] Avg episode reward: [(0, '0.162')] [2024-06-10 10:08:05,696][32415] Updated weights for policy 0, policy_version 6730 (0.0023) [2024-06-10 10:08:09,592][32177] Fps is (10 sec: 42598.2, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 110411776. Throughput: 0: 44606.3. Samples: 110559380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 10:08:09,592][32177] Avg episode reward: [(0, '0.179')] [2024-06-10 10:08:09,593][32394] Saving new best policy, reward=0.179! [2024-06-10 10:08:10,318][32415] Updated weights for policy 0, policy_version 6740 (0.0033) [2024-06-10 10:08:12,980][32415] Updated weights for policy 0, policy_version 6750 (0.0032) [2024-06-10 10:08:14,592][32177] Fps is (10 sec: 45875.2, 60 sec: 44237.6, 300 sec: 44653.4). Total num frames: 110641152. Throughput: 0: 44612.8. Samples: 110699980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-10 10:08:14,592][32177] Avg episode reward: [(0, '0.154')] [2024-06-10 10:08:17,453][32415] Updated weights for policy 0, policy_version 6760 (0.0034) [2024-06-10 10:08:19,592][32177] Fps is (10 sec: 42598.4, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 110837760. Throughput: 0: 44492.4. Samples: 110964080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-10 10:08:19,592][32177] Avg episode reward: [(0, '0.166')] [2024-06-10 10:08:19,701][32394] Signal inference workers to stop experience collection... (1600 times) [2024-06-10 10:08:19,731][32415] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-06-10 10:08:19,765][32394] Signal inference workers to resume experience collection... (1600 times) [2024-06-10 10:08:19,765][32415] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-06-10 10:08:20,484][32415] Updated weights for policy 0, policy_version 6770 (0.0026) [2024-06-10 10:08:24,401][32415] Updated weights for policy 0, policy_version 6780 (0.0037) [2024-06-10 10:08:24,592][32177] Fps is (10 sec: 44236.8, 60 sec: 44783.0, 300 sec: 44764.6). Total num frames: 111083520. Throughput: 0: 44583.8. Samples: 111226420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-10 10:08:24,592][32177] Avg episode reward: [(0, '0.166')] [2024-06-10 10:08:27,702][32415] Updated weights for policy 0, policy_version 6790 (0.0029) [2024-06-10 10:08:29,592][32177] Fps is (10 sec: 49152.3, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 111329280. Throughput: 0: 44527.5. Samples: 111366980. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-06-10 10:08:29,592][32177] Avg episode reward: [(0, '0.148')] [2024-06-10 10:08:31,819][32415] Updated weights for policy 0, policy_version 6800 (0.0033) [2024-06-10 10:08:34,592][32177] Fps is (10 sec: 45875.2, 60 sec: 45057.6, 300 sec: 44820.0). Total num frames: 111542272. Throughput: 0: 44936.7. Samples: 111644440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-10 10:08:34,592][32177] Avg episode reward: [(0, '0.149')] [2024-06-10 10:08:34,604][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000006808_111542272.pth... [2024-06-10 10:08:34,670][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000006152_100794368.pth [2024-06-10 10:08:35,057][32415] Updated weights for policy 0, policy_version 6810 (0.0031) [2024-06-10 10:08:39,144][32415] Updated weights for policy 0, policy_version 6820 (0.0037) [2024-06-10 10:08:39,592][32177] Fps is (10 sec: 40959.7, 60 sec: 44513.0, 300 sec: 44709.0). Total num frames: 111738880. Throughput: 0: 44905.8. Samples: 111905900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 10:08:39,592][32177] Avg episode reward: [(0, '0.160')] [2024-06-10 10:08:42,296][32415] Updated weights for policy 0, policy_version 6830 (0.0047) [2024-06-10 10:08:44,592][32177] Fps is (10 sec: 44237.4, 60 sec: 44509.9, 300 sec: 44709.5). Total num frames: 111984640. Throughput: 0: 44638.3. Samples: 112034860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 10:08:44,592][32177] Avg episode reward: [(0, '0.162')] [2024-06-10 10:08:46,234][32415] Updated weights for policy 0, policy_version 6840 (0.0028) [2024-06-10 10:08:49,596][32177] Fps is (10 sec: 45855.6, 60 sec: 45325.7, 300 sec: 44763.8). Total num frames: 112197632. Throughput: 0: 44697.1. Samples: 112307260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-10 10:08:49,597][32177] Avg episode reward: [(0, '0.156')] [2024-06-10 10:08:49,843][32415] Updated weights for policy 0, policy_version 6850 (0.0020) [2024-06-10 10:08:53,478][32415] Updated weights for policy 0, policy_version 6860 (0.0027) [2024-06-10 10:08:54,592][32177] Fps is (10 sec: 40959.9, 60 sec: 44236.9, 300 sec: 44708.9). Total num frames: 112394240. Throughput: 0: 44829.0. Samples: 112576680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-10 10:08:54,592][32177] Avg episode reward: [(0, '0.150')] [2024-06-10 10:08:57,044][32415] Updated weights for policy 0, policy_version 6870 (0.0040) [2024-06-10 10:08:59,596][32177] Fps is (10 sec: 45875.2, 60 sec: 44506.7, 300 sec: 44764.4). Total num frames: 112656384. Throughput: 0: 44610.0. Samples: 112707620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-10 10:08:59,596][32177] Avg episode reward: [(0, '0.157')] [2024-06-10 10:09:01,157][32415] Updated weights for policy 0, policy_version 6880 (0.0028) [2024-06-10 10:09:04,305][32415] Updated weights for policy 0, policy_version 6890 (0.0041) [2024-06-10 10:09:04,592][32177] Fps is (10 sec: 50789.6, 60 sec: 45329.0, 300 sec: 44819.9). Total num frames: 112902144. Throughput: 0: 44895.5. Samples: 112984380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-10 10:09:04,592][32177] Avg episode reward: [(0, '0.152')] [2024-06-10 10:09:08,546][32415] Updated weights for policy 0, policy_version 6900 (0.0029) [2024-06-10 10:09:09,592][32177] Fps is (10 sec: 42616.2, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 113082368. Throughput: 0: 45127.5. Samples: 113257160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-10 10:09:09,592][32177] Avg episode reward: [(0, '0.166')] [2024-06-10 10:09:11,636][32415] Updated weights for policy 0, policy_version 6910 (0.0033) [2024-06-10 10:09:14,594][32177] Fps is (10 sec: 42588.3, 60 sec: 44781.1, 300 sec: 44764.0). Total num frames: 113328128. Throughput: 0: 44690.8. Samples: 113378180. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-10 10:09:14,595][32177] Avg episode reward: [(0, '0.169')] [2024-06-10 10:09:15,429][32415] Updated weights for policy 0, policy_version 6920 (0.0031) [2024-06-10 10:09:19,080][32415] Updated weights for policy 0, policy_version 6930 (0.0036) [2024-06-10 10:09:19,592][32177] Fps is (10 sec: 49152.8, 60 sec: 45602.2, 300 sec: 44820.6). Total num frames: 113573888. Throughput: 0: 44676.1. Samples: 113654860. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-10 10:09:19,592][32177] Avg episode reward: [(0, '0.160')] [2024-06-10 10:09:22,824][32415] Updated weights for policy 0, policy_version 6940 (0.0027) [2024-06-10 10:09:24,594][32177] Fps is (10 sec: 42597.3, 60 sec: 44507.9, 300 sec: 44764.7). Total num frames: 113754112. Throughput: 0: 45037.7. Samples: 113932720. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-10 10:09:24,595][32177] Avg episode reward: [(0, '0.159')] [2024-06-10 10:09:26,135][32415] Updated weights for policy 0, policy_version 6950 (0.0025) [2024-06-10 10:09:29,592][32177] Fps is (10 sec: 42597.9, 60 sec: 44509.8, 300 sec: 44819.9). Total num frames: 113999872. Throughput: 0: 44887.0. Samples: 114054780. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-10 10:09:29,592][32177] Avg episode reward: [(0, '0.161')] [2024-06-10 10:09:30,622][32415] Updated weights for policy 0, policy_version 6960 (0.0025) [2024-06-10 10:09:31,647][32394] Signal inference workers to stop experience collection... (1650 times) [2024-06-10 10:09:31,647][32394] Signal inference workers to resume experience collection... (1650 times) [2024-06-10 10:09:31,668][32415] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-06-10 10:09:31,668][32415] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-06-10 10:09:33,367][32415] Updated weights for policy 0, policy_version 6970 (0.0030) [2024-06-10 10:09:34,592][32177] Fps is (10 sec: 49165.0, 60 sec: 45056.0, 300 sec: 44708.9). Total num frames: 114245632. Throughput: 0: 44910.4. Samples: 114328040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 10:09:34,601][32177] Avg episode reward: [(0, '0.156')] [2024-06-10 10:09:37,814][32415] Updated weights for policy 0, policy_version 6980 (0.0036) [2024-06-10 10:09:39,592][32177] Fps is (10 sec: 44236.6, 60 sec: 45055.9, 300 sec: 44819.9). Total num frames: 114442240. Throughput: 0: 44865.6. Samples: 114595640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 10:09:39,592][32177] Avg episode reward: [(0, '0.160')] [2024-06-10 10:09:40,878][32415] Updated weights for policy 0, policy_version 6990 (0.0035) [2024-06-10 10:09:44,592][32177] Fps is (10 sec: 40960.7, 60 sec: 44509.9, 300 sec: 44709.1). Total num frames: 114655232. Throughput: 0: 44772.4. Samples: 114722180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 10:09:44,592][32177] Avg episode reward: [(0, '0.158')] [2024-06-10 10:09:44,994][32415] Updated weights for policy 0, policy_version 7000 (0.0046) [2024-06-10 10:09:48,417][32415] Updated weights for policy 0, policy_version 7010 (0.0033) [2024-06-10 10:09:49,592][32177] Fps is (10 sec: 47513.6, 60 sec: 45332.2, 300 sec: 44765.1). Total num frames: 114917376. Throughput: 0: 44671.1. Samples: 114994580. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-10 10:09:49,593][32177] Avg episode reward: [(0, '0.175')] [2024-06-10 10:09:52,283][32415] Updated weights for policy 0, policy_version 7020 (0.0027) [2024-06-10 10:09:54,592][32177] Fps is (10 sec: 45874.5, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 115113984. Throughput: 0: 44656.9. Samples: 115266720. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-10 10:09:54,592][32177] Avg episode reward: [(0, '0.172')] [2024-06-10 10:09:55,587][32415] Updated weights for policy 0, policy_version 7030 (0.0034) [2024-06-10 10:09:59,592][32177] Fps is (10 sec: 40959.3, 60 sec: 44512.8, 300 sec: 44764.4). Total num frames: 115326976. Throughput: 0: 44866.2. Samples: 115397060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 10:09:59,601][32177] Avg episode reward: [(0, '0.174')] [2024-06-10 10:10:00,129][32415] Updated weights for policy 0, policy_version 7040 (0.0040) [2024-06-10 10:10:02,715][32415] Updated weights for policy 0, policy_version 7050 (0.0029) [2024-06-10 10:10:04,592][32177] Fps is (10 sec: 45875.5, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 115572736. Throughput: 0: 44692.4. Samples: 115666020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 10:10:04,592][32177] Avg episode reward: [(0, '0.169')] [2024-06-10 10:10:07,359][32415] Updated weights for policy 0, policy_version 7060 (0.0037) [2024-06-10 10:10:09,592][32177] Fps is (10 sec: 45876.8, 60 sec: 45056.1, 300 sec: 44708.9). Total num frames: 115785728. Throughput: 0: 44508.1. Samples: 115935460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-10 10:10:09,592][32177] Avg episode reward: [(0, '0.153')] [2024-06-10 10:10:10,284][32415] Updated weights for policy 0, policy_version 7070 (0.0040) [2024-06-10 10:10:14,592][32177] Fps is (10 sec: 40959.9, 60 sec: 44238.6, 300 sec: 44764.4). Total num frames: 115982336. Throughput: 0: 44598.3. Samples: 116061700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-10 10:10:14,592][32177] Avg episode reward: [(0, '0.159')] [2024-06-10 10:10:14,884][32415] Updated weights for policy 0, policy_version 7080 (0.0041) [2024-06-10 10:10:17,951][32415] Updated weights for policy 0, policy_version 7090 (0.0039) [2024-06-10 10:10:19,592][32177] Fps is (10 sec: 45875.1, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 116244480. Throughput: 0: 44669.9. Samples: 116338180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 10:10:19,592][32177] Avg episode reward: [(0, '0.170')] [2024-06-10 10:10:22,259][32415] Updated weights for policy 0, policy_version 7100 (0.0044) [2024-06-10 10:10:24,592][32177] Fps is (10 sec: 47514.2, 60 sec: 45058.1, 300 sec: 44708.9). Total num frames: 116457472. Throughput: 0: 44778.4. Samples: 116610660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 10:10:24,592][32177] Avg episode reward: [(0, '0.164')] [2024-06-10 10:10:25,005][32415] Updated weights for policy 0, policy_version 7110 (0.0037) [2024-06-10 10:10:29,456][32415] Updated weights for policy 0, policy_version 7120 (0.0029) [2024-06-10 10:10:29,592][32177] Fps is (10 sec: 40960.2, 60 sec: 44236.9, 300 sec: 44708.9). Total num frames: 116654080. Throughput: 0: 44886.2. Samples: 116742060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 10:10:29,592][32177] Avg episode reward: [(0, '0.180')] [2024-06-10 10:10:32,059][32415] Updated weights for policy 0, policy_version 7130 (0.0032) [2024-06-10 10:10:33,590][32394] Signal inference workers to stop experience collection... (1700 times) [2024-06-10 10:10:33,590][32394] Signal inference workers to resume experience collection... (1700 times) [2024-06-10 10:10:33,642][32415] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-06-10 10:10:33,642][32415] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-06-10 10:10:34,594][32177] Fps is (10 sec: 45863.3, 60 sec: 44508.1, 300 sec: 44708.7). Total num frames: 116916224. Throughput: 0: 44771.9. Samples: 117009420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 10:10:34,595][32177] Avg episode reward: [(0, '0.170')] [2024-06-10 10:10:34,606][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000007136_116916224.pth... [2024-06-10 10:10:34,659][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000006480_106168320.pth [2024-06-10 10:10:36,582][32415] Updated weights for policy 0, policy_version 7140 (0.0036) [2024-06-10 10:10:39,458][32415] Updated weights for policy 0, policy_version 7150 (0.0032) [2024-06-10 10:10:39,592][32177] Fps is (10 sec: 49151.2, 60 sec: 45056.0, 300 sec: 44764.5). Total num frames: 117145600. Throughput: 0: 44763.1. Samples: 117281060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 10:10:39,592][32177] Avg episode reward: [(0, '0.174')] [2024-06-10 10:10:43,909][32415] Updated weights for policy 0, policy_version 7160 (0.0034) [2024-06-10 10:10:44,592][32177] Fps is (10 sec: 42609.3, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 117342208. Throughput: 0: 44859.0. Samples: 117415700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-10 10:10:44,592][32177] Avg episode reward: [(0, '0.176')] [2024-06-10 10:10:47,013][32415] Updated weights for policy 0, policy_version 7170 (0.0034) [2024-06-10 10:10:49,592][32177] Fps is (10 sec: 42597.7, 60 sec: 44236.7, 300 sec: 44708.8). Total num frames: 117571584. Throughput: 0: 44698.9. Samples: 117677480. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-10 10:10:49,592][32177] Avg episode reward: [(0, '0.162')] [2024-06-10 10:10:51,362][32415] Updated weights for policy 0, policy_version 7180 (0.0034) [2024-06-10 10:10:53,953][32415] Updated weights for policy 0, policy_version 7190 (0.0021) [2024-06-10 10:10:54,592][32177] Fps is (10 sec: 47513.1, 60 sec: 45056.0, 300 sec: 44708.9). Total num frames: 117817344. Throughput: 0: 44803.9. Samples: 117951640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-10 10:10:54,592][32177] Avg episode reward: [(0, '0.159')] [2024-06-10 10:10:58,404][32415] Updated weights for policy 0, policy_version 7200 (0.0034) [2024-06-10 10:10:59,592][32177] Fps is (10 sec: 45876.3, 60 sec: 45056.2, 300 sec: 44875.5). Total num frames: 118030336. Throughput: 0: 45182.7. Samples: 118094920. Policy #0 lag: (min: 2.0, avg: 12.3, max: 23.0) [2024-06-10 10:10:59,592][32177] Avg episode reward: [(0, '0.161')] [2024-06-10 10:11:01,207][32415] Updated weights for policy 0, policy_version 7210 (0.0031) [2024-06-10 10:11:04,592][32177] Fps is (10 sec: 42598.9, 60 sec: 44509.9, 300 sec: 44709.5). Total num frames: 118243328. Throughput: 0: 44816.0. Samples: 118354900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-10 10:11:04,592][32177] Avg episode reward: [(0, '0.168')] [2024-06-10 10:11:05,897][32415] Updated weights for policy 0, policy_version 7220 (0.0042) [2024-06-10 10:11:08,792][32415] Updated weights for policy 0, policy_version 7230 (0.0037) [2024-06-10 10:11:09,592][32177] Fps is (10 sec: 45875.8, 60 sec: 45056.0, 300 sec: 44764.5). Total num frames: 118489088. Throughput: 0: 44700.0. Samples: 118622160. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-10 10:11:09,592][32177] Avg episode reward: [(0, '0.159')] [2024-06-10 10:11:13,137][32415] Updated weights for policy 0, policy_version 7240 (0.0033) [2024-06-10 10:11:14,592][32177] Fps is (10 sec: 44236.3, 60 sec: 45056.0, 300 sec: 44764.4). Total num frames: 118685696. Throughput: 0: 44955.4. Samples: 118765060. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-10 10:11:14,592][32177] Avg episode reward: [(0, '0.164')] [2024-06-10 10:11:15,992][32415] Updated weights for policy 0, policy_version 7250 (0.0042) [2024-06-10 10:11:19,592][32177] Fps is (10 sec: 40959.8, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 118898688. Throughput: 0: 44841.2. Samples: 119027160. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-10 10:11:19,592][32177] Avg episode reward: [(0, '0.167')] [2024-06-10 10:11:20,453][32415] Updated weights for policy 0, policy_version 7260 (0.0026) [2024-06-10 10:11:23,106][32415] Updated weights for policy 0, policy_version 7270 (0.0028) [2024-06-10 10:11:24,591][32177] Fps is (10 sec: 47514.2, 60 sec: 45056.0, 300 sec: 44708.9). Total num frames: 119160832. Throughput: 0: 44943.3. Samples: 119303500. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-10 10:11:24,592][32177] Avg episode reward: [(0, '0.166')] [2024-06-10 10:11:27,950][32415] Updated weights for policy 0, policy_version 7280 (0.0043) [2024-06-10 10:11:29,592][32177] Fps is (10 sec: 49152.0, 60 sec: 45602.1, 300 sec: 44820.0). Total num frames: 119390208. Throughput: 0: 45145.3. Samples: 119447240. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-10 10:11:29,592][32177] Avg episode reward: [(0, '0.160')] [2024-06-10 10:11:30,278][32415] Updated weights for policy 0, policy_version 7290 (0.0046) [2024-06-10 10:11:34,592][32177] Fps is (10 sec: 40959.6, 60 sec: 44238.7, 300 sec: 44653.3). Total num frames: 119570432. Throughput: 0: 45147.8. Samples: 119709120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 10:11:34,592][32177] Avg episode reward: [(0, '0.160')] [2024-06-10 10:11:35,018][32415] Updated weights for policy 0, policy_version 7300 (0.0040) [2024-06-10 10:11:38,034][32415] Updated weights for policy 0, policy_version 7310 (0.0023) [2024-06-10 10:11:39,592][32177] Fps is (10 sec: 45875.2, 60 sec: 45056.1, 300 sec: 44708.9). Total num frames: 119848960. Throughput: 0: 44861.4. Samples: 119970400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 10:11:39,592][32177] Avg episode reward: [(0, '0.166')] [2024-06-10 10:11:42,362][32415] Updated weights for policy 0, policy_version 7320 (0.0035) [2024-06-10 10:11:44,592][32177] Fps is (10 sec: 47512.2, 60 sec: 45055.7, 300 sec: 44819.9). Total num frames: 120045568. Throughput: 0: 44818.4. Samples: 120111760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-10 10:11:44,592][32177] Avg episode reward: [(0, '0.178')] [2024-06-10 10:11:45,042][32415] Updated weights for policy 0, policy_version 7330 (0.0026) [2024-06-10 10:11:49,522][32415] Updated weights for policy 0, policy_version 7340 (0.0035) [2024-06-10 10:11:49,592][32177] Fps is (10 sec: 40959.3, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 120258560. Throughput: 0: 45139.4. Samples: 120386180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 10:11:49,592][32177] Avg episode reward: [(0, '0.165')] [2024-06-10 10:11:52,190][32415] Updated weights for policy 0, policy_version 7350 (0.0036) [2024-06-10 10:11:54,592][32177] Fps is (10 sec: 47515.2, 60 sec: 45056.1, 300 sec: 44764.4). Total num frames: 120520704. Throughput: 0: 45068.8. Samples: 120650260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 10:11:54,592][32177] Avg episode reward: [(0, '0.171')] [2024-06-10 10:11:57,053][32415] Updated weights for policy 0, policy_version 7360 (0.0033) [2024-06-10 10:11:58,208][32394] Signal inference workers to stop experience collection... (1750 times) [2024-06-10 10:11:58,254][32415] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-06-10 10:11:58,265][32394] Signal inference workers to resume experience collection... (1750 times) [2024-06-10 10:11:58,275][32415] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-06-10 10:11:59,592][32177] Fps is (10 sec: 47513.6, 60 sec: 45055.9, 300 sec: 44875.8). Total num frames: 120733696. Throughput: 0: 45017.7. Samples: 120790860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-10 10:11:59,592][32177] Avg episode reward: [(0, '0.179')] [2024-06-10 10:11:59,606][32415] Updated weights for policy 0, policy_version 7370 (0.0030) [2024-06-10 10:12:04,326][32415] Updated weights for policy 0, policy_version 7380 (0.0026) [2024-06-10 10:12:04,592][32177] Fps is (10 sec: 39321.3, 60 sec: 44509.8, 300 sec: 44653.3). Total num frames: 120913920. Throughput: 0: 45059.9. Samples: 121054860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-10 10:12:04,592][32177] Avg episode reward: [(0, '0.175')] [2024-06-10 10:12:07,181][32415] Updated weights for policy 0, policy_version 7390 (0.0048) [2024-06-10 10:12:09,592][32177] Fps is (10 sec: 44237.4, 60 sec: 44782.9, 300 sec: 44709.1). Total num frames: 121176064. Throughput: 0: 44728.8. Samples: 121316300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-10 10:12:09,592][32177] Avg episode reward: [(0, '0.164')] [2024-06-10 10:12:11,703][32415] Updated weights for policy 0, policy_version 7400 (0.0022) [2024-06-10 10:12:14,378][32415] Updated weights for policy 0, policy_version 7410 (0.0030) [2024-06-10 10:12:14,592][32177] Fps is (10 sec: 50790.0, 60 sec: 45602.1, 300 sec: 44986.5). Total num frames: 121421824. Throughput: 0: 44607.4. Samples: 121454580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-10 10:12:14,592][32177] Avg episode reward: [(0, '0.172')] [2024-06-10 10:12:18,880][32415] Updated weights for policy 0, policy_version 7420 (0.0036) [2024-06-10 10:12:19,592][32177] Fps is (10 sec: 42598.2, 60 sec: 45055.9, 300 sec: 44764.4). Total num frames: 121602048. Throughput: 0: 44939.5. Samples: 121731400. Policy #0 lag: (min: 0.0, avg: 13.1, max: 23.0) [2024-06-10 10:12:19,592][32177] Avg episode reward: [(0, '0.154')] [2024-06-10 10:12:21,508][32415] Updated weights for policy 0, policy_version 7430 (0.0039) [2024-06-10 10:12:24,592][32177] Fps is (10 sec: 40960.5, 60 sec: 44509.8, 300 sec: 44708.9). Total num frames: 121831424. Throughput: 0: 44905.7. Samples: 121991160. Policy #0 lag: (min: 1.0, avg: 12.0, max: 24.0) [2024-06-10 10:12:24,592][32177] Avg episode reward: [(0, '0.172')] [2024-06-10 10:12:26,407][32415] Updated weights for policy 0, policy_version 7440 (0.0034) [2024-06-10 10:12:29,072][32415] Updated weights for policy 0, policy_version 7450 (0.0032) [2024-06-10 10:12:29,592][32177] Fps is (10 sec: 49151.6, 60 sec: 45055.9, 300 sec: 44931.4). Total num frames: 122093568. Throughput: 0: 44851.7. Samples: 122130080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-10 10:12:29,592][32177] Avg episode reward: [(0, '0.171')] [2024-06-10 10:12:33,806][32415] Updated weights for policy 0, policy_version 7460 (0.0032) [2024-06-10 10:12:34,592][32177] Fps is (10 sec: 42597.8, 60 sec: 44782.8, 300 sec: 44709.5). Total num frames: 122257408. Throughput: 0: 44574.2. Samples: 122392020. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-10 10:12:34,592][32177] Avg episode reward: [(0, '0.178')] [2024-06-10 10:12:34,637][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000007463_122273792.pth... [2024-06-10 10:12:34,718][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000006808_111542272.pth [2024-06-10 10:12:36,394][32415] Updated weights for policy 0, policy_version 7470 (0.0030) [2024-06-10 10:12:39,596][32177] Fps is (10 sec: 40942.8, 60 sec: 44233.6, 300 sec: 44708.2). Total num frames: 122503168. Throughput: 0: 44634.3. Samples: 122659000. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-10 10:12:39,597][32177] Avg episode reward: [(0, '0.182')] [2024-06-10 10:12:39,597][32394] Saving new best policy, reward=0.182! [2024-06-10 10:12:40,961][32415] Updated weights for policy 0, policy_version 7480 (0.0036) [2024-06-10 10:12:43,675][32415] Updated weights for policy 0, policy_version 7490 (0.0039) [2024-06-10 10:12:44,592][32177] Fps is (10 sec: 50790.2, 60 sec: 45329.2, 300 sec: 45042.1). Total num frames: 122765312. Throughput: 0: 44569.3. Samples: 122796480. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-10 10:12:44,592][32177] Avg episode reward: [(0, '0.158')] [2024-06-10 10:12:47,980][32415] Updated weights for policy 0, policy_version 7500 (0.0030) [2024-06-10 10:12:49,592][32177] Fps is (10 sec: 44255.9, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 122945536. Throughput: 0: 44656.0. Samples: 123064380. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-10 10:12:49,592][32177] Avg episode reward: [(0, '0.183')] [2024-06-10 10:12:50,750][32415] Updated weights for policy 0, policy_version 7510 (0.0040) [2024-06-10 10:12:54,592][32177] Fps is (10 sec: 39322.1, 60 sec: 43963.7, 300 sec: 44653.3). Total num frames: 123158528. Throughput: 0: 44867.9. Samples: 123335360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 10:12:54,592][32177] Avg episode reward: [(0, '0.181')] [2024-06-10 10:12:55,337][32415] Updated weights for policy 0, policy_version 7520 (0.0026) [2024-06-10 10:12:58,133][32415] Updated weights for policy 0, policy_version 7530 (0.0036) [2024-06-10 10:12:59,592][32177] Fps is (10 sec: 47513.2, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 123420672. Throughput: 0: 44813.8. Samples: 123471200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 10:12:59,592][32177] Avg episode reward: [(0, '0.178')] [2024-06-10 10:13:02,686][32415] Updated weights for policy 0, policy_version 7540 (0.0033) [2024-06-10 10:13:04,592][32177] Fps is (10 sec: 47513.8, 60 sec: 45329.1, 300 sec: 44820.0). Total num frames: 123633664. Throughput: 0: 44632.9. Samples: 123739880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-10 10:13:04,592][32177] Avg episode reward: [(0, '0.164')] [2024-06-10 10:13:05,681][32415] Updated weights for policy 0, policy_version 7550 (0.0028) [2024-06-10 10:13:09,592][32177] Fps is (10 sec: 40960.6, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 123830272. Throughput: 0: 44828.5. Samples: 124008440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-10 10:13:09,592][32177] Avg episode reward: [(0, '0.178')] [2024-06-10 10:13:09,862][32415] Updated weights for policy 0, policy_version 7560 (0.0031) [2024-06-10 10:13:12,759][32415] Updated weights for policy 0, policy_version 7570 (0.0025) [2024-06-10 10:13:14,592][32177] Fps is (10 sec: 47513.7, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 124108800. Throughput: 0: 44837.9. Samples: 124147780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-10 10:13:14,592][32177] Avg episode reward: [(0, '0.172')] [2024-06-10 10:13:17,104][32415] Updated weights for policy 0, policy_version 7580 (0.0037) [2024-06-10 10:13:19,377][32394] Signal inference workers to stop experience collection... (1800 times) [2024-06-10 10:13:19,435][32415] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-06-10 10:13:19,440][32394] Signal inference workers to resume experience collection... (1800 times) [2024-06-10 10:13:19,444][32415] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-06-10 10:13:19,592][32177] Fps is (10 sec: 49151.7, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 124321792. Throughput: 0: 44985.9. Samples: 124416380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 10:13:19,592][32177] Avg episode reward: [(0, '0.174')] [2024-06-10 10:13:20,005][32415] Updated weights for policy 0, policy_version 7590 (0.0031) [2024-06-10 10:13:24,255][32415] Updated weights for policy 0, policy_version 7600 (0.0041) [2024-06-10 10:13:24,592][32177] Fps is (10 sec: 40959.5, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 124518400. Throughput: 0: 44961.6. Samples: 124682080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 10:13:24,592][32177] Avg episode reward: [(0, '0.170')] [2024-06-10 10:13:27,269][32415] Updated weights for policy 0, policy_version 7610 (0.0037) [2024-06-10 10:13:29,596][32177] Fps is (10 sec: 44217.2, 60 sec: 44506.7, 300 sec: 44819.3). Total num frames: 124764160. Throughput: 0: 44852.6. Samples: 124815040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 10:13:29,597][32177] Avg episode reward: [(0, '0.185')] [2024-06-10 10:13:29,598][32394] Saving new best policy, reward=0.185! [2024-06-10 10:13:31,630][32415] Updated weights for policy 0, policy_version 7620 (0.0036) [2024-06-10 10:13:34,592][32177] Fps is (10 sec: 45876.0, 60 sec: 45329.2, 300 sec: 44875.5). Total num frames: 124977152. Throughput: 0: 45007.2. Samples: 125089700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 10:13:34,592][32177] Avg episode reward: [(0, '0.176')] [2024-06-10 10:13:34,741][32415] Updated weights for policy 0, policy_version 7630 (0.0039) [2024-06-10 10:13:38,769][32415] Updated weights for policy 0, policy_version 7640 (0.0030) [2024-06-10 10:13:39,592][32177] Fps is (10 sec: 42617.1, 60 sec: 44786.1, 300 sec: 44764.4). Total num frames: 125190144. Throughput: 0: 45002.2. Samples: 125360460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 10:13:39,592][32177] Avg episode reward: [(0, '0.160')] [2024-06-10 10:13:42,030][32415] Updated weights for policy 0, policy_version 7650 (0.0032) [2024-06-10 10:13:44,592][32177] Fps is (10 sec: 45875.0, 60 sec: 44510.0, 300 sec: 44876.2). Total num frames: 125435904. Throughput: 0: 44889.9. Samples: 125491240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 10:13:44,592][32177] Avg episode reward: [(0, '0.175')] [2024-06-10 10:13:46,323][32415] Updated weights for policy 0, policy_version 7660 (0.0043) [2024-06-10 10:13:49,185][32415] Updated weights for policy 0, policy_version 7670 (0.0035) [2024-06-10 10:13:49,592][32177] Fps is (10 sec: 49152.3, 60 sec: 45602.2, 300 sec: 45042.1). Total num frames: 125681664. Throughput: 0: 45048.9. Samples: 125767080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 10:13:49,592][32177] Avg episode reward: [(0, '0.184')] [2024-06-10 10:13:53,611][32415] Updated weights for policy 0, policy_version 7680 (0.0037) [2024-06-10 10:13:54,596][32177] Fps is (10 sec: 44217.5, 60 sec: 45325.8, 300 sec: 44820.0). Total num frames: 125878272. Throughput: 0: 44929.4. Samples: 126030460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 10:13:54,597][32177] Avg episode reward: [(0, '0.170')] [2024-06-10 10:13:56,653][32415] Updated weights for policy 0, policy_version 7690 (0.0040) [2024-06-10 10:13:59,592][32177] Fps is (10 sec: 40960.2, 60 sec: 44510.0, 300 sec: 44708.9). Total num frames: 126091264. Throughput: 0: 44641.4. Samples: 126156640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 10:13:59,592][32177] Avg episode reward: [(0, '0.194')] [2024-06-10 10:13:59,596][32394] Saving new best policy, reward=0.194! [2024-06-10 10:14:00,751][32415] Updated weights for policy 0, policy_version 7700 (0.0030) [2024-06-10 10:14:04,133][32415] Updated weights for policy 0, policy_version 7710 (0.0025) [2024-06-10 10:14:04,592][32177] Fps is (10 sec: 45894.9, 60 sec: 45056.0, 300 sec: 44931.1). Total num frames: 126337024. Throughput: 0: 44756.4. Samples: 126430420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 10:14:04,592][32177] Avg episode reward: [(0, '0.184')] [2024-06-10 10:14:08,255][32415] Updated weights for policy 0, policy_version 7720 (0.0031) [2024-06-10 10:14:09,592][32177] Fps is (10 sec: 44236.4, 60 sec: 45055.9, 300 sec: 44764.8). Total num frames: 126533632. Throughput: 0: 44931.2. Samples: 126703980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-10 10:14:09,592][32177] Avg episode reward: [(0, '0.176')] [2024-06-10 10:14:11,320][32415] Updated weights for policy 0, policy_version 7730 (0.0024) [2024-06-10 10:14:14,592][32177] Fps is (10 sec: 42598.7, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 126763008. Throughput: 0: 44686.2. Samples: 126825720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-10 10:14:14,592][32177] Avg episode reward: [(0, '0.185')] [2024-06-10 10:14:15,544][32415] Updated weights for policy 0, policy_version 7740 (0.0034) [2024-06-10 10:14:18,426][32415] Updated weights for policy 0, policy_version 7750 (0.0023) [2024-06-10 10:14:19,592][32177] Fps is (10 sec: 49151.4, 60 sec: 45055.9, 300 sec: 44987.0). Total num frames: 127025152. Throughput: 0: 44886.4. Samples: 127109600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-10 10:14:19,593][32177] Avg episode reward: [(0, '0.182')] [2024-06-10 10:14:23,059][32415] Updated weights for policy 0, policy_version 7760 (0.0038) [2024-06-10 10:14:24,592][32177] Fps is (10 sec: 45875.3, 60 sec: 45056.1, 300 sec: 44820.0). Total num frames: 127221760. Throughput: 0: 44784.5. Samples: 127375760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-10 10:14:24,592][32177] Avg episode reward: [(0, '0.173')] [2024-06-10 10:14:25,336][32394] Signal inference workers to stop experience collection... (1850 times) [2024-06-10 10:14:25,337][32394] Signal inference workers to resume experience collection... (1850 times) [2024-06-10 10:14:25,346][32415] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-06-10 10:14:25,358][32415] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-06-10 10:14:25,850][32415] Updated weights for policy 0, policy_version 7770 (0.0032) [2024-06-10 10:14:29,594][32177] Fps is (10 sec: 40951.5, 60 sec: 44511.5, 300 sec: 44708.6). Total num frames: 127434752. Throughput: 0: 44694.2. Samples: 127502580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-10 10:14:29,594][32177] Avg episode reward: [(0, '0.177')] [2024-06-10 10:14:30,166][32415] Updated weights for policy 0, policy_version 7780 (0.0031) [2024-06-10 10:14:33,358][32415] Updated weights for policy 0, policy_version 7790 (0.0025) [2024-06-10 10:14:34,592][32177] Fps is (10 sec: 47513.4, 60 sec: 45329.0, 300 sec: 44931.1). Total num frames: 127696896. Throughput: 0: 44604.4. Samples: 127774280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-10 10:14:34,592][32177] Avg episode reward: [(0, '0.179')] [2024-06-10 10:14:34,696][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000007795_127713280.pth... [2024-06-10 10:14:34,752][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000007136_116916224.pth [2024-06-10 10:14:37,622][32415] Updated weights for policy 0, policy_version 7800 (0.0036) [2024-06-10 10:14:39,592][32177] Fps is (10 sec: 44246.9, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 127877120. Throughput: 0: 44767.5. Samples: 128044800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-10 10:14:39,592][32177] Avg episode reward: [(0, '0.178')] [2024-06-10 10:14:40,628][32415] Updated weights for policy 0, policy_version 7810 (0.0040) [2024-06-10 10:14:44,591][32177] Fps is (10 sec: 40960.4, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 128106496. Throughput: 0: 44760.5. Samples: 128170860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-10 10:14:44,592][32177] Avg episode reward: [(0, '0.173')] [2024-06-10 10:14:44,615][32415] Updated weights for policy 0, policy_version 7820 (0.0030) [2024-06-10 10:14:47,787][32415] Updated weights for policy 0, policy_version 7830 (0.0027) [2024-06-10 10:14:49,592][32177] Fps is (10 sec: 49151.8, 60 sec: 44782.9, 300 sec: 44931.1). Total num frames: 128368640. Throughput: 0: 44829.4. Samples: 128447740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-10 10:14:49,592][32177] Avg episode reward: [(0, '0.182')] [2024-06-10 10:14:52,017][32415] Updated weights for policy 0, policy_version 7840 (0.0028) [2024-06-10 10:14:54,592][32177] Fps is (10 sec: 45874.8, 60 sec: 44786.2, 300 sec: 44875.6). Total num frames: 128565248. Throughput: 0: 44814.3. Samples: 128720620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-10 10:14:54,592][32177] Avg episode reward: [(0, '0.191')] [2024-06-10 10:14:55,204][32415] Updated weights for policy 0, policy_version 7850 (0.0028) [2024-06-10 10:14:59,592][32177] Fps is (10 sec: 39321.7, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 128761856. Throughput: 0: 44946.7. Samples: 128848320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 10:14:59,592][32177] Avg episode reward: [(0, '0.182')] [2024-06-10 10:14:59,626][32415] Updated weights for policy 0, policy_version 7860 (0.0045) [2024-06-10 10:15:02,614][32415] Updated weights for policy 0, policy_version 7870 (0.0039) [2024-06-10 10:15:04,596][32177] Fps is (10 sec: 45855.5, 60 sec: 44779.8, 300 sec: 44874.8). Total num frames: 129024000. Throughput: 0: 44569.7. Samples: 129115420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 10:15:04,596][32177] Avg episode reward: [(0, '0.174')] [2024-06-10 10:15:06,736][32415] Updated weights for policy 0, policy_version 7880 (0.0042) [2024-06-10 10:15:09,592][32177] Fps is (10 sec: 49151.2, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 129253376. Throughput: 0: 44713.2. Samples: 129387860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 10:15:09,592][32177] Avg episode reward: [(0, '0.174')] [2024-06-10 10:15:09,873][32415] Updated weights for policy 0, policy_version 7890 (0.0036) [2024-06-10 10:15:13,860][32415] Updated weights for policy 0, policy_version 7900 (0.0033) [2024-06-10 10:15:14,592][32177] Fps is (10 sec: 42616.4, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 129449984. Throughput: 0: 45013.8. Samples: 129528100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-10 10:15:14,592][32177] Avg episode reward: [(0, '0.193')] [2024-06-10 10:15:17,087][32415] Updated weights for policy 0, policy_version 7910 (0.0027) [2024-06-10 10:15:19,592][32177] Fps is (10 sec: 42597.8, 60 sec: 44236.7, 300 sec: 44819.9). Total num frames: 129679360. Throughput: 0: 44884.6. Samples: 129794100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-10 10:15:19,592][32177] Avg episode reward: [(0, '0.170')] [2024-06-10 10:15:21,358][32415] Updated weights for policy 0, policy_version 7920 (0.0042) [2024-06-10 10:15:24,254][32415] Updated weights for policy 0, policy_version 7930 (0.0034) [2024-06-10 10:15:24,592][32177] Fps is (10 sec: 47512.8, 60 sec: 45055.8, 300 sec: 44986.5). Total num frames: 129925120. Throughput: 0: 44970.0. Samples: 130068460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 10:15:24,592][32177] Avg episode reward: [(0, '0.188')] [2024-06-10 10:15:28,804][32415] Updated weights for policy 0, policy_version 7940 (0.0027) [2024-06-10 10:15:29,592][32177] Fps is (10 sec: 45876.5, 60 sec: 45057.7, 300 sec: 44820.3). Total num frames: 130138112. Throughput: 0: 45043.9. Samples: 130197840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-10 10:15:29,592][32177] Avg episode reward: [(0, '0.168')] [2024-06-10 10:15:31,763][32415] Updated weights for policy 0, policy_version 7950 (0.0040) [2024-06-10 10:15:34,592][32177] Fps is (10 sec: 42597.9, 60 sec: 44236.6, 300 sec: 44764.4). Total num frames: 130351104. Throughput: 0: 44844.6. Samples: 130465760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 10:15:34,592][32177] Avg episode reward: [(0, '0.191')] [2024-06-10 10:15:35,911][32415] Updated weights for policy 0, policy_version 7960 (0.0027) [2024-06-10 10:15:38,826][32415] Updated weights for policy 0, policy_version 7970 (0.0049) [2024-06-10 10:15:39,592][32177] Fps is (10 sec: 45875.3, 60 sec: 45329.1, 300 sec: 44931.0). Total num frames: 130596864. Throughput: 0: 44771.6. Samples: 130735340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 10:15:39,592][32177] Avg episode reward: [(0, '0.159')] [2024-06-10 10:15:42,910][32415] Updated weights for policy 0, policy_version 7980 (0.0040) [2024-06-10 10:15:44,592][32177] Fps is (10 sec: 44238.0, 60 sec: 44782.8, 300 sec: 44820.0). Total num frames: 130793472. Throughput: 0: 45084.4. Samples: 130877120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-10 10:15:44,592][32177] Avg episode reward: [(0, '0.170')] [2024-06-10 10:15:44,876][32394] Signal inference workers to stop experience collection... (1900 times) [2024-06-10 10:15:44,877][32394] Signal inference workers to resume experience collection... (1900 times) [2024-06-10 10:15:44,914][32415] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-06-10 10:15:44,915][32415] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-06-10 10:15:46,203][32415] Updated weights for policy 0, policy_version 7990 (0.0033) [2024-06-10 10:15:49,592][32177] Fps is (10 sec: 44236.8, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 131039232. Throughput: 0: 44971.9. Samples: 131138960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-10 10:15:49,592][32177] Avg episode reward: [(0, '0.177')] [2024-06-10 10:15:50,676][32415] Updated weights for policy 0, policy_version 8000 (0.0035) [2024-06-10 10:15:53,668][32415] Updated weights for policy 0, policy_version 8010 (0.0025) [2024-06-10 10:15:54,592][32177] Fps is (10 sec: 49151.4, 60 sec: 45328.9, 300 sec: 44931.0). Total num frames: 131284992. Throughput: 0: 44897.7. Samples: 131408260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-10 10:15:54,593][32177] Avg episode reward: [(0, '0.180')] [2024-06-10 10:15:58,186][32415] Updated weights for policy 0, policy_version 8020 (0.0030) [2024-06-10 10:15:59,592][32177] Fps is (10 sec: 45874.3, 60 sec: 45602.0, 300 sec: 44931.0). Total num frames: 131497984. Throughput: 0: 44777.2. Samples: 131543080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-10 10:15:59,592][32177] Avg episode reward: [(0, '0.186')] [2024-06-10 10:16:00,954][32415] Updated weights for policy 0, policy_version 8030 (0.0044) [2024-06-10 10:16:04,596][32177] Fps is (10 sec: 40943.1, 60 sec: 44509.8, 300 sec: 44763.8). Total num frames: 131694592. Throughput: 0: 44760.0. Samples: 131808480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-10 10:16:04,597][32177] Avg episode reward: [(0, '0.197')] [2024-06-10 10:16:04,601][32394] Saving new best policy, reward=0.197! [2024-06-10 10:16:05,281][32415] Updated weights for policy 0, policy_version 8040 (0.0032) [2024-06-10 10:16:08,038][32415] Updated weights for policy 0, policy_version 8050 (0.0032) [2024-06-10 10:16:09,592][32177] Fps is (10 sec: 45876.1, 60 sec: 45056.1, 300 sec: 44986.6). Total num frames: 131956736. Throughput: 0: 44754.0. Samples: 132082380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-10 10:16:09,592][32177] Avg episode reward: [(0, '0.188')] [2024-06-10 10:16:12,679][32415] Updated weights for policy 0, policy_version 8060 (0.0024) [2024-06-10 10:16:14,591][32177] Fps is (10 sec: 45895.4, 60 sec: 45056.1, 300 sec: 44931.0). Total num frames: 132153344. Throughput: 0: 45157.4. Samples: 132229920. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-10 10:16:14,592][32177] Avg episode reward: [(0, '0.176')] [2024-06-10 10:16:15,360][32415] Updated weights for policy 0, policy_version 8070 (0.0039) [2024-06-10 10:16:19,592][32177] Fps is (10 sec: 39320.2, 60 sec: 44509.8, 300 sec: 44708.8). Total num frames: 132349952. Throughput: 0: 44928.9. Samples: 132487560. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-10 10:16:19,592][32177] Avg episode reward: [(0, '0.191')] [2024-06-10 10:16:19,936][32415] Updated weights for policy 0, policy_version 8080 (0.0037) [2024-06-10 10:16:23,006][32415] Updated weights for policy 0, policy_version 8090 (0.0035) [2024-06-10 10:16:24,592][32177] Fps is (10 sec: 45874.1, 60 sec: 44783.0, 300 sec: 44819.9). Total num frames: 132612096. Throughput: 0: 44853.6. Samples: 132753760. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-10 10:16:24,592][32177] Avg episode reward: [(0, '0.187')] [2024-06-10 10:16:27,271][32415] Updated weights for policy 0, policy_version 8100 (0.0026) [2024-06-10 10:16:29,592][32177] Fps is (10 sec: 47515.0, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 132825088. Throughput: 0: 44808.0. Samples: 132893480. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-10 10:16:29,592][32177] Avg episode reward: [(0, '0.179')] [2024-06-10 10:16:30,399][32415] Updated weights for policy 0, policy_version 8110 (0.0035) [2024-06-10 10:16:34,386][32415] Updated weights for policy 0, policy_version 8120 (0.0022) [2024-06-10 10:16:34,592][32177] Fps is (10 sec: 42598.6, 60 sec: 44783.1, 300 sec: 44708.9). Total num frames: 133038080. Throughput: 0: 44839.0. Samples: 133156720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-10 10:16:34,593][32177] Avg episode reward: [(0, '0.182')] [2024-06-10 10:16:34,609][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000008120_133038080.pth... [2024-06-10 10:16:34,665][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000007463_122273792.pth [2024-06-10 10:16:37,441][32415] Updated weights for policy 0, policy_version 8130 (0.0034) [2024-06-10 10:16:39,592][32177] Fps is (10 sec: 47513.9, 60 sec: 45056.0, 300 sec: 44931.1). Total num frames: 133300224. Throughput: 0: 44806.0. Samples: 133424520. Policy #0 lag: (min: 2.0, avg: 10.9, max: 23.0) [2024-06-10 10:16:39,592][32177] Avg episode reward: [(0, '0.181')] [2024-06-10 10:16:42,027][32415] Updated weights for policy 0, policy_version 8140 (0.0032) [2024-06-10 10:16:44,526][32415] Updated weights for policy 0, policy_version 8150 (0.0037) [2024-06-10 10:16:44,592][32177] Fps is (10 sec: 49151.9, 60 sec: 45602.1, 300 sec: 44986.6). Total num frames: 133529600. Throughput: 0: 45103.6. Samples: 133572740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 10:16:44,592][32177] Avg episode reward: [(0, '0.179')] [2024-06-10 10:16:48,993][32415] Updated weights for policy 0, policy_version 8160 (0.0026) [2024-06-10 10:16:49,592][32177] Fps is (10 sec: 40959.7, 60 sec: 44509.8, 300 sec: 44708.9). Total num frames: 133709824. Throughput: 0: 45002.9. Samples: 133833420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 10:16:49,592][32177] Avg episode reward: [(0, '0.188')] [2024-06-10 10:16:51,952][32415] Updated weights for policy 0, policy_version 8170 (0.0042) [2024-06-10 10:16:54,592][32177] Fps is (10 sec: 42598.7, 60 sec: 44510.0, 300 sec: 44820.0). Total num frames: 133955584. Throughput: 0: 44731.5. Samples: 134095300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-10 10:16:54,592][32177] Avg episode reward: [(0, '0.191')] [2024-06-10 10:16:56,398][32415] Updated weights for policy 0, policy_version 8180 (0.0043) [2024-06-10 10:16:59,396][32394] Signal inference workers to stop experience collection... (1950 times) [2024-06-10 10:16:59,396][32394] Signal inference workers to resume experience collection... (1950 times) [2024-06-10 10:16:59,412][32415] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-06-10 10:16:59,412][32415] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-06-10 10:16:59,531][32415] Updated weights for policy 0, policy_version 8190 (0.0028) [2024-06-10 10:16:59,592][32177] Fps is (10 sec: 47513.9, 60 sec: 44783.1, 300 sec: 44986.6). Total num frames: 134184960. Throughput: 0: 44631.0. Samples: 134238320. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-10 10:16:59,592][32177] Avg episode reward: [(0, '0.184')] [2024-06-10 10:17:03,945][32415] Updated weights for policy 0, policy_version 8200 (0.0033) [2024-06-10 10:17:04,596][32177] Fps is (10 sec: 40942.6, 60 sec: 44509.9, 300 sec: 44708.2). Total num frames: 134365184. Throughput: 0: 44831.1. Samples: 134505140. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-10 10:17:04,596][32177] Avg episode reward: [(0, '0.173')] [2024-06-10 10:17:06,710][32415] Updated weights for policy 0, policy_version 8210 (0.0035) [2024-06-10 10:17:09,595][32177] Fps is (10 sec: 44219.6, 60 sec: 44507.0, 300 sec: 44763.9). Total num frames: 134627328. Throughput: 0: 44883.4. Samples: 134773680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-10 10:17:09,596][32177] Avg episode reward: [(0, '0.192')] [2024-06-10 10:17:10,904][32415] Updated weights for policy 0, policy_version 8220 (0.0056) [2024-06-10 10:17:13,990][32415] Updated weights for policy 0, policy_version 8230 (0.0034) [2024-06-10 10:17:14,592][32177] Fps is (10 sec: 50812.1, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 134873088. Throughput: 0: 45018.2. Samples: 134919300. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-10 10:17:14,592][32177] Avg episode reward: [(0, '0.194')] [2024-06-10 10:17:17,954][32415] Updated weights for policy 0, policy_version 8240 (0.0040) [2024-06-10 10:17:19,592][32177] Fps is (10 sec: 42614.4, 60 sec: 45056.1, 300 sec: 44819.9). Total num frames: 135053312. Throughput: 0: 44901.7. Samples: 135177300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-10 10:17:19,592][32177] Avg episode reward: [(0, '0.186')] [2024-06-10 10:17:21,407][32415] Updated weights for policy 0, policy_version 8250 (0.0029) [2024-06-10 10:17:24,592][32177] Fps is (10 sec: 42598.2, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 135299072. Throughput: 0: 44740.3. Samples: 135437840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-10 10:17:24,592][32177] Avg episode reward: [(0, '0.180')] [2024-06-10 10:17:25,655][32415] Updated weights for policy 0, policy_version 8260 (0.0031) [2024-06-10 10:17:28,738][32415] Updated weights for policy 0, policy_version 8270 (0.0027) [2024-06-10 10:17:29,592][32177] Fps is (10 sec: 47514.5, 60 sec: 45056.1, 300 sec: 44986.6). Total num frames: 135528448. Throughput: 0: 44564.6. Samples: 135578140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 10:17:29,592][32177] Avg episode reward: [(0, '0.191')] [2024-06-10 10:17:33,095][32415] Updated weights for policy 0, policy_version 8280 (0.0027) [2024-06-10 10:17:34,592][32177] Fps is (10 sec: 42598.1, 60 sec: 44782.9, 300 sec: 44820.6). Total num frames: 135725056. Throughput: 0: 44794.6. Samples: 135849180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-10 10:17:34,592][32177] Avg episode reward: [(0, '0.185')] [2024-06-10 10:17:36,249][32415] Updated weights for policy 0, policy_version 8290 (0.0040) [2024-06-10 10:17:39,592][32177] Fps is (10 sec: 42598.3, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 135954432. Throughput: 0: 44799.7. Samples: 136111280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-10 10:17:39,592][32177] Avg episode reward: [(0, '0.194')] [2024-06-10 10:17:40,094][32415] Updated weights for policy 0, policy_version 8300 (0.0032) [2024-06-10 10:17:43,355][32415] Updated weights for policy 0, policy_version 8310 (0.0033) [2024-06-10 10:17:44,592][32177] Fps is (10 sec: 49152.7, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 136216576. Throughput: 0: 44757.8. Samples: 136252420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 10:17:44,592][32177] Avg episode reward: [(0, '0.177')] [2024-06-10 10:17:47,355][32415] Updated weights for policy 0, policy_version 8320 (0.0035) [2024-06-10 10:17:49,595][32177] Fps is (10 sec: 44223.6, 60 sec: 44780.8, 300 sec: 44875.1). Total num frames: 136396800. Throughput: 0: 44843.1. Samples: 136523020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 10:17:49,595][32177] Avg episode reward: [(0, '0.197')] [2024-06-10 10:17:50,688][32415] Updated weights for policy 0, policy_version 8330 (0.0034) [2024-06-10 10:17:54,592][32177] Fps is (10 sec: 40959.8, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 136626176. Throughput: 0: 44761.6. Samples: 136787780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 10:17:54,592][32177] Avg episode reward: [(0, '0.186')] [2024-06-10 10:17:54,880][32415] Updated weights for policy 0, policy_version 8340 (0.0039) [2024-06-10 10:17:57,977][32415] Updated weights for policy 0, policy_version 8350 (0.0032) [2024-06-10 10:17:59,592][32177] Fps is (10 sec: 49166.0, 60 sec: 45055.9, 300 sec: 44931.0). Total num frames: 136888320. Throughput: 0: 44530.6. Samples: 136923180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 10:17:59,592][32177] Avg episode reward: [(0, '0.202')] [2024-06-10 10:17:59,593][32394] Saving new best policy, reward=0.202! [2024-06-10 10:18:02,127][32415] Updated weights for policy 0, policy_version 8360 (0.0023) [2024-06-10 10:18:04,592][32177] Fps is (10 sec: 45875.6, 60 sec: 45332.3, 300 sec: 44931.0). Total num frames: 137084928. Throughput: 0: 44844.6. Samples: 137195300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 10:18:04,592][32177] Avg episode reward: [(0, '0.195')] [2024-06-10 10:18:05,529][32415] Updated weights for policy 0, policy_version 8370 (0.0027) [2024-06-10 10:18:09,059][32415] Updated weights for policy 0, policy_version 8380 (0.0034) [2024-06-10 10:18:09,596][32177] Fps is (10 sec: 40942.5, 60 sec: 44509.5, 300 sec: 44708.2). Total num frames: 137297920. Throughput: 0: 45052.2. Samples: 137465380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 10:18:09,597][32177] Avg episode reward: [(0, '0.186')] [2024-06-10 10:18:12,615][32415] Updated weights for policy 0, policy_version 8390 (0.0035) [2024-06-10 10:18:14,596][32177] Fps is (10 sec: 45855.3, 60 sec: 44506.7, 300 sec: 44819.3). Total num frames: 137543680. Throughput: 0: 44899.6. Samples: 137598820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 10:18:14,597][32177] Avg episode reward: [(0, '0.192')] [2024-06-10 10:18:16,332][32415] Updated weights for policy 0, policy_version 8400 (0.0046) [2024-06-10 10:18:19,596][32177] Fps is (10 sec: 44237.0, 60 sec: 44779.8, 300 sec: 44819.3). Total num frames: 137740288. Throughput: 0: 44766.1. Samples: 137863840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-10 10:18:19,596][32177] Avg episode reward: [(0, '0.201')] [2024-06-10 10:18:20,125][32415] Updated weights for policy 0, policy_version 8410 (0.0032) [2024-06-10 10:18:23,981][32415] Updated weights for policy 0, policy_version 8420 (0.0039) [2024-06-10 10:18:24,596][32177] Fps is (10 sec: 44237.0, 60 sec: 44779.8, 300 sec: 44820.0). Total num frames: 137986048. Throughput: 0: 44809.9. Samples: 138127920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-10 10:18:24,605][32177] Avg episode reward: [(0, '0.199')] [2024-06-10 10:18:27,063][32394] Signal inference workers to stop experience collection... (2000 times) [2024-06-10 10:18:27,106][32415] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-06-10 10:18:27,114][32394] Signal inference workers to resume experience collection... (2000 times) [2024-06-10 10:18:27,124][32415] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-06-10 10:18:27,247][32415] Updated weights for policy 0, policy_version 8430 (0.0025) [2024-06-10 10:18:29,592][32177] Fps is (10 sec: 47533.6, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 138215424. Throughput: 0: 44759.0. Samples: 138266580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-10 10:18:29,592][32177] Avg episode reward: [(0, '0.185')] [2024-06-10 10:18:31,486][32415] Updated weights for policy 0, policy_version 8440 (0.0027) [2024-06-10 10:18:34,592][32177] Fps is (10 sec: 44255.6, 60 sec: 45056.1, 300 sec: 44875.5). Total num frames: 138428416. Throughput: 0: 44708.6. Samples: 138534780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 10:18:34,592][32177] Avg episode reward: [(0, '0.194')] [2024-06-10 10:18:34,599][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000008449_138428416.pth... [2024-06-10 10:18:34,677][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000007795_127713280.pth [2024-06-10 10:18:34,827][32415] Updated weights for policy 0, policy_version 8450 (0.0029) [2024-06-10 10:18:38,404][32415] Updated weights for policy 0, policy_version 8460 (0.0026) [2024-06-10 10:18:39,592][32177] Fps is (10 sec: 42598.5, 60 sec: 44782.8, 300 sec: 44764.4). Total num frames: 138641408. Throughput: 0: 44961.3. Samples: 138811040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-10 10:18:39,592][32177] Avg episode reward: [(0, '0.196')] [2024-06-10 10:18:41,979][32415] Updated weights for policy 0, policy_version 8470 (0.0027) [2024-06-10 10:18:44,592][32177] Fps is (10 sec: 45873.5, 60 sec: 44509.6, 300 sec: 44764.4). Total num frames: 138887168. Throughput: 0: 44847.7. Samples: 138941340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-10 10:18:44,592][32177] Avg episode reward: [(0, '0.206')] [2024-06-10 10:18:44,712][32394] Saving new best policy, reward=0.206! [2024-06-10 10:18:45,479][32415] Updated weights for policy 0, policy_version 8480 (0.0040) [2024-06-10 10:18:49,378][32415] Updated weights for policy 0, policy_version 8490 (0.0029) [2024-06-10 10:18:49,592][32177] Fps is (10 sec: 45875.5, 60 sec: 45058.2, 300 sec: 44820.6). Total num frames: 139100160. Throughput: 0: 44770.6. Samples: 139209980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-10 10:18:49,592][32177] Avg episode reward: [(0, '0.190')] [2024-06-10 10:18:52,967][32415] Updated weights for policy 0, policy_version 8500 (0.0035) [2024-06-10 10:18:54,592][32177] Fps is (10 sec: 42599.6, 60 sec: 44782.9, 300 sec: 44819.9). Total num frames: 139313152. Throughput: 0: 44689.5. Samples: 139476220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-10 10:18:54,592][32177] Avg episode reward: [(0, '0.196')] [2024-06-10 10:18:56,553][32415] Updated weights for policy 0, policy_version 8510 (0.0038) [2024-06-10 10:18:59,592][32177] Fps is (10 sec: 45874.6, 60 sec: 44509.8, 300 sec: 44819.9). Total num frames: 139558912. Throughput: 0: 44781.9. Samples: 139613820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-10 10:18:59,593][32177] Avg episode reward: [(0, '0.194')] [2024-06-10 10:19:00,559][32415] Updated weights for policy 0, policy_version 8520 (0.0029) [2024-06-10 10:19:04,083][32415] Updated weights for policy 0, policy_version 8530 (0.0024) [2024-06-10 10:19:04,592][32177] Fps is (10 sec: 47514.5, 60 sec: 45056.0, 300 sec: 44931.1). Total num frames: 139788288. Throughput: 0: 44983.9. Samples: 139887920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-10 10:19:04,592][32177] Avg episode reward: [(0, '0.181')] [2024-06-10 10:19:07,663][32415] Updated weights for policy 0, policy_version 8540 (0.0027) [2024-06-10 10:19:09,592][32177] Fps is (10 sec: 44237.3, 60 sec: 45059.2, 300 sec: 44875.5). Total num frames: 140001280. Throughput: 0: 44938.0. Samples: 140149940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-10 10:19:09,592][32177] Avg episode reward: [(0, '0.199')] [2024-06-10 10:19:11,150][32415] Updated weights for policy 0, policy_version 8550 (0.0031) [2024-06-10 10:19:14,596][32177] Fps is (10 sec: 44217.6, 60 sec: 44783.0, 300 sec: 44763.8). Total num frames: 140230656. Throughput: 0: 44847.4. Samples: 140284900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 10:19:14,597][32177] Avg episode reward: [(0, '0.206')] [2024-06-10 10:19:15,298][32415] Updated weights for policy 0, policy_version 8560 (0.0023) [2024-06-10 10:19:18,481][32415] Updated weights for policy 0, policy_version 8570 (0.0029) [2024-06-10 10:19:19,592][32177] Fps is (10 sec: 45875.6, 60 sec: 45332.4, 300 sec: 44875.5). Total num frames: 140460032. Throughput: 0: 44894.8. Samples: 140555040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 10:19:19,592][32177] Avg episode reward: [(0, '0.201')] [2024-06-10 10:19:22,291][32415] Updated weights for policy 0, policy_version 8580 (0.0034) [2024-06-10 10:19:24,592][32177] Fps is (10 sec: 42616.7, 60 sec: 44513.1, 300 sec: 44820.3). Total num frames: 140656640. Throughput: 0: 44756.5. Samples: 140825080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 10:19:24,592][32177] Avg episode reward: [(0, '0.172')] [2024-06-10 10:19:25,769][32415] Updated weights for policy 0, policy_version 8590 (0.0035) [2024-06-10 10:19:29,592][32177] Fps is (10 sec: 42598.4, 60 sec: 44510.0, 300 sec: 44708.9). Total num frames: 140886016. Throughput: 0: 44778.7. Samples: 140956360. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-10 10:19:29,592][32177] Avg episode reward: [(0, '0.192')] [2024-06-10 10:19:29,806][32415] Updated weights for policy 0, policy_version 8600 (0.0041) [2024-06-10 10:19:33,086][32415] Updated weights for policy 0, policy_version 8610 (0.0029) [2024-06-10 10:19:34,592][32177] Fps is (10 sec: 47513.3, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 141131776. Throughput: 0: 44927.5. Samples: 141231720. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-10 10:19:34,592][32177] Avg episode reward: [(0, '0.206')] [2024-06-10 10:19:36,736][32415] Updated weights for policy 0, policy_version 8620 (0.0033) [2024-06-10 10:19:39,592][32177] Fps is (10 sec: 44236.9, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 141328384. Throughput: 0: 45108.6. Samples: 141506100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 10:19:39,592][32177] Avg episode reward: [(0, '0.214')] [2024-06-10 10:19:39,645][32394] Signal inference workers to stop experience collection... (2050 times) [2024-06-10 10:19:39,695][32415] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-06-10 10:19:39,703][32394] Signal inference workers to resume experience collection... (2050 times) [2024-06-10 10:19:39,704][32394] Saving new best policy, reward=0.214! [2024-06-10 10:19:39,708][32415] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-06-10 10:19:40,202][32415] Updated weights for policy 0, policy_version 8630 (0.0024) [2024-06-10 10:19:44,209][32415] Updated weights for policy 0, policy_version 8640 (0.0034) [2024-06-10 10:19:44,592][32177] Fps is (10 sec: 44237.0, 60 sec: 44783.2, 300 sec: 44764.4). Total num frames: 141574144. Throughput: 0: 44893.5. Samples: 141634020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-10 10:19:44,592][32177] Avg episode reward: [(0, '0.200')] [2024-06-10 10:19:47,474][32415] Updated weights for policy 0, policy_version 8650 (0.0021) [2024-06-10 10:19:49,592][32177] Fps is (10 sec: 47513.6, 60 sec: 45056.1, 300 sec: 44875.5). Total num frames: 141803520. Throughput: 0: 44828.0. Samples: 141905180. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-10 10:19:49,592][32177] Avg episode reward: [(0, '0.199')] [2024-06-10 10:19:51,599][32415] Updated weights for policy 0, policy_version 8660 (0.0036) [2024-06-10 10:19:54,592][32177] Fps is (10 sec: 44236.4, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 142016512. Throughput: 0: 45024.4. Samples: 142176040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-10 10:19:54,592][32177] Avg episode reward: [(0, '0.172')] [2024-06-10 10:19:54,975][32415] Updated weights for policy 0, policy_version 8670 (0.0026) [2024-06-10 10:19:58,591][32415] Updated weights for policy 0, policy_version 8680 (0.0033) [2024-06-10 10:19:59,596][32177] Fps is (10 sec: 44217.4, 60 sec: 44779.8, 300 sec: 44819.9). Total num frames: 142245888. Throughput: 0: 44840.8. Samples: 142302740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 10:19:59,597][32177] Avg episode reward: [(0, '0.208')] [2024-06-10 10:20:02,424][32415] Updated weights for policy 0, policy_version 8690 (0.0027) [2024-06-10 10:20:04,592][32177] Fps is (10 sec: 47513.7, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 142491648. Throughput: 0: 44902.1. Samples: 142575640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-10 10:20:04,592][32177] Avg episode reward: [(0, '0.204')] [2024-06-10 10:20:05,569][32415] Updated weights for policy 0, policy_version 8700 (0.0031) [2024-06-10 10:20:09,428][32415] Updated weights for policy 0, policy_version 8710 (0.0032) [2024-06-10 10:20:09,592][32177] Fps is (10 sec: 45894.8, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 142704640. Throughput: 0: 45011.5. Samples: 142850600. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-10 10:20:09,592][32177] Avg episode reward: [(0, '0.195')] [2024-06-10 10:20:13,186][32415] Updated weights for policy 0, policy_version 8720 (0.0028) [2024-06-10 10:20:14,592][32177] Fps is (10 sec: 44237.2, 60 sec: 45059.2, 300 sec: 44931.1). Total num frames: 142934016. Throughput: 0: 44891.5. Samples: 142976480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 10:20:14,599][32177] Avg episode reward: [(0, '0.207')] [2024-06-10 10:20:16,751][32415] Updated weights for policy 0, policy_version 8730 (0.0031) [2024-06-10 10:20:19,592][32177] Fps is (10 sec: 45875.3, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 143163392. Throughput: 0: 44796.0. Samples: 143247540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 10:20:19,601][32177] Avg episode reward: [(0, '0.197')] [2024-06-10 10:20:20,893][32415] Updated weights for policy 0, policy_version 8740 (0.0038) [2024-06-10 10:20:24,119][32415] Updated weights for policy 0, policy_version 8750 (0.0039) [2024-06-10 10:20:24,592][32177] Fps is (10 sec: 42598.1, 60 sec: 45056.0, 300 sec: 44819.9). Total num frames: 143360000. Throughput: 0: 44699.0. Samples: 143517560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 10:20:24,592][32177] Avg episode reward: [(0, '0.192')] [2024-06-10 10:20:27,915][32415] Updated weights for policy 0, policy_version 8760 (0.0036) [2024-06-10 10:20:29,592][32177] Fps is (10 sec: 42596.6, 60 sec: 45055.6, 300 sec: 44875.5). Total num frames: 143589376. Throughput: 0: 44770.2. Samples: 143648700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-10 10:20:29,593][32177] Avg episode reward: [(0, '0.215')] [2024-06-10 10:20:31,562][32415] Updated weights for policy 0, policy_version 8770 (0.0027) [2024-06-10 10:20:34,592][32177] Fps is (10 sec: 45875.6, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 143818752. Throughput: 0: 44796.4. Samples: 143921020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-10 10:20:34,592][32177] Avg episode reward: [(0, '0.205')] [2024-06-10 10:20:34,671][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000008779_143835136.pth... [2024-06-10 10:20:34,734][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000008120_133038080.pth [2024-06-10 10:20:34,936][32415] Updated weights for policy 0, policy_version 8780 (0.0028) [2024-06-10 10:20:38,664][32415] Updated weights for policy 0, policy_version 8790 (0.0034) [2024-06-10 10:20:39,595][32177] Fps is (10 sec: 44225.9, 60 sec: 45053.8, 300 sec: 44875.1). Total num frames: 144031744. Throughput: 0: 44685.6. Samples: 144187020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-10 10:20:39,595][32177] Avg episode reward: [(0, '0.196')] [2024-06-10 10:20:42,494][32415] Updated weights for policy 0, policy_version 8800 (0.0028) [2024-06-10 10:20:44,592][32177] Fps is (10 sec: 42597.6, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 144244736. Throughput: 0: 44790.8. Samples: 144318140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-10 10:20:44,592][32177] Avg episode reward: [(0, '0.206')] [2024-06-10 10:20:46,197][32415] Updated weights for policy 0, policy_version 8810 (0.0032) [2024-06-10 10:20:49,592][32177] Fps is (10 sec: 45888.9, 60 sec: 44782.9, 300 sec: 44764.5). Total num frames: 144490496. Throughput: 0: 44755.2. Samples: 144589620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 10:20:49,592][32177] Avg episode reward: [(0, '0.214')] [2024-06-10 10:20:49,999][32415] Updated weights for policy 0, policy_version 8820 (0.0028) [2024-06-10 10:20:53,712][32394] Signal inference workers to stop experience collection... (2100 times) [2024-06-10 10:20:53,758][32394] Signal inference workers to resume experience collection... (2100 times) [2024-06-10 10:20:53,759][32415] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-06-10 10:20:53,762][32415] Updated weights for policy 0, policy_version 8830 (0.0033) [2024-06-10 10:20:53,780][32415] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-06-10 10:20:54,592][32177] Fps is (10 sec: 47514.4, 60 sec: 45056.1, 300 sec: 44820.0). Total num frames: 144719872. Throughput: 0: 44689.4. Samples: 144861620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 10:20:54,592][32177] Avg episode reward: [(0, '0.188')] [2024-06-10 10:20:57,022][32415] Updated weights for policy 0, policy_version 8840 (0.0042) [2024-06-10 10:20:59,592][32177] Fps is (10 sec: 42596.8, 60 sec: 44512.8, 300 sec: 44820.6). Total num frames: 144916480. Throughput: 0: 44862.3. Samples: 144995300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-10 10:20:59,593][32177] Avg episode reward: [(0, '0.211')] [2024-06-10 10:21:00,938][32415] Updated weights for policy 0, policy_version 8850 (0.0033) [2024-06-10 10:21:04,269][32415] Updated weights for policy 0, policy_version 8860 (0.0030) [2024-06-10 10:21:04,592][32177] Fps is (10 sec: 45874.8, 60 sec: 44783.0, 300 sec: 44819.9). Total num frames: 145178624. Throughput: 0: 44744.9. Samples: 145261060. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-10 10:21:04,592][32177] Avg episode reward: [(0, '0.207')] [2024-06-10 10:21:08,245][32415] Updated weights for policy 0, policy_version 8870 (0.0028) [2024-06-10 10:21:09,591][32177] Fps is (10 sec: 47515.6, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 145391616. Throughput: 0: 44815.2. Samples: 145534240. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-10 10:21:09,592][32177] Avg episode reward: [(0, '0.205')] [2024-06-10 10:21:11,514][32415] Updated weights for policy 0, policy_version 8880 (0.0053) [2024-06-10 10:21:14,592][32177] Fps is (10 sec: 40959.8, 60 sec: 44236.7, 300 sec: 44875.5). Total num frames: 145588224. Throughput: 0: 44887.0. Samples: 145668600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 10:21:14,592][32177] Avg episode reward: [(0, '0.206')] [2024-06-10 10:21:15,415][32415] Updated weights for policy 0, policy_version 8890 (0.0032) [2024-06-10 10:21:19,039][32415] Updated weights for policy 0, policy_version 8900 (0.0043) [2024-06-10 10:21:19,591][32177] Fps is (10 sec: 45875.3, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 145850368. Throughput: 0: 44776.1. Samples: 145935940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 10:21:19,592][32177] Avg episode reward: [(0, '0.210')] [2024-06-10 10:21:22,933][32415] Updated weights for policy 0, policy_version 8910 (0.0040) [2024-06-10 10:21:24,592][32177] Fps is (10 sec: 47513.6, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 146063360. Throughput: 0: 44949.9. Samples: 146209640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 10:21:24,592][32177] Avg episode reward: [(0, '0.203')] [2024-06-10 10:21:26,146][32415] Updated weights for policy 0, policy_version 8920 (0.0039) [2024-06-10 10:21:29,592][32177] Fps is (10 sec: 40959.7, 60 sec: 44510.2, 300 sec: 44820.0). Total num frames: 146259968. Throughput: 0: 44950.8. Samples: 146340920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-10 10:21:29,592][32177] Avg episode reward: [(0, '0.215')] [2024-06-10 10:21:30,054][32415] Updated weights for policy 0, policy_version 8930 (0.0029) [2024-06-10 10:21:33,428][32415] Updated weights for policy 0, policy_version 8940 (0.0031) [2024-06-10 10:21:34,594][32177] Fps is (10 sec: 45866.3, 60 sec: 45054.4, 300 sec: 44819.6). Total num frames: 146522112. Throughput: 0: 44942.3. Samples: 146612120. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-10 10:21:34,594][32177] Avg episode reward: [(0, '0.211')] [2024-06-10 10:21:37,461][32415] Updated weights for policy 0, policy_version 8950 (0.0035) [2024-06-10 10:21:39,592][32177] Fps is (10 sec: 47513.6, 60 sec: 45058.2, 300 sec: 44764.4). Total num frames: 146735104. Throughput: 0: 44814.2. Samples: 146878260. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-10 10:21:39,592][32177] Avg episode reward: [(0, '0.200')] [2024-06-10 10:21:40,636][32415] Updated weights for policy 0, policy_version 8960 (0.0047) [2024-06-10 10:21:44,592][32177] Fps is (10 sec: 40968.4, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 146931712. Throughput: 0: 44817.2. Samples: 147012060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-10 10:21:44,592][32177] Avg episode reward: [(0, '0.213')] [2024-06-10 10:21:44,964][32415] Updated weights for policy 0, policy_version 8970 (0.0035) [2024-06-10 10:21:48,248][32415] Updated weights for policy 0, policy_version 8980 (0.0043) [2024-06-10 10:21:49,596][32177] Fps is (10 sec: 44218.0, 60 sec: 44779.7, 300 sec: 44819.3). Total num frames: 147177472. Throughput: 0: 44927.4. Samples: 147282980. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-10 10:21:49,596][32177] Avg episode reward: [(0, '0.200')] [2024-06-10 10:21:52,253][32415] Updated weights for policy 0, policy_version 8990 (0.0035) [2024-06-10 10:21:54,592][32177] Fps is (10 sec: 45875.4, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 147390464. Throughput: 0: 44728.8. Samples: 147547040. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-10 10:21:54,592][32177] Avg episode reward: [(0, '0.208')] [2024-06-10 10:21:55,539][32415] Updated weights for policy 0, policy_version 9000 (0.0041) [2024-06-10 10:21:59,592][32177] Fps is (10 sec: 42616.7, 60 sec: 44783.2, 300 sec: 44876.2). Total num frames: 147603456. Throughput: 0: 44774.8. Samples: 147683460. Policy #0 lag: (min: 1.0, avg: 11.0, max: 20.0) [2024-06-10 10:21:59,592][32177] Avg episode reward: [(0, '0.199')] [2024-06-10 10:21:59,667][32415] Updated weights for policy 0, policy_version 9010 (0.0022) [2024-06-10 10:22:02,835][32415] Updated weights for policy 0, policy_version 9020 (0.0040) [2024-06-10 10:22:04,592][32177] Fps is (10 sec: 44236.0, 60 sec: 44236.7, 300 sec: 44765.0). Total num frames: 147832832. Throughput: 0: 44604.6. Samples: 147943160. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-06-10 10:22:04,592][32177] Avg episode reward: [(0, '0.206')] [2024-06-10 10:22:07,298][32415] Updated weights for policy 0, policy_version 9030 (0.0034) [2024-06-10 10:22:09,592][32177] Fps is (10 sec: 45875.1, 60 sec: 44509.8, 300 sec: 44708.9). Total num frames: 148062208. Throughput: 0: 44418.8. Samples: 148208480. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-06-10 10:22:09,592][32177] Avg episode reward: [(0, '0.204')] [2024-06-10 10:22:10,130][32415] Updated weights for policy 0, policy_version 9040 (0.0034) [2024-06-10 10:22:14,506][32415] Updated weights for policy 0, policy_version 9050 (0.0023) [2024-06-10 10:22:14,592][32177] Fps is (10 sec: 44236.8, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 148275200. Throughput: 0: 44549.2. Samples: 148345640. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-10 10:22:14,592][32177] Avg episode reward: [(0, '0.203')] [2024-06-10 10:22:17,490][32415] Updated weights for policy 0, policy_version 9060 (0.0035) [2024-06-10 10:22:19,592][32177] Fps is (10 sec: 44236.7, 60 sec: 44236.7, 300 sec: 44764.4). Total num frames: 148504576. Throughput: 0: 44327.4. Samples: 148606760. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-10 10:22:19,592][32177] Avg episode reward: [(0, '0.204')] [2024-06-10 10:22:21,173][32394] Signal inference workers to stop experience collection... (2150 times) [2024-06-10 10:22:21,174][32394] Signal inference workers to resume experience collection... (2150 times) [2024-06-10 10:22:21,204][32415] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-06-10 10:22:21,204][32415] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-06-10 10:22:21,794][32415] Updated weights for policy 0, policy_version 9070 (0.0030) [2024-06-10 10:22:24,592][32177] Fps is (10 sec: 45876.0, 60 sec: 44510.0, 300 sec: 44764.4). Total num frames: 148733952. Throughput: 0: 44565.3. Samples: 148883700. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-10 10:22:24,592][32177] Avg episode reward: [(0, '0.205')] [2024-06-10 10:22:24,792][32415] Updated weights for policy 0, policy_version 9080 (0.0027) [2024-06-10 10:22:29,206][32415] Updated weights for policy 0, policy_version 9090 (0.0036) [2024-06-10 10:22:29,592][32177] Fps is (10 sec: 44236.2, 60 sec: 44782.8, 300 sec: 44820.0). Total num frames: 148946944. Throughput: 0: 44586.1. Samples: 149018440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-10 10:22:29,592][32177] Avg episode reward: [(0, '0.207')] [2024-06-10 10:22:32,338][32415] Updated weights for policy 0, policy_version 9100 (0.0023) [2024-06-10 10:22:34,592][32177] Fps is (10 sec: 44236.5, 60 sec: 44238.3, 300 sec: 44819.9). Total num frames: 149176320. Throughput: 0: 44404.1. Samples: 149280980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-10 10:22:34,592][32177] Avg episode reward: [(0, '0.222')] [2024-06-10 10:22:34,605][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000009105_149176320.pth... [2024-06-10 10:22:34,665][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000008449_138428416.pth [2024-06-10 10:22:34,675][32394] Saving new best policy, reward=0.222! [2024-06-10 10:22:36,713][32415] Updated weights for policy 0, policy_version 9110 (0.0033) [2024-06-10 10:22:39,378][32415] Updated weights for policy 0, policy_version 9120 (0.0026) [2024-06-10 10:22:39,592][32177] Fps is (10 sec: 49152.3, 60 sec: 45055.9, 300 sec: 44820.0). Total num frames: 149438464. Throughput: 0: 44557.7. Samples: 149552140. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-10 10:22:39,592][32177] Avg episode reward: [(0, '0.207')] [2024-06-10 10:22:43,940][32415] Updated weights for policy 0, policy_version 9130 (0.0025) [2024-06-10 10:22:44,592][32177] Fps is (10 sec: 44237.3, 60 sec: 44783.0, 300 sec: 44820.4). Total num frames: 149618688. Throughput: 0: 44509.8. Samples: 149686400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-10 10:22:44,592][32177] Avg episode reward: [(0, '0.201')] [2024-06-10 10:22:46,887][32415] Updated weights for policy 0, policy_version 9140 (0.0033) [2024-06-10 10:22:49,596][32177] Fps is (10 sec: 39304.8, 60 sec: 44236.7, 300 sec: 44763.8). Total num frames: 149831680. Throughput: 0: 44666.5. Samples: 149953340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-10 10:22:49,597][32177] Avg episode reward: [(0, '0.217')] [2024-06-10 10:22:51,203][32415] Updated weights for policy 0, policy_version 9150 (0.0031) [2024-06-10 10:22:54,071][32415] Updated weights for policy 0, policy_version 9160 (0.0037) [2024-06-10 10:22:54,592][32177] Fps is (10 sec: 47513.5, 60 sec: 45056.0, 300 sec: 44764.4). Total num frames: 150093824. Throughput: 0: 44807.1. Samples: 150224800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-10 10:22:54,592][32177] Avg episode reward: [(0, '0.220')] [2024-06-10 10:22:58,656][32415] Updated weights for policy 0, policy_version 9170 (0.0034) [2024-06-10 10:22:59,592][32177] Fps is (10 sec: 47533.4, 60 sec: 45055.8, 300 sec: 44819.9). Total num frames: 150306816. Throughput: 0: 44916.0. Samples: 150366860. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-10 10:22:59,592][32177] Avg episode reward: [(0, '0.207')] [2024-06-10 10:23:01,408][32415] Updated weights for policy 0, policy_version 9180 (0.0039) [2024-06-10 10:23:04,592][32177] Fps is (10 sec: 39321.5, 60 sec: 44236.9, 300 sec: 44709.5). Total num frames: 150487040. Throughput: 0: 44824.5. Samples: 150623860. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-10 10:23:04,592][32177] Avg episode reward: [(0, '0.212')] [2024-06-10 10:23:05,987][32415] Updated weights for policy 0, policy_version 9190 (0.0027) [2024-06-10 10:23:08,523][32415] Updated weights for policy 0, policy_version 9200 (0.0038) [2024-06-10 10:23:09,592][32177] Fps is (10 sec: 45875.5, 60 sec: 45055.9, 300 sec: 44820.6). Total num frames: 150765568. Throughput: 0: 44542.6. Samples: 150888120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-10 10:23:09,592][32177] Avg episode reward: [(0, '0.208')] [2024-06-10 10:23:13,210][32415] Updated weights for policy 0, policy_version 9210 (0.0042) [2024-06-10 10:23:14,596][32177] Fps is (10 sec: 49130.3, 60 sec: 45052.8, 300 sec: 44875.5). Total num frames: 150978560. Throughput: 0: 44761.1. Samples: 151032880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 10:23:14,597][32177] Avg episode reward: [(0, '0.206')] [2024-06-10 10:23:15,858][32415] Updated weights for policy 0, policy_version 9220 (0.0040) [2024-06-10 10:23:19,592][32177] Fps is (10 sec: 39321.8, 60 sec: 44236.8, 300 sec: 44654.0). Total num frames: 151158784. Throughput: 0: 44801.8. Samples: 151297060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 10:23:19,592][32177] Avg episode reward: [(0, '0.219')] [2024-06-10 10:23:20,466][32415] Updated weights for policy 0, policy_version 9230 (0.0032) [2024-06-10 10:23:23,124][32415] Updated weights for policy 0, policy_version 9240 (0.0041) [2024-06-10 10:23:24,592][32177] Fps is (10 sec: 45895.5, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 151437312. Throughput: 0: 44901.0. Samples: 151572680. Policy #0 lag: (min: 2.0, avg: 9.8, max: 20.0) [2024-06-10 10:23:24,592][32177] Avg episode reward: [(0, '0.217')] [2024-06-10 10:23:27,711][32415] Updated weights for policy 0, policy_version 9250 (0.0026) [2024-06-10 10:23:29,592][32177] Fps is (10 sec: 49151.8, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 151650304. Throughput: 0: 45142.1. Samples: 151717800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-10 10:23:29,592][32177] Avg episode reward: [(0, '0.210')] [2024-06-10 10:23:30,586][32415] Updated weights for policy 0, policy_version 9260 (0.0033) [2024-06-10 10:23:34,593][32177] Fps is (10 sec: 37677.7, 60 sec: 43962.7, 300 sec: 44653.1). Total num frames: 151814144. Throughput: 0: 44832.2. Samples: 151970660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-10 10:23:34,593][32177] Avg episode reward: [(0, '0.223')] [2024-06-10 10:23:35,209][32415] Updated weights for policy 0, policy_version 9270 (0.0033) [2024-06-10 10:23:35,759][32394] Signal inference workers to stop experience collection... (2200 times) [2024-06-10 10:23:35,796][32415] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-06-10 10:23:35,815][32394] Signal inference workers to resume experience collection... (2200 times) [2024-06-10 10:23:35,817][32415] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-06-10 10:23:37,767][32415] Updated weights for policy 0, policy_version 9280 (0.0034) [2024-06-10 10:23:39,592][32177] Fps is (10 sec: 44236.5, 60 sec: 44236.7, 300 sec: 44764.5). Total num frames: 152092672. Throughput: 0: 44669.1. Samples: 152234920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-10 10:23:39,592][32177] Avg episode reward: [(0, '0.212')] [2024-06-10 10:23:42,585][32415] Updated weights for policy 0, policy_version 9290 (0.0022) [2024-06-10 10:23:44,596][32177] Fps is (10 sec: 50775.8, 60 sec: 45052.7, 300 sec: 44819.3). Total num frames: 152322048. Throughput: 0: 44771.9. Samples: 152381780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-10 10:23:44,596][32177] Avg episode reward: [(0, '0.220')] [2024-06-10 10:23:45,099][32415] Updated weights for policy 0, policy_version 9300 (0.0036) [2024-06-10 10:23:49,592][32177] Fps is (10 sec: 42598.6, 60 sec: 44786.1, 300 sec: 44764.4). Total num frames: 152518656. Throughput: 0: 45006.1. Samples: 152649140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-10 10:23:49,593][32177] Avg episode reward: [(0, '0.208')] [2024-06-10 10:23:49,630][32415] Updated weights for policy 0, policy_version 9310 (0.0038) [2024-06-10 10:23:52,449][32415] Updated weights for policy 0, policy_version 9320 (0.0028) [2024-06-10 10:23:54,592][32177] Fps is (10 sec: 44255.3, 60 sec: 44509.7, 300 sec: 44764.4). Total num frames: 152764416. Throughput: 0: 45017.7. Samples: 152913920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-10 10:23:54,592][32177] Avg episode reward: [(0, '0.211')] [2024-06-10 10:23:57,130][32415] Updated weights for policy 0, policy_version 9330 (0.0029) [2024-06-10 10:23:59,592][32177] Fps is (10 sec: 49152.7, 60 sec: 45056.2, 300 sec: 44820.0). Total num frames: 153010176. Throughput: 0: 44921.3. Samples: 153054140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-10 10:23:59,592][32177] Avg episode reward: [(0, '0.228')] [2024-06-10 10:23:59,599][32394] Saving new best policy, reward=0.228! [2024-06-10 10:23:59,602][32415] Updated weights for policy 0, policy_version 9340 (0.0026) [2024-06-10 10:24:04,432][32415] Updated weights for policy 0, policy_version 9350 (0.0028) [2024-06-10 10:24:04,592][32177] Fps is (10 sec: 42598.2, 60 sec: 45055.8, 300 sec: 44708.9). Total num frames: 153190400. Throughput: 0: 44893.2. Samples: 153317260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-10 10:24:04,592][32177] Avg episode reward: [(0, '0.212')] [2024-06-10 10:24:06,975][32415] Updated weights for policy 0, policy_version 9360 (0.0032) [2024-06-10 10:24:09,592][32177] Fps is (10 sec: 42598.1, 60 sec: 44509.9, 300 sec: 44765.1). Total num frames: 153436160. Throughput: 0: 44540.4. Samples: 153577000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 10:24:09,592][32177] Avg episode reward: [(0, '0.225')] [2024-06-10 10:24:11,883][32415] Updated weights for policy 0, policy_version 9370 (0.0036) [2024-06-10 10:24:14,575][32415] Updated weights for policy 0, policy_version 9380 (0.0037) [2024-06-10 10:24:14,592][32177] Fps is (10 sec: 49152.9, 60 sec: 45059.3, 300 sec: 44820.0). Total num frames: 153681920. Throughput: 0: 44480.1. Samples: 153719400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-10 10:24:14,592][32177] Avg episode reward: [(0, '0.223')] [2024-06-10 10:24:19,003][32415] Updated weights for policy 0, policy_version 9390 (0.0039) [2024-06-10 10:24:19,592][32177] Fps is (10 sec: 42598.3, 60 sec: 45056.0, 300 sec: 44764.4). Total num frames: 153862144. Throughput: 0: 44871.6. Samples: 153989820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-10 10:24:19,592][32177] Avg episode reward: [(0, '0.216')] [2024-06-10 10:24:21,832][32415] Updated weights for policy 0, policy_version 9400 (0.0031) [2024-06-10 10:24:24,592][32177] Fps is (10 sec: 40959.4, 60 sec: 44236.7, 300 sec: 44764.4). Total num frames: 154091520. Throughput: 0: 44904.9. Samples: 154255640. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-10 10:24:24,592][32177] Avg episode reward: [(0, '0.221')] [2024-06-10 10:24:26,432][32415] Updated weights for policy 0, policy_version 9410 (0.0022) [2024-06-10 10:24:29,281][32415] Updated weights for policy 0, policy_version 9420 (0.0024) [2024-06-10 10:24:29,592][32177] Fps is (10 sec: 49151.9, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 154353664. Throughput: 0: 44497.5. Samples: 154383980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 10:24:29,601][32177] Avg episode reward: [(0, '0.227')] [2024-06-10 10:24:33,803][32415] Updated weights for policy 0, policy_version 9430 (0.0028) [2024-06-10 10:24:34,592][32177] Fps is (10 sec: 45875.6, 60 sec: 45603.1, 300 sec: 44819.9). Total num frames: 154550272. Throughput: 0: 44734.7. Samples: 154662200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 10:24:34,592][32177] Avg episode reward: [(0, '0.222')] [2024-06-10 10:24:34,605][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000009433_154550272.pth... [2024-06-10 10:24:34,660][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000008779_143835136.pth [2024-06-10 10:24:36,425][32415] Updated weights for policy 0, policy_version 9440 (0.0035) [2024-06-10 10:24:39,592][32177] Fps is (10 sec: 40956.7, 60 sec: 44509.3, 300 sec: 44708.8). Total num frames: 154763264. Throughput: 0: 44611.3. Samples: 154921460. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-10 10:24:39,593][32177] Avg episode reward: [(0, '0.213')] [2024-06-10 10:24:41,168][32415] Updated weights for policy 0, policy_version 9450 (0.0033) [2024-06-10 10:24:43,031][32394] Signal inference workers to stop experience collection... (2250 times) [2024-06-10 10:24:43,032][32394] Signal inference workers to resume experience collection... (2250 times) [2024-06-10 10:24:43,046][32415] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-06-10 10:24:43,076][32415] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-06-10 10:24:43,716][32415] Updated weights for policy 0, policy_version 9460 (0.0027) [2024-06-10 10:24:44,592][32177] Fps is (10 sec: 45875.1, 60 sec: 44786.1, 300 sec: 44764.4). Total num frames: 155009024. Throughput: 0: 44540.3. Samples: 155058460. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-10 10:24:44,592][32177] Avg episode reward: [(0, '0.218')] [2024-06-10 10:24:48,336][32415] Updated weights for policy 0, policy_version 9470 (0.0022) [2024-06-10 10:24:49,592][32177] Fps is (10 sec: 45877.7, 60 sec: 45055.8, 300 sec: 44764.4). Total num frames: 155222016. Throughput: 0: 44825.7. Samples: 155334420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-10 10:24:49,593][32177] Avg episode reward: [(0, '0.214')] [2024-06-10 10:24:51,128][32415] Updated weights for policy 0, policy_version 9480 (0.0032) [2024-06-10 10:24:54,592][32177] Fps is (10 sec: 39322.3, 60 sec: 43963.9, 300 sec: 44598.5). Total num frames: 155402240. Throughput: 0: 45031.2. Samples: 155603400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 10:24:54,592][32177] Avg episode reward: [(0, '0.218')] [2024-06-10 10:24:55,582][32415] Updated weights for policy 0, policy_version 9490 (0.0030) [2024-06-10 10:24:58,635][32415] Updated weights for policy 0, policy_version 9500 (0.0027) [2024-06-10 10:24:59,592][32177] Fps is (10 sec: 47514.1, 60 sec: 44782.7, 300 sec: 44764.4). Total num frames: 155697152. Throughput: 0: 44704.3. Samples: 155731100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 10:24:59,592][32177] Avg episode reward: [(0, '0.217')] [2024-06-10 10:25:02,937][32415] Updated weights for policy 0, policy_version 9510 (0.0051) [2024-06-10 10:25:04,592][32177] Fps is (10 sec: 49151.9, 60 sec: 45056.2, 300 sec: 44708.9). Total num frames: 155893760. Throughput: 0: 44659.2. Samples: 155999480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-10 10:25:04,592][32177] Avg episode reward: [(0, '0.233')] [2024-06-10 10:25:04,670][32394] Saving new best policy, reward=0.233! [2024-06-10 10:25:06,189][32415] Updated weights for policy 0, policy_version 9520 (0.0040) [2024-06-10 10:25:09,592][32177] Fps is (10 sec: 40960.5, 60 sec: 44509.8, 300 sec: 44653.3). Total num frames: 156106752. Throughput: 0: 44803.6. Samples: 156271800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-10 10:25:09,592][32177] Avg episode reward: [(0, '0.213')] [2024-06-10 10:25:10,134][32415] Updated weights for policy 0, policy_version 9530 (0.0034) [2024-06-10 10:25:13,185][32415] Updated weights for policy 0, policy_version 9540 (0.0021) [2024-06-10 10:25:14,592][32177] Fps is (10 sec: 45874.4, 60 sec: 44509.8, 300 sec: 44708.9). Total num frames: 156352512. Throughput: 0: 44795.0. Samples: 156399760. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-10 10:25:14,592][32177] Avg episode reward: [(0, '0.214')] [2024-06-10 10:25:17,275][32415] Updated weights for policy 0, policy_version 9550 (0.0039) [2024-06-10 10:25:19,592][32177] Fps is (10 sec: 49151.8, 60 sec: 45602.1, 300 sec: 44875.5). Total num frames: 156598272. Throughput: 0: 44691.5. Samples: 156673320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-10 10:25:19,592][32177] Avg episode reward: [(0, '0.216')] [2024-06-10 10:25:20,503][32415] Updated weights for policy 0, policy_version 9560 (0.0036) [2024-06-10 10:25:24,592][32177] Fps is (10 sec: 40960.4, 60 sec: 44510.0, 300 sec: 44653.4). Total num frames: 156762112. Throughput: 0: 45196.4. Samples: 156955260. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-10 10:25:24,592][32177] Avg episode reward: [(0, '0.219')] [2024-06-10 10:25:24,780][32415] Updated weights for policy 0, policy_version 9570 (0.0027) [2024-06-10 10:25:27,996][32415] Updated weights for policy 0, policy_version 9580 (0.0033) [2024-06-10 10:25:29,592][32177] Fps is (10 sec: 42597.5, 60 sec: 44509.7, 300 sec: 44764.4). Total num frames: 157024256. Throughput: 0: 44814.9. Samples: 157075140. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-10 10:25:29,601][32177] Avg episode reward: [(0, '0.212')] [2024-06-10 10:25:32,210][32415] Updated weights for policy 0, policy_version 9590 (0.0032) [2024-06-10 10:25:34,592][32177] Fps is (10 sec: 49152.0, 60 sec: 45056.0, 300 sec: 44820.4). Total num frames: 157253632. Throughput: 0: 44564.3. Samples: 157339800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-10 10:25:34,601][32177] Avg episode reward: [(0, '0.218')] [2024-06-10 10:25:35,261][32415] Updated weights for policy 0, policy_version 9600 (0.0026) [2024-06-10 10:25:39,247][32415] Updated weights for policy 0, policy_version 9610 (0.0043) [2024-06-10 10:25:39,592][32177] Fps is (10 sec: 44238.2, 60 sec: 45056.6, 300 sec: 44820.0). Total num frames: 157466624. Throughput: 0: 44811.9. Samples: 157619940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-10 10:25:39,592][32177] Avg episode reward: [(0, '0.218')] [2024-06-10 10:25:42,760][32415] Updated weights for policy 0, policy_version 9620 (0.0030) [2024-06-10 10:25:44,592][32177] Fps is (10 sec: 42598.1, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 157679616. Throughput: 0: 44749.4. Samples: 157744820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 10:25:44,601][32177] Avg episode reward: [(0, '0.217')] [2024-06-10 10:25:46,509][32415] Updated weights for policy 0, policy_version 9630 (0.0031) [2024-06-10 10:25:49,592][32177] Fps is (10 sec: 45875.5, 60 sec: 45056.3, 300 sec: 44764.4). Total num frames: 157925376. Throughput: 0: 44857.8. Samples: 158018080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-10 10:25:49,592][32177] Avg episode reward: [(0, '0.213')] [2024-06-10 10:25:49,610][32415] Updated weights for policy 0, policy_version 9640 (0.0023) [2024-06-10 10:25:53,850][32415] Updated weights for policy 0, policy_version 9650 (0.0037) [2024-06-10 10:25:54,596][32177] Fps is (10 sec: 47493.7, 60 sec: 45871.9, 300 sec: 44874.9). Total num frames: 158154752. Throughput: 0: 44959.3. Samples: 158295160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-10 10:25:54,605][32177] Avg episode reward: [(0, '0.222')] [2024-06-10 10:25:56,626][32415] Updated weights for policy 0, policy_version 9660 (0.0040) [2024-06-10 10:25:59,596][32177] Fps is (10 sec: 42579.9, 60 sec: 44233.8, 300 sec: 44652.7). Total num frames: 158351360. Throughput: 0: 45024.7. Samples: 158426060. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-10 10:25:59,597][32177] Avg episode reward: [(0, '0.212')] [2024-06-10 10:26:01,350][32415] Updated weights for policy 0, policy_version 9670 (0.0043) [2024-06-10 10:26:04,111][32415] Updated weights for policy 0, policy_version 9680 (0.0026) [2024-06-10 10:26:04,596][32177] Fps is (10 sec: 44236.8, 60 sec: 45052.7, 300 sec: 44763.8). Total num frames: 158597120. Throughput: 0: 44781.6. Samples: 158688680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 10:26:04,596][32177] Avg episode reward: [(0, '0.214')] [2024-06-10 10:26:08,333][32394] Signal inference workers to stop experience collection... (2300 times) [2024-06-10 10:26:08,334][32394] Signal inference workers to resume experience collection... (2300 times) [2024-06-10 10:26:08,381][32415] InferenceWorker_p0-w0: stopping experience collection (2300 times) [2024-06-10 10:26:08,381][32415] InferenceWorker_p0-w0: resuming experience collection (2300 times) [2024-06-10 10:26:08,469][32415] Updated weights for policy 0, policy_version 9690 (0.0023) [2024-06-10 10:26:09,592][32177] Fps is (10 sec: 45894.8, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 158810112. Throughput: 0: 44620.4. Samples: 158963180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 10:26:09,592][32177] Avg episode reward: [(0, '0.217')] [2024-06-10 10:26:11,401][32415] Updated weights for policy 0, policy_version 9700 (0.0045) [2024-06-10 10:26:14,596][32177] Fps is (10 sec: 40959.9, 60 sec: 44233.7, 300 sec: 44597.1). Total num frames: 159006720. Throughput: 0: 45051.5. Samples: 159102640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-10 10:26:14,597][32177] Avg episode reward: [(0, '0.219')] [2024-06-10 10:26:15,887][32415] Updated weights for policy 0, policy_version 9710 (0.0032) [2024-06-10 10:26:18,358][32415] Updated weights for policy 0, policy_version 9720 (0.0033) [2024-06-10 10:26:19,591][32177] Fps is (10 sec: 45875.8, 60 sec: 44510.0, 300 sec: 44764.5). Total num frames: 159268864. Throughput: 0: 45015.7. Samples: 159365500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-10 10:26:19,592][32177] Avg episode reward: [(0, '0.227')] [2024-06-10 10:26:23,023][32415] Updated weights for policy 0, policy_version 9730 (0.0023) [2024-06-10 10:26:24,596][32177] Fps is (10 sec: 49152.3, 60 sec: 45598.9, 300 sec: 44874.8). Total num frames: 159498240. Throughput: 0: 44732.2. Samples: 159633080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-10 10:26:24,597][32177] Avg episode reward: [(0, '0.213')] [2024-06-10 10:26:25,914][32415] Updated weights for policy 0, policy_version 9740 (0.0027) [2024-06-10 10:26:29,594][32177] Fps is (10 sec: 40951.0, 60 sec: 44235.5, 300 sec: 44597.8). Total num frames: 159678464. Throughput: 0: 45046.9. Samples: 159772020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 10:26:29,594][32177] Avg episode reward: [(0, '0.212')] [2024-06-10 10:26:30,559][32415] Updated weights for policy 0, policy_version 9750 (0.0026) [2024-06-10 10:26:33,478][32415] Updated weights for policy 0, policy_version 9760 (0.0026) [2024-06-10 10:26:34,592][32177] Fps is (10 sec: 42616.5, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 159924224. Throughput: 0: 44808.4. Samples: 160034460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 10:26:34,592][32177] Avg episode reward: [(0, '0.217')] [2024-06-10 10:26:34,614][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000009761_159924224.pth... [2024-06-10 10:26:34,672][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000009105_149176320.pth [2024-06-10 10:26:37,815][32415] Updated weights for policy 0, policy_version 9770 (0.0038) [2024-06-10 10:26:39,592][32177] Fps is (10 sec: 50801.2, 60 sec: 45329.1, 300 sec: 44931.0). Total num frames: 160186368. Throughput: 0: 44568.3. Samples: 160300540. Policy #0 lag: (min: 1.0, avg: 7.6, max: 19.0) [2024-06-10 10:26:39,592][32177] Avg episode reward: [(0, '0.216')] [2024-06-10 10:26:40,575][32415] Updated weights for policy 0, policy_version 9780 (0.0037) [2024-06-10 10:26:44,592][32177] Fps is (10 sec: 40960.4, 60 sec: 44236.9, 300 sec: 44598.4). Total num frames: 160333824. Throughput: 0: 44873.2. Samples: 160445160. Policy #0 lag: (min: 1.0, avg: 7.6, max: 19.0) [2024-06-10 10:26:44,592][32177] Avg episode reward: [(0, '0.219')] [2024-06-10 10:26:45,315][32415] Updated weights for policy 0, policy_version 9790 (0.0029) [2024-06-10 10:26:47,830][32415] Updated weights for policy 0, policy_version 9800 (0.0033) [2024-06-10 10:26:49,592][32177] Fps is (10 sec: 40959.5, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 160595968. Throughput: 0: 44833.1. Samples: 160705980. Policy #0 lag: (min: 2.0, avg: 8.5, max: 23.0) [2024-06-10 10:26:49,592][32177] Avg episode reward: [(0, '0.211')] [2024-06-10 10:26:52,413][32415] Updated weights for policy 0, policy_version 9810 (0.0027) [2024-06-10 10:26:54,592][32177] Fps is (10 sec: 50790.4, 60 sec: 44786.2, 300 sec: 44875.5). Total num frames: 160841728. Throughput: 0: 44663.6. Samples: 160973040. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-10 10:26:54,592][32177] Avg episode reward: [(0, '0.227')] [2024-06-10 10:26:55,244][32415] Updated weights for policy 0, policy_version 9820 (0.0036) [2024-06-10 10:26:59,592][32177] Fps is (10 sec: 42599.0, 60 sec: 44513.1, 300 sec: 44708.9). Total num frames: 161021952. Throughput: 0: 44745.7. Samples: 161116000. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-10 10:26:59,592][32177] Avg episode reward: [(0, '0.229')] [2024-06-10 10:26:59,848][32415] Updated weights for policy 0, policy_version 9830 (0.0031) [2024-06-10 10:27:02,773][32415] Updated weights for policy 0, policy_version 9840 (0.0026) [2024-06-10 10:27:04,592][32177] Fps is (10 sec: 42597.8, 60 sec: 44513.0, 300 sec: 44764.4). Total num frames: 161267712. Throughput: 0: 44706.5. Samples: 161377300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-10 10:27:04,592][32177] Avg episode reward: [(0, '0.209')] [2024-06-10 10:27:06,946][32415] Updated weights for policy 0, policy_version 9850 (0.0032) [2024-06-10 10:27:07,999][32394] Signal inference workers to stop experience collection... (2350 times) [2024-06-10 10:27:08,021][32415] InferenceWorker_p0-w0: stopping experience collection (2350 times) [2024-06-10 10:27:08,055][32394] Signal inference workers to resume experience collection... (2350 times) [2024-06-10 10:27:08,057][32415] InferenceWorker_p0-w0: resuming experience collection (2350 times) [2024-06-10 10:27:09,591][32177] Fps is (10 sec: 49152.3, 60 sec: 45056.1, 300 sec: 44875.5). Total num frames: 161513472. Throughput: 0: 44782.2. Samples: 161648080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-10 10:27:09,592][32177] Avg episode reward: [(0, '0.208')] [2024-06-10 10:27:09,921][32415] Updated weights for policy 0, policy_version 9860 (0.0036) [2024-06-10 10:27:14,503][32415] Updated weights for policy 0, policy_version 9870 (0.0042) [2024-06-10 10:27:14,592][32177] Fps is (10 sec: 44237.4, 60 sec: 45059.3, 300 sec: 44764.4). Total num frames: 161710080. Throughput: 0: 44875.9. Samples: 161791340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-10 10:27:14,592][32177] Avg episode reward: [(0, '0.217')] [2024-06-10 10:27:17,774][32415] Updated weights for policy 0, policy_version 9880 (0.0022) [2024-06-10 10:27:19,592][32177] Fps is (10 sec: 42597.2, 60 sec: 44509.7, 300 sec: 44764.4). Total num frames: 161939456. Throughput: 0: 45039.9. Samples: 162061260. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-10 10:27:19,592][32177] Avg episode reward: [(0, '0.209')] [2024-06-10 10:27:21,629][32415] Updated weights for policy 0, policy_version 9890 (0.0028) [2024-06-10 10:27:24,592][32177] Fps is (10 sec: 45875.2, 60 sec: 44513.1, 300 sec: 44820.0). Total num frames: 162168832. Throughput: 0: 44934.2. Samples: 162322580. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-10 10:27:24,592][32177] Avg episode reward: [(0, '0.232')] [2024-06-10 10:27:24,962][32415] Updated weights for policy 0, policy_version 9900 (0.0022) [2024-06-10 10:27:28,963][32415] Updated weights for policy 0, policy_version 9910 (0.0026) [2024-06-10 10:27:29,592][32177] Fps is (10 sec: 47514.0, 60 sec: 45603.7, 300 sec: 44875.5). Total num frames: 162414592. Throughput: 0: 44818.5. Samples: 162462000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-10 10:27:29,592][32177] Avg episode reward: [(0, '0.220')] [2024-06-10 10:27:31,915][32415] Updated weights for policy 0, policy_version 9920 (0.0028) [2024-06-10 10:27:34,595][32177] Fps is (10 sec: 44222.0, 60 sec: 44780.5, 300 sec: 44652.9). Total num frames: 162611200. Throughput: 0: 45050.1. Samples: 162733380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-10 10:27:34,595][32177] Avg episode reward: [(0, '0.218')] [2024-06-10 10:27:36,185][32415] Updated weights for policy 0, policy_version 9930 (0.0042) [2024-06-10 10:27:39,239][32415] Updated weights for policy 0, policy_version 9940 (0.0025) [2024-06-10 10:27:39,592][32177] Fps is (10 sec: 44237.4, 60 sec: 44509.9, 300 sec: 44875.5). Total num frames: 162856960. Throughput: 0: 45183.1. Samples: 163006280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-10 10:27:39,592][32177] Avg episode reward: [(0, '0.224')] [2024-06-10 10:27:43,319][32415] Updated weights for policy 0, policy_version 9950 (0.0032) [2024-06-10 10:27:44,592][32177] Fps is (10 sec: 50806.4, 60 sec: 46421.2, 300 sec: 45042.8). Total num frames: 163119104. Throughput: 0: 45152.2. Samples: 163147860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 10:27:44,592][32177] Avg episode reward: [(0, '0.234')] [2024-06-10 10:27:46,690][32415] Updated weights for policy 0, policy_version 9960 (0.0041) [2024-06-10 10:27:49,592][32177] Fps is (10 sec: 42597.7, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 163282944. Throughput: 0: 45379.1. Samples: 163419360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 10:27:49,592][32177] Avg episode reward: [(0, '0.211')] [2024-06-10 10:27:50,679][32415] Updated weights for policy 0, policy_version 9970 (0.0031) [2024-06-10 10:27:53,913][32415] Updated weights for policy 0, policy_version 9980 (0.0036) [2024-06-10 10:27:54,592][32177] Fps is (10 sec: 40960.6, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 163528704. Throughput: 0: 45115.4. Samples: 163678280. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-10 10:27:54,592][32177] Avg episode reward: [(0, '0.231')] [2024-06-10 10:27:57,879][32415] Updated weights for policy 0, policy_version 9990 (0.0036) [2024-06-10 10:27:59,592][32177] Fps is (10 sec: 49151.7, 60 sec: 45875.0, 300 sec: 45042.1). Total num frames: 163774464. Throughput: 0: 45009.1. Samples: 163816760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-10 10:27:59,592][32177] Avg episode reward: [(0, '0.216')] [2024-06-10 10:28:00,978][32415] Updated weights for policy 0, policy_version 10000 (0.0037) [2024-06-10 10:28:04,592][32177] Fps is (10 sec: 42598.0, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 163954688. Throughput: 0: 45005.4. Samples: 164086500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-10 10:28:04,592][32177] Avg episode reward: [(0, '0.227')] [2024-06-10 10:28:05,116][32415] Updated weights for policy 0, policy_version 10010 (0.0035) [2024-06-10 10:28:08,377][32415] Updated weights for policy 0, policy_version 10020 (0.0036) [2024-06-10 10:28:09,592][32177] Fps is (10 sec: 40959.7, 60 sec: 44509.6, 300 sec: 44765.0). Total num frames: 164184064. Throughput: 0: 45199.7. Samples: 164356580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-10 10:28:09,592][32177] Avg episode reward: [(0, '0.222')] [2024-06-10 10:28:12,350][32415] Updated weights for policy 0, policy_version 10030 (0.0029) [2024-06-10 10:28:13,346][32394] Signal inference workers to stop experience collection... (2400 times) [2024-06-10 10:28:13,374][32415] InferenceWorker_p0-w0: stopping experience collection (2400 times) [2024-06-10 10:28:13,450][32394] Signal inference workers to resume experience collection... (2400 times) [2024-06-10 10:28:13,451][32415] InferenceWorker_p0-w0: resuming experience collection (2400 times) [2024-06-10 10:28:14,592][32177] Fps is (10 sec: 50790.4, 60 sec: 45875.1, 300 sec: 45097.6). Total num frames: 164462592. Throughput: 0: 45163.1. Samples: 164494340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-10 10:28:14,592][32177] Avg episode reward: [(0, '0.229')] [2024-06-10 10:28:15,969][32415] Updated weights for policy 0, policy_version 10040 (0.0027) [2024-06-10 10:28:19,592][32177] Fps is (10 sec: 44238.2, 60 sec: 44783.1, 300 sec: 44708.9). Total num frames: 164626432. Throughput: 0: 45002.5. Samples: 164758340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-10 10:28:19,592][32177] Avg episode reward: [(0, '0.226')] [2024-06-10 10:28:19,782][32415] Updated weights for policy 0, policy_version 10050 (0.0028) [2024-06-10 10:28:23,248][32415] Updated weights for policy 0, policy_version 10060 (0.0034) [2024-06-10 10:28:24,592][32177] Fps is (10 sec: 39322.0, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 164855808. Throughput: 0: 44760.4. Samples: 165020500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-10 10:28:24,592][32177] Avg episode reward: [(0, '0.231')] [2024-06-10 10:28:27,195][32415] Updated weights for policy 0, policy_version 10070 (0.0037) [2024-06-10 10:28:29,592][32177] Fps is (10 sec: 50789.9, 60 sec: 45329.1, 300 sec: 45153.4). Total num frames: 165134336. Throughput: 0: 44665.4. Samples: 165157800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-10 10:28:29,592][32177] Avg episode reward: [(0, '0.222')] [2024-06-10 10:28:30,294][32415] Updated weights for policy 0, policy_version 10080 (0.0029) [2024-06-10 10:28:34,383][32415] Updated weights for policy 0, policy_version 10090 (0.0033) [2024-06-10 10:28:34,592][32177] Fps is (10 sec: 47512.9, 60 sec: 45331.5, 300 sec: 44875.5). Total num frames: 165330944. Throughput: 0: 44648.9. Samples: 165428560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-10 10:28:34,593][32177] Avg episode reward: [(0, '0.233')] [2024-06-10 10:28:34,605][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000010091_165330944.pth... [2024-06-10 10:28:34,657][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000009433_154550272.pth [2024-06-10 10:28:37,646][32415] Updated weights for policy 0, policy_version 10100 (0.0030) [2024-06-10 10:28:39,592][32177] Fps is (10 sec: 40959.7, 60 sec: 44782.8, 300 sec: 44820.6). Total num frames: 165543936. Throughput: 0: 44839.9. Samples: 165696080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-10 10:28:39,592][32177] Avg episode reward: [(0, '0.220')] [2024-06-10 10:28:41,570][32415] Updated weights for policy 0, policy_version 10110 (0.0033) [2024-06-10 10:28:44,592][32177] Fps is (10 sec: 45875.4, 60 sec: 44509.9, 300 sec: 44986.6). Total num frames: 165789696. Throughput: 0: 44730.8. Samples: 165829640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-10 10:28:44,592][32177] Avg episode reward: [(0, '0.224')] [2024-06-10 10:28:45,107][32415] Updated weights for policy 0, policy_version 10120 (0.0046) [2024-06-10 10:28:49,420][32415] Updated weights for policy 0, policy_version 10130 (0.0033) [2024-06-10 10:28:49,592][32177] Fps is (10 sec: 44237.0, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 165986304. Throughput: 0: 44540.9. Samples: 166090840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 10:28:49,592][32177] Avg episode reward: [(0, '0.226')] [2024-06-10 10:28:52,412][32415] Updated weights for policy 0, policy_version 10140 (0.0032) [2024-06-10 10:28:54,592][32177] Fps is (10 sec: 39322.1, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 166182912. Throughput: 0: 44561.2. Samples: 166361820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 10:28:54,592][32177] Avg episode reward: [(0, '0.224')] [2024-06-10 10:28:56,563][32415] Updated weights for policy 0, policy_version 10150 (0.0031) [2024-06-10 10:28:59,379][32415] Updated weights for policy 0, policy_version 10160 (0.0030) [2024-06-10 10:28:59,592][32177] Fps is (10 sec: 47514.2, 60 sec: 44783.1, 300 sec: 44986.6). Total num frames: 166461440. Throughput: 0: 44477.5. Samples: 166495820. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-10 10:28:59,592][32177] Avg episode reward: [(0, '0.229')] [2024-06-10 10:29:03,713][32415] Updated weights for policy 0, policy_version 10170 (0.0033) [2024-06-10 10:29:04,592][32177] Fps is (10 sec: 45875.3, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 166641664. Throughput: 0: 44632.9. Samples: 166766820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-10 10:29:04,592][32177] Avg episode reward: [(0, '0.216')] [2024-06-10 10:29:06,572][32415] Updated weights for policy 0, policy_version 10180 (0.0046) [2024-06-10 10:29:09,592][32177] Fps is (10 sec: 40959.9, 60 sec: 44783.2, 300 sec: 44708.9). Total num frames: 166871040. Throughput: 0: 44705.4. Samples: 167032240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-10 10:29:09,592][32177] Avg episode reward: [(0, '0.221')] [2024-06-10 10:29:11,082][32415] Updated weights for policy 0, policy_version 10190 (0.0037) [2024-06-10 10:29:14,437][32415] Updated weights for policy 0, policy_version 10200 (0.0041) [2024-06-10 10:29:14,592][32177] Fps is (10 sec: 47512.9, 60 sec: 44236.8, 300 sec: 44931.0). Total num frames: 167116800. Throughput: 0: 44665.3. Samples: 167167740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-10 10:29:14,592][32177] Avg episode reward: [(0, '0.230')] [2024-06-10 10:29:18,629][32415] Updated weights for policy 0, policy_version 10210 (0.0027) [2024-06-10 10:29:19,592][32177] Fps is (10 sec: 45875.2, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 167329792. Throughput: 0: 44538.8. Samples: 167432800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-10 10:29:19,592][32177] Avg episode reward: [(0, '0.221')] [2024-06-10 10:29:21,822][32415] Updated weights for policy 0, policy_version 10220 (0.0025) [2024-06-10 10:29:24,592][32177] Fps is (10 sec: 40960.6, 60 sec: 44509.9, 300 sec: 44653.4). Total num frames: 167526400. Throughput: 0: 44565.0. Samples: 167701500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-10 10:29:24,592][32177] Avg episode reward: [(0, '0.228')] [2024-06-10 10:29:25,842][32415] Updated weights for policy 0, policy_version 10230 (0.0039) [2024-06-10 10:29:28,894][32415] Updated weights for policy 0, policy_version 10240 (0.0024) [2024-06-10 10:29:29,592][32177] Fps is (10 sec: 44236.6, 60 sec: 43963.8, 300 sec: 44820.0). Total num frames: 167772160. Throughput: 0: 44480.5. Samples: 167831260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-10 10:29:29,592][32177] Avg episode reward: [(0, '0.231')] [2024-06-10 10:29:33,296][32394] Signal inference workers to stop experience collection... (2450 times) [2024-06-10 10:29:33,348][32394] Signal inference workers to resume experience collection... (2450 times) [2024-06-10 10:29:33,352][32415] InferenceWorker_p0-w0: stopping experience collection (2450 times) [2024-06-10 10:29:33,355][32415] Updated weights for policy 0, policy_version 10250 (0.0037) [2024-06-10 10:29:33,372][32415] InferenceWorker_p0-w0: resuming experience collection (2450 times) [2024-06-10 10:29:34,592][32177] Fps is (10 sec: 49151.1, 60 sec: 44782.9, 300 sec: 44931.1). Total num frames: 168017920. Throughput: 0: 44920.4. Samples: 168112260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-10 10:29:34,592][32177] Avg episode reward: [(0, '0.235')] [2024-06-10 10:29:34,607][32394] Saving new best policy, reward=0.235! [2024-06-10 10:29:36,127][32415] Updated weights for policy 0, policy_version 10260 (0.0026) [2024-06-10 10:29:39,592][32177] Fps is (10 sec: 42597.8, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 168198144. Throughput: 0: 44635.4. Samples: 168370420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 10:29:39,592][32177] Avg episode reward: [(0, '0.220')] [2024-06-10 10:29:40,523][32415] Updated weights for policy 0, policy_version 10270 (0.0027) [2024-06-10 10:29:43,943][32415] Updated weights for policy 0, policy_version 10280 (0.0025) [2024-06-10 10:29:44,592][32177] Fps is (10 sec: 40960.8, 60 sec: 43963.8, 300 sec: 44764.5). Total num frames: 168427520. Throughput: 0: 44588.0. Samples: 168502280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 10:29:44,592][32177] Avg episode reward: [(0, '0.215')] [2024-06-10 10:29:47,958][32415] Updated weights for policy 0, policy_version 10290 (0.0029) [2024-06-10 10:29:49,592][32177] Fps is (10 sec: 49151.6, 60 sec: 45055.9, 300 sec: 45042.1). Total num frames: 168689664. Throughput: 0: 44479.7. Samples: 168768420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 10:29:49,592][32177] Avg episode reward: [(0, '0.218')] [2024-06-10 10:29:51,401][32415] Updated weights for policy 0, policy_version 10300 (0.0032) [2024-06-10 10:29:54,592][32177] Fps is (10 sec: 44235.2, 60 sec: 44782.7, 300 sec: 44653.3). Total num frames: 168869888. Throughput: 0: 44675.6. Samples: 169042660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 10:29:54,592][32177] Avg episode reward: [(0, '0.233')] [2024-06-10 10:29:55,192][32415] Updated weights for policy 0, policy_version 10310 (0.0027) [2024-06-10 10:29:58,372][32415] Updated weights for policy 0, policy_version 10320 (0.0026) [2024-06-10 10:29:59,592][32177] Fps is (10 sec: 40960.7, 60 sec: 43963.7, 300 sec: 44764.4). Total num frames: 169099264. Throughput: 0: 44552.9. Samples: 169172620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 10:29:59,592][32177] Avg episode reward: [(0, '0.223')] [2024-06-10 10:30:02,703][32415] Updated weights for policy 0, policy_version 10330 (0.0027) [2024-06-10 10:30:04,592][32177] Fps is (10 sec: 49153.0, 60 sec: 45329.0, 300 sec: 44931.0). Total num frames: 169361408. Throughput: 0: 44782.5. Samples: 169448020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-10 10:30:04,592][32177] Avg episode reward: [(0, '0.233')] [2024-06-10 10:30:06,027][32415] Updated weights for policy 0, policy_version 10340 (0.0030) [2024-06-10 10:30:09,596][32177] Fps is (10 sec: 44217.9, 60 sec: 44506.6, 300 sec: 44708.2). Total num frames: 169541632. Throughput: 0: 44710.8. Samples: 169713680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-10 10:30:09,597][32177] Avg episode reward: [(0, '0.230')] [2024-06-10 10:30:10,026][32415] Updated weights for policy 0, policy_version 10350 (0.0032) [2024-06-10 10:30:13,444][32415] Updated weights for policy 0, policy_version 10360 (0.0039) [2024-06-10 10:30:14,592][32177] Fps is (10 sec: 40960.5, 60 sec: 44236.9, 300 sec: 44653.4). Total num frames: 169771008. Throughput: 0: 44594.7. Samples: 169838020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 10:30:14,592][32177] Avg episode reward: [(0, '0.230')] [2024-06-10 10:30:17,607][32415] Updated weights for policy 0, policy_version 10370 (0.0030) [2024-06-10 10:30:19,592][32177] Fps is (10 sec: 47534.1, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 170016768. Throughput: 0: 44403.7. Samples: 170110420. Policy #0 lag: (min: 2.0, avg: 11.9, max: 23.0) [2024-06-10 10:30:19,592][32177] Avg episode reward: [(0, '0.231')] [2024-06-10 10:30:20,737][32415] Updated weights for policy 0, policy_version 10380 (0.0033) [2024-06-10 10:30:24,592][32177] Fps is (10 sec: 44236.7, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 170213376. Throughput: 0: 44695.2. Samples: 170381700. Policy #0 lag: (min: 2.0, avg: 11.9, max: 23.0) [2024-06-10 10:30:24,592][32177] Avg episode reward: [(0, '0.242')] [2024-06-10 10:30:24,641][32394] Saving new best policy, reward=0.242! [2024-06-10 10:30:24,647][32415] Updated weights for policy 0, policy_version 10390 (0.0028) [2024-06-10 10:30:28,049][32415] Updated weights for policy 0, policy_version 10400 (0.0036) [2024-06-10 10:30:29,592][32177] Fps is (10 sec: 42597.8, 60 sec: 44509.8, 300 sec: 44708.9). Total num frames: 170442752. Throughput: 0: 44605.6. Samples: 170509540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 10:30:29,592][32177] Avg episode reward: [(0, '0.226')] [2024-06-10 10:30:31,860][32415] Updated weights for policy 0, policy_version 10410 (0.0039) [2024-06-10 10:30:34,592][32177] Fps is (10 sec: 47513.3, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 170688512. Throughput: 0: 44689.9. Samples: 170779460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 10:30:34,592][32177] Avg episode reward: [(0, '0.229')] [2024-06-10 10:30:34,618][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000010418_170688512.pth... [2024-06-10 10:30:34,679][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000009761_159924224.pth [2024-06-10 10:30:35,327][32415] Updated weights for policy 0, policy_version 10420 (0.0036) [2024-06-10 10:30:38,998][32394] Signal inference workers to stop experience collection... (2500 times) [2024-06-10 10:30:38,999][32394] Signal inference workers to resume experience collection... (2500 times) [2024-06-10 10:30:39,024][32415] InferenceWorker_p0-w0: stopping experience collection (2500 times) [2024-06-10 10:30:39,024][32415] InferenceWorker_p0-w0: resuming experience collection (2500 times) [2024-06-10 10:30:39,131][32415] Updated weights for policy 0, policy_version 10430 (0.0030) [2024-06-10 10:30:39,592][32177] Fps is (10 sec: 44237.7, 60 sec: 44783.1, 300 sec: 44764.4). Total num frames: 170885120. Throughput: 0: 44642.6. Samples: 171051560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-10 10:30:39,592][32177] Avg episode reward: [(0, '0.227')] [2024-06-10 10:30:42,677][32415] Updated weights for policy 0, policy_version 10440 (0.0032) [2024-06-10 10:30:44,592][32177] Fps is (10 sec: 42598.0, 60 sec: 44782.8, 300 sec: 44708.9). Total num frames: 171114496. Throughput: 0: 44538.6. Samples: 171176860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 10:30:44,592][32177] Avg episode reward: [(0, '0.226')] [2024-06-10 10:30:46,880][32415] Updated weights for policy 0, policy_version 10450 (0.0027) [2024-06-10 10:30:49,592][32177] Fps is (10 sec: 45875.1, 60 sec: 44237.0, 300 sec: 44709.5). Total num frames: 171343872. Throughput: 0: 44301.0. Samples: 171441560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 10:30:49,592][32177] Avg episode reward: [(0, '0.238')] [2024-06-10 10:30:50,154][32415] Updated weights for policy 0, policy_version 10460 (0.0032) [2024-06-10 10:30:53,951][32415] Updated weights for policy 0, policy_version 10470 (0.0031) [2024-06-10 10:30:54,592][32177] Fps is (10 sec: 45876.1, 60 sec: 45056.3, 300 sec: 44820.6). Total num frames: 171573248. Throughput: 0: 44465.6. Samples: 171714440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 10:30:54,592][32177] Avg episode reward: [(0, '0.229')] [2024-06-10 10:30:57,270][32415] Updated weights for policy 0, policy_version 10480 (0.0020) [2024-06-10 10:30:59,592][32177] Fps is (10 sec: 44236.7, 60 sec: 44783.0, 300 sec: 44709.5). Total num frames: 171786240. Throughput: 0: 44787.5. Samples: 171853460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 10:30:59,592][32177] Avg episode reward: [(0, '0.234')] [2024-06-10 10:31:01,142][32415] Updated weights for policy 0, policy_version 10490 (0.0021) [2024-06-10 10:31:04,592][32177] Fps is (10 sec: 44236.3, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 172015616. Throughput: 0: 44722.6. Samples: 172122940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 10:31:04,592][32177] Avg episode reward: [(0, '0.230')] [2024-06-10 10:31:04,654][32415] Updated weights for policy 0, policy_version 10500 (0.0032) [2024-06-10 10:31:08,602][32415] Updated weights for policy 0, policy_version 10510 (0.0032) [2024-06-10 10:31:09,592][32177] Fps is (10 sec: 45874.8, 60 sec: 45059.2, 300 sec: 44876.1). Total num frames: 172244992. Throughput: 0: 44763.0. Samples: 172396040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-10 10:31:09,592][32177] Avg episode reward: [(0, '0.235')] [2024-06-10 10:31:11,972][32415] Updated weights for policy 0, policy_version 10520 (0.0049) [2024-06-10 10:31:14,592][32177] Fps is (10 sec: 44237.3, 60 sec: 44783.0, 300 sec: 44708.9). Total num frames: 172457984. Throughput: 0: 44731.7. Samples: 172522460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-10 10:31:14,592][32177] Avg episode reward: [(0, '0.240')] [2024-06-10 10:31:16,061][32415] Updated weights for policy 0, policy_version 10530 (0.0034) [2024-06-10 10:31:19,251][32415] Updated weights for policy 0, policy_version 10540 (0.0032) [2024-06-10 10:31:19,596][32177] Fps is (10 sec: 44218.1, 60 sec: 44506.7, 300 sec: 44708.9). Total num frames: 172687360. Throughput: 0: 44675.8. Samples: 172790060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-10 10:31:19,597][32177] Avg episode reward: [(0, '0.231')] [2024-06-10 10:31:23,111][32415] Updated weights for policy 0, policy_version 10550 (0.0036) [2024-06-10 10:31:24,592][32177] Fps is (10 sec: 45875.2, 60 sec: 45056.0, 300 sec: 44875.8). Total num frames: 172916736. Throughput: 0: 44630.7. Samples: 173059940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-10 10:31:24,592][32177] Avg episode reward: [(0, '0.220')] [2024-06-10 10:31:26,222][32415] Updated weights for policy 0, policy_version 10560 (0.0028) [2024-06-10 10:31:29,592][32177] Fps is (10 sec: 44254.5, 60 sec: 44782.8, 300 sec: 44764.4). Total num frames: 173129728. Throughput: 0: 44965.6. Samples: 173200320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 10:31:29,593][32177] Avg episode reward: [(0, '0.229')] [2024-06-10 10:31:30,246][32415] Updated weights for policy 0, policy_version 10570 (0.0038) [2024-06-10 10:31:33,888][32415] Updated weights for policy 0, policy_version 10580 (0.0035) [2024-06-10 10:31:34,592][32177] Fps is (10 sec: 44236.6, 60 sec: 44509.9, 300 sec: 44653.3). Total num frames: 173359104. Throughput: 0: 44885.3. Samples: 173461400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 10:31:34,592][32177] Avg episode reward: [(0, '0.234')] [2024-06-10 10:31:37,663][32415] Updated weights for policy 0, policy_version 10590 (0.0025) [2024-06-10 10:31:39,591][32177] Fps is (10 sec: 44238.6, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 173572096. Throughput: 0: 44798.3. Samples: 173730360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 10:31:39,592][32177] Avg episode reward: [(0, '0.230')] [2024-06-10 10:31:41,355][32415] Updated weights for policy 0, policy_version 10600 (0.0029) [2024-06-10 10:31:44,592][32177] Fps is (10 sec: 44237.0, 60 sec: 44783.1, 300 sec: 44764.4). Total num frames: 173801472. Throughput: 0: 44718.7. Samples: 173865800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 10:31:44,592][32177] Avg episode reward: [(0, '0.240')] [2024-06-10 10:31:45,294][32415] Updated weights for policy 0, policy_version 10610 (0.0026) [2024-06-10 10:31:48,777][32415] Updated weights for policy 0, policy_version 10620 (0.0032) [2024-06-10 10:31:49,592][32177] Fps is (10 sec: 44236.5, 60 sec: 44509.9, 300 sec: 44653.3). Total num frames: 174014464. Throughput: 0: 44545.4. Samples: 174127480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 10:31:49,592][32177] Avg episode reward: [(0, '0.224')] [2024-06-10 10:31:52,527][32415] Updated weights for policy 0, policy_version 10630 (0.0040) [2024-06-10 10:31:54,592][32177] Fps is (10 sec: 45875.0, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 174260224. Throughput: 0: 44496.5. Samples: 174398380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 10:31:54,592][32177] Avg episode reward: [(0, '0.229')] [2024-06-10 10:31:55,775][32415] Updated weights for policy 0, policy_version 10640 (0.0025) [2024-06-10 10:31:59,592][32177] Fps is (10 sec: 45875.3, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 174473216. Throughput: 0: 44929.3. Samples: 174544280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 10:31:59,592][32177] Avg episode reward: [(0, '0.239')] [2024-06-10 10:31:59,626][32415] Updated weights for policy 0, policy_version 10650 (0.0027) [2024-06-10 10:32:02,972][32415] Updated weights for policy 0, policy_version 10660 (0.0035) [2024-06-10 10:32:04,592][32177] Fps is (10 sec: 42597.6, 60 sec: 44509.8, 300 sec: 44653.3). Total num frames: 174686208. Throughput: 0: 44706.8. Samples: 174801680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-10 10:32:04,592][32177] Avg episode reward: [(0, '0.224')] [2024-06-10 10:32:06,828][32415] Updated weights for policy 0, policy_version 10670 (0.0031) [2024-06-10 10:32:09,592][32177] Fps is (10 sec: 45874.3, 60 sec: 44782.9, 300 sec: 44819.9). Total num frames: 174931968. Throughput: 0: 44690.9. Samples: 175071040. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-10 10:32:09,592][32177] Avg episode reward: [(0, '0.239')] [2024-06-10 10:32:10,595][32415] Updated weights for policy 0, policy_version 10680 (0.0042) [2024-06-10 10:32:14,592][32177] Fps is (10 sec: 44237.8, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 175128576. Throughput: 0: 44660.4. Samples: 175210020. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-10 10:32:14,592][32177] Avg episode reward: [(0, '0.233')] [2024-06-10 10:32:14,614][32415] Updated weights for policy 0, policy_version 10690 (0.0025) [2024-06-10 10:32:17,039][32394] Signal inference workers to stop experience collection... (2550 times) [2024-06-10 10:32:17,040][32394] Signal inference workers to resume experience collection... (2550 times) [2024-06-10 10:32:17,051][32415] InferenceWorker_p0-w0: stopping experience collection (2550 times) [2024-06-10 10:32:17,052][32415] InferenceWorker_p0-w0: resuming experience collection (2550 times) [2024-06-10 10:32:17,841][32415] Updated weights for policy 0, policy_version 10700 (0.0029) [2024-06-10 10:32:19,592][32177] Fps is (10 sec: 40960.5, 60 sec: 44240.0, 300 sec: 44653.3). Total num frames: 175341568. Throughput: 0: 44585.3. Samples: 175467740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-10 10:32:19,592][32177] Avg episode reward: [(0, '0.225')] [2024-06-10 10:32:21,857][32415] Updated weights for policy 0, policy_version 10710 (0.0026) [2024-06-10 10:32:24,592][32177] Fps is (10 sec: 47513.4, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 175603712. Throughput: 0: 44694.1. Samples: 175741600. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-10 10:32:24,592][32177] Avg episode reward: [(0, '0.242')] [2024-06-10 10:32:24,922][32415] Updated weights for policy 0, policy_version 10720 (0.0029) [2024-06-10 10:32:29,411][32415] Updated weights for policy 0, policy_version 10730 (0.0037) [2024-06-10 10:32:29,592][32177] Fps is (10 sec: 47513.5, 60 sec: 44783.1, 300 sec: 44764.9). Total num frames: 175816704. Throughput: 0: 44756.8. Samples: 175879860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-10 10:32:29,592][32177] Avg episode reward: [(0, '0.234')] [2024-06-10 10:32:32,406][32415] Updated weights for policy 0, policy_version 10740 (0.0047) [2024-06-10 10:32:34,592][32177] Fps is (10 sec: 40960.1, 60 sec: 44236.8, 300 sec: 44597.8). Total num frames: 176013312. Throughput: 0: 44875.6. Samples: 176146880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-10 10:32:34,592][32177] Avg episode reward: [(0, '0.236')] [2024-06-10 10:32:34,670][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000010744_176029696.pth... [2024-06-10 10:32:34,721][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000010091_165330944.pth [2024-06-10 10:32:36,521][32415] Updated weights for policy 0, policy_version 10750 (0.0043) [2024-06-10 10:32:39,592][32177] Fps is (10 sec: 45875.3, 60 sec: 45055.9, 300 sec: 44597.8). Total num frames: 176275456. Throughput: 0: 44512.4. Samples: 176401440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-10 10:32:39,592][32177] Avg episode reward: [(0, '0.233')] [2024-06-10 10:32:39,929][32415] Updated weights for policy 0, policy_version 10760 (0.0032) [2024-06-10 10:32:44,158][32415] Updated weights for policy 0, policy_version 10770 (0.0032) [2024-06-10 10:32:44,592][32177] Fps is (10 sec: 47513.1, 60 sec: 44782.8, 300 sec: 44764.4). Total num frames: 176488448. Throughput: 0: 44348.8. Samples: 176539980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-10 10:32:44,592][32177] Avg episode reward: [(0, '0.222')] [2024-06-10 10:32:47,057][32415] Updated weights for policy 0, policy_version 10780 (0.0035) [2024-06-10 10:32:49,592][32177] Fps is (10 sec: 42597.5, 60 sec: 44782.7, 300 sec: 44653.3). Total num frames: 176701440. Throughput: 0: 44695.5. Samples: 176812980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-10 10:32:49,592][32177] Avg episode reward: [(0, '0.228')] [2024-06-10 10:32:51,180][32415] Updated weights for policy 0, policy_version 10790 (0.0043) [2024-06-10 10:32:54,053][32415] Updated weights for policy 0, policy_version 10800 (0.0028) [2024-06-10 10:32:54,592][32177] Fps is (10 sec: 45875.2, 60 sec: 44782.9, 300 sec: 44653.4). Total num frames: 176947200. Throughput: 0: 44566.7. Samples: 177076540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 10:32:54,592][32177] Avg episode reward: [(0, '0.226')] [2024-06-10 10:32:58,752][32415] Updated weights for policy 0, policy_version 10810 (0.0027) [2024-06-10 10:32:59,592][32177] Fps is (10 sec: 45876.5, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 177160192. Throughput: 0: 44500.0. Samples: 177212520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 10:32:59,592][32177] Avg episode reward: [(0, '0.233')] [2024-06-10 10:33:01,414][32415] Updated weights for policy 0, policy_version 10820 (0.0036) [2024-06-10 10:33:04,592][32177] Fps is (10 sec: 42598.9, 60 sec: 44783.1, 300 sec: 44708.9). Total num frames: 177373184. Throughput: 0: 44896.5. Samples: 177488080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 10:33:04,592][32177] Avg episode reward: [(0, '0.223')] [2024-06-10 10:33:05,891][32415] Updated weights for policy 0, policy_version 10830 (0.0021) [2024-06-10 10:33:09,054][32415] Updated weights for policy 0, policy_version 10840 (0.0034) [2024-06-10 10:33:09,592][32177] Fps is (10 sec: 44236.6, 60 sec: 44510.0, 300 sec: 44542.3). Total num frames: 177602560. Throughput: 0: 44473.8. Samples: 177742920. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-10 10:33:09,592][32177] Avg episode reward: [(0, '0.232')] [2024-06-10 10:33:13,633][32415] Updated weights for policy 0, policy_version 10850 (0.0029) [2024-06-10 10:33:14,592][32177] Fps is (10 sec: 44236.9, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 177815552. Throughput: 0: 44441.4. Samples: 177879720. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-10 10:33:14,592][32177] Avg episode reward: [(0, '0.232')] [2024-06-10 10:33:16,503][32415] Updated weights for policy 0, policy_version 10860 (0.0036) [2024-06-10 10:33:19,591][32177] Fps is (10 sec: 42598.9, 60 sec: 44783.0, 300 sec: 44653.4). Total num frames: 178028544. Throughput: 0: 44586.3. Samples: 178153260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-10 10:33:19,592][32177] Avg episode reward: [(0, '0.244')] [2024-06-10 10:33:19,668][32394] Saving new best policy, reward=0.244! [2024-06-10 10:33:20,713][32415] Updated weights for policy 0, policy_version 10870 (0.0029) [2024-06-10 10:33:23,539][32415] Updated weights for policy 0, policy_version 10880 (0.0031) [2024-06-10 10:33:24,592][32177] Fps is (10 sec: 44236.7, 60 sec: 44236.8, 300 sec: 44486.7). Total num frames: 178257920. Throughput: 0: 44793.4. Samples: 178417140. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-10 10:33:24,592][32177] Avg episode reward: [(0, '0.246')] [2024-06-10 10:33:24,599][32394] Saving new best policy, reward=0.246! [2024-06-10 10:33:28,215][32415] Updated weights for policy 0, policy_version 10890 (0.0018) [2024-06-10 10:33:29,592][32177] Fps is (10 sec: 47513.3, 60 sec: 44783.0, 300 sec: 44653.4). Total num frames: 178503680. Throughput: 0: 44925.5. Samples: 178561620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-10 10:33:29,592][32177] Avg episode reward: [(0, '0.227')] [2024-06-10 10:33:30,730][32415] Updated weights for policy 0, policy_version 10900 (0.0031) [2024-06-10 10:33:34,592][32177] Fps is (10 sec: 45874.7, 60 sec: 45055.9, 300 sec: 44653.3). Total num frames: 178716672. Throughput: 0: 44694.8. Samples: 178824240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-10 10:33:34,592][32177] Avg episode reward: [(0, '0.243')] [2024-06-10 10:33:35,643][32415] Updated weights for policy 0, policy_version 10910 (0.0037) [2024-06-10 10:33:38,080][32394] Signal inference workers to stop experience collection... (2600 times) [2024-06-10 10:33:38,119][32415] InferenceWorker_p0-w0: stopping experience collection (2600 times) [2024-06-10 10:33:38,124][32394] Signal inference workers to resume experience collection... (2600 times) [2024-06-10 10:33:38,132][32415] InferenceWorker_p0-w0: resuming experience collection (2600 times) [2024-06-10 10:33:38,289][32415] Updated weights for policy 0, policy_version 10920 (0.0038) [2024-06-10 10:33:39,591][32177] Fps is (10 sec: 42598.6, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 178929664. Throughput: 0: 44627.7. Samples: 179084780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-10 10:33:39,592][32177] Avg episode reward: [(0, '0.224')] [2024-06-10 10:33:42,810][32415] Updated weights for policy 0, policy_version 10930 (0.0026) [2024-06-10 10:33:44,592][32177] Fps is (10 sec: 45875.3, 60 sec: 44783.0, 300 sec: 44708.9). Total num frames: 179175424. Throughput: 0: 44605.7. Samples: 179219780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 10:33:44,592][32177] Avg episode reward: [(0, '0.233')] [2024-06-10 10:33:45,843][32415] Updated weights for policy 0, policy_version 10940 (0.0035) [2024-06-10 10:33:49,592][32177] Fps is (10 sec: 44235.6, 60 sec: 44509.9, 300 sec: 44708.8). Total num frames: 179372032. Throughput: 0: 44506.9. Samples: 179490900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 10:33:49,592][32177] Avg episode reward: [(0, '0.226')] [2024-06-10 10:33:50,075][32415] Updated weights for policy 0, policy_version 10950 (0.0034) [2024-06-10 10:33:52,896][32415] Updated weights for policy 0, policy_version 10960 (0.0036) [2024-06-10 10:33:54,592][32177] Fps is (10 sec: 42598.8, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 179601408. Throughput: 0: 45036.9. Samples: 179769580. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-10 10:33:54,592][32177] Avg episode reward: [(0, '0.222')] [2024-06-10 10:33:57,463][32415] Updated weights for policy 0, policy_version 10970 (0.0043) [2024-06-10 10:33:59,592][32177] Fps is (10 sec: 49153.1, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 179863552. Throughput: 0: 45010.2. Samples: 179905180. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-06-10 10:33:59,592][32177] Avg episode reward: [(0, '0.247')] [2024-06-10 10:34:00,351][32415] Updated weights for policy 0, policy_version 10980 (0.0043) [2024-06-10 10:34:04,592][32177] Fps is (10 sec: 44236.8, 60 sec: 44509.9, 300 sec: 44653.3). Total num frames: 180043776. Throughput: 0: 44747.9. Samples: 180166920. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-06-10 10:34:04,592][32177] Avg episode reward: [(0, '0.231')] [2024-06-10 10:34:04,663][32415] Updated weights for policy 0, policy_version 10990 (0.0036) [2024-06-10 10:34:08,131][32415] Updated weights for policy 0, policy_version 11000 (0.0039) [2024-06-10 10:34:09,592][32177] Fps is (10 sec: 40959.4, 60 sec: 44509.8, 300 sec: 44597.8). Total num frames: 180273152. Throughput: 0: 44862.1. Samples: 180435940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-10 10:34:09,592][32177] Avg episode reward: [(0, '0.237')] [2024-06-10 10:34:11,814][32415] Updated weights for policy 0, policy_version 11010 (0.0031) [2024-06-10 10:34:14,596][32177] Fps is (10 sec: 49130.3, 60 sec: 45325.7, 300 sec: 44763.8). Total num frames: 180535296. Throughput: 0: 44633.4. Samples: 180570320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-10 10:34:14,605][32177] Avg episode reward: [(0, '0.228')] [2024-06-10 10:34:15,133][32415] Updated weights for policy 0, policy_version 11020 (0.0035) [2024-06-10 10:34:19,171][32415] Updated weights for policy 0, policy_version 11030 (0.0037) [2024-06-10 10:34:19,596][32177] Fps is (10 sec: 47493.6, 60 sec: 45325.7, 300 sec: 44819.3). Total num frames: 180748288. Throughput: 0: 44988.7. Samples: 180848920. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-10 10:34:19,596][32177] Avg episode reward: [(0, '0.226')] [2024-06-10 10:34:22,168][32415] Updated weights for policy 0, policy_version 11040 (0.0028) [2024-06-10 10:34:24,592][32177] Fps is (10 sec: 42616.7, 60 sec: 45055.9, 300 sec: 44708.9). Total num frames: 180961280. Throughput: 0: 45061.2. Samples: 181112540. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-10 10:34:24,592][32177] Avg episode reward: [(0, '0.238')] [2024-06-10 10:34:26,682][32415] Updated weights for policy 0, policy_version 11050 (0.0031) [2024-06-10 10:34:29,529][32415] Updated weights for policy 0, policy_version 11060 (0.0038) [2024-06-10 10:34:29,592][32177] Fps is (10 sec: 45894.2, 60 sec: 45055.8, 300 sec: 44708.9). Total num frames: 181207040. Throughput: 0: 45043.9. Samples: 181246760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-10 10:34:29,596][32177] Avg episode reward: [(0, '0.235')] [2024-06-10 10:34:33,826][32415] Updated weights for policy 0, policy_version 11070 (0.0035) [2024-06-10 10:34:34,592][32177] Fps is (10 sec: 45875.6, 60 sec: 45056.1, 300 sec: 44820.0). Total num frames: 181420032. Throughput: 0: 44969.5. Samples: 181514520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-10 10:34:34,592][32177] Avg episode reward: [(0, '0.242')] [2024-06-10 10:34:34,610][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000011073_181420032.pth... [2024-06-10 10:34:34,667][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000010418_170688512.pth [2024-06-10 10:34:36,912][32415] Updated weights for policy 0, policy_version 11080 (0.0031) [2024-06-10 10:34:39,592][32177] Fps is (10 sec: 40960.9, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 181616640. Throughput: 0: 44752.5. Samples: 181783440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-10 10:34:39,592][32177] Avg episode reward: [(0, '0.235')] [2024-06-10 10:34:41,065][32415] Updated weights for policy 0, policy_version 11090 (0.0028) [2024-06-10 10:34:43,897][32415] Updated weights for policy 0, policy_version 11100 (0.0039) [2024-06-10 10:34:44,596][32177] Fps is (10 sec: 44217.6, 60 sec: 44779.8, 300 sec: 44652.7). Total num frames: 181862400. Throughput: 0: 44646.3. Samples: 181914460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-10 10:34:44,596][32177] Avg episode reward: [(0, '0.224')] [2024-06-10 10:34:48,209][32415] Updated weights for policy 0, policy_version 11110 (0.0028) [2024-06-10 10:34:49,592][32177] Fps is (10 sec: 49151.8, 60 sec: 45602.3, 300 sec: 44875.5). Total num frames: 182108160. Throughput: 0: 45053.3. Samples: 182194320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-10 10:34:49,592][32177] Avg episode reward: [(0, '0.244')] [2024-06-10 10:34:50,823][32415] Updated weights for policy 0, policy_version 11120 (0.0022) [2024-06-10 10:34:53,541][32394] Signal inference workers to stop experience collection... (2650 times) [2024-06-10 10:34:53,593][32394] Signal inference workers to resume experience collection... (2650 times) [2024-06-10 10:34:53,594][32415] InferenceWorker_p0-w0: stopping experience collection (2650 times) [2024-06-10 10:34:53,606][32415] InferenceWorker_p0-w0: resuming experience collection (2650 times) [2024-06-10 10:34:54,592][32177] Fps is (10 sec: 42615.9, 60 sec: 44782.8, 300 sec: 44708.9). Total num frames: 182288384. Throughput: 0: 45125.3. Samples: 182466580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-10 10:34:54,592][32177] Avg episode reward: [(0, '0.230')] [2024-06-10 10:34:55,627][32415] Updated weights for policy 0, policy_version 11130 (0.0032) [2024-06-10 10:34:58,421][32415] Updated weights for policy 0, policy_version 11140 (0.0032) [2024-06-10 10:34:59,596][32177] Fps is (10 sec: 40942.4, 60 sec: 44233.6, 300 sec: 44597.2). Total num frames: 182517760. Throughput: 0: 44987.2. Samples: 182594740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-10 10:34:59,596][32177] Avg episode reward: [(0, '0.243')] [2024-06-10 10:35:02,689][32415] Updated weights for policy 0, policy_version 11150 (0.0044) [2024-06-10 10:35:04,592][32177] Fps is (10 sec: 47514.2, 60 sec: 45329.0, 300 sec: 44820.6). Total num frames: 182763520. Throughput: 0: 44655.3. Samples: 182858220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-10 10:35:04,592][32177] Avg episode reward: [(0, '0.244')] [2024-06-10 10:35:05,910][32415] Updated weights for policy 0, policy_version 11160 (0.0041) [2024-06-10 10:35:09,592][32177] Fps is (10 sec: 45895.1, 60 sec: 45056.1, 300 sec: 44764.4). Total num frames: 182976512. Throughput: 0: 44789.0. Samples: 183128040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-10 10:35:09,592][32177] Avg episode reward: [(0, '0.242')] [2024-06-10 10:35:10,132][32415] Updated weights for policy 0, policy_version 11170 (0.0035) [2024-06-10 10:35:13,184][32415] Updated weights for policy 0, policy_version 11180 (0.0030) [2024-06-10 10:35:14,592][32177] Fps is (10 sec: 42598.3, 60 sec: 44240.0, 300 sec: 44653.3). Total num frames: 183189504. Throughput: 0: 44757.8. Samples: 183260860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-10 10:35:14,592][32177] Avg episode reward: [(0, '0.230')] [2024-06-10 10:35:17,449][32415] Updated weights for policy 0, policy_version 11190 (0.0034) [2024-06-10 10:35:19,592][32177] Fps is (10 sec: 47513.6, 60 sec: 45059.3, 300 sec: 44875.5). Total num frames: 183451648. Throughput: 0: 44853.8. Samples: 183532940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-10 10:35:19,592][32177] Avg episode reward: [(0, '0.235')] [2024-06-10 10:35:20,496][32415] Updated weights for policy 0, policy_version 11200 (0.0026) [2024-06-10 10:35:24,592][32177] Fps is (10 sec: 45875.8, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 183648256. Throughput: 0: 44861.3. Samples: 183802200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-10 10:35:24,592][32177] Avg episode reward: [(0, '0.238')] [2024-06-10 10:35:24,682][32415] Updated weights for policy 0, policy_version 11210 (0.0035) [2024-06-10 10:35:27,787][32415] Updated weights for policy 0, policy_version 11220 (0.0033) [2024-06-10 10:35:29,596][32177] Fps is (10 sec: 39304.6, 60 sec: 43960.7, 300 sec: 44597.2). Total num frames: 183844864. Throughput: 0: 44821.3. Samples: 183931420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 10:35:29,596][32177] Avg episode reward: [(0, '0.242')] [2024-06-10 10:35:32,006][32415] Updated weights for policy 0, policy_version 11230 (0.0041) [2024-06-10 10:35:34,592][32177] Fps is (10 sec: 44235.3, 60 sec: 44509.6, 300 sec: 44764.4). Total num frames: 184090624. Throughput: 0: 44318.4. Samples: 184188660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 10:35:34,592][32177] Avg episode reward: [(0, '0.232')] [2024-06-10 10:35:35,498][32415] Updated weights for policy 0, policy_version 11240 (0.0031) [2024-06-10 10:35:39,424][32415] Updated weights for policy 0, policy_version 11250 (0.0031) [2024-06-10 10:35:39,592][32177] Fps is (10 sec: 47532.5, 60 sec: 45055.7, 300 sec: 44764.4). Total num frames: 184320000. Throughput: 0: 44320.8. Samples: 184461020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-10 10:35:39,593][32177] Avg episode reward: [(0, '0.253')] [2024-06-10 10:35:39,593][32394] Saving new best policy, reward=0.253! [2024-06-10 10:35:42,734][32415] Updated weights for policy 0, policy_version 11260 (0.0032) [2024-06-10 10:35:44,592][32177] Fps is (10 sec: 42599.3, 60 sec: 44239.9, 300 sec: 44653.3). Total num frames: 184516608. Throughput: 0: 44424.6. Samples: 184593660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-10 10:35:44,592][32177] Avg episode reward: [(0, '0.236')] [2024-06-10 10:35:46,803][32415] Updated weights for policy 0, policy_version 11270 (0.0049) [2024-06-10 10:35:49,592][32177] Fps is (10 sec: 45876.3, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 184778752. Throughput: 0: 44555.1. Samples: 184863200. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-10 10:35:49,592][32177] Avg episode reward: [(0, '0.239')] [2024-06-10 10:35:50,254][32415] Updated weights for policy 0, policy_version 11280 (0.0028) [2024-06-10 10:35:54,185][32415] Updated weights for policy 0, policy_version 11290 (0.0027) [2024-06-10 10:35:54,596][32177] Fps is (10 sec: 47493.4, 60 sec: 45052.9, 300 sec: 44763.8). Total num frames: 184991744. Throughput: 0: 44557.9. Samples: 185133340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 10:35:54,597][32177] Avg episode reward: [(0, '0.238')] [2024-06-10 10:35:57,302][32415] Updated weights for policy 0, policy_version 11300 (0.0033) [2024-06-10 10:35:59,592][32177] Fps is (10 sec: 39322.0, 60 sec: 44240.0, 300 sec: 44597.8). Total num frames: 185171968. Throughput: 0: 44565.5. Samples: 185266300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 10:35:59,592][32177] Avg episode reward: [(0, '0.237')] [2024-06-10 10:36:01,323][32415] Updated weights for policy 0, policy_version 11310 (0.0029) [2024-06-10 10:36:04,592][32177] Fps is (10 sec: 44256.2, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 185434112. Throughput: 0: 44466.2. Samples: 185533920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-10 10:36:04,592][32177] Avg episode reward: [(0, '0.234')] [2024-06-10 10:36:05,028][32415] Updated weights for policy 0, policy_version 11320 (0.0037) [2024-06-10 10:36:07,658][32394] Signal inference workers to stop experience collection... (2700 times) [2024-06-10 10:36:07,682][32415] InferenceWorker_p0-w0: stopping experience collection (2700 times) [2024-06-10 10:36:07,768][32394] Signal inference workers to resume experience collection... (2700 times) [2024-06-10 10:36:07,768][32415] InferenceWorker_p0-w0: resuming experience collection (2700 times) [2024-06-10 10:36:08,856][32415] Updated weights for policy 0, policy_version 11330 (0.0025) [2024-06-10 10:36:09,592][32177] Fps is (10 sec: 50788.5, 60 sec: 45055.7, 300 sec: 44819.9). Total num frames: 185679872. Throughput: 0: 44328.0. Samples: 185796980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-10 10:36:09,593][32177] Avg episode reward: [(0, '0.237')] [2024-06-10 10:36:12,182][32415] Updated weights for policy 0, policy_version 11340 (0.0027) [2024-06-10 10:36:14,592][32177] Fps is (10 sec: 40959.9, 60 sec: 44236.9, 300 sec: 44598.5). Total num frames: 185843712. Throughput: 0: 44474.9. Samples: 185932600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-10 10:36:14,592][32177] Avg episode reward: [(0, '0.239')] [2024-06-10 10:36:16,235][32415] Updated weights for policy 0, policy_version 11350 (0.0026) [2024-06-10 10:36:19,592][32177] Fps is (10 sec: 42599.9, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 186105856. Throughput: 0: 44901.2. Samples: 186209200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-10 10:36:19,592][32177] Avg episode reward: [(0, '0.250')] [2024-06-10 10:36:19,673][32415] Updated weights for policy 0, policy_version 11360 (0.0033) [2024-06-10 10:36:23,441][32415] Updated weights for policy 0, policy_version 11370 (0.0049) [2024-06-10 10:36:24,592][32177] Fps is (10 sec: 49151.8, 60 sec: 44782.9, 300 sec: 44764.5). Total num frames: 186335232. Throughput: 0: 44605.6. Samples: 186468260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 10:36:24,598][32177] Avg episode reward: [(0, '0.249')] [2024-06-10 10:36:26,870][32415] Updated weights for policy 0, policy_version 11380 (0.0036) [2024-06-10 10:36:29,592][32177] Fps is (10 sec: 42597.9, 60 sec: 44786.1, 300 sec: 44653.3). Total num frames: 186531840. Throughput: 0: 44632.9. Samples: 186602140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 10:36:29,592][32177] Avg episode reward: [(0, '0.254')] [2024-06-10 10:36:29,593][32394] Saving new best policy, reward=0.254! [2024-06-10 10:36:30,697][32415] Updated weights for policy 0, policy_version 11390 (0.0045) [2024-06-10 10:36:34,543][32415] Updated weights for policy 0, policy_version 11400 (0.0023) [2024-06-10 10:36:34,592][32177] Fps is (10 sec: 44236.6, 60 sec: 44783.1, 300 sec: 44764.4). Total num frames: 186777600. Throughput: 0: 44729.8. Samples: 186876040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-10 10:36:34,592][32177] Avg episode reward: [(0, '0.245')] [2024-06-10 10:36:34,614][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000011400_186777600.pth... [2024-06-10 10:36:34,677][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000010744_176029696.pth [2024-06-10 10:36:38,392][32415] Updated weights for policy 0, policy_version 11410 (0.0036) [2024-06-10 10:36:39,592][32177] Fps is (10 sec: 45874.3, 60 sec: 44509.9, 300 sec: 44708.8). Total num frames: 186990592. Throughput: 0: 44467.1. Samples: 187134180. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-10 10:36:39,593][32177] Avg episode reward: [(0, '0.237')] [2024-06-10 10:36:42,021][32415] Updated weights for policy 0, policy_version 11420 (0.0032) [2024-06-10 10:36:44,592][32177] Fps is (10 sec: 42598.3, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 187203584. Throughput: 0: 44686.1. Samples: 187277180. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-10 10:36:44,592][32177] Avg episode reward: [(0, '0.241')] [2024-06-10 10:36:45,549][32415] Updated weights for policy 0, policy_version 11430 (0.0035) [2024-06-10 10:36:49,086][32415] Updated weights for policy 0, policy_version 11440 (0.0034) [2024-06-10 10:36:49,592][32177] Fps is (10 sec: 45876.2, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 187449344. Throughput: 0: 44775.9. Samples: 187548840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-10 10:36:49,592][32177] Avg episode reward: [(0, '0.249')] [2024-06-10 10:36:52,778][32415] Updated weights for policy 0, policy_version 11450 (0.0029) [2024-06-10 10:36:54,592][32177] Fps is (10 sec: 47513.5, 60 sec: 44786.1, 300 sec: 44764.4). Total num frames: 187678720. Throughput: 0: 44812.7. Samples: 187813540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-10 10:36:54,592][32177] Avg episode reward: [(0, '0.249')] [2024-06-10 10:36:56,389][32415] Updated weights for policy 0, policy_version 11460 (0.0032) [2024-06-10 10:36:59,592][32177] Fps is (10 sec: 42598.3, 60 sec: 45055.9, 300 sec: 44708.9). Total num frames: 187875328. Throughput: 0: 44711.9. Samples: 187944640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-10 10:36:59,592][32177] Avg episode reward: [(0, '0.234')] [2024-06-10 10:37:00,209][32415] Updated weights for policy 0, policy_version 11470 (0.0038) [2024-06-10 10:37:03,915][32415] Updated weights for policy 0, policy_version 11480 (0.0027) [2024-06-10 10:37:04,592][32177] Fps is (10 sec: 42598.8, 60 sec: 44509.8, 300 sec: 44653.4). Total num frames: 188104704. Throughput: 0: 44620.4. Samples: 188217120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-10 10:37:04,600][32177] Avg episode reward: [(0, '0.235')] [2024-06-10 10:37:07,832][32415] Updated weights for policy 0, policy_version 11490 (0.0041) [2024-06-10 10:37:09,592][32177] Fps is (10 sec: 45874.1, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 188334080. Throughput: 0: 44783.2. Samples: 188483520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-10 10:37:09,593][32177] Avg episode reward: [(0, '0.228')] [2024-06-10 10:37:11,108][32415] Updated weights for policy 0, policy_version 11500 (0.0028) [2024-06-10 10:37:14,596][32177] Fps is (10 sec: 44217.8, 60 sec: 45052.7, 300 sec: 44763.8). Total num frames: 188547072. Throughput: 0: 44872.7. Samples: 188621600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-10 10:37:14,597][32177] Avg episode reward: [(0, '0.243')] [2024-06-10 10:37:15,009][32415] Updated weights for policy 0, policy_version 11510 (0.0028) [2024-06-10 10:37:18,241][32415] Updated weights for policy 0, policy_version 11520 (0.0036) [2024-06-10 10:37:19,592][32177] Fps is (10 sec: 44238.2, 60 sec: 44509.8, 300 sec: 44653.3). Total num frames: 188776448. Throughput: 0: 44791.6. Samples: 188891660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-10 10:37:19,592][32177] Avg episode reward: [(0, '0.250')] [2024-06-10 10:37:22,120][32415] Updated weights for policy 0, policy_version 11530 (0.0033) [2024-06-10 10:37:23,109][32394] Signal inference workers to stop experience collection... (2750 times) [2024-06-10 10:37:23,109][32394] Signal inference workers to resume experience collection... (2750 times) [2024-06-10 10:37:23,139][32415] InferenceWorker_p0-w0: stopping experience collection (2750 times) [2024-06-10 10:37:23,139][32415] InferenceWorker_p0-w0: resuming experience collection (2750 times) [2024-06-10 10:37:24,592][32177] Fps is (10 sec: 44255.7, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 188989440. Throughput: 0: 44951.4. Samples: 189156980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 10:37:24,592][32177] Avg episode reward: [(0, '0.240')] [2024-06-10 10:37:25,758][32415] Updated weights for policy 0, policy_version 11540 (0.0040) [2024-06-10 10:37:29,596][32177] Fps is (10 sec: 44217.9, 60 sec: 44779.8, 300 sec: 44763.8). Total num frames: 189218816. Throughput: 0: 44674.0. Samples: 189287700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 10:37:29,597][32177] Avg episode reward: [(0, '0.240')] [2024-06-10 10:37:29,761][32415] Updated weights for policy 0, policy_version 11550 (0.0037) [2024-06-10 10:37:33,205][32415] Updated weights for policy 0, policy_version 11560 (0.0029) [2024-06-10 10:37:34,592][32177] Fps is (10 sec: 45874.5, 60 sec: 44509.8, 300 sec: 44653.3). Total num frames: 189448192. Throughput: 0: 44550.6. Samples: 189553620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-10 10:37:34,593][32177] Avg episode reward: [(0, '0.249')] [2024-06-10 10:37:37,054][32415] Updated weights for policy 0, policy_version 11570 (0.0032) [2024-06-10 10:37:39,592][32177] Fps is (10 sec: 44255.6, 60 sec: 44510.1, 300 sec: 44653.3). Total num frames: 189661184. Throughput: 0: 44773.4. Samples: 189828340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-10 10:37:39,592][32177] Avg episode reward: [(0, '0.236')] [2024-06-10 10:37:40,259][32415] Updated weights for policy 0, policy_version 11580 (0.0032) [2024-06-10 10:37:44,148][32415] Updated weights for policy 0, policy_version 11590 (0.0022) [2024-06-10 10:37:44,592][32177] Fps is (10 sec: 44237.3, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 189890560. Throughput: 0: 44928.9. Samples: 189966440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-10 10:37:44,592][32177] Avg episode reward: [(0, '0.239')] [2024-06-10 10:37:47,499][32415] Updated weights for policy 0, policy_version 11600 (0.0033) [2024-06-10 10:37:49,592][32177] Fps is (10 sec: 45875.4, 60 sec: 44509.9, 300 sec: 44653.4). Total num frames: 190119936. Throughput: 0: 44754.2. Samples: 190231060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-10 10:37:49,592][32177] Avg episode reward: [(0, '0.242')] [2024-06-10 10:37:51,398][32415] Updated weights for policy 0, policy_version 11610 (0.0037) [2024-06-10 10:37:54,592][32177] Fps is (10 sec: 45874.4, 60 sec: 44509.8, 300 sec: 44708.8). Total num frames: 190349312. Throughput: 0: 44894.3. Samples: 190503760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-10 10:37:54,593][32177] Avg episode reward: [(0, '0.226')] [2024-06-10 10:37:55,055][32415] Updated weights for policy 0, policy_version 11620 (0.0033) [2024-06-10 10:37:59,002][32415] Updated weights for policy 0, policy_version 11630 (0.0044) [2024-06-10 10:37:59,596][32177] Fps is (10 sec: 45855.6, 60 sec: 45052.9, 300 sec: 44763.8). Total num frames: 190578688. Throughput: 0: 44741.8. Samples: 190634980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-10 10:37:59,597][32177] Avg episode reward: [(0, '0.245')] [2024-06-10 10:38:02,188][32415] Updated weights for policy 0, policy_version 11640 (0.0036) [2024-06-10 10:38:04,592][32177] Fps is (10 sec: 45876.5, 60 sec: 45056.0, 300 sec: 44764.4). Total num frames: 190808064. Throughput: 0: 44811.2. Samples: 190908160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-10 10:38:04,592][32177] Avg episode reward: [(0, '0.240')] [2024-06-10 10:38:06,136][32415] Updated weights for policy 0, policy_version 11650 (0.0031) [2024-06-10 10:38:09,223][32415] Updated weights for policy 0, policy_version 11660 (0.0033) [2024-06-10 10:38:09,592][32177] Fps is (10 sec: 45893.2, 60 sec: 45056.0, 300 sec: 44819.9). Total num frames: 191037440. Throughput: 0: 44899.2. Samples: 191177460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-10 10:38:09,593][32177] Avg episode reward: [(0, '0.242')] [2024-06-10 10:38:13,533][32415] Updated weights for policy 0, policy_version 11670 (0.0029) [2024-06-10 10:38:14,592][32177] Fps is (10 sec: 44236.8, 60 sec: 45059.3, 300 sec: 44819.9). Total num frames: 191250432. Throughput: 0: 45091.0. Samples: 191316600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-10 10:38:14,592][32177] Avg episode reward: [(0, '0.242')] [2024-06-10 10:38:16,629][32415] Updated weights for policy 0, policy_version 11680 (0.0033) [2024-06-10 10:38:19,592][32177] Fps is (10 sec: 44238.7, 60 sec: 45056.1, 300 sec: 44820.0). Total num frames: 191479808. Throughput: 0: 45103.8. Samples: 191583280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-10 10:38:19,592][32177] Avg episode reward: [(0, '0.248')] [2024-06-10 10:38:20,660][32415] Updated weights for policy 0, policy_version 11690 (0.0034) [2024-06-10 10:38:24,089][32415] Updated weights for policy 0, policy_version 11700 (0.0031) [2024-06-10 10:38:24,592][32177] Fps is (10 sec: 45874.7, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 191709184. Throughput: 0: 45004.8. Samples: 191853560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-10 10:38:24,592][32177] Avg episode reward: [(0, '0.255')] [2024-06-10 10:38:24,601][32394] Saving new best policy, reward=0.255! [2024-06-10 10:38:28,071][32415] Updated weights for policy 0, policy_version 11710 (0.0036) [2024-06-10 10:38:29,592][32177] Fps is (10 sec: 44236.4, 60 sec: 45059.2, 300 sec: 44764.4). Total num frames: 191922176. Throughput: 0: 44866.3. Samples: 191985420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 10:38:29,592][32177] Avg episode reward: [(0, '0.241')] [2024-06-10 10:38:31,223][32415] Updated weights for policy 0, policy_version 11720 (0.0030) [2024-06-10 10:38:34,592][32177] Fps is (10 sec: 44236.9, 60 sec: 45056.1, 300 sec: 44819.9). Total num frames: 192151552. Throughput: 0: 44975.5. Samples: 192254960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 10:38:34,592][32177] Avg episode reward: [(0, '0.231')] [2024-06-10 10:38:34,608][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000011728_192151552.pth... [2024-06-10 10:38:34,669][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000011073_181420032.pth [2024-06-10 10:38:35,066][32415] Updated weights for policy 0, policy_version 11730 (0.0028) [2024-06-10 10:38:37,661][32394] Signal inference workers to stop experience collection... (2800 times) [2024-06-10 10:38:37,661][32394] Signal inference workers to resume experience collection... (2800 times) [2024-06-10 10:38:37,692][32415] InferenceWorker_p0-w0: stopping experience collection (2800 times) [2024-06-10 10:38:37,692][32415] InferenceWorker_p0-w0: resuming experience collection (2800 times) [2024-06-10 10:38:38,207][32415] Updated weights for policy 0, policy_version 11740 (0.0031) [2024-06-10 10:38:39,592][32177] Fps is (10 sec: 44236.8, 60 sec: 45056.0, 300 sec: 44708.9). Total num frames: 192364544. Throughput: 0: 45013.6. Samples: 192529360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 10:38:39,592][32177] Avg episode reward: [(0, '0.247')] [2024-06-10 10:38:42,550][32415] Updated weights for policy 0, policy_version 11750 (0.0025) [2024-06-10 10:38:44,592][32177] Fps is (10 sec: 45872.6, 60 sec: 45328.6, 300 sec: 44875.4). Total num frames: 192610304. Throughput: 0: 45119.6. Samples: 192665200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 10:38:44,593][32177] Avg episode reward: [(0, '0.251')] [2024-06-10 10:38:45,771][32415] Updated weights for policy 0, policy_version 11760 (0.0040) [2024-06-10 10:38:49,559][32415] Updated weights for policy 0, policy_version 11770 (0.0034) [2024-06-10 10:38:49,596][32177] Fps is (10 sec: 47493.3, 60 sec: 45325.8, 300 sec: 44874.8). Total num frames: 192839680. Throughput: 0: 45001.0. Samples: 192933400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 10:38:49,597][32177] Avg episode reward: [(0, '0.253')] [2024-06-10 10:38:53,505][32415] Updated weights for policy 0, policy_version 11780 (0.0034) [2024-06-10 10:38:54,591][32177] Fps is (10 sec: 44240.1, 60 sec: 45056.3, 300 sec: 44708.9). Total num frames: 193052672. Throughput: 0: 44846.7. Samples: 193195540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-10 10:38:54,592][32177] Avg episode reward: [(0, '0.243')] [2024-06-10 10:38:57,196][32415] Updated weights for policy 0, policy_version 11790 (0.0043) [2024-06-10 10:38:59,592][32177] Fps is (10 sec: 40977.4, 60 sec: 44513.0, 300 sec: 44764.4). Total num frames: 193249280. Throughput: 0: 44767.0. Samples: 193331120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-10 10:38:59,592][32177] Avg episode reward: [(0, '0.243')] [2024-06-10 10:39:00,544][32415] Updated weights for policy 0, policy_version 11800 (0.0032) [2024-06-10 10:39:04,383][32415] Updated weights for policy 0, policy_version 11810 (0.0035) [2024-06-10 10:39:04,592][32177] Fps is (10 sec: 44235.7, 60 sec: 44782.8, 300 sec: 44820.0). Total num frames: 193495040. Throughput: 0: 44839.8. Samples: 193601080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-10 10:39:04,592][32177] Avg episode reward: [(0, '0.242')] [2024-06-10 10:39:07,511][32415] Updated weights for policy 0, policy_version 11820 (0.0034) [2024-06-10 10:39:09,592][32177] Fps is (10 sec: 49151.4, 60 sec: 45056.1, 300 sec: 44765.1). Total num frames: 193740800. Throughput: 0: 44891.9. Samples: 193873700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-10 10:39:09,592][32177] Avg episode reward: [(0, '0.234')] [2024-06-10 10:39:11,822][32415] Updated weights for policy 0, policy_version 11830 (0.0034) [2024-06-10 10:39:14,596][32177] Fps is (10 sec: 44218.5, 60 sec: 44779.7, 300 sec: 44708.9). Total num frames: 193937408. Throughput: 0: 44992.2. Samples: 194010260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-10 10:39:14,597][32177] Avg episode reward: [(0, '0.254')] [2024-06-10 10:39:15,063][32415] Updated weights for policy 0, policy_version 11840 (0.0025) [2024-06-10 10:39:18,849][32415] Updated weights for policy 0, policy_version 11850 (0.0031) [2024-06-10 10:39:19,596][32177] Fps is (10 sec: 40942.9, 60 sec: 44506.6, 300 sec: 44708.2). Total num frames: 194150400. Throughput: 0: 44816.2. Samples: 194271880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-10 10:39:19,597][32177] Avg episode reward: [(0, '0.250')] [2024-06-10 10:39:22,717][32415] Updated weights for policy 0, policy_version 11860 (0.0035) [2024-06-10 10:39:24,592][32177] Fps is (10 sec: 47533.1, 60 sec: 45055.9, 300 sec: 44764.4). Total num frames: 194412544. Throughput: 0: 44569.2. Samples: 194534980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 10:39:24,592][32177] Avg episode reward: [(0, '0.235')] [2024-06-10 10:39:26,218][32415] Updated weights for policy 0, policy_version 11870 (0.0035) [2024-06-10 10:39:29,592][32177] Fps is (10 sec: 45895.3, 60 sec: 44783.0, 300 sec: 44708.9). Total num frames: 194609152. Throughput: 0: 44765.1. Samples: 194679600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 10:39:29,592][32177] Avg episode reward: [(0, '0.250')] [2024-06-10 10:39:29,849][32415] Updated weights for policy 0, policy_version 11880 (0.0025) [2024-06-10 10:39:33,591][32415] Updated weights for policy 0, policy_version 11890 (0.0032) [2024-06-10 10:39:34,592][32177] Fps is (10 sec: 40959.4, 60 sec: 44509.7, 300 sec: 44764.4). Total num frames: 194822144. Throughput: 0: 44753.3. Samples: 194947120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-10 10:39:34,593][32177] Avg episode reward: [(0, '0.243')] [2024-06-10 10:39:36,876][32415] Updated weights for policy 0, policy_version 11900 (0.0037) [2024-06-10 10:39:39,592][32177] Fps is (10 sec: 47513.2, 60 sec: 45329.0, 300 sec: 44820.6). Total num frames: 195084288. Throughput: 0: 44963.8. Samples: 195218920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-10 10:39:39,592][32177] Avg episode reward: [(0, '0.231')] [2024-06-10 10:39:40,547][32415] Updated weights for policy 0, policy_version 11910 (0.0031) [2024-06-10 10:39:44,538][32415] Updated weights for policy 0, policy_version 11920 (0.0035) [2024-06-10 10:39:44,592][32177] Fps is (10 sec: 47515.1, 60 sec: 44783.4, 300 sec: 44708.9). Total num frames: 195297280. Throughput: 0: 45085.8. Samples: 195359980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 10:39:44,592][32177] Avg episode reward: [(0, '0.248')] [2024-06-10 10:39:47,540][32415] Updated weights for policy 0, policy_version 11930 (0.0024) [2024-06-10 10:39:49,592][32177] Fps is (10 sec: 39322.0, 60 sec: 43966.9, 300 sec: 44708.9). Total num frames: 195477504. Throughput: 0: 44801.5. Samples: 195617140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 10:39:49,592][32177] Avg episode reward: [(0, '0.250')] [2024-06-10 10:39:51,844][32394] Signal inference workers to stop experience collection... (2850 times) [2024-06-10 10:39:51,844][32394] Signal inference workers to resume experience collection... (2850 times) [2024-06-10 10:39:51,857][32415] InferenceWorker_p0-w0: stopping experience collection (2850 times) [2024-06-10 10:39:51,857][32415] InferenceWorker_p0-w0: resuming experience collection (2850 times) [2024-06-10 10:39:51,981][32415] Updated weights for policy 0, policy_version 11940 (0.0029) [2024-06-10 10:39:54,592][32177] Fps is (10 sec: 45875.1, 60 sec: 45055.9, 300 sec: 44876.1). Total num frames: 195756032. Throughput: 0: 44578.8. Samples: 195879740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 10:39:54,592][32177] Avg episode reward: [(0, '0.241')] [2024-06-10 10:39:55,077][32415] Updated weights for policy 0, policy_version 11950 (0.0036) [2024-06-10 10:39:59,382][32415] Updated weights for policy 0, policy_version 11960 (0.0034) [2024-06-10 10:39:59,592][32177] Fps is (10 sec: 49150.1, 60 sec: 45328.9, 300 sec: 44764.4). Total num frames: 195969024. Throughput: 0: 44884.4. Samples: 196029880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 10:39:59,593][32177] Avg episode reward: [(0, '0.248')] [2024-06-10 10:40:02,606][32415] Updated weights for policy 0, policy_version 11970 (0.0024) [2024-06-10 10:40:04,592][32177] Fps is (10 sec: 40959.4, 60 sec: 44509.9, 300 sec: 44708.8). Total num frames: 196165632. Throughput: 0: 44975.7. Samples: 196295600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 10:40:04,592][32177] Avg episode reward: [(0, '0.251')] [2024-06-10 10:40:06,384][32415] Updated weights for policy 0, policy_version 11980 (0.0039) [2024-06-10 10:40:09,596][32177] Fps is (10 sec: 45856.8, 60 sec: 44779.8, 300 sec: 44874.8). Total num frames: 196427776. Throughput: 0: 45027.4. Samples: 196561400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-10 10:40:09,597][32177] Avg episode reward: [(0, '0.242')] [2024-06-10 10:40:09,739][32415] Updated weights for policy 0, policy_version 11990 (0.0035) [2024-06-10 10:40:13,968][32415] Updated weights for policy 0, policy_version 12000 (0.0052) [2024-06-10 10:40:14,592][32177] Fps is (10 sec: 49152.9, 60 sec: 45332.3, 300 sec: 44764.4). Total num frames: 196657152. Throughput: 0: 44975.6. Samples: 196703500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-10 10:40:14,592][32177] Avg episode reward: [(0, '0.244')] [2024-06-10 10:40:16,727][32415] Updated weights for policy 0, policy_version 12010 (0.0031) [2024-06-10 10:40:19,592][32177] Fps is (10 sec: 42616.2, 60 sec: 45059.1, 300 sec: 44764.4). Total num frames: 196853760. Throughput: 0: 44881.0. Samples: 196966760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-10 10:40:19,592][32177] Avg episode reward: [(0, '0.247')] [2024-06-10 10:40:21,091][32415] Updated weights for policy 0, policy_version 12020 (0.0033) [2024-06-10 10:40:24,444][32415] Updated weights for policy 0, policy_version 12030 (0.0043) [2024-06-10 10:40:24,592][32177] Fps is (10 sec: 44236.0, 60 sec: 44783.0, 300 sec: 44931.7). Total num frames: 197099520. Throughput: 0: 44655.5. Samples: 197228420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-10 10:40:24,592][32177] Avg episode reward: [(0, '0.246')] [2024-06-10 10:40:28,730][32415] Updated weights for policy 0, policy_version 12040 (0.0041) [2024-06-10 10:40:29,592][32177] Fps is (10 sec: 47512.8, 60 sec: 45328.8, 300 sec: 44875.5). Total num frames: 197328896. Throughput: 0: 44763.2. Samples: 197374340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-10 10:40:29,593][32177] Avg episode reward: [(0, '0.235')] [2024-06-10 10:40:31,866][32415] Updated weights for policy 0, policy_version 12050 (0.0026) [2024-06-10 10:40:34,592][32177] Fps is (10 sec: 42598.5, 60 sec: 45056.2, 300 sec: 44764.4). Total num frames: 197525504. Throughput: 0: 45076.3. Samples: 197645580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-10 10:40:34,592][32177] Avg episode reward: [(0, '0.248')] [2024-06-10 10:40:34,599][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000012056_197525504.pth... [2024-06-10 10:40:34,657][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000011400_186777600.pth [2024-06-10 10:40:35,870][32415] Updated weights for policy 0, policy_version 12060 (0.0032) [2024-06-10 10:40:38,967][32415] Updated weights for policy 0, policy_version 12070 (0.0031) [2024-06-10 10:40:39,592][32177] Fps is (10 sec: 42598.5, 60 sec: 44509.7, 300 sec: 44875.5). Total num frames: 197754880. Throughput: 0: 44950.4. Samples: 197902520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-10 10:40:39,593][32177] Avg episode reward: [(0, '0.239')] [2024-06-10 10:40:43,204][32415] Updated weights for policy 0, policy_version 12080 (0.0040) [2024-06-10 10:40:44,592][32177] Fps is (10 sec: 47513.7, 60 sec: 45055.9, 300 sec: 44820.0). Total num frames: 198000640. Throughput: 0: 44798.9. Samples: 198045820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-10 10:40:44,592][32177] Avg episode reward: [(0, '0.250')] [2024-06-10 10:40:46,575][32415] Updated weights for policy 0, policy_version 12090 (0.0027) [2024-06-10 10:40:49,592][32177] Fps is (10 sec: 42599.6, 60 sec: 45055.9, 300 sec: 44709.5). Total num frames: 198180864. Throughput: 0: 44804.1. Samples: 198311780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-10 10:40:49,592][32177] Avg episode reward: [(0, '0.249')] [2024-06-10 10:40:50,464][32415] Updated weights for policy 0, policy_version 12100 (0.0034) [2024-06-10 10:40:53,990][32415] Updated weights for policy 0, policy_version 12110 (0.0031) [2024-06-10 10:40:54,592][32177] Fps is (10 sec: 42598.9, 60 sec: 44509.9, 300 sec: 44931.0). Total num frames: 198426624. Throughput: 0: 44663.4. Samples: 198571060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-10 10:40:54,592][32177] Avg episode reward: [(0, '0.248')] [2024-06-10 10:40:57,818][32415] Updated weights for policy 0, policy_version 12120 (0.0030) [2024-06-10 10:40:59,594][32177] Fps is (10 sec: 49141.7, 60 sec: 45054.7, 300 sec: 44875.2). Total num frames: 198672384. Throughput: 0: 44585.0. Samples: 198709920. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-10 10:40:59,594][32177] Avg episode reward: [(0, '0.250')] [2024-06-10 10:41:01,060][32415] Updated weights for policy 0, policy_version 12130 (0.0026) [2024-06-10 10:41:03,831][32394] Signal inference workers to stop experience collection... (2900 times) [2024-06-10 10:41:03,832][32394] Signal inference workers to resume experience collection... (2900 times) [2024-06-10 10:41:03,848][32415] InferenceWorker_p0-w0: stopping experience collection (2900 times) [2024-06-10 10:41:03,880][32415] InferenceWorker_p0-w0: resuming experience collection (2900 times) [2024-06-10 10:41:04,592][32177] Fps is (10 sec: 44237.0, 60 sec: 45056.2, 300 sec: 44708.9). Total num frames: 198868992. Throughput: 0: 44953.6. Samples: 198989660. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-10 10:41:04,592][32177] Avg episode reward: [(0, '0.243')] [2024-06-10 10:41:04,858][32415] Updated weights for policy 0, policy_version 12140 (0.0031) [2024-06-10 10:41:08,531][32415] Updated weights for policy 0, policy_version 12150 (0.0023) [2024-06-10 10:41:09,592][32177] Fps is (10 sec: 40969.0, 60 sec: 44240.1, 300 sec: 44875.5). Total num frames: 199081984. Throughput: 0: 44939.3. Samples: 199250680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 10:41:09,592][32177] Avg episode reward: [(0, '0.238')] [2024-06-10 10:41:12,335][32415] Updated weights for policy 0, policy_version 12160 (0.0042) [2024-06-10 10:41:14,592][32177] Fps is (10 sec: 47513.0, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 199344128. Throughput: 0: 44790.0. Samples: 199389880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-10 10:41:14,596][32177] Avg episode reward: [(0, '0.242')] [2024-06-10 10:41:15,735][32415] Updated weights for policy 0, policy_version 12170 (0.0028) [2024-06-10 10:41:19,592][32177] Fps is (10 sec: 45875.1, 60 sec: 44783.1, 300 sec: 44764.4). Total num frames: 199540736. Throughput: 0: 44731.3. Samples: 199658480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-10 10:41:19,592][32177] Avg episode reward: [(0, '0.256')] [2024-06-10 10:41:19,668][32415] Updated weights for policy 0, policy_version 12180 (0.0027) [2024-06-10 10:41:23,128][32415] Updated weights for policy 0, policy_version 12190 (0.0034) [2024-06-10 10:41:24,592][32177] Fps is (10 sec: 40960.3, 60 sec: 44236.9, 300 sec: 44820.0). Total num frames: 199753728. Throughput: 0: 45073.7. Samples: 199930820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-10 10:41:24,592][32177] Avg episode reward: [(0, '0.250')] [2024-06-10 10:41:26,771][32415] Updated weights for policy 0, policy_version 12200 (0.0032) [2024-06-10 10:41:29,592][32177] Fps is (10 sec: 47513.0, 60 sec: 44783.1, 300 sec: 44875.5). Total num frames: 200015872. Throughput: 0: 44767.1. Samples: 200060340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-10 10:41:29,592][32177] Avg episode reward: [(0, '0.250')] [2024-06-10 10:41:30,136][32415] Updated weights for policy 0, policy_version 12210 (0.0024) [2024-06-10 10:41:34,003][32415] Updated weights for policy 0, policy_version 12220 (0.0021) [2024-06-10 10:41:34,592][32177] Fps is (10 sec: 47513.2, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 200228864. Throughput: 0: 44949.3. Samples: 200334500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-10 10:41:34,593][32177] Avg episode reward: [(0, '0.250')] [2024-06-10 10:41:37,302][32415] Updated weights for policy 0, policy_version 12230 (0.0025) [2024-06-10 10:41:39,592][32177] Fps is (10 sec: 40960.6, 60 sec: 44510.1, 300 sec: 44820.0). Total num frames: 200425472. Throughput: 0: 45104.0. Samples: 200600740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-10 10:41:39,592][32177] Avg episode reward: [(0, '0.254')] [2024-06-10 10:41:41,447][32415] Updated weights for policy 0, policy_version 12240 (0.0038) [2024-06-10 10:41:44,592][32177] Fps is (10 sec: 44236.9, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 200671232. Throughput: 0: 44942.5. Samples: 200732240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-10 10:41:44,592][32177] Avg episode reward: [(0, '0.241')] [2024-06-10 10:41:44,759][32415] Updated weights for policy 0, policy_version 12250 (0.0043) [2024-06-10 10:41:48,726][32415] Updated weights for policy 0, policy_version 12260 (0.0031) [2024-06-10 10:41:49,592][32177] Fps is (10 sec: 45873.6, 60 sec: 45055.8, 300 sec: 44764.4). Total num frames: 200884224. Throughput: 0: 44675.6. Samples: 201000080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-10 10:41:49,593][32177] Avg episode reward: [(0, '0.240')] [2024-06-10 10:41:52,332][32415] Updated weights for policy 0, policy_version 12270 (0.0036) [2024-06-10 10:41:54,592][32177] Fps is (10 sec: 42598.4, 60 sec: 44509.8, 300 sec: 44820.0). Total num frames: 201097216. Throughput: 0: 44949.2. Samples: 201273400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-10 10:41:54,592][32177] Avg episode reward: [(0, '0.239')] [2024-06-10 10:41:56,129][32415] Updated weights for policy 0, policy_version 12280 (0.0034) [2024-06-10 10:41:59,290][32415] Updated weights for policy 0, policy_version 12290 (0.0038) [2024-06-10 10:41:59,592][32177] Fps is (10 sec: 47514.6, 60 sec: 44784.5, 300 sec: 44931.0). Total num frames: 201359360. Throughput: 0: 44764.4. Samples: 201404280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-10 10:41:59,592][32177] Avg episode reward: [(0, '0.238')] [2024-06-10 10:42:03,483][32415] Updated weights for policy 0, policy_version 12300 (0.0033) [2024-06-10 10:42:04,592][32177] Fps is (10 sec: 47513.1, 60 sec: 45055.8, 300 sec: 44875.5). Total num frames: 201572352. Throughput: 0: 44841.6. Samples: 201676360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 10:42:04,592][32177] Avg episode reward: [(0, '0.261')] [2024-06-10 10:42:04,639][32394] Saving new best policy, reward=0.261! [2024-06-10 10:42:06,342][32415] Updated weights for policy 0, policy_version 12310 (0.0037) [2024-06-10 10:42:09,592][32177] Fps is (10 sec: 39322.0, 60 sec: 44509.8, 300 sec: 44765.1). Total num frames: 201752576. Throughput: 0: 44731.6. Samples: 201943740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 10:42:09,592][32177] Avg episode reward: [(0, '0.263')] [2024-06-10 10:42:09,644][32394] Saving new best policy, reward=0.263! [2024-06-10 10:42:10,835][32415] Updated weights for policy 0, policy_version 12320 (0.0026) [2024-06-10 10:42:14,286][32415] Updated weights for policy 0, policy_version 12330 (0.0036) [2024-06-10 10:42:14,592][32177] Fps is (10 sec: 44237.2, 60 sec: 44509.9, 300 sec: 44875.5). Total num frames: 202014720. Throughput: 0: 44671.1. Samples: 202070540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 10:42:14,601][32177] Avg episode reward: [(0, '0.251')] [2024-06-10 10:42:18,078][32415] Updated weights for policy 0, policy_version 12340 (0.0032) [2024-06-10 10:42:19,592][32177] Fps is (10 sec: 47513.2, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 202227712. Throughput: 0: 44495.6. Samples: 202336800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 10:42:19,592][32177] Avg episode reward: [(0, '0.245')] [2024-06-10 10:42:21,785][32415] Updated weights for policy 0, policy_version 12350 (0.0040) [2024-06-10 10:42:24,592][32177] Fps is (10 sec: 42597.5, 60 sec: 44782.7, 300 sec: 44820.6). Total num frames: 202440704. Throughput: 0: 44699.6. Samples: 202612240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-10 10:42:24,593][32177] Avg episode reward: [(0, '0.243')] [2024-06-10 10:42:25,526][32415] Updated weights for policy 0, policy_version 12360 (0.0024) [2024-06-10 10:42:28,780][32415] Updated weights for policy 0, policy_version 12370 (0.0030) [2024-06-10 10:42:29,592][32177] Fps is (10 sec: 44236.5, 60 sec: 44236.8, 300 sec: 44820.0). Total num frames: 202670080. Throughput: 0: 44655.0. Samples: 202741720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-10 10:42:29,596][32177] Avg episode reward: [(0, '0.248')] [2024-06-10 10:42:33,086][32415] Updated weights for policy 0, policy_version 12380 (0.0039) [2024-06-10 10:42:33,442][32394] Signal inference workers to stop experience collection... (2950 times) [2024-06-10 10:42:33,490][32415] InferenceWorker_p0-w0: stopping experience collection (2950 times) [2024-06-10 10:42:33,497][32394] Signal inference workers to resume experience collection... (2950 times) [2024-06-10 10:42:33,510][32415] InferenceWorker_p0-w0: resuming experience collection (2950 times) [2024-06-10 10:42:34,592][32177] Fps is (10 sec: 47514.6, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 202915840. Throughput: 0: 44862.0. Samples: 203018860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 10:42:34,592][32177] Avg episode reward: [(0, '0.239')] [2024-06-10 10:42:34,637][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000012386_202932224.pth... [2024-06-10 10:42:34,682][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000011728_192151552.pth [2024-06-10 10:42:35,791][32415] Updated weights for policy 0, policy_version 12390 (0.0031) [2024-06-10 10:42:39,592][32177] Fps is (10 sec: 42599.2, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 203096064. Throughput: 0: 44669.0. Samples: 203283500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 10:42:39,592][32177] Avg episode reward: [(0, '0.244')] [2024-06-10 10:42:40,218][32415] Updated weights for policy 0, policy_version 12400 (0.0027) [2024-06-10 10:42:43,515][32415] Updated weights for policy 0, policy_version 12410 (0.0035) [2024-06-10 10:42:44,596][32177] Fps is (10 sec: 42580.5, 60 sec: 44506.7, 300 sec: 44819.3). Total num frames: 203341824. Throughput: 0: 44605.2. Samples: 203411700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-10 10:42:44,597][32177] Avg episode reward: [(0, '0.260')] [2024-06-10 10:42:47,529][32415] Updated weights for policy 0, policy_version 12420 (0.0033) [2024-06-10 10:42:49,592][32177] Fps is (10 sec: 49151.1, 60 sec: 45056.1, 300 sec: 44875.5). Total num frames: 203587584. Throughput: 0: 44462.7. Samples: 203677180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-10 10:42:49,592][32177] Avg episode reward: [(0, '0.250')] [2024-06-10 10:42:50,966][32415] Updated weights for policy 0, policy_version 12430 (0.0034) [2024-06-10 10:42:54,592][32177] Fps is (10 sec: 44256.0, 60 sec: 44783.0, 300 sec: 44765.1). Total num frames: 203784192. Throughput: 0: 44754.3. Samples: 203957680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-10 10:42:54,592][32177] Avg episode reward: [(0, '0.246')] [2024-06-10 10:42:54,927][32415] Updated weights for policy 0, policy_version 12440 (0.0037) [2024-06-10 10:42:57,987][32415] Updated weights for policy 0, policy_version 12450 (0.0023) [2024-06-10 10:42:59,592][32177] Fps is (10 sec: 44237.0, 60 sec: 44509.9, 300 sec: 44819.9). Total num frames: 204029952. Throughput: 0: 44732.0. Samples: 204083480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-10 10:42:59,592][32177] Avg episode reward: [(0, '0.238')] [2024-06-10 10:43:02,358][32415] Updated weights for policy 0, policy_version 12460 (0.0030) [2024-06-10 10:43:04,592][32177] Fps is (10 sec: 47513.6, 60 sec: 44783.1, 300 sec: 44820.0). Total num frames: 204259328. Throughput: 0: 44897.5. Samples: 204357180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-10 10:43:04,592][32177] Avg episode reward: [(0, '0.254')] [2024-06-10 10:43:05,282][32415] Updated weights for policy 0, policy_version 12470 (0.0032) [2024-06-10 10:43:09,397][32415] Updated weights for policy 0, policy_version 12480 (0.0040) [2024-06-10 10:43:09,592][32177] Fps is (10 sec: 44237.5, 60 sec: 45329.1, 300 sec: 44820.0). Total num frames: 204472320. Throughput: 0: 44717.3. Samples: 204624500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-10 10:43:09,592][32177] Avg episode reward: [(0, '0.245')] [2024-06-10 10:43:12,857][32415] Updated weights for policy 0, policy_version 12490 (0.0039) [2024-06-10 10:43:14,592][32177] Fps is (10 sec: 42598.0, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 204685312. Throughput: 0: 44666.3. Samples: 204751700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-10 10:43:14,592][32177] Avg episode reward: [(0, '0.254')] [2024-06-10 10:43:16,687][32415] Updated weights for policy 0, policy_version 12500 (0.0029) [2024-06-10 10:43:19,592][32177] Fps is (10 sec: 45875.2, 60 sec: 45056.1, 300 sec: 44820.0). Total num frames: 204931072. Throughput: 0: 44584.6. Samples: 205025160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 10:43:19,592][32177] Avg episode reward: [(0, '0.252')] [2024-06-10 10:43:20,190][32415] Updated weights for policy 0, policy_version 12510 (0.0040) [2024-06-10 10:43:24,241][32415] Updated weights for policy 0, policy_version 12520 (0.0038) [2024-06-10 10:43:24,592][32177] Fps is (10 sec: 45875.5, 60 sec: 45056.3, 300 sec: 44820.0). Total num frames: 205144064. Throughput: 0: 44899.1. Samples: 205303960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 10:43:24,592][32177] Avg episode reward: [(0, '0.260')] [2024-06-10 10:43:27,195][32415] Updated weights for policy 0, policy_version 12530 (0.0040) [2024-06-10 10:43:29,592][32177] Fps is (10 sec: 40959.0, 60 sec: 44509.8, 300 sec: 44708.9). Total num frames: 205340672. Throughput: 0: 44915.2. Samples: 205432700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-10 10:43:29,592][32177] Avg episode reward: [(0, '0.244')] [2024-06-10 10:43:31,672][32415] Updated weights for policy 0, policy_version 12540 (0.0035) [2024-06-10 10:43:34,446][32415] Updated weights for policy 0, policy_version 12550 (0.0032) [2024-06-10 10:43:34,592][32177] Fps is (10 sec: 47512.9, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 205619200. Throughput: 0: 44943.1. Samples: 205699620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-10 10:43:34,592][32177] Avg episode reward: [(0, '0.248')] [2024-06-10 10:43:38,736][32415] Updated weights for policy 0, policy_version 12560 (0.0033) [2024-06-10 10:43:39,592][32177] Fps is (10 sec: 47513.6, 60 sec: 45328.9, 300 sec: 44764.5). Total num frames: 205815808. Throughput: 0: 44622.4. Samples: 205965700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 10:43:39,592][32177] Avg episode reward: [(0, '0.257')] [2024-06-10 10:43:41,935][32415] Updated weights for policy 0, policy_version 12570 (0.0038) [2024-06-10 10:43:44,591][32177] Fps is (10 sec: 37684.1, 60 sec: 44240.1, 300 sec: 44598.5). Total num frames: 205996032. Throughput: 0: 44666.0. Samples: 206093440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 10:43:44,592][32177] Avg episode reward: [(0, '0.249')] [2024-06-10 10:43:46,093][32415] Updated weights for policy 0, policy_version 12580 (0.0025) [2024-06-10 10:43:49,518][32415] Updated weights for policy 0, policy_version 12590 (0.0037) [2024-06-10 10:43:49,592][32177] Fps is (10 sec: 45876.1, 60 sec: 44783.0, 300 sec: 44819.9). Total num frames: 206274560. Throughput: 0: 44627.5. Samples: 206365420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 10:43:49,592][32177] Avg episode reward: [(0, '0.257')] [2024-06-10 10:43:53,558][32415] Updated weights for policy 0, policy_version 12600 (0.0030) [2024-06-10 10:43:54,592][32177] Fps is (10 sec: 49151.0, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 206487552. Throughput: 0: 44772.3. Samples: 206639260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 10:43:54,592][32177] Avg episode reward: [(0, '0.258')] [2024-06-10 10:43:56,592][32415] Updated weights for policy 0, policy_version 12610 (0.0025) [2024-06-10 10:43:59,596][32177] Fps is (10 sec: 44217.9, 60 sec: 44779.8, 300 sec: 44819.3). Total num frames: 206716928. Throughput: 0: 44878.0. Samples: 206771400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 10:43:59,596][32177] Avg episode reward: [(0, '0.251')] [2024-06-10 10:44:01,173][32415] Updated weights for policy 0, policy_version 12620 (0.0031) [2024-06-10 10:44:01,344][32394] Signal inference workers to stop experience collection... (3000 times) [2024-06-10 10:44:01,344][32394] Signal inference workers to resume experience collection... (3000 times) [2024-06-10 10:44:01,396][32415] InferenceWorker_p0-w0: stopping experience collection (3000 times) [2024-06-10 10:44:01,396][32415] InferenceWorker_p0-w0: resuming experience collection (3000 times) [2024-06-10 10:44:03,622][32415] Updated weights for policy 0, policy_version 12630 (0.0036) [2024-06-10 10:44:04,592][32177] Fps is (10 sec: 44237.4, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 206929920. Throughput: 0: 44769.8. Samples: 207039800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 10:44:04,592][32177] Avg episode reward: [(0, '0.252')] [2024-06-10 10:44:08,155][32415] Updated weights for policy 0, policy_version 12640 (0.0035) [2024-06-10 10:44:09,592][32177] Fps is (10 sec: 45894.4, 60 sec: 45055.9, 300 sec: 44876.1). Total num frames: 207175680. Throughput: 0: 44558.5. Samples: 207309100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-10 10:44:09,592][32177] Avg episode reward: [(0, '0.248')] [2024-06-10 10:44:11,260][32415] Updated weights for policy 0, policy_version 12650 (0.0040) [2024-06-10 10:44:14,592][32177] Fps is (10 sec: 42598.3, 60 sec: 44509.9, 300 sec: 44765.1). Total num frames: 207355904. Throughput: 0: 44660.7. Samples: 207442420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-10 10:44:14,592][32177] Avg episode reward: [(0, '0.262')] [2024-06-10 10:44:15,164][32415] Updated weights for policy 0, policy_version 12660 (0.0035) [2024-06-10 10:44:18,630][32415] Updated weights for policy 0, policy_version 12670 (0.0031) [2024-06-10 10:44:19,592][32177] Fps is (10 sec: 44236.9, 60 sec: 44782.8, 300 sec: 44764.4). Total num frames: 207618048. Throughput: 0: 44839.2. Samples: 207717380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 10:44:19,592][32177] Avg episode reward: [(0, '0.256')] [2024-06-10 10:44:22,671][32415] Updated weights for policy 0, policy_version 12680 (0.0035) [2024-06-10 10:44:24,592][32177] Fps is (10 sec: 49151.9, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 207847424. Throughput: 0: 44962.4. Samples: 207989000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 10:44:24,592][32177] Avg episode reward: [(0, '0.248')] [2024-06-10 10:44:25,610][32415] Updated weights for policy 0, policy_version 12690 (0.0030) [2024-06-10 10:44:29,592][32177] Fps is (10 sec: 42598.0, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 208044032. Throughput: 0: 45015.7. Samples: 208119160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-10 10:44:29,592][32177] Avg episode reward: [(0, '0.256')] [2024-06-10 10:44:30,158][32415] Updated weights for policy 0, policy_version 12700 (0.0022) [2024-06-10 10:44:32,994][32415] Updated weights for policy 0, policy_version 12710 (0.0046) [2024-06-10 10:44:34,592][32177] Fps is (10 sec: 44236.5, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 208289792. Throughput: 0: 44852.8. Samples: 208383800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-10 10:44:34,592][32177] Avg episode reward: [(0, '0.253')] [2024-06-10 10:44:34,602][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000012713_208289792.pth... [2024-06-10 10:44:34,675][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000012056_197525504.pth [2024-06-10 10:44:37,281][32415] Updated weights for policy 0, policy_version 12720 (0.0033) [2024-06-10 10:44:39,593][32177] Fps is (10 sec: 47505.7, 60 sec: 45054.8, 300 sec: 44819.7). Total num frames: 208519168. Throughput: 0: 44825.8. Samples: 208656500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 10:44:39,594][32177] Avg episode reward: [(0, '0.265')] [2024-06-10 10:44:39,596][32394] Saving new best policy, reward=0.265! [2024-06-10 10:44:40,703][32415] Updated weights for policy 0, policy_version 12730 (0.0037) [2024-06-10 10:44:44,364][32415] Updated weights for policy 0, policy_version 12740 (0.0038) [2024-06-10 10:44:44,592][32177] Fps is (10 sec: 44237.2, 60 sec: 45602.1, 300 sec: 44931.0). Total num frames: 208732160. Throughput: 0: 44872.7. Samples: 208790480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 10:44:44,592][32177] Avg episode reward: [(0, '0.247')] [2024-06-10 10:44:47,851][32415] Updated weights for policy 0, policy_version 12750 (0.0037) [2024-06-10 10:44:49,592][32177] Fps is (10 sec: 42605.8, 60 sec: 44509.8, 300 sec: 44708.9). Total num frames: 208945152. Throughput: 0: 44911.4. Samples: 209060820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 10:44:49,592][32177] Avg episode reward: [(0, '0.253')] [2024-06-10 10:44:51,790][32415] Updated weights for policy 0, policy_version 12760 (0.0033) [2024-06-10 10:44:54,592][32177] Fps is (10 sec: 45873.6, 60 sec: 45055.8, 300 sec: 44820.0). Total num frames: 209190912. Throughput: 0: 45041.1. Samples: 209335960. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-10 10:44:54,593][32177] Avg episode reward: [(0, '0.240')] [2024-06-10 10:44:55,005][32415] Updated weights for policy 0, policy_version 12770 (0.0028) [2024-06-10 10:44:59,372][32415] Updated weights for policy 0, policy_version 12780 (0.0034) [2024-06-10 10:44:59,592][32177] Fps is (10 sec: 45875.1, 60 sec: 44786.0, 300 sec: 44875.5). Total num frames: 209403904. Throughput: 0: 45091.4. Samples: 209471540. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-10 10:44:59,592][32177] Avg episode reward: [(0, '0.261')] [2024-06-10 10:45:02,413][32415] Updated weights for policy 0, policy_version 12790 (0.0039) [2024-06-10 10:45:04,592][32177] Fps is (10 sec: 42599.3, 60 sec: 44782.8, 300 sec: 44709.5). Total num frames: 209616896. Throughput: 0: 44870.6. Samples: 209736560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-10 10:45:04,592][32177] Avg episode reward: [(0, '0.252')] [2024-06-10 10:45:06,392][32415] Updated weights for policy 0, policy_version 12800 (0.0036) [2024-06-10 10:45:09,592][32177] Fps is (10 sec: 44237.5, 60 sec: 44510.0, 300 sec: 44708.9). Total num frames: 209846272. Throughput: 0: 44836.5. Samples: 210006640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-10 10:45:09,592][32177] Avg episode reward: [(0, '0.264')] [2024-06-10 10:45:09,951][32415] Updated weights for policy 0, policy_version 12810 (0.0031) [2024-06-10 10:45:13,490][32415] Updated weights for policy 0, policy_version 12820 (0.0030) [2024-06-10 10:45:14,594][32177] Fps is (10 sec: 45864.6, 60 sec: 45327.2, 300 sec: 44819.6). Total num frames: 210075648. Throughput: 0: 45088.9. Samples: 210148260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-10 10:45:14,595][32177] Avg episode reward: [(0, '0.252')] [2024-06-10 10:45:16,984][32415] Updated weights for policy 0, policy_version 12830 (0.0034) [2024-06-10 10:45:19,592][32177] Fps is (10 sec: 44236.9, 60 sec: 44510.0, 300 sec: 44708.9). Total num frames: 210288640. Throughput: 0: 45198.3. Samples: 210417720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-10 10:45:19,592][32177] Avg episode reward: [(0, '0.251')] [2024-06-10 10:45:20,860][32415] Updated weights for policy 0, policy_version 12840 (0.0034) [2024-06-10 10:45:24,090][32415] Updated weights for policy 0, policy_version 12850 (0.0032) [2024-06-10 10:45:24,592][32177] Fps is (10 sec: 47525.3, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 210550784. Throughput: 0: 45070.8. Samples: 210684600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 10:45:24,592][32177] Avg episode reward: [(0, '0.251')] [2024-06-10 10:45:28,294][32415] Updated weights for policy 0, policy_version 12860 (0.0032) [2024-06-10 10:45:29,592][32177] Fps is (10 sec: 45875.2, 60 sec: 45056.2, 300 sec: 44820.0). Total num frames: 210747392. Throughput: 0: 45250.2. Samples: 210826740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 10:45:29,592][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 10:45:29,741][32394] Saving new best policy, reward=0.268! [2024-06-10 10:45:31,590][32415] Updated weights for policy 0, policy_version 12870 (0.0029) [2024-06-10 10:45:34,038][32394] Signal inference workers to stop experience collection... (3050 times) [2024-06-10 10:45:34,039][32394] Signal inference workers to resume experience collection... (3050 times) [2024-06-10 10:45:34,051][32415] InferenceWorker_p0-w0: stopping experience collection (3050 times) [2024-06-10 10:45:34,072][32415] InferenceWorker_p0-w0: resuming experience collection (3050 times) [2024-06-10 10:45:34,592][32177] Fps is (10 sec: 42598.1, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 210976768. Throughput: 0: 45068.1. Samples: 211088880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-10 10:45:34,592][32177] Avg episode reward: [(0, '0.251')] [2024-06-10 10:45:35,397][32415] Updated weights for policy 0, policy_version 12880 (0.0037) [2024-06-10 10:45:38,990][32415] Updated weights for policy 0, policy_version 12890 (0.0029) [2024-06-10 10:45:39,596][32177] Fps is (10 sec: 47493.0, 60 sec: 45054.2, 300 sec: 44819.3). Total num frames: 211222528. Throughput: 0: 44860.9. Samples: 211354880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-10 10:45:39,597][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 10:45:42,610][32415] Updated weights for policy 0, policy_version 12900 (0.0024) [2024-06-10 10:45:44,592][32177] Fps is (10 sec: 47513.1, 60 sec: 45328.9, 300 sec: 44986.6). Total num frames: 211451904. Throughput: 0: 45029.3. Samples: 211497860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 10:45:44,592][32177] Avg episode reward: [(0, '0.256')] [2024-06-10 10:45:46,057][32415] Updated weights for policy 0, policy_version 12910 (0.0043) [2024-06-10 10:45:49,592][32177] Fps is (10 sec: 44255.0, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 211664896. Throughput: 0: 45102.6. Samples: 211766180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 10:45:49,592][32177] Avg episode reward: [(0, '0.253')] [2024-06-10 10:45:50,021][32415] Updated weights for policy 0, policy_version 12920 (0.0030) [2024-06-10 10:45:53,459][32415] Updated weights for policy 0, policy_version 12930 (0.0031) [2024-06-10 10:45:54,592][32177] Fps is (10 sec: 44236.4, 60 sec: 45056.1, 300 sec: 44820.2). Total num frames: 211894272. Throughput: 0: 45183.3. Samples: 212039900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 10:45:54,596][32177] Avg episode reward: [(0, '0.259')] [2024-06-10 10:45:57,026][32415] Updated weights for policy 0, policy_version 12940 (0.0031) [2024-06-10 10:45:59,592][32177] Fps is (10 sec: 44236.3, 60 sec: 45055.9, 300 sec: 44875.4). Total num frames: 212107264. Throughput: 0: 44925.7. Samples: 212169820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 10:45:59,593][32177] Avg episode reward: [(0, '0.264')] [2024-06-10 10:46:00,642][32415] Updated weights for policy 0, policy_version 12950 (0.0035) [2024-06-10 10:46:04,313][32415] Updated weights for policy 0, policy_version 12960 (0.0029) [2024-06-10 10:46:04,592][32177] Fps is (10 sec: 44237.4, 60 sec: 45329.1, 300 sec: 44931.0). Total num frames: 212336640. Throughput: 0: 44981.2. Samples: 212441880. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-10 10:46:04,592][32177] Avg episode reward: [(0, '0.254')] [2024-06-10 10:46:08,138][32415] Updated weights for policy 0, policy_version 12970 (0.0032) [2024-06-10 10:46:09,592][32177] Fps is (10 sec: 44237.5, 60 sec: 45055.9, 300 sec: 44764.4). Total num frames: 212549632. Throughput: 0: 44867.4. Samples: 212703640. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-10 10:46:09,592][32177] Avg episode reward: [(0, '0.259')] [2024-06-10 10:46:11,851][32415] Updated weights for policy 0, policy_version 12980 (0.0043) [2024-06-10 10:46:14,592][32177] Fps is (10 sec: 42598.4, 60 sec: 44784.7, 300 sec: 44819.9). Total num frames: 212762624. Throughput: 0: 44641.6. Samples: 212835620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 10:46:14,592][32177] Avg episode reward: [(0, '0.264')] [2024-06-10 10:46:15,177][32415] Updated weights for policy 0, policy_version 12990 (0.0037) [2024-06-10 10:46:19,074][32415] Updated weights for policy 0, policy_version 13000 (0.0028) [2024-06-10 10:46:19,596][32177] Fps is (10 sec: 45856.0, 60 sec: 45325.8, 300 sec: 44930.4). Total num frames: 213008384. Throughput: 0: 44955.7. Samples: 213112080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 10:46:19,597][32177] Avg episode reward: [(0, '0.266')] [2024-06-10 10:46:22,662][32415] Updated weights for policy 0, policy_version 13010 (0.0034) [2024-06-10 10:46:24,592][32177] Fps is (10 sec: 45874.3, 60 sec: 44509.6, 300 sec: 44764.4). Total num frames: 213221376. Throughput: 0: 45083.1. Samples: 213383440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 10:46:24,593][32177] Avg episode reward: [(0, '0.255')] [2024-06-10 10:46:26,055][32415] Updated weights for policy 0, policy_version 13020 (0.0030) [2024-06-10 10:46:29,596][32177] Fps is (10 sec: 44236.7, 60 sec: 45052.7, 300 sec: 44819.3). Total num frames: 213450752. Throughput: 0: 44859.4. Samples: 213516720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 10:46:29,597][32177] Avg episode reward: [(0, '0.265')] [2024-06-10 10:46:29,955][32415] Updated weights for policy 0, policy_version 13030 (0.0036) [2024-06-10 10:46:33,451][32415] Updated weights for policy 0, policy_version 13040 (0.0030) [2024-06-10 10:46:34,592][32177] Fps is (10 sec: 45875.7, 60 sec: 45055.9, 300 sec: 44931.0). Total num frames: 213680128. Throughput: 0: 44891.5. Samples: 213786300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 10:46:34,592][32177] Avg episode reward: [(0, '0.261')] [2024-06-10 10:46:34,636][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000013043_213696512.pth... [2024-06-10 10:46:34,704][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000012386_202932224.pth [2024-06-10 10:46:37,270][32415] Updated weights for policy 0, policy_version 13050 (0.0031) [2024-06-10 10:46:39,592][32177] Fps is (10 sec: 42616.2, 60 sec: 44239.9, 300 sec: 44764.4). Total num frames: 213876736. Throughput: 0: 44740.5. Samples: 214053220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 10:46:39,596][32177] Avg episode reward: [(0, '0.258')] [2024-06-10 10:46:40,882][32415] Updated weights for policy 0, policy_version 13060 (0.0033) [2024-06-10 10:46:44,400][32415] Updated weights for policy 0, policy_version 13070 (0.0027) [2024-06-10 10:46:44,592][32177] Fps is (10 sec: 45874.7, 60 sec: 44782.8, 300 sec: 44931.0). Total num frames: 214138880. Throughput: 0: 44825.8. Samples: 214186980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 10:46:44,593][32177] Avg episode reward: [(0, '0.259')] [2024-06-10 10:46:48,169][32415] Updated weights for policy 0, policy_version 13080 (0.0024) [2024-06-10 10:46:49,592][32177] Fps is (10 sec: 49152.7, 60 sec: 45056.1, 300 sec: 44986.6). Total num frames: 214368256. Throughput: 0: 44832.1. Samples: 214459320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 10:46:49,592][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 10:46:51,996][32415] Updated weights for policy 0, policy_version 13090 (0.0037) [2024-06-10 10:46:54,592][32177] Fps is (10 sec: 42599.4, 60 sec: 44510.0, 300 sec: 44764.4). Total num frames: 214564864. Throughput: 0: 45008.9. Samples: 214729040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 10:46:54,592][32177] Avg episode reward: [(0, '0.260')] [2024-06-10 10:46:55,199][32415] Updated weights for policy 0, policy_version 13100 (0.0041) [2024-06-10 10:46:56,203][32394] Signal inference workers to stop experience collection... (3100 times) [2024-06-10 10:46:56,204][32394] Signal inference workers to resume experience collection... (3100 times) [2024-06-10 10:46:56,250][32415] InferenceWorker_p0-w0: stopping experience collection (3100 times) [2024-06-10 10:46:56,250][32415] InferenceWorker_p0-w0: resuming experience collection (3100 times) [2024-06-10 10:46:59,105][32415] Updated weights for policy 0, policy_version 13110 (0.0037) [2024-06-10 10:46:59,592][32177] Fps is (10 sec: 44236.3, 60 sec: 45056.1, 300 sec: 44875.5). Total num frames: 214810624. Throughput: 0: 44988.9. Samples: 214860120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 10:46:59,592][32177] Avg episode reward: [(0, '0.260')] [2024-06-10 10:47:02,788][32415] Updated weights for policy 0, policy_version 13120 (0.0046) [2024-06-10 10:47:04,596][32177] Fps is (10 sec: 47493.5, 60 sec: 45052.8, 300 sec: 45041.5). Total num frames: 215040000. Throughput: 0: 44800.9. Samples: 215128120. Policy #0 lag: (min: 0.0, avg: 8.0, max: 18.0) [2024-06-10 10:47:04,596][32177] Avg episode reward: [(0, '0.252')] [2024-06-10 10:47:06,457][32415] Updated weights for policy 0, policy_version 13130 (0.0021) [2024-06-10 10:47:09,592][32177] Fps is (10 sec: 40960.7, 60 sec: 44510.0, 300 sec: 44764.4). Total num frames: 215220224. Throughput: 0: 44937.2. Samples: 215405600. Policy #0 lag: (min: 0.0, avg: 8.0, max: 18.0) [2024-06-10 10:47:09,592][32177] Avg episode reward: [(0, '0.254')] [2024-06-10 10:47:10,270][32415] Updated weights for policy 0, policy_version 13140 (0.0039) [2024-06-10 10:47:13,786][32415] Updated weights for policy 0, policy_version 13150 (0.0028) [2024-06-10 10:47:14,592][32177] Fps is (10 sec: 44255.2, 60 sec: 45329.0, 300 sec: 44931.0). Total num frames: 215482368. Throughput: 0: 44891.3. Samples: 215536640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 10:47:14,592][32177] Avg episode reward: [(0, '0.256')] [2024-06-10 10:47:17,563][32415] Updated weights for policy 0, policy_version 13160 (0.0027) [2024-06-10 10:47:19,592][32177] Fps is (10 sec: 47513.4, 60 sec: 44786.2, 300 sec: 44931.1). Total num frames: 215695360. Throughput: 0: 44843.3. Samples: 215804240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 10:47:19,592][32177] Avg episode reward: [(0, '0.259')] [2024-06-10 10:47:20,788][32415] Updated weights for policy 0, policy_version 13170 (0.0040) [2024-06-10 10:47:24,592][32177] Fps is (10 sec: 42598.8, 60 sec: 44783.1, 300 sec: 44875.5). Total num frames: 215908352. Throughput: 0: 44982.8. Samples: 216077440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-10 10:47:24,592][32177] Avg episode reward: [(0, '0.262')] [2024-06-10 10:47:24,821][32415] Updated weights for policy 0, policy_version 13180 (0.0032) [2024-06-10 10:47:28,138][32415] Updated weights for policy 0, policy_version 13190 (0.0038) [2024-06-10 10:47:29,596][32177] Fps is (10 sec: 45855.7, 60 sec: 45056.1, 300 sec: 44874.9). Total num frames: 216154112. Throughput: 0: 44949.8. Samples: 216209900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-10 10:47:29,597][32177] Avg episode reward: [(0, '0.266')] [2024-06-10 10:47:32,167][32415] Updated weights for policy 0, policy_version 13200 (0.0028) [2024-06-10 10:47:34,592][32177] Fps is (10 sec: 45875.5, 60 sec: 44783.1, 300 sec: 44986.6). Total num frames: 216367104. Throughput: 0: 44766.7. Samples: 216473820. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-10 10:47:34,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 10:47:34,625][32394] Saving new best policy, reward=0.269! [2024-06-10 10:47:35,678][32415] Updated weights for policy 0, policy_version 13210 (0.0026) [2024-06-10 10:47:39,309][32415] Updated weights for policy 0, policy_version 13220 (0.0024) [2024-06-10 10:47:39,592][32177] Fps is (10 sec: 45893.3, 60 sec: 45602.0, 300 sec: 44987.2). Total num frames: 216612864. Throughput: 0: 45003.3. Samples: 216754200. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-10 10:47:39,593][32177] Avg episode reward: [(0, '0.264')] [2024-06-10 10:47:42,641][32415] Updated weights for policy 0, policy_version 13230 (0.0043) [2024-06-10 10:47:44,592][32177] Fps is (10 sec: 45875.4, 60 sec: 44783.2, 300 sec: 44875.5). Total num frames: 216825856. Throughput: 0: 45070.8. Samples: 216888300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-10 10:47:44,592][32177] Avg episode reward: [(0, '0.264')] [2024-06-10 10:47:46,675][32415] Updated weights for policy 0, policy_version 13240 (0.0031) [2024-06-10 10:47:49,592][32177] Fps is (10 sec: 45876.7, 60 sec: 45056.0, 300 sec: 45042.1). Total num frames: 217071616. Throughput: 0: 45245.7. Samples: 217163980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-10 10:47:49,592][32177] Avg episode reward: [(0, '0.262')] [2024-06-10 10:47:49,671][32415] Updated weights for policy 0, policy_version 13250 (0.0040) [2024-06-10 10:47:53,878][32415] Updated weights for policy 0, policy_version 13260 (0.0031) [2024-06-10 10:47:54,592][32177] Fps is (10 sec: 45874.9, 60 sec: 45329.1, 300 sec: 44931.1). Total num frames: 217284608. Throughput: 0: 44837.7. Samples: 217423300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-10 10:47:54,592][32177] Avg episode reward: [(0, '0.267')] [2024-06-10 10:47:57,371][32415] Updated weights for policy 0, policy_version 13270 (0.0029) [2024-06-10 10:47:59,592][32177] Fps is (10 sec: 42597.9, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 217497600. Throughput: 0: 44859.6. Samples: 217555320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-10 10:47:59,592][32177] Avg episode reward: [(0, '0.249')] [2024-06-10 10:48:01,324][32415] Updated weights for policy 0, policy_version 13280 (0.0032) [2024-06-10 10:48:04,592][32177] Fps is (10 sec: 42598.0, 60 sec: 44513.0, 300 sec: 44875.5). Total num frames: 217710592. Throughput: 0: 44919.4. Samples: 217825620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-10 10:48:04,592][32177] Avg episode reward: [(0, '0.257')] [2024-06-10 10:48:05,030][32415] Updated weights for policy 0, policy_version 13290 (0.0037) [2024-06-10 10:48:08,651][32415] Updated weights for policy 0, policy_version 13300 (0.0033) [2024-06-10 10:48:09,592][32177] Fps is (10 sec: 44237.3, 60 sec: 45329.1, 300 sec: 44931.0). Total num frames: 217939968. Throughput: 0: 44783.6. Samples: 218092700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-10 10:48:09,592][32177] Avg episode reward: [(0, '0.265')] [2024-06-10 10:48:12,322][32415] Updated weights for policy 0, policy_version 13310 (0.0028) [2024-06-10 10:48:13,050][32394] Signal inference workers to stop experience collection... (3150 times) [2024-06-10 10:48:13,050][32394] Signal inference workers to resume experience collection... (3150 times) [2024-06-10 10:48:13,090][32415] InferenceWorker_p0-w0: stopping experience collection (3150 times) [2024-06-10 10:48:13,090][32415] InferenceWorker_p0-w0: resuming experience collection (3150 times) [2024-06-10 10:48:14,592][32177] Fps is (10 sec: 44236.8, 60 sec: 44509.9, 300 sec: 44819.9). Total num frames: 218152960. Throughput: 0: 44853.0. Samples: 218228100. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-10 10:48:14,592][32177] Avg episode reward: [(0, '0.256')] [2024-06-10 10:48:16,139][32415] Updated weights for policy 0, policy_version 13320 (0.0029) [2024-06-10 10:48:19,592][32177] Fps is (10 sec: 42598.3, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 218365952. Throughput: 0: 45014.6. Samples: 218499480. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-10 10:48:19,592][32177] Avg episode reward: [(0, '0.262')] [2024-06-10 10:48:19,924][32415] Updated weights for policy 0, policy_version 13330 (0.0032) [2024-06-10 10:48:23,485][32415] Updated weights for policy 0, policy_version 13340 (0.0040) [2024-06-10 10:48:24,596][32177] Fps is (10 sec: 45855.9, 60 sec: 45052.8, 300 sec: 44985.9). Total num frames: 218611712. Throughput: 0: 44492.0. Samples: 218756520. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-10 10:48:24,597][32177] Avg episode reward: [(0, '0.265')] [2024-06-10 10:48:27,262][32415] Updated weights for policy 0, policy_version 13350 (0.0035) [2024-06-10 10:48:29,594][32177] Fps is (10 sec: 44225.5, 60 sec: 44238.1, 300 sec: 44708.5). Total num frames: 218808320. Throughput: 0: 44609.9. Samples: 218895860. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-10 10:48:29,595][32177] Avg episode reward: [(0, '0.261')] [2024-06-10 10:48:30,717][32415] Updated weights for policy 0, policy_version 13360 (0.0034) [2024-06-10 10:48:34,221][32415] Updated weights for policy 0, policy_version 13370 (0.0036) [2024-06-10 10:48:34,592][32177] Fps is (10 sec: 45895.2, 60 sec: 45056.0, 300 sec: 44931.1). Total num frames: 219070464. Throughput: 0: 44504.0. Samples: 219166660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-10 10:48:34,592][32177] Avg episode reward: [(0, '0.253')] [2024-06-10 10:48:34,612][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000013371_219070464.pth... [2024-06-10 10:48:34,661][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000012713_208289792.pth [2024-06-10 10:48:37,986][32415] Updated weights for policy 0, policy_version 13380 (0.0029) [2024-06-10 10:48:39,592][32177] Fps is (10 sec: 47526.0, 60 sec: 44510.1, 300 sec: 45042.1). Total num frames: 219283456. Throughput: 0: 44743.2. Samples: 219436740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-10 10:48:39,592][32177] Avg episode reward: [(0, '0.264')] [2024-06-10 10:48:41,382][32415] Updated weights for policy 0, policy_version 13390 (0.0027) [2024-06-10 10:48:44,592][32177] Fps is (10 sec: 40959.6, 60 sec: 44236.7, 300 sec: 44764.4). Total num frames: 219480064. Throughput: 0: 44691.1. Samples: 219566420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 10:48:44,592][32177] Avg episode reward: [(0, '0.246')] [2024-06-10 10:48:45,257][32415] Updated weights for policy 0, policy_version 13400 (0.0034) [2024-06-10 10:48:49,042][32415] Updated weights for policy 0, policy_version 13410 (0.0030) [2024-06-10 10:48:49,592][32177] Fps is (10 sec: 45875.1, 60 sec: 44509.9, 300 sec: 44931.1). Total num frames: 219742208. Throughput: 0: 44597.0. Samples: 219832480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 10:48:49,592][32177] Avg episode reward: [(0, '0.260')] [2024-06-10 10:48:52,846][32415] Updated weights for policy 0, policy_version 13420 (0.0032) [2024-06-10 10:48:54,592][32177] Fps is (10 sec: 45874.6, 60 sec: 44236.7, 300 sec: 44820.6). Total num frames: 219938816. Throughput: 0: 44711.3. Samples: 220104720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 10:48:54,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 10:48:56,299][32415] Updated weights for policy 0, policy_version 13430 (0.0026) [2024-06-10 10:48:59,592][32177] Fps is (10 sec: 42598.2, 60 sec: 44509.9, 300 sec: 44875.5). Total num frames: 220168192. Throughput: 0: 44672.5. Samples: 220238360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 10:48:59,592][32177] Avg episode reward: [(0, '0.261')] [2024-06-10 10:49:00,209][32415] Updated weights for policy 0, policy_version 13440 (0.0026) [2024-06-10 10:49:03,238][32415] Updated weights for policy 0, policy_version 13450 (0.0027) [2024-06-10 10:49:04,592][32177] Fps is (10 sec: 45876.2, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 220397568. Throughput: 0: 44619.6. Samples: 220507360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-10 10:49:04,592][32177] Avg episode reward: [(0, '0.256')] [2024-06-10 10:49:07,364][32415] Updated weights for policy 0, policy_version 13460 (0.0031) [2024-06-10 10:49:09,596][32177] Fps is (10 sec: 44217.8, 60 sec: 44506.7, 300 sec: 44930.4). Total num frames: 220610560. Throughput: 0: 44839.1. Samples: 220774280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-10 10:49:09,597][32177] Avg episode reward: [(0, '0.264')] [2024-06-10 10:49:10,549][32415] Updated weights for policy 0, policy_version 13470 (0.0035) [2024-06-10 10:49:14,591][32177] Fps is (10 sec: 44237.0, 60 sec: 44783.1, 300 sec: 44820.0). Total num frames: 220839936. Throughput: 0: 44819.1. Samples: 220912600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-10 10:49:14,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 10:49:14,682][32415] Updated weights for policy 0, policy_version 13480 (0.0039) [2024-06-10 10:49:18,114][32415] Updated weights for policy 0, policy_version 13490 (0.0024) [2024-06-10 10:49:19,592][32177] Fps is (10 sec: 45894.4, 60 sec: 45055.9, 300 sec: 44819.9). Total num frames: 221069312. Throughput: 0: 44803.8. Samples: 221182840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-10 10:49:19,593][32177] Avg episode reward: [(0, '0.262')] [2024-06-10 10:49:22,070][32415] Updated weights for policy 0, policy_version 13500 (0.0038) [2024-06-10 10:49:24,592][32177] Fps is (10 sec: 47512.8, 60 sec: 45059.2, 300 sec: 44986.6). Total num frames: 221315072. Throughput: 0: 44747.9. Samples: 221450400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-10 10:49:24,592][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 10:49:24,599][32394] Saving new best policy, reward=0.270! [2024-06-10 10:49:25,378][32415] Updated weights for policy 0, policy_version 13510 (0.0029) [2024-06-10 10:49:29,362][32415] Updated weights for policy 0, policy_version 13520 (0.0046) [2024-06-10 10:49:29,592][32177] Fps is (10 sec: 44237.5, 60 sec: 45057.9, 300 sec: 44820.0). Total num frames: 221511680. Throughput: 0: 44795.7. Samples: 221582220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-10 10:49:29,592][32177] Avg episode reward: [(0, '0.250')] [2024-06-10 10:49:31,364][32394] Signal inference workers to stop experience collection... (3200 times) [2024-06-10 10:49:31,364][32394] Signal inference workers to resume experience collection... (3200 times) [2024-06-10 10:49:31,386][32415] InferenceWorker_p0-w0: stopping experience collection (3200 times) [2024-06-10 10:49:31,386][32415] InferenceWorker_p0-w0: resuming experience collection (3200 times) [2024-06-10 10:49:32,494][32415] Updated weights for policy 0, policy_version 13530 (0.0030) [2024-06-10 10:49:34,592][32177] Fps is (10 sec: 44237.4, 60 sec: 44783.0, 300 sec: 44875.8). Total num frames: 221757440. Throughput: 0: 44971.1. Samples: 221856180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 10:49:34,592][32177] Avg episode reward: [(0, '0.261')] [2024-06-10 10:49:36,473][32415] Updated weights for policy 0, policy_version 13540 (0.0034) [2024-06-10 10:49:39,592][32177] Fps is (10 sec: 47513.4, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 221986816. Throughput: 0: 44868.6. Samples: 222123800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 10:49:39,592][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 10:49:39,815][32415] Updated weights for policy 0, policy_version 13550 (0.0037) [2024-06-10 10:49:44,176][32415] Updated weights for policy 0, policy_version 13560 (0.0039) [2024-06-10 10:49:44,592][32177] Fps is (10 sec: 42597.5, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 222183424. Throughput: 0: 44707.0. Samples: 222250180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-10 10:49:44,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 10:49:47,467][32415] Updated weights for policy 0, policy_version 13570 (0.0040) [2024-06-10 10:49:49,592][32177] Fps is (10 sec: 42598.4, 60 sec: 44509.8, 300 sec: 44820.0). Total num frames: 222412800. Throughput: 0: 44680.9. Samples: 222518000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-10 10:49:49,592][32177] Avg episode reward: [(0, '0.257')] [2024-06-10 10:49:51,687][32415] Updated weights for policy 0, policy_version 13580 (0.0026) [2024-06-10 10:49:54,596][32177] Fps is (10 sec: 45856.7, 60 sec: 45053.0, 300 sec: 44874.9). Total num frames: 222642176. Throughput: 0: 44685.9. Samples: 222785140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 10:49:54,596][32177] Avg episode reward: [(0, '0.253')] [2024-06-10 10:49:54,693][32415] Updated weights for policy 0, policy_version 13590 (0.0037) [2024-06-10 10:49:58,960][32415] Updated weights for policy 0, policy_version 13600 (0.0035) [2024-06-10 10:49:59,596][32177] Fps is (10 sec: 44217.8, 60 sec: 44779.7, 300 sec: 44874.9). Total num frames: 222855168. Throughput: 0: 44521.4. Samples: 222916260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 10:49:59,596][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 10:49:59,597][32394] Saving new best policy, reward=0.275! [2024-06-10 10:50:01,878][32415] Updated weights for policy 0, policy_version 13610 (0.0029) [2024-06-10 10:50:04,592][32177] Fps is (10 sec: 44255.5, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 223084544. Throughput: 0: 44631.3. Samples: 223191240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 10:50:04,592][32177] Avg episode reward: [(0, '0.256')] [2024-06-10 10:50:06,000][32415] Updated weights for policy 0, policy_version 13620 (0.0039) [2024-06-10 10:50:09,080][32415] Updated weights for policy 0, policy_version 13630 (0.0042) [2024-06-10 10:50:09,592][32177] Fps is (10 sec: 47534.0, 60 sec: 45332.3, 300 sec: 44931.4). Total num frames: 223330304. Throughput: 0: 44617.8. Samples: 223458200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 10:50:09,592][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 10:50:13,361][32415] Updated weights for policy 0, policy_version 13640 (0.0040) [2024-06-10 10:50:14,592][32177] Fps is (10 sec: 44236.7, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 223526912. Throughput: 0: 44800.4. Samples: 223598240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 10:50:14,592][32177] Avg episode reward: [(0, '0.266')] [2024-06-10 10:50:16,487][32415] Updated weights for policy 0, policy_version 13650 (0.0030) [2024-06-10 10:50:19,592][32177] Fps is (10 sec: 42597.6, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 223756288. Throughput: 0: 44617.1. Samples: 223863960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-10 10:50:19,592][32177] Avg episode reward: [(0, '0.258')] [2024-06-10 10:50:20,958][32415] Updated weights for policy 0, policy_version 13660 (0.0027) [2024-06-10 10:50:23,874][32415] Updated weights for policy 0, policy_version 13670 (0.0026) [2024-06-10 10:50:24,594][32177] Fps is (10 sec: 45864.6, 60 sec: 44508.2, 300 sec: 44875.1). Total num frames: 223985664. Throughput: 0: 44525.3. Samples: 224127540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-10 10:50:24,595][32177] Avg episode reward: [(0, '0.263')] [2024-06-10 10:50:28,119][32415] Updated weights for policy 0, policy_version 13680 (0.0034) [2024-06-10 10:50:29,592][32177] Fps is (10 sec: 44237.9, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 224198656. Throughput: 0: 44907.3. Samples: 224271000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-10 10:50:29,592][32177] Avg episode reward: [(0, '0.249')] [2024-06-10 10:50:30,881][32415] Updated weights for policy 0, policy_version 13690 (0.0039) [2024-06-10 10:50:34,592][32177] Fps is (10 sec: 42608.4, 60 sec: 44236.8, 300 sec: 44709.5). Total num frames: 224411648. Throughput: 0: 44960.5. Samples: 224541220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-10 10:50:34,592][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 10:50:34,682][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000013698_224428032.pth... [2024-06-10 10:50:34,737][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000013043_213696512.pth [2024-06-10 10:50:35,100][32415] Updated weights for policy 0, policy_version 13700 (0.0041) [2024-06-10 10:50:38,391][32415] Updated weights for policy 0, policy_version 13710 (0.0028) [2024-06-10 10:50:39,592][32177] Fps is (10 sec: 47513.0, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 224673792. Throughput: 0: 44879.7. Samples: 224804540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-10 10:50:39,592][32177] Avg episode reward: [(0, '0.255')] [2024-06-10 10:50:42,794][32415] Updated weights for policy 0, policy_version 13720 (0.0041) [2024-06-10 10:50:44,592][32177] Fps is (10 sec: 47513.0, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 224886784. Throughput: 0: 45060.2. Samples: 224943780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-10 10:50:44,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 10:50:45,913][32415] Updated weights for policy 0, policy_version 13730 (0.0031) [2024-06-10 10:50:49,596][32177] Fps is (10 sec: 40943.4, 60 sec: 44506.8, 300 sec: 44708.3). Total num frames: 225083392. Throughput: 0: 44795.4. Samples: 225207220. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-10 10:50:49,596][32177] Avg episode reward: [(0, '0.260')] [2024-06-10 10:50:50,328][32394] Signal inference workers to stop experience collection... (3250 times) [2024-06-10 10:50:50,328][32394] Signal inference workers to resume experience collection... (3250 times) [2024-06-10 10:50:50,331][32415] Updated weights for policy 0, policy_version 13740 (0.0026) [2024-06-10 10:50:50,344][32415] InferenceWorker_p0-w0: stopping experience collection (3250 times) [2024-06-10 10:50:50,344][32415] InferenceWorker_p0-w0: resuming experience collection (3250 times) [2024-06-10 10:50:53,052][32415] Updated weights for policy 0, policy_version 13750 (0.0030) [2024-06-10 10:50:54,592][32177] Fps is (10 sec: 44236.7, 60 sec: 44786.0, 300 sec: 44820.0). Total num frames: 225329152. Throughput: 0: 44839.9. Samples: 225476000. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-10 10:50:54,592][32177] Avg episode reward: [(0, '0.266')] [2024-06-10 10:50:57,398][32415] Updated weights for policy 0, policy_version 13760 (0.0047) [2024-06-10 10:50:59,592][32177] Fps is (10 sec: 47533.0, 60 sec: 45059.2, 300 sec: 44820.0). Total num frames: 225558528. Throughput: 0: 44923.9. Samples: 225619820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 10:50:59,592][32177] Avg episode reward: [(0, '0.256')] [2024-06-10 10:51:00,061][32415] Updated weights for policy 0, policy_version 13770 (0.0028) [2024-06-10 10:51:04,424][32415] Updated weights for policy 0, policy_version 13780 (0.0029) [2024-06-10 10:51:04,592][32177] Fps is (10 sec: 44237.4, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 225771520. Throughput: 0: 44844.2. Samples: 225881940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 10:51:04,592][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 10:51:07,387][32415] Updated weights for policy 0, policy_version 13790 (0.0032) [2024-06-10 10:51:09,592][32177] Fps is (10 sec: 44235.9, 60 sec: 44509.7, 300 sec: 44875.5). Total num frames: 226000896. Throughput: 0: 44952.7. Samples: 226150320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-10 10:51:09,592][32177] Avg episode reward: [(0, '0.262')] [2024-06-10 10:51:11,826][32415] Updated weights for policy 0, policy_version 13800 (0.0031) [2024-06-10 10:51:14,592][32177] Fps is (10 sec: 47513.1, 60 sec: 45329.0, 300 sec: 44876.1). Total num frames: 226246656. Throughput: 0: 44690.1. Samples: 226282060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-10 10:51:14,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 10:51:14,875][32415] Updated weights for policy 0, policy_version 13810 (0.0031) [2024-06-10 10:51:19,241][32415] Updated weights for policy 0, policy_version 13820 (0.0030) [2024-06-10 10:51:19,592][32177] Fps is (10 sec: 42599.3, 60 sec: 44510.0, 300 sec: 44764.5). Total num frames: 226426880. Throughput: 0: 44858.2. Samples: 226559840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-10 10:51:19,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 10:51:22,128][32415] Updated weights for policy 0, policy_version 13830 (0.0033) [2024-06-10 10:51:24,592][32177] Fps is (10 sec: 42598.3, 60 sec: 44784.6, 300 sec: 44820.6). Total num frames: 226672640. Throughput: 0: 44838.6. Samples: 226822280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-10 10:51:24,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 10:51:24,610][32394] Saving new best policy, reward=0.277! [2024-06-10 10:51:26,595][32415] Updated weights for policy 0, policy_version 13840 (0.0031) [2024-06-10 10:51:29,174][32415] Updated weights for policy 0, policy_version 13850 (0.0026) [2024-06-10 10:51:29,592][32177] Fps is (10 sec: 50790.2, 60 sec: 45602.0, 300 sec: 44931.1). Total num frames: 226934784. Throughput: 0: 44913.4. Samples: 226964880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-10 10:51:29,592][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 10:51:34,028][32415] Updated weights for policy 0, policy_version 13860 (0.0026) [2024-06-10 10:51:34,592][32177] Fps is (10 sec: 45875.7, 60 sec: 45329.0, 300 sec: 44931.1). Total num frames: 227131392. Throughput: 0: 45021.9. Samples: 227233020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-10 10:51:34,592][32177] Avg episode reward: [(0, '0.267')] [2024-06-10 10:51:36,676][32415] Updated weights for policy 0, policy_version 13870 (0.0035) [2024-06-10 10:51:39,592][32177] Fps is (10 sec: 40959.7, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 227344384. Throughput: 0: 44809.8. Samples: 227492440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 10:51:39,592][32177] Avg episode reward: [(0, '0.259')] [2024-06-10 10:51:41,318][32415] Updated weights for policy 0, policy_version 13880 (0.0046) [2024-06-10 10:51:44,258][32415] Updated weights for policy 0, policy_version 13890 (0.0049) [2024-06-10 10:51:44,592][32177] Fps is (10 sec: 44236.1, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 227573760. Throughput: 0: 44451.5. Samples: 227620140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 10:51:44,593][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 10:51:48,767][32415] Updated weights for policy 0, policy_version 13900 (0.0032) [2024-06-10 10:51:49,592][32177] Fps is (10 sec: 45875.7, 60 sec: 45332.1, 300 sec: 44875.5). Total num frames: 227803136. Throughput: 0: 44819.5. Samples: 227898820. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-10 10:51:49,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 10:51:51,665][32415] Updated weights for policy 0, policy_version 13910 (0.0028) [2024-06-10 10:51:54,592][32177] Fps is (10 sec: 42598.5, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 227999744. Throughput: 0: 44645.9. Samples: 228159380. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-10 10:51:54,592][32177] Avg episode reward: [(0, '0.266')] [2024-06-10 10:51:55,991][32415] Updated weights for policy 0, policy_version 13920 (0.0029) [2024-06-10 10:51:56,023][32394] Signal inference workers to stop experience collection... (3300 times) [2024-06-10 10:51:56,028][32394] Signal inference workers to resume experience collection... (3300 times) [2024-06-10 10:51:56,062][32415] InferenceWorker_p0-w0: stopping experience collection (3300 times) [2024-06-10 10:51:56,063][32415] InferenceWorker_p0-w0: resuming experience collection (3300 times) [2024-06-10 10:51:58,759][32415] Updated weights for policy 0, policy_version 13930 (0.0026) [2024-06-10 10:51:59,592][32177] Fps is (10 sec: 44237.0, 60 sec: 44783.0, 300 sec: 44765.1). Total num frames: 228245504. Throughput: 0: 44737.4. Samples: 228295240. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-10 10:51:59,592][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 10:52:03,474][32415] Updated weights for policy 0, policy_version 13940 (0.0026) [2024-06-10 10:52:04,592][32177] Fps is (10 sec: 45874.6, 60 sec: 44782.7, 300 sec: 44875.5). Total num frames: 228458496. Throughput: 0: 44663.8. Samples: 228569720. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-10 10:52:04,592][32177] Avg episode reward: [(0, '0.266')] [2024-06-10 10:52:06,273][32415] Updated weights for policy 0, policy_version 13950 (0.0038) [2024-06-10 10:52:09,592][32177] Fps is (10 sec: 40959.9, 60 sec: 44237.0, 300 sec: 44653.4). Total num frames: 228655104. Throughput: 0: 44557.0. Samples: 228827340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 10:52:09,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 10:52:10,819][32415] Updated weights for policy 0, policy_version 13960 (0.0028) [2024-06-10 10:52:13,716][32415] Updated weights for policy 0, policy_version 13970 (0.0036) [2024-06-10 10:52:14,592][32177] Fps is (10 sec: 44238.0, 60 sec: 44236.9, 300 sec: 44764.4). Total num frames: 228900864. Throughput: 0: 44350.3. Samples: 228960640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 10:52:14,592][32177] Avg episode reward: [(0, '0.254')] [2024-06-10 10:52:17,918][32415] Updated weights for policy 0, policy_version 13980 (0.0033) [2024-06-10 10:52:19,592][32177] Fps is (10 sec: 49152.2, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 229146624. Throughput: 0: 44420.9. Samples: 229231960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-10 10:52:19,592][32177] Avg episode reward: [(0, '0.267')] [2024-06-10 10:52:21,063][32415] Updated weights for policy 0, policy_version 13990 (0.0030) [2024-06-10 10:52:24,592][32177] Fps is (10 sec: 42597.3, 60 sec: 44236.7, 300 sec: 44654.0). Total num frames: 229326848. Throughput: 0: 44784.4. Samples: 229507740. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-10 10:52:24,593][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 10:52:25,310][32415] Updated weights for policy 0, policy_version 14000 (0.0029) [2024-06-10 10:52:28,121][32415] Updated weights for policy 0, policy_version 14010 (0.0032) [2024-06-10 10:52:29,592][32177] Fps is (10 sec: 44236.8, 60 sec: 44236.9, 300 sec: 44820.0). Total num frames: 229588992. Throughput: 0: 44817.9. Samples: 229636940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 10:52:29,592][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 10:52:32,491][32415] Updated weights for policy 0, policy_version 14020 (0.0036) [2024-06-10 10:52:34,592][32177] Fps is (10 sec: 47514.3, 60 sec: 44509.8, 300 sec: 44708.9). Total num frames: 229801984. Throughput: 0: 44765.3. Samples: 229913260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 10:52:34,592][32177] Avg episode reward: [(0, '0.256')] [2024-06-10 10:52:34,619][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000014027_229818368.pth... [2024-06-10 10:52:34,689][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000013371_219070464.pth [2024-06-10 10:52:35,585][32415] Updated weights for policy 0, policy_version 14030 (0.0033) [2024-06-10 10:52:39,592][32177] Fps is (10 sec: 42598.5, 60 sec: 44510.0, 300 sec: 44708.9). Total num frames: 230014976. Throughput: 0: 45038.4. Samples: 230186100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 10:52:39,592][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 10:52:39,608][32415] Updated weights for policy 0, policy_version 14040 (0.0030) [2024-06-10 10:52:42,729][32415] Updated weights for policy 0, policy_version 14050 (0.0028) [2024-06-10 10:52:44,591][32177] Fps is (10 sec: 45876.1, 60 sec: 44783.1, 300 sec: 44708.9). Total num frames: 230260736. Throughput: 0: 44912.5. Samples: 230316300. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-10 10:52:44,592][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 10:52:46,784][32415] Updated weights for policy 0, policy_version 14060 (0.0035) [2024-06-10 10:52:49,592][32177] Fps is (10 sec: 47513.1, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 230490112. Throughput: 0: 44881.5. Samples: 230589380. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-10 10:52:49,592][32177] Avg episode reward: [(0, '0.265')] [2024-06-10 10:52:50,199][32415] Updated weights for policy 0, policy_version 14070 (0.0027) [2024-06-10 10:52:54,388][32415] Updated weights for policy 0, policy_version 14080 (0.0039) [2024-06-10 10:52:54,591][32177] Fps is (10 sec: 42598.3, 60 sec: 44783.1, 300 sec: 44708.9). Total num frames: 230686720. Throughput: 0: 45152.5. Samples: 230859200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 10:52:54,592][32177] Avg episode reward: [(0, '0.264')] [2024-06-10 10:52:57,503][32415] Updated weights for policy 0, policy_version 14090 (0.0037) [2024-06-10 10:52:59,592][32177] Fps is (10 sec: 44236.5, 60 sec: 44782.8, 300 sec: 44820.0). Total num frames: 230932480. Throughput: 0: 45091.4. Samples: 230989760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 10:52:59,592][32177] Avg episode reward: [(0, '0.265')] [2024-06-10 10:53:01,501][32415] Updated weights for policy 0, policy_version 14100 (0.0039) [2024-06-10 10:53:04,592][32177] Fps is (10 sec: 49151.9, 60 sec: 45329.3, 300 sec: 44875.5). Total num frames: 231178240. Throughput: 0: 44975.1. Samples: 231255840. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-10 10:53:04,592][32177] Avg episode reward: [(0, '0.261')] [2024-06-10 10:53:04,601][32415] Updated weights for policy 0, policy_version 14110 (0.0039) [2024-06-10 10:53:09,031][32415] Updated weights for policy 0, policy_version 14120 (0.0031) [2024-06-10 10:53:09,592][32177] Fps is (10 sec: 44237.1, 60 sec: 45329.0, 300 sec: 44820.0). Total num frames: 231374848. Throughput: 0: 44905.9. Samples: 231528500. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-10 10:53:09,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 10:53:12,227][32415] Updated weights for policy 0, policy_version 14130 (0.0032) [2024-06-10 10:53:14,592][32177] Fps is (10 sec: 42596.6, 60 sec: 45055.7, 300 sec: 44875.4). Total num frames: 231604224. Throughput: 0: 44928.1. Samples: 231658720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 10:53:14,593][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 10:53:16,133][32415] Updated weights for policy 0, policy_version 14140 (0.0040) [2024-06-10 10:53:16,987][32394] Signal inference workers to stop experience collection... (3350 times) [2024-06-10 10:53:17,020][32415] InferenceWorker_p0-w0: stopping experience collection (3350 times) [2024-06-10 10:53:17,036][32394] Signal inference workers to resume experience collection... (3350 times) [2024-06-10 10:53:17,041][32415] InferenceWorker_p0-w0: resuming experience collection (3350 times) [2024-06-10 10:53:19,516][32415] Updated weights for policy 0, policy_version 14150 (0.0027) [2024-06-10 10:53:19,592][32177] Fps is (10 sec: 45874.5, 60 sec: 44782.8, 300 sec: 44820.6). Total num frames: 231833600. Throughput: 0: 44705.2. Samples: 231925000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 10:53:19,593][32177] Avg episode reward: [(0, '0.260')] [2024-06-10 10:53:23,455][32415] Updated weights for policy 0, policy_version 14160 (0.0028) [2024-06-10 10:53:24,592][32177] Fps is (10 sec: 44237.4, 60 sec: 45329.1, 300 sec: 44875.9). Total num frames: 232046592. Throughput: 0: 44743.7. Samples: 232199580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 10:53:24,592][32177] Avg episode reward: [(0, '0.257')] [2024-06-10 10:53:26,904][32415] Updated weights for policy 0, policy_version 14170 (0.0041) [2024-06-10 10:53:29,596][32177] Fps is (10 sec: 42581.1, 60 sec: 44506.7, 300 sec: 44708.2). Total num frames: 232259584. Throughput: 0: 44836.5. Samples: 232334140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 10:53:29,596][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 10:53:30,745][32415] Updated weights for policy 0, policy_version 14180 (0.0034) [2024-06-10 10:53:34,084][32415] Updated weights for policy 0, policy_version 14190 (0.0039) [2024-06-10 10:53:34,592][32177] Fps is (10 sec: 44236.6, 60 sec: 44782.8, 300 sec: 44764.4). Total num frames: 232488960. Throughput: 0: 44674.0. Samples: 232599720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-10 10:53:34,593][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 10:53:38,285][32415] Updated weights for policy 0, policy_version 14200 (0.0026) [2024-06-10 10:53:39,596][32177] Fps is (10 sec: 47513.6, 60 sec: 45325.8, 300 sec: 44930.4). Total num frames: 232734720. Throughput: 0: 44678.3. Samples: 232869920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-10 10:53:39,596][32177] Avg episode reward: [(0, '0.258')] [2024-06-10 10:53:41,606][32415] Updated weights for policy 0, policy_version 14210 (0.0032) [2024-06-10 10:53:44,596][32177] Fps is (10 sec: 44217.9, 60 sec: 44506.4, 300 sec: 44708.2). Total num frames: 232931328. Throughput: 0: 44593.4. Samples: 232996660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-10 10:53:44,597][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 10:53:45,555][32415] Updated weights for policy 0, policy_version 14220 (0.0024) [2024-06-10 10:53:48,845][32415] Updated weights for policy 0, policy_version 14230 (0.0033) [2024-06-10 10:53:49,596][32177] Fps is (10 sec: 44236.6, 60 sec: 44779.8, 300 sec: 44874.9). Total num frames: 233177088. Throughput: 0: 44696.1. Samples: 233267360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-10 10:53:49,597][32177] Avg episode reward: [(0, '0.266')] [2024-06-10 10:53:52,878][32415] Updated weights for policy 0, policy_version 14240 (0.0027) [2024-06-10 10:53:54,592][32177] Fps is (10 sec: 45896.0, 60 sec: 45055.9, 300 sec: 44820.0). Total num frames: 233390080. Throughput: 0: 44697.4. Samples: 233539880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 10:53:54,592][32177] Avg episode reward: [(0, '0.260')] [2024-06-10 10:53:56,224][32415] Updated weights for policy 0, policy_version 14250 (0.0031) [2024-06-10 10:53:59,592][32177] Fps is (10 sec: 42616.6, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 233603072. Throughput: 0: 44727.9. Samples: 233671460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 10:53:59,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 10:54:00,006][32415] Updated weights for policy 0, policy_version 14260 (0.0027) [2024-06-10 10:54:03,935][32415] Updated weights for policy 0, policy_version 14270 (0.0036) [2024-06-10 10:54:04,592][32177] Fps is (10 sec: 45875.2, 60 sec: 44509.8, 300 sec: 44876.2). Total num frames: 233848832. Throughput: 0: 44802.9. Samples: 233941120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 10:54:04,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 10:54:07,491][32415] Updated weights for policy 0, policy_version 14280 (0.0028) [2024-06-10 10:54:09,592][32177] Fps is (10 sec: 45875.0, 60 sec: 44782.9, 300 sec: 44819.9). Total num frames: 234061824. Throughput: 0: 44536.1. Samples: 234203700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 10:54:09,592][32177] Avg episode reward: [(0, '0.264')] [2024-06-10 10:54:11,043][32415] Updated weights for policy 0, policy_version 14290 (0.0023) [2024-06-10 10:54:14,592][32177] Fps is (10 sec: 40959.8, 60 sec: 44237.0, 300 sec: 44708.9). Total num frames: 234258432. Throughput: 0: 44541.5. Samples: 234338320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 10:54:14,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 10:54:14,921][32415] Updated weights for policy 0, policy_version 14300 (0.0029) [2024-06-10 10:54:18,270][32415] Updated weights for policy 0, policy_version 14310 (0.0033) [2024-06-10 10:54:19,596][32177] Fps is (10 sec: 44219.3, 60 sec: 44507.0, 300 sec: 44708.3). Total num frames: 234504192. Throughput: 0: 44645.6. Samples: 234608940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-10 10:54:19,596][32177] Avg episode reward: [(0, '0.265')] [2024-06-10 10:54:22,095][32415] Updated weights for policy 0, policy_version 14320 (0.0029) [2024-06-10 10:54:24,592][32177] Fps is (10 sec: 45874.7, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 234717184. Throughput: 0: 44576.1. Samples: 234875660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-10 10:54:24,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 10:54:25,510][32415] Updated weights for policy 0, policy_version 14330 (0.0023) [2024-06-10 10:54:29,321][32415] Updated weights for policy 0, policy_version 14340 (0.0036) [2024-06-10 10:54:29,592][32177] Fps is (10 sec: 44254.7, 60 sec: 44786.1, 300 sec: 44708.9). Total num frames: 234946560. Throughput: 0: 44881.4. Samples: 235016120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 10:54:29,592][32177] Avg episode reward: [(0, '0.262')] [2024-06-10 10:54:33,169][32415] Updated weights for policy 0, policy_version 14350 (0.0026) [2024-06-10 10:54:34,596][32177] Fps is (10 sec: 45856.0, 60 sec: 44779.9, 300 sec: 44708.2). Total num frames: 235175936. Throughput: 0: 44859.1. Samples: 235286020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 10:54:34,597][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 10:54:34,659][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000014355_235192320.pth... [2024-06-10 10:54:34,722][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000013698_224428032.pth [2024-06-10 10:54:36,838][32415] Updated weights for policy 0, policy_version 14360 (0.0046) [2024-06-10 10:54:39,592][32177] Fps is (10 sec: 44236.8, 60 sec: 44240.0, 300 sec: 44764.4). Total num frames: 235388928. Throughput: 0: 44716.9. Samples: 235552140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 10:54:39,592][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 10:54:40,163][32415] Updated weights for policy 0, policy_version 14370 (0.0022) [2024-06-10 10:54:44,231][32415] Updated weights for policy 0, policy_version 14380 (0.0036) [2024-06-10 10:54:44,592][32177] Fps is (10 sec: 42616.8, 60 sec: 44513.2, 300 sec: 44708.9). Total num frames: 235601920. Throughput: 0: 44812.9. Samples: 235688040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 10:54:44,592][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 10:54:44,835][32394] Signal inference workers to stop experience collection... (3400 times) [2024-06-10 10:54:44,835][32394] Signal inference workers to resume experience collection... (3400 times) [2024-06-10 10:54:44,878][32415] InferenceWorker_p0-w0: stopping experience collection (3400 times) [2024-06-10 10:54:44,878][32415] InferenceWorker_p0-w0: resuming experience collection (3400 times) [2024-06-10 10:54:47,540][32415] Updated weights for policy 0, policy_version 14390 (0.0034) [2024-06-10 10:54:49,592][32177] Fps is (10 sec: 45874.7, 60 sec: 44513.0, 300 sec: 44765.0). Total num frames: 235847680. Throughput: 0: 44692.8. Samples: 235952300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-10 10:54:49,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 10:54:51,427][32415] Updated weights for policy 0, policy_version 14400 (0.0031) [2024-06-10 10:54:54,592][32177] Fps is (10 sec: 47513.9, 60 sec: 44782.9, 300 sec: 44820.6). Total num frames: 236077056. Throughput: 0: 44910.3. Samples: 236224660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-10 10:54:54,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 10:54:54,926][32415] Updated weights for policy 0, policy_version 14410 (0.0032) [2024-06-10 10:54:58,466][32415] Updated weights for policy 0, policy_version 14420 (0.0047) [2024-06-10 10:54:59,596][32177] Fps is (10 sec: 44218.2, 60 sec: 44779.7, 300 sec: 44763.8). Total num frames: 236290048. Throughput: 0: 44883.8. Samples: 236358280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 10:54:59,597][32177] Avg episode reward: [(0, '0.267')] [2024-06-10 10:55:02,134][32415] Updated weights for policy 0, policy_version 14430 (0.0034) [2024-06-10 10:55:04,596][32177] Fps is (10 sec: 44217.7, 60 sec: 44506.7, 300 sec: 44708.2). Total num frames: 236519424. Throughput: 0: 44838.9. Samples: 236626700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 10:55:04,597][32177] Avg episode reward: [(0, '0.266')] [2024-06-10 10:55:06,123][32415] Updated weights for policy 0, policy_version 14440 (0.0030) [2024-06-10 10:55:09,592][32177] Fps is (10 sec: 44255.7, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 236732416. Throughput: 0: 44800.1. Samples: 236891660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-10 10:55:09,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 10:55:09,778][32415] Updated weights for policy 0, policy_version 14450 (0.0021) [2024-06-10 10:55:13,574][32415] Updated weights for policy 0, policy_version 14460 (0.0037) [2024-06-10 10:55:14,594][32177] Fps is (10 sec: 45885.9, 60 sec: 45327.6, 300 sec: 44819.7). Total num frames: 236978176. Throughput: 0: 44642.5. Samples: 237025120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-10 10:55:14,594][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 10:55:16,785][32415] Updated weights for policy 0, policy_version 14470 (0.0032) [2024-06-10 10:55:19,592][32177] Fps is (10 sec: 44237.2, 60 sec: 44512.9, 300 sec: 44709.2). Total num frames: 237174784. Throughput: 0: 44666.6. Samples: 237295820. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-10 10:55:19,592][32177] Avg episode reward: [(0, '0.264')] [2024-06-10 10:55:20,527][32415] Updated weights for policy 0, policy_version 14480 (0.0034) [2024-06-10 10:55:24,180][32415] Updated weights for policy 0, policy_version 14490 (0.0040) [2024-06-10 10:55:24,592][32177] Fps is (10 sec: 44244.4, 60 sec: 45055.9, 300 sec: 44819.9). Total num frames: 237420544. Throughput: 0: 44726.8. Samples: 237564860. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-10 10:55:24,593][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 10:55:27,591][32415] Updated weights for policy 0, policy_version 14500 (0.0031) [2024-06-10 10:55:29,593][32177] Fps is (10 sec: 47507.0, 60 sec: 45055.0, 300 sec: 44875.3). Total num frames: 237649920. Throughput: 0: 44637.4. Samples: 237696780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 10:55:29,594][32177] Avg episode reward: [(0, '0.265')] [2024-06-10 10:55:31,574][32415] Updated weights for policy 0, policy_version 14510 (0.0028) [2024-06-10 10:55:34,592][32177] Fps is (10 sec: 44236.0, 60 sec: 44785.8, 300 sec: 44708.8). Total num frames: 237862912. Throughput: 0: 44895.7. Samples: 237972620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 10:55:34,593][32177] Avg episode reward: [(0, '0.266')] [2024-06-10 10:55:35,254][32415] Updated weights for policy 0, policy_version 14520 (0.0032) [2024-06-10 10:55:39,207][32415] Updated weights for policy 0, policy_version 14530 (0.0040) [2024-06-10 10:55:39,592][32177] Fps is (10 sec: 42604.3, 60 sec: 44783.0, 300 sec: 44708.9). Total num frames: 238075904. Throughput: 0: 44850.7. Samples: 238242940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 10:55:39,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 10:55:42,490][32415] Updated weights for policy 0, policy_version 14540 (0.0028) [2024-06-10 10:55:44,596][32177] Fps is (10 sec: 45857.4, 60 sec: 45325.8, 300 sec: 44875.5). Total num frames: 238321664. Throughput: 0: 44841.8. Samples: 238376160. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-10 10:55:44,597][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 10:55:46,248][32415] Updated weights for policy 0, policy_version 14550 (0.0035) [2024-06-10 10:55:49,466][32415] Updated weights for policy 0, policy_version 14560 (0.0025) [2024-06-10 10:55:49,592][32177] Fps is (10 sec: 47512.6, 60 sec: 45055.9, 300 sec: 44820.0). Total num frames: 238551040. Throughput: 0: 44899.7. Samples: 238647000. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-10 10:55:49,592][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 10:55:53,479][32415] Updated weights for policy 0, policy_version 14570 (0.0037) [2024-06-10 10:55:54,592][32177] Fps is (10 sec: 44255.1, 60 sec: 44782.8, 300 sec: 44764.4). Total num frames: 238764032. Throughput: 0: 44871.4. Samples: 238910880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-10 10:55:54,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 10:55:56,747][32415] Updated weights for policy 0, policy_version 14580 (0.0024) [2024-06-10 10:55:59,592][32177] Fps is (10 sec: 42598.6, 60 sec: 44786.1, 300 sec: 44764.4). Total num frames: 238977024. Throughput: 0: 44858.3. Samples: 239043660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-10 10:55:59,592][32177] Avg episode reward: [(0, '0.265')] [2024-06-10 10:56:01,208][32415] Updated weights for policy 0, policy_version 14590 (0.0033) [2024-06-10 10:56:04,467][32415] Updated weights for policy 0, policy_version 14600 (0.0027) [2024-06-10 10:56:04,592][32177] Fps is (10 sec: 44237.7, 60 sec: 44786.2, 300 sec: 44764.5). Total num frames: 239206400. Throughput: 0: 44749.3. Samples: 239309540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-10 10:56:04,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 10:56:08,397][32415] Updated weights for policy 0, policy_version 14610 (0.0035) [2024-06-10 10:56:09,072][32394] Signal inference workers to stop experience collection... (3450 times) [2024-06-10 10:56:09,073][32394] Signal inference workers to resume experience collection... (3450 times) [2024-06-10 10:56:09,095][32415] InferenceWorker_p0-w0: stopping experience collection (3450 times) [2024-06-10 10:56:09,095][32415] InferenceWorker_p0-w0: resuming experience collection (3450 times) [2024-06-10 10:56:09,592][32177] Fps is (10 sec: 45874.4, 60 sec: 45055.8, 300 sec: 44708.8). Total num frames: 239435776. Throughput: 0: 44887.5. Samples: 239584800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-10 10:56:09,592][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 10:56:11,680][32415] Updated weights for policy 0, policy_version 14620 (0.0026) [2024-06-10 10:56:14,592][32177] Fps is (10 sec: 45874.7, 60 sec: 44784.3, 300 sec: 44875.5). Total num frames: 239665152. Throughput: 0: 44903.9. Samples: 239717400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-10 10:56:14,592][32177] Avg episode reward: [(0, '0.267')] [2024-06-10 10:56:15,485][32415] Updated weights for policy 0, policy_version 14630 (0.0035) [2024-06-10 10:56:19,034][32415] Updated weights for policy 0, policy_version 14640 (0.0029) [2024-06-10 10:56:19,592][32177] Fps is (10 sec: 44236.7, 60 sec: 45055.7, 300 sec: 44764.4). Total num frames: 239878144. Throughput: 0: 44751.2. Samples: 239986420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-10 10:56:19,592][32177] Avg episode reward: [(0, '0.260')] [2024-06-10 10:56:22,849][32415] Updated weights for policy 0, policy_version 14650 (0.0040) [2024-06-10 10:56:24,592][32177] Fps is (10 sec: 42599.0, 60 sec: 44510.1, 300 sec: 44597.8). Total num frames: 240091136. Throughput: 0: 44704.5. Samples: 240254640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 10:56:24,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 10:56:26,179][32415] Updated weights for policy 0, policy_version 14660 (0.0036) [2024-06-10 10:56:29,592][32177] Fps is (10 sec: 44237.7, 60 sec: 44510.8, 300 sec: 44708.9). Total num frames: 240320512. Throughput: 0: 44629.0. Samples: 240384280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 10:56:29,592][32177] Avg episode reward: [(0, '0.260')] [2024-06-10 10:56:30,438][32415] Updated weights for policy 0, policy_version 14670 (0.0034) [2024-06-10 10:56:33,769][32415] Updated weights for policy 0, policy_version 14680 (0.0026) [2024-06-10 10:56:34,596][32177] Fps is (10 sec: 47492.5, 60 sec: 45053.1, 300 sec: 44819.3). Total num frames: 240566272. Throughput: 0: 44605.2. Samples: 240654420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 10:56:34,597][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 10:56:34,610][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000014683_240566272.pth... [2024-06-10 10:56:34,674][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000014027_229818368.pth [2024-06-10 10:56:37,768][32415] Updated weights for policy 0, policy_version 14690 (0.0038) [2024-06-10 10:56:39,592][32177] Fps is (10 sec: 44237.4, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 240762880. Throughput: 0: 44582.4. Samples: 240917080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 10:56:39,592][32177] Avg episode reward: [(0, '0.260')] [2024-06-10 10:56:40,877][32415] Updated weights for policy 0, policy_version 14700 (0.0028) [2024-06-10 10:56:44,592][32177] Fps is (10 sec: 42617.3, 60 sec: 44513.1, 300 sec: 44708.9). Total num frames: 240992256. Throughput: 0: 44601.1. Samples: 241050700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-10 10:56:44,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 10:56:44,664][32394] Saving new best policy, reward=0.279! [2024-06-10 10:56:44,668][32415] Updated weights for policy 0, policy_version 14710 (0.0035) [2024-06-10 10:56:48,521][32415] Updated weights for policy 0, policy_version 14720 (0.0031) [2024-06-10 10:56:49,591][32177] Fps is (10 sec: 45875.8, 60 sec: 44510.1, 300 sec: 44820.0). Total num frames: 241221632. Throughput: 0: 44684.5. Samples: 241320340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-10 10:56:49,592][32177] Avg episode reward: [(0, '0.263')] [2024-06-10 10:56:52,126][32415] Updated weights for policy 0, policy_version 14730 (0.0026) [2024-06-10 10:56:54,596][32177] Fps is (10 sec: 44217.5, 60 sec: 44506.8, 300 sec: 44708.2). Total num frames: 241434624. Throughput: 0: 44702.7. Samples: 241596600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-10 10:56:54,596][32177] Avg episode reward: [(0, '0.263')] [2024-06-10 10:56:55,545][32415] Updated weights for policy 0, policy_version 14740 (0.0034) [2024-06-10 10:56:59,596][32177] Fps is (10 sec: 40941.9, 60 sec: 44233.7, 300 sec: 44652.7). Total num frames: 241631232. Throughput: 0: 44710.0. Samples: 241729540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-10 10:56:59,596][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 10:56:59,761][32415] Updated weights for policy 0, policy_version 14750 (0.0041) [2024-06-10 10:57:02,832][32415] Updated weights for policy 0, policy_version 14760 (0.0041) [2024-06-10 10:57:04,592][32177] Fps is (10 sec: 45894.2, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 241893376. Throughput: 0: 44601.1. Samples: 241993460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-10 10:57:04,593][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 10:57:07,068][32415] Updated weights for policy 0, policy_version 14770 (0.0032) [2024-06-10 10:57:09,596][32177] Fps is (10 sec: 49151.9, 60 sec: 44779.9, 300 sec: 44819.3). Total num frames: 242122752. Throughput: 0: 44657.4. Samples: 242264420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 10:57:09,597][32177] Avg episode reward: [(0, '0.267')] [2024-06-10 10:57:10,344][32415] Updated weights for policy 0, policy_version 14780 (0.0027) [2024-06-10 10:57:14,117][32415] Updated weights for policy 0, policy_version 14790 (0.0037) [2024-06-10 10:57:14,592][32177] Fps is (10 sec: 45875.0, 60 sec: 44782.8, 300 sec: 44764.4). Total num frames: 242352128. Throughput: 0: 44733.3. Samples: 242397280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 10:57:14,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 10:57:17,926][32415] Updated weights for policy 0, policy_version 14800 (0.0038) [2024-06-10 10:57:19,592][32177] Fps is (10 sec: 44255.8, 60 sec: 44783.2, 300 sec: 44875.5). Total num frames: 242565120. Throughput: 0: 44854.6. Samples: 242672680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-10 10:57:19,592][32177] Avg episode reward: [(0, '0.264')] [2024-06-10 10:57:21,310][32415] Updated weights for policy 0, policy_version 14810 (0.0033) [2024-06-10 10:57:24,592][32177] Fps is (10 sec: 42598.6, 60 sec: 44782.8, 300 sec: 44708.9). Total num frames: 242778112. Throughput: 0: 44992.3. Samples: 242941740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-10 10:57:24,592][32177] Avg episode reward: [(0, '0.264')] [2024-06-10 10:57:24,973][32415] Updated weights for policy 0, policy_version 14820 (0.0041) [2024-06-10 10:57:28,809][32415] Updated weights for policy 0, policy_version 14830 (0.0029) [2024-06-10 10:57:29,592][32177] Fps is (10 sec: 45875.0, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 243023872. Throughput: 0: 45019.9. Samples: 243076600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-10 10:57:29,596][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 10:57:32,126][32415] Updated weights for policy 0, policy_version 14840 (0.0033) [2024-06-10 10:57:34,592][32177] Fps is (10 sec: 45875.8, 60 sec: 44513.1, 300 sec: 44819.9). Total num frames: 243236864. Throughput: 0: 45003.8. Samples: 243345520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-10 10:57:34,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 10:57:35,907][32415] Updated weights for policy 0, policy_version 14850 (0.0043) [2024-06-10 10:57:39,439][32415] Updated weights for policy 0, policy_version 14860 (0.0022) [2024-06-10 10:57:39,592][32177] Fps is (10 sec: 44235.9, 60 sec: 45055.8, 300 sec: 44764.4). Total num frames: 243466240. Throughput: 0: 44870.3. Samples: 243615580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 10:57:39,593][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 10:57:43,062][32415] Updated weights for policy 0, policy_version 14870 (0.0041) [2024-06-10 10:57:44,592][32177] Fps is (10 sec: 45874.9, 60 sec: 45055.9, 300 sec: 44764.4). Total num frames: 243695616. Throughput: 0: 45019.3. Samples: 243755220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 10:57:44,592][32177] Avg episode reward: [(0, '0.264')] [2024-06-10 10:57:45,068][32394] Signal inference workers to stop experience collection... (3500 times) [2024-06-10 10:57:45,088][32415] InferenceWorker_p0-w0: stopping experience collection (3500 times) [2024-06-10 10:57:45,172][32394] Signal inference workers to resume experience collection... (3500 times) [2024-06-10 10:57:45,173][32415] InferenceWorker_p0-w0: resuming experience collection (3500 times) [2024-06-10 10:57:46,895][32415] Updated weights for policy 0, policy_version 14880 (0.0031) [2024-06-10 10:57:49,592][32177] Fps is (10 sec: 45875.9, 60 sec: 45055.8, 300 sec: 44875.5). Total num frames: 243924992. Throughput: 0: 45010.7. Samples: 244018940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 10:57:49,593][32177] Avg episode reward: [(0, '0.263')] [2024-06-10 10:57:50,108][32415] Updated weights for policy 0, policy_version 14890 (0.0039) [2024-06-10 10:57:54,100][32415] Updated weights for policy 0, policy_version 14900 (0.0046) [2024-06-10 10:57:54,592][32177] Fps is (10 sec: 44236.8, 60 sec: 45059.2, 300 sec: 44764.4). Total num frames: 244137984. Throughput: 0: 45056.7. Samples: 244291780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-10 10:57:54,592][32177] Avg episode reward: [(0, '0.263')] [2024-06-10 10:57:57,712][32415] Updated weights for policy 0, policy_version 14910 (0.0033) [2024-06-10 10:57:59,592][32177] Fps is (10 sec: 42599.0, 60 sec: 45332.3, 300 sec: 44653.3). Total num frames: 244350976. Throughput: 0: 45055.3. Samples: 244424760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-10 10:57:59,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 10:58:01,078][32415] Updated weights for policy 0, policy_version 14920 (0.0036) [2024-06-10 10:58:04,592][32177] Fps is (10 sec: 44236.7, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 244580352. Throughput: 0: 44905.7. Samples: 244693440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 10:58:04,592][32177] Avg episode reward: [(0, '0.263')] [2024-06-10 10:58:04,822][32415] Updated weights for policy 0, policy_version 14930 (0.0031) [2024-06-10 10:58:08,409][32415] Updated weights for policy 0, policy_version 14940 (0.0031) [2024-06-10 10:58:09,592][32177] Fps is (10 sec: 44236.9, 60 sec: 44513.1, 300 sec: 44708.9). Total num frames: 244793344. Throughput: 0: 44983.8. Samples: 244966000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 10:58:09,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 10:58:12,154][32415] Updated weights for policy 0, policy_version 14950 (0.0022) [2024-06-10 10:58:14,592][32177] Fps is (10 sec: 45875.6, 60 sec: 44783.1, 300 sec: 44764.4). Total num frames: 245039104. Throughput: 0: 44872.9. Samples: 245095880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 10:58:14,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 10:58:15,918][32415] Updated weights for policy 0, policy_version 14960 (0.0049) [2024-06-10 10:58:19,409][32415] Updated weights for policy 0, policy_version 14970 (0.0033) [2024-06-10 10:58:19,592][32177] Fps is (10 sec: 47513.3, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 245268480. Throughput: 0: 44960.0. Samples: 245368720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 10:58:19,592][32177] Avg episode reward: [(0, '0.267')] [2024-06-10 10:58:23,073][32415] Updated weights for policy 0, policy_version 14980 (0.0031) [2024-06-10 10:58:24,592][32177] Fps is (10 sec: 42598.7, 60 sec: 44783.1, 300 sec: 44765.1). Total num frames: 245465088. Throughput: 0: 44907.4. Samples: 245636400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 10:58:24,592][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 10:58:26,907][32415] Updated weights for policy 0, policy_version 14990 (0.0038) [2024-06-10 10:58:29,591][32177] Fps is (10 sec: 42598.9, 60 sec: 44510.0, 300 sec: 44764.5). Total num frames: 245694464. Throughput: 0: 44621.6. Samples: 245763180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 10:58:29,592][32177] Avg episode reward: [(0, '0.264')] [2024-06-10 10:58:30,152][32415] Updated weights for policy 0, policy_version 15000 (0.0034) [2024-06-10 10:58:33,960][32415] Updated weights for policy 0, policy_version 15010 (0.0031) [2024-06-10 10:58:34,592][32177] Fps is (10 sec: 45874.5, 60 sec: 44782.9, 300 sec: 44709.5). Total num frames: 245923840. Throughput: 0: 44843.1. Samples: 246036880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 10:58:34,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 10:58:34,606][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000015010_245923840.pth... [2024-06-10 10:58:34,676][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000014355_235192320.pth [2024-06-10 10:58:37,770][32415] Updated weights for policy 0, policy_version 15020 (0.0023) [2024-06-10 10:58:39,592][32177] Fps is (10 sec: 45874.5, 60 sec: 44783.1, 300 sec: 44820.6). Total num frames: 246153216. Throughput: 0: 44967.2. Samples: 246315300. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-10 10:58:39,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 10:58:41,356][32415] Updated weights for policy 0, policy_version 15030 (0.0031) [2024-06-10 10:58:44,592][32177] Fps is (10 sec: 45875.8, 60 sec: 44783.0, 300 sec: 44765.1). Total num frames: 246382592. Throughput: 0: 44756.9. Samples: 246438820. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-10 10:58:44,592][32177] Avg episode reward: [(0, '0.265')] [2024-06-10 10:58:45,101][32415] Updated weights for policy 0, policy_version 15040 (0.0032) [2024-06-10 10:58:48,549][32415] Updated weights for policy 0, policy_version 15050 (0.0030) [2024-06-10 10:58:49,592][32177] Fps is (10 sec: 44235.8, 60 sec: 44509.7, 300 sec: 44764.4). Total num frames: 246595584. Throughput: 0: 44742.1. Samples: 246706840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-10 10:58:49,592][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 10:58:52,158][32415] Updated weights for policy 0, policy_version 15060 (0.0030) [2024-06-10 10:58:54,596][32177] Fps is (10 sec: 44217.6, 60 sec: 44779.8, 300 sec: 44819.3). Total num frames: 246824960. Throughput: 0: 44713.4. Samples: 246978300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-10 10:58:54,597][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 10:58:55,998][32415] Updated weights for policy 0, policy_version 15070 (0.0041) [2024-06-10 10:58:59,363][32415] Updated weights for policy 0, policy_version 15080 (0.0031) [2024-06-10 10:58:59,592][32177] Fps is (10 sec: 47514.1, 60 sec: 45328.9, 300 sec: 44819.9). Total num frames: 247070720. Throughput: 0: 44858.1. Samples: 247114500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 10:58:59,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 10:59:03,157][32415] Updated weights for policy 0, policy_version 15090 (0.0030) [2024-06-10 10:59:04,592][32177] Fps is (10 sec: 45894.3, 60 sec: 45056.0, 300 sec: 44819.9). Total num frames: 247283712. Throughput: 0: 44626.5. Samples: 247376920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 10:59:04,592][32177] Avg episode reward: [(0, '0.266')] [2024-06-10 10:59:07,002][32415] Updated weights for policy 0, policy_version 15100 (0.0039) [2024-06-10 10:59:08,708][32394] Signal inference workers to stop experience collection... (3550 times) [2024-06-10 10:59:08,760][32394] Signal inference workers to resume experience collection... (3550 times) [2024-06-10 10:59:08,761][32415] InferenceWorker_p0-w0: stopping experience collection (3550 times) [2024-06-10 10:59:08,779][32415] InferenceWorker_p0-w0: resuming experience collection (3550 times) [2024-06-10 10:59:09,592][32177] Fps is (10 sec: 42599.1, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 247496704. Throughput: 0: 44776.0. Samples: 247651320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 10:59:09,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 10:59:10,393][32415] Updated weights for policy 0, policy_version 15110 (0.0038) [2024-06-10 10:59:14,115][32415] Updated weights for policy 0, policy_version 15120 (0.0027) [2024-06-10 10:59:14,592][32177] Fps is (10 sec: 44237.5, 60 sec: 44783.0, 300 sec: 44820.6). Total num frames: 247726080. Throughput: 0: 44966.1. Samples: 247786660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 10:59:14,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 10:59:17,627][32415] Updated weights for policy 0, policy_version 15130 (0.0027) [2024-06-10 10:59:19,592][32177] Fps is (10 sec: 42597.9, 60 sec: 44236.7, 300 sec: 44764.4). Total num frames: 247922688. Throughput: 0: 44861.8. Samples: 248055660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 10:59:19,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 10:59:21,342][32415] Updated weights for policy 0, policy_version 15140 (0.0025) [2024-06-10 10:59:24,592][32177] Fps is (10 sec: 45874.1, 60 sec: 45328.9, 300 sec: 44875.5). Total num frames: 248184832. Throughput: 0: 44770.0. Samples: 248329960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 10:59:24,593][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 10:59:24,945][32415] Updated weights for policy 0, policy_version 15150 (0.0030) [2024-06-10 10:59:28,793][32415] Updated weights for policy 0, policy_version 15160 (0.0036) [2024-06-10 10:59:29,596][32177] Fps is (10 sec: 47493.8, 60 sec: 45052.7, 300 sec: 44820.0). Total num frames: 248397824. Throughput: 0: 44976.1. Samples: 248462940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 10:59:29,597][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 10:59:32,062][32415] Updated weights for policy 0, policy_version 15170 (0.0030) [2024-06-10 10:59:34,592][32177] Fps is (10 sec: 44236.9, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 248627200. Throughput: 0: 44891.6. Samples: 248726960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-10 10:59:34,592][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 10:59:36,315][32415] Updated weights for policy 0, policy_version 15180 (0.0042) [2024-06-10 10:59:39,592][32177] Fps is (10 sec: 45894.7, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 248856576. Throughput: 0: 44626.0. Samples: 248986280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-10 10:59:39,592][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 10:59:39,706][32415] Updated weights for policy 0, policy_version 15190 (0.0030) [2024-06-10 10:59:43,490][32415] Updated weights for policy 0, policy_version 15200 (0.0035) [2024-06-10 10:59:44,594][32177] Fps is (10 sec: 42588.9, 60 sec: 44508.1, 300 sec: 44764.1). Total num frames: 249053184. Throughput: 0: 44706.2. Samples: 249126380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 10:59:44,595][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 10:59:46,938][32415] Updated weights for policy 0, policy_version 15210 (0.0043) [2024-06-10 10:59:49,592][32177] Fps is (10 sec: 42597.9, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 249282560. Throughput: 0: 44919.1. Samples: 249398280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 10:59:49,596][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 10:59:51,044][32415] Updated weights for policy 0, policy_version 15220 (0.0025) [2024-06-10 10:59:54,194][32415] Updated weights for policy 0, policy_version 15230 (0.0028) [2024-06-10 10:59:54,592][32177] Fps is (10 sec: 47525.0, 60 sec: 45059.2, 300 sec: 44876.1). Total num frames: 249528320. Throughput: 0: 44647.5. Samples: 249660460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-10 10:59:54,592][32177] Avg episode reward: [(0, '0.257')] [2024-06-10 10:59:58,177][32415] Updated weights for policy 0, policy_version 15240 (0.0034) [2024-06-10 10:59:59,592][32177] Fps is (10 sec: 42599.2, 60 sec: 43963.9, 300 sec: 44709.5). Total num frames: 249708544. Throughput: 0: 44679.6. Samples: 249797240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-10 10:59:59,592][32177] Avg episode reward: [(0, '0.267')] [2024-06-10 11:00:01,693][32415] Updated weights for policy 0, policy_version 15250 (0.0030) [2024-06-10 11:00:04,592][32177] Fps is (10 sec: 44236.3, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 249970688. Throughput: 0: 44682.2. Samples: 250066360. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-10 11:00:04,592][32177] Avg episode reward: [(0, '0.267')] [2024-06-10 11:00:05,832][32415] Updated weights for policy 0, policy_version 15260 (0.0030) [2024-06-10 11:00:09,095][32415] Updated weights for policy 0, policy_version 15270 (0.0035) [2024-06-10 11:00:09,592][32177] Fps is (10 sec: 49151.5, 60 sec: 45056.0, 300 sec: 44820.3). Total num frames: 250200064. Throughput: 0: 44550.8. Samples: 250334740. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-10 11:00:09,592][32177] Avg episode reward: [(0, '0.261')] [2024-06-10 11:00:13,007][32415] Updated weights for policy 0, policy_version 15280 (0.0031) [2024-06-10 11:00:14,592][32177] Fps is (10 sec: 40959.1, 60 sec: 44236.5, 300 sec: 44764.4). Total num frames: 250380288. Throughput: 0: 44715.9. Samples: 250474980. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-10 11:00:14,596][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 11:00:16,247][32415] Updated weights for policy 0, policy_version 15290 (0.0024) [2024-06-10 11:00:19,117][32394] Signal inference workers to stop experience collection... (3600 times) [2024-06-10 11:00:19,118][32394] Signal inference workers to resume experience collection... (3600 times) [2024-06-10 11:00:19,160][32415] InferenceWorker_p0-w0: stopping experience collection (3600 times) [2024-06-10 11:00:19,161][32415] InferenceWorker_p0-w0: resuming experience collection (3600 times) [2024-06-10 11:00:19,592][32177] Fps is (10 sec: 44237.1, 60 sec: 45329.2, 300 sec: 44820.0). Total num frames: 250642432. Throughput: 0: 44783.3. Samples: 250742200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-10 11:00:19,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:00:19,592][32394] Saving new best policy, reward=0.281! [2024-06-10 11:00:20,069][32415] Updated weights for policy 0, policy_version 15300 (0.0032) [2024-06-10 11:00:23,488][32415] Updated weights for policy 0, policy_version 15310 (0.0030) [2024-06-10 11:00:24,592][32177] Fps is (10 sec: 47514.3, 60 sec: 44509.9, 300 sec: 44764.6). Total num frames: 250855424. Throughput: 0: 44701.6. Samples: 250997860. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-10 11:00:24,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 11:00:27,709][32415] Updated weights for policy 0, policy_version 15320 (0.0038) [2024-06-10 11:00:29,592][32177] Fps is (10 sec: 40959.3, 60 sec: 44239.9, 300 sec: 44708.9). Total num frames: 251052032. Throughput: 0: 44648.5. Samples: 251135460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-10 11:00:29,592][32177] Avg episode reward: [(0, '0.259')] [2024-06-10 11:00:31,032][32415] Updated weights for policy 0, policy_version 15330 (0.0025) [2024-06-10 11:00:34,592][32177] Fps is (10 sec: 44237.9, 60 sec: 44510.0, 300 sec: 44820.0). Total num frames: 251297792. Throughput: 0: 44605.5. Samples: 251405520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-10 11:00:34,592][32177] Avg episode reward: [(0, '0.266')] [2024-06-10 11:00:34,602][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000015338_251297792.pth... [2024-06-10 11:00:34,667][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000014683_240566272.pth [2024-06-10 11:00:35,066][32415] Updated weights for policy 0, policy_version 15340 (0.0032) [2024-06-10 11:00:38,487][32415] Updated weights for policy 0, policy_version 15350 (0.0023) [2024-06-10 11:00:39,592][32177] Fps is (10 sec: 47514.3, 60 sec: 44509.9, 300 sec: 44765.1). Total num frames: 251527168. Throughput: 0: 44821.4. Samples: 251677420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-10 11:00:39,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:00:42,386][32415] Updated weights for policy 0, policy_version 15360 (0.0035) [2024-06-10 11:00:44,592][32177] Fps is (10 sec: 44235.7, 60 sec: 44784.6, 300 sec: 44708.9). Total num frames: 251740160. Throughput: 0: 44805.1. Samples: 251813480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-10 11:00:44,592][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 11:00:45,590][32415] Updated weights for policy 0, policy_version 15370 (0.0027) [2024-06-10 11:00:49,409][32415] Updated weights for policy 0, policy_version 15380 (0.0034) [2024-06-10 11:00:49,592][32177] Fps is (10 sec: 45875.2, 60 sec: 45056.1, 300 sec: 44820.0). Total num frames: 251985920. Throughput: 0: 44802.4. Samples: 252082460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 11:00:49,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 11:00:53,047][32415] Updated weights for policy 0, policy_version 15390 (0.0042) [2024-06-10 11:00:54,592][32177] Fps is (10 sec: 45876.0, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 252198912. Throughput: 0: 44868.0. Samples: 252353800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 11:00:54,592][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:00:56,889][32415] Updated weights for policy 0, policy_version 15400 (0.0026) [2024-06-10 11:00:59,596][32177] Fps is (10 sec: 42579.9, 60 sec: 45052.7, 300 sec: 44763.8). Total num frames: 252411904. Throughput: 0: 44581.9. Samples: 252481340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 11:00:59,597][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 11:01:00,328][32415] Updated weights for policy 0, policy_version 15410 (0.0034) [2024-06-10 11:01:04,245][32415] Updated weights for policy 0, policy_version 15420 (0.0034) [2024-06-10 11:01:04,592][32177] Fps is (10 sec: 44236.5, 60 sec: 44509.9, 300 sec: 44764.5). Total num frames: 252641280. Throughput: 0: 44649.6. Samples: 252751440. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-10 11:01:04,592][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 11:01:07,838][32415] Updated weights for policy 0, policy_version 15430 (0.0026) [2024-06-10 11:01:09,592][32177] Fps is (10 sec: 47534.3, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 252887040. Throughput: 0: 44970.0. Samples: 253021500. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-10 11:01:09,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 11:01:11,267][32415] Updated weights for policy 0, policy_version 15440 (0.0032) [2024-06-10 11:01:14,592][32177] Fps is (10 sec: 44236.8, 60 sec: 45056.2, 300 sec: 44764.5). Total num frames: 253083648. Throughput: 0: 45038.7. Samples: 253162200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-10 11:01:14,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:01:15,097][32415] Updated weights for policy 0, policy_version 15450 (0.0030) [2024-06-10 11:01:18,755][32415] Updated weights for policy 0, policy_version 15460 (0.0038) [2024-06-10 11:01:19,592][32177] Fps is (10 sec: 42597.2, 60 sec: 44509.6, 300 sec: 44819.9). Total num frames: 253313024. Throughput: 0: 44928.1. Samples: 253427300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-10 11:01:19,593][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 11:01:22,325][32415] Updated weights for policy 0, policy_version 15470 (0.0045) [2024-06-10 11:01:24,592][32177] Fps is (10 sec: 47514.2, 60 sec: 45056.2, 300 sec: 44875.5). Total num frames: 253558784. Throughput: 0: 44779.1. Samples: 253692480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-10 11:01:24,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:01:26,272][32415] Updated weights for policy 0, policy_version 15480 (0.0036) [2024-06-10 11:01:29,000][32394] Signal inference workers to stop experience collection... (3650 times) [2024-06-10 11:01:29,000][32394] Signal inference workers to resume experience collection... (3650 times) [2024-06-10 11:01:29,034][32415] InferenceWorker_p0-w0: stopping experience collection (3650 times) [2024-06-10 11:01:29,034][32415] InferenceWorker_p0-w0: resuming experience collection (3650 times) [2024-06-10 11:01:29,494][32415] Updated weights for policy 0, policy_version 15490 (0.0032) [2024-06-10 11:01:29,592][32177] Fps is (10 sec: 47514.7, 60 sec: 45602.2, 300 sec: 44820.6). Total num frames: 253788160. Throughput: 0: 44869.5. Samples: 253832600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-10 11:01:29,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:01:33,613][32415] Updated weights for policy 0, policy_version 15500 (0.0027) [2024-06-10 11:01:34,593][32177] Fps is (10 sec: 44228.3, 60 sec: 45054.6, 300 sec: 44875.2). Total num frames: 254001152. Throughput: 0: 44809.7. Samples: 254098980. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-10 11:01:34,594][32177] Avg episode reward: [(0, '0.267')] [2024-06-10 11:01:37,266][32415] Updated weights for policy 0, policy_version 15510 (0.0033) [2024-06-10 11:01:39,592][32177] Fps is (10 sec: 45875.2, 60 sec: 45329.0, 300 sec: 44931.0). Total num frames: 254246912. Throughput: 0: 44666.2. Samples: 254363780. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-10 11:01:39,592][32177] Avg episode reward: [(0, '0.266')] [2024-06-10 11:01:40,695][32415] Updated weights for policy 0, policy_version 15520 (0.0031) [2024-06-10 11:01:44,382][32415] Updated weights for policy 0, policy_version 15530 (0.0043) [2024-06-10 11:01:44,592][32177] Fps is (10 sec: 44243.9, 60 sec: 45055.9, 300 sec: 44819.9). Total num frames: 254443520. Throughput: 0: 44955.6. Samples: 254504160. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-10 11:01:44,593][32177] Avg episode reward: [(0, '0.264')] [2024-06-10 11:01:48,415][32415] Updated weights for policy 0, policy_version 15540 (0.0030) [2024-06-10 11:01:49,592][32177] Fps is (10 sec: 40960.1, 60 sec: 44509.8, 300 sec: 44820.6). Total num frames: 254656512. Throughput: 0: 44870.3. Samples: 254770600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 11:01:49,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:01:52,107][32415] Updated weights for policy 0, policy_version 15550 (0.0033) [2024-06-10 11:01:54,596][32177] Fps is (10 sec: 45857.2, 60 sec: 45052.9, 300 sec: 44986.6). Total num frames: 254902272. Throughput: 0: 44580.2. Samples: 255027800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 11:01:54,596][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 11:01:55,938][32415] Updated weights for policy 0, policy_version 15560 (0.0032) [2024-06-10 11:01:59,191][32415] Updated weights for policy 0, policy_version 15570 (0.0024) [2024-06-10 11:01:59,592][32177] Fps is (10 sec: 45874.7, 60 sec: 45059.2, 300 sec: 44820.0). Total num frames: 255115264. Throughput: 0: 44668.4. Samples: 255172280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-10 11:01:59,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:02:03,092][32415] Updated weights for policy 0, policy_version 15580 (0.0031) [2024-06-10 11:02:04,592][32177] Fps is (10 sec: 40976.7, 60 sec: 44509.8, 300 sec: 44709.5). Total num frames: 255311872. Throughput: 0: 44825.0. Samples: 255444420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-10 11:02:04,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:02:06,434][32415] Updated weights for policy 0, policy_version 15590 (0.0028) [2024-06-10 11:02:09,592][32177] Fps is (10 sec: 44237.1, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 255557632. Throughput: 0: 44836.8. Samples: 255710140. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-10 11:02:09,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:02:10,124][32415] Updated weights for policy 0, policy_version 15600 (0.0038) [2024-06-10 11:02:13,642][32415] Updated weights for policy 0, policy_version 15610 (0.0037) [2024-06-10 11:02:14,592][32177] Fps is (10 sec: 49152.1, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 255803392. Throughput: 0: 44811.0. Samples: 255849100. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-10 11:02:14,592][32177] Avg episode reward: [(0, '0.264')] [2024-06-10 11:02:17,444][32415] Updated weights for policy 0, policy_version 15620 (0.0039) [2024-06-10 11:02:19,592][32177] Fps is (10 sec: 45874.9, 60 sec: 45056.1, 300 sec: 44875.5). Total num frames: 256016384. Throughput: 0: 44957.8. Samples: 256122000. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-10 11:02:19,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 11:02:20,981][32415] Updated weights for policy 0, policy_version 15630 (0.0024) [2024-06-10 11:02:24,596][32177] Fps is (10 sec: 42580.6, 60 sec: 44506.6, 300 sec: 44763.8). Total num frames: 256229376. Throughput: 0: 44967.7. Samples: 256387520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-10 11:02:24,605][32177] Avg episode reward: [(0, '0.265')] [2024-06-10 11:02:24,854][32415] Updated weights for policy 0, policy_version 15640 (0.0032) [2024-06-10 11:02:28,027][32415] Updated weights for policy 0, policy_version 15650 (0.0033) [2024-06-10 11:02:29,592][32177] Fps is (10 sec: 44236.6, 60 sec: 44509.8, 300 sec: 44819.9). Total num frames: 256458752. Throughput: 0: 44839.7. Samples: 256521940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-10 11:02:29,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:02:32,097][32415] Updated weights for policy 0, policy_version 15660 (0.0021) [2024-06-10 11:02:34,592][32177] Fps is (10 sec: 44255.6, 60 sec: 44511.2, 300 sec: 44764.4). Total num frames: 256671744. Throughput: 0: 44991.9. Samples: 256795240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-10 11:02:34,592][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 11:02:34,683][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000015667_256688128.pth... [2024-06-10 11:02:34,732][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000015010_245923840.pth [2024-06-10 11:02:35,395][32415] Updated weights for policy 0, policy_version 15670 (0.0035) [2024-06-10 11:02:39,027][32415] Updated weights for policy 0, policy_version 15680 (0.0029) [2024-06-10 11:02:39,592][32177] Fps is (10 sec: 44234.7, 60 sec: 44236.4, 300 sec: 44764.3). Total num frames: 256901120. Throughput: 0: 45247.6. Samples: 257063780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-10 11:02:39,593][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:02:42,631][32415] Updated weights for policy 0, policy_version 15690 (0.0036) [2024-06-10 11:02:44,592][32177] Fps is (10 sec: 47514.2, 60 sec: 45056.3, 300 sec: 44820.0). Total num frames: 257146880. Throughput: 0: 45176.2. Samples: 257205200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 11:02:44,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:02:44,841][32394] Signal inference workers to stop experience collection... (3700 times) [2024-06-10 11:02:44,895][32394] Signal inference workers to resume experience collection... (3700 times) [2024-06-10 11:02:44,900][32415] InferenceWorker_p0-w0: stopping experience collection (3700 times) [2024-06-10 11:02:44,909][32415] InferenceWorker_p0-w0: resuming experience collection (3700 times) [2024-06-10 11:02:46,187][32415] Updated weights for policy 0, policy_version 15700 (0.0031) [2024-06-10 11:02:49,592][32177] Fps is (10 sec: 45877.5, 60 sec: 45055.9, 300 sec: 44820.0). Total num frames: 257359872. Throughput: 0: 44956.9. Samples: 257467480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 11:02:49,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:02:49,911][32415] Updated weights for policy 0, policy_version 15710 (0.0025) [2024-06-10 11:02:54,064][32415] Updated weights for policy 0, policy_version 15720 (0.0028) [2024-06-10 11:02:54,592][32177] Fps is (10 sec: 40959.4, 60 sec: 44239.9, 300 sec: 44764.4). Total num frames: 257556480. Throughput: 0: 45072.4. Samples: 257738400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 11:02:54,592][32177] Avg episode reward: [(0, '0.266')] [2024-06-10 11:02:57,093][32415] Updated weights for policy 0, policy_version 15730 (0.0038) [2024-06-10 11:02:59,592][32177] Fps is (10 sec: 45876.0, 60 sec: 45056.1, 300 sec: 44875.5). Total num frames: 257818624. Throughput: 0: 44833.0. Samples: 257866580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-10 11:02:59,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:03:01,227][32415] Updated weights for policy 0, policy_version 15740 (0.0036) [2024-06-10 11:03:04,515][32415] Updated weights for policy 0, policy_version 15750 (0.0039) [2024-06-10 11:03:04,592][32177] Fps is (10 sec: 49151.9, 60 sec: 45602.2, 300 sec: 44931.0). Total num frames: 258048000. Throughput: 0: 44811.6. Samples: 258138520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-10 11:03:04,592][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 11:03:08,715][32415] Updated weights for policy 0, policy_version 15760 (0.0039) [2024-06-10 11:03:09,596][32177] Fps is (10 sec: 42580.5, 60 sec: 44779.8, 300 sec: 44763.8). Total num frames: 258244608. Throughput: 0: 44887.3. Samples: 258407440. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-10 11:03:09,596][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:03:11,850][32415] Updated weights for policy 0, policy_version 15770 (0.0035) [2024-06-10 11:03:14,592][32177] Fps is (10 sec: 44237.3, 60 sec: 44783.1, 300 sec: 44820.0). Total num frames: 258490368. Throughput: 0: 44858.9. Samples: 258540580. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-10 11:03:14,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:03:16,205][32415] Updated weights for policy 0, policy_version 15780 (0.0031) [2024-06-10 11:03:19,282][32415] Updated weights for policy 0, policy_version 15790 (0.0028) [2024-06-10 11:03:19,592][32177] Fps is (10 sec: 45894.5, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 258703360. Throughput: 0: 44643.2. Samples: 258804180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-10 11:03:19,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 11:03:23,206][32415] Updated weights for policy 0, policy_version 15800 (0.0031) [2024-06-10 11:03:24,592][32177] Fps is (10 sec: 42597.7, 60 sec: 44786.1, 300 sec: 44819.9). Total num frames: 258916352. Throughput: 0: 44780.1. Samples: 259078860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-10 11:03:24,592][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:03:26,507][32415] Updated weights for policy 0, policy_version 15810 (0.0027) [2024-06-10 11:03:29,592][32177] Fps is (10 sec: 45875.4, 60 sec: 45056.2, 300 sec: 44875.5). Total num frames: 259162112. Throughput: 0: 44541.8. Samples: 259209580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 11:03:29,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:03:30,253][32415] Updated weights for policy 0, policy_version 15820 (0.0031) [2024-06-10 11:03:34,037][32415] Updated weights for policy 0, policy_version 15830 (0.0027) [2024-06-10 11:03:34,592][32177] Fps is (10 sec: 45874.6, 60 sec: 45055.9, 300 sec: 44819.9). Total num frames: 259375104. Throughput: 0: 44786.5. Samples: 259482880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 11:03:34,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 11:03:37,733][32415] Updated weights for policy 0, policy_version 15840 (0.0044) [2024-06-10 11:03:39,592][32177] Fps is (10 sec: 42598.5, 60 sec: 44783.5, 300 sec: 44764.4). Total num frames: 259588096. Throughput: 0: 44754.8. Samples: 259752360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 11:03:39,592][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:03:41,108][32415] Updated weights for policy 0, policy_version 15850 (0.0036) [2024-06-10 11:03:44,592][32177] Fps is (10 sec: 45876.5, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 259833856. Throughput: 0: 44795.6. Samples: 259882380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-10 11:03:44,592][32177] Avg episode reward: [(0, '0.267')] [2024-06-10 11:03:45,033][32415] Updated weights for policy 0, policy_version 15860 (0.0031) [2024-06-10 11:03:48,581][32415] Updated weights for policy 0, policy_version 15870 (0.0032) [2024-06-10 11:03:49,592][32177] Fps is (10 sec: 47512.7, 60 sec: 45056.0, 300 sec: 44876.1). Total num frames: 260063232. Throughput: 0: 44592.9. Samples: 260145200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-10 11:03:49,592][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 11:03:51,992][32415] Updated weights for policy 0, policy_version 15880 (0.0034) [2024-06-10 11:03:54,592][32177] Fps is (10 sec: 44236.1, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 260276224. Throughput: 0: 44723.6. Samples: 260419820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 11:03:54,592][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:03:56,082][32415] Updated weights for policy 0, policy_version 15890 (0.0029) [2024-06-10 11:03:59,426][32415] Updated weights for policy 0, policy_version 15900 (0.0024) [2024-06-10 11:03:59,592][32177] Fps is (10 sec: 44237.1, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 260505600. Throughput: 0: 44666.6. Samples: 260550580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 11:03:59,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:04:03,202][32415] Updated weights for policy 0, policy_version 15910 (0.0039) [2024-06-10 11:04:04,592][32177] Fps is (10 sec: 44236.9, 60 sec: 44509.9, 300 sec: 44819.9). Total num frames: 260718592. Throughput: 0: 44877.7. Samples: 260823680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 11:04:04,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:04:07,051][32415] Updated weights for policy 0, policy_version 15920 (0.0032) [2024-06-10 11:04:09,592][32177] Fps is (10 sec: 42597.9, 60 sec: 44785.9, 300 sec: 44764.4). Total num frames: 260931584. Throughput: 0: 44869.8. Samples: 261098000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 11:04:09,592][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 11:04:09,797][32394] Signal inference workers to stop experience collection... (3750 times) [2024-06-10 11:04:09,797][32394] Signal inference workers to resume experience collection... (3750 times) [2024-06-10 11:04:09,817][32415] InferenceWorker_p0-w0: stopping experience collection (3750 times) [2024-06-10 11:04:09,817][32415] InferenceWorker_p0-w0: resuming experience collection (3750 times) [2024-06-10 11:04:10,246][32415] Updated weights for policy 0, policy_version 15930 (0.0034) [2024-06-10 11:04:14,131][32415] Updated weights for policy 0, policy_version 15940 (0.0032) [2024-06-10 11:04:14,592][32177] Fps is (10 sec: 44237.3, 60 sec: 44509.9, 300 sec: 44875.5). Total num frames: 261160960. Throughput: 0: 44821.3. Samples: 261226540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 11:04:14,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:04:17,731][32415] Updated weights for policy 0, policy_version 15950 (0.0034) [2024-06-10 11:04:19,592][32177] Fps is (10 sec: 45875.6, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 261390336. Throughput: 0: 44621.5. Samples: 261490840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 11:04:19,592][32177] Avg episode reward: [(0, '0.263')] [2024-06-10 11:04:21,517][32415] Updated weights for policy 0, policy_version 15960 (0.0037) [2024-06-10 11:04:24,592][32177] Fps is (10 sec: 45874.4, 60 sec: 45056.0, 300 sec: 44820.6). Total num frames: 261619712. Throughput: 0: 44679.8. Samples: 261762960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 11:04:24,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:04:24,962][32415] Updated weights for policy 0, policy_version 15970 (0.0020) [2024-06-10 11:04:28,948][32415] Updated weights for policy 0, policy_version 15980 (0.0027) [2024-06-10 11:04:29,592][32177] Fps is (10 sec: 44237.2, 60 sec: 44509.8, 300 sec: 44764.5). Total num frames: 261832704. Throughput: 0: 44917.3. Samples: 261903660. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-10 11:04:29,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:04:32,512][32415] Updated weights for policy 0, policy_version 15990 (0.0033) [2024-06-10 11:04:34,592][32177] Fps is (10 sec: 44236.9, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 262062080. Throughput: 0: 45003.1. Samples: 262170340. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-10 11:04:34,592][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 11:04:34,706][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000015996_262078464.pth... [2024-06-10 11:04:34,761][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000015338_251297792.pth [2024-06-10 11:04:35,861][32415] Updated weights for policy 0, policy_version 16000 (0.0027) [2024-06-10 11:04:39,592][32177] Fps is (10 sec: 45875.3, 60 sec: 45056.0, 300 sec: 44875.9). Total num frames: 262291456. Throughput: 0: 44954.0. Samples: 262442740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 11:04:39,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:04:39,618][32415] Updated weights for policy 0, policy_version 16010 (0.0036) [2024-06-10 11:04:43,217][32415] Updated weights for policy 0, policy_version 16020 (0.0029) [2024-06-10 11:04:44,592][32177] Fps is (10 sec: 42598.4, 60 sec: 44236.7, 300 sec: 44764.4). Total num frames: 262488064. Throughput: 0: 45039.5. Samples: 262577360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 11:04:44,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 11:04:47,092][32415] Updated weights for policy 0, policy_version 16030 (0.0023) [2024-06-10 11:04:49,592][32177] Fps is (10 sec: 44235.2, 60 sec: 44509.7, 300 sec: 44764.4). Total num frames: 262733824. Throughput: 0: 44827.3. Samples: 262840920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 11:04:49,593][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:04:50,658][32415] Updated weights for policy 0, policy_version 16040 (0.0029) [2024-06-10 11:04:54,357][32415] Updated weights for policy 0, policy_version 16050 (0.0034) [2024-06-10 11:04:54,592][32177] Fps is (10 sec: 47513.3, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 262963200. Throughput: 0: 44737.3. Samples: 263111180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 11:04:54,592][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:04:57,805][32415] Updated weights for policy 0, policy_version 16060 (0.0031) [2024-06-10 11:04:59,592][32177] Fps is (10 sec: 44237.0, 60 sec: 44509.7, 300 sec: 44764.4). Total num frames: 263176192. Throughput: 0: 44814.4. Samples: 263243200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 11:04:59,592][32177] Avg episode reward: [(0, '0.265')] [2024-06-10 11:05:01,652][32415] Updated weights for policy 0, policy_version 16070 (0.0034) [2024-06-10 11:05:04,592][32177] Fps is (10 sec: 44237.4, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 263405568. Throughput: 0: 44971.1. Samples: 263514540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 11:05:04,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:05:05,202][32415] Updated weights for policy 0, policy_version 16080 (0.0031) [2024-06-10 11:05:09,090][32415] Updated weights for policy 0, policy_version 16090 (0.0026) [2024-06-10 11:05:09,592][32177] Fps is (10 sec: 49152.0, 60 sec: 45602.0, 300 sec: 45042.1). Total num frames: 263667712. Throughput: 0: 44767.4. Samples: 263777500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 11:05:09,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 11:05:12,767][32415] Updated weights for policy 0, policy_version 16100 (0.0029) [2024-06-10 11:05:14,592][32177] Fps is (10 sec: 44235.8, 60 sec: 44782.7, 300 sec: 44764.4). Total num frames: 263847936. Throughput: 0: 44610.8. Samples: 263911160. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-10 11:05:14,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:05:16,436][32415] Updated weights for policy 0, policy_version 16110 (0.0025) [2024-06-10 11:05:19,592][32177] Fps is (10 sec: 39322.1, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 264060928. Throughput: 0: 44440.0. Samples: 264170140. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-10 11:05:19,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:05:20,240][32415] Updated weights for policy 0, policy_version 16120 (0.0036) [2024-06-10 11:05:24,039][32415] Updated weights for policy 0, policy_version 16130 (0.0042) [2024-06-10 11:05:24,592][32177] Fps is (10 sec: 44237.8, 60 sec: 44510.0, 300 sec: 44875.5). Total num frames: 264290304. Throughput: 0: 44287.0. Samples: 264435660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-10 11:05:24,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:05:26,553][32394] Signal inference workers to stop experience collection... (3800 times) [2024-06-10 11:05:26,553][32394] Signal inference workers to resume experience collection... (3800 times) [2024-06-10 11:05:26,594][32415] InferenceWorker_p0-w0: stopping experience collection (3800 times) [2024-06-10 11:05:26,594][32415] InferenceWorker_p0-w0: resuming experience collection (3800 times) [2024-06-10 11:05:27,317][32415] Updated weights for policy 0, policy_version 16140 (0.0043) [2024-06-10 11:05:29,592][32177] Fps is (10 sec: 44237.5, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 264503296. Throughput: 0: 44367.2. Samples: 264573880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-10 11:05:29,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:05:31,228][32415] Updated weights for policy 0, policy_version 16150 (0.0041) [2024-06-10 11:05:34,592][32177] Fps is (10 sec: 45875.3, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 264749056. Throughput: 0: 44628.3. Samples: 264849180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-10 11:05:34,592][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:05:34,654][32415] Updated weights for policy 0, policy_version 16160 (0.0038) [2024-06-10 11:05:38,576][32415] Updated weights for policy 0, policy_version 16170 (0.0025) [2024-06-10 11:05:39,592][32177] Fps is (10 sec: 47512.1, 60 sec: 44782.7, 300 sec: 44875.5). Total num frames: 264978432. Throughput: 0: 44353.7. Samples: 265107100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-10 11:05:39,593][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:05:42,335][32415] Updated weights for policy 0, policy_version 16180 (0.0029) [2024-06-10 11:05:44,594][32177] Fps is (10 sec: 42586.5, 60 sec: 44780.9, 300 sec: 44708.5). Total num frames: 265175040. Throughput: 0: 44396.2. Samples: 265241140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-10 11:05:44,595][32177] Avg episode reward: [(0, '0.265')] [2024-06-10 11:05:45,931][32415] Updated weights for policy 0, policy_version 16190 (0.0032) [2024-06-10 11:05:49,592][32177] Fps is (10 sec: 42599.5, 60 sec: 44510.1, 300 sec: 44764.4). Total num frames: 265404416. Throughput: 0: 44335.1. Samples: 265509620. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-10 11:05:49,592][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 11:05:49,829][32415] Updated weights for policy 0, policy_version 16200 (0.0032) [2024-06-10 11:05:53,455][32415] Updated weights for policy 0, policy_version 16210 (0.0040) [2024-06-10 11:05:54,592][32177] Fps is (10 sec: 45887.5, 60 sec: 44509.9, 300 sec: 44820.6). Total num frames: 265633792. Throughput: 0: 44331.7. Samples: 265772420. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-10 11:05:54,595][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:05:56,869][32415] Updated weights for policy 0, policy_version 16220 (0.0034) [2024-06-10 11:05:59,592][32177] Fps is (10 sec: 42598.4, 60 sec: 44237.0, 300 sec: 44708.9). Total num frames: 265830400. Throughput: 0: 44439.3. Samples: 265910920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-10 11:05:59,592][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 11:06:00,590][32415] Updated weights for policy 0, policy_version 16230 (0.0027) [2024-06-10 11:06:04,244][32415] Updated weights for policy 0, policy_version 16240 (0.0029) [2024-06-10 11:06:04,592][32177] Fps is (10 sec: 44237.2, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 266076160. Throughput: 0: 44648.1. Samples: 266179300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-10 11:06:04,592][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 11:06:08,078][32415] Updated weights for policy 0, policy_version 16250 (0.0025) [2024-06-10 11:06:09,592][32177] Fps is (10 sec: 45874.5, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 266289152. Throughput: 0: 44576.3. Samples: 266441600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-10 11:06:09,592][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 11:06:11,892][32415] Updated weights for policy 0, policy_version 16260 (0.0027) [2024-06-10 11:06:14,592][32177] Fps is (10 sec: 42597.1, 60 sec: 44236.7, 300 sec: 44708.9). Total num frames: 266502144. Throughput: 0: 44523.2. Samples: 266577440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 11:06:14,593][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:06:15,495][32415] Updated weights for policy 0, policy_version 16270 (0.0027) [2024-06-10 11:06:19,296][32415] Updated weights for policy 0, policy_version 16280 (0.0027) [2024-06-10 11:06:19,592][32177] Fps is (10 sec: 45875.2, 60 sec: 44782.9, 300 sec: 44708.8). Total num frames: 266747904. Throughput: 0: 44375.4. Samples: 266846080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 11:06:19,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:06:22,819][32415] Updated weights for policy 0, policy_version 16290 (0.0029) [2024-06-10 11:06:24,592][32177] Fps is (10 sec: 45876.6, 60 sec: 44509.9, 300 sec: 44653.3). Total num frames: 266960896. Throughput: 0: 44670.0. Samples: 267117240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 11:06:24,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:06:26,254][32415] Updated weights for policy 0, policy_version 16300 (0.0032) [2024-06-10 11:06:29,592][32177] Fps is (10 sec: 44238.0, 60 sec: 44783.0, 300 sec: 44709.2). Total num frames: 267190272. Throughput: 0: 44748.6. Samples: 267254700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 11:06:29,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:06:29,837][32415] Updated weights for policy 0, policy_version 16310 (0.0049) [2024-06-10 11:06:33,584][32415] Updated weights for policy 0, policy_version 16320 (0.0029) [2024-06-10 11:06:34,596][32177] Fps is (10 sec: 45855.5, 60 sec: 44506.7, 300 sec: 44652.7). Total num frames: 267419648. Throughput: 0: 44671.3. Samples: 267520020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 11:06:34,597][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 11:06:34,602][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000016322_267419648.pth... [2024-06-10 11:06:34,660][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000015667_256688128.pth [2024-06-10 11:06:37,249][32415] Updated weights for policy 0, policy_version 16330 (0.0039) [2024-06-10 11:06:39,592][32177] Fps is (10 sec: 44235.9, 60 sec: 44236.9, 300 sec: 44708.9). Total num frames: 267632640. Throughput: 0: 44788.9. Samples: 267787920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 11:06:39,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:06:40,999][32415] Updated weights for policy 0, policy_version 16340 (0.0035) [2024-06-10 11:06:44,592][32177] Fps is (10 sec: 44254.7, 60 sec: 44784.8, 300 sec: 44764.4). Total num frames: 267862016. Throughput: 0: 44740.6. Samples: 267924260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 11:06:44,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:06:44,727][32415] Updated weights for policy 0, policy_version 16350 (0.0031) [2024-06-10 11:06:48,500][32415] Updated weights for policy 0, policy_version 16360 (0.0021) [2024-06-10 11:06:49,592][32177] Fps is (10 sec: 44237.1, 60 sec: 44509.8, 300 sec: 44654.0). Total num frames: 268075008. Throughput: 0: 44698.2. Samples: 268190720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 11:06:49,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 11:06:51,947][32415] Updated weights for policy 0, policy_version 16370 (0.0032) [2024-06-10 11:06:54,592][32177] Fps is (10 sec: 44238.3, 60 sec: 44510.0, 300 sec: 44708.9). Total num frames: 268304384. Throughput: 0: 44837.6. Samples: 268459280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 11:06:54,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:06:55,875][32415] Updated weights for policy 0, policy_version 16380 (0.0033) [2024-06-10 11:06:57,617][32394] Signal inference workers to stop experience collection... (3850 times) [2024-06-10 11:06:57,617][32394] Signal inference workers to resume experience collection... (3850 times) [2024-06-10 11:06:57,660][32415] InferenceWorker_p0-w0: stopping experience collection (3850 times) [2024-06-10 11:06:57,660][32415] InferenceWorker_p0-w0: resuming experience collection (3850 times) [2024-06-10 11:06:59,099][32415] Updated weights for policy 0, policy_version 16390 (0.0034) [2024-06-10 11:06:59,592][32177] Fps is (10 sec: 47513.5, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 268550144. Throughput: 0: 44945.2. Samples: 268599960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 11:06:59,592][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 11:07:02,926][32415] Updated weights for policy 0, policy_version 16400 (0.0033) [2024-06-10 11:07:04,592][32177] Fps is (10 sec: 44236.6, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 268746752. Throughput: 0: 44829.1. Samples: 268863380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 11:07:04,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:07:06,614][32415] Updated weights for policy 0, policy_version 16410 (0.0030) [2024-06-10 11:07:09,592][32177] Fps is (10 sec: 44237.2, 60 sec: 45056.2, 300 sec: 44708.9). Total num frames: 268992512. Throughput: 0: 44712.1. Samples: 269129280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 11:07:09,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:07:10,328][32415] Updated weights for policy 0, policy_version 16420 (0.0025) [2024-06-10 11:07:13,916][32415] Updated weights for policy 0, policy_version 16430 (0.0038) [2024-06-10 11:07:14,592][32177] Fps is (10 sec: 47513.9, 60 sec: 45329.4, 300 sec: 44764.4). Total num frames: 269221888. Throughput: 0: 44824.4. Samples: 269271800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-10 11:07:14,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:07:17,739][32415] Updated weights for policy 0, policy_version 16440 (0.0032) [2024-06-10 11:07:19,592][32177] Fps is (10 sec: 42598.5, 60 sec: 44510.0, 300 sec: 44709.5). Total num frames: 269418496. Throughput: 0: 44748.8. Samples: 269533520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-10 11:07:19,592][32177] Avg episode reward: [(0, '0.265')] [2024-06-10 11:07:21,022][32415] Updated weights for policy 0, policy_version 16450 (0.0032) [2024-06-10 11:07:24,592][32177] Fps is (10 sec: 44236.7, 60 sec: 45056.1, 300 sec: 44764.5). Total num frames: 269664256. Throughput: 0: 44852.6. Samples: 269806280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 11:07:24,592][32177] Avg episode reward: [(0, '0.265')] [2024-06-10 11:07:24,729][32415] Updated weights for policy 0, policy_version 16460 (0.0024) [2024-06-10 11:07:28,253][32415] Updated weights for policy 0, policy_version 16470 (0.0044) [2024-06-10 11:07:29,592][32177] Fps is (10 sec: 47512.6, 60 sec: 45055.8, 300 sec: 44819.9). Total num frames: 269893632. Throughput: 0: 45087.7. Samples: 269953200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 11:07:29,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:07:31,798][32415] Updated weights for policy 0, policy_version 16480 (0.0036) [2024-06-10 11:07:34,592][32177] Fps is (10 sec: 40959.6, 60 sec: 44240.0, 300 sec: 44653.4). Total num frames: 270073856. Throughput: 0: 44955.6. Samples: 270213720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 11:07:34,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:07:35,669][32415] Updated weights for policy 0, policy_version 16490 (0.0029) [2024-06-10 11:07:39,434][32415] Updated weights for policy 0, policy_version 16500 (0.0031) [2024-06-10 11:07:39,592][32177] Fps is (10 sec: 44237.7, 60 sec: 45056.1, 300 sec: 44708.9). Total num frames: 270336000. Throughput: 0: 44865.8. Samples: 270478240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 11:07:39,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:07:39,625][32394] Saving new best policy, reward=0.283! [2024-06-10 11:07:43,114][32415] Updated weights for policy 0, policy_version 16510 (0.0042) [2024-06-10 11:07:44,592][32177] Fps is (10 sec: 47513.6, 60 sec: 44783.1, 300 sec: 44708.9). Total num frames: 270548992. Throughput: 0: 44805.8. Samples: 270616220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 11:07:44,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:07:46,906][32415] Updated weights for policy 0, policy_version 16520 (0.0026) [2024-06-10 11:07:49,592][32177] Fps is (10 sec: 42598.5, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 270761984. Throughput: 0: 44876.5. Samples: 270882820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 11:07:49,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:07:50,314][32415] Updated weights for policy 0, policy_version 16530 (0.0032) [2024-06-10 11:07:53,934][32415] Updated weights for policy 0, policy_version 16540 (0.0023) [2024-06-10 11:07:54,592][32177] Fps is (10 sec: 45874.5, 60 sec: 45055.8, 300 sec: 44708.8). Total num frames: 271007744. Throughput: 0: 44932.2. Samples: 271151240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 11:07:54,592][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:07:54,604][32394] Saving new best policy, reward=0.289! [2024-06-10 11:07:57,122][32394] Signal inference workers to stop experience collection... (3900 times) [2024-06-10 11:07:57,160][32415] InferenceWorker_p0-w0: stopping experience collection (3900 times) [2024-06-10 11:07:57,181][32394] Signal inference workers to resume experience collection... (3900 times) [2024-06-10 11:07:57,187][32415] InferenceWorker_p0-w0: resuming experience collection (3900 times) [2024-06-10 11:07:57,490][32415] Updated weights for policy 0, policy_version 16550 (0.0028) [2024-06-10 11:07:59,592][32177] Fps is (10 sec: 47512.8, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 271237120. Throughput: 0: 44929.6. Samples: 271293640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 11:07:59,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 11:08:01,157][32415] Updated weights for policy 0, policy_version 16560 (0.0036) [2024-06-10 11:08:04,592][32177] Fps is (10 sec: 44238.0, 60 sec: 45056.0, 300 sec: 44765.1). Total num frames: 271450112. Throughput: 0: 45089.8. Samples: 271562560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 11:08:04,592][32177] Avg episode reward: [(0, '0.262')] [2024-06-10 11:08:04,767][32415] Updated weights for policy 0, policy_version 16570 (0.0030) [2024-06-10 11:08:08,756][32415] Updated weights for policy 0, policy_version 16580 (0.0029) [2024-06-10 11:08:09,592][32177] Fps is (10 sec: 44237.3, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 271679488. Throughput: 0: 44846.6. Samples: 271824380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 11:08:09,592][32177] Avg episode reward: [(0, '0.265')] [2024-06-10 11:08:12,128][32415] Updated weights for policy 0, policy_version 16590 (0.0031) [2024-06-10 11:08:14,592][32177] Fps is (10 sec: 45874.3, 60 sec: 44782.8, 300 sec: 44764.4). Total num frames: 271908864. Throughput: 0: 44559.6. Samples: 271958380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-10 11:08:14,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:08:15,781][32415] Updated weights for policy 0, policy_version 16600 (0.0035) [2024-06-10 11:08:19,411][32415] Updated weights for policy 0, policy_version 16610 (0.0032) [2024-06-10 11:08:19,592][32177] Fps is (10 sec: 45875.4, 60 sec: 45329.1, 300 sec: 44820.0). Total num frames: 272138240. Throughput: 0: 45049.9. Samples: 272240960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-10 11:08:19,592][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:08:22,876][32415] Updated weights for policy 0, policy_version 16620 (0.0040) [2024-06-10 11:08:24,592][32177] Fps is (10 sec: 44236.5, 60 sec: 44782.8, 300 sec: 44708.8). Total num frames: 272351232. Throughput: 0: 45066.9. Samples: 272506260. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-10 11:08:24,593][32177] Avg episode reward: [(0, '0.264')] [2024-06-10 11:08:26,795][32415] Updated weights for policy 0, policy_version 16630 (0.0035) [2024-06-10 11:08:29,592][32177] Fps is (10 sec: 44236.1, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 272580608. Throughput: 0: 44882.6. Samples: 272635940. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-10 11:08:29,592][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:08:30,424][32415] Updated weights for policy 0, policy_version 16640 (0.0028) [2024-06-10 11:08:34,147][32415] Updated weights for policy 0, policy_version 16650 (0.0031) [2024-06-10 11:08:34,592][32177] Fps is (10 sec: 45876.0, 60 sec: 45602.1, 300 sec: 44819.9). Total num frames: 272809984. Throughput: 0: 44910.1. Samples: 272903780. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-10 11:08:34,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:08:34,601][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000016651_272809984.pth... [2024-06-10 11:08:34,661][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000015996_262078464.pth [2024-06-10 11:08:38,098][32415] Updated weights for policy 0, policy_version 16660 (0.0029) [2024-06-10 11:08:39,592][32177] Fps is (10 sec: 40959.8, 60 sec: 44236.6, 300 sec: 44597.8). Total num frames: 272990208. Throughput: 0: 44776.9. Samples: 273166200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-10 11:08:39,594][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:08:41,615][32415] Updated weights for policy 0, policy_version 16670 (0.0030) [2024-06-10 11:08:44,592][32177] Fps is (10 sec: 44236.7, 60 sec: 45056.0, 300 sec: 44708.9). Total num frames: 273252352. Throughput: 0: 44442.7. Samples: 273293560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-10 11:08:44,592][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:08:45,395][32415] Updated weights for policy 0, policy_version 16680 (0.0027) [2024-06-10 11:08:49,016][32415] Updated weights for policy 0, policy_version 16690 (0.0024) [2024-06-10 11:08:49,592][32177] Fps is (10 sec: 49152.6, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 273481728. Throughput: 0: 44679.5. Samples: 273573140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 11:08:49,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:08:52,376][32415] Updated weights for policy 0, policy_version 16700 (0.0042) [2024-06-10 11:08:54,592][32177] Fps is (10 sec: 42598.4, 60 sec: 44510.0, 300 sec: 44653.3). Total num frames: 273678336. Throughput: 0: 44831.9. Samples: 273841820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 11:08:54,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:08:56,069][32415] Updated weights for policy 0, policy_version 16710 (0.0035) [2024-06-10 11:08:59,592][32177] Fps is (10 sec: 42598.9, 60 sec: 44510.0, 300 sec: 44708.9). Total num frames: 273907712. Throughput: 0: 44794.4. Samples: 273974120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 11:08:59,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:08:59,898][32415] Updated weights for policy 0, policy_version 16720 (0.0039) [2024-06-10 11:09:03,380][32415] Updated weights for policy 0, policy_version 16730 (0.0027) [2024-06-10 11:09:04,592][32177] Fps is (10 sec: 47513.9, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 274153472. Throughput: 0: 44449.7. Samples: 274241200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 11:09:04,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:09:07,590][32415] Updated weights for policy 0, policy_version 16740 (0.0030) [2024-06-10 11:09:09,592][32177] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 274333696. Throughput: 0: 44702.9. Samples: 274517880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 11:09:09,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:09:10,727][32415] Updated weights for policy 0, policy_version 16750 (0.0034) [2024-06-10 11:09:11,113][32394] Signal inference workers to stop experience collection... (3950 times) [2024-06-10 11:09:11,147][32415] InferenceWorker_p0-w0: stopping experience collection (3950 times) [2024-06-10 11:09:11,232][32394] Signal inference workers to resume experience collection... (3950 times) [2024-06-10 11:09:11,233][32415] InferenceWorker_p0-w0: resuming experience collection (3950 times) [2024-06-10 11:09:14,592][32177] Fps is (10 sec: 42597.0, 60 sec: 44509.7, 300 sec: 44708.8). Total num frames: 274579456. Throughput: 0: 44615.8. Samples: 274643660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-10 11:09:14,593][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:09:14,688][32415] Updated weights for policy 0, policy_version 16760 (0.0038) [2024-06-10 11:09:17,972][32415] Updated weights for policy 0, policy_version 16770 (0.0032) [2024-06-10 11:09:19,592][32177] Fps is (10 sec: 49150.6, 60 sec: 44782.7, 300 sec: 44764.4). Total num frames: 274825216. Throughput: 0: 44675.3. Samples: 274914180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-10 11:09:19,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:09:21,686][32415] Updated weights for policy 0, policy_version 16780 (0.0036) [2024-06-10 11:09:24,592][32177] Fps is (10 sec: 45876.5, 60 sec: 44783.1, 300 sec: 44764.4). Total num frames: 275038208. Throughput: 0: 44991.7. Samples: 275190820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 11:09:24,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:09:25,410][32415] Updated weights for policy 0, policy_version 16790 (0.0044) [2024-06-10 11:09:29,357][32415] Updated weights for policy 0, policy_version 16800 (0.0033) [2024-06-10 11:09:29,592][32177] Fps is (10 sec: 42599.6, 60 sec: 44510.0, 300 sec: 44708.9). Total num frames: 275251200. Throughput: 0: 44921.4. Samples: 275315020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 11:09:29,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:09:32,612][32415] Updated weights for policy 0, policy_version 16810 (0.0030) [2024-06-10 11:09:34,592][32177] Fps is (10 sec: 44237.3, 60 sec: 44510.0, 300 sec: 44708.9). Total num frames: 275480576. Throughput: 0: 44569.9. Samples: 275578780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 11:09:34,592][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:09:37,181][32415] Updated weights for policy 0, policy_version 16820 (0.0036) [2024-06-10 11:09:39,592][32177] Fps is (10 sec: 45874.1, 60 sec: 45329.0, 300 sec: 44819.9). Total num frames: 275709952. Throughput: 0: 44831.0. Samples: 275859220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-10 11:09:39,592][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 11:09:39,951][32415] Updated weights for policy 0, policy_version 16830 (0.0052) [2024-06-10 11:09:44,374][32415] Updated weights for policy 0, policy_version 16840 (0.0040) [2024-06-10 11:09:44,592][32177] Fps is (10 sec: 42597.7, 60 sec: 44236.8, 300 sec: 44653.4). Total num frames: 275906560. Throughput: 0: 44768.7. Samples: 275988720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-10 11:09:44,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:09:47,315][32415] Updated weights for policy 0, policy_version 16850 (0.0025) [2024-06-10 11:09:49,592][32177] Fps is (10 sec: 44238.1, 60 sec: 44510.0, 300 sec: 44708.9). Total num frames: 276152320. Throughput: 0: 44783.2. Samples: 276256440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-10 11:09:49,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:09:51,766][32415] Updated weights for policy 0, policy_version 16860 (0.0024) [2024-06-10 11:09:54,592][32177] Fps is (10 sec: 47513.8, 60 sec: 45056.0, 300 sec: 44764.5). Total num frames: 276381696. Throughput: 0: 44690.6. Samples: 276528960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-10 11:09:54,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:09:54,663][32415] Updated weights for policy 0, policy_version 16870 (0.0039) [2024-06-10 11:09:59,155][32415] Updated weights for policy 0, policy_version 16880 (0.0030) [2024-06-10 11:09:59,592][32177] Fps is (10 sec: 42598.3, 60 sec: 44509.8, 300 sec: 44653.4). Total num frames: 276578304. Throughput: 0: 44799.0. Samples: 276659600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-10 11:09:59,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:10:01,839][32415] Updated weights for policy 0, policy_version 16890 (0.0038) [2024-06-10 11:10:04,591][32177] Fps is (10 sec: 44237.5, 60 sec: 44510.0, 300 sec: 44597.9). Total num frames: 276824064. Throughput: 0: 44708.8. Samples: 276926060. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-10 11:10:04,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:10:06,460][32415] Updated weights for policy 0, policy_version 16900 (0.0021) [2024-06-10 11:10:09,063][32415] Updated weights for policy 0, policy_version 16910 (0.0027) [2024-06-10 11:10:09,592][32177] Fps is (10 sec: 49151.2, 60 sec: 45602.0, 300 sec: 44820.0). Total num frames: 277069824. Throughput: 0: 44575.5. Samples: 277196720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-10 11:10:09,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:10:13,531][32415] Updated weights for policy 0, policy_version 16920 (0.0032) [2024-06-10 11:10:14,592][32177] Fps is (10 sec: 44235.8, 60 sec: 44783.1, 300 sec: 44764.4). Total num frames: 277266432. Throughput: 0: 44974.5. Samples: 277338880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 11:10:14,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:10:16,352][32415] Updated weights for policy 0, policy_version 16930 (0.0037) [2024-06-10 11:10:19,592][32177] Fps is (10 sec: 42598.9, 60 sec: 44510.1, 300 sec: 44764.4). Total num frames: 277495808. Throughput: 0: 45133.7. Samples: 277609800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 11:10:19,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 11:10:20,756][32415] Updated weights for policy 0, policy_version 16940 (0.0032) [2024-06-10 11:10:23,779][32415] Updated weights for policy 0, policy_version 16950 (0.0037) [2024-06-10 11:10:24,592][32177] Fps is (10 sec: 49152.2, 60 sec: 45329.0, 300 sec: 44931.0). Total num frames: 277757952. Throughput: 0: 44717.0. Samples: 277871480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-10 11:10:24,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:10:28,135][32415] Updated weights for policy 0, policy_version 16960 (0.0031) [2024-06-10 11:10:29,592][32177] Fps is (10 sec: 42598.7, 60 sec: 44509.9, 300 sec: 44653.4). Total num frames: 277921792. Throughput: 0: 44999.3. Samples: 278013680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-10 11:10:29,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 11:10:31,050][32415] Updated weights for policy 0, policy_version 16970 (0.0025) [2024-06-10 11:10:34,592][32177] Fps is (10 sec: 40959.9, 60 sec: 44782.8, 300 sec: 44708.9). Total num frames: 278167552. Throughput: 0: 44802.5. Samples: 278272560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-10 11:10:34,592][32177] Avg episode reward: [(0, '0.266')] [2024-06-10 11:10:34,737][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000016979_278183936.pth... [2024-06-10 11:10:34,787][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000016322_267419648.pth [2024-06-10 11:10:35,468][32415] Updated weights for policy 0, policy_version 16980 (0.0044) [2024-06-10 11:10:37,515][32394] Signal inference workers to stop experience collection... (4000 times) [2024-06-10 11:10:37,515][32394] Signal inference workers to resume experience collection... (4000 times) [2024-06-10 11:10:37,531][32415] InferenceWorker_p0-w0: stopping experience collection (4000 times) [2024-06-10 11:10:37,556][32415] InferenceWorker_p0-w0: resuming experience collection (4000 times) [2024-06-10 11:10:38,166][32415] Updated weights for policy 0, policy_version 16990 (0.0026) [2024-06-10 11:10:39,592][32177] Fps is (10 sec: 49151.0, 60 sec: 45056.1, 300 sec: 44875.9). Total num frames: 278413312. Throughput: 0: 44695.9. Samples: 278540280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-10 11:10:39,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:10:42,559][32415] Updated weights for policy 0, policy_version 17000 (0.0042) [2024-06-10 11:10:44,592][32177] Fps is (10 sec: 44237.1, 60 sec: 45056.0, 300 sec: 44764.4). Total num frames: 278609920. Throughput: 0: 44992.4. Samples: 278684260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-10 11:10:44,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:10:45,621][32415] Updated weights for policy 0, policy_version 17010 (0.0038) [2024-06-10 11:10:49,592][32177] Fps is (10 sec: 42598.5, 60 sec: 44782.8, 300 sec: 44764.4). Total num frames: 278839296. Throughput: 0: 44973.1. Samples: 278949860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-10 11:10:49,593][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:10:50,106][32415] Updated weights for policy 0, policy_version 17020 (0.0034) [2024-06-10 11:10:53,054][32415] Updated weights for policy 0, policy_version 17030 (0.0020) [2024-06-10 11:10:54,592][32177] Fps is (10 sec: 45875.4, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 279068672. Throughput: 0: 44870.4. Samples: 279215880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-10 11:10:54,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:10:57,068][32415] Updated weights for policy 0, policy_version 17040 (0.0032) [2024-06-10 11:10:59,592][32177] Fps is (10 sec: 44236.9, 60 sec: 45055.9, 300 sec: 44764.4). Total num frames: 279281664. Throughput: 0: 44797.8. Samples: 279354780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-10 11:10:59,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 11:11:00,247][32415] Updated weights for policy 0, policy_version 17050 (0.0029) [2024-06-10 11:11:04,314][32415] Updated weights for policy 0, policy_version 17060 (0.0022) [2024-06-10 11:11:04,592][32177] Fps is (10 sec: 44236.5, 60 sec: 44782.8, 300 sec: 44820.0). Total num frames: 279511040. Throughput: 0: 44695.5. Samples: 279621100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-10 11:11:04,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:11:07,669][32415] Updated weights for policy 0, policy_version 17070 (0.0024) [2024-06-10 11:11:09,592][32177] Fps is (10 sec: 47513.1, 60 sec: 44782.9, 300 sec: 44931.1). Total num frames: 279756800. Throughput: 0: 44891.0. Samples: 279891580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-10 11:11:09,592][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:11:11,730][32415] Updated weights for policy 0, policy_version 17080 (0.0036) [2024-06-10 11:11:14,592][32177] Fps is (10 sec: 47513.0, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 279986176. Throughput: 0: 44881.5. Samples: 280033360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-10 11:11:14,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:11:14,851][32415] Updated weights for policy 0, policy_version 17090 (0.0025) [2024-06-10 11:11:19,375][32415] Updated weights for policy 0, policy_version 17100 (0.0030) [2024-06-10 11:11:19,592][32177] Fps is (10 sec: 40960.5, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 280166400. Throughput: 0: 45120.0. Samples: 280302960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-10 11:11:19,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:11:22,366][32415] Updated weights for policy 0, policy_version 17110 (0.0033) [2024-06-10 11:11:24,592][32177] Fps is (10 sec: 42597.9, 60 sec: 44236.6, 300 sec: 44819.9). Total num frames: 280412160. Throughput: 0: 44877.6. Samples: 280559780. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-10 11:11:24,593][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:11:26,644][32415] Updated weights for policy 0, policy_version 17120 (0.0034) [2024-06-10 11:11:29,523][32415] Updated weights for policy 0, policy_version 17130 (0.0023) [2024-06-10 11:11:29,592][32177] Fps is (10 sec: 49151.7, 60 sec: 45602.0, 300 sec: 44876.1). Total num frames: 280657920. Throughput: 0: 44855.9. Samples: 280702780. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-10 11:11:29,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:11:33,692][32415] Updated weights for policy 0, policy_version 17140 (0.0031) [2024-06-10 11:11:34,592][32177] Fps is (10 sec: 42599.7, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 280838144. Throughput: 0: 44855.2. Samples: 280968340. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-10 11:11:34,592][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:11:36,926][32415] Updated weights for policy 0, policy_version 17150 (0.0029) [2024-06-10 11:11:39,592][32177] Fps is (10 sec: 42598.1, 60 sec: 44509.8, 300 sec: 44820.0). Total num frames: 281083904. Throughput: 0: 45061.6. Samples: 281243660. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-10 11:11:39,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:11:40,638][32415] Updated weights for policy 0, policy_version 17160 (0.0030) [2024-06-10 11:11:44,070][32415] Updated weights for policy 0, policy_version 17170 (0.0033) [2024-06-10 11:11:44,592][32177] Fps is (10 sec: 49152.1, 60 sec: 45329.1, 300 sec: 44931.1). Total num frames: 281329664. Throughput: 0: 44948.1. Samples: 281377440. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-10 11:11:44,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 11:11:48,179][32415] Updated weights for policy 0, policy_version 17180 (0.0036) [2024-06-10 11:11:49,592][32177] Fps is (10 sec: 42599.1, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 281509888. Throughput: 0: 45077.4. Samples: 281649580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-10 11:11:49,592][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:11:51,507][32415] Updated weights for policy 0, policy_version 17190 (0.0031) [2024-06-10 11:11:54,592][32177] Fps is (10 sec: 44235.8, 60 sec: 45055.8, 300 sec: 44819.9). Total num frames: 281772032. Throughput: 0: 44968.4. Samples: 281915160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-10 11:11:54,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:11:55,674][32415] Updated weights for policy 0, policy_version 17200 (0.0032) [2024-06-10 11:11:58,442][32394] Signal inference workers to stop experience collection... (4050 times) [2024-06-10 11:11:58,442][32394] Signal inference workers to resume experience collection... (4050 times) [2024-06-10 11:11:58,456][32415] InferenceWorker_p0-w0: stopping experience collection (4050 times) [2024-06-10 11:11:58,456][32415] InferenceWorker_p0-w0: resuming experience collection (4050 times) [2024-06-10 11:11:58,703][32415] Updated weights for policy 0, policy_version 17210 (0.0031) [2024-06-10 11:11:59,592][32177] Fps is (10 sec: 49152.2, 60 sec: 45329.1, 300 sec: 44931.0). Total num frames: 282001408. Throughput: 0: 44843.8. Samples: 282051320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-10 11:11:59,592][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:12:02,859][32415] Updated weights for policy 0, policy_version 17220 (0.0029) [2024-06-10 11:12:04,592][32177] Fps is (10 sec: 42599.1, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 282198016. Throughput: 0: 44664.9. Samples: 282312880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-10 11:12:04,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:12:06,002][32415] Updated weights for policy 0, policy_version 17230 (0.0029) [2024-06-10 11:12:09,592][32177] Fps is (10 sec: 42597.6, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 282427392. Throughput: 0: 44944.1. Samples: 282582260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-10 11:12:09,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:12:10,085][32415] Updated weights for policy 0, policy_version 17240 (0.0029) [2024-06-10 11:12:13,471][32415] Updated weights for policy 0, policy_version 17250 (0.0027) [2024-06-10 11:12:14,592][32177] Fps is (10 sec: 47512.6, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 282673152. Throughput: 0: 44842.1. Samples: 282720680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 11:12:14,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:12:17,392][32415] Updated weights for policy 0, policy_version 17260 (0.0042) [2024-06-10 11:12:19,592][32177] Fps is (10 sec: 45875.7, 60 sec: 45329.1, 300 sec: 44819.9). Total num frames: 282886144. Throughput: 0: 44911.0. Samples: 282989340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 11:12:19,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:12:21,014][32415] Updated weights for policy 0, policy_version 17270 (0.0026) [2024-06-10 11:12:24,592][32177] Fps is (10 sec: 40960.3, 60 sec: 44510.0, 300 sec: 44708.9). Total num frames: 283082752. Throughput: 0: 44611.1. Samples: 283251160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 11:12:24,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:12:24,967][32415] Updated weights for policy 0, policy_version 17280 (0.0034) [2024-06-10 11:12:28,198][32415] Updated weights for policy 0, policy_version 17290 (0.0026) [2024-06-10 11:12:29,592][32177] Fps is (10 sec: 45875.3, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 283344896. Throughput: 0: 44675.0. Samples: 283387820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-10 11:12:29,592][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:12:32,295][32415] Updated weights for policy 0, policy_version 17300 (0.0036) [2024-06-10 11:12:34,592][32177] Fps is (10 sec: 45876.0, 60 sec: 45056.0, 300 sec: 44764.4). Total num frames: 283541504. Throughput: 0: 44655.1. Samples: 283659060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-10 11:12:34,592][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 11:12:34,612][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000017306_283541504.pth... [2024-06-10 11:12:34,663][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000016651_272809984.pth [2024-06-10 11:12:35,370][32415] Updated weights for policy 0, policy_version 17310 (0.0023) [2024-06-10 11:12:39,259][32415] Updated weights for policy 0, policy_version 17320 (0.0038) [2024-06-10 11:12:39,592][32177] Fps is (10 sec: 42597.9, 60 sec: 44783.0, 300 sec: 44819.9). Total num frames: 283770880. Throughput: 0: 44811.6. Samples: 283931680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-10 11:12:39,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 11:12:42,701][32415] Updated weights for policy 0, policy_version 17330 (0.0034) [2024-06-10 11:12:44,592][32177] Fps is (10 sec: 45874.8, 60 sec: 44509.8, 300 sec: 44875.5). Total num frames: 284000256. Throughput: 0: 44812.3. Samples: 284067880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-10 11:12:44,592][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 11:12:46,823][32415] Updated weights for policy 0, policy_version 17340 (0.0032) [2024-06-10 11:12:49,592][32177] Fps is (10 sec: 44236.9, 60 sec: 45055.9, 300 sec: 44764.4). Total num frames: 284213248. Throughput: 0: 44893.7. Samples: 284333100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-10 11:12:49,592][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:12:50,174][32415] Updated weights for policy 0, policy_version 17350 (0.0025) [2024-06-10 11:12:54,233][32415] Updated weights for policy 0, policy_version 17360 (0.0029) [2024-06-10 11:12:54,596][32177] Fps is (10 sec: 44218.7, 60 sec: 44506.9, 300 sec: 44763.8). Total num frames: 284442624. Throughput: 0: 44853.8. Samples: 284600860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 11:12:54,596][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:12:57,617][32415] Updated weights for policy 0, policy_version 17370 (0.0037) [2024-06-10 11:12:59,592][32177] Fps is (10 sec: 47514.5, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 284688384. Throughput: 0: 44796.3. Samples: 284736500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 11:12:59,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:13:01,274][32415] Updated weights for policy 0, policy_version 17380 (0.0035) [2024-06-10 11:13:04,592][32177] Fps is (10 sec: 44254.9, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 284884992. Throughput: 0: 44672.8. Samples: 284999620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 11:13:04,593][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 11:13:04,810][32415] Updated weights for policy 0, policy_version 17390 (0.0034) [2024-06-10 11:13:08,560][32415] Updated weights for policy 0, policy_version 17400 (0.0032) [2024-06-10 11:13:09,592][32177] Fps is (10 sec: 40960.1, 60 sec: 44510.0, 300 sec: 44708.9). Total num frames: 285097984. Throughput: 0: 44895.8. Samples: 285271460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 11:13:09,592][32177] Avg episode reward: [(0, '0.265')] [2024-06-10 11:13:09,924][32394] Signal inference workers to stop experience collection... (4100 times) [2024-06-10 11:13:09,967][32415] InferenceWorker_p0-w0: stopping experience collection (4100 times) [2024-06-10 11:13:09,990][32394] Signal inference workers to resume experience collection... (4100 times) [2024-06-10 11:13:09,991][32415] InferenceWorker_p0-w0: resuming experience collection (4100 times) [2024-06-10 11:13:12,179][32415] Updated weights for policy 0, policy_version 17410 (0.0032) [2024-06-10 11:13:14,592][32177] Fps is (10 sec: 45875.1, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 285343744. Throughput: 0: 44815.5. Samples: 285404520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 11:13:14,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:13:16,145][32415] Updated weights for policy 0, policy_version 17420 (0.0025) [2024-06-10 11:13:19,494][32415] Updated weights for policy 0, policy_version 17430 (0.0033) [2024-06-10 11:13:19,592][32177] Fps is (10 sec: 47512.1, 60 sec: 44782.8, 300 sec: 44820.0). Total num frames: 285573120. Throughput: 0: 44718.4. Samples: 285671400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 11:13:19,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:13:23,542][32415] Updated weights for policy 0, policy_version 17440 (0.0036) [2024-06-10 11:13:24,592][32177] Fps is (10 sec: 44237.7, 60 sec: 45056.2, 300 sec: 44764.4). Total num frames: 285786112. Throughput: 0: 44644.2. Samples: 285940660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 11:13:24,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:13:26,839][32415] Updated weights for policy 0, policy_version 17450 (0.0028) [2024-06-10 11:13:29,591][32177] Fps is (10 sec: 45877.0, 60 sec: 44783.1, 300 sec: 44820.0). Total num frames: 286031872. Throughput: 0: 44556.7. Samples: 286072920. Policy #0 lag: (min: 2.0, avg: 12.1, max: 23.0) [2024-06-10 11:13:29,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:13:30,569][32415] Updated weights for policy 0, policy_version 17460 (0.0029) [2024-06-10 11:13:34,045][32415] Updated weights for policy 0, policy_version 17470 (0.0034) [2024-06-10 11:13:34,592][32177] Fps is (10 sec: 47512.5, 60 sec: 45328.9, 300 sec: 44986.6). Total num frames: 286261248. Throughput: 0: 44854.6. Samples: 286351560. Policy #0 lag: (min: 2.0, avg: 12.1, max: 23.0) [2024-06-10 11:13:34,592][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:13:37,793][32415] Updated weights for policy 0, policy_version 17480 (0.0026) [2024-06-10 11:13:39,592][32177] Fps is (10 sec: 42598.2, 60 sec: 44783.1, 300 sec: 44764.4). Total num frames: 286457856. Throughput: 0: 44693.1. Samples: 286611860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-10 11:13:39,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 11:13:41,409][32415] Updated weights for policy 0, policy_version 17490 (0.0036) [2024-06-10 11:13:44,592][32177] Fps is (10 sec: 42599.2, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 286687232. Throughput: 0: 44559.1. Samples: 286741660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-10 11:13:44,592][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 11:13:44,993][32415] Updated weights for policy 0, policy_version 17500 (0.0032) [2024-06-10 11:13:48,678][32415] Updated weights for policy 0, policy_version 17510 (0.0041) [2024-06-10 11:13:49,592][32177] Fps is (10 sec: 47512.8, 60 sec: 45329.1, 300 sec: 44931.0). Total num frames: 286932992. Throughput: 0: 44948.9. Samples: 287022320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-10 11:13:49,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:13:52,409][32415] Updated weights for policy 0, policy_version 17520 (0.0030) [2024-06-10 11:13:54,592][32177] Fps is (10 sec: 42598.6, 60 sec: 44513.0, 300 sec: 44764.4). Total num frames: 287113216. Throughput: 0: 44836.9. Samples: 287289120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-10 11:13:54,592][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 11:13:56,084][32415] Updated weights for policy 0, policy_version 17530 (0.0032) [2024-06-10 11:13:59,314][32415] Updated weights for policy 0, policy_version 17540 (0.0030) [2024-06-10 11:13:59,592][32177] Fps is (10 sec: 44236.7, 60 sec: 44782.8, 300 sec: 44819.9). Total num frames: 287375360. Throughput: 0: 44799.1. Samples: 287420480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-10 11:13:59,592][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 11:14:03,105][32415] Updated weights for policy 0, policy_version 17550 (0.0032) [2024-06-10 11:14:04,592][32177] Fps is (10 sec: 47511.8, 60 sec: 45055.8, 300 sec: 44931.0). Total num frames: 287588352. Throughput: 0: 44955.9. Samples: 287694420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-10 11:14:04,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 11:14:06,387][32415] Updated weights for policy 0, policy_version 17560 (0.0041) [2024-06-10 11:14:09,097][32394] Signal inference workers to stop experience collection... (4150 times) [2024-06-10 11:14:09,098][32394] Signal inference workers to resume experience collection... (4150 times) [2024-06-10 11:14:09,114][32415] InferenceWorker_p0-w0: stopping experience collection (4150 times) [2024-06-10 11:14:09,114][32415] InferenceWorker_p0-w0: resuming experience collection (4150 times) [2024-06-10 11:14:09,592][32177] Fps is (10 sec: 42597.5, 60 sec: 45055.7, 300 sec: 44820.0). Total num frames: 287801344. Throughput: 0: 45079.2. Samples: 287969240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-10 11:14:09,593][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:14:10,722][32415] Updated weights for policy 0, policy_version 17570 (0.0021) [2024-06-10 11:14:13,751][32415] Updated weights for policy 0, policy_version 17580 (0.0035) [2024-06-10 11:14:14,592][32177] Fps is (10 sec: 44238.5, 60 sec: 44783.1, 300 sec: 44764.5). Total num frames: 288030720. Throughput: 0: 44992.8. Samples: 288097600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-10 11:14:14,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:14:18,182][32415] Updated weights for policy 0, policy_version 17590 (0.0032) [2024-06-10 11:14:19,592][32177] Fps is (10 sec: 45876.1, 60 sec: 44783.0, 300 sec: 44819.9). Total num frames: 288260096. Throughput: 0: 44865.0. Samples: 288370480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-10 11:14:19,592][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:14:21,158][32415] Updated weights for policy 0, policy_version 17600 (0.0027) [2024-06-10 11:14:24,592][32177] Fps is (10 sec: 42598.2, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 288456704. Throughput: 0: 45163.0. Samples: 288644200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-10 11:14:24,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:14:25,279][32415] Updated weights for policy 0, policy_version 17610 (0.0038) [2024-06-10 11:14:28,154][32415] Updated weights for policy 0, policy_version 17620 (0.0021) [2024-06-10 11:14:29,591][32177] Fps is (10 sec: 44237.9, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 288702464. Throughput: 0: 45029.0. Samples: 288767960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-10 11:14:29,592][32177] Avg episode reward: [(0, '0.266')] [2024-06-10 11:14:32,761][32415] Updated weights for policy 0, policy_version 17630 (0.0039) [2024-06-10 11:14:34,592][32177] Fps is (10 sec: 47512.6, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 288931840. Throughput: 0: 45018.1. Samples: 289048140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 11:14:34,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:14:34,800][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000017637_288964608.pth... [2024-06-10 11:14:34,852][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000016979_278183936.pth [2024-06-10 11:14:35,392][32415] Updated weights for policy 0, policy_version 17640 (0.0033) [2024-06-10 11:14:39,591][32177] Fps is (10 sec: 44236.8, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 289144832. Throughput: 0: 45091.2. Samples: 289318220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 11:14:39,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:14:40,042][32415] Updated weights for policy 0, policy_version 17650 (0.0028) [2024-06-10 11:14:42,531][32415] Updated weights for policy 0, policy_version 17660 (0.0035) [2024-06-10 11:14:44,592][32177] Fps is (10 sec: 44237.8, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 289374208. Throughput: 0: 45017.0. Samples: 289446240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 11:14:44,592][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:14:47,477][32415] Updated weights for policy 0, policy_version 17670 (0.0029) [2024-06-10 11:14:49,596][32177] Fps is (10 sec: 49130.2, 60 sec: 45052.8, 300 sec: 44930.4). Total num frames: 289636352. Throughput: 0: 44797.8. Samples: 289710500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 11:14:49,597][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 11:14:50,015][32415] Updated weights for policy 0, policy_version 17680 (0.0048) [2024-06-10 11:14:54,592][32177] Fps is (10 sec: 44236.7, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 289816576. Throughput: 0: 45067.0. Samples: 289997240. Policy #0 lag: (min: 2.0, avg: 8.6, max: 22.0) [2024-06-10 11:14:54,592][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 11:14:54,627][32415] Updated weights for policy 0, policy_version 17690 (0.0031) [2024-06-10 11:14:57,150][32415] Updated weights for policy 0, policy_version 17700 (0.0027) [2024-06-10 11:14:59,596][32177] Fps is (10 sec: 40960.1, 60 sec: 44506.8, 300 sec: 44819.3). Total num frames: 290045952. Throughput: 0: 44764.1. Samples: 290112180. Policy #0 lag: (min: 2.0, avg: 8.6, max: 22.0) [2024-06-10 11:14:59,597][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:15:01,923][32415] Updated weights for policy 0, policy_version 17710 (0.0032) [2024-06-10 11:15:04,529][32415] Updated weights for policy 0, policy_version 17720 (0.0020) [2024-06-10 11:15:04,596][32177] Fps is (10 sec: 50767.9, 60 sec: 45599.0, 300 sec: 44930.4). Total num frames: 290324480. Throughput: 0: 44977.9. Samples: 290394680. Policy #0 lag: (min: 2.0, avg: 8.6, max: 22.0) [2024-06-10 11:15:04,597][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:15:09,326][32415] Updated weights for policy 0, policy_version 17730 (0.0046) [2024-06-10 11:15:09,592][32177] Fps is (10 sec: 45894.1, 60 sec: 45056.1, 300 sec: 44875.5). Total num frames: 290504704. Throughput: 0: 44978.0. Samples: 290668220. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-10 11:15:09,592][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 11:15:11,626][32415] Updated weights for policy 0, policy_version 17740 (0.0034) [2024-06-10 11:15:14,592][32177] Fps is (10 sec: 39339.0, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 290717696. Throughput: 0: 44929.2. Samples: 290789780. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-10 11:15:14,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:15:16,772][32415] Updated weights for policy 0, policy_version 17750 (0.0032) [2024-06-10 11:15:17,224][32394] Signal inference workers to stop experience collection... (4200 times) [2024-06-10 11:15:17,224][32394] Signal inference workers to resume experience collection... (4200 times) [2024-06-10 11:15:17,263][32415] InferenceWorker_p0-w0: stopping experience collection (4200 times) [2024-06-10 11:15:17,263][32415] InferenceWorker_p0-w0: resuming experience collection (4200 times) [2024-06-10 11:15:19,126][32415] Updated weights for policy 0, policy_version 17760 (0.0036) [2024-06-10 11:15:19,592][32177] Fps is (10 sec: 47514.6, 60 sec: 45329.2, 300 sec: 44820.0). Total num frames: 290979840. Throughput: 0: 44602.5. Samples: 291055240. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-10 11:15:19,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:15:24,221][32415] Updated weights for policy 0, policy_version 17770 (0.0030) [2024-06-10 11:15:24,592][32177] Fps is (10 sec: 44236.7, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 291160064. Throughput: 0: 44775.4. Samples: 291333120. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-10 11:15:24,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:15:26,420][32415] Updated weights for policy 0, policy_version 17780 (0.0039) [2024-06-10 11:15:29,596][32177] Fps is (10 sec: 42579.9, 60 sec: 45052.7, 300 sec: 44874.9). Total num frames: 291405824. Throughput: 0: 44618.4. Samples: 291454260. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-10 11:15:29,597][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:15:31,492][32415] Updated weights for policy 0, policy_version 17790 (0.0040) [2024-06-10 11:15:33,927][32415] Updated weights for policy 0, policy_version 17800 (0.0027) [2024-06-10 11:15:34,593][32177] Fps is (10 sec: 49145.4, 60 sec: 45328.2, 300 sec: 44875.3). Total num frames: 291651584. Throughput: 0: 44831.8. Samples: 291727800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-10 11:15:34,593][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:15:38,958][32415] Updated weights for policy 0, policy_version 17810 (0.0027) [2024-06-10 11:15:39,592][32177] Fps is (10 sec: 44255.6, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 291848192. Throughput: 0: 44507.1. Samples: 292000060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-10 11:15:39,592][32177] Avg episode reward: [(0, '0.266')] [2024-06-10 11:15:40,921][32415] Updated weights for policy 0, policy_version 17820 (0.0019) [2024-06-10 11:15:44,592][32177] Fps is (10 sec: 39326.5, 60 sec: 44509.7, 300 sec: 44764.4). Total num frames: 292044800. Throughput: 0: 44833.0. Samples: 292129480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-10 11:15:44,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:15:46,435][32415] Updated weights for policy 0, policy_version 17830 (0.0041) [2024-06-10 11:15:48,658][32415] Updated weights for policy 0, policy_version 17840 (0.0033) [2024-06-10 11:15:49,592][32177] Fps is (10 sec: 45875.6, 60 sec: 44513.1, 300 sec: 44875.5). Total num frames: 292306944. Throughput: 0: 44318.6. Samples: 292388820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 11:15:49,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:15:53,978][32415] Updated weights for policy 0, policy_version 17850 (0.0036) [2024-06-10 11:15:54,592][32177] Fps is (10 sec: 45875.6, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 292503552. Throughput: 0: 44339.7. Samples: 292663500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 11:15:54,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:15:55,897][32415] Updated weights for policy 0, policy_version 17860 (0.0020) [2024-06-10 11:15:59,592][32177] Fps is (10 sec: 40959.9, 60 sec: 44513.1, 300 sec: 44764.4). Total num frames: 292716544. Throughput: 0: 44457.4. Samples: 292790360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-10 11:15:59,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:16:01,262][32415] Updated weights for policy 0, policy_version 17870 (0.0027) [2024-06-10 11:16:03,454][32415] Updated weights for policy 0, policy_version 17880 (0.0033) [2024-06-10 11:16:04,592][32177] Fps is (10 sec: 45875.0, 60 sec: 43966.9, 300 sec: 44764.4). Total num frames: 292962304. Throughput: 0: 44412.8. Samples: 293053820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-10 11:16:04,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:16:08,704][32415] Updated weights for policy 0, policy_version 17890 (0.0042) [2024-06-10 11:16:09,592][32177] Fps is (10 sec: 47513.8, 60 sec: 44783.1, 300 sec: 44764.5). Total num frames: 293191680. Throughput: 0: 44393.4. Samples: 293330820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-10 11:16:09,592][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:16:09,963][32394] Signal inference workers to stop experience collection... (4250 times) [2024-06-10 11:16:09,963][32394] Signal inference workers to resume experience collection... (4250 times) [2024-06-10 11:16:10,002][32415] InferenceWorker_p0-w0: stopping experience collection (4250 times) [2024-06-10 11:16:10,003][32415] InferenceWorker_p0-w0: resuming experience collection (4250 times) [2024-06-10 11:16:10,620][32415] Updated weights for policy 0, policy_version 17900 (0.0024) [2024-06-10 11:16:14,592][32177] Fps is (10 sec: 40960.2, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 293371904. Throughput: 0: 44643.3. Samples: 293463020. Policy #0 lag: (min: 0.0, avg: 13.2, max: 20.0) [2024-06-10 11:16:14,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:16:15,941][32415] Updated weights for policy 0, policy_version 17910 (0.0036) [2024-06-10 11:16:17,999][32415] Updated weights for policy 0, policy_version 17920 (0.0029) [2024-06-10 11:16:19,592][32177] Fps is (10 sec: 42597.5, 60 sec: 43963.6, 300 sec: 44764.4). Total num frames: 293617664. Throughput: 0: 44198.1. Samples: 293716660. Policy #0 lag: (min: 0.0, avg: 13.2, max: 20.0) [2024-06-10 11:16:19,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 11:16:23,320][32415] Updated weights for policy 0, policy_version 17930 (0.0033) [2024-06-10 11:16:24,592][32177] Fps is (10 sec: 47513.6, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 293847040. Throughput: 0: 44400.9. Samples: 293998100. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-10 11:16:24,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:16:25,357][32415] Updated weights for policy 0, policy_version 17940 (0.0030) [2024-06-10 11:16:29,592][32177] Fps is (10 sec: 40960.1, 60 sec: 43693.7, 300 sec: 44708.9). Total num frames: 294027264. Throughput: 0: 44364.5. Samples: 294125880. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-10 11:16:29,592][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:16:30,630][32415] Updated weights for policy 0, policy_version 17950 (0.0028) [2024-06-10 11:16:32,746][32415] Updated weights for policy 0, policy_version 17960 (0.0025) [2024-06-10 11:16:34,592][32177] Fps is (10 sec: 44235.5, 60 sec: 43964.5, 300 sec: 44764.4). Total num frames: 294289408. Throughput: 0: 44440.5. Samples: 294388660. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-10 11:16:34,593][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 11:16:34,611][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000017963_294305792.pth... [2024-06-10 11:16:34,672][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000017306_283541504.pth [2024-06-10 11:16:38,207][32415] Updated weights for policy 0, policy_version 17970 (0.0040) [2024-06-10 11:16:39,596][32177] Fps is (10 sec: 52406.7, 60 sec: 45052.8, 300 sec: 44819.3). Total num frames: 294551552. Throughput: 0: 44370.0. Samples: 294660340. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-10 11:16:39,597][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 11:16:39,967][32415] Updated weights for policy 0, policy_version 17980 (0.0025) [2024-06-10 11:16:44,592][32177] Fps is (10 sec: 39322.6, 60 sec: 43963.8, 300 sec: 44653.3). Total num frames: 294682624. Throughput: 0: 44507.4. Samples: 294793200. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-10 11:16:44,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:16:45,420][32415] Updated weights for policy 0, policy_version 17990 (0.0028) [2024-06-10 11:16:47,338][32415] Updated weights for policy 0, policy_version 18000 (0.0018) [2024-06-10 11:16:49,596][32177] Fps is (10 sec: 40959.8, 60 sec: 44233.6, 300 sec: 44708.3). Total num frames: 294961152. Throughput: 0: 44394.5. Samples: 295051760. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-10 11:16:49,597][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:16:52,558][32415] Updated weights for policy 0, policy_version 18010 (0.0038) [2024-06-10 11:16:54,592][32177] Fps is (10 sec: 52428.9, 60 sec: 45056.0, 300 sec: 44764.4). Total num frames: 295206912. Throughput: 0: 44397.6. Samples: 295328720. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-10 11:16:54,592][32177] Avg episode reward: [(0, '0.267')] [2024-06-10 11:16:54,994][32415] Updated weights for policy 0, policy_version 18020 (0.0027) [2024-06-10 11:16:59,592][32177] Fps is (10 sec: 39338.8, 60 sec: 43963.7, 300 sec: 44597.8). Total num frames: 295354368. Throughput: 0: 44448.5. Samples: 295463200. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-10 11:16:59,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:16:59,974][32415] Updated weights for policy 0, policy_version 18030 (0.0027) [2024-06-10 11:17:01,897][32394] Signal inference workers to stop experience collection... (4300 times) [2024-06-10 11:17:01,943][32415] InferenceWorker_p0-w0: stopping experience collection (4300 times) [2024-06-10 11:17:02,009][32394] Signal inference workers to resume experience collection... (4300 times) [2024-06-10 11:17:02,009][32415] InferenceWorker_p0-w0: resuming experience collection (4300 times) [2024-06-10 11:17:02,143][32415] Updated weights for policy 0, policy_version 18040 (0.0030) [2024-06-10 11:17:04,595][32177] Fps is (10 sec: 40944.8, 60 sec: 44234.1, 300 sec: 44708.3). Total num frames: 295616512. Throughput: 0: 44718.6. Samples: 295729160. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-10 11:17:04,596][32177] Avg episode reward: [(0, '0.266')] [2024-06-10 11:17:07,292][32415] Updated weights for policy 0, policy_version 18050 (0.0037) [2024-06-10 11:17:09,149][32415] Updated weights for policy 0, policy_version 18060 (0.0033) [2024-06-10 11:17:09,592][32177] Fps is (10 sec: 54066.9, 60 sec: 45055.9, 300 sec: 44820.0). Total num frames: 295895040. Throughput: 0: 44402.7. Samples: 295996220. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-10 11:17:09,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:17:14,592][32177] Fps is (10 sec: 42614.3, 60 sec: 44509.9, 300 sec: 44597.8). Total num frames: 296042496. Throughput: 0: 44768.1. Samples: 296140440. Policy #0 lag: (min: 0.0, avg: 6.7, max: 20.0) [2024-06-10 11:17:14,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 11:17:14,630][32415] Updated weights for policy 0, policy_version 18070 (0.0028) [2024-06-10 11:17:16,796][32415] Updated weights for policy 0, policy_version 18080 (0.0031) [2024-06-10 11:17:19,592][32177] Fps is (10 sec: 39321.1, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 296288256. Throughput: 0: 44829.5. Samples: 296405980. Policy #0 lag: (min: 0.0, avg: 6.7, max: 20.0) [2024-06-10 11:17:19,592][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 11:17:21,613][32415] Updated weights for policy 0, policy_version 18090 (0.0024) [2024-06-10 11:17:24,287][32415] Updated weights for policy 0, policy_version 18100 (0.0032) [2024-06-10 11:17:24,592][32177] Fps is (10 sec: 52429.2, 60 sec: 45329.1, 300 sec: 44820.0). Total num frames: 296566784. Throughput: 0: 44767.4. Samples: 296674680. Policy #0 lag: (min: 0.0, avg: 6.7, max: 20.0) [2024-06-10 11:17:24,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:17:29,110][32415] Updated weights for policy 0, policy_version 18110 (0.0031) [2024-06-10 11:17:29,592][32177] Fps is (10 sec: 44236.9, 60 sec: 45056.0, 300 sec: 44708.9). Total num frames: 296730624. Throughput: 0: 44955.1. Samples: 296816180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 11:17:29,596][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 11:17:31,508][32415] Updated weights for policy 0, policy_version 18120 (0.0030) [2024-06-10 11:17:34,592][32177] Fps is (10 sec: 37683.2, 60 sec: 44237.1, 300 sec: 44653.4). Total num frames: 296943616. Throughput: 0: 44947.0. Samples: 297074180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 11:17:34,592][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 11:17:36,458][32415] Updated weights for policy 0, policy_version 18130 (0.0035) [2024-06-10 11:17:38,778][32415] Updated weights for policy 0, policy_version 18140 (0.0047) [2024-06-10 11:17:39,592][32177] Fps is (10 sec: 50791.1, 60 sec: 44786.2, 300 sec: 44875.5). Total num frames: 297238528. Throughput: 0: 44736.1. Samples: 297341840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 11:17:39,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:17:43,675][32415] Updated weights for policy 0, policy_version 18150 (0.0036) [2024-06-10 11:17:44,592][32177] Fps is (10 sec: 49152.2, 60 sec: 45875.3, 300 sec: 44820.0). Total num frames: 297435136. Throughput: 0: 44973.8. Samples: 297487020. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-10 11:17:44,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:17:46,237][32415] Updated weights for policy 0, policy_version 18160 (0.0023) [2024-06-10 11:17:49,592][32177] Fps is (10 sec: 37683.3, 60 sec: 44240.0, 300 sec: 44654.0). Total num frames: 297615360. Throughput: 0: 44909.6. Samples: 297749920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-10 11:17:49,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:17:50,659][32415] Updated weights for policy 0, policy_version 18170 (0.0025) [2024-06-10 11:17:53,579][32415] Updated weights for policy 0, policy_version 18180 (0.0025) [2024-06-10 11:17:54,592][32177] Fps is (10 sec: 47513.5, 60 sec: 45056.1, 300 sec: 44820.0). Total num frames: 297910272. Throughput: 0: 45014.7. Samples: 298021880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-10 11:17:54,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:17:58,011][32394] Signal inference workers to stop experience collection... (4350 times) [2024-06-10 11:17:58,011][32394] Signal inference workers to resume experience collection... (4350 times) [2024-06-10 11:17:58,055][32415] InferenceWorker_p0-w0: stopping experience collection (4350 times) [2024-06-10 11:17:58,055][32415] InferenceWorker_p0-w0: resuming experience collection (4350 times) [2024-06-10 11:17:58,151][32415] Updated weights for policy 0, policy_version 18190 (0.0044) [2024-06-10 11:17:59,592][32177] Fps is (10 sec: 49151.2, 60 sec: 45875.1, 300 sec: 44820.0). Total num frames: 298106880. Throughput: 0: 45071.5. Samples: 298168660. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-10 11:17:59,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:18:00,802][32415] Updated weights for policy 0, policy_version 18200 (0.0034) [2024-06-10 11:18:04,592][32177] Fps is (10 sec: 37683.2, 60 sec: 44512.7, 300 sec: 44708.9). Total num frames: 298287104. Throughput: 0: 44949.5. Samples: 298428700. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-10 11:18:04,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:18:05,541][32415] Updated weights for policy 0, policy_version 18210 (0.0027) [2024-06-10 11:18:08,256][32415] Updated weights for policy 0, policy_version 18220 (0.0038) [2024-06-10 11:18:09,592][32177] Fps is (10 sec: 45874.4, 60 sec: 44509.7, 300 sec: 44819.9). Total num frames: 298565632. Throughput: 0: 44795.2. Samples: 298690480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-10 11:18:09,593][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:18:12,680][32415] Updated weights for policy 0, policy_version 18230 (0.0045) [2024-06-10 11:18:14,592][32177] Fps is (10 sec: 49151.0, 60 sec: 45602.0, 300 sec: 44764.4). Total num frames: 298778624. Throughput: 0: 44968.8. Samples: 298839780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-10 11:18:14,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:18:15,927][32415] Updated weights for policy 0, policy_version 18240 (0.0032) [2024-06-10 11:18:19,592][32177] Fps is (10 sec: 40960.5, 60 sec: 44782.9, 300 sec: 44708.8). Total num frames: 298975232. Throughput: 0: 45037.6. Samples: 299100880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-10 11:18:19,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:18:19,839][32415] Updated weights for policy 0, policy_version 18250 (0.0023) [2024-06-10 11:18:23,058][32415] Updated weights for policy 0, policy_version 18260 (0.0031) [2024-06-10 11:18:24,592][32177] Fps is (10 sec: 45876.1, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 299237376. Throughput: 0: 44872.4. Samples: 299361100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-10 11:18:24,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:18:27,461][32415] Updated weights for policy 0, policy_version 18270 (0.0032) [2024-06-10 11:18:29,592][32177] Fps is (10 sec: 45876.1, 60 sec: 45056.1, 300 sec: 44653.4). Total num frames: 299433984. Throughput: 0: 44960.4. Samples: 299510240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-10 11:18:29,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 11:18:30,265][32415] Updated weights for policy 0, policy_version 18280 (0.0043) [2024-06-10 11:18:34,592][32177] Fps is (10 sec: 42597.6, 60 sec: 45328.9, 300 sec: 44764.4). Total num frames: 299663360. Throughput: 0: 44931.8. Samples: 299771860. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-10 11:18:34,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:18:34,600][32415] Updated weights for policy 0, policy_version 18290 (0.0032) [2024-06-10 11:18:34,605][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000018290_299663360.pth... [2024-06-10 11:18:34,658][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000017637_288964608.pth [2024-06-10 11:18:37,954][32415] Updated weights for policy 0, policy_version 18300 (0.0033) [2024-06-10 11:18:39,592][32177] Fps is (10 sec: 45875.3, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 299892736. Throughput: 0: 44665.8. Samples: 300031840. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-10 11:18:39,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 11:18:41,938][32415] Updated weights for policy 0, policy_version 18310 (0.0021) [2024-06-10 11:18:44,592][32177] Fps is (10 sec: 45874.9, 60 sec: 44782.7, 300 sec: 44708.9). Total num frames: 300122112. Throughput: 0: 44577.7. Samples: 300174660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 11:18:44,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:18:45,115][32415] Updated weights for policy 0, policy_version 18320 (0.0029) [2024-06-10 11:18:49,175][32415] Updated weights for policy 0, policy_version 18330 (0.0027) [2024-06-10 11:18:49,592][32177] Fps is (10 sec: 44236.4, 60 sec: 45329.0, 300 sec: 44819.9). Total num frames: 300335104. Throughput: 0: 44829.3. Samples: 300446020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 11:18:49,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:18:52,551][32415] Updated weights for policy 0, policy_version 18340 (0.0037) [2024-06-10 11:18:54,592][32177] Fps is (10 sec: 44237.0, 60 sec: 44236.6, 300 sec: 44708.9). Total num frames: 300564480. Throughput: 0: 44918.3. Samples: 300711800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 11:18:54,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:18:56,378][32415] Updated weights for policy 0, policy_version 18350 (0.0036) [2024-06-10 11:18:59,592][32177] Fps is (10 sec: 44236.9, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 300777472. Throughput: 0: 44687.7. Samples: 300850720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-10 11:18:59,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:18:59,837][32415] Updated weights for policy 0, policy_version 18360 (0.0041) [2024-06-10 11:19:03,492][32415] Updated weights for policy 0, policy_version 18370 (0.0032) [2024-06-10 11:19:04,596][32177] Fps is (10 sec: 44218.4, 60 sec: 45325.8, 300 sec: 44763.8). Total num frames: 301006848. Throughput: 0: 44989.1. Samples: 301125580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-10 11:19:04,597][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:19:06,975][32415] Updated weights for policy 0, policy_version 18380 (0.0041) [2024-06-10 11:19:09,592][32177] Fps is (10 sec: 44236.8, 60 sec: 44237.0, 300 sec: 44708.9). Total num frames: 301219840. Throughput: 0: 45137.7. Samples: 301392300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-10 11:19:09,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:19:10,850][32415] Updated weights for policy 0, policy_version 18390 (0.0027) [2024-06-10 11:19:12,634][32394] Signal inference workers to stop experience collection... (4400 times) [2024-06-10 11:19:12,670][32415] InferenceWorker_p0-w0: stopping experience collection (4400 times) [2024-06-10 11:19:12,677][32394] Signal inference workers to resume experience collection... (4400 times) [2024-06-10 11:19:12,690][32415] InferenceWorker_p0-w0: resuming experience collection (4400 times) [2024-06-10 11:19:14,219][32415] Updated weights for policy 0, policy_version 18400 (0.0029) [2024-06-10 11:19:14,592][32177] Fps is (10 sec: 45894.4, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 301465600. Throughput: 0: 44819.8. Samples: 301527140. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-06-10 11:19:14,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:19:18,375][32415] Updated weights for policy 0, policy_version 18410 (0.0034) [2024-06-10 11:19:19,592][32177] Fps is (10 sec: 45875.7, 60 sec: 45056.2, 300 sec: 44820.0). Total num frames: 301678592. Throughput: 0: 45060.2. Samples: 301799560. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-06-10 11:19:19,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:19:21,580][32415] Updated weights for policy 0, policy_version 18420 (0.0040) [2024-06-10 11:19:24,592][32177] Fps is (10 sec: 42598.0, 60 sec: 44236.6, 300 sec: 44708.8). Total num frames: 301891584. Throughput: 0: 45216.1. Samples: 302066580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 11:19:24,593][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:19:25,748][32415] Updated weights for policy 0, policy_version 18430 (0.0030) [2024-06-10 11:19:29,104][32415] Updated weights for policy 0, policy_version 18440 (0.0032) [2024-06-10 11:19:29,593][32177] Fps is (10 sec: 49142.7, 60 sec: 45600.7, 300 sec: 44875.3). Total num frames: 302170112. Throughput: 0: 44837.1. Samples: 302192400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 11:19:29,594][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:19:33,357][32415] Updated weights for policy 0, policy_version 18450 (0.0028) [2024-06-10 11:19:34,592][32177] Fps is (10 sec: 47515.2, 60 sec: 45056.2, 300 sec: 44820.0). Total num frames: 302366720. Throughput: 0: 44965.0. Samples: 302469440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 11:19:34,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:19:36,355][32415] Updated weights for policy 0, policy_version 18460 (0.0038) [2024-06-10 11:19:39,592][32177] Fps is (10 sec: 39328.7, 60 sec: 44509.8, 300 sec: 44708.9). Total num frames: 302563328. Throughput: 0: 44905.9. Samples: 302732560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-10 11:19:39,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:19:40,339][32415] Updated weights for policy 0, policy_version 18470 (0.0027) [2024-06-10 11:19:43,632][32415] Updated weights for policy 0, policy_version 18480 (0.0041) [2024-06-10 11:19:44,592][32177] Fps is (10 sec: 44236.8, 60 sec: 44783.1, 300 sec: 44654.0). Total num frames: 302809088. Throughput: 0: 44737.9. Samples: 302863920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-10 11:19:44,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:19:47,455][32415] Updated weights for policy 0, policy_version 18490 (0.0029) [2024-06-10 11:19:49,592][32177] Fps is (10 sec: 45875.4, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 303022080. Throughput: 0: 44622.1. Samples: 303133380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 11:19:49,596][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:19:50,893][32415] Updated weights for policy 0, policy_version 18500 (0.0040) [2024-06-10 11:19:54,592][32177] Fps is (10 sec: 44236.2, 60 sec: 44783.0, 300 sec: 44765.1). Total num frames: 303251456. Throughput: 0: 44718.6. Samples: 303404640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 11:19:54,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:19:54,731][32415] Updated weights for policy 0, policy_version 18510 (0.0026) [2024-06-10 11:19:58,322][32415] Updated weights for policy 0, policy_version 18520 (0.0033) [2024-06-10 11:19:59,592][32177] Fps is (10 sec: 45875.3, 60 sec: 45056.1, 300 sec: 44598.5). Total num frames: 303480832. Throughput: 0: 44702.4. Samples: 303538740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 11:19:59,592][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 11:20:02,123][32415] Updated weights for policy 0, policy_version 18530 (0.0037) [2024-06-10 11:20:04,592][32177] Fps is (10 sec: 45874.9, 60 sec: 45059.1, 300 sec: 44764.4). Total num frames: 303710208. Throughput: 0: 44768.7. Samples: 303814160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 11:20:04,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:20:05,698][32415] Updated weights for policy 0, policy_version 18540 (0.0025) [2024-06-10 11:20:09,338][32415] Updated weights for policy 0, policy_version 18550 (0.0031) [2024-06-10 11:20:09,592][32177] Fps is (10 sec: 44236.6, 60 sec: 45056.0, 300 sec: 44764.4). Total num frames: 303923200. Throughput: 0: 44684.7. Samples: 304077380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 11:20:09,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:20:13,349][32415] Updated weights for policy 0, policy_version 18560 (0.0026) [2024-06-10 11:20:14,592][32177] Fps is (10 sec: 44236.8, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 304152576. Throughput: 0: 44900.3. Samples: 304212840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 11:20:14,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:20:16,512][32415] Updated weights for policy 0, policy_version 18570 (0.0034) [2024-06-10 11:20:19,592][32177] Fps is (10 sec: 45875.2, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 304381952. Throughput: 0: 44692.8. Samples: 304480620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-10 11:20:19,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:20:20,466][32415] Updated weights for policy 0, policy_version 18580 (0.0030) [2024-06-10 11:20:23,984][32415] Updated weights for policy 0, policy_version 18590 (0.0052) [2024-06-10 11:20:24,592][32177] Fps is (10 sec: 44237.4, 60 sec: 45056.2, 300 sec: 44709.5). Total num frames: 304594944. Throughput: 0: 44729.3. Samples: 304745380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-10 11:20:24,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:20:27,523][32415] Updated weights for policy 0, policy_version 18600 (0.0026) [2024-06-10 11:20:29,596][32177] Fps is (10 sec: 42580.0, 60 sec: 43961.9, 300 sec: 44597.4). Total num frames: 304807936. Throughput: 0: 44805.9. Samples: 304880380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-10 11:20:29,596][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:20:31,328][32415] Updated weights for policy 0, policy_version 18610 (0.0035) [2024-06-10 11:20:34,592][32177] Fps is (10 sec: 45874.0, 60 sec: 44782.7, 300 sec: 44764.4). Total num frames: 305053696. Throughput: 0: 44880.6. Samples: 305153020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-10 11:20:34,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:20:34,608][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000018619_305053696.pth... [2024-06-10 11:20:34,663][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000017963_294305792.pth [2024-06-10 11:20:35,064][32415] Updated weights for policy 0, policy_version 18620 (0.0034) [2024-06-10 11:20:37,619][32394] Signal inference workers to stop experience collection... (4450 times) [2024-06-10 11:20:37,653][32415] InferenceWorker_p0-w0: stopping experience collection (4450 times) [2024-06-10 11:20:37,665][32394] Signal inference workers to resume experience collection... (4450 times) [2024-06-10 11:20:37,675][32415] InferenceWorker_p0-w0: resuming experience collection (4450 times) [2024-06-10 11:20:38,371][32415] Updated weights for policy 0, policy_version 18630 (0.0041) [2024-06-10 11:20:39,592][32177] Fps is (10 sec: 47533.7, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 305283072. Throughput: 0: 44781.8. Samples: 305419820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-10 11:20:39,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:20:42,886][32415] Updated weights for policy 0, policy_version 18640 (0.0026) [2024-06-10 11:20:44,592][32177] Fps is (10 sec: 44237.5, 60 sec: 44782.8, 300 sec: 44708.9). Total num frames: 305496064. Throughput: 0: 44747.4. Samples: 305552380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-10 11:20:44,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:20:45,563][32415] Updated weights for policy 0, policy_version 18650 (0.0021) [2024-06-10 11:20:49,592][32177] Fps is (10 sec: 42598.5, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 305709056. Throughput: 0: 44660.1. Samples: 305823860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-10 11:20:49,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:20:49,987][32415] Updated weights for policy 0, policy_version 18660 (0.0031) [2024-06-10 11:20:53,144][32415] Updated weights for policy 0, policy_version 18670 (0.0039) [2024-06-10 11:20:54,592][32177] Fps is (10 sec: 44237.2, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 305938432. Throughput: 0: 44742.6. Samples: 306090800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-10 11:20:54,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:20:56,961][32415] Updated weights for policy 0, policy_version 18680 (0.0027) [2024-06-10 11:20:59,596][32177] Fps is (10 sec: 44218.0, 60 sec: 44506.6, 300 sec: 44708.2). Total num frames: 306151424. Throughput: 0: 44694.6. Samples: 306224280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-10 11:20:59,597][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 11:21:00,443][32415] Updated weights for policy 0, policy_version 18690 (0.0030) [2024-06-10 11:21:04,523][32415] Updated weights for policy 0, policy_version 18700 (0.0030) [2024-06-10 11:21:04,592][32177] Fps is (10 sec: 44236.4, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 306380800. Throughput: 0: 44618.1. Samples: 306488440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-10 11:21:04,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:21:04,597][32394] Saving new best policy, reward=0.291! [2024-06-10 11:21:07,519][32415] Updated weights for policy 0, policy_version 18710 (0.0030) [2024-06-10 11:21:09,592][32177] Fps is (10 sec: 44255.9, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 306593792. Throughput: 0: 44762.7. Samples: 306759700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 11:21:09,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:21:11,927][32415] Updated weights for policy 0, policy_version 18720 (0.0033) [2024-06-10 11:21:14,592][32177] Fps is (10 sec: 45875.6, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 306839552. Throughput: 0: 44898.0. Samples: 306900600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 11:21:14,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:21:14,904][32415] Updated weights for policy 0, policy_version 18730 (0.0031) [2024-06-10 11:21:18,950][32415] Updated weights for policy 0, policy_version 18740 (0.0044) [2024-06-10 11:21:19,592][32177] Fps is (10 sec: 44236.1, 60 sec: 44236.7, 300 sec: 44708.9). Total num frames: 307036160. Throughput: 0: 44708.1. Samples: 307164880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 11:21:19,592][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:21:22,349][32415] Updated weights for policy 0, policy_version 18750 (0.0039) [2024-06-10 11:21:24,592][32177] Fps is (10 sec: 44237.0, 60 sec: 44783.0, 300 sec: 44931.1). Total num frames: 307281920. Throughput: 0: 44919.2. Samples: 307441180. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-10 11:21:24,593][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:21:25,948][32415] Updated weights for policy 0, policy_version 18760 (0.0049) [2024-06-10 11:21:29,592][32177] Fps is (10 sec: 47514.5, 60 sec: 45059.3, 300 sec: 44820.0). Total num frames: 307511296. Throughput: 0: 45095.3. Samples: 307581660. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-10 11:21:29,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:21:29,622][32415] Updated weights for policy 0, policy_version 18770 (0.0025) [2024-06-10 11:21:33,678][32415] Updated weights for policy 0, policy_version 18780 (0.0033) [2024-06-10 11:21:34,592][32177] Fps is (10 sec: 44235.9, 60 sec: 44509.9, 300 sec: 44654.0). Total num frames: 307724288. Throughput: 0: 44830.1. Samples: 307841220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 11:21:34,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:21:36,617][32415] Updated weights for policy 0, policy_version 18790 (0.0030) [2024-06-10 11:21:39,592][32177] Fps is (10 sec: 45875.1, 60 sec: 44783.0, 300 sec: 45042.1). Total num frames: 307970048. Throughput: 0: 44890.7. Samples: 308110880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 11:21:39,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:21:41,407][32415] Updated weights for policy 0, policy_version 18800 (0.0035) [2024-06-10 11:21:43,852][32415] Updated weights for policy 0, policy_version 18810 (0.0036) [2024-06-10 11:21:44,592][32177] Fps is (10 sec: 47514.2, 60 sec: 45056.0, 300 sec: 44876.1). Total num frames: 308199424. Throughput: 0: 45066.4. Samples: 308252080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 11:21:44,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:21:46,581][32394] Signal inference workers to stop experience collection... (4500 times) [2024-06-10 11:21:46,582][32394] Signal inference workers to resume experience collection... (4500 times) [2024-06-10 11:21:46,619][32415] InferenceWorker_p0-w0: stopping experience collection (4500 times) [2024-06-10 11:21:46,619][32415] InferenceWorker_p0-w0: resuming experience collection (4500 times) [2024-06-10 11:21:48,325][32415] Updated weights for policy 0, policy_version 18820 (0.0034) [2024-06-10 11:21:49,592][32177] Fps is (10 sec: 40959.9, 60 sec: 44509.9, 300 sec: 44653.4). Total num frames: 308379648. Throughput: 0: 45169.0. Samples: 308521040. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-10 11:21:49,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:21:51,483][32415] Updated weights for policy 0, policy_version 18830 (0.0033) [2024-06-10 11:21:54,592][32177] Fps is (10 sec: 44236.9, 60 sec: 45056.0, 300 sec: 45042.1). Total num frames: 308641792. Throughput: 0: 44960.8. Samples: 308782940. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-10 11:21:54,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:21:55,368][32415] Updated weights for policy 0, policy_version 18840 (0.0046) [2024-06-10 11:21:59,062][32415] Updated weights for policy 0, policy_version 18850 (0.0030) [2024-06-10 11:21:59,592][32177] Fps is (10 sec: 47513.3, 60 sec: 45059.2, 300 sec: 44876.1). Total num frames: 308854784. Throughput: 0: 44688.0. Samples: 308911560. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-10 11:21:59,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:22:03,027][32415] Updated weights for policy 0, policy_version 18860 (0.0043) [2024-06-10 11:22:04,592][32177] Fps is (10 sec: 44236.5, 60 sec: 45056.0, 300 sec: 44708.9). Total num frames: 309084160. Throughput: 0: 44927.6. Samples: 309186620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 11:22:04,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:22:06,034][32415] Updated weights for policy 0, policy_version 18870 (0.0027) [2024-06-10 11:22:09,592][32177] Fps is (10 sec: 45875.2, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 309313536. Throughput: 0: 44656.4. Samples: 309450720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 11:22:09,593][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:22:10,683][32415] Updated weights for policy 0, policy_version 18880 (0.0032) [2024-06-10 11:22:13,259][32415] Updated weights for policy 0, policy_version 18890 (0.0034) [2024-06-10 11:22:14,592][32177] Fps is (10 sec: 45875.9, 60 sec: 45056.0, 300 sec: 44931.1). Total num frames: 309542912. Throughput: 0: 44686.6. Samples: 309592560. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-10 11:22:14,592][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:22:17,820][32415] Updated weights for policy 0, policy_version 18900 (0.0025) [2024-06-10 11:22:19,592][32177] Fps is (10 sec: 42598.9, 60 sec: 45056.1, 300 sec: 44653.3). Total num frames: 309739520. Throughput: 0: 44832.7. Samples: 309858680. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-10 11:22:19,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:22:20,806][32415] Updated weights for policy 0, policy_version 18910 (0.0045) [2024-06-10 11:22:24,592][32177] Fps is (10 sec: 42597.4, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 309968896. Throughput: 0: 44767.3. Samples: 310125420. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-10 11:22:24,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:22:24,781][32415] Updated weights for policy 0, policy_version 18920 (0.0033) [2024-06-10 11:22:28,205][32415] Updated weights for policy 0, policy_version 18930 (0.0036) [2024-06-10 11:22:29,592][32177] Fps is (10 sec: 45875.4, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 310198272. Throughput: 0: 44702.8. Samples: 310263700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 11:22:29,592][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:22:32,379][32415] Updated weights for policy 0, policy_version 18940 (0.0029) [2024-06-10 11:22:34,596][32177] Fps is (10 sec: 42581.0, 60 sec: 44506.8, 300 sec: 44597.1). Total num frames: 310394880. Throughput: 0: 44780.6. Samples: 310536360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 11:22:34,597][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:22:34,781][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000018947_310427648.pth... [2024-06-10 11:22:34,835][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000018290_299663360.pth [2024-06-10 11:22:35,313][32415] Updated weights for policy 0, policy_version 18950 (0.0038) [2024-06-10 11:22:39,592][32177] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 310624256. Throughput: 0: 44881.0. Samples: 310802580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 11:22:39,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:22:39,927][32415] Updated weights for policy 0, policy_version 18960 (0.0027) [2024-06-10 11:22:42,651][32415] Updated weights for policy 0, policy_version 18970 (0.0038) [2024-06-10 11:22:44,592][32177] Fps is (10 sec: 49172.5, 60 sec: 44782.9, 300 sec: 44986.5). Total num frames: 310886400. Throughput: 0: 45080.4. Samples: 310940180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 11:22:44,592][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:22:46,877][32415] Updated weights for policy 0, policy_version 18980 (0.0033) [2024-06-10 11:22:49,592][32177] Fps is (10 sec: 47512.9, 60 sec: 45329.0, 300 sec: 44708.9). Total num frames: 311099392. Throughput: 0: 44870.7. Samples: 311205800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 11:22:49,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:22:49,873][32415] Updated weights for policy 0, policy_version 18990 (0.0044) [2024-06-10 11:22:50,849][32394] Signal inference workers to stop experience collection... (4550 times) [2024-06-10 11:22:50,849][32394] Signal inference workers to resume experience collection... (4550 times) [2024-06-10 11:22:50,864][32415] InferenceWorker_p0-w0: stopping experience collection (4550 times) [2024-06-10 11:22:50,864][32415] InferenceWorker_p0-w0: resuming experience collection (4550 times) [2024-06-10 11:22:54,131][32415] Updated weights for policy 0, policy_version 19000 (0.0040) [2024-06-10 11:22:54,592][32177] Fps is (10 sec: 42599.1, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 311312384. Throughput: 0: 44965.9. Samples: 311474180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 11:22:54,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:22:57,615][32415] Updated weights for policy 0, policy_version 19010 (0.0032) [2024-06-10 11:22:59,592][32177] Fps is (10 sec: 45875.8, 60 sec: 45056.1, 300 sec: 44986.6). Total num frames: 311558144. Throughput: 0: 44740.0. Samples: 311605860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 11:22:59,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:23:01,599][32415] Updated weights for policy 0, policy_version 19020 (0.0032) [2024-06-10 11:23:04,592][32177] Fps is (10 sec: 45875.0, 60 sec: 44783.0, 300 sec: 44764.5). Total num frames: 311771136. Throughput: 0: 44858.2. Samples: 311877300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 11:23:04,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:23:04,702][32415] Updated weights for policy 0, policy_version 19030 (0.0032) [2024-06-10 11:23:08,710][32415] Updated weights for policy 0, policy_version 19040 (0.0025) [2024-06-10 11:23:09,592][32177] Fps is (10 sec: 42598.5, 60 sec: 44510.0, 300 sec: 44764.5). Total num frames: 311984128. Throughput: 0: 44952.2. Samples: 312148260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-10 11:23:09,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:23:12,042][32415] Updated weights for policy 0, policy_version 19050 (0.0031) [2024-06-10 11:23:14,592][32177] Fps is (10 sec: 47513.1, 60 sec: 45055.9, 300 sec: 44986.6). Total num frames: 312246272. Throughput: 0: 44761.6. Samples: 312277980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-10 11:23:14,592][32177] Avg episode reward: [(0, '0.270')] [2024-06-10 11:23:15,594][32415] Updated weights for policy 0, policy_version 19060 (0.0027) [2024-06-10 11:23:19,344][32415] Updated weights for policy 0, policy_version 19070 (0.0032) [2024-06-10 11:23:19,596][32177] Fps is (10 sec: 45855.3, 60 sec: 45052.7, 300 sec: 44763.8). Total num frames: 312442880. Throughput: 0: 44754.7. Samples: 312550320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-10 11:23:19,596][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:23:22,938][32415] Updated weights for policy 0, policy_version 19080 (0.0040) [2024-06-10 11:23:24,598][32177] Fps is (10 sec: 40935.9, 60 sec: 44778.6, 300 sec: 44819.0). Total num frames: 312655872. Throughput: 0: 44789.1. Samples: 312818360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 11:23:24,598][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:23:27,049][32415] Updated weights for policy 0, policy_version 19090 (0.0034) [2024-06-10 11:23:29,592][32177] Fps is (10 sec: 47533.8, 60 sec: 45329.0, 300 sec: 44931.1). Total num frames: 312918016. Throughput: 0: 44800.1. Samples: 312956180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 11:23:29,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:23:30,320][32415] Updated weights for policy 0, policy_version 19100 (0.0032) [2024-06-10 11:23:34,109][32415] Updated weights for policy 0, policy_version 19110 (0.0039) [2024-06-10 11:23:34,592][32177] Fps is (10 sec: 45902.8, 60 sec: 45332.3, 300 sec: 44820.0). Total num frames: 313114624. Throughput: 0: 44806.3. Samples: 313222080. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-10 11:23:34,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:23:37,324][32415] Updated weights for policy 0, policy_version 19120 (0.0043) [2024-06-10 11:23:39,592][32177] Fps is (10 sec: 40959.8, 60 sec: 45055.9, 300 sec: 44764.4). Total num frames: 313327616. Throughput: 0: 44727.4. Samples: 313486920. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-10 11:23:39,594][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:23:41,393][32415] Updated weights for policy 0, policy_version 19130 (0.0029) [2024-06-10 11:23:44,550][32415] Updated weights for policy 0, policy_version 19140 (0.0034) [2024-06-10 11:23:44,592][32177] Fps is (10 sec: 47513.5, 60 sec: 45056.1, 300 sec: 44931.0). Total num frames: 313589760. Throughput: 0: 44938.6. Samples: 313628100. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-10 11:23:44,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:23:48,575][32415] Updated weights for policy 0, policy_version 19150 (0.0025) [2024-06-10 11:23:49,592][32177] Fps is (10 sec: 45875.8, 60 sec: 44783.1, 300 sec: 44820.0). Total num frames: 313786368. Throughput: 0: 44919.6. Samples: 313898680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 11:23:49,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:23:51,795][32415] Updated weights for policy 0, policy_version 19160 (0.0028) [2024-06-10 11:23:54,592][32177] Fps is (10 sec: 42597.8, 60 sec: 45055.8, 300 sec: 44875.5). Total num frames: 314015744. Throughput: 0: 44873.1. Samples: 314167560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 11:23:54,592][32177] Avg episode reward: [(0, '0.292')] [2024-06-10 11:23:54,613][32394] Saving new best policy, reward=0.292! [2024-06-10 11:23:56,146][32415] Updated weights for policy 0, policy_version 19170 (0.0042) [2024-06-10 11:23:59,002][32415] Updated weights for policy 0, policy_version 19180 (0.0027) [2024-06-10 11:23:59,592][32177] Fps is (10 sec: 47512.5, 60 sec: 45055.8, 300 sec: 44931.7). Total num frames: 314261504. Throughput: 0: 45008.8. Samples: 314303380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 11:23:59,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:24:03,209][32415] Updated weights for policy 0, policy_version 19190 (0.0027) [2024-06-10 11:24:04,592][32177] Fps is (10 sec: 44236.2, 60 sec: 44782.7, 300 sec: 44875.5). Total num frames: 314458112. Throughput: 0: 44960.4. Samples: 314573360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 11:24:04,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:24:06,143][32415] Updated weights for policy 0, policy_version 19200 (0.0026) [2024-06-10 11:24:09,592][32177] Fps is (10 sec: 40961.0, 60 sec: 44783.0, 300 sec: 44764.5). Total num frames: 314671104. Throughput: 0: 44982.1. Samples: 314842280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 11:24:09,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:24:10,751][32415] Updated weights for policy 0, policy_version 19210 (0.0038) [2024-06-10 11:24:13,640][32394] Signal inference workers to stop experience collection... (4600 times) [2024-06-10 11:24:13,640][32394] Signal inference workers to resume experience collection... (4600 times) [2024-06-10 11:24:13,684][32415] InferenceWorker_p0-w0: stopping experience collection (4600 times) [2024-06-10 11:24:13,684][32415] InferenceWorker_p0-w0: resuming experience collection (4600 times) [2024-06-10 11:24:13,779][32415] Updated weights for policy 0, policy_version 19220 (0.0028) [2024-06-10 11:24:14,592][32177] Fps is (10 sec: 44238.0, 60 sec: 44236.9, 300 sec: 44819.9). Total num frames: 314900480. Throughput: 0: 44883.1. Samples: 314975920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 11:24:14,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:24:18,104][32415] Updated weights for policy 0, policy_version 19230 (0.0033) [2024-06-10 11:24:19,592][32177] Fps is (10 sec: 47513.1, 60 sec: 45059.2, 300 sec: 44931.1). Total num frames: 315146240. Throughput: 0: 45108.9. Samples: 315251980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-10 11:24:19,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:24:20,847][32415] Updated weights for policy 0, policy_version 19240 (0.0039) [2024-06-10 11:24:24,592][32177] Fps is (10 sec: 44236.4, 60 sec: 44787.3, 300 sec: 44653.6). Total num frames: 315342848. Throughput: 0: 45076.0. Samples: 315515340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-10 11:24:24,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:24:25,356][32415] Updated weights for policy 0, policy_version 19250 (0.0036) [2024-06-10 11:24:28,586][32415] Updated weights for policy 0, policy_version 19260 (0.0041) [2024-06-10 11:24:29,592][32177] Fps is (10 sec: 42598.6, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 315572224. Throughput: 0: 44972.5. Samples: 315651860. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-10 11:24:29,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:24:32,343][32415] Updated weights for policy 0, policy_version 19270 (0.0023) [2024-06-10 11:24:34,592][32177] Fps is (10 sec: 47513.2, 60 sec: 45055.9, 300 sec: 44931.0). Total num frames: 315817984. Throughput: 0: 44993.1. Samples: 315923380. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-10 11:24:34,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:24:34,743][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000019277_315834368.pth... [2024-06-10 11:24:34,795][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000018619_305053696.pth [2024-06-10 11:24:36,225][32415] Updated weights for policy 0, policy_version 19280 (0.0040) [2024-06-10 11:24:39,592][32177] Fps is (10 sec: 45874.6, 60 sec: 45056.0, 300 sec: 44819.9). Total num frames: 316030976. Throughput: 0: 44822.3. Samples: 316184560. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-10 11:24:39,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:24:39,913][32415] Updated weights for policy 0, policy_version 19290 (0.0032) [2024-06-10 11:24:43,195][32415] Updated weights for policy 0, policy_version 19300 (0.0031) [2024-06-10 11:24:44,592][32177] Fps is (10 sec: 44237.3, 60 sec: 44509.8, 300 sec: 44875.5). Total num frames: 316260352. Throughput: 0: 44759.2. Samples: 316317540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 11:24:44,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:24:47,391][32415] Updated weights for policy 0, policy_version 19310 (0.0027) [2024-06-10 11:24:49,592][32177] Fps is (10 sec: 44237.6, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 316473344. Throughput: 0: 44906.1. Samples: 316594120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 11:24:49,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:24:50,239][32415] Updated weights for policy 0, policy_version 19320 (0.0025) [2024-06-10 11:24:54,526][32415] Updated weights for policy 0, policy_version 19330 (0.0038) [2024-06-10 11:24:54,592][32177] Fps is (10 sec: 44237.1, 60 sec: 44783.1, 300 sec: 44820.0). Total num frames: 316702720. Throughput: 0: 44886.2. Samples: 316862160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 11:24:54,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:24:57,624][32415] Updated weights for policy 0, policy_version 19340 (0.0027) [2024-06-10 11:24:59,592][32177] Fps is (10 sec: 45874.2, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 316932096. Throughput: 0: 44742.1. Samples: 316989320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 11:24:59,593][32177] Avg episode reward: [(0, '0.294')] [2024-06-10 11:24:59,593][32394] Saving new best policy, reward=0.294! [2024-06-10 11:25:01,513][32415] Updated weights for policy 0, policy_version 19350 (0.0036) [2024-06-10 11:25:04,596][32177] Fps is (10 sec: 47493.0, 60 sec: 45326.0, 300 sec: 44930.4). Total num frames: 317177856. Throughput: 0: 44845.0. Samples: 317270200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 11:25:04,597][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:25:05,036][32415] Updated weights for policy 0, policy_version 19360 (0.0051) [2024-06-10 11:25:08,780][32415] Updated weights for policy 0, policy_version 19370 (0.0038) [2024-06-10 11:25:09,595][32177] Fps is (10 sec: 44222.3, 60 sec: 45053.4, 300 sec: 44819.5). Total num frames: 317374464. Throughput: 0: 44898.0. Samples: 317535900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 11:25:09,596][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:25:11,989][32415] Updated weights for policy 0, policy_version 19380 (0.0027) [2024-06-10 11:25:14,592][32177] Fps is (10 sec: 42616.5, 60 sec: 45055.9, 300 sec: 44819.9). Total num frames: 317603840. Throughput: 0: 44697.2. Samples: 317663240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 11:25:14,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:25:16,122][32415] Updated weights for policy 0, policy_version 19390 (0.0043) [2024-06-10 11:25:19,177][32415] Updated weights for policy 0, policy_version 19400 (0.0026) [2024-06-10 11:25:19,592][32177] Fps is (10 sec: 49167.0, 60 sec: 45328.8, 300 sec: 44986.5). Total num frames: 317865984. Throughput: 0: 44854.9. Samples: 317941860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 11:25:19,593][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:25:23,565][32415] Updated weights for policy 0, policy_version 19410 (0.0039) [2024-06-10 11:25:24,592][32177] Fps is (10 sec: 44237.0, 60 sec: 45056.0, 300 sec: 44876.1). Total num frames: 318046208. Throughput: 0: 45083.1. Samples: 318213300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 11:25:24,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:25:26,705][32415] Updated weights for policy 0, policy_version 19420 (0.0032) [2024-06-10 11:25:29,592][32177] Fps is (10 sec: 40961.0, 60 sec: 45055.9, 300 sec: 44820.0). Total num frames: 318275584. Throughput: 0: 44917.3. Samples: 318338820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 11:25:29,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:25:30,849][32415] Updated weights for policy 0, policy_version 19430 (0.0029) [2024-06-10 11:25:34,020][32415] Updated weights for policy 0, policy_version 19440 (0.0035) [2024-06-10 11:25:34,592][32177] Fps is (10 sec: 47514.0, 60 sec: 45056.2, 300 sec: 44875.5). Total num frames: 318521344. Throughput: 0: 44843.5. Samples: 318612080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 11:25:34,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 11:25:36,287][32394] Signal inference workers to stop experience collection... (4650 times) [2024-06-10 11:25:36,320][32415] InferenceWorker_p0-w0: stopping experience collection (4650 times) [2024-06-10 11:25:36,340][32394] Signal inference workers to resume experience collection... (4650 times) [2024-06-10 11:25:36,341][32415] InferenceWorker_p0-w0: resuming experience collection (4650 times) [2024-06-10 11:25:38,461][32415] Updated weights for policy 0, policy_version 19450 (0.0032) [2024-06-10 11:25:39,592][32177] Fps is (10 sec: 44237.1, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 318717952. Throughput: 0: 44850.2. Samples: 318880420. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-10 11:25:39,592][32177] Avg episode reward: [(0, '0.273')] [2024-06-10 11:25:41,284][32415] Updated weights for policy 0, policy_version 19460 (0.0029) [2024-06-10 11:25:44,592][32177] Fps is (10 sec: 40959.4, 60 sec: 44509.8, 300 sec: 44819.9). Total num frames: 318930944. Throughput: 0: 44856.0. Samples: 319007840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-10 11:25:44,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:25:45,458][32415] Updated weights for policy 0, policy_version 19470 (0.0032) [2024-06-10 11:25:48,379][32415] Updated weights for policy 0, policy_version 19480 (0.0027) [2024-06-10 11:25:49,592][32177] Fps is (10 sec: 47512.8, 60 sec: 45328.8, 300 sec: 44931.0). Total num frames: 319193088. Throughput: 0: 44787.2. Samples: 319285440. Policy #0 lag: (min: 2.0, avg: 12.5, max: 24.0) [2024-06-10 11:25:49,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:25:52,694][32415] Updated weights for policy 0, policy_version 19490 (0.0044) [2024-06-10 11:25:54,592][32177] Fps is (10 sec: 45876.1, 60 sec: 44783.0, 300 sec: 44876.2). Total num frames: 319389696. Throughput: 0: 44998.1. Samples: 319560660. Policy #0 lag: (min: 2.0, avg: 12.5, max: 24.0) [2024-06-10 11:25:54,592][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:25:55,863][32415] Updated weights for policy 0, policy_version 19500 (0.0036) [2024-06-10 11:25:59,592][32177] Fps is (10 sec: 42599.2, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 319619072. Throughput: 0: 45050.3. Samples: 319690500. Policy #0 lag: (min: 2.0, avg: 12.5, max: 24.0) [2024-06-10 11:25:59,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:26:00,064][32415] Updated weights for policy 0, policy_version 19510 (0.0029) [2024-06-10 11:26:03,207][32415] Updated weights for policy 0, policy_version 19520 (0.0035) [2024-06-10 11:26:04,592][32177] Fps is (10 sec: 47512.6, 60 sec: 44786.1, 300 sec: 44986.6). Total num frames: 319864832. Throughput: 0: 44698.9. Samples: 319953300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 11:26:04,592][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:26:07,707][32415] Updated weights for policy 0, policy_version 19530 (0.0060) [2024-06-10 11:26:09,592][32177] Fps is (10 sec: 44236.4, 60 sec: 44785.4, 300 sec: 44819.9). Total num frames: 320061440. Throughput: 0: 44804.8. Samples: 320229520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 11:26:09,592][32177] Avg episode reward: [(0, '0.269')] [2024-06-10 11:26:10,594][32415] Updated weights for policy 0, policy_version 19540 (0.0030) [2024-06-10 11:26:14,592][32177] Fps is (10 sec: 42598.8, 60 sec: 44783.0, 300 sec: 44931.1). Total num frames: 320290816. Throughput: 0: 44929.0. Samples: 320360620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 11:26:14,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:26:14,728][32415] Updated weights for policy 0, policy_version 19550 (0.0037) [2024-06-10 11:26:17,743][32415] Updated weights for policy 0, policy_version 19560 (0.0031) [2024-06-10 11:26:19,592][32177] Fps is (10 sec: 45875.7, 60 sec: 44237.1, 300 sec: 44875.5). Total num frames: 320520192. Throughput: 0: 44656.4. Samples: 320621620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 11:26:19,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:26:22,287][32415] Updated weights for policy 0, policy_version 19570 (0.0026) [2024-06-10 11:26:24,596][32177] Fps is (10 sec: 47493.5, 60 sec: 45325.9, 300 sec: 44930.4). Total num frames: 320765952. Throughput: 0: 44794.0. Samples: 320896340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 11:26:24,596][32177] Avg episode reward: [(0, '0.292')] [2024-06-10 11:26:25,458][32415] Updated weights for policy 0, policy_version 19580 (0.0040) [2024-06-10 11:26:29,592][32177] Fps is (10 sec: 42598.6, 60 sec: 44510.0, 300 sec: 44820.0). Total num frames: 320946176. Throughput: 0: 45008.2. Samples: 321033200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-10 11:26:29,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:26:29,642][32415] Updated weights for policy 0, policy_version 19590 (0.0040) [2024-06-10 11:26:32,702][32415] Updated weights for policy 0, policy_version 19600 (0.0031) [2024-06-10 11:26:34,592][32177] Fps is (10 sec: 40977.6, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 321175552. Throughput: 0: 44658.5. Samples: 321295060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-10 11:26:34,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:26:34,768][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000019605_321208320.pth... [2024-06-10 11:26:34,820][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000018947_310427648.pth [2024-06-10 11:26:37,007][32415] Updated weights for policy 0, policy_version 19610 (0.0033) [2024-06-10 11:26:39,592][32177] Fps is (10 sec: 45874.9, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 321404928. Throughput: 0: 44583.5. Samples: 321566920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-10 11:26:39,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:26:39,919][32415] Updated weights for policy 0, policy_version 19620 (0.0030) [2024-06-10 11:26:44,204][32415] Updated weights for policy 0, policy_version 19630 (0.0029) [2024-06-10 11:26:44,592][32177] Fps is (10 sec: 44236.6, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 321617920. Throughput: 0: 44704.5. Samples: 321702200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-10 11:26:44,592][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:26:46,880][32394] Signal inference workers to stop experience collection... (4700 times) [2024-06-10 11:26:46,931][32415] InferenceWorker_p0-w0: stopping experience collection (4700 times) [2024-06-10 11:26:46,937][32394] Signal inference workers to resume experience collection... (4700 times) [2024-06-10 11:26:46,945][32415] InferenceWorker_p0-w0: resuming experience collection (4700 times) [2024-06-10 11:26:47,075][32415] Updated weights for policy 0, policy_version 19640 (0.0040) [2024-06-10 11:26:49,592][32177] Fps is (10 sec: 45874.3, 60 sec: 44509.9, 300 sec: 44819.9). Total num frames: 321863680. Throughput: 0: 44767.5. Samples: 321967840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-10 11:26:49,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:26:51,523][32415] Updated weights for policy 0, policy_version 19650 (0.0025) [2024-06-10 11:26:54,592][32177] Fps is (10 sec: 47513.8, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 322093056. Throughput: 0: 44701.0. Samples: 322241060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-10 11:26:54,592][32177] Avg episode reward: [(0, '0.294')] [2024-06-10 11:26:54,692][32415] Updated weights for policy 0, policy_version 19660 (0.0036) [2024-06-10 11:26:58,804][32415] Updated weights for policy 0, policy_version 19670 (0.0043) [2024-06-10 11:26:59,596][32177] Fps is (10 sec: 44219.0, 60 sec: 44779.8, 300 sec: 44819.3). Total num frames: 322306048. Throughput: 0: 44912.7. Samples: 322381880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 11:26:59,597][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 11:27:02,009][32415] Updated weights for policy 0, policy_version 19680 (0.0048) [2024-06-10 11:27:04,592][32177] Fps is (10 sec: 45874.6, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 322551808. Throughput: 0: 44955.9. Samples: 322644640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 11:27:04,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:27:06,175][32415] Updated weights for policy 0, policy_version 19690 (0.0035) [2024-06-10 11:27:09,226][32415] Updated weights for policy 0, policy_version 19700 (0.0040) [2024-06-10 11:27:09,592][32177] Fps is (10 sec: 45895.0, 60 sec: 45056.1, 300 sec: 44820.0). Total num frames: 322764800. Throughput: 0: 44803.9. Samples: 322912320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 11:27:09,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:27:13,264][32415] Updated weights for policy 0, policy_version 19710 (0.0023) [2024-06-10 11:27:14,592][32177] Fps is (10 sec: 42599.1, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 322977792. Throughput: 0: 44768.0. Samples: 323047760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 11:27:14,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:27:16,468][32415] Updated weights for policy 0, policy_version 19720 (0.0041) [2024-06-10 11:27:19,592][32177] Fps is (10 sec: 45875.0, 60 sec: 45056.0, 300 sec: 44931.1). Total num frames: 323223552. Throughput: 0: 44994.2. Samples: 323319800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 11:27:19,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:27:20,803][32415] Updated weights for policy 0, policy_version 19730 (0.0037) [2024-06-10 11:27:23,992][32415] Updated weights for policy 0, policy_version 19740 (0.0027) [2024-06-10 11:27:24,592][32177] Fps is (10 sec: 45875.2, 60 sec: 44513.1, 300 sec: 44875.5). Total num frames: 323436544. Throughput: 0: 44893.9. Samples: 323587140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 11:27:24,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:27:27,838][32415] Updated weights for policy 0, policy_version 19750 (0.0045) [2024-06-10 11:27:29,592][32177] Fps is (10 sec: 40960.2, 60 sec: 44782.9, 300 sec: 44876.2). Total num frames: 323633152. Throughput: 0: 44894.3. Samples: 323722440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 11:27:29,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:27:31,184][32415] Updated weights for policy 0, policy_version 19760 (0.0039) [2024-06-10 11:27:34,592][32177] Fps is (10 sec: 44235.6, 60 sec: 45055.8, 300 sec: 44931.0). Total num frames: 323878912. Throughput: 0: 44914.7. Samples: 323989000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 11:27:34,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:27:35,259][32415] Updated weights for policy 0, policy_version 19770 (0.0034) [2024-06-10 11:27:38,503][32415] Updated weights for policy 0, policy_version 19780 (0.0034) [2024-06-10 11:27:39,592][32177] Fps is (10 sec: 47512.8, 60 sec: 45055.9, 300 sec: 44820.0). Total num frames: 324108288. Throughput: 0: 44918.5. Samples: 324262400. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-10 11:27:39,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:27:42,392][32415] Updated weights for policy 0, policy_version 19790 (0.0027) [2024-06-10 11:27:44,592][32177] Fps is (10 sec: 45875.6, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 324337664. Throughput: 0: 44838.3. Samples: 324399420. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-10 11:27:44,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:27:45,639][32415] Updated weights for policy 0, policy_version 19800 (0.0030) [2024-06-10 11:27:49,592][32177] Fps is (10 sec: 44237.6, 60 sec: 44783.1, 300 sec: 44875.5). Total num frames: 324550656. Throughput: 0: 45053.9. Samples: 324672060. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-10 11:27:49,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:27:49,693][32415] Updated weights for policy 0, policy_version 19810 (0.0039) [2024-06-10 11:27:52,808][32415] Updated weights for policy 0, policy_version 19820 (0.0031) [2024-06-10 11:27:54,592][32177] Fps is (10 sec: 45875.5, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 324796416. Throughput: 0: 45263.9. Samples: 324949200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-10 11:27:54,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:27:56,846][32415] Updated weights for policy 0, policy_version 19830 (0.0021) [2024-06-10 11:27:59,592][32177] Fps is (10 sec: 47513.0, 60 sec: 45332.2, 300 sec: 44931.0). Total num frames: 325025792. Throughput: 0: 45133.2. Samples: 325078760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-10 11:27:59,592][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:28:00,250][32415] Updated weights for policy 0, policy_version 19840 (0.0027) [2024-06-10 11:28:04,295][32415] Updated weights for policy 0, policy_version 19850 (0.0036) [2024-06-10 11:28:04,592][32177] Fps is (10 sec: 44236.4, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 325238784. Throughput: 0: 44912.7. Samples: 325340880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-10 11:28:04,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:28:07,872][32415] Updated weights for policy 0, policy_version 19860 (0.0037) [2024-06-10 11:28:09,592][32177] Fps is (10 sec: 45875.0, 60 sec: 45328.9, 300 sec: 44875.5). Total num frames: 325484544. Throughput: 0: 45067.4. Samples: 325615180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 11:28:09,596][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:28:11,437][32415] Updated weights for policy 0, policy_version 19870 (0.0037) [2024-06-10 11:28:14,592][32177] Fps is (10 sec: 44236.9, 60 sec: 45055.9, 300 sec: 44876.1). Total num frames: 325681152. Throughput: 0: 45198.5. Samples: 325756380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 11:28:14,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:28:14,924][32394] Signal inference workers to stop experience collection... (4750 times) [2024-06-10 11:28:14,924][32394] Signal inference workers to resume experience collection... (4750 times) [2024-06-10 11:28:14,928][32415] Updated weights for policy 0, policy_version 19880 (0.0025) [2024-06-10 11:28:14,941][32415] InferenceWorker_p0-w0: stopping experience collection (4750 times) [2024-06-10 11:28:14,941][32415] InferenceWorker_p0-w0: resuming experience collection (4750 times) [2024-06-10 11:28:18,843][32415] Updated weights for policy 0, policy_version 19890 (0.0031) [2024-06-10 11:28:19,592][32177] Fps is (10 sec: 42598.5, 60 sec: 44782.8, 300 sec: 44931.9). Total num frames: 325910528. Throughput: 0: 45080.5. Samples: 326017620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 11:28:19,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:28:21,895][32415] Updated weights for policy 0, policy_version 19900 (0.0026) [2024-06-10 11:28:24,592][32177] Fps is (10 sec: 45875.8, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 326139904. Throughput: 0: 45099.2. Samples: 326291860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 11:28:24,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:28:26,046][32415] Updated weights for policy 0, policy_version 19910 (0.0031) [2024-06-10 11:28:29,294][32415] Updated weights for policy 0, policy_version 19920 (0.0033) [2024-06-10 11:28:29,592][32177] Fps is (10 sec: 45875.9, 60 sec: 45602.1, 300 sec: 44931.0). Total num frames: 326369280. Throughput: 0: 45089.5. Samples: 326428440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 11:28:29,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:28:33,422][32415] Updated weights for policy 0, policy_version 19930 (0.0045) [2024-06-10 11:28:34,596][32177] Fps is (10 sec: 45855.4, 60 sec: 45326.0, 300 sec: 44985.9). Total num frames: 326598656. Throughput: 0: 45045.0. Samples: 326699280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 11:28:34,597][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:28:34,618][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000019934_326598656.pth... [2024-06-10 11:28:34,682][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000019277_315834368.pth [2024-06-10 11:28:36,800][32415] Updated weights for policy 0, policy_version 19940 (0.0032) [2024-06-10 11:28:39,592][32177] Fps is (10 sec: 47513.0, 60 sec: 45602.2, 300 sec: 44931.0). Total num frames: 326844416. Throughput: 0: 44785.3. Samples: 326964540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 11:28:39,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:28:40,697][32415] Updated weights for policy 0, policy_version 19950 (0.0046) [2024-06-10 11:28:44,034][32415] Updated weights for policy 0, policy_version 19960 (0.0039) [2024-06-10 11:28:44,592][32177] Fps is (10 sec: 44256.0, 60 sec: 45056.1, 300 sec: 44931.0). Total num frames: 327041024. Throughput: 0: 44944.1. Samples: 327101240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 11:28:44,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:28:48,090][32415] Updated weights for policy 0, policy_version 19970 (0.0026) [2024-06-10 11:28:49,592][32177] Fps is (10 sec: 42598.8, 60 sec: 45329.0, 300 sec: 44931.1). Total num frames: 327270400. Throughput: 0: 45304.6. Samples: 327379580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 11:28:49,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:28:51,121][32415] Updated weights for policy 0, policy_version 19980 (0.0035) [2024-06-10 11:28:54,596][32177] Fps is (10 sec: 44217.8, 60 sec: 44779.8, 300 sec: 44819.3). Total num frames: 327483392. Throughput: 0: 44927.0. Samples: 327637080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 11:28:54,597][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:28:55,141][32415] Updated weights for policy 0, policy_version 19990 (0.0030) [2024-06-10 11:28:58,739][32415] Updated weights for policy 0, policy_version 20000 (0.0027) [2024-06-10 11:28:59,591][32177] Fps is (10 sec: 42598.8, 60 sec: 44510.0, 300 sec: 44875.6). Total num frames: 327696384. Throughput: 0: 44866.4. Samples: 327775360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 11:28:59,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:29:02,623][32415] Updated weights for policy 0, policy_version 20010 (0.0029) [2024-06-10 11:29:04,592][32177] Fps is (10 sec: 44255.2, 60 sec: 44783.0, 300 sec: 44931.0). Total num frames: 327925760. Throughput: 0: 44889.3. Samples: 328037640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 11:29:04,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:29:06,117][32415] Updated weights for policy 0, policy_version 20020 (0.0031) [2024-06-10 11:29:09,592][32177] Fps is (10 sec: 45873.9, 60 sec: 44509.8, 300 sec: 44931.0). Total num frames: 328155136. Throughput: 0: 44869.6. Samples: 328311000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 11:29:09,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:29:09,907][32415] Updated weights for policy 0, policy_version 20030 (0.0035) [2024-06-10 11:29:13,346][32415] Updated weights for policy 0, policy_version 20040 (0.0027) [2024-06-10 11:29:14,592][32177] Fps is (10 sec: 49152.0, 60 sec: 45602.1, 300 sec: 44986.6). Total num frames: 328417280. Throughput: 0: 44941.6. Samples: 328450820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 11:29:14,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:29:17,223][32415] Updated weights for policy 0, policy_version 20050 (0.0028) [2024-06-10 11:29:19,592][32177] Fps is (10 sec: 44237.2, 60 sec: 44783.0, 300 sec: 44931.0). Total num frames: 328597504. Throughput: 0: 44861.6. Samples: 328717860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-10 11:29:19,596][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:29:20,364][32415] Updated weights for policy 0, policy_version 20060 (0.0041) [2024-06-10 11:29:21,224][32394] Signal inference workers to stop experience collection... (4800 times) [2024-06-10 11:29:21,225][32394] Signal inference workers to resume experience collection... (4800 times) [2024-06-10 11:29:21,236][32415] InferenceWorker_p0-w0: stopping experience collection (4800 times) [2024-06-10 11:29:21,237][32415] InferenceWorker_p0-w0: resuming experience collection (4800 times) [2024-06-10 11:29:24,302][32415] Updated weights for policy 0, policy_version 20070 (0.0027) [2024-06-10 11:29:24,592][32177] Fps is (10 sec: 40960.6, 60 sec: 44783.0, 300 sec: 44931.0). Total num frames: 328826880. Throughput: 0: 44931.3. Samples: 328986440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-10 11:29:24,592][32177] Avg episode reward: [(0, '0.297')] [2024-06-10 11:29:24,642][32394] Saving new best policy, reward=0.297! [2024-06-10 11:29:27,856][32415] Updated weights for policy 0, policy_version 20080 (0.0045) [2024-06-10 11:29:29,592][32177] Fps is (10 sec: 47514.3, 60 sec: 45056.0, 300 sec: 44931.1). Total num frames: 329072640. Throughput: 0: 44765.8. Samples: 329115700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-10 11:29:29,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:29:31,754][32415] Updated weights for policy 0, policy_version 20090 (0.0038) [2024-06-10 11:29:34,592][32177] Fps is (10 sec: 44236.7, 60 sec: 44513.1, 300 sec: 44875.5). Total num frames: 329269248. Throughput: 0: 44694.7. Samples: 329390840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 11:29:34,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:29:35,300][32415] Updated weights for policy 0, policy_version 20100 (0.0023) [2024-06-10 11:29:39,377][32415] Updated weights for policy 0, policy_version 20110 (0.0043) [2024-06-10 11:29:39,592][32177] Fps is (10 sec: 42596.9, 60 sec: 44236.7, 300 sec: 44875.5). Total num frames: 329498624. Throughput: 0: 45037.8. Samples: 329663600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 11:29:39,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:29:42,492][32415] Updated weights for policy 0, policy_version 20120 (0.0033) [2024-06-10 11:29:44,592][32177] Fps is (10 sec: 47512.9, 60 sec: 45055.9, 300 sec: 44986.6). Total num frames: 329744384. Throughput: 0: 44896.2. Samples: 329795700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-10 11:29:44,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:29:46,523][32415] Updated weights for policy 0, policy_version 20130 (0.0038) [2024-06-10 11:29:49,596][32177] Fps is (10 sec: 45856.9, 60 sec: 44779.7, 300 sec: 44930.4). Total num frames: 329957376. Throughput: 0: 44992.7. Samples: 330062500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-10 11:29:49,597][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:29:49,723][32415] Updated weights for policy 0, policy_version 20140 (0.0026) [2024-06-10 11:29:53,594][32415] Updated weights for policy 0, policy_version 20150 (0.0034) [2024-06-10 11:29:54,592][32177] Fps is (10 sec: 42598.2, 60 sec: 44786.0, 300 sec: 44875.5). Total num frames: 330170368. Throughput: 0: 44911.1. Samples: 330332000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-10 11:29:54,593][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:29:57,014][32415] Updated weights for policy 0, policy_version 20160 (0.0024) [2024-06-10 11:29:59,592][32177] Fps is (10 sec: 44255.6, 60 sec: 45055.9, 300 sec: 44820.6). Total num frames: 330399744. Throughput: 0: 44624.1. Samples: 330458900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 11:29:59,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:30:00,940][32415] Updated weights for policy 0, policy_version 20170 (0.0036) [2024-06-10 11:30:04,536][32415] Updated weights for policy 0, policy_version 20180 (0.0045) [2024-06-10 11:30:04,592][32177] Fps is (10 sec: 45875.4, 60 sec: 45056.0, 300 sec: 44931.5). Total num frames: 330629120. Throughput: 0: 44796.0. Samples: 330733680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 11:30:04,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:30:08,737][32415] Updated weights for policy 0, policy_version 20190 (0.0031) [2024-06-10 11:30:09,591][32177] Fps is (10 sec: 42599.0, 60 sec: 44510.1, 300 sec: 44820.0). Total num frames: 330825728. Throughput: 0: 44830.3. Samples: 331003800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 11:30:09,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:30:11,658][32415] Updated weights for policy 0, policy_version 20200 (0.0031) [2024-06-10 11:30:14,592][32177] Fps is (10 sec: 44237.3, 60 sec: 44236.9, 300 sec: 44764.5). Total num frames: 331071488. Throughput: 0: 44669.3. Samples: 331125820. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-10 11:30:14,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:30:16,117][32415] Updated weights for policy 0, policy_version 20210 (0.0027) [2024-06-10 11:30:19,247][32415] Updated weights for policy 0, policy_version 20220 (0.0031) [2024-06-10 11:30:19,592][32177] Fps is (10 sec: 49151.5, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 331317248. Throughput: 0: 44649.7. Samples: 331400080. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-10 11:30:19,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:30:23,211][32415] Updated weights for policy 0, policy_version 20230 (0.0030) [2024-06-10 11:30:24,592][32177] Fps is (10 sec: 42598.0, 60 sec: 44509.7, 300 sec: 44820.0). Total num frames: 331497472. Throughput: 0: 44497.5. Samples: 331665980. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-10 11:30:24,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:30:26,355][32415] Updated weights for policy 0, policy_version 20240 (0.0030) [2024-06-10 11:30:29,596][32177] Fps is (10 sec: 42580.0, 60 sec: 44506.6, 300 sec: 44819.3). Total num frames: 331743232. Throughput: 0: 44433.2. Samples: 331795380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-10 11:30:29,597][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:30:30,381][32415] Updated weights for policy 0, policy_version 20250 (0.0040) [2024-06-10 11:30:33,760][32394] Signal inference workers to stop experience collection... (4850 times) [2024-06-10 11:30:33,779][32415] InferenceWorker_p0-w0: stopping experience collection (4850 times) [2024-06-10 11:30:33,812][32394] Signal inference workers to resume experience collection... (4850 times) [2024-06-10 11:30:33,812][32415] InferenceWorker_p0-w0: resuming experience collection (4850 times) [2024-06-10 11:30:33,814][32415] Updated weights for policy 0, policy_version 20260 (0.0032) [2024-06-10 11:30:34,594][32177] Fps is (10 sec: 49141.5, 60 sec: 45327.3, 300 sec: 44986.2). Total num frames: 331988992. Throughput: 0: 44644.2. Samples: 332071400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-10 11:30:34,594][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:30:34,617][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000020263_331988992.pth... [2024-06-10 11:30:34,673][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000019605_321208320.pth [2024-06-10 11:30:37,877][32415] Updated weights for policy 0, policy_version 20270 (0.0040) [2024-06-10 11:30:39,592][32177] Fps is (10 sec: 42617.0, 60 sec: 44510.1, 300 sec: 44875.5). Total num frames: 332169216. Throughput: 0: 44776.7. Samples: 332346940. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-10 11:30:39,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:30:40,889][32415] Updated weights for policy 0, policy_version 20280 (0.0031) [2024-06-10 11:30:44,596][32177] Fps is (10 sec: 40951.6, 60 sec: 44233.7, 300 sec: 44763.8). Total num frames: 332398592. Throughput: 0: 44756.2. Samples: 332473120. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-10 11:30:44,597][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:30:45,172][32415] Updated weights for policy 0, policy_version 20290 (0.0031) [2024-06-10 11:30:48,286][32415] Updated weights for policy 0, policy_version 20300 (0.0028) [2024-06-10 11:30:49,592][32177] Fps is (10 sec: 49151.1, 60 sec: 45059.1, 300 sec: 44986.5). Total num frames: 332660736. Throughput: 0: 44556.9. Samples: 332738740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-10 11:30:49,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:30:52,529][32415] Updated weights for policy 0, policy_version 20310 (0.0031) [2024-06-10 11:30:54,592][32177] Fps is (10 sec: 44255.2, 60 sec: 44509.9, 300 sec: 44819.9). Total num frames: 332840960. Throughput: 0: 44726.8. Samples: 333016520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 11:30:54,592][32177] Avg episode reward: [(0, '0.268')] [2024-06-10 11:30:55,589][32415] Updated weights for policy 0, policy_version 20320 (0.0040) [2024-06-10 11:30:59,592][32177] Fps is (10 sec: 40960.3, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 333070336. Throughput: 0: 44827.1. Samples: 333143040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 11:30:59,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:30:59,638][32415] Updated weights for policy 0, policy_version 20330 (0.0049) [2024-06-10 11:31:02,954][32415] Updated weights for policy 0, policy_version 20340 (0.0041) [2024-06-10 11:31:04,592][32177] Fps is (10 sec: 49152.3, 60 sec: 45056.0, 300 sec: 44986.6). Total num frames: 333332480. Throughput: 0: 44659.9. Samples: 333409780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 11:31:04,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:31:07,069][32415] Updated weights for policy 0, policy_version 20350 (0.0027) [2024-06-10 11:31:09,592][32177] Fps is (10 sec: 45875.6, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 333529088. Throughput: 0: 44942.8. Samples: 333688400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 11:31:09,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:31:10,053][32415] Updated weights for policy 0, policy_version 20360 (0.0034) [2024-06-10 11:31:14,494][32415] Updated weights for policy 0, policy_version 20370 (0.0030) [2024-06-10 11:31:14,592][32177] Fps is (10 sec: 40960.5, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 333742080. Throughput: 0: 44898.1. Samples: 333815600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 11:31:14,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:31:17,533][32415] Updated weights for policy 0, policy_version 20380 (0.0022) [2024-06-10 11:31:19,592][32177] Fps is (10 sec: 47513.7, 60 sec: 44783.0, 300 sec: 44876.2). Total num frames: 334004224. Throughput: 0: 44625.4. Samples: 334079440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 11:31:19,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:31:21,852][32415] Updated weights for policy 0, policy_version 20390 (0.0036) [2024-06-10 11:31:24,592][32177] Fps is (10 sec: 45874.6, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 334200832. Throughput: 0: 44781.2. Samples: 334362100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 11:31:24,593][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:31:24,811][32415] Updated weights for policy 0, policy_version 20400 (0.0037) [2024-06-10 11:31:28,904][32415] Updated weights for policy 0, policy_version 20410 (0.0037) [2024-06-10 11:31:29,592][32177] Fps is (10 sec: 40959.5, 60 sec: 44513.0, 300 sec: 44875.5). Total num frames: 334413824. Throughput: 0: 44757.5. Samples: 334487020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 11:31:29,592][32177] Avg episode reward: [(0, '0.292')] [2024-06-10 11:31:31,996][32415] Updated weights for policy 0, policy_version 20420 (0.0033) [2024-06-10 11:31:34,592][32177] Fps is (10 sec: 45875.5, 60 sec: 44511.5, 300 sec: 44931.0). Total num frames: 334659584. Throughput: 0: 44929.9. Samples: 334760580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 11:31:34,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:31:36,321][32415] Updated weights for policy 0, policy_version 20430 (0.0032) [2024-06-10 11:31:39,253][32415] Updated weights for policy 0, policy_version 20440 (0.0035) [2024-06-10 11:31:39,592][32177] Fps is (10 sec: 49151.6, 60 sec: 45602.0, 300 sec: 45042.1). Total num frames: 334905344. Throughput: 0: 44817.8. Samples: 335033320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 11:31:39,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:31:43,539][32415] Updated weights for policy 0, policy_version 20450 (0.0046) [2024-06-10 11:31:44,592][32177] Fps is (10 sec: 42598.8, 60 sec: 44786.2, 300 sec: 44820.0). Total num frames: 335085568. Throughput: 0: 44957.5. Samples: 335166120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 11:31:44,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:31:46,610][32394] Signal inference workers to stop experience collection... (4900 times) [2024-06-10 11:31:46,662][32415] InferenceWorker_p0-w0: stopping experience collection (4900 times) [2024-06-10 11:31:46,668][32394] Signal inference workers to resume experience collection... (4900 times) [2024-06-10 11:31:46,677][32415] InferenceWorker_p0-w0: resuming experience collection (4900 times) [2024-06-10 11:31:46,806][32415] Updated weights for policy 0, policy_version 20460 (0.0028) [2024-06-10 11:31:49,592][32177] Fps is (10 sec: 40960.0, 60 sec: 44236.8, 300 sec: 44819.9). Total num frames: 335314944. Throughput: 0: 44818.6. Samples: 335426620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 11:31:49,593][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:31:51,358][32415] Updated weights for policy 0, policy_version 20470 (0.0026) [2024-06-10 11:31:54,035][32415] Updated weights for policy 0, policy_version 20480 (0.0033) [2024-06-10 11:31:54,592][32177] Fps is (10 sec: 47513.1, 60 sec: 45329.2, 300 sec: 44931.7). Total num frames: 335560704. Throughput: 0: 44582.6. Samples: 335694620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 11:31:54,592][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:31:58,679][32415] Updated weights for policy 0, policy_version 20490 (0.0033) [2024-06-10 11:31:59,592][32177] Fps is (10 sec: 44237.7, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 335757312. Throughput: 0: 44895.6. Samples: 335835900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 11:31:59,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:32:01,241][32415] Updated weights for policy 0, policy_version 20500 (0.0030) [2024-06-10 11:32:04,592][32177] Fps is (10 sec: 44236.4, 60 sec: 44509.8, 300 sec: 44875.5). Total num frames: 336003072. Throughput: 0: 44888.3. Samples: 336099420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 11:32:04,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:32:05,784][32415] Updated weights for policy 0, policy_version 20510 (0.0029) [2024-06-10 11:32:08,448][32415] Updated weights for policy 0, policy_version 20520 (0.0034) [2024-06-10 11:32:09,592][32177] Fps is (10 sec: 49151.9, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 336248832. Throughput: 0: 44787.2. Samples: 336377520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 11:32:09,592][32177] Avg episode reward: [(0, '0.294')] [2024-06-10 11:32:13,057][32415] Updated weights for policy 0, policy_version 20530 (0.0033) [2024-06-10 11:32:14,592][32177] Fps is (10 sec: 44236.9, 60 sec: 45055.9, 300 sec: 44819.9). Total num frames: 336445440. Throughput: 0: 45071.1. Samples: 336515220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 11:32:14,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:32:15,948][32415] Updated weights for policy 0, policy_version 20540 (0.0050) [2024-06-10 11:32:19,592][32177] Fps is (10 sec: 40959.6, 60 sec: 44236.7, 300 sec: 44819.9). Total num frames: 336658432. Throughput: 0: 44914.6. Samples: 336781740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-10 11:32:19,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:32:20,315][32415] Updated weights for policy 0, policy_version 20550 (0.0023) [2024-06-10 11:32:23,454][32415] Updated weights for policy 0, policy_version 20560 (0.0031) [2024-06-10 11:32:24,592][32177] Fps is (10 sec: 47513.8, 60 sec: 45329.1, 300 sec: 45042.1). Total num frames: 336920576. Throughput: 0: 44736.1. Samples: 337046440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-10 11:32:24,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:32:27,606][32415] Updated weights for policy 0, policy_version 20570 (0.0030) [2024-06-10 11:32:29,591][32177] Fps is (10 sec: 45876.0, 60 sec: 45056.1, 300 sec: 44875.5). Total num frames: 337117184. Throughput: 0: 45046.3. Samples: 337193200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-10 11:32:29,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:32:30,583][32415] Updated weights for policy 0, policy_version 20580 (0.0022) [2024-06-10 11:32:34,592][32177] Fps is (10 sec: 39321.1, 60 sec: 44236.7, 300 sec: 44764.4). Total num frames: 337313792. Throughput: 0: 45005.3. Samples: 337451860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 11:32:34,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:32:34,656][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000020589_337330176.pth... [2024-06-10 11:32:34,711][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000019934_326598656.pth [2024-06-10 11:32:34,954][32415] Updated weights for policy 0, policy_version 20590 (0.0032) [2024-06-10 11:32:37,736][32415] Updated weights for policy 0, policy_version 20600 (0.0031) [2024-06-10 11:32:39,592][32177] Fps is (10 sec: 47512.5, 60 sec: 44783.0, 300 sec: 44931.0). Total num frames: 337592320. Throughput: 0: 45123.5. Samples: 337725180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 11:32:39,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:32:42,116][32415] Updated weights for policy 0, policy_version 20610 (0.0040) [2024-06-10 11:32:44,591][32177] Fps is (10 sec: 47515.0, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 337788928. Throughput: 0: 45034.3. Samples: 337862440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 11:32:44,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:32:45,235][32415] Updated weights for policy 0, policy_version 20620 (0.0026) [2024-06-10 11:32:49,592][32177] Fps is (10 sec: 39321.4, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 337985536. Throughput: 0: 45055.1. Samples: 338126900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 11:32:49,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:32:49,792][32415] Updated weights for policy 0, policy_version 20630 (0.0029) [2024-06-10 11:32:52,642][32415] Updated weights for policy 0, policy_version 20640 (0.0038) [2024-06-10 11:32:54,596][32177] Fps is (10 sec: 47492.7, 60 sec: 45052.8, 300 sec: 44874.9). Total num frames: 338264064. Throughput: 0: 44773.5. Samples: 338392520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 11:32:54,597][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:32:56,778][32415] Updated weights for policy 0, policy_version 20650 (0.0029) [2024-06-10 11:32:59,592][32177] Fps is (10 sec: 47514.7, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 338460672. Throughput: 0: 45068.2. Samples: 338543280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 11:32:59,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:32:59,782][32415] Updated weights for policy 0, policy_version 20660 (0.0030) [2024-06-10 11:33:04,003][32415] Updated weights for policy 0, policy_version 20670 (0.0044) [2024-06-10 11:33:04,592][32177] Fps is (10 sec: 40977.7, 60 sec: 44510.0, 300 sec: 44708.9). Total num frames: 338673664. Throughput: 0: 45090.3. Samples: 338810800. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-10 11:33:04,592][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:33:06,842][32415] Updated weights for policy 0, policy_version 20680 (0.0045) [2024-06-10 11:33:09,592][32177] Fps is (10 sec: 47512.4, 60 sec: 44782.8, 300 sec: 44931.0). Total num frames: 338935808. Throughput: 0: 45074.1. Samples: 339074780. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-10 11:33:09,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:33:11,448][32415] Updated weights for policy 0, policy_version 20690 (0.0044) [2024-06-10 11:33:12,436][32394] Signal inference workers to stop experience collection... (4950 times) [2024-06-10 11:33:12,436][32394] Signal inference workers to resume experience collection... (4950 times) [2024-06-10 11:33:12,467][32415] InferenceWorker_p0-w0: stopping experience collection (4950 times) [2024-06-10 11:33:12,467][32415] InferenceWorker_p0-w0: resuming experience collection (4950 times) [2024-06-10 11:33:14,488][32415] Updated weights for policy 0, policy_version 20700 (0.0026) [2024-06-10 11:33:14,592][32177] Fps is (10 sec: 47513.5, 60 sec: 45056.1, 300 sec: 44875.5). Total num frames: 339148800. Throughput: 0: 44849.2. Samples: 339211420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-10 11:33:14,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:33:18,894][32415] Updated weights for policy 0, policy_version 20710 (0.0023) [2024-06-10 11:33:19,592][32177] Fps is (10 sec: 42599.2, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 339361792. Throughput: 0: 45023.7. Samples: 339477920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-10 11:33:19,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:33:21,884][32415] Updated weights for policy 0, policy_version 20720 (0.0043) [2024-06-10 11:33:24,592][32177] Fps is (10 sec: 45875.1, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 339607552. Throughput: 0: 44835.2. Samples: 339742760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-10 11:33:24,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:33:25,850][32415] Updated weights for policy 0, policy_version 20730 (0.0030) [2024-06-10 11:33:29,249][32415] Updated weights for policy 0, policy_version 20740 (0.0029) [2024-06-10 11:33:29,592][32177] Fps is (10 sec: 47514.0, 60 sec: 45329.0, 300 sec: 44876.2). Total num frames: 339836928. Throughput: 0: 44959.5. Samples: 339885620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-10 11:33:29,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:33:33,351][32415] Updated weights for policy 0, policy_version 20750 (0.0032) [2024-06-10 11:33:34,592][32177] Fps is (10 sec: 44236.3, 60 sec: 45602.2, 300 sec: 44764.4). Total num frames: 340049920. Throughput: 0: 45101.8. Samples: 340156480. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-10 11:33:34,592][32177] Avg episode reward: [(0, '0.295')] [2024-06-10 11:33:36,386][32415] Updated weights for policy 0, policy_version 20760 (0.0027) [2024-06-10 11:33:39,592][32177] Fps is (10 sec: 42598.2, 60 sec: 44510.0, 300 sec: 44820.0). Total num frames: 340262912. Throughput: 0: 45012.3. Samples: 340417880. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-10 11:33:39,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:33:40,674][32415] Updated weights for policy 0, policy_version 20770 (0.0033) [2024-06-10 11:33:43,783][32415] Updated weights for policy 0, policy_version 20780 (0.0034) [2024-06-10 11:33:44,592][32177] Fps is (10 sec: 44237.2, 60 sec: 45055.9, 300 sec: 44820.0). Total num frames: 340492288. Throughput: 0: 44549.2. Samples: 340548000. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-10 11:33:44,592][32177] Avg episode reward: [(0, '0.294')] [2024-06-10 11:33:48,052][32415] Updated weights for policy 0, policy_version 20790 (0.0039) [2024-06-10 11:33:49,592][32177] Fps is (10 sec: 45875.0, 60 sec: 45602.2, 300 sec: 44876.1). Total num frames: 340721664. Throughput: 0: 44702.6. Samples: 340822420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-10 11:33:49,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:33:50,965][32415] Updated weights for policy 0, policy_version 20800 (0.0031) [2024-06-10 11:33:54,592][32177] Fps is (10 sec: 42597.7, 60 sec: 44239.8, 300 sec: 44819.9). Total num frames: 340918272. Throughput: 0: 44763.1. Samples: 341089120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-10 11:33:54,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:33:55,407][32415] Updated weights for policy 0, policy_version 20810 (0.0027) [2024-06-10 11:33:58,520][32415] Updated weights for policy 0, policy_version 20820 (0.0043) [2024-06-10 11:33:59,592][32177] Fps is (10 sec: 47513.7, 60 sec: 45602.1, 300 sec: 44986.6). Total num frames: 341196800. Throughput: 0: 44669.8. Samples: 341221560. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-10 11:33:59,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:34:02,423][32415] Updated weights for policy 0, policy_version 20830 (0.0026) [2024-06-10 11:34:04,592][32177] Fps is (10 sec: 47514.8, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 341393408. Throughput: 0: 44920.1. Samples: 341499320. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-10 11:34:04,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:34:05,545][32415] Updated weights for policy 0, policy_version 20840 (0.0024) [2024-06-10 11:34:09,596][32177] Fps is (10 sec: 39304.5, 60 sec: 44233.7, 300 sec: 44652.7). Total num frames: 341590016. Throughput: 0: 44943.7. Samples: 341765420. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-10 11:34:09,597][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:34:09,870][32415] Updated weights for policy 0, policy_version 20850 (0.0026) [2024-06-10 11:34:12,864][32415] Updated weights for policy 0, policy_version 20860 (0.0024) [2024-06-10 11:34:14,592][32177] Fps is (10 sec: 45874.9, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 341852160. Throughput: 0: 44748.4. Samples: 341899300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 11:34:14,592][32177] Avg episode reward: [(0, '0.296')] [2024-06-10 11:34:17,093][32415] Updated weights for policy 0, policy_version 20870 (0.0029) [2024-06-10 11:34:19,592][32177] Fps is (10 sec: 47533.8, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 342065152. Throughput: 0: 44643.6. Samples: 342165440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 11:34:19,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:34:20,247][32415] Updated weights for policy 0, policy_version 20880 (0.0025) [2024-06-10 11:34:24,564][32415] Updated weights for policy 0, policy_version 20890 (0.0024) [2024-06-10 11:34:24,592][32177] Fps is (10 sec: 40960.1, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 342261760. Throughput: 0: 44904.9. Samples: 342438600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 11:34:24,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:34:27,845][32415] Updated weights for policy 0, policy_version 20900 (0.0035) [2024-06-10 11:34:29,592][32177] Fps is (10 sec: 45875.3, 60 sec: 44782.8, 300 sec: 44931.0). Total num frames: 342523904. Throughput: 0: 44956.0. Samples: 342571020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 11:34:29,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:34:31,531][32394] Signal inference workers to stop experience collection... (5000 times) [2024-06-10 11:34:31,532][32394] Signal inference workers to resume experience collection... (5000 times) [2024-06-10 11:34:31,561][32415] InferenceWorker_p0-w0: stopping experience collection (5000 times) [2024-06-10 11:34:31,561][32415] InferenceWorker_p0-w0: resuming experience collection (5000 times) [2024-06-10 11:34:31,662][32415] Updated weights for policy 0, policy_version 20910 (0.0040) [2024-06-10 11:34:34,592][32177] Fps is (10 sec: 45874.9, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 342720512. Throughput: 0: 44785.7. Samples: 342837780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 11:34:34,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:34:34,609][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000020918_342720512.pth... [2024-06-10 11:34:34,681][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000020263_331988992.pth [2024-06-10 11:34:34,981][32415] Updated weights for policy 0, policy_version 20920 (0.0028) [2024-06-10 11:34:38,990][32415] Updated weights for policy 0, policy_version 20930 (0.0030) [2024-06-10 11:34:39,592][32177] Fps is (10 sec: 42598.6, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 342949888. Throughput: 0: 44886.4. Samples: 343109000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 11:34:39,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:34:42,122][32415] Updated weights for policy 0, policy_version 20940 (0.0037) [2024-06-10 11:34:44,592][32177] Fps is (10 sec: 47513.2, 60 sec: 45055.9, 300 sec: 44876.1). Total num frames: 343195648. Throughput: 0: 44847.8. Samples: 343239720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 11:34:44,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:34:46,447][32415] Updated weights for policy 0, policy_version 20950 (0.0035) [2024-06-10 11:34:49,321][32415] Updated weights for policy 0, policy_version 20960 (0.0022) [2024-06-10 11:34:49,592][32177] Fps is (10 sec: 45874.5, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 343408640. Throughput: 0: 44727.8. Samples: 343512080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 11:34:49,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:34:53,571][32415] Updated weights for policy 0, policy_version 20970 (0.0038) [2024-06-10 11:34:54,592][32177] Fps is (10 sec: 42598.5, 60 sec: 45056.0, 300 sec: 44819.9). Total num frames: 343621632. Throughput: 0: 44927.8. Samples: 343786980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 11:34:54,594][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:34:56,947][32415] Updated weights for policy 0, policy_version 20980 (0.0031) [2024-06-10 11:34:59,592][32177] Fps is (10 sec: 44237.1, 60 sec: 44236.7, 300 sec: 44820.0). Total num frames: 343851008. Throughput: 0: 44941.7. Samples: 343921680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 11:34:59,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:35:00,649][32415] Updated weights for policy 0, policy_version 20990 (0.0030) [2024-06-10 11:35:04,049][32415] Updated weights for policy 0, policy_version 21000 (0.0031) [2024-06-10 11:35:04,592][32177] Fps is (10 sec: 45876.1, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 344080384. Throughput: 0: 44838.8. Samples: 344183180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 11:35:04,592][32177] Avg episode reward: [(0, '0.293')] [2024-06-10 11:35:08,095][32415] Updated weights for policy 0, policy_version 21010 (0.0020) [2024-06-10 11:35:09,592][32177] Fps is (10 sec: 45876.0, 60 sec: 45332.4, 300 sec: 44875.5). Total num frames: 344309760. Throughput: 0: 44719.2. Samples: 344450960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 11:35:09,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:35:11,222][32415] Updated weights for policy 0, policy_version 21020 (0.0032) [2024-06-10 11:35:14,592][32177] Fps is (10 sec: 42597.9, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 344506368. Throughput: 0: 44741.3. Samples: 344584380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 11:35:14,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:35:15,581][32415] Updated weights for policy 0, policy_version 21030 (0.0029) [2024-06-10 11:35:18,660][32415] Updated weights for policy 0, policy_version 21040 (0.0043) [2024-06-10 11:35:19,592][32177] Fps is (10 sec: 44236.3, 60 sec: 44783.0, 300 sec: 44931.0). Total num frames: 344752128. Throughput: 0: 44709.8. Samples: 344849720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 11:35:19,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:35:22,598][32415] Updated weights for policy 0, policy_version 21050 (0.0033) [2024-06-10 11:35:24,592][32177] Fps is (10 sec: 47513.3, 60 sec: 45329.0, 300 sec: 44876.1). Total num frames: 344981504. Throughput: 0: 44823.4. Samples: 345126060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 11:35:24,592][32177] Avg episode reward: [(0, '0.272')] [2024-06-10 11:35:26,166][32415] Updated weights for policy 0, policy_version 21060 (0.0025) [2024-06-10 11:35:29,592][32177] Fps is (10 sec: 42598.6, 60 sec: 44236.8, 300 sec: 44709.2). Total num frames: 345178112. Throughput: 0: 44908.1. Samples: 345260580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-10 11:35:29,592][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:35:29,899][32415] Updated weights for policy 0, policy_version 21070 (0.0024) [2024-06-10 11:35:33,253][32415] Updated weights for policy 0, policy_version 21080 (0.0038) [2024-06-10 11:35:34,592][32177] Fps is (10 sec: 42599.1, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 345407488. Throughput: 0: 44722.0. Samples: 345524560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-10 11:35:34,592][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:35:37,343][32415] Updated weights for policy 0, policy_version 21090 (0.0036) [2024-06-10 11:35:39,592][32177] Fps is (10 sec: 47513.7, 60 sec: 45056.0, 300 sec: 44931.7). Total num frames: 345653248. Throughput: 0: 44540.6. Samples: 345791300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-10 11:35:39,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:35:40,302][32415] Updated weights for policy 0, policy_version 21100 (0.0029) [2024-06-10 11:35:44,592][32177] Fps is (10 sec: 44236.7, 60 sec: 44236.9, 300 sec: 44708.9). Total num frames: 345849856. Throughput: 0: 44545.0. Samples: 345926200. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-10 11:35:44,592][32177] Avg episode reward: [(0, '0.293')] [2024-06-10 11:35:44,702][32415] Updated weights for policy 0, policy_version 21110 (0.0037) [2024-06-10 11:35:47,443][32415] Updated weights for policy 0, policy_version 21120 (0.0029) [2024-06-10 11:35:49,592][32177] Fps is (10 sec: 44236.0, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 346095616. Throughput: 0: 44786.0. Samples: 346198560. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-10 11:35:49,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:35:51,877][32415] Updated weights for policy 0, policy_version 21130 (0.0038) [2024-06-10 11:35:53,293][32394] Signal inference workers to stop experience collection... (5050 times) [2024-06-10 11:35:53,328][32415] InferenceWorker_p0-w0: stopping experience collection (5050 times) [2024-06-10 11:35:53,354][32394] Signal inference workers to resume experience collection... (5050 times) [2024-06-10 11:35:53,355][32415] InferenceWorker_p0-w0: resuming experience collection (5050 times) [2024-06-10 11:35:54,592][32177] Fps is (10 sec: 49151.2, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 346341376. Throughput: 0: 44894.5. Samples: 346471220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-10 11:35:54,592][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:35:54,982][32415] Updated weights for policy 0, policy_version 21140 (0.0021) [2024-06-10 11:35:59,026][32415] Updated weights for policy 0, policy_version 21150 (0.0044) [2024-06-10 11:35:59,596][32177] Fps is (10 sec: 42580.9, 60 sec: 44506.8, 300 sec: 44708.2). Total num frames: 346521600. Throughput: 0: 44870.9. Samples: 346603760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-10 11:35:59,597][32177] Avg episode reward: [(0, '0.292')] [2024-06-10 11:36:02,675][32415] Updated weights for policy 0, policy_version 21160 (0.0031) [2024-06-10 11:36:04,592][32177] Fps is (10 sec: 40960.9, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 346750976. Throughput: 0: 44899.7. Samples: 346870200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-10 11:36:04,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:36:06,409][32415] Updated weights for policy 0, policy_version 21170 (0.0034) [2024-06-10 11:36:09,592][32177] Fps is (10 sec: 47534.1, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 346996736. Throughput: 0: 44545.9. Samples: 347130620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 11:36:09,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:36:09,727][32415] Updated weights for policy 0, policy_version 21180 (0.0026) [2024-06-10 11:36:13,871][32415] Updated weights for policy 0, policy_version 21190 (0.0044) [2024-06-10 11:36:14,596][32177] Fps is (10 sec: 45855.4, 60 sec: 45052.9, 300 sec: 44763.8). Total num frames: 347209728. Throughput: 0: 44574.5. Samples: 347266620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 11:36:14,596][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:36:16,869][32415] Updated weights for policy 0, policy_version 21200 (0.0034) [2024-06-10 11:36:19,592][32177] Fps is (10 sec: 42597.8, 60 sec: 44509.8, 300 sec: 44820.0). Total num frames: 347422720. Throughput: 0: 44831.0. Samples: 347541960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 11:36:19,592][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:36:21,173][32415] Updated weights for policy 0, policy_version 21210 (0.0033) [2024-06-10 11:36:24,154][32415] Updated weights for policy 0, policy_version 21220 (0.0031) [2024-06-10 11:36:24,592][32177] Fps is (10 sec: 45894.3, 60 sec: 44783.0, 300 sec: 44931.0). Total num frames: 347668480. Throughput: 0: 44745.2. Samples: 347804840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-10 11:36:24,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:36:28,571][32415] Updated weights for policy 0, policy_version 21230 (0.0040) [2024-06-10 11:36:29,592][32177] Fps is (10 sec: 45875.9, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 347881472. Throughput: 0: 44782.7. Samples: 347941420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-10 11:36:29,592][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:36:31,652][32415] Updated weights for policy 0, policy_version 21240 (0.0031) [2024-06-10 11:36:34,592][32177] Fps is (10 sec: 42598.0, 60 sec: 44782.8, 300 sec: 44708.9). Total num frames: 348094464. Throughput: 0: 44740.4. Samples: 348211880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-10 11:36:34,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:36:34,598][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000021246_348094464.pth... [2024-06-10 11:36:34,655][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000020589_337330176.pth [2024-06-10 11:36:35,510][32415] Updated weights for policy 0, policy_version 21250 (0.0036) [2024-06-10 11:36:39,138][32415] Updated weights for policy 0, policy_version 21260 (0.0027) [2024-06-10 11:36:39,592][32177] Fps is (10 sec: 44236.4, 60 sec: 44509.8, 300 sec: 44875.5). Total num frames: 348323840. Throughput: 0: 44519.2. Samples: 348474580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-10 11:36:39,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:36:42,823][32415] Updated weights for policy 0, policy_version 21270 (0.0025) [2024-06-10 11:36:44,592][32177] Fps is (10 sec: 45875.4, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 348553216. Throughput: 0: 44748.1. Samples: 348617240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-10 11:36:44,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:36:46,350][32415] Updated weights for policy 0, policy_version 21280 (0.0026) [2024-06-10 11:36:49,592][32177] Fps is (10 sec: 45874.6, 60 sec: 44782.9, 300 sec: 44819.9). Total num frames: 348782592. Throughput: 0: 44824.6. Samples: 348887320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-10 11:36:49,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:36:49,964][32415] Updated weights for policy 0, policy_version 21290 (0.0035) [2024-06-10 11:36:53,560][32415] Updated weights for policy 0, policy_version 21300 (0.0023) [2024-06-10 11:36:54,592][32177] Fps is (10 sec: 44236.8, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 348995584. Throughput: 0: 44975.8. Samples: 349154540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-10 11:36:54,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:36:57,435][32415] Updated weights for policy 0, policy_version 21310 (0.0035) [2024-06-10 11:36:59,592][32177] Fps is (10 sec: 45876.3, 60 sec: 45332.3, 300 sec: 44875.5). Total num frames: 349241344. Throughput: 0: 44901.2. Samples: 349286980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-10 11:36:59,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:37:01,126][32415] Updated weights for policy 0, policy_version 21320 (0.0037) [2024-06-10 11:37:04,592][32177] Fps is (10 sec: 45876.0, 60 sec: 45056.0, 300 sec: 44764.4). Total num frames: 349454336. Throughput: 0: 44847.3. Samples: 349560080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-10 11:37:04,592][32177] Avg episode reward: [(0, '0.296')] [2024-06-10 11:37:04,626][32415] Updated weights for policy 0, policy_version 21330 (0.0035) [2024-06-10 11:37:08,725][32415] Updated weights for policy 0, policy_version 21340 (0.0020) [2024-06-10 11:37:09,592][32177] Fps is (10 sec: 40959.5, 60 sec: 44236.7, 300 sec: 44764.4). Total num frames: 349650944. Throughput: 0: 44821.8. Samples: 349821820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 11:37:09,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:37:10,788][32394] Signal inference workers to stop experience collection... (5100 times) [2024-06-10 11:37:10,788][32394] Signal inference workers to resume experience collection... (5100 times) [2024-06-10 11:37:10,807][32415] InferenceWorker_p0-w0: stopping experience collection (5100 times) [2024-06-10 11:37:10,807][32415] InferenceWorker_p0-w0: resuming experience collection (5100 times) [2024-06-10 11:37:11,927][32415] Updated weights for policy 0, policy_version 21350 (0.0036) [2024-06-10 11:37:14,592][32177] Fps is (10 sec: 44237.0, 60 sec: 44786.2, 300 sec: 44875.5). Total num frames: 349896704. Throughput: 0: 44580.5. Samples: 349947540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 11:37:14,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:37:15,924][32415] Updated weights for policy 0, policy_version 21360 (0.0029) [2024-06-10 11:37:19,133][32415] Updated weights for policy 0, policy_version 21370 (0.0033) [2024-06-10 11:37:19,592][32177] Fps is (10 sec: 47513.4, 60 sec: 45056.0, 300 sec: 44764.4). Total num frames: 350126080. Throughput: 0: 44849.4. Samples: 350230100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 11:37:19,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:37:23,090][32415] Updated weights for policy 0, policy_version 21380 (0.0035) [2024-06-10 11:37:24,592][32177] Fps is (10 sec: 44235.9, 60 sec: 44509.8, 300 sec: 44819.9). Total num frames: 350339072. Throughput: 0: 44997.7. Samples: 350499480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-10 11:37:24,601][32177] Avg episode reward: [(0, '0.292')] [2024-06-10 11:37:26,462][32415] Updated weights for policy 0, policy_version 21390 (0.0045) [2024-06-10 11:37:29,592][32177] Fps is (10 sec: 45875.8, 60 sec: 45056.0, 300 sec: 44986.6). Total num frames: 350584832. Throughput: 0: 44735.7. Samples: 350630340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-10 11:37:29,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:37:30,661][32415] Updated weights for policy 0, policy_version 21400 (0.0024) [2024-06-10 11:37:33,799][32415] Updated weights for policy 0, policy_version 21410 (0.0027) [2024-06-10 11:37:34,592][32177] Fps is (10 sec: 45875.8, 60 sec: 45056.1, 300 sec: 44764.4). Total num frames: 350797824. Throughput: 0: 44761.1. Samples: 350901560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-10 11:37:34,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:37:37,823][32415] Updated weights for policy 0, policy_version 21420 (0.0042) [2024-06-10 11:37:39,592][32177] Fps is (10 sec: 40959.3, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 350994432. Throughput: 0: 45136.9. Samples: 351185700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-10 11:37:39,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:37:40,797][32415] Updated weights for policy 0, policy_version 21430 (0.0036) [2024-06-10 11:37:44,592][32177] Fps is (10 sec: 45874.8, 60 sec: 45056.0, 300 sec: 44986.6). Total num frames: 351256576. Throughput: 0: 44923.4. Samples: 351308540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-10 11:37:44,593][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:37:45,095][32415] Updated weights for policy 0, policy_version 21440 (0.0034) [2024-06-10 11:37:48,116][32415] Updated weights for policy 0, policy_version 21450 (0.0026) [2024-06-10 11:37:49,592][32177] Fps is (10 sec: 49153.0, 60 sec: 45056.2, 300 sec: 44820.6). Total num frames: 351485952. Throughput: 0: 44807.1. Samples: 351576400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-10 11:37:49,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:37:52,539][32415] Updated weights for policy 0, policy_version 21460 (0.0038) [2024-06-10 11:37:54,592][32177] Fps is (10 sec: 42598.8, 60 sec: 44783.1, 300 sec: 44820.0). Total num frames: 351682560. Throughput: 0: 45082.3. Samples: 351850520. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-10 11:37:54,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:37:55,563][32415] Updated weights for policy 0, policy_version 21470 (0.0036) [2024-06-10 11:37:59,592][32177] Fps is (10 sec: 42597.8, 60 sec: 44509.8, 300 sec: 44875.5). Total num frames: 351911936. Throughput: 0: 45149.6. Samples: 351979280. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-10 11:37:59,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:37:59,713][32415] Updated weights for policy 0, policy_version 21480 (0.0035) [2024-06-10 11:38:03,096][32415] Updated weights for policy 0, policy_version 21490 (0.0040) [2024-06-10 11:38:04,592][32177] Fps is (10 sec: 45875.0, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 352141312. Throughput: 0: 44771.2. Samples: 352244800. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-10 11:38:04,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:38:06,684][32415] Updated weights for policy 0, policy_version 21500 (0.0034) [2024-06-10 11:38:09,592][32177] Fps is (10 sec: 45875.4, 60 sec: 45329.1, 300 sec: 44819.9). Total num frames: 352370688. Throughput: 0: 44888.1. Samples: 352519440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-10 11:38:09,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:38:10,410][32415] Updated weights for policy 0, policy_version 21510 (0.0040) [2024-06-10 11:38:14,190][32415] Updated weights for policy 0, policy_version 21520 (0.0029) [2024-06-10 11:38:14,592][32177] Fps is (10 sec: 44237.0, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 352583680. Throughput: 0: 44917.3. Samples: 352651620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-10 11:38:14,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:38:17,427][32415] Updated weights for policy 0, policy_version 21530 (0.0039) [2024-06-10 11:38:19,476][32394] Signal inference workers to stop experience collection... (5150 times) [2024-06-10 11:38:19,526][32415] InferenceWorker_p0-w0: stopping experience collection (5150 times) [2024-06-10 11:38:19,532][32394] Signal inference workers to resume experience collection... (5150 times) [2024-06-10 11:38:19,542][32415] InferenceWorker_p0-w0: resuming experience collection (5150 times) [2024-06-10 11:38:19,592][32177] Fps is (10 sec: 44237.1, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 352813056. Throughput: 0: 44932.0. Samples: 352923500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-10 11:38:19,592][32177] Avg episode reward: [(0, '0.292')] [2024-06-10 11:38:21,571][32415] Updated weights for policy 0, policy_version 21540 (0.0028) [2024-06-10 11:38:24,594][32177] Fps is (10 sec: 47503.8, 60 sec: 45327.6, 300 sec: 44819.6). Total num frames: 353058816. Throughput: 0: 44590.1. Samples: 353192340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-10 11:38:24,594][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:38:24,648][32415] Updated weights for policy 0, policy_version 21550 (0.0032) [2024-06-10 11:38:28,511][32415] Updated weights for policy 0, policy_version 21560 (0.0034) [2024-06-10 11:38:29,592][32177] Fps is (10 sec: 44236.4, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 353255424. Throughput: 0: 44796.9. Samples: 353324400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-10 11:38:29,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:38:32,150][32415] Updated weights for policy 0, policy_version 21570 (0.0026) [2024-06-10 11:38:34,592][32177] Fps is (10 sec: 45883.1, 60 sec: 45328.8, 300 sec: 44931.0). Total num frames: 353517568. Throughput: 0: 44862.8. Samples: 353595240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-10 11:38:34,593][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:38:34,598][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000021577_353517568.pth... [2024-06-10 11:38:34,655][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000020918_342720512.pth [2024-06-10 11:38:35,847][32415] Updated weights for policy 0, policy_version 21580 (0.0026) [2024-06-10 11:38:39,596][32177] Fps is (10 sec: 45855.8, 60 sec: 45325.9, 300 sec: 44819.3). Total num frames: 353714176. Throughput: 0: 44677.0. Samples: 353861180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-10 11:38:39,597][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:38:39,714][32415] Updated weights for policy 0, policy_version 21590 (0.0041) [2024-06-10 11:38:43,012][32415] Updated weights for policy 0, policy_version 21600 (0.0029) [2024-06-10 11:38:44,592][32177] Fps is (10 sec: 39322.6, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 353910784. Throughput: 0: 44832.5. Samples: 353996740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-10 11:38:44,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:38:46,593][32415] Updated weights for policy 0, policy_version 21610 (0.0020) [2024-06-10 11:38:49,592][32177] Fps is (10 sec: 45895.3, 60 sec: 44783.0, 300 sec: 44931.1). Total num frames: 354172928. Throughput: 0: 45160.5. Samples: 354277020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-10 11:38:49,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:38:50,359][32415] Updated weights for policy 0, policy_version 21620 (0.0032) [2024-06-10 11:38:53,656][32415] Updated weights for policy 0, policy_version 21630 (0.0036) [2024-06-10 11:38:54,592][32177] Fps is (10 sec: 47513.3, 60 sec: 45055.9, 300 sec: 44708.9). Total num frames: 354385920. Throughput: 0: 44934.6. Samples: 354541500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-10 11:38:54,592][32177] Avg episode reward: [(0, '0.274')] [2024-06-10 11:38:57,471][32415] Updated weights for policy 0, policy_version 21640 (0.0036) [2024-06-10 11:38:59,592][32177] Fps is (10 sec: 42598.1, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 354598912. Throughput: 0: 45037.3. Samples: 354678300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-10 11:38:59,592][32177] Avg episode reward: [(0, '0.292')] [2024-06-10 11:39:01,314][32415] Updated weights for policy 0, policy_version 21650 (0.0032) [2024-06-10 11:39:04,592][32177] Fps is (10 sec: 47514.3, 60 sec: 45329.1, 300 sec: 44987.2). Total num frames: 354861056. Throughput: 0: 44881.3. Samples: 354943160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-10 11:39:04,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:39:05,213][32415] Updated weights for policy 0, policy_version 21660 (0.0033) [2024-06-10 11:39:08,715][32415] Updated weights for policy 0, policy_version 21670 (0.0034) [2024-06-10 11:39:09,592][32177] Fps is (10 sec: 45874.7, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 355057664. Throughput: 0: 44959.7. Samples: 355215440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 11:39:09,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:39:12,385][32415] Updated weights for policy 0, policy_version 21680 (0.0042) [2024-06-10 11:39:14,592][32177] Fps is (10 sec: 42598.6, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 355287040. Throughput: 0: 45100.6. Samples: 355353920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 11:39:14,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:39:15,957][32415] Updated weights for policy 0, policy_version 21690 (0.0032) [2024-06-10 11:39:19,494][32415] Updated weights for policy 0, policy_version 21700 (0.0035) [2024-06-10 11:39:19,592][32177] Fps is (10 sec: 47514.0, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 355532800. Throughput: 0: 45055.0. Samples: 355622700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 11:39:19,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:39:22,899][32415] Updated weights for policy 0, policy_version 21710 (0.0024) [2024-06-10 11:39:24,596][32177] Fps is (10 sec: 44217.2, 60 sec: 44508.1, 300 sec: 44763.8). Total num frames: 355729408. Throughput: 0: 45203.5. Samples: 355895340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-10 11:39:24,597][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:39:27,181][32415] Updated weights for policy 0, policy_version 21720 (0.0038) [2024-06-10 11:39:29,592][32177] Fps is (10 sec: 44236.9, 60 sec: 45329.1, 300 sec: 44931.0). Total num frames: 355975168. Throughput: 0: 45120.5. Samples: 356027160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-10 11:39:29,592][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:39:30,602][32415] Updated weights for policy 0, policy_version 21730 (0.0037) [2024-06-10 11:39:31,480][32394] Signal inference workers to stop experience collection... (5200 times) [2024-06-10 11:39:31,480][32394] Signal inference workers to resume experience collection... (5200 times) [2024-06-10 11:39:31,494][32415] InferenceWorker_p0-w0: stopping experience collection (5200 times) [2024-06-10 11:39:31,494][32415] InferenceWorker_p0-w0: resuming experience collection (5200 times) [2024-06-10 11:39:34,596][32177] Fps is (10 sec: 44237.5, 60 sec: 44233.9, 300 sec: 44819.3). Total num frames: 356171776. Throughput: 0: 44778.4. Samples: 356292240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-10 11:39:34,596][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:39:34,827][32415] Updated weights for policy 0, policy_version 21740 (0.0031) [2024-06-10 11:39:37,840][32415] Updated weights for policy 0, policy_version 21750 (0.0036) [2024-06-10 11:39:39,592][32177] Fps is (10 sec: 44237.1, 60 sec: 45059.3, 300 sec: 44820.0). Total num frames: 356417536. Throughput: 0: 45018.9. Samples: 356567340. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-10 11:39:39,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:39:41,937][32415] Updated weights for policy 0, policy_version 21760 (0.0034) [2024-06-10 11:39:44,592][32177] Fps is (10 sec: 47533.7, 60 sec: 45602.2, 300 sec: 44875.5). Total num frames: 356646912. Throughput: 0: 44915.1. Samples: 356699480. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-10 11:39:44,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:39:45,363][32415] Updated weights for policy 0, policy_version 21770 (0.0041) [2024-06-10 11:39:49,068][32415] Updated weights for policy 0, policy_version 21780 (0.0022) [2024-06-10 11:39:49,592][32177] Fps is (10 sec: 44235.7, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 356859904. Throughput: 0: 45001.6. Samples: 356968240. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-10 11:39:49,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:39:52,935][32415] Updated weights for policy 0, policy_version 21790 (0.0037) [2024-06-10 11:39:54,592][32177] Fps is (10 sec: 45874.9, 60 sec: 45329.1, 300 sec: 44931.0). Total num frames: 357105664. Throughput: 0: 44935.2. Samples: 357237520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 11:39:54,592][32177] Avg episode reward: [(0, '0.297')] [2024-06-10 11:39:56,667][32415] Updated weights for policy 0, policy_version 21800 (0.0036) [2024-06-10 11:39:59,592][32177] Fps is (10 sec: 44237.8, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 357302272. Throughput: 0: 44874.2. Samples: 357373260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 11:39:59,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:40:00,052][32415] Updated weights for policy 0, policy_version 21810 (0.0031) [2024-06-10 11:40:03,898][32415] Updated weights for policy 0, policy_version 21820 (0.0033) [2024-06-10 11:40:04,592][32177] Fps is (10 sec: 42598.3, 60 sec: 44509.8, 300 sec: 44819.9). Total num frames: 357531648. Throughput: 0: 44851.1. Samples: 357641000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 11:40:04,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:40:07,273][32415] Updated weights for policy 0, policy_version 21830 (0.0028) [2024-06-10 11:40:09,592][32177] Fps is (10 sec: 45874.8, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 357761024. Throughput: 0: 44776.8. Samples: 357910100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 11:40:09,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:40:11,013][32415] Updated weights for policy 0, policy_version 21840 (0.0038) [2024-06-10 11:40:14,413][32415] Updated weights for policy 0, policy_version 21850 (0.0047) [2024-06-10 11:40:14,592][32177] Fps is (10 sec: 45875.2, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 357990400. Throughput: 0: 44875.0. Samples: 358046540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 11:40:14,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:40:18,555][32415] Updated weights for policy 0, policy_version 21860 (0.0025) [2024-06-10 11:40:19,592][32177] Fps is (10 sec: 45875.5, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 358219776. Throughput: 0: 45058.9. Samples: 358319700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 11:40:19,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:40:21,862][32415] Updated weights for policy 0, policy_version 21870 (0.0029) [2024-06-10 11:40:24,593][32177] Fps is (10 sec: 45869.0, 60 sec: 45331.3, 300 sec: 44986.4). Total num frames: 358449152. Throughput: 0: 44951.8. Samples: 358590240. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-10 11:40:24,594][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:40:25,650][32415] Updated weights for policy 0, policy_version 21880 (0.0025) [2024-06-10 11:40:28,826][32415] Updated weights for policy 0, policy_version 21890 (0.0034) [2024-06-10 11:40:29,592][32177] Fps is (10 sec: 45875.3, 60 sec: 45056.0, 300 sec: 44986.6). Total num frames: 358678528. Throughput: 0: 44964.5. Samples: 358722880. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-10 11:40:29,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:40:32,874][32415] Updated weights for policy 0, policy_version 21900 (0.0034) [2024-06-10 11:40:34,592][32177] Fps is (10 sec: 44243.1, 60 sec: 45332.3, 300 sec: 44875.5). Total num frames: 358891520. Throughput: 0: 45050.8. Samples: 358995520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-10 11:40:34,592][32177] Avg episode reward: [(0, '0.276')] [2024-06-10 11:40:34,605][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000021905_358891520.pth... [2024-06-10 11:40:34,654][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000021246_348094464.pth [2024-06-10 11:40:36,383][32415] Updated weights for policy 0, policy_version 21910 (0.0025) [2024-06-10 11:40:39,592][32177] Fps is (10 sec: 42597.4, 60 sec: 44782.7, 300 sec: 44931.0). Total num frames: 359104512. Throughput: 0: 45007.0. Samples: 359262840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 11:40:39,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:40:40,165][32415] Updated weights for policy 0, policy_version 21920 (0.0027) [2024-06-10 11:40:43,691][32415] Updated weights for policy 0, policy_version 21930 (0.0028) [2024-06-10 11:40:44,592][32177] Fps is (10 sec: 44237.0, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 359333888. Throughput: 0: 44952.0. Samples: 359396100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 11:40:44,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:40:47,422][32415] Updated weights for policy 0, policy_version 21940 (0.0033) [2024-06-10 11:40:49,592][32177] Fps is (10 sec: 45875.6, 60 sec: 45056.1, 300 sec: 44820.0). Total num frames: 359563264. Throughput: 0: 45084.0. Samples: 359669780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 11:40:49,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:40:51,069][32415] Updated weights for policy 0, policy_version 21950 (0.0036) [2024-06-10 11:40:54,592][32177] Fps is (10 sec: 42597.9, 60 sec: 44236.8, 300 sec: 44876.1). Total num frames: 359759872. Throughput: 0: 45009.3. Samples: 359935520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-10 11:40:54,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:40:54,931][32415] Updated weights for policy 0, policy_version 21960 (0.0032) [2024-06-10 11:40:55,086][32394] Signal inference workers to stop experience collection... (5250 times) [2024-06-10 11:40:55,086][32394] Signal inference workers to resume experience collection... (5250 times) [2024-06-10 11:40:55,134][32415] InferenceWorker_p0-w0: stopping experience collection (5250 times) [2024-06-10 11:40:55,135][32415] InferenceWorker_p0-w0: resuming experience collection (5250 times) [2024-06-10 11:40:58,192][32415] Updated weights for policy 0, policy_version 21970 (0.0042) [2024-06-10 11:40:59,597][32177] Fps is (10 sec: 44212.1, 60 sec: 45051.7, 300 sec: 44930.2). Total num frames: 360005632. Throughput: 0: 44973.5. Samples: 360070600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-10 11:40:59,598][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:41:01,915][32415] Updated weights for policy 0, policy_version 21980 (0.0036) [2024-06-10 11:41:04,592][32177] Fps is (10 sec: 49151.7, 60 sec: 45329.0, 300 sec: 44931.0). Total num frames: 360251392. Throughput: 0: 44815.8. Samples: 360336420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-10 11:41:04,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:41:05,669][32415] Updated weights for policy 0, policy_version 21990 (0.0031) [2024-06-10 11:41:09,072][32415] Updated weights for policy 0, policy_version 22000 (0.0032) [2024-06-10 11:41:09,592][32177] Fps is (10 sec: 44261.4, 60 sec: 44782.9, 300 sec: 44876.1). Total num frames: 360448000. Throughput: 0: 44718.2. Samples: 360602500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 11:41:09,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:41:13,081][32415] Updated weights for policy 0, policy_version 22010 (0.0035) [2024-06-10 11:41:14,592][32177] Fps is (10 sec: 42599.0, 60 sec: 44783.0, 300 sec: 44931.1). Total num frames: 360677376. Throughput: 0: 44729.3. Samples: 360735700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 11:41:14,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:41:16,325][32415] Updated weights for policy 0, policy_version 22020 (0.0024) [2024-06-10 11:41:19,592][32177] Fps is (10 sec: 44236.7, 60 sec: 44509.7, 300 sec: 44820.0). Total num frames: 360890368. Throughput: 0: 44748.7. Samples: 361009220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 11:41:19,592][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:41:20,167][32415] Updated weights for policy 0, policy_version 22030 (0.0022) [2024-06-10 11:41:23,921][32415] Updated weights for policy 0, policy_version 22040 (0.0030) [2024-06-10 11:41:24,592][32177] Fps is (10 sec: 45874.3, 60 sec: 44783.9, 300 sec: 44931.0). Total num frames: 361136128. Throughput: 0: 44753.3. Samples: 361276740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-10 11:41:24,592][32177] Avg episode reward: [(0, '0.293')] [2024-06-10 11:41:27,231][32415] Updated weights for policy 0, policy_version 22050 (0.0030) [2024-06-10 11:41:29,592][32177] Fps is (10 sec: 44237.6, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 361332736. Throughput: 0: 44841.7. Samples: 361413980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-10 11:41:29,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:41:31,323][32415] Updated weights for policy 0, policy_version 22060 (0.0041) [2024-06-10 11:41:34,592][32177] Fps is (10 sec: 42598.5, 60 sec: 44509.8, 300 sec: 44875.5). Total num frames: 361562112. Throughput: 0: 44656.4. Samples: 361679320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-10 11:41:34,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:41:34,822][32415] Updated weights for policy 0, policy_version 22070 (0.0046) [2024-06-10 11:41:38,509][32415] Updated weights for policy 0, policy_version 22080 (0.0042) [2024-06-10 11:41:39,592][32177] Fps is (10 sec: 47513.6, 60 sec: 45056.1, 300 sec: 44931.1). Total num frames: 361807872. Throughput: 0: 44739.6. Samples: 361948800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 11:41:39,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:41:42,287][32415] Updated weights for policy 0, policy_version 22090 (0.0042) [2024-06-10 11:41:44,592][32177] Fps is (10 sec: 45875.5, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 362020864. Throughput: 0: 44902.0. Samples: 362090940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 11:41:44,592][32177] Avg episode reward: [(0, '0.299')] [2024-06-10 11:41:44,615][32394] Saving new best policy, reward=0.299! [2024-06-10 11:41:45,834][32415] Updated weights for policy 0, policy_version 22100 (0.0032) [2024-06-10 11:41:49,468][32415] Updated weights for policy 0, policy_version 22110 (0.0048) [2024-06-10 11:41:49,592][32177] Fps is (10 sec: 44236.8, 60 sec: 44783.0, 300 sec: 44931.1). Total num frames: 362250240. Throughput: 0: 44892.6. Samples: 362356580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 11:41:49,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:41:53,039][32415] Updated weights for policy 0, policy_version 22120 (0.0032) [2024-06-10 11:41:54,592][32177] Fps is (10 sec: 44236.9, 60 sec: 45056.0, 300 sec: 44819.9). Total num frames: 362463232. Throughput: 0: 44778.3. Samples: 362617520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-10 11:41:54,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:41:56,532][32415] Updated weights for policy 0, policy_version 22130 (0.0040) [2024-06-10 11:41:59,592][32177] Fps is (10 sec: 42598.3, 60 sec: 44514.1, 300 sec: 44820.0). Total num frames: 362676224. Throughput: 0: 44862.6. Samples: 362754520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-10 11:41:59,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:42:00,465][32415] Updated weights for policy 0, policy_version 22140 (0.0031) [2024-06-10 11:42:03,834][32415] Updated weights for policy 0, policy_version 22150 (0.0030) [2024-06-10 11:42:04,592][32177] Fps is (10 sec: 45875.0, 60 sec: 44509.9, 300 sec: 44986.6). Total num frames: 362921984. Throughput: 0: 44971.6. Samples: 363032940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-10 11:42:04,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:42:07,546][32415] Updated weights for policy 0, policy_version 22160 (0.0028) [2024-06-10 11:42:09,592][32177] Fps is (10 sec: 47513.1, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 363151360. Throughput: 0: 44912.5. Samples: 363297800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 11:42:09,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:42:11,202][32415] Updated weights for policy 0, policy_version 22170 (0.0030) [2024-06-10 11:42:14,592][32177] Fps is (10 sec: 45875.7, 60 sec: 45056.0, 300 sec: 44931.1). Total num frames: 363380736. Throughput: 0: 45009.3. Samples: 363439400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 11:42:14,592][32177] Avg episode reward: [(0, '0.295')] [2024-06-10 11:42:14,618][32415] Updated weights for policy 0, policy_version 22180 (0.0026) [2024-06-10 11:42:18,696][32415] Updated weights for policy 0, policy_version 22190 (0.0044) [2024-06-10 11:42:19,592][32177] Fps is (10 sec: 45876.0, 60 sec: 45329.2, 300 sec: 44986.6). Total num frames: 363610112. Throughput: 0: 44892.6. Samples: 363699480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 11:42:19,592][32177] Avg episode reward: [(0, '0.295')] [2024-06-10 11:42:22,201][32415] Updated weights for policy 0, policy_version 22200 (0.0039) [2024-06-10 11:42:24,592][32177] Fps is (10 sec: 42598.6, 60 sec: 44510.0, 300 sec: 44820.0). Total num frames: 363806720. Throughput: 0: 44932.5. Samples: 363970760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-10 11:42:24,592][32177] Avg episode reward: [(0, '0.277')] [2024-06-10 11:42:25,039][32394] Signal inference workers to stop experience collection... (5300 times) [2024-06-10 11:42:25,040][32394] Signal inference workers to resume experience collection... (5300 times) [2024-06-10 11:42:25,068][32415] InferenceWorker_p0-w0: stopping experience collection (5300 times) [2024-06-10 11:42:25,068][32415] InferenceWorker_p0-w0: resuming experience collection (5300 times) [2024-06-10 11:42:25,882][32415] Updated weights for policy 0, policy_version 22210 (0.0026) [2024-06-10 11:42:29,592][32177] Fps is (10 sec: 42598.5, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 364036096. Throughput: 0: 44678.8. Samples: 364101480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-10 11:42:29,592][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:42:29,629][32415] Updated weights for policy 0, policy_version 22220 (0.0042) [2024-06-10 11:42:33,189][32415] Updated weights for policy 0, policy_version 22230 (0.0025) [2024-06-10 11:42:34,592][32177] Fps is (10 sec: 44235.7, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 364249088. Throughput: 0: 44711.8. Samples: 364368620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-10 11:42:34,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:42:34,650][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000022233_364265472.pth... [2024-06-10 11:42:34,706][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000021577_353517568.pth [2024-06-10 11:42:36,940][32415] Updated weights for policy 0, policy_version 22240 (0.0034) [2024-06-10 11:42:39,592][32177] Fps is (10 sec: 45874.9, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 364494848. Throughput: 0: 44967.2. Samples: 364641040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-10 11:42:39,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:42:40,504][32415] Updated weights for policy 0, policy_version 22250 (0.0023) [2024-06-10 11:42:43,963][32415] Updated weights for policy 0, policy_version 22260 (0.0040) [2024-06-10 11:42:44,591][32177] Fps is (10 sec: 47515.1, 60 sec: 45056.2, 300 sec: 44875.5). Total num frames: 364724224. Throughput: 0: 45092.6. Samples: 364783680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-10 11:42:44,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:42:47,572][32415] Updated weights for policy 0, policy_version 22270 (0.0032) [2024-06-10 11:42:49,592][32177] Fps is (10 sec: 42597.5, 60 sec: 44509.7, 300 sec: 44875.5). Total num frames: 364920832. Throughput: 0: 44877.2. Samples: 365052420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-10 11:42:49,592][32177] Avg episode reward: [(0, '0.292')] [2024-06-10 11:42:51,397][32415] Updated weights for policy 0, policy_version 22280 (0.0040) [2024-06-10 11:42:54,596][32177] Fps is (10 sec: 44217.0, 60 sec: 45052.8, 300 sec: 44930.4). Total num frames: 365166592. Throughput: 0: 44926.9. Samples: 365319700. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-10 11:42:54,597][32177] Avg episode reward: [(0, '0.293')] [2024-06-10 11:42:55,160][32415] Updated weights for policy 0, policy_version 22290 (0.0026) [2024-06-10 11:42:58,583][32415] Updated weights for policy 0, policy_version 22300 (0.0036) [2024-06-10 11:42:59,592][32177] Fps is (10 sec: 49152.3, 60 sec: 45602.0, 300 sec: 44986.6). Total num frames: 365412352. Throughput: 0: 44793.2. Samples: 365455100. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-10 11:42:59,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:43:02,159][32415] Updated weights for policy 0, policy_version 22310 (0.0025) [2024-06-10 11:43:04,592][32177] Fps is (10 sec: 44256.3, 60 sec: 44783.1, 300 sec: 44875.5). Total num frames: 365608960. Throughput: 0: 45049.8. Samples: 365726720. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-10 11:43:04,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:43:05,831][32415] Updated weights for policy 0, policy_version 22320 (0.0033) [2024-06-10 11:43:09,488][32415] Updated weights for policy 0, policy_version 22330 (0.0028) [2024-06-10 11:43:09,592][32177] Fps is (10 sec: 44237.1, 60 sec: 45056.0, 300 sec: 44986.6). Total num frames: 365854720. Throughput: 0: 45023.4. Samples: 365996820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-10 11:43:09,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:43:13,013][32415] Updated weights for policy 0, policy_version 22340 (0.0034) [2024-06-10 11:43:14,592][32177] Fps is (10 sec: 45874.5, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 366067712. Throughput: 0: 45203.8. Samples: 366135660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-10 11:43:14,592][32177] Avg episode reward: [(0, '0.294')] [2024-06-10 11:43:16,722][32415] Updated weights for policy 0, policy_version 22350 (0.0033) [2024-06-10 11:43:19,592][32177] Fps is (10 sec: 42598.7, 60 sec: 44509.8, 300 sec: 44820.3). Total num frames: 366280704. Throughput: 0: 45160.6. Samples: 366400840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-10 11:43:19,592][32177] Avg episode reward: [(0, '0.293')] [2024-06-10 11:43:20,285][32415] Updated weights for policy 0, policy_version 22360 (0.0040) [2024-06-10 11:43:24,306][32415] Updated weights for policy 0, policy_version 22370 (0.0027) [2024-06-10 11:43:24,592][32177] Fps is (10 sec: 45874.1, 60 sec: 45328.8, 300 sec: 44986.5). Total num frames: 366526464. Throughput: 0: 45097.0. Samples: 366670420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 11:43:24,593][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:43:27,888][32415] Updated weights for policy 0, policy_version 22380 (0.0043) [2024-06-10 11:43:29,592][32177] Fps is (10 sec: 44236.7, 60 sec: 44782.9, 300 sec: 44764.5). Total num frames: 366723072. Throughput: 0: 44879.4. Samples: 366803260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 11:43:29,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:43:31,325][32415] Updated weights for policy 0, policy_version 22390 (0.0039) [2024-06-10 11:43:34,592][32177] Fps is (10 sec: 44236.6, 60 sec: 45328.9, 300 sec: 44931.6). Total num frames: 366968832. Throughput: 0: 44982.9. Samples: 367076660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 11:43:34,593][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:43:35,128][32415] Updated weights for policy 0, policy_version 22400 (0.0032) [2024-06-10 11:43:38,691][32415] Updated weights for policy 0, policy_version 22410 (0.0029) [2024-06-10 11:43:39,592][32177] Fps is (10 sec: 47512.9, 60 sec: 45055.9, 300 sec: 45042.1). Total num frames: 367198208. Throughput: 0: 45045.5. Samples: 367346560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-10 11:43:39,596][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:43:42,173][32415] Updated weights for policy 0, policy_version 22420 (0.0027) [2024-06-10 11:43:44,592][32177] Fps is (10 sec: 45876.4, 60 sec: 45055.8, 300 sec: 44931.0). Total num frames: 367427584. Throughput: 0: 44912.9. Samples: 367476180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-10 11:43:44,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:43:46,037][32415] Updated weights for policy 0, policy_version 22430 (0.0029) [2024-06-10 11:43:49,374][32415] Updated weights for policy 0, policy_version 22440 (0.0040) [2024-06-10 11:43:49,592][32177] Fps is (10 sec: 45875.9, 60 sec: 45602.3, 300 sec: 44986.6). Total num frames: 367656960. Throughput: 0: 44961.7. Samples: 367750000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-10 11:43:49,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:43:53,483][32415] Updated weights for policy 0, policy_version 22450 (0.0027) [2024-06-10 11:43:54,592][32177] Fps is (10 sec: 44237.5, 60 sec: 45059.3, 300 sec: 44986.6). Total num frames: 367869952. Throughput: 0: 44949.4. Samples: 368019540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 11:43:54,592][32177] Avg episode reward: [(0, '0.295')] [2024-06-10 11:43:56,798][32415] Updated weights for policy 0, policy_version 22460 (0.0036) [2024-06-10 11:43:59,592][32177] Fps is (10 sec: 42597.8, 60 sec: 44509.9, 300 sec: 44819.9). Total num frames: 368082944. Throughput: 0: 44819.5. Samples: 368152540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 11:43:59,593][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:44:00,745][32415] Updated weights for policy 0, policy_version 22470 (0.0037) [2024-06-10 11:44:04,247][32415] Updated weights for policy 0, policy_version 22480 (0.0025) [2024-06-10 11:44:04,592][32177] Fps is (10 sec: 45874.3, 60 sec: 45328.9, 300 sec: 44986.6). Total num frames: 368328704. Throughput: 0: 44945.2. Samples: 368423380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-10 11:44:04,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:44:05,683][32394] Signal inference workers to stop experience collection... (5350 times) [2024-06-10 11:44:05,684][32394] Signal inference workers to resume experience collection... (5350 times) [2024-06-10 11:44:05,713][32415] InferenceWorker_p0-w0: stopping experience collection (5350 times) [2024-06-10 11:44:05,713][32415] InferenceWorker_p0-w0: resuming experience collection (5350 times) [2024-06-10 11:44:07,997][32415] Updated weights for policy 0, policy_version 22490 (0.0031) [2024-06-10 11:44:09,592][32177] Fps is (10 sec: 44237.3, 60 sec: 44509.9, 300 sec: 44875.5). Total num frames: 368525312. Throughput: 0: 44974.5. Samples: 368694260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-10 11:44:09,592][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:44:11,307][32415] Updated weights for policy 0, policy_version 22500 (0.0036) [2024-06-10 11:44:14,592][32177] Fps is (10 sec: 42597.9, 60 sec: 44782.8, 300 sec: 44819.9). Total num frames: 368754688. Throughput: 0: 44902.8. Samples: 368823900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-10 11:44:14,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:44:15,458][32415] Updated weights for policy 0, policy_version 22510 (0.0051) [2024-06-10 11:44:18,700][32415] Updated weights for policy 0, policy_version 22520 (0.0034) [2024-06-10 11:44:19,592][32177] Fps is (10 sec: 45875.4, 60 sec: 45056.0, 300 sec: 44931.7). Total num frames: 368984064. Throughput: 0: 44862.6. Samples: 369095460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-10 11:44:19,592][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:44:22,870][32415] Updated weights for policy 0, policy_version 22530 (0.0031) [2024-06-10 11:44:24,592][32177] Fps is (10 sec: 45875.6, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 369213440. Throughput: 0: 44641.3. Samples: 369355420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 11:44:24,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:44:26,201][32415] Updated weights for policy 0, policy_version 22540 (0.0030) [2024-06-10 11:44:29,592][32177] Fps is (10 sec: 44236.1, 60 sec: 45055.9, 300 sec: 44931.7). Total num frames: 369426432. Throughput: 0: 44916.4. Samples: 369497420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 11:44:29,594][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:44:30,181][32415] Updated weights for policy 0, policy_version 22550 (0.0036) [2024-06-10 11:44:33,518][32415] Updated weights for policy 0, policy_version 22560 (0.0027) [2024-06-10 11:44:34,592][32177] Fps is (10 sec: 44237.8, 60 sec: 44783.3, 300 sec: 44875.5). Total num frames: 369655808. Throughput: 0: 44852.1. Samples: 369768340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 11:44:34,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:44:34,673][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000022563_369672192.pth... [2024-06-10 11:44:34,730][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000021905_358891520.pth [2024-06-10 11:44:37,163][32415] Updated weights for policy 0, policy_version 22570 (0.0027) [2024-06-10 11:44:39,592][32177] Fps is (10 sec: 45875.8, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 369885184. Throughput: 0: 44845.7. Samples: 370037600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-10 11:44:39,592][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:44:40,485][32415] Updated weights for policy 0, policy_version 22580 (0.0033) [2024-06-10 11:44:44,592][32177] Fps is (10 sec: 44235.9, 60 sec: 44509.8, 300 sec: 44875.5). Total num frames: 370098176. Throughput: 0: 44957.3. Samples: 370175620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-10 11:44:44,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:44:44,633][32415] Updated weights for policy 0, policy_version 22590 (0.0040) [2024-06-10 11:44:47,807][32415] Updated weights for policy 0, policy_version 22600 (0.0032) [2024-06-10 11:44:49,592][32177] Fps is (10 sec: 44237.2, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 370327552. Throughput: 0: 44798.0. Samples: 370439280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-10 11:44:49,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:44:51,584][32415] Updated weights for policy 0, policy_version 22610 (0.0027) [2024-06-10 11:44:54,592][32177] Fps is (10 sec: 47513.6, 60 sec: 45055.8, 300 sec: 44986.5). Total num frames: 370573312. Throughput: 0: 44946.1. Samples: 370716840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-10 11:44:54,601][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:44:55,299][32415] Updated weights for policy 0, policy_version 22620 (0.0035) [2024-06-10 11:44:59,042][32415] Updated weights for policy 0, policy_version 22630 (0.0036) [2024-06-10 11:44:59,592][32177] Fps is (10 sec: 45875.0, 60 sec: 45056.1, 300 sec: 44931.1). Total num frames: 370786304. Throughput: 0: 44955.0. Samples: 370846860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-10 11:44:59,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:45:02,610][32415] Updated weights for policy 0, policy_version 22640 (0.0032) [2024-06-10 11:45:04,592][32177] Fps is (10 sec: 44237.8, 60 sec: 44783.1, 300 sec: 44931.1). Total num frames: 371015680. Throughput: 0: 44940.9. Samples: 371117800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-10 11:45:04,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:45:06,279][32415] Updated weights for policy 0, policy_version 22650 (0.0032) [2024-06-10 11:45:09,592][32177] Fps is (10 sec: 45874.9, 60 sec: 45329.1, 300 sec: 44931.0). Total num frames: 371245056. Throughput: 0: 45118.8. Samples: 371385760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 11:45:09,592][32177] Avg episode reward: [(0, '0.293')] [2024-06-10 11:45:09,841][32415] Updated weights for policy 0, policy_version 22660 (0.0033) [2024-06-10 11:45:13,868][32415] Updated weights for policy 0, policy_version 22670 (0.0050) [2024-06-10 11:45:14,592][32177] Fps is (10 sec: 44236.5, 60 sec: 45056.2, 300 sec: 44875.5). Total num frames: 371458048. Throughput: 0: 44941.9. Samples: 371519800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 11:45:14,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:45:17,353][32415] Updated weights for policy 0, policy_version 22680 (0.0025) [2024-06-10 11:45:19,592][32177] Fps is (10 sec: 44236.5, 60 sec: 45055.9, 300 sec: 44875.7). Total num frames: 371687424. Throughput: 0: 44718.5. Samples: 371780680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 11:45:19,593][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:45:20,891][32415] Updated weights for policy 0, policy_version 22690 (0.0027) [2024-06-10 11:45:24,592][32177] Fps is (10 sec: 44237.2, 60 sec: 44783.1, 300 sec: 44820.0). Total num frames: 371900416. Throughput: 0: 44783.6. Samples: 372052860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 11:45:24,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:45:24,690][32415] Updated weights for policy 0, policy_version 22700 (0.0026) [2024-06-10 11:45:28,103][32415] Updated weights for policy 0, policy_version 22710 (0.0027) [2024-06-10 11:45:29,592][32177] Fps is (10 sec: 45876.2, 60 sec: 45329.3, 300 sec: 44931.1). Total num frames: 372146176. Throughput: 0: 44818.9. Samples: 372192460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 11:45:29,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:45:31,627][32415] Updated weights for policy 0, policy_version 22720 (0.0039) [2024-06-10 11:45:34,592][32177] Fps is (10 sec: 45874.8, 60 sec: 45055.9, 300 sec: 44931.1). Total num frames: 372359168. Throughput: 0: 45119.9. Samples: 372469680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 11:45:34,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:45:35,468][32415] Updated weights for policy 0, policy_version 22730 (0.0029) [2024-06-10 11:45:38,936][32415] Updated weights for policy 0, policy_version 22740 (0.0036) [2024-06-10 11:45:39,592][32177] Fps is (10 sec: 44235.6, 60 sec: 45055.9, 300 sec: 44931.0). Total num frames: 372588544. Throughput: 0: 44866.7. Samples: 372735840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 11:45:39,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:45:42,616][32415] Updated weights for policy 0, policy_version 22750 (0.0041) [2024-06-10 11:45:44,596][32177] Fps is (10 sec: 45855.8, 60 sec: 45326.0, 300 sec: 44930.4). Total num frames: 372817920. Throughput: 0: 45003.7. Samples: 372872220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 11:45:44,597][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:45:45,682][32394] Signal inference workers to stop experience collection... (5400 times) [2024-06-10 11:45:45,683][32394] Signal inference workers to resume experience collection... (5400 times) [2024-06-10 11:45:45,728][32415] InferenceWorker_p0-w0: stopping experience collection (5400 times) [2024-06-10 11:45:45,728][32415] InferenceWorker_p0-w0: resuming experience collection (5400 times) [2024-06-10 11:45:46,269][32415] Updated weights for policy 0, policy_version 22760 (0.0035) [2024-06-10 11:45:49,592][32177] Fps is (10 sec: 44237.7, 60 sec: 45056.0, 300 sec: 44986.6). Total num frames: 373030912. Throughput: 0: 44969.8. Samples: 373141440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 11:45:49,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:45:49,903][32415] Updated weights for policy 0, policy_version 22770 (0.0029) [2024-06-10 11:45:53,629][32415] Updated weights for policy 0, policy_version 22780 (0.0030) [2024-06-10 11:45:54,592][32177] Fps is (10 sec: 45893.9, 60 sec: 45056.0, 300 sec: 44987.4). Total num frames: 373276672. Throughput: 0: 44929.2. Samples: 373407580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 11:45:54,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:45:57,616][32415] Updated weights for policy 0, policy_version 22790 (0.0033) [2024-06-10 11:45:59,592][32177] Fps is (10 sec: 45874.5, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 373489664. Throughput: 0: 44939.0. Samples: 373542060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 11:45:59,592][32177] Avg episode reward: [(0, '0.292')] [2024-06-10 11:46:01,002][32415] Updated weights for policy 0, policy_version 22800 (0.0025) [2024-06-10 11:46:04,596][32177] Fps is (10 sec: 42580.9, 60 sec: 44779.7, 300 sec: 44930.4). Total num frames: 373702656. Throughput: 0: 45230.9. Samples: 373816260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 11:46:04,597][32177] Avg episode reward: [(0, '0.292')] [2024-06-10 11:46:04,777][32415] Updated weights for policy 0, policy_version 22810 (0.0035) [2024-06-10 11:46:08,312][32415] Updated weights for policy 0, policy_version 22820 (0.0038) [2024-06-10 11:46:09,592][32177] Fps is (10 sec: 44237.3, 60 sec: 44783.0, 300 sec: 44931.0). Total num frames: 373932032. Throughput: 0: 45118.6. Samples: 374083200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 11:46:09,596][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:46:11,859][32415] Updated weights for policy 0, policy_version 22830 (0.0032) [2024-06-10 11:46:14,592][32177] Fps is (10 sec: 44255.4, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 374145024. Throughput: 0: 44938.5. Samples: 374214700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-10 11:46:14,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:46:15,746][32415] Updated weights for policy 0, policy_version 22840 (0.0043) [2024-06-10 11:46:19,138][32415] Updated weights for policy 0, policy_version 22850 (0.0021) [2024-06-10 11:46:19,592][32177] Fps is (10 sec: 45875.3, 60 sec: 45056.1, 300 sec: 44931.1). Total num frames: 374390784. Throughput: 0: 44745.4. Samples: 374483220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-10 11:46:19,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:46:23,084][32415] Updated weights for policy 0, policy_version 22860 (0.0028) [2024-06-10 11:46:24,592][32177] Fps is (10 sec: 44237.0, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 374587392. Throughput: 0: 44745.9. Samples: 374749400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-10 11:46:24,592][32177] Avg episode reward: [(0, '0.296')] [2024-06-10 11:46:26,766][32415] Updated weights for policy 0, policy_version 22870 (0.0030) [2024-06-10 11:46:29,592][32177] Fps is (10 sec: 40959.8, 60 sec: 44236.7, 300 sec: 44875.5). Total num frames: 374800384. Throughput: 0: 44740.6. Samples: 374885360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 11:46:29,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:46:30,387][32415] Updated weights for policy 0, policy_version 22880 (0.0035) [2024-06-10 11:46:33,942][32415] Updated weights for policy 0, policy_version 22890 (0.0030) [2024-06-10 11:46:34,592][32177] Fps is (10 sec: 47513.0, 60 sec: 45055.9, 300 sec: 44931.0). Total num frames: 375062528. Throughput: 0: 44843.8. Samples: 375159420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 11:46:34,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:46:34,619][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000022892_375062528.pth... [2024-06-10 11:46:34,682][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000022233_364265472.pth [2024-06-10 11:46:37,741][32415] Updated weights for policy 0, policy_version 22900 (0.0024) [2024-06-10 11:46:39,592][32177] Fps is (10 sec: 47513.6, 60 sec: 44783.0, 300 sec: 44931.0). Total num frames: 375275520. Throughput: 0: 44878.4. Samples: 375427100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 11:46:39,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:46:41,250][32415] Updated weights for policy 0, policy_version 22910 (0.0044) [2024-06-10 11:46:44,592][32177] Fps is (10 sec: 42599.3, 60 sec: 44513.1, 300 sec: 44875.5). Total num frames: 375488512. Throughput: 0: 44796.6. Samples: 375557900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 11:46:44,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:46:44,895][32415] Updated weights for policy 0, policy_version 22920 (0.0034) [2024-06-10 11:46:48,264][32415] Updated weights for policy 0, policy_version 22930 (0.0036) [2024-06-10 11:46:49,592][32177] Fps is (10 sec: 44236.6, 60 sec: 44782.8, 300 sec: 44931.0). Total num frames: 375717888. Throughput: 0: 44717.5. Samples: 375828360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 11:46:49,592][32177] Avg episode reward: [(0, '0.292')] [2024-06-10 11:46:52,250][32415] Updated weights for policy 0, policy_version 22940 (0.0030) [2024-06-10 11:46:54,592][32177] Fps is (10 sec: 45874.5, 60 sec: 44509.9, 300 sec: 44986.6). Total num frames: 375947264. Throughput: 0: 44911.9. Samples: 376104240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 11:46:54,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:46:55,738][32415] Updated weights for policy 0, policy_version 22950 (0.0025) [2024-06-10 11:46:59,477][32415] Updated weights for policy 0, policy_version 22960 (0.0025) [2024-06-10 11:46:59,592][32177] Fps is (10 sec: 45875.6, 60 sec: 44783.0, 300 sec: 44931.1). Total num frames: 376176640. Throughput: 0: 44866.3. Samples: 376233680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-10 11:46:59,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:47:03,028][32415] Updated weights for policy 0, policy_version 22970 (0.0034) [2024-06-10 11:47:04,592][32177] Fps is (10 sec: 44237.3, 60 sec: 44786.1, 300 sec: 44875.5). Total num frames: 376389632. Throughput: 0: 45049.8. Samples: 376510460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-10 11:47:04,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:47:07,082][32415] Updated weights for policy 0, policy_version 22980 (0.0035) [2024-06-10 11:47:09,592][32177] Fps is (10 sec: 44236.4, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 376619008. Throughput: 0: 44995.9. Samples: 376774220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-10 11:47:09,592][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:47:10,271][32415] Updated weights for policy 0, policy_version 22990 (0.0037) [2024-06-10 11:47:14,592][32177] Fps is (10 sec: 42598.0, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 376815616. Throughput: 0: 44997.7. Samples: 376910260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 11:47:14,592][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:47:14,671][32394] Signal inference workers to stop experience collection... (5450 times) [2024-06-10 11:47:14,726][32394] Signal inference workers to resume experience collection... (5450 times) [2024-06-10 11:47:14,728][32415] InferenceWorker_p0-w0: stopping experience collection (5450 times) [2024-06-10 11:47:14,730][32415] Updated weights for policy 0, policy_version 23000 (0.0025) [2024-06-10 11:47:14,738][32415] InferenceWorker_p0-w0: resuming experience collection (5450 times) [2024-06-10 11:47:17,535][32415] Updated weights for policy 0, policy_version 23010 (0.0032) [2024-06-10 11:47:19,595][32177] Fps is (10 sec: 45860.1, 60 sec: 44780.4, 300 sec: 44986.1). Total num frames: 377077760. Throughput: 0: 44637.3. Samples: 377168240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 11:47:19,596][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:47:21,736][32415] Updated weights for policy 0, policy_version 23020 (0.0029) [2024-06-10 11:47:24,592][32177] Fps is (10 sec: 49152.1, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 377307136. Throughput: 0: 44832.4. Samples: 377444560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 11:47:24,593][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:47:24,746][32415] Updated weights for policy 0, policy_version 23030 (0.0025) [2024-06-10 11:47:28,844][32415] Updated weights for policy 0, policy_version 23040 (0.0038) [2024-06-10 11:47:29,592][32177] Fps is (10 sec: 45890.8, 60 sec: 45602.2, 300 sec: 45042.1). Total num frames: 377536512. Throughput: 0: 45092.4. Samples: 377587060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 11:47:29,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:47:31,923][32415] Updated weights for policy 0, policy_version 23050 (0.0029) [2024-06-10 11:47:34,596][32177] Fps is (10 sec: 44218.2, 60 sec: 44779.9, 300 sec: 44930.4). Total num frames: 377749504. Throughput: 0: 45030.5. Samples: 377854920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 11:47:34,596][32177] Avg episode reward: [(0, '0.271')] [2024-06-10 11:47:36,341][32415] Updated weights for policy 0, policy_version 23060 (0.0022) [2024-06-10 11:47:39,381][32415] Updated weights for policy 0, policy_version 23070 (0.0038) [2024-06-10 11:47:39,592][32177] Fps is (10 sec: 45875.3, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 377995264. Throughput: 0: 44836.1. Samples: 378121860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 11:47:39,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:47:43,664][32415] Updated weights for policy 0, policy_version 23080 (0.0043) [2024-06-10 11:47:44,592][32177] Fps is (10 sec: 44255.4, 60 sec: 45055.9, 300 sec: 44986.6). Total num frames: 378191872. Throughput: 0: 44923.5. Samples: 378255240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 11:47:44,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:47:46,813][32415] Updated weights for policy 0, policy_version 23090 (0.0031) [2024-06-10 11:47:49,592][32177] Fps is (10 sec: 40960.0, 60 sec: 44783.0, 300 sec: 44876.2). Total num frames: 378404864. Throughput: 0: 44663.1. Samples: 378520300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-10 11:47:49,592][32177] Avg episode reward: [(0, '0.275')] [2024-06-10 11:47:50,962][32415] Updated weights for policy 0, policy_version 23100 (0.0033) [2024-06-10 11:47:53,916][32415] Updated weights for policy 0, policy_version 23110 (0.0025) [2024-06-10 11:47:54,592][32177] Fps is (10 sec: 45874.7, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 378650624. Throughput: 0: 44783.9. Samples: 378789500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-10 11:47:54,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:47:57,966][32415] Updated weights for policy 0, policy_version 23120 (0.0031) [2024-06-10 11:47:59,592][32177] Fps is (10 sec: 45875.3, 60 sec: 44783.0, 300 sec: 44931.0). Total num frames: 378863616. Throughput: 0: 45055.7. Samples: 378937760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-10 11:47:59,592][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:48:01,239][32415] Updated weights for policy 0, policy_version 23130 (0.0031) [2024-06-10 11:48:04,592][32177] Fps is (10 sec: 42599.0, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 379076608. Throughput: 0: 45051.3. Samples: 379195400. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-10 11:48:04,593][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:48:05,674][32415] Updated weights for policy 0, policy_version 23140 (0.0033) [2024-06-10 11:48:08,437][32415] Updated weights for policy 0, policy_version 23150 (0.0037) [2024-06-10 11:48:09,592][32177] Fps is (10 sec: 45875.1, 60 sec: 45056.1, 300 sec: 44931.1). Total num frames: 379322368. Throughput: 0: 44890.7. Samples: 379464640. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-10 11:48:09,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:48:12,957][32415] Updated weights for policy 0, policy_version 23160 (0.0027) [2024-06-10 11:48:14,591][32177] Fps is (10 sec: 45875.8, 60 sec: 45329.2, 300 sec: 44931.1). Total num frames: 379535360. Throughput: 0: 44804.5. Samples: 379603260. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-10 11:48:14,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:48:15,702][32415] Updated weights for policy 0, policy_version 23170 (0.0027) [2024-06-10 11:48:19,592][32177] Fps is (10 sec: 40959.6, 60 sec: 44239.2, 300 sec: 44764.5). Total num frames: 379731968. Throughput: 0: 44748.2. Samples: 379868400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-10 11:48:19,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:48:20,381][32415] Updated weights for policy 0, policy_version 23180 (0.0045) [2024-06-10 11:48:21,458][32394] Signal inference workers to stop experience collection... (5500 times) [2024-06-10 11:48:21,458][32394] Signal inference workers to resume experience collection... (5500 times) [2024-06-10 11:48:21,500][32415] InferenceWorker_p0-w0: stopping experience collection (5500 times) [2024-06-10 11:48:21,500][32415] InferenceWorker_p0-w0: resuming experience collection (5500 times) [2024-06-10 11:48:23,083][32415] Updated weights for policy 0, policy_version 23190 (0.0040) [2024-06-10 11:48:24,592][32177] Fps is (10 sec: 45874.6, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 379994112. Throughput: 0: 44701.3. Samples: 380133420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-10 11:48:24,592][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:48:27,369][32415] Updated weights for policy 0, policy_version 23200 (0.0028) [2024-06-10 11:48:29,592][32177] Fps is (10 sec: 47513.3, 60 sec: 44509.7, 300 sec: 44875.5). Total num frames: 380207104. Throughput: 0: 45031.9. Samples: 380281680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-10 11:48:29,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:48:30,440][32415] Updated weights for policy 0, policy_version 23210 (0.0034) [2024-06-10 11:48:34,592][32177] Fps is (10 sec: 40959.5, 60 sec: 44239.8, 300 sec: 44764.4). Total num frames: 380403712. Throughput: 0: 44906.0. Samples: 380541080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 11:48:34,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:48:34,622][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000023219_380420096.pth... [2024-06-10 11:48:34,680][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000022563_369672192.pth [2024-06-10 11:48:35,069][32415] Updated weights for policy 0, policy_version 23220 (0.0036) [2024-06-10 11:48:37,881][32415] Updated weights for policy 0, policy_version 23230 (0.0021) [2024-06-10 11:48:39,592][32177] Fps is (10 sec: 47514.1, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 380682240. Throughput: 0: 44757.9. Samples: 380803600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 11:48:39,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:48:42,477][32415] Updated weights for policy 0, policy_version 23240 (0.0040) [2024-06-10 11:48:44,592][32177] Fps is (10 sec: 47514.6, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 380878848. Throughput: 0: 44677.8. Samples: 380948260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 11:48:44,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:48:44,970][32415] Updated weights for policy 0, policy_version 23250 (0.0032) [2024-06-10 11:48:49,502][32415] Updated weights for policy 0, policy_version 23260 (0.0039) [2024-06-10 11:48:49,592][32177] Fps is (10 sec: 40959.5, 60 sec: 44782.8, 300 sec: 44819.9). Total num frames: 381091840. Throughput: 0: 44905.2. Samples: 381216140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 11:48:49,592][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:48:52,159][32415] Updated weights for policy 0, policy_version 23270 (0.0029) [2024-06-10 11:48:54,592][32177] Fps is (10 sec: 47513.6, 60 sec: 45056.1, 300 sec: 44986.6). Total num frames: 381353984. Throughput: 0: 44929.8. Samples: 381486480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 11:48:54,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:48:56,581][32415] Updated weights for policy 0, policy_version 23280 (0.0038) [2024-06-10 11:48:59,524][32415] Updated weights for policy 0, policy_version 23290 (0.0043) [2024-06-10 11:48:59,592][32177] Fps is (10 sec: 49152.9, 60 sec: 45329.0, 300 sec: 44931.1). Total num frames: 381583360. Throughput: 0: 44981.7. Samples: 381627440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 11:48:59,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:49:04,102][32415] Updated weights for policy 0, policy_version 23300 (0.0035) [2024-06-10 11:49:04,592][32177] Fps is (10 sec: 40959.7, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 381763584. Throughput: 0: 45090.2. Samples: 381897460. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-10 11:49:04,592][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:49:07,069][32415] Updated weights for policy 0, policy_version 23310 (0.0023) [2024-06-10 11:49:09,592][32177] Fps is (10 sec: 42598.2, 60 sec: 44782.9, 300 sec: 44931.1). Total num frames: 382009344. Throughput: 0: 45054.2. Samples: 382160860. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-10 11:49:09,592][32177] Avg episode reward: [(0, '0.293')] [2024-06-10 11:49:11,330][32415] Updated weights for policy 0, policy_version 23320 (0.0020) [2024-06-10 11:49:14,071][32415] Updated weights for policy 0, policy_version 23330 (0.0037) [2024-06-10 11:49:14,592][32177] Fps is (10 sec: 49152.2, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 382255104. Throughput: 0: 44865.5. Samples: 382300620. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-10 11:49:14,592][32177] Avg episode reward: [(0, '0.294')] [2024-06-10 11:49:18,392][32415] Updated weights for policy 0, policy_version 23340 (0.0035) [2024-06-10 11:49:19,592][32177] Fps is (10 sec: 42598.9, 60 sec: 45056.1, 300 sec: 44820.0). Total num frames: 382435328. Throughput: 0: 45043.4. Samples: 382568020. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-10 11:49:19,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:49:21,461][32415] Updated weights for policy 0, policy_version 23350 (0.0026) [2024-06-10 11:49:24,592][32177] Fps is (10 sec: 42598.6, 60 sec: 44783.0, 300 sec: 44931.1). Total num frames: 382681088. Throughput: 0: 45135.2. Samples: 382834680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-10 11:49:24,592][32177] Avg episode reward: [(0, '0.294')] [2024-06-10 11:49:25,661][32415] Updated weights for policy 0, policy_version 23360 (0.0037) [2024-06-10 11:49:29,098][32415] Updated weights for policy 0, policy_version 23370 (0.0030) [2024-06-10 11:49:29,592][32177] Fps is (10 sec: 47513.4, 60 sec: 45056.1, 300 sec: 44931.0). Total num frames: 382910464. Throughput: 0: 44979.6. Samples: 382972340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-10 11:49:29,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:49:33,270][32415] Updated weights for policy 0, policy_version 23380 (0.0022) [2024-06-10 11:49:34,584][32394] Signal inference workers to stop experience collection... (5550 times) [2024-06-10 11:49:34,588][32394] Signal inference workers to resume experience collection... (5550 times) [2024-06-10 11:49:34,592][32177] Fps is (10 sec: 44236.8, 60 sec: 45329.2, 300 sec: 44875.5). Total num frames: 383123456. Throughput: 0: 45072.6. Samples: 383244400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-10 11:49:34,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:49:34,615][32415] InferenceWorker_p0-w0: stopping experience collection (5550 times) [2024-06-10 11:49:34,620][32415] InferenceWorker_p0-w0: resuming experience collection (5550 times) [2024-06-10 11:49:36,621][32415] Updated weights for policy 0, policy_version 23390 (0.0027) [2024-06-10 11:49:39,592][32177] Fps is (10 sec: 42598.4, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 383336448. Throughput: 0: 44800.0. Samples: 383502480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 11:49:39,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:49:40,596][32415] Updated weights for policy 0, policy_version 23400 (0.0022) [2024-06-10 11:49:43,599][32415] Updated weights for policy 0, policy_version 23410 (0.0028) [2024-06-10 11:49:44,592][32177] Fps is (10 sec: 47513.3, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 383598592. Throughput: 0: 44733.3. Samples: 383640440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 11:49:44,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:49:47,624][32415] Updated weights for policy 0, policy_version 23420 (0.0027) [2024-06-10 11:49:49,592][32177] Fps is (10 sec: 47513.0, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 383811584. Throughput: 0: 44892.4. Samples: 383917620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 11:49:49,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:49:50,724][32415] Updated weights for policy 0, policy_version 23430 (0.0031) [2024-06-10 11:49:54,592][32177] Fps is (10 sec: 42598.8, 60 sec: 44509.9, 300 sec: 44875.5). Total num frames: 384024576. Throughput: 0: 44927.2. Samples: 384182580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-10 11:49:54,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:49:54,918][32415] Updated weights for policy 0, policy_version 23440 (0.0030) [2024-06-10 11:49:58,355][32415] Updated weights for policy 0, policy_version 23450 (0.0036) [2024-06-10 11:49:59,592][32177] Fps is (10 sec: 47514.1, 60 sec: 45056.0, 300 sec: 44986.6). Total num frames: 384286720. Throughput: 0: 44748.9. Samples: 384314320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-10 11:49:59,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:50:02,139][32415] Updated weights for policy 0, policy_version 23460 (0.0026) [2024-06-10 11:50:04,592][32177] Fps is (10 sec: 45874.8, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 384483328. Throughput: 0: 44847.4. Samples: 384586160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-10 11:50:04,592][32177] Avg episode reward: [(0, '0.296')] [2024-06-10 11:50:05,729][32415] Updated weights for policy 0, policy_version 23470 (0.0042) [2024-06-10 11:50:09,523][32415] Updated weights for policy 0, policy_version 23480 (0.0034) [2024-06-10 11:50:09,592][32177] Fps is (10 sec: 40959.3, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 384696320. Throughput: 0: 44839.4. Samples: 384852460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-10 11:50:09,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:50:12,997][32415] Updated weights for policy 0, policy_version 23490 (0.0035) [2024-06-10 11:50:14,592][32177] Fps is (10 sec: 45875.5, 60 sec: 44783.0, 300 sec: 44931.1). Total num frames: 384942080. Throughput: 0: 44837.8. Samples: 384990040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-10 11:50:14,592][32177] Avg episode reward: [(0, '0.292')] [2024-06-10 11:50:16,757][32415] Updated weights for policy 0, policy_version 23500 (0.0033) [2024-06-10 11:50:19,592][32177] Fps is (10 sec: 44237.6, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 385138688. Throughput: 0: 44754.2. Samples: 385258340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-10 11:50:19,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:50:20,144][32415] Updated weights for policy 0, policy_version 23510 (0.0040) [2024-06-10 11:50:23,849][32415] Updated weights for policy 0, policy_version 23520 (0.0040) [2024-06-10 11:50:24,592][32177] Fps is (10 sec: 44236.3, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 385384448. Throughput: 0: 44987.0. Samples: 385526900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-10 11:50:24,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:50:27,715][32415] Updated weights for policy 0, policy_version 23530 (0.0028) [2024-06-10 11:50:29,592][32177] Fps is (10 sec: 47513.5, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 385613824. Throughput: 0: 44956.9. Samples: 385663500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-10 11:50:29,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:50:31,343][32415] Updated weights for policy 0, policy_version 23540 (0.0022) [2024-06-10 11:50:34,592][32177] Fps is (10 sec: 42598.9, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 385810432. Throughput: 0: 44575.2. Samples: 385923500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-10 11:50:34,592][32177] Avg episode reward: [(0, '0.282')] [2024-06-10 11:50:34,608][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000023549_385826816.pth... [2024-06-10 11:50:34,667][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000022892_375062528.pth [2024-06-10 11:50:35,039][32415] Updated weights for policy 0, policy_version 23550 (0.0039) [2024-06-10 11:50:38,827][32415] Updated weights for policy 0, policy_version 23560 (0.0033) [2024-06-10 11:50:39,592][32177] Fps is (10 sec: 42598.1, 60 sec: 45055.9, 300 sec: 44820.6). Total num frames: 386039808. Throughput: 0: 44657.7. Samples: 386192180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-10 11:50:39,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:50:42,183][32415] Updated weights for policy 0, policy_version 23570 (0.0028) [2024-06-10 11:50:44,592][32177] Fps is (10 sec: 45875.3, 60 sec: 44509.9, 300 sec: 44875.5). Total num frames: 386269184. Throughput: 0: 44819.1. Samples: 386331180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 11:50:44,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:50:45,850][32415] Updated weights for policy 0, policy_version 23580 (0.0031) [2024-06-10 11:50:49,396][32415] Updated weights for policy 0, policy_version 23590 (0.0030) [2024-06-10 11:50:49,592][32177] Fps is (10 sec: 47513.9, 60 sec: 45056.1, 300 sec: 44875.5). Total num frames: 386514944. Throughput: 0: 44922.3. Samples: 386607660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 11:50:49,592][32177] Avg episode reward: [(0, '0.293')] [2024-06-10 11:50:52,899][32415] Updated weights for policy 0, policy_version 23600 (0.0045) [2024-06-10 11:50:54,592][32177] Fps is (10 sec: 47513.1, 60 sec: 45329.0, 300 sec: 44931.0). Total num frames: 386744320. Throughput: 0: 44960.1. Samples: 386875660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-10 11:50:54,592][32177] Avg episode reward: [(0, '0.296')] [2024-06-10 11:50:56,810][32415] Updated weights for policy 0, policy_version 23610 (0.0022) [2024-06-10 11:50:59,592][32177] Fps is (10 sec: 44236.3, 60 sec: 44509.8, 300 sec: 44931.7). Total num frames: 386957312. Throughput: 0: 44923.0. Samples: 387011580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 11:50:59,593][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:51:00,305][32415] Updated weights for policy 0, policy_version 23620 (0.0034) [2024-06-10 11:51:04,297][32415] Updated weights for policy 0, policy_version 23630 (0.0030) [2024-06-10 11:51:04,592][32177] Fps is (10 sec: 40959.9, 60 sec: 44509.8, 300 sec: 44819.9). Total num frames: 387153920. Throughput: 0: 44825.2. Samples: 387275480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 11:51:04,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:51:07,930][32415] Updated weights for policy 0, policy_version 23640 (0.0033) [2024-06-10 11:51:09,592][32177] Fps is (10 sec: 44236.9, 60 sec: 45056.1, 300 sec: 44931.0). Total num frames: 387399680. Throughput: 0: 44791.1. Samples: 387542500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-10 11:51:09,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:51:11,650][32415] Updated weights for policy 0, policy_version 23650 (0.0031) [2024-06-10 11:51:14,592][32177] Fps is (10 sec: 45875.8, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 387612672. Throughput: 0: 44826.7. Samples: 387680700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 11:51:14,592][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:51:14,947][32415] Updated weights for policy 0, policy_version 23660 (0.0033) [2024-06-10 11:51:15,574][32394] Signal inference workers to stop experience collection... (5600 times) [2024-06-10 11:51:15,620][32415] InferenceWorker_p0-w0: stopping experience collection (5600 times) [2024-06-10 11:51:15,629][32394] Signal inference workers to resume experience collection... (5600 times) [2024-06-10 11:51:15,637][32415] InferenceWorker_p0-w0: resuming experience collection (5600 times) [2024-06-10 11:51:18,810][32415] Updated weights for policy 0, policy_version 23670 (0.0029) [2024-06-10 11:51:19,592][32177] Fps is (10 sec: 44236.3, 60 sec: 45055.9, 300 sec: 44931.0). Total num frames: 387842048. Throughput: 0: 45214.9. Samples: 387958180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 11:51:19,593][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:51:21,976][32415] Updated weights for policy 0, policy_version 23680 (0.0033) [2024-06-10 11:51:24,592][32177] Fps is (10 sec: 45874.7, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 388071424. Throughput: 0: 45044.0. Samples: 388219160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 11:51:24,592][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:51:26,262][32415] Updated weights for policy 0, policy_version 23690 (0.0046) [2024-06-10 11:51:29,440][32415] Updated weights for policy 0, policy_version 23700 (0.0027) [2024-06-10 11:51:29,592][32177] Fps is (10 sec: 45875.4, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 388300800. Throughput: 0: 44945.2. Samples: 388353720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-10 11:51:29,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:51:33,251][32415] Updated weights for policy 0, policy_version 23710 (0.0031) [2024-06-10 11:51:34,592][32177] Fps is (10 sec: 44237.0, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 388513792. Throughput: 0: 44664.4. Samples: 388617560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-10 11:51:34,592][32177] Avg episode reward: [(0, '0.296')] [2024-06-10 11:51:37,119][32415] Updated weights for policy 0, policy_version 23720 (0.0025) [2024-06-10 11:51:39,592][32177] Fps is (10 sec: 42599.1, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 388726784. Throughput: 0: 44804.6. Samples: 388891860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-10 11:51:39,592][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:51:40,490][32415] Updated weights for policy 0, policy_version 23730 (0.0033) [2024-06-10 11:51:44,095][32415] Updated weights for policy 0, policy_version 23740 (0.0030) [2024-06-10 11:51:44,592][32177] Fps is (10 sec: 44236.5, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 388956160. Throughput: 0: 44616.9. Samples: 389019340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 11:51:44,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:51:47,814][32415] Updated weights for policy 0, policy_version 23750 (0.0035) [2024-06-10 11:51:49,592][32177] Fps is (10 sec: 49151.1, 60 sec: 45055.9, 300 sec: 44986.6). Total num frames: 389218304. Throughput: 0: 45166.2. Samples: 389307960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 11:51:49,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:51:51,023][32415] Updated weights for policy 0, policy_version 23760 (0.0028) [2024-06-10 11:51:54,592][32177] Fps is (10 sec: 44237.4, 60 sec: 44236.9, 300 sec: 44820.0). Total num frames: 389398528. Throughput: 0: 45100.5. Samples: 389572020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 11:51:54,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:51:55,395][32415] Updated weights for policy 0, policy_version 23770 (0.0032) [2024-06-10 11:51:58,590][32415] Updated weights for policy 0, policy_version 23780 (0.0042) [2024-06-10 11:51:59,592][32177] Fps is (10 sec: 40959.9, 60 sec: 44509.8, 300 sec: 44875.5). Total num frames: 389627904. Throughput: 0: 44884.7. Samples: 389700520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 11:51:59,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:52:02,369][32415] Updated weights for policy 0, policy_version 23790 (0.0020) [2024-06-10 11:52:04,592][32177] Fps is (10 sec: 47513.0, 60 sec: 45329.1, 300 sec: 44931.0). Total num frames: 389873664. Throughput: 0: 44701.4. Samples: 389969740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-10 11:52:04,592][32177] Avg episode reward: [(0, '0.297')] [2024-06-10 11:52:06,116][32415] Updated weights for policy 0, policy_version 23800 (0.0042) [2024-06-10 11:52:09,592][32177] Fps is (10 sec: 45875.5, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 390086656. Throughput: 0: 45021.3. Samples: 390245120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-10 11:52:09,592][32177] Avg episode reward: [(0, '0.293')] [2024-06-10 11:52:09,857][32415] Updated weights for policy 0, policy_version 23810 (0.0036) [2024-06-10 11:52:13,184][32415] Updated weights for policy 0, policy_version 23820 (0.0025) [2024-06-10 11:52:14,592][32177] Fps is (10 sec: 44236.8, 60 sec: 45055.9, 300 sec: 44876.0). Total num frames: 390316032. Throughput: 0: 44848.5. Samples: 390371900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-10 11:52:14,592][32177] Avg episode reward: [(0, '0.296')] [2024-06-10 11:52:17,439][32415] Updated weights for policy 0, policy_version 23830 (0.0039) [2024-06-10 11:52:19,592][32177] Fps is (10 sec: 47514.4, 60 sec: 45329.2, 300 sec: 44931.1). Total num frames: 390561792. Throughput: 0: 45284.1. Samples: 390655340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 11:52:19,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:52:20,154][32415] Updated weights for policy 0, policy_version 23840 (0.0026) [2024-06-10 11:52:24,454][32415] Updated weights for policy 0, policy_version 23850 (0.0030) [2024-06-10 11:52:24,592][32177] Fps is (10 sec: 44236.9, 60 sec: 44782.9, 300 sec: 44819.9). Total num frames: 390758400. Throughput: 0: 45129.7. Samples: 390922700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 11:52:24,592][32177] Avg episode reward: [(0, '0.281')] [2024-06-10 11:52:27,685][32415] Updated weights for policy 0, policy_version 23860 (0.0035) [2024-06-10 11:52:29,592][32177] Fps is (10 sec: 40958.9, 60 sec: 44509.8, 300 sec: 44820.6). Total num frames: 390971392. Throughput: 0: 45040.3. Samples: 391046160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 11:52:29,592][32177] Avg episode reward: [(0, '0.293')] [2024-06-10 11:52:31,787][32415] Updated weights for policy 0, policy_version 23870 (0.0027) [2024-06-10 11:52:34,592][32177] Fps is (10 sec: 47513.4, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 391233536. Throughput: 0: 44570.2. Samples: 391313620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-10 11:52:34,593][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:52:34,600][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000023879_391233536.pth... [2024-06-10 11:52:34,649][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000023219_380420096.pth [2024-06-10 11:52:35,079][32415] Updated weights for policy 0, policy_version 23880 (0.0037) [2024-06-10 11:52:39,218][32415] Updated weights for policy 0, policy_version 23890 (0.0033) [2024-06-10 11:52:39,592][32177] Fps is (10 sec: 45876.3, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 391430144. Throughput: 0: 44953.3. Samples: 391594920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-10 11:52:39,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:52:42,099][32415] Updated weights for policy 0, policy_version 23900 (0.0028) [2024-06-10 11:52:44,594][32177] Fps is (10 sec: 42587.1, 60 sec: 45054.0, 300 sec: 44930.6). Total num frames: 391659520. Throughput: 0: 45021.0. Samples: 391726580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-10 11:52:44,595][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:52:46,250][32415] Updated weights for policy 0, policy_version 23910 (0.0034) [2024-06-10 11:52:47,683][32394] Signal inference workers to stop experience collection... (5650 times) [2024-06-10 11:52:47,709][32415] InferenceWorker_p0-w0: stopping experience collection (5650 times) [2024-06-10 11:52:47,744][32394] Signal inference workers to resume experience collection... (5650 times) [2024-06-10 11:52:47,745][32415] InferenceWorker_p0-w0: resuming experience collection (5650 times) [2024-06-10 11:52:49,113][32415] Updated weights for policy 0, policy_version 23920 (0.0036) [2024-06-10 11:52:49,592][32177] Fps is (10 sec: 49151.9, 60 sec: 45056.1, 300 sec: 44986.6). Total num frames: 391921664. Throughput: 0: 45098.8. Samples: 391999180. Policy #0 lag: (min: 2.0, avg: 12.5, max: 20.0) [2024-06-10 11:52:49,592][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:52:53,961][32415] Updated weights for policy 0, policy_version 23930 (0.0043) [2024-06-10 11:52:54,592][32177] Fps is (10 sec: 45888.1, 60 sec: 45329.1, 300 sec: 44931.0). Total num frames: 392118272. Throughput: 0: 45005.0. Samples: 392270340. Policy #0 lag: (min: 2.0, avg: 12.5, max: 20.0) [2024-06-10 11:52:54,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:52:56,772][32415] Updated weights for policy 0, policy_version 23940 (0.0038) [2024-06-10 11:52:59,592][32177] Fps is (10 sec: 39321.7, 60 sec: 44783.1, 300 sec: 44875.5). Total num frames: 392314880. Throughput: 0: 44895.7. Samples: 392392200. Policy #0 lag: (min: 2.0, avg: 12.5, max: 20.0) [2024-06-10 11:52:59,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:53:01,203][32415] Updated weights for policy 0, policy_version 23950 (0.0038) [2024-06-10 11:53:03,787][32415] Updated weights for policy 0, policy_version 23960 (0.0032) [2024-06-10 11:53:04,596][32177] Fps is (10 sec: 45855.4, 60 sec: 45052.9, 300 sec: 44930.4). Total num frames: 392577024. Throughput: 0: 44623.7. Samples: 392663600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-10 11:53:04,597][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:53:08,272][32415] Updated weights for policy 0, policy_version 23970 (0.0036) [2024-06-10 11:53:09,592][32177] Fps is (10 sec: 50789.4, 60 sec: 45602.1, 300 sec: 45042.1). Total num frames: 392822784. Throughput: 0: 45003.9. Samples: 392947880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-10 11:53:09,592][32177] Avg episode reward: [(0, '0.294')] [2024-06-10 11:53:10,791][32415] Updated weights for policy 0, policy_version 23980 (0.0033) [2024-06-10 11:53:14,592][32177] Fps is (10 sec: 44255.3, 60 sec: 45056.0, 300 sec: 45042.1). Total num frames: 393019392. Throughput: 0: 45282.8. Samples: 393083880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-10 11:53:14,592][32177] Avg episode reward: [(0, '0.278')] [2024-06-10 11:53:15,664][32415] Updated weights for policy 0, policy_version 23990 (0.0030) [2024-06-10 11:53:18,030][32415] Updated weights for policy 0, policy_version 24000 (0.0033) [2024-06-10 11:53:19,592][32177] Fps is (10 sec: 40960.1, 60 sec: 44509.7, 300 sec: 44875.5). Total num frames: 393232384. Throughput: 0: 45079.6. Samples: 393342200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-10 11:53:19,592][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:53:22,973][32415] Updated weights for policy 0, policy_version 24010 (0.0030) [2024-06-10 11:53:24,592][32177] Fps is (10 sec: 47513.8, 60 sec: 45602.1, 300 sec: 45042.1). Total num frames: 393494528. Throughput: 0: 44877.2. Samples: 393614400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-10 11:53:24,592][32177] Avg episode reward: [(0, '0.292')] [2024-06-10 11:53:25,614][32415] Updated weights for policy 0, policy_version 24020 (0.0026) [2024-06-10 11:53:29,592][32177] Fps is (10 sec: 42598.3, 60 sec: 44783.0, 300 sec: 44931.0). Total num frames: 393658368. Throughput: 0: 45097.7. Samples: 393755860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-10 11:53:29,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:53:30,286][32415] Updated weights for policy 0, policy_version 24030 (0.0034) [2024-06-10 11:53:32,963][32415] Updated weights for policy 0, policy_version 24040 (0.0036) [2024-06-10 11:53:34,592][32177] Fps is (10 sec: 40959.7, 60 sec: 44509.9, 300 sec: 44819.9). Total num frames: 393904128. Throughput: 0: 44823.4. Samples: 394016240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-10 11:53:34,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:53:37,513][32415] Updated weights for policy 0, policy_version 24050 (0.0024) [2024-06-10 11:53:39,592][32177] Fps is (10 sec: 52429.6, 60 sec: 45875.2, 300 sec: 45097.7). Total num frames: 394182656. Throughput: 0: 44885.8. Samples: 394290200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-10 11:53:39,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:53:39,909][32415] Updated weights for policy 0, policy_version 24060 (0.0025) [2024-06-10 11:53:44,592][32177] Fps is (10 sec: 44237.5, 60 sec: 44785.0, 300 sec: 44931.1). Total num frames: 394346496. Throughput: 0: 45481.3. Samples: 394438860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-10 11:53:44,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:53:44,623][32415] Updated weights for policy 0, policy_version 24070 (0.0044) [2024-06-10 11:53:46,290][32394] Signal inference workers to stop experience collection... (5700 times) [2024-06-10 11:53:46,319][32415] InferenceWorker_p0-w0: stopping experience collection (5700 times) [2024-06-10 11:53:46,343][32394] Signal inference workers to resume experience collection... (5700 times) [2024-06-10 11:53:46,344][32415] InferenceWorker_p0-w0: resuming experience collection (5700 times) [2024-06-10 11:53:47,274][32415] Updated weights for policy 0, policy_version 24080 (0.0025) [2024-06-10 11:53:49,592][32177] Fps is (10 sec: 40960.1, 60 sec: 44509.9, 300 sec: 44875.5). Total num frames: 394592256. Throughput: 0: 45160.8. Samples: 394695640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-10 11:53:49,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:53:52,323][32415] Updated weights for policy 0, policy_version 24090 (0.0030) [2024-06-10 11:53:54,592][32177] Fps is (10 sec: 49150.8, 60 sec: 45328.9, 300 sec: 44931.0). Total num frames: 394838016. Throughput: 0: 44682.1. Samples: 394958580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 11:53:54,596][32177] Avg episode reward: [(0, '0.285')] [2024-06-10 11:53:54,705][32415] Updated weights for policy 0, policy_version 24100 (0.0030) [2024-06-10 11:53:59,482][32415] Updated weights for policy 0, policy_version 24110 (0.0031) [2024-06-10 11:53:59,596][32177] Fps is (10 sec: 42579.9, 60 sec: 45052.7, 300 sec: 44930.4). Total num frames: 395018240. Throughput: 0: 44657.6. Samples: 395093660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 11:53:59,597][32177] Avg episode reward: [(0, '0.279')] [2024-06-10 11:54:01,892][32415] Updated weights for policy 0, policy_version 24120 (0.0033) [2024-06-10 11:54:04,592][32177] Fps is (10 sec: 42598.8, 60 sec: 44786.0, 300 sec: 44931.0). Total num frames: 395264000. Throughput: 0: 45037.3. Samples: 395368880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 11:54:04,596][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:54:06,636][32415] Updated weights for policy 0, policy_version 24130 (0.0042) [2024-06-10 11:54:09,289][32415] Updated weights for policy 0, policy_version 24140 (0.0026) [2024-06-10 11:54:09,592][32177] Fps is (10 sec: 49173.4, 60 sec: 44783.1, 300 sec: 44931.1). Total num frames: 395509760. Throughput: 0: 44957.0. Samples: 395637460. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-10 11:54:09,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:54:14,263][32415] Updated weights for policy 0, policy_version 24150 (0.0037) [2024-06-10 11:54:14,592][32177] Fps is (10 sec: 42598.2, 60 sec: 44509.8, 300 sec: 44931.0). Total num frames: 395689984. Throughput: 0: 45015.1. Samples: 395781540. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-10 11:54:14,592][32177] Avg episode reward: [(0, '0.284')] [2024-06-10 11:54:16,401][32415] Updated weights for policy 0, policy_version 24160 (0.0027) [2024-06-10 11:54:19,592][32177] Fps is (10 sec: 40959.8, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 395919360. Throughput: 0: 45016.6. Samples: 396041980. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-10 11:54:19,592][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:54:21,575][32415] Updated weights for policy 0, policy_version 24170 (0.0035) [2024-06-10 11:54:23,757][32415] Updated weights for policy 0, policy_version 24180 (0.0032) [2024-06-10 11:54:24,592][32177] Fps is (10 sec: 47514.2, 60 sec: 44509.9, 300 sec: 44931.0). Total num frames: 396165120. Throughput: 0: 44833.3. Samples: 396307700. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-10 11:54:24,592][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:54:28,607][32415] Updated weights for policy 0, policy_version 24190 (0.0031) [2024-06-10 11:54:29,592][32177] Fps is (10 sec: 47512.8, 60 sec: 45602.1, 300 sec: 44986.5). Total num frames: 396394496. Throughput: 0: 44678.5. Samples: 396449400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 11:54:29,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:54:31,014][32415] Updated weights for policy 0, policy_version 24200 (0.0025) [2024-06-10 11:54:34,592][32177] Fps is (10 sec: 42598.2, 60 sec: 44783.0, 300 sec: 44931.0). Total num frames: 396591104. Throughput: 0: 44926.5. Samples: 396717340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 11:54:34,593][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:54:34,606][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000024206_396591104.pth... [2024-06-10 11:54:34,657][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000023549_385826816.pth [2024-06-10 11:54:35,645][32415] Updated weights for policy 0, policy_version 24210 (0.0028) [2024-06-10 11:54:38,134][32415] Updated weights for policy 0, policy_version 24220 (0.0021) [2024-06-10 11:54:39,592][32177] Fps is (10 sec: 44237.6, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 396836864. Throughput: 0: 45111.8. Samples: 396988600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 11:54:39,592][32177] Avg episode reward: [(0, '0.293')] [2024-06-10 11:54:43,223][32415] Updated weights for policy 0, policy_version 24230 (0.0037) [2024-06-10 11:54:44,592][32177] Fps is (10 sec: 47514.0, 60 sec: 45329.0, 300 sec: 44931.0). Total num frames: 397066240. Throughput: 0: 45156.7. Samples: 397125520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-10 11:54:44,592][32177] Avg episode reward: [(0, '0.280')] [2024-06-10 11:54:45,481][32415] Updated weights for policy 0, policy_version 24240 (0.0028) [2024-06-10 11:54:49,592][32177] Fps is (10 sec: 42597.3, 60 sec: 44509.7, 300 sec: 44875.5). Total num frames: 397262848. Throughput: 0: 44832.3. Samples: 397386340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-10 11:54:49,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:54:50,624][32415] Updated weights for policy 0, policy_version 24250 (0.0033) [2024-06-10 11:54:52,993][32415] Updated weights for policy 0, policy_version 24260 (0.0027) [2024-06-10 11:54:54,592][32177] Fps is (10 sec: 44237.1, 60 sec: 44510.1, 300 sec: 44820.0). Total num frames: 397508608. Throughput: 0: 44871.5. Samples: 397656680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-10 11:54:54,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:54:57,352][32394] Signal inference workers to stop experience collection... (5750 times) [2024-06-10 11:54:57,353][32394] Signal inference workers to resume experience collection... (5750 times) [2024-06-10 11:54:57,385][32415] InferenceWorker_p0-w0: stopping experience collection (5750 times) [2024-06-10 11:54:57,385][32415] InferenceWorker_p0-w0: resuming experience collection (5750 times) [2024-06-10 11:54:57,786][32415] Updated weights for policy 0, policy_version 24270 (0.0037) [2024-06-10 11:54:59,592][32177] Fps is (10 sec: 47514.1, 60 sec: 45332.2, 300 sec: 44931.0). Total num frames: 397737984. Throughput: 0: 44992.5. Samples: 397806200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-10 11:54:59,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:55:00,204][32415] Updated weights for policy 0, policy_version 24280 (0.0037) [2024-06-10 11:55:04,592][32177] Fps is (10 sec: 44236.7, 60 sec: 44783.0, 300 sec: 44931.1). Total num frames: 397950976. Throughput: 0: 45045.8. Samples: 398069040. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-10 11:55:04,592][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:55:04,724][32415] Updated weights for policy 0, policy_version 24290 (0.0046) [2024-06-10 11:55:07,355][32415] Updated weights for policy 0, policy_version 24300 (0.0033) [2024-06-10 11:55:09,592][32177] Fps is (10 sec: 45875.6, 60 sec: 44782.8, 300 sec: 44931.0). Total num frames: 398196736. Throughput: 0: 45068.0. Samples: 398335760. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-10 11:55:09,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:55:12,242][32415] Updated weights for policy 0, policy_version 24310 (0.0034) [2024-06-10 11:55:14,592][32177] Fps is (10 sec: 47513.3, 60 sec: 45602.2, 300 sec: 45042.1). Total num frames: 398426112. Throughput: 0: 44957.4. Samples: 398472480. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-10 11:55:14,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:55:14,821][32415] Updated weights for policy 0, policy_version 24320 (0.0023) [2024-06-10 11:55:19,592][32177] Fps is (10 sec: 40960.2, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 398606336. Throughput: 0: 44833.0. Samples: 398734820. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-10 11:55:19,592][32177] Avg episode reward: [(0, '0.296')] [2024-06-10 11:55:19,652][32415] Updated weights for policy 0, policy_version 24330 (0.0028) [2024-06-10 11:55:22,382][32415] Updated weights for policy 0, policy_version 24340 (0.0032) [2024-06-10 11:55:24,596][32177] Fps is (10 sec: 44218.2, 60 sec: 45052.8, 300 sec: 44930.4). Total num frames: 398868480. Throughput: 0: 44649.5. Samples: 398998020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-10 11:55:24,597][32177] Avg episode reward: [(0, '0.287')] [2024-06-10 11:55:26,794][32415] Updated weights for policy 0, policy_version 24350 (0.0032) [2024-06-10 11:55:29,592][32177] Fps is (10 sec: 49152.1, 60 sec: 45056.1, 300 sec: 45042.1). Total num frames: 399097856. Throughput: 0: 44869.4. Samples: 399144640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-10 11:55:29,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:55:29,721][32415] Updated weights for policy 0, policy_version 24360 (0.0035) [2024-06-10 11:55:33,790][32415] Updated weights for policy 0, policy_version 24370 (0.0031) [2024-06-10 11:55:34,592][32177] Fps is (10 sec: 42616.0, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 399294464. Throughput: 0: 45220.5. Samples: 399421260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 11:55:34,592][32177] Avg episode reward: [(0, '0.292')] [2024-06-10 11:55:37,001][32415] Updated weights for policy 0, policy_version 24380 (0.0031) [2024-06-10 11:55:39,592][32177] Fps is (10 sec: 44236.0, 60 sec: 45055.9, 300 sec: 44986.5). Total num frames: 399540224. Throughput: 0: 44942.5. Samples: 399679100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 11:55:39,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:55:41,252][32415] Updated weights for policy 0, policy_version 24390 (0.0028) [2024-06-10 11:55:44,367][32415] Updated weights for policy 0, policy_version 24400 (0.0036) [2024-06-10 11:55:44,592][32177] Fps is (10 sec: 49151.9, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 399785984. Throughput: 0: 44784.4. Samples: 399821500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-10 11:55:44,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:55:48,835][32415] Updated weights for policy 0, policy_version 24410 (0.0021) [2024-06-10 11:55:49,596][32177] Fps is (10 sec: 40943.2, 60 sec: 44779.9, 300 sec: 44763.8). Total num frames: 399949824. Throughput: 0: 44761.5. Samples: 400083500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 11:55:49,596][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:55:51,918][32415] Updated weights for policy 0, policy_version 24420 (0.0037) [2024-06-10 11:55:54,592][32177] Fps is (10 sec: 40960.2, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 400195584. Throughput: 0: 44736.8. Samples: 400348920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 11:55:54,592][32177] Avg episode reward: [(0, '0.286')] [2024-06-10 11:55:55,927][32415] Updated weights for policy 0, policy_version 24430 (0.0031) [2024-06-10 11:55:59,259][32415] Updated weights for policy 0, policy_version 24440 (0.0040) [2024-06-10 11:55:59,592][32177] Fps is (10 sec: 47533.6, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 400424960. Throughput: 0: 44901.8. Samples: 400493060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-10 11:55:59,592][32177] Avg episode reward: [(0, '0.290')] [2024-06-10 11:56:03,001][32415] Updated weights for policy 0, policy_version 24450 (0.0040) [2024-06-10 11:56:04,073][32394] Signal inference workers to stop experience collection... (5800 times) [2024-06-10 11:56:04,132][32415] InferenceWorker_p0-w0: stopping experience collection (5800 times) [2024-06-10 11:56:04,185][32394] Signal inference workers to resume experience collection... (5800 times) [2024-06-10 11:56:04,186][32415] InferenceWorker_p0-w0: resuming experience collection (5800 times) [2024-06-10 11:56:04,592][32177] Fps is (10 sec: 45875.3, 60 sec: 45055.9, 300 sec: 44931.0). Total num frames: 400654336. Throughput: 0: 45027.9. Samples: 400761080. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-10 11:56:04,592][32177] Avg episode reward: [(0, '0.288')] [2024-06-10 11:56:06,246][32415] Updated weights for policy 0, policy_version 24460 (0.0033) [2024-06-10 11:56:09,596][32177] Fps is (10 sec: 45855.9, 60 sec: 44779.8, 300 sec: 44985.9). Total num frames: 400883712. Throughput: 0: 45219.6. Samples: 401032900. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-10 11:56:09,597][32177] Avg episode reward: [(0, '0.283')] [2024-06-10 11:56:10,320][32415] Updated weights for policy 0, policy_version 24470 (0.0032) [2024-06-10 11:56:13,779][32415] Updated weights for policy 0, policy_version 24480 (0.0035) [2024-06-10 11:56:14,592][32177] Fps is (10 sec: 47514.0, 60 sec: 45056.0, 300 sec: 45042.1). Total num frames: 401129472. Throughput: 0: 44886.2. Samples: 401164520. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-10 11:56:14,592][32177] Avg episode reward: [(0, '0.291')] [2024-06-10 11:56:17,848][32415] Updated weights for policy 0, policy_version 24490 (0.0029) [2024-06-10 11:56:19,592][32177] Fps is (10 sec: 44255.8, 60 sec: 45329.1, 300 sec: 44931.1). Total num frames: 401326080. Throughput: 0: 44629.0. Samples: 401429560. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-10 11:56:19,592][32177] Avg episode reward: [(0, '0.295')] [2024-06-10 11:56:21,209][32415] Updated weights for policy 0, policy_version 24500 (0.0040) [2024-06-10 11:59:23,791][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w22 [2024-06-10 11:59:23,793][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w6 [2024-06-10 11:59:23,794][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w7 [2024-06-10 11:59:23,794][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w2 [2024-06-10 11:59:23,794][32177] Heartbeat reconnected after 186 seconds from LearnerWorker_p0 [2024-06-10 11:59:23,794][32177] Fps is (10 sec: 1905.1, 60 sec: 10959.3, 300 sec: 27813.3). Total num frames: 401489920. Throughput: 0: 8408.5. Samples: 401564300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 11:59:23,795][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:59:23,807][32177] No heartbeat for components: Batcher_0 (186 seconds), InferenceWorker_p0-w0 (186 seconds), RolloutWorker_w0 (186 seconds), RolloutWorker_w1 (186 seconds), RolloutWorker_w3 (186 seconds), RolloutWorker_w4 (186 seconds), RolloutWorker_w5 (186 seconds), RolloutWorker_w8 (186 seconds), RolloutWorker_w9 (186 seconds), RolloutWorker_w10 (186 seconds), RolloutWorker_w11 (186 seconds), RolloutWorker_w12 (186 seconds), RolloutWorker_w13 (186 seconds), RolloutWorker_w14 (186 seconds), RolloutWorker_w15 (186 seconds), RolloutWorker_w16 (186 seconds), RolloutWorker_w17 (186 seconds), RolloutWorker_w18 (186 seconds), RolloutWorker_w19 (186 seconds), RolloutWorker_w20 (186 seconds), RolloutWorker_w21 (186 seconds), RolloutWorker_w23 (186 seconds), RolloutWorker_w24 (186 seconds), RolloutWorker_w25 (186 seconds), RolloutWorker_w26 (186 seconds), RolloutWorker_w27 (186 seconds), RolloutWorker_w28 (186 seconds), RolloutWorker_w29 (186 seconds), RolloutWorker_w30 (186 seconds), RolloutWorker_w31 (186 seconds) [2024-06-10 11:59:23,808][32177] Stopping training due to lack of heartbeats from , [2024-06-10 11:59:23,808][32436] Stopping RolloutWorker_w21... [2024-06-10 11:59:23,808][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w5 [2024-06-10 11:59:23,808][32436] Loop rollout_proc21_evt_loop terminating... [2024-06-10 11:59:23,808][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w10 [2024-06-10 11:59:23,809][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w13 [2024-06-10 11:59:23,809][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w11 [2024-06-10 11:59:23,809][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w1 [2024-06-10 11:59:23,809][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w4 [2024-06-10 11:59:23,810][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w15 [2024-06-10 11:59:23,810][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w0 [2024-06-10 11:59:23,810][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w25 [2024-06-10 11:59:23,810][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w8 [2024-06-10 11:59:23,810][32177] Fps is (10 sec: 889.4, 60 sec: 10213.0, 300 sec: 27654.8). Total num frames: 401489920. Throughput: 0: 7950.1. Samples: 401564300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 11:59:23,810][32177] Avg episode reward: [(0, '0.289')] [2024-06-10 11:59:23,810][32418] Stopping RolloutWorker_w2... [2024-06-10 11:59:23,811][32418] Loop rollout_proc2_evt_loop terminating... [2024-06-10 11:59:23,812][32421] Stopping RolloutWorker_w7... [2024-06-10 11:59:23,812][32421] Loop rollout_proc7_evt_loop terminating... [2024-06-10 11:59:23,813][32437] Stopping RolloutWorker_w22... [2024-06-10 11:59:23,813][32437] Loop rollout_proc22_evt_loop terminating... [2024-06-10 11:59:23,814][32420] Stopping RolloutWorker_w5... [2024-06-10 11:59:23,814][32417] Stopping RolloutWorker_w1... [2024-06-10 11:59:23,814][32426] Stopping RolloutWorker_w13... [2024-06-10 11:59:23,814][32420] Loop rollout_proc5_evt_loop terminating... [2024-06-10 11:59:23,814][32417] Loop rollout_proc1_evt_loop terminating... [2024-06-10 11:59:23,815][32426] Loop rollout_proc13_evt_loop terminating... [2024-06-10 11:59:23,814][32427] Stopping RolloutWorker_w11... [2024-06-10 11:59:23,814][32425] Stopping RolloutWorker_w10... [2024-06-10 11:59:23,815][32427] Loop rollout_proc11_evt_loop terminating... [2024-06-10 11:59:23,815][32425] Loop rollout_proc10_evt_loop terminating... [2024-06-10 11:59:23,815][32423] Stopping RolloutWorker_w8... [2024-06-10 11:59:23,815][32419] Stopping RolloutWorker_w4... [2024-06-10 11:59:23,815][32423] Loop rollout_proc8_evt_loop terminating... [2024-06-10 11:59:23,815][32435] Stopping RolloutWorker_w20... [2024-06-10 11:59:23,815][32419] Loop rollout_proc4_evt_loop terminating... [2024-06-10 11:59:23,815][32433] Stopping RolloutWorker_w19... [2024-06-10 11:59:23,815][32428] Stopping RolloutWorker_w14... [2024-06-10 11:59:23,815][32430] Stopping RolloutWorker_w15... [2024-06-10 11:59:23,815][32435] Loop rollout_proc20_evt_loop terminating... [2024-06-10 11:59:23,815][32441] Stopping RolloutWorker_w25... [2024-06-10 11:59:23,815][32432] Stopping RolloutWorker_w17... [2024-06-10 11:59:23,815][32433] Loop rollout_proc19_evt_loop terminating... [2024-06-10 11:59:23,815][32428] Loop rollout_proc14_evt_loop terminating... [2024-06-10 11:59:23,815][32430] Loop rollout_proc15_evt_loop terminating... [2024-06-10 11:59:23,815][32441] Loop rollout_proc25_evt_loop terminating... [2024-06-10 11:59:23,815][32432] Loop rollout_proc17_evt_loop terminating... [2024-06-10 11:59:23,815][32443] Stopping RolloutWorker_w27... [2024-06-10 11:59:23,815][32438] Stopping RolloutWorker_w24... [2024-06-10 11:59:23,815][32431] Stopping RolloutWorker_w16... [2024-06-10 11:59:23,816][32443] Loop rollout_proc27_evt_loop terminating... [2024-06-10 11:59:23,815][32440] Stopping RolloutWorker_w26... [2024-06-10 11:59:23,815][32444] Stopping RolloutWorker_w29... [2024-06-10 11:59:23,815][32414] Stopping RolloutWorker_w0... [2024-06-10 11:59:23,816][32438] Loop rollout_proc24_evt_loop terminating... [2024-06-10 11:59:23,816][32431] Loop rollout_proc16_evt_loop terminating... [2024-06-10 11:59:23,816][32446] Stopping RolloutWorker_w30... [2024-06-10 11:59:23,816][32444] Loop rollout_proc29_evt_loop terminating... [2024-06-10 11:59:23,816][32440] Loop rollout_proc26_evt_loop terminating... [2024-06-10 11:59:23,816][32414] Loop rollout_proc0_evt_loop terminating... [2024-06-10 11:59:23,816][32446] Loop rollout_proc30_evt_loop terminating... [2024-06-10 11:59:23,816][32442] Stopping RolloutWorker_w28... [2024-06-10 11:59:23,816][32442] Loop rollout_proc28_evt_loop terminating... [2024-06-10 11:59:23,816][32434] Stopping RolloutWorker_w18... [2024-06-10 11:59:23,816][32445] Stopping RolloutWorker_w31... [2024-06-10 11:59:23,816][32434] Loop rollout_proc18_evt_loop terminating... [2024-06-10 11:59:23,816][32445] Loop rollout_proc31_evt_loop terminating... [2024-06-10 11:59:23,822][32429] Stopping RolloutWorker_w12... [2024-06-10 11:59:23,823][32429] Loop rollout_proc12_evt_loop terminating... [2024-06-10 11:59:23,826][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w14 [2024-06-10 11:59:23,827][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w20 [2024-06-10 11:59:23,827][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w29 [2024-06-10 11:59:23,827][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w17 [2024-06-10 11:59:23,827][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w27 [2024-06-10 11:59:23,827][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w19 [2024-06-10 11:59:23,827][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w24 [2024-06-10 11:59:23,827][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w16 [2024-06-10 11:59:23,828][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w30 [2024-06-10 11:59:23,828][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w26 [2024-06-10 11:59:23,828][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w12 [2024-06-10 11:59:23,828][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w28 [2024-06-10 11:59:23,828][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w21 [2024-06-10 11:59:23,828][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w18 [2024-06-10 11:59:23,828][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w31 [2024-06-10 11:59:23,829][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w9 [2024-06-10 11:59:23,829][32177] Heartbeat reconnected after 186 seconds from RolloutWorker_w3 [2024-06-10 11:59:23,829][32177] Component RolloutWorker_w21 stopped! [2024-06-10 11:59:23,829][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w1', 'RolloutWorker_w2', 'RolloutWorker_w3', 'RolloutWorker_w4', 'RolloutWorker_w5', 'RolloutWorker_w6', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,829][32177] Heartbeat reconnected after 186 seconds from InferenceWorker_p0-w0 [2024-06-10 11:59:23,830][32177] Component RolloutWorker_w2 stopped! [2024-06-10 11:59:23,830][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w1', 'RolloutWorker_w3', 'RolloutWorker_w4', 'RolloutWorker_w5', 'RolloutWorker_w6', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,830][32177] Component RolloutWorker_w7 stopped! [2024-06-10 11:59:23,830][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w1', 'RolloutWorker_w3', 'RolloutWorker_w4', 'RolloutWorker_w5', 'RolloutWorker_w6', 'RolloutWorker_w8', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,831][32177] Component RolloutWorker_w22 stopped! [2024-06-10 11:59:23,831][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w1', 'RolloutWorker_w3', 'RolloutWorker_w4', 'RolloutWorker_w5', 'RolloutWorker_w6', 'RolloutWorker_w8', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,831][32177] Component RolloutWorker_w5 stopped! [2024-06-10 11:59:23,831][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w1', 'RolloutWorker_w3', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w8', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,831][32177] Component RolloutWorker_w1 stopped! [2024-06-10 11:59:23,831][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w3', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w8', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,831][32177] Component RolloutWorker_w13 stopped! [2024-06-10 11:59:23,832][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w3', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w8', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w12', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,832][32177] Component RolloutWorker_w11 stopped! [2024-06-10 11:59:23,832][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w3', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w8', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w12', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,832][32177] Component RolloutWorker_w10 stopped! [2024-06-10 11:59:23,833][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w3', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w8', 'RolloutWorker_w9', 'RolloutWorker_w12', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,833][32177] Component RolloutWorker_w8 stopped! [2024-06-10 11:59:23,833][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w3', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w9', 'RolloutWorker_w12', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,833][32177] Component RolloutWorker_w4 stopped! [2024-06-10 11:59:23,833][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w9', 'RolloutWorker_w12', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,836][32424] Stopping RolloutWorker_w9... [2024-06-10 11:59:23,838][32424] Loop rollout_proc9_evt_loop terminating... [2024-06-10 11:59:23,838][32394] Stopping Batcher_0... [2024-06-10 11:59:23,838][32394] Loop batcher_evt_loop terminating... [2024-06-10 11:59:23,837][32177] Component RolloutWorker_w20 stopped! [2024-06-10 11:59:23,840][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w9', 'RolloutWorker_w12', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,840][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000024505_401489920.pth... [2024-06-10 11:59:23,841][32177] Component RolloutWorker_w15 stopped! [2024-06-10 11:59:23,841][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w9', 'RolloutWorker_w12', 'RolloutWorker_w14', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,841][32177] Component RolloutWorker_w14 stopped! [2024-06-10 11:59:23,841][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w9', 'RolloutWorker_w12', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,841][32177] Component RolloutWorker_w19 stopped! [2024-06-10 11:59:23,841][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w9', 'RolloutWorker_w12', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,842][32177] Component RolloutWorker_w25 stopped! [2024-06-10 11:59:23,842][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w9', 'RolloutWorker_w12', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,842][32177] Component RolloutWorker_w17 stopped! [2024-06-10 11:59:23,842][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w9', 'RolloutWorker_w12', 'RolloutWorker_w16', 'RolloutWorker_w18', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,842][32177] Component RolloutWorker_w27 stopped! [2024-06-10 11:59:23,842][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w9', 'RolloutWorker_w12', 'RolloutWorker_w16', 'RolloutWorker_w18', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w26', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,843][32177] Component RolloutWorker_w24 stopped! [2024-06-10 11:59:23,843][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w9', 'RolloutWorker_w12', 'RolloutWorker_w16', 'RolloutWorker_w18', 'RolloutWorker_w23', 'RolloutWorker_w26', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,843][32177] Component RolloutWorker_w16 stopped! [2024-06-10 11:59:23,843][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w9', 'RolloutWorker_w12', 'RolloutWorker_w18', 'RolloutWorker_w23', 'RolloutWorker_w26', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,843][32177] Component RolloutWorker_w0 stopped! [2024-06-10 11:59:23,844][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w9', 'RolloutWorker_w12', 'RolloutWorker_w18', 'RolloutWorker_w23', 'RolloutWorker_w26', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,844][32177] Component RolloutWorker_w26 stopped! [2024-06-10 11:59:23,844][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w9', 'RolloutWorker_w12', 'RolloutWorker_w18', 'RolloutWorker_w23', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,844][32177] Component RolloutWorker_w29 stopped! [2024-06-10 11:59:23,844][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w9', 'RolloutWorker_w12', 'RolloutWorker_w18', 'RolloutWorker_w23', 'RolloutWorker_w28', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,845][32177] Component RolloutWorker_w30 stopped! [2024-06-10 11:59:23,845][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w9', 'RolloutWorker_w12', 'RolloutWorker_w18', 'RolloutWorker_w23', 'RolloutWorker_w28', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,845][32177] Component RolloutWorker_w28 stopped! [2024-06-10 11:59:23,845][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w9', 'RolloutWorker_w12', 'RolloutWorker_w18', 'RolloutWorker_w23', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,845][32177] Component RolloutWorker_w18 stopped! [2024-06-10 11:59:23,846][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w9', 'RolloutWorker_w12', 'RolloutWorker_w23', 'RolloutWorker_w31'] to stop... [2024-06-10 11:59:23,846][32177] Component RolloutWorker_w31 stopped! [2024-06-10 11:59:23,846][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w9', 'RolloutWorker_w12', 'RolloutWorker_w23'] to stop... [2024-06-10 11:59:23,846][32177] Component RolloutWorker_w12 stopped! [2024-06-10 11:59:23,846][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w9', 'RolloutWorker_w23'] to stop... [2024-06-10 11:59:23,847][32177] Component RolloutWorker_w9 stopped! [2024-06-10 11:59:23,847][32177] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w23'] to stop... [2024-06-10 11:59:23,847][32177] Heartbeat reconnected after 186 seconds from Batcher_0 [2024-06-10 11:59:23,847][32177] Component Batcher_0 stopped! [2024-06-10 11:59:23,848][32177] Waiting for ['LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w3', 'RolloutWorker_w6', 'RolloutWorker_w23'] to stop... [2024-06-10 11:59:23,934][32394] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000023879_391233536.pth [2024-06-10 11:59:23,942][32177] Component RolloutWorker_w23 stopped! [2024-06-10 11:59:23,943][32177] Waiting for ['LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w3', 'RolloutWorker_w6'] to stop... [2024-06-10 11:59:23,944][32439] Stopping RolloutWorker_w23... [2024-06-10 11:59:23,944][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000024505_401489920.pth... [2024-06-10 11:59:23,944][32439] Loop rollout_proc23_evt_loop terminating... [2024-06-10 11:59:23,956][32415] Weights refcount: 2 0 [2024-06-10 11:59:23,980][32416] Stopping RolloutWorker_w3... [2024-06-10 11:59:23,981][32416] Loop rollout_proc3_evt_loop terminating... [2024-06-10 11:59:23,984][32415] Stopping InferenceWorker_p0-w0... [2024-06-10 11:59:23,984][32177] Component RolloutWorker_w3 stopped! [2024-06-10 11:59:23,985][32177] Waiting for ['LearnerWorker_p0', 'InferenceWorker_p0-w0', 'RolloutWorker_w6'] to stop... [2024-06-10 11:59:23,985][32415] Loop inference_proc0-0_evt_loop terminating... [2024-06-10 11:59:23,985][32177] Component InferenceWorker_p0-w0 stopped! [2024-06-10 11:59:23,985][32177] Waiting for ['LearnerWorker_p0', 'RolloutWorker_w6'] to stop... [2024-06-10 11:59:23,988][32422] Stopping RolloutWorker_w6... [2024-06-10 11:59:23,989][32422] Loop rollout_proc6_evt_loop terminating... [2024-06-10 11:59:23,992][32177] Component RolloutWorker_w6 stopped! [2024-06-10 11:59:23,992][32177] Waiting for ['LearnerWorker_p0'] to stop... [2024-06-10 11:59:24,055][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000024505_401489920.pth... [2024-06-10 11:59:24,238][32394] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000024505_401489920.pth... [2024-06-10 11:59:24,366][32394] Stopping LearnerWorker_p0... [2024-06-10 11:59:24,367][32394] Loop learner_proc0_evt_loop terminating... [2024-06-10 11:59:24,366][32177] Component LearnerWorker_p0 stopped! [2024-06-10 11:59:24,367][32177] Waiting for process learner_proc0 to stop... [2024-06-10 11:59:25,909][32177] Waiting for process inference_proc0-0 to join... [2024-06-10 11:59:25,909][32177] Waiting for process rollout_proc0 to join... [2024-06-10 11:59:25,910][32177] Waiting for process rollout_proc1 to join... [2024-06-10 11:59:25,910][32177] Waiting for process rollout_proc2 to join... [2024-06-10 11:59:25,950][32177] Waiting for process rollout_proc3 to join... [2024-06-10 11:59:25,950][32177] Waiting for process rollout_proc4 to join... [2024-06-10 11:59:25,950][32177] Waiting for process rollout_proc5 to join... [2024-06-10 11:59:25,950][32177] Waiting for process rollout_proc6 to join... [2024-06-10 11:59:25,958][32177] Waiting for process rollout_proc7 to join... [2024-06-10 11:59:25,967][32177] Waiting for process rollout_proc8 to join... [2024-06-10 11:59:25,967][32177] Waiting for process rollout_proc9 to join... [2024-06-10 11:59:25,967][32177] Waiting for process rollout_proc10 to join... [2024-06-10 11:59:25,968][32177] Waiting for process rollout_proc11 to join... [2024-06-10 11:59:25,968][32177] Waiting for process rollout_proc12 to join... [2024-06-10 11:59:25,968][32177] Waiting for process rollout_proc13 to join... [2024-06-10 11:59:25,968][32177] Waiting for process rollout_proc14 to join... [2024-06-10 11:59:25,968][32177] Waiting for process rollout_proc15 to join... [2024-06-10 11:59:25,968][32177] Waiting for process rollout_proc16 to join... [2024-06-10 11:59:25,968][32177] Waiting for process rollout_proc17 to join... [2024-06-10 11:59:25,968][32177] Waiting for process rollout_proc18 to join... [2024-06-10 11:59:25,968][32177] Waiting for process rollout_proc19 to join... [2024-06-10 11:59:25,969][32177] Waiting for process rollout_proc20 to join... [2024-06-10 11:59:25,969][32177] Waiting for process rollout_proc21 to join... [2024-06-10 11:59:25,969][32177] Waiting for process rollout_proc22 to join... [2024-06-10 11:59:25,969][32177] Waiting for process rollout_proc23 to join... [2024-06-10 11:59:25,969][32177] Waiting for process rollout_proc24 to join... [2024-06-10 11:59:25,969][32177] Waiting for process rollout_proc25 to join... [2024-06-10 11:59:25,969][32177] Waiting for process rollout_proc26 to join... [2024-06-10 11:59:25,969][32177] Waiting for process rollout_proc27 to join... [2024-06-10 11:59:25,969][32177] Waiting for process rollout_proc28 to join... [2024-06-10 11:59:25,970][32177] Waiting for process rollout_proc29 to join... [2024-06-10 11:59:25,970][32177] Waiting for process rollout_proc30 to join... [2024-06-10 11:59:25,970][32177] Waiting for process rollout_proc31 to join... [2024-06-10 11:59:25,970][32177] Batcher 0 profile tree view: batching: 1168.7937, releasing_batches: 50.5313 [2024-06-10 11:59:25,970][32177] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0005 wait_policy_total: 118.4919 update_model: 104.8437 weight_update: 0.0034 one_step: 0.0326 handle_policy_step: 8813.1497 deserialize: 953.3885, stack: 30.9688, obs_to_device_normalize: 1907.0688, forward: 4701.2898, send_messages: 300.6705 prepare_outputs: 771.6320 to_cpu: 372.2793 [2024-06-10 11:59:25,970][32177] Learner 0 profile tree view: misc: 0.0939, prepare_batch: 327.4295 train: 2969.7579 epoch_init: 0.0856, minibatch_init: 0.0849, losses_postprocess: 9.9600, kl_divergence: 14.6689, after_optimizer: 1077.4544 calculate_losses: 1660.9255 losses_init: 0.0560, forward_head: 169.0042, bptt_initial: 1316.6657, tail: 22.7931, advantages_returns: 3.5714, losses: 69.0342 bptt: 73.6784 bptt_forward_core: 72.6842 update: 198.1883 clip: 19.2591 [2024-06-10 11:59:25,970][32177] RolloutWorker_w0 profile tree view: wait_for_trajectories: 0.5624, enqueue_policy_requests: 203.7042, env_step: 4086.5198, overhead: 131.8047, complete_rollouts: 0.6303 save_policy_outputs: 356.4279 split_output_tensors: 159.0076 [2024-06-10 11:59:25,970][32177] RolloutWorker_w31 profile tree view: wait_for_trajectories: 0.5246, enqueue_policy_requests: 216.8307, env_step: 4479.3893, overhead: 154.7423, complete_rollouts: 146.0848 save_policy_outputs: 373.7138 split_output_tensors: 156.9143 [2024-06-10 11:59:25,970][32177] Loop Runner_EvtLoop terminating... [2024-06-10 11:59:25,970][32177] Runner profile tree view: main_loop: 9168.5278 [2024-06-10 11:59:25,971][32177] Collected {0: 401489920}, FPS: 43790.0 [2024-06-10 11:59:35,644][35745] Saving configuration to /workspace/metta/train_dir/p2.metta.6/config.json... [2024-06-10 11:59:35,660][35745] Rollout worker 0 uses device cpu [2024-06-10 11:59:35,660][35745] Rollout worker 1 uses device cpu [2024-06-10 11:59:35,660][35745] Rollout worker 2 uses device cpu [2024-06-10 11:59:35,660][35745] Rollout worker 3 uses device cpu [2024-06-10 11:59:35,660][35745] Rollout worker 4 uses device cpu [2024-06-10 11:59:35,660][35745] Rollout worker 5 uses device cpu [2024-06-10 11:59:35,660][35745] Rollout worker 6 uses device cpu [2024-06-10 11:59:35,660][35745] Rollout worker 7 uses device cpu [2024-06-10 11:59:35,660][35745] Rollout worker 8 uses device cpu [2024-06-10 11:59:35,660][35745] Rollout worker 9 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 10 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 11 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 12 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 13 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 14 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 15 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 16 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 17 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 18 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 19 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 20 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 21 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 22 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 23 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 24 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 25 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 26 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 27 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 28 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 29 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 30 uses device cpu [2024-06-10 11:59:35,661][35745] Rollout worker 31 uses device cpu [2024-06-10 11:59:36,167][35745] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-10 11:59:36,167][35745] InferenceWorker_p0-w0: min num requests: 10 [2024-06-10 11:59:36,208][35745] Starting all processes... [2024-06-10 11:59:36,208][35745] Starting process learner_proc0 [2024-06-10 11:59:36,470][35745] Starting all processes... [2024-06-10 11:59:36,472][35745] Starting process inference_proc0-0 [2024-06-10 11:59:36,472][35745] Starting process rollout_proc0 [2024-06-10 11:59:36,472][35745] Starting process rollout_proc1 [2024-06-10 11:59:36,473][35745] Starting process rollout_proc2 [2024-06-10 11:59:36,473][35745] Starting process rollout_proc3 [2024-06-10 11:59:36,473][35745] Starting process rollout_proc4 [2024-06-10 11:59:36,473][35745] Starting process rollout_proc5 [2024-06-10 11:59:36,476][35745] Starting process rollout_proc6 [2024-06-10 11:59:36,477][35745] Starting process rollout_proc7 [2024-06-10 11:59:36,477][35745] Starting process rollout_proc8 [2024-06-10 11:59:36,478][35745] Starting process rollout_proc9 [2024-06-10 11:59:36,478][35745] Starting process rollout_proc10 [2024-06-10 11:59:36,478][35745] Starting process rollout_proc11 [2024-06-10 11:59:36,478][35745] Starting process rollout_proc12 [2024-06-10 11:59:36,478][35745] Starting process rollout_proc13 [2024-06-10 11:59:36,479][35745] Starting process rollout_proc14 [2024-06-10 11:59:36,479][35745] Starting process rollout_proc15 [2024-06-10 11:59:36,480][35745] Starting process rollout_proc16 [2024-06-10 11:59:36,480][35745] Starting process rollout_proc17 [2024-06-10 11:59:36,481][35745] Starting process rollout_proc18 [2024-06-10 11:59:36,482][35745] Starting process rollout_proc19 [2024-06-10 11:59:36,484][35745] Starting process rollout_proc20 [2024-06-10 11:59:36,484][35745] Starting process rollout_proc21 [2024-06-10 11:59:36,486][35745] Starting process rollout_proc22 [2024-06-10 11:59:36,486][35745] Starting process rollout_proc23 [2024-06-10 11:59:36,490][35745] Starting process rollout_proc24 [2024-06-10 11:59:36,490][35745] Starting process rollout_proc25 [2024-06-10 11:59:36,490][35745] Starting process rollout_proc26 [2024-06-10 11:59:36,493][35745] Starting process rollout_proc27 [2024-06-10 11:59:36,493][35745] Starting process rollout_proc28 [2024-06-10 11:59:36,495][35745] Starting process rollout_proc29 [2024-06-10 11:59:36,497][35745] Starting process rollout_proc30 [2024-06-10 11:59:36,498][35745] Starting process rollout_proc31 [2024-06-10 11:59:38,308][35987] Worker 9 uses CPU cores [9] [2024-06-10 11:59:38,506][36000] Worker 22 uses CPU cores [22] [2024-06-10 11:59:38,518][35977] Worker 0 uses CPU cores [0] [2024-06-10 11:59:38,536][35980] Worker 3 uses CPU cores [3] [2024-06-10 11:59:38,548][35981] Worker 2 uses CPU cores [2] [2024-06-10 11:59:38,551][36007] Worker 29 uses CPU cores [29] [2024-06-10 11:59:38,577][35984] Worker 6 uses CPU cores [6] [2024-06-10 11:59:38,580][35979] Worker 1 uses CPU cores [1] [2024-06-10 11:59:38,630][35978] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-10 11:59:38,630][35978] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-06-10 11:59:38,640][35990] Worker 12 uses CPU cores [12] [2024-06-10 11:59:38,640][35978] Num visible devices: 1 [2024-06-10 11:59:38,663][35986] Worker 8 uses CPU cores [8] [2024-06-10 11:59:38,663][35993] Worker 15 uses CPU cores [15] [2024-06-10 11:59:38,668][35985] Worker 7 uses CPU cores [7] [2024-06-10 11:59:38,684][36006] Worker 27 uses CPU cores [27] [2024-06-10 11:59:38,764][35983] Worker 5 uses CPU cores [5] [2024-06-10 11:59:38,776][36002] Worker 26 uses CPU cores [26] [2024-06-10 11:59:38,792][35957] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-10 11:59:38,792][35957] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-06-10 11:59:38,801][35957] Num visible devices: 1 [2024-06-10 11:59:38,808][35982] Worker 4 uses CPU cores [4] [2024-06-10 11:59:38,828][35957] Setting fixed seed 0 [2024-06-10 11:59:38,829][35957] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-10 11:59:38,830][35957] Initializing actor-critic model on device cuda:0 [2024-06-10 11:59:38,832][36004] Worker 23 uses CPU cores [23] [2024-06-10 11:59:38,847][36005] Worker 28 uses CPU cores [28] [2024-06-10 11:59:38,861][35988] Worker 10 uses CPU cores [10] [2024-06-10 11:59:38,875][36001] Worker 24 uses CPU cores [24] [2024-06-10 11:59:38,891][35992] Worker 14 uses CPU cores [14] [2024-06-10 11:59:38,908][35991] Worker 13 uses CPU cores [13] [2024-06-10 11:59:38,914][35999] Worker 21 uses CPU cores [21] [2024-06-10 11:59:38,916][35996] Worker 18 uses CPU cores [18] [2024-06-10 11:59:38,945][35998] Worker 20 uses CPU cores [20] [2024-06-10 11:59:38,950][35997] Worker 19 uses CPU cores [19] [2024-06-10 11:59:38,963][35994] Worker 17 uses CPU cores [17] [2024-06-10 11:59:38,971][36008] Worker 30 uses CPU cores [30] [2024-06-10 11:59:38,976][36009] Worker 31 uses CPU cores [31] [2024-06-10 11:59:38,976][36003] Worker 25 uses CPU cores [25] [2024-06-10 11:59:38,984][35989] Worker 11 uses CPU cores [11] [2024-06-10 11:59:38,986][35995] Worker 16 uses CPU cores [16] [2024-06-10 11:59:39,581][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,581][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,581][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,581][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,581][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,582][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,582][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,582][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,582][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,582][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,582][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,582][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,582][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,582][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,582][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,582][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,582][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,582][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,582][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,582][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,582][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,582][35957] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:39,585][35957] RunningMeanStd input shape: (1,) [2024-06-10 11:59:39,586][35957] RunningMeanStd input shape: (1,) [2024-06-10 11:59:39,586][35957] RunningMeanStd input shape: (1,) [2024-06-10 11:59:39,586][35957] RunningMeanStd input shape: (1,) [2024-06-10 11:59:39,626][35957] RunningMeanStd input shape: (1,) [2024-06-10 11:59:39,631][35957] Created Actor Critic model with architecture: [2024-06-10 11:59:39,631][35957] SampleFactoryAgentWrapper( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (agent): MettaAgent( (_encoder): MultiFeatureSetEncoder( (feature_set_encoders): ModuleDict( (grid_obs): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (converter): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:hp): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv_r1): RunningMeanStdInPlace() (agent:inv_r2): RunningMeanStdInPlace() (agent:inv_r3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (altar:hp): RunningMeanStdInPlace() (altar:state): RunningMeanStdInPlace() (converter:hp): RunningMeanStdInPlace() (converter:state): RunningMeanStdInPlace() (generator:amount): RunningMeanStdInPlace() (generator:hp): RunningMeanStdInPlace() (generator:state): RunningMeanStdInPlace() (wall:hp): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (global_vars): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (_steps): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_action): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_reward): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_reward): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) ) (merged_encoder): Sequential( (0): Linear(in_features=536, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (_core): ModelCoreRNN( (core): GRU(512, 512) ) (_decoder): Decoder( (mlp): Identity() ) (_critic_linear): Linear(in_features=512, out_features=1, bias=True) (_action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) ) [2024-06-10 11:59:39,703][35957] Using optimizer [2024-06-10 11:59:39,889][35957] Loading state from checkpoint /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000024505_401489920.pth... [2024-06-10 11:59:39,903][35957] Loading model from checkpoint [2024-06-10 11:59:39,905][35957] Loaded experiment state at self.train_step=24505, self.env_steps=401489920 [2024-06-10 11:59:39,905][35957] Initialized policy 0 weights for model version 24505 [2024-06-10 11:59:39,907][35957] LearnerWorker_p0 finished initialization! [2024-06-10 11:59:39,907][35957] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-10 11:59:40,631][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,631][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,631][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,631][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,631][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,632][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,632][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,632][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,632][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,632][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,632][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,632][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,632][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,632][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,632][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,632][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,632][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,632][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,632][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,632][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,632][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,632][35978] RunningMeanStd input shape: (11, 11) [2024-06-10 11:59:40,635][35978] RunningMeanStd input shape: (1,) [2024-06-10 11:59:40,636][35978] RunningMeanStd input shape: (1,) [2024-06-10 11:59:40,636][35978] RunningMeanStd input shape: (1,) [2024-06-10 11:59:40,636][35978] RunningMeanStd input shape: (1,) [2024-06-10 11:59:40,675][35978] RunningMeanStd input shape: (1,) [2024-06-10 11:59:40,697][35745] Inference worker 0-0 is ready! [2024-06-10 11:59:40,697][35745] All inference workers are ready! Signal rollout workers to start! [2024-06-10 11:59:43,179][36009] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,197][35998] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,197][35997] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,198][36003] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,199][35999] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,202][36001] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,202][36007] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,204][36000] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,205][35995] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,206][36008] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,207][36004] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,209][35994] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,210][35996] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,247][36005] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,270][35987] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,270][35993] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,271][35980] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,272][35983] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,274][35991] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,275][35985] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,279][35989] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,285][35979] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,287][35981] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,288][35984] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,292][35990] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,292][35977] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,292][35986] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,294][35988] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,295][35982] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,300][36002] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,300][35992] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,355][36006] Decorrelating experience for 0 frames... [2024-06-10 11:59:43,402][35745] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 401489920. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-10 11:59:44,594][36009] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,666][35998] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,667][35997] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,674][36003] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,684][36001] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,686][35999] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,690][36008] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,691][35995] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,691][36007] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,699][36004] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,699][36000] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,706][35994] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,708][35996] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,751][35993] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,751][35987] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,762][35983] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,764][35980] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,771][35985] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,773][35989] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,777][36005] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,782][35979] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,783][35991] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,791][35984] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,794][35981] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,795][35977] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,798][35982] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,800][35986] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,805][35988] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,807][35990] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,810][35992] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,834][36002] Decorrelating experience for 256 frames... [2024-06-10 11:59:44,871][36006] Decorrelating experience for 256 frames... [2024-06-10 11:59:48,402][35745] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 401489920. Throughput: 0: 8996.5. Samples: 44980. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-10 11:59:51,121][35987] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-06-10 11:59:51,131][35989] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-06-10 11:59:51,132][35994] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-06-10 11:59:51,132][35997] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-06-10 11:59:51,138][35988] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-06-10 11:59:51,140][35980] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-06-10 11:59:51,141][36004] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-06-10 11:59:51,141][36000] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-06-10 11:59:51,152][35998] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-06-10 11:59:51,152][35995] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-06-10 11:59:51,152][36003] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-06-10 11:59:51,153][36009] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-06-10 11:59:51,155][35986] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-06-10 11:59:51,159][35996] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-06-10 11:59:51,160][36001] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-06-10 11:59:51,160][36008] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-06-10 11:59:51,167][35999] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-06-10 11:59:51,171][35991] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-06-10 11:59:51,177][35990] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-06-10 11:59:51,177][35981] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-06-10 11:59:51,182][35993] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-06-10 11:59:51,186][36007] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-06-10 11:59:51,186][35992] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-06-10 11:59:51,188][35979] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-06-10 11:59:51,202][36005] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-06-10 11:59:51,206][35985] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-06-10 11:59:51,209][36002] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-06-10 11:59:51,242][35957] Signal inference workers to stop experience collection... [2024-06-10 11:59:51,266][35978] InferenceWorker_p0-w0: stopping experience collection [2024-06-10 11:59:51,282][35982] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-06-10 11:59:51,283][36006] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-06-10 11:59:51,285][35983] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-06-10 11:59:51,794][35957] Signal inference workers to resume experience collection... [2024-06-10 11:59:51,795][35978] InferenceWorker_p0-w0: resuming experience collection [2024-06-10 11:59:51,827][35984] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-06-10 11:59:52,897][35978] Updated weights for policy 0, policy_version 24515 (0.0012) [2024-06-10 11:59:53,402][35745] Fps is (10 sec: 16383.9, 60 sec: 16383.9, 300 sec: 16383.9). Total num frames: 401653760. Throughput: 0: 32857.8. Samples: 328580. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-10 11:59:55,899][35979] Worker 1 awakens! [2024-06-10 11:59:56,164][35745] Heartbeat connected on Batcher_0 [2024-06-10 11:59:56,166][35745] Heartbeat connected on LearnerWorker_p0 [2024-06-10 11:59:56,173][35745] Heartbeat connected on RolloutWorker_w0 [2024-06-10 11:59:56,173][35745] Heartbeat connected on RolloutWorker_w1 [2024-06-10 11:59:56,222][35745] Heartbeat connected on InferenceWorker_p0-w0 [2024-06-10 11:59:58,402][35745] Fps is (10 sec: 16383.9, 60 sec: 10922.8, 300 sec: 10922.8). Total num frames: 401653760. Throughput: 0: 22127.0. Samples: 331900. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-10 12:00:00,599][35981] Worker 2 awakens! [2024-06-10 12:00:00,607][35745] Heartbeat connected on RolloutWorker_w2 [2024-06-10 12:00:03,402][35745] Fps is (10 sec: 1638.4, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 401670144. Throughput: 0: 17371.0. Samples: 347420. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-10 12:00:05,272][35980] Worker 3 awakens! [2024-06-10 12:00:05,281][35745] Heartbeat connected on RolloutWorker_w3 [2024-06-10 12:00:08,402][35745] Fps is (10 sec: 3276.7, 60 sec: 7864.3, 300 sec: 7864.3). Total num frames: 401686528. Throughput: 0: 14885.6. Samples: 372140. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-10 12:00:10,126][35982] Worker 4 awakens! [2024-06-10 12:00:10,134][35745] Heartbeat connected on RolloutWorker_w4 [2024-06-10 12:00:13,401][35745] Fps is (10 sec: 6553.7, 60 sec: 8192.1, 300 sec: 8192.1). Total num frames: 401735680. Throughput: 0: 13078.1. Samples: 392340. Policy #0 lag: (min: 0.0, avg: 4.3, max: 12.0) [2024-06-10 12:00:13,402][35745] Avg episode reward: [(0, '0.279')] [2024-06-10 12:00:14,822][35983] Worker 5 awakens! [2024-06-10 12:00:14,827][35745] Heartbeat connected on RolloutWorker_w5 [2024-06-10 12:00:18,148][35978] Updated weights for policy 0, policy_version 24525 (0.0016) [2024-06-10 12:00:18,401][35745] Fps is (10 sec: 13107.6, 60 sec: 9362.4, 300 sec: 9362.4). Total num frames: 401817600. Throughput: 0: 13506.4. Samples: 472720. Policy #0 lag: (min: 0.0, avg: 4.3, max: 12.0) [2024-06-10 12:00:18,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:00:20,052][35984] Worker 6 awakens! [2024-06-10 12:00:20,056][35745] Heartbeat connected on RolloutWorker_w6 [2024-06-10 12:00:23,402][35745] Fps is (10 sec: 18022.4, 60 sec: 10649.7, 300 sec: 10649.7). Total num frames: 401915904. Throughput: 0: 14444.6. Samples: 577780. Policy #0 lag: (min: 0.0, avg: 4.3, max: 12.0) [2024-06-10 12:00:23,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:00:24,118][35985] Worker 7 awakens! [2024-06-10 12:00:24,125][35745] Heartbeat connected on RolloutWorker_w7 [2024-06-10 12:00:25,818][35978] Updated weights for policy 0, policy_version 24535 (0.0011) [2024-06-10 12:00:28,402][35745] Fps is (10 sec: 19660.7, 60 sec: 11650.9, 300 sec: 11650.9). Total num frames: 402014208. Throughput: 0: 14389.4. Samples: 647520. Policy #0 lag: (min: 0.0, avg: 2.1, max: 6.0) [2024-06-10 12:00:28,402][35745] Avg episode reward: [(0, '0.284')] [2024-06-10 12:00:28,752][35986] Worker 8 awakens! [2024-06-10 12:00:28,757][35745] Heartbeat connected on RolloutWorker_w8 [2024-06-10 12:00:33,376][35987] Worker 9 awakens! [2024-06-10 12:00:33,382][35745] Heartbeat connected on RolloutWorker_w9 [2024-06-10 12:00:33,401][35745] Fps is (10 sec: 21299.3, 60 sec: 12779.6, 300 sec: 12779.6). Total num frames: 402128896. Throughput: 0: 16413.3. Samples: 783580. Policy #0 lag: (min: 0.0, avg: 2.1, max: 6.0) [2024-06-10 12:00:33,408][35745] Avg episode reward: [(0, '0.287')] [2024-06-10 12:00:33,459][35978] Updated weights for policy 0, policy_version 24545 (0.0013) [2024-06-10 12:00:38,112][35988] Worker 10 awakens! [2024-06-10 12:00:38,116][35745] Heartbeat connected on RolloutWorker_w10 [2024-06-10 12:00:38,402][35745] Fps is (10 sec: 26214.3, 60 sec: 14298.8, 300 sec: 14298.8). Total num frames: 402276352. Throughput: 0: 13653.9. Samples: 943000. Policy #0 lag: (min: 0.0, avg: 2.1, max: 6.0) [2024-06-10 12:00:38,402][35745] Avg episode reward: [(0, '0.287')] [2024-06-10 12:00:38,857][35978] Updated weights for policy 0, policy_version 24555 (0.0014) [2024-06-10 12:00:42,792][35989] Worker 11 awakens! [2024-06-10 12:00:42,799][35745] Heartbeat connected on RolloutWorker_w11 [2024-06-10 12:00:43,402][35745] Fps is (10 sec: 31129.3, 60 sec: 15837.9, 300 sec: 15837.9). Total num frames: 402440192. Throughput: 0: 15709.3. Samples: 1038820. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-06-10 12:00:43,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:00:44,274][35978] Updated weights for policy 0, policy_version 24565 (0.0014) [2024-06-10 12:00:47,527][35990] Worker 12 awakens! [2024-06-10 12:00:47,533][35745] Heartbeat connected on RolloutWorker_w12 [2024-06-10 12:00:48,402][35745] Fps is (10 sec: 32768.1, 60 sec: 18568.5, 300 sec: 17140.2). Total num frames: 402604032. Throughput: 0: 19688.5. Samples: 1233400. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-06-10 12:00:48,409][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:00:49,390][35978] Updated weights for policy 0, policy_version 24575 (0.0018) [2024-06-10 12:00:52,210][35991] Worker 13 awakens! [2024-06-10 12:00:52,216][35745] Heartbeat connected on RolloutWorker_w13 [2024-06-10 12:00:53,402][35745] Fps is (10 sec: 34406.2, 60 sec: 18841.6, 300 sec: 18490.5). Total num frames: 402784256. Throughput: 0: 23911.2. Samples: 1448140. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-06-10 12:00:53,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:00:53,766][35978] Updated weights for policy 0, policy_version 24585 (0.0016) [2024-06-10 12:00:56,910][35992] Worker 14 awakens! [2024-06-10 12:00:56,916][35745] Heartbeat connected on RolloutWorker_w14 [2024-06-10 12:00:58,098][35978] Updated weights for policy 0, policy_version 24595 (0.0021) [2024-06-10 12:00:58,402][35745] Fps is (10 sec: 36044.4, 60 sec: 21845.3, 300 sec: 19660.8). Total num frames: 402964480. Throughput: 0: 25892.8. Samples: 1557520. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-06-10 12:00:58,402][35745] Avg episode reward: [(0, '0.286')] [2024-06-10 12:01:01,592][35993] Worker 15 awakens! [2024-06-10 12:01:01,600][35745] Heartbeat connected on RolloutWorker_w15 [2024-06-10 12:01:02,217][35978] Updated weights for policy 0, policy_version 24605 (0.0018) [2024-06-10 12:01:03,402][35745] Fps is (10 sec: 36045.1, 60 sec: 24576.1, 300 sec: 20684.8). Total num frames: 403144704. Throughput: 0: 28933.7. Samples: 1774740. Policy #0 lag: (min: 0.0, avg: 5.1, max: 10.0) [2024-06-10 12:01:03,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:01:06,252][35995] Worker 16 awakens! [2024-06-10 12:01:06,261][35745] Heartbeat connected on RolloutWorker_w16 [2024-06-10 12:01:07,121][35978] Updated weights for policy 0, policy_version 24615 (0.0030) [2024-06-10 12:01:08,402][35745] Fps is (10 sec: 36044.9, 60 sec: 27306.7, 300 sec: 21588.4). Total num frames: 403324928. Throughput: 0: 31277.7. Samples: 1985280. Policy #0 lag: (min: 0.0, avg: 5.1, max: 10.0) [2024-06-10 12:01:08,402][35745] Avg episode reward: [(0, '0.287')] [2024-06-10 12:01:10,878][35994] Worker 17 awakens! [2024-06-10 12:01:10,888][35745] Heartbeat connected on RolloutWorker_w17 [2024-06-10 12:01:11,578][35978] Updated weights for policy 0, policy_version 24625 (0.0028) [2024-06-10 12:01:13,402][35745] Fps is (10 sec: 36044.7, 60 sec: 29491.2, 300 sec: 22391.5). Total num frames: 403505152. Throughput: 0: 32291.0. Samples: 2100620. Policy #0 lag: (min: 0.0, avg: 5.1, max: 10.0) [2024-06-10 12:01:13,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:01:15,636][35996] Worker 18 awakens! [2024-06-10 12:01:15,646][35745] Heartbeat connected on RolloutWorker_w18 [2024-06-10 12:01:16,230][35978] Updated weights for policy 0, policy_version 24635 (0.0027) [2024-06-10 12:01:18,402][35745] Fps is (10 sec: 37683.6, 60 sec: 31402.6, 300 sec: 23282.6). Total num frames: 403701760. Throughput: 0: 34239.1. Samples: 2324340. Policy #0 lag: (min: 0.0, avg: 35.8, max: 128.0) [2024-06-10 12:01:18,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:01:20,296][35997] Worker 19 awakens! [2024-06-10 12:01:20,306][35745] Heartbeat connected on RolloutWorker_w19 [2024-06-10 12:01:20,471][35978] Updated weights for policy 0, policy_version 24645 (0.0031) [2024-06-10 12:01:23,402][35745] Fps is (10 sec: 39321.1, 60 sec: 33040.9, 300 sec: 24084.5). Total num frames: 403898368. Throughput: 0: 35876.3. Samples: 2557440. Policy #0 lag: (min: 0.0, avg: 35.8, max: 128.0) [2024-06-10 12:01:23,402][35745] Avg episode reward: [(0, '0.282')] [2024-06-10 12:01:24,833][35978] Updated weights for policy 0, policy_version 24655 (0.0021) [2024-06-10 12:01:25,002][35998] Worker 20 awakens! [2024-06-10 12:01:25,013][35745] Heartbeat connected on RolloutWorker_w20 [2024-06-10 12:01:28,402][35745] Fps is (10 sec: 39321.3, 60 sec: 34679.4, 300 sec: 24810.1). Total num frames: 404094976. Throughput: 0: 36468.0. Samples: 2679880. Policy #0 lag: (min: 0.0, avg: 35.8, max: 128.0) [2024-06-10 12:01:28,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:01:28,581][35978] Updated weights for policy 0, policy_version 24665 (0.0032) [2024-06-10 12:01:29,707][35999] Worker 21 awakens! [2024-06-10 12:01:29,719][35745] Heartbeat connected on RolloutWorker_w21 [2024-06-10 12:01:32,044][35978] Updated weights for policy 0, policy_version 24675 (0.0030) [2024-06-10 12:01:33,402][35745] Fps is (10 sec: 40960.3, 60 sec: 36317.8, 300 sec: 25618.6). Total num frames: 404307968. Throughput: 0: 37586.1. Samples: 2924780. Policy #0 lag: (min: 0.0, avg: 8.3, max: 16.0) [2024-06-10 12:01:33,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:01:33,412][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000024677_404307968.pth... [2024-06-10 12:01:33,462][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000024206_396591104.pth [2024-06-10 12:01:34,336][36000] Worker 22 awakens! [2024-06-10 12:01:34,349][35745] Heartbeat connected on RolloutWorker_w22 [2024-06-10 12:01:36,935][35978] Updated weights for policy 0, policy_version 24685 (0.0022) [2024-06-10 12:01:38,402][35745] Fps is (10 sec: 42598.1, 60 sec: 37410.1, 300 sec: 26356.9). Total num frames: 404520960. Throughput: 0: 38304.0. Samples: 3171820. Policy #0 lag: (min: 0.0, avg: 8.3, max: 16.0) [2024-06-10 12:01:38,402][35745] Avg episode reward: [(0, '0.287')] [2024-06-10 12:01:39,054][36004] Worker 23 awakens! [2024-06-10 12:01:39,066][35745] Heartbeat connected on RolloutWorker_w23 [2024-06-10 12:01:40,354][35978] Updated weights for policy 0, policy_version 24695 (0.0025) [2024-06-10 12:01:43,402][35745] Fps is (10 sec: 42598.2, 60 sec: 38229.3, 300 sec: 27033.6). Total num frames: 404733952. Throughput: 0: 38859.1. Samples: 3306180. Policy #0 lag: (min: 0.0, avg: 8.3, max: 16.0) [2024-06-10 12:01:43,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:01:43,712][36001] Worker 24 awakens! [2024-06-10 12:01:43,724][35745] Heartbeat connected on RolloutWorker_w24 [2024-06-10 12:01:43,797][35978] Updated weights for policy 0, policy_version 24705 (0.0034) [2024-06-10 12:01:47,900][35978] Updated weights for policy 0, policy_version 24715 (0.0024) [2024-06-10 12:01:48,402][35745] Fps is (10 sec: 40960.1, 60 sec: 38775.4, 300 sec: 27525.1). Total num frames: 404930560. Throughput: 0: 39824.0. Samples: 3566820. Policy #0 lag: (min: 0.0, avg: 8.3, max: 16.0) [2024-06-10 12:01:48,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:01:48,440][36003] Worker 25 awakens! [2024-06-10 12:01:48,453][35745] Heartbeat connected on RolloutWorker_w25 [2024-06-10 12:01:51,592][35978] Updated weights for policy 0, policy_version 24725 (0.0030) [2024-06-10 12:01:53,185][36002] Worker 26 awakens! [2024-06-10 12:01:53,197][35745] Heartbeat connected on RolloutWorker_w26 [2024-06-10 12:01:53,402][35745] Fps is (10 sec: 42598.1, 60 sec: 39594.6, 300 sec: 28230.9). Total num frames: 405159936. Throughput: 0: 40894.5. Samples: 3825540. Policy #0 lag: (min: 0.0, avg: 71.6, max: 211.0) [2024-06-10 12:01:53,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:01:55,306][35978] Updated weights for policy 0, policy_version 24735 (0.0031) [2024-06-10 12:01:57,944][36006] Worker 27 awakens! [2024-06-10 12:01:57,954][35745] Heartbeat connected on RolloutWorker_w27 [2024-06-10 12:01:58,402][35745] Fps is (10 sec: 45875.5, 60 sec: 40413.9, 300 sec: 28884.4). Total num frames: 405389312. Throughput: 0: 41096.9. Samples: 3949980. Policy #0 lag: (min: 0.0, avg: 71.6, max: 211.0) [2024-06-10 12:01:58,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:01:58,784][35978] Updated weights for policy 0, policy_version 24745 (0.0028) [2024-06-10 12:02:02,040][35978] Updated weights for policy 0, policy_version 24755 (0.0040) [2024-06-10 12:02:02,552][36005] Worker 28 awakens! [2024-06-10 12:02:02,565][35745] Heartbeat connected on RolloutWorker_w28 [2024-06-10 12:02:03,402][35745] Fps is (10 sec: 44237.8, 60 sec: 40960.0, 300 sec: 29374.2). Total num frames: 405602304. Throughput: 0: 42133.8. Samples: 4220360. Policy #0 lag: (min: 0.0, avg: 71.6, max: 211.0) [2024-06-10 12:02:03,402][35745] Avg episode reward: [(0, '0.286')] [2024-06-10 12:02:06,533][35978] Updated weights for policy 0, policy_version 24765 (0.0034) [2024-06-10 12:02:07,223][36007] Worker 29 awakens! [2024-06-10 12:02:07,237][35745] Heartbeat connected on RolloutWorker_w29 [2024-06-10 12:02:08,402][35745] Fps is (10 sec: 44236.2, 60 sec: 41779.2, 300 sec: 29943.2). Total num frames: 405831680. Throughput: 0: 42885.4. Samples: 4487280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-10 12:02:08,402][35745] Avg episode reward: [(0, '0.284')] [2024-06-10 12:02:09,400][35978] Updated weights for policy 0, policy_version 24775 (0.0031) [2024-06-10 12:02:11,885][36008] Worker 30 awakens! [2024-06-10 12:02:11,900][35745] Heartbeat connected on RolloutWorker_w30 [2024-06-10 12:02:13,402][35745] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 30365.0). Total num frames: 406044672. Throughput: 0: 43318.1. Samples: 4629200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-10 12:02:13,402][35745] Avg episode reward: [(0, '0.280')] [2024-06-10 12:02:13,654][35978] Updated weights for policy 0, policy_version 24785 (0.0023) [2024-06-10 12:02:14,670][35957] Signal inference workers to stop experience collection... (50 times) [2024-06-10 12:02:14,689][35978] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-06-10 12:02:14,782][35957] Signal inference workers to resume experience collection... (50 times) [2024-06-10 12:02:14,782][35978] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-06-10 12:02:16,564][36009] Worker 31 awakens! [2024-06-10 12:02:16,578][35745] Heartbeat connected on RolloutWorker_w31 [2024-06-10 12:02:16,756][35978] Updated weights for policy 0, policy_version 24795 (0.0028) [2024-06-10 12:02:18,401][35745] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 30865.4). Total num frames: 406274048. Throughput: 0: 43851.7. Samples: 4898100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-10 12:02:18,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:02:20,907][35978] Updated weights for policy 0, policy_version 24805 (0.0036) [2024-06-10 12:02:23,402][35745] Fps is (10 sec: 49152.6, 60 sec: 43963.9, 300 sec: 31539.2). Total num frames: 406536192. Throughput: 0: 44472.5. Samples: 5173080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 12:02:23,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:02:23,884][35978] Updated weights for policy 0, policy_version 24815 (0.0026) [2024-06-10 12:02:28,061][35978] Updated weights for policy 0, policy_version 24825 (0.0036) [2024-06-10 12:02:28,404][35745] Fps is (10 sec: 45864.4, 60 sec: 43962.1, 300 sec: 31774.6). Total num frames: 406732800. Throughput: 0: 44752.5. Samples: 5320140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 12:02:28,404][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:02:31,161][35978] Updated weights for policy 0, policy_version 24835 (0.0037) [2024-06-10 12:02:33,404][35745] Fps is (10 sec: 44228.2, 60 sec: 44508.5, 300 sec: 32285.8). Total num frames: 406978560. Throughput: 0: 45095.9. Samples: 5596220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 12:02:33,404][35745] Avg episode reward: [(0, '0.284')] [2024-06-10 12:02:35,245][35978] Updated weights for policy 0, policy_version 24845 (0.0030) [2024-06-10 12:02:38,103][35978] Updated weights for policy 0, policy_version 24855 (0.0048) [2024-06-10 12:02:38,403][35745] Fps is (10 sec: 49155.3, 60 sec: 45054.9, 300 sec: 32767.7). Total num frames: 407224320. Throughput: 0: 45362.6. Samples: 5866920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 12:02:38,404][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:02:42,114][35978] Updated weights for policy 0, policy_version 24865 (0.0040) [2024-06-10 12:02:43,402][35745] Fps is (10 sec: 45884.4, 60 sec: 45056.1, 300 sec: 33041.1). Total num frames: 407437312. Throughput: 0: 45645.4. Samples: 6004020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 12:02:43,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:02:45,009][35978] Updated weights for policy 0, policy_version 24875 (0.0031) [2024-06-10 12:02:48,404][35745] Fps is (10 sec: 44233.6, 60 sec: 45600.4, 300 sec: 33387.5). Total num frames: 407666688. Throughput: 0: 45826.5. Samples: 6282660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 12:02:48,404][35745] Avg episode reward: [(0, '0.283')] [2024-06-10 12:02:49,262][35978] Updated weights for policy 0, policy_version 24885 (0.0025) [2024-06-10 12:02:52,144][35978] Updated weights for policy 0, policy_version 24895 (0.0028) [2024-06-10 12:02:53,402][35745] Fps is (10 sec: 45874.6, 60 sec: 45602.2, 300 sec: 33716.6). Total num frames: 407896064. Throughput: 0: 46027.6. Samples: 6558520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 12:02:53,402][35745] Avg episode reward: [(0, '0.286')] [2024-06-10 12:02:56,519][35978] Updated weights for policy 0, policy_version 24905 (0.0038) [2024-06-10 12:02:58,401][35745] Fps is (10 sec: 45886.3, 60 sec: 45602.2, 300 sec: 34028.4). Total num frames: 408125440. Throughput: 0: 45947.3. Samples: 6696820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 12:02:58,402][35745] Avg episode reward: [(0, '0.287')] [2024-06-10 12:02:59,301][35978] Updated weights for policy 0, policy_version 24915 (0.0028) [2024-06-10 12:03:03,402][35745] Fps is (10 sec: 45875.5, 60 sec: 45875.2, 300 sec: 34324.5). Total num frames: 408354816. Throughput: 0: 46137.7. Samples: 6974300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 12:03:03,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:03:03,697][35978] Updated weights for policy 0, policy_version 24925 (0.0031) [2024-06-10 12:03:06,500][35978] Updated weights for policy 0, policy_version 24935 (0.0027) [2024-06-10 12:03:08,402][35745] Fps is (10 sec: 45874.8, 60 sec: 45875.3, 300 sec: 34606.2). Total num frames: 408584192. Throughput: 0: 46208.4. Samples: 7252460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-10 12:03:08,402][35745] Avg episode reward: [(0, '0.287')] [2024-06-10 12:03:10,482][35978] Updated weights for policy 0, policy_version 24945 (0.0031) [2024-06-10 12:03:13,401][35745] Fps is (10 sec: 49152.4, 60 sec: 46694.5, 300 sec: 35030.6). Total num frames: 408846336. Throughput: 0: 46120.2. Samples: 7395440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:03:13,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:03:13,495][35978] Updated weights for policy 0, policy_version 24955 (0.0028) [2024-06-10 12:03:17,315][35978] Updated weights for policy 0, policy_version 24965 (0.0040) [2024-06-10 12:03:18,401][35745] Fps is (10 sec: 45875.6, 60 sec: 46148.3, 300 sec: 35130.4). Total num frames: 409042944. Throughput: 0: 46191.9. Samples: 7674760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:03:18,402][35745] Avg episode reward: [(0, '0.283')] [2024-06-10 12:03:20,587][35978] Updated weights for policy 0, policy_version 24975 (0.0037) [2024-06-10 12:03:23,402][35745] Fps is (10 sec: 44236.7, 60 sec: 45875.2, 300 sec: 35449.1). Total num frames: 409288704. Throughput: 0: 46228.4. Samples: 7947120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:03:23,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:03:24,958][35978] Updated weights for policy 0, policy_version 24985 (0.0037) [2024-06-10 12:03:27,809][35978] Updated weights for policy 0, policy_version 24995 (0.0031) [2024-06-10 12:03:28,402][35745] Fps is (10 sec: 47513.5, 60 sec: 46423.1, 300 sec: 35680.8). Total num frames: 409518080. Throughput: 0: 46253.8. Samples: 8085440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:03:28,402][35745] Avg episode reward: [(0, '0.285')] [2024-06-10 12:03:32,144][35978] Updated weights for policy 0, policy_version 25005 (0.0036) [2024-06-10 12:03:33,402][35745] Fps is (10 sec: 45874.2, 60 sec: 46149.6, 300 sec: 35902.3). Total num frames: 409747456. Throughput: 0: 46134.2. Samples: 8358600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-10 12:03:33,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:03:33,419][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000025009_409747456.pth... [2024-06-10 12:03:33,480][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000024505_401489920.pth [2024-06-10 12:03:35,263][35978] Updated weights for policy 0, policy_version 25015 (0.0035) [2024-06-10 12:03:38,402][35745] Fps is (10 sec: 45871.3, 60 sec: 45875.8, 300 sec: 36114.4). Total num frames: 409976832. Throughput: 0: 46113.0. Samples: 8633640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-10 12:03:38,403][35745] Avg episode reward: [(0, '0.285')] [2024-06-10 12:03:39,079][35978] Updated weights for policy 0, policy_version 25025 (0.0037) [2024-06-10 12:03:42,274][35978] Updated weights for policy 0, policy_version 25035 (0.0022) [2024-06-10 12:03:43,402][35745] Fps is (10 sec: 45875.7, 60 sec: 46148.2, 300 sec: 36317.9). Total num frames: 410206208. Throughput: 0: 46111.0. Samples: 8771820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-10 12:03:43,405][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:03:45,881][35978] Updated weights for policy 0, policy_version 25045 (0.0039) [2024-06-10 12:03:48,402][35745] Fps is (10 sec: 47517.0, 60 sec: 46423.1, 300 sec: 36579.8). Total num frames: 410451968. Throughput: 0: 46227.9. Samples: 9054560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-10 12:03:48,408][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:03:49,114][35978] Updated weights for policy 0, policy_version 25055 (0.0031) [2024-06-10 12:03:52,998][35978] Updated weights for policy 0, policy_version 25065 (0.0032) [2024-06-10 12:03:53,402][35745] Fps is (10 sec: 45875.5, 60 sec: 46148.3, 300 sec: 36700.2). Total num frames: 410664960. Throughput: 0: 46147.6. Samples: 9329100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-10 12:03:53,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:03:54,414][35957] Signal inference workers to stop experience collection... (100 times) [2024-06-10 12:03:54,448][35978] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-06-10 12:03:54,470][35957] Signal inference workers to resume experience collection... (100 times) [2024-06-10 12:03:54,470][35978] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-06-10 12:03:56,508][35978] Updated weights for policy 0, policy_version 25075 (0.0039) [2024-06-10 12:03:58,402][35745] Fps is (10 sec: 45875.7, 60 sec: 46421.3, 300 sec: 36944.3). Total num frames: 410910720. Throughput: 0: 46101.3. Samples: 9470000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-10 12:03:58,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:04:00,196][35978] Updated weights for policy 0, policy_version 25085 (0.0041) [2024-06-10 12:04:03,401][35745] Fps is (10 sec: 47513.9, 60 sec: 46421.4, 300 sec: 37116.1). Total num frames: 411140096. Throughput: 0: 46048.0. Samples: 9746920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-10 12:04:03,402][35745] Avg episode reward: [(0, '0.287')] [2024-06-10 12:04:03,545][35978] Updated weights for policy 0, policy_version 25095 (0.0039) [2024-06-10 12:04:07,330][35978] Updated weights for policy 0, policy_version 25105 (0.0036) [2024-06-10 12:04:08,402][35745] Fps is (10 sec: 45875.1, 60 sec: 46421.3, 300 sec: 37281.4). Total num frames: 411369472. Throughput: 0: 46054.1. Samples: 10019560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-10 12:04:08,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:04:10,663][35978] Updated weights for policy 0, policy_version 25115 (0.0028) [2024-06-10 12:04:13,402][35745] Fps is (10 sec: 44236.6, 60 sec: 45602.1, 300 sec: 37379.8). Total num frames: 411582464. Throughput: 0: 45992.4. Samples: 10155100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-10 12:04:13,402][35745] Avg episode reward: [(0, '0.287')] [2024-06-10 12:04:14,292][35978] Updated weights for policy 0, policy_version 25125 (0.0043) [2024-06-10 12:04:17,566][35978] Updated weights for policy 0, policy_version 25135 (0.0038) [2024-06-10 12:04:18,401][35745] Fps is (10 sec: 45875.7, 60 sec: 46421.3, 300 sec: 37593.9). Total num frames: 411828224. Throughput: 0: 46116.3. Samples: 10433820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-10 12:04:18,402][35745] Avg episode reward: [(0, '0.281')] [2024-06-10 12:04:21,500][35978] Updated weights for policy 0, policy_version 25145 (0.0037) [2024-06-10 12:04:23,402][35745] Fps is (10 sec: 49151.7, 60 sec: 46421.3, 300 sec: 37800.3). Total num frames: 412073984. Throughput: 0: 46287.0. Samples: 10716520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-10 12:04:23,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:04:24,918][35978] Updated weights for policy 0, policy_version 25155 (0.0028) [2024-06-10 12:04:28,367][35978] Updated weights for policy 0, policy_version 25165 (0.0037) [2024-06-10 12:04:28,402][35745] Fps is (10 sec: 47513.1, 60 sec: 46421.3, 300 sec: 37941.9). Total num frames: 412303360. Throughput: 0: 46306.7. Samples: 10855620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-10 12:04:28,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:04:31,929][35978] Updated weights for policy 0, policy_version 25175 (0.0041) [2024-06-10 12:04:33,402][35745] Fps is (10 sec: 42598.6, 60 sec: 45875.3, 300 sec: 37965.7). Total num frames: 412499968. Throughput: 0: 46024.6. Samples: 11125660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-10 12:04:33,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:04:35,782][35978] Updated weights for policy 0, policy_version 25185 (0.0035) [2024-06-10 12:04:38,402][35745] Fps is (10 sec: 45875.0, 60 sec: 46421.9, 300 sec: 38210.8). Total num frames: 412762112. Throughput: 0: 46022.6. Samples: 11400120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 12:04:38,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:04:39,352][35978] Updated weights for policy 0, policy_version 25195 (0.0040) [2024-06-10 12:04:42,937][35978] Updated weights for policy 0, policy_version 25205 (0.0030) [2024-06-10 12:04:43,402][35745] Fps is (10 sec: 47513.3, 60 sec: 46148.3, 300 sec: 38932.8). Total num frames: 412975104. Throughput: 0: 45959.9. Samples: 11538200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 12:04:43,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:04:46,307][35978] Updated weights for policy 0, policy_version 25215 (0.0037) [2024-06-10 12:04:48,401][35745] Fps is (10 sec: 40960.5, 60 sec: 45329.2, 300 sec: 39043.9). Total num frames: 413171712. Throughput: 0: 45870.7. Samples: 11811100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 12:04:48,402][35745] Avg episode reward: [(0, '0.286')] [2024-06-10 12:04:49,934][35978] Updated weights for policy 0, policy_version 25225 (0.0030) [2024-06-10 12:04:53,401][35745] Fps is (10 sec: 44237.3, 60 sec: 45875.2, 300 sec: 39877.0). Total num frames: 413417472. Throughput: 0: 45806.8. Samples: 12080860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-10 12:04:53,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:04:53,688][35978] Updated weights for policy 0, policy_version 25235 (0.0027) [2024-06-10 12:04:57,139][35978] Updated weights for policy 0, policy_version 25245 (0.0035) [2024-06-10 12:04:58,401][35745] Fps is (10 sec: 49152.1, 60 sec: 45875.3, 300 sec: 40654.6). Total num frames: 413663232. Throughput: 0: 45796.1. Samples: 12215920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-10 12:04:58,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:05:00,919][35978] Updated weights for policy 0, policy_version 25255 (0.0027) [2024-06-10 12:05:03,404][35745] Fps is (10 sec: 47502.1, 60 sec: 45873.4, 300 sec: 41376.2). Total num frames: 413892608. Throughput: 0: 45807.3. Samples: 12495260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-10 12:05:03,405][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:05:04,485][35978] Updated weights for policy 0, policy_version 25265 (0.0028) [2024-06-10 12:05:08,010][35978] Updated weights for policy 0, policy_version 25275 (0.0034) [2024-06-10 12:05:08,402][35745] Fps is (10 sec: 45875.0, 60 sec: 45875.3, 300 sec: 41987.5). Total num frames: 414121984. Throughput: 0: 45600.1. Samples: 12768520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-10 12:05:08,402][35745] Avg episode reward: [(0, '0.287')] [2024-06-10 12:05:11,360][35978] Updated weights for policy 0, policy_version 25285 (0.0028) [2024-06-10 12:05:13,402][35745] Fps is (10 sec: 45885.9, 60 sec: 46148.2, 300 sec: 42487.3). Total num frames: 414351360. Throughput: 0: 45607.6. Samples: 12907960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 12:05:13,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:05:15,125][35978] Updated weights for policy 0, policy_version 25295 (0.0029) [2024-06-10 12:05:18,403][35745] Fps is (10 sec: 45868.9, 60 sec: 45874.1, 300 sec: 42931.4). Total num frames: 414580736. Throughput: 0: 45820.9. Samples: 13187660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 12:05:18,404][35745] Avg episode reward: [(0, '0.286')] [2024-06-10 12:05:18,427][35978] Updated weights for policy 0, policy_version 25305 (0.0025) [2024-06-10 12:05:21,775][35957] Signal inference workers to stop experience collection... (150 times) [2024-06-10 12:05:21,778][35957] Signal inference workers to resume experience collection... (150 times) [2024-06-10 12:05:21,800][35978] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-06-10 12:05:21,800][35978] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-06-10 12:05:22,054][35978] Updated weights for policy 0, policy_version 25315 (0.0034) [2024-06-10 12:05:23,402][35745] Fps is (10 sec: 42598.1, 60 sec: 45056.0, 300 sec: 43264.8). Total num frames: 414777344. Throughput: 0: 45676.9. Samples: 13455580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 12:05:23,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:05:25,617][35978] Updated weights for policy 0, policy_version 25325 (0.0043) [2024-06-10 12:05:28,402][35745] Fps is (10 sec: 44238.7, 60 sec: 45328.4, 300 sec: 43709.0). Total num frames: 415023104. Throughput: 0: 45465.4. Samples: 13584180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 12:05:28,408][35745] Avg episode reward: [(0, '0.285')] [2024-06-10 12:05:29,358][35978] Updated weights for policy 0, policy_version 25335 (0.0027) [2024-06-10 12:05:32,869][35978] Updated weights for policy 0, policy_version 25345 (0.0030) [2024-06-10 12:05:33,401][35745] Fps is (10 sec: 49152.7, 60 sec: 46148.3, 300 sec: 44042.4). Total num frames: 415268864. Throughput: 0: 45712.0. Samples: 13868140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-10 12:05:33,402][35745] Avg episode reward: [(0, '0.284')] [2024-06-10 12:05:33,465][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000025347_415285248.pth... [2024-06-10 12:05:33,516][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000024677_404307968.pth [2024-06-10 12:05:36,554][35978] Updated weights for policy 0, policy_version 25355 (0.0030) [2024-06-10 12:05:38,401][35745] Fps is (10 sec: 45879.7, 60 sec: 45329.2, 300 sec: 44209.0). Total num frames: 415481856. Throughput: 0: 45985.3. Samples: 14150200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-10 12:05:38,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:05:40,034][35978] Updated weights for policy 0, policy_version 25365 (0.0025) [2024-06-10 12:05:43,402][35745] Fps is (10 sec: 44236.4, 60 sec: 45602.1, 300 sec: 44431.2). Total num frames: 415711232. Throughput: 0: 45851.4. Samples: 14279240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-10 12:05:43,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:05:43,649][35978] Updated weights for policy 0, policy_version 25375 (0.0026) [2024-06-10 12:05:46,946][35978] Updated weights for policy 0, policy_version 25385 (0.0024) [2024-06-10 12:05:48,402][35745] Fps is (10 sec: 47512.9, 60 sec: 46421.2, 300 sec: 44653.3). Total num frames: 415956992. Throughput: 0: 45764.1. Samples: 14554540. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-10 12:05:48,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:05:50,771][35978] Updated weights for policy 0, policy_version 25395 (0.0032) [2024-06-10 12:05:53,402][35745] Fps is (10 sec: 45875.3, 60 sec: 45875.1, 300 sec: 44764.4). Total num frames: 416169984. Throughput: 0: 45967.9. Samples: 14837080. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-10 12:05:53,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:05:54,066][35978] Updated weights for policy 0, policy_version 25405 (0.0027) [2024-06-10 12:05:57,792][35978] Updated weights for policy 0, policy_version 25415 (0.0044) [2024-06-10 12:05:58,402][35745] Fps is (10 sec: 44236.5, 60 sec: 45602.0, 300 sec: 44931.0). Total num frames: 416399360. Throughput: 0: 45760.3. Samples: 14967180. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-10 12:05:58,402][35745] Avg episode reward: [(0, '0.286')] [2024-06-10 12:06:01,322][35978] Updated weights for policy 0, policy_version 25425 (0.0023) [2024-06-10 12:06:03,405][35745] Fps is (10 sec: 47495.2, 60 sec: 45874.0, 300 sec: 45152.6). Total num frames: 416645120. Throughput: 0: 45650.7. Samples: 15242060. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-10 12:06:03,406][35745] Avg episode reward: [(0, '0.286')] [2024-06-10 12:06:04,967][35978] Updated weights for policy 0, policy_version 25435 (0.0033) [2024-06-10 12:06:08,402][35745] Fps is (10 sec: 47514.3, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 416874496. Throughput: 0: 45895.6. Samples: 15520880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-10 12:06:08,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:06:08,771][35978] Updated weights for policy 0, policy_version 25445 (0.0036) [2024-06-10 12:06:12,235][35978] Updated weights for policy 0, policy_version 25455 (0.0027) [2024-06-10 12:06:13,401][35745] Fps is (10 sec: 42615.2, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 417071104. Throughput: 0: 46061.9. Samples: 15656920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-10 12:06:13,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:06:15,753][35978] Updated weights for policy 0, policy_version 25465 (0.0027) [2024-06-10 12:06:18,401][35745] Fps is (10 sec: 44237.0, 60 sec: 45603.2, 300 sec: 45486.5). Total num frames: 417316864. Throughput: 0: 45874.2. Samples: 15932480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-10 12:06:18,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:06:19,751][35978] Updated weights for policy 0, policy_version 25475 (0.0036) [2024-06-10 12:06:22,619][35978] Updated weights for policy 0, policy_version 25485 (0.0031) [2024-06-10 12:06:23,402][35745] Fps is (10 sec: 49151.9, 60 sec: 46421.4, 300 sec: 45653.1). Total num frames: 417562624. Throughput: 0: 45598.2. Samples: 16202120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 12:06:23,402][35745] Avg episode reward: [(0, '0.283')] [2024-06-10 12:06:26,890][35978] Updated weights for policy 0, policy_version 25495 (0.0029) [2024-06-10 12:06:28,402][35745] Fps is (10 sec: 45874.8, 60 sec: 45875.9, 300 sec: 45653.1). Total num frames: 417775616. Throughput: 0: 45916.0. Samples: 16345460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 12:06:28,403][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:06:28,719][35957] Signal inference workers to stop experience collection... (200 times) [2024-06-10 12:06:28,719][35957] Signal inference workers to resume experience collection... (200 times) [2024-06-10 12:06:28,742][35978] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-06-10 12:06:28,742][35978] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-06-10 12:06:30,075][35978] Updated weights for policy 0, policy_version 25505 (0.0027) [2024-06-10 12:06:33,402][35745] Fps is (10 sec: 44236.5, 60 sec: 45602.1, 300 sec: 45708.6). Total num frames: 418004992. Throughput: 0: 45850.7. Samples: 16617820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 12:06:33,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:06:33,966][35978] Updated weights for policy 0, policy_version 25515 (0.0043) [2024-06-10 12:06:37,606][35978] Updated weights for policy 0, policy_version 25525 (0.0040) [2024-06-10 12:06:38,401][35745] Fps is (10 sec: 45875.6, 60 sec: 45875.2, 300 sec: 45764.2). Total num frames: 418234368. Throughput: 0: 45616.5. Samples: 16889820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 12:06:38,402][35745] Avg episode reward: [(0, '0.287')] [2024-06-10 12:06:41,185][35978] Updated weights for policy 0, policy_version 25535 (0.0028) [2024-06-10 12:06:43,402][35745] Fps is (10 sec: 47513.1, 60 sec: 46148.2, 300 sec: 45930.7). Total num frames: 418480128. Throughput: 0: 45708.9. Samples: 17024080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 12:06:43,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:06:44,752][35978] Updated weights for policy 0, policy_version 25545 (0.0039) [2024-06-10 12:06:48,401][35745] Fps is (10 sec: 42598.6, 60 sec: 45056.1, 300 sec: 45764.2). Total num frames: 418660352. Throughput: 0: 45703.6. Samples: 17298540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 12:06:48,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:06:48,699][35978] Updated weights for policy 0, policy_version 25555 (0.0036) [2024-06-10 12:06:51,835][35978] Updated weights for policy 0, policy_version 25565 (0.0031) [2024-06-10 12:06:53,402][35745] Fps is (10 sec: 44237.4, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 418922496. Throughput: 0: 45516.0. Samples: 17569100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 12:06:53,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:06:55,759][35978] Updated weights for policy 0, policy_version 25575 (0.0031) [2024-06-10 12:06:58,404][35745] Fps is (10 sec: 49140.1, 60 sec: 45873.5, 300 sec: 45930.4). Total num frames: 419151872. Throughput: 0: 45406.5. Samples: 17700320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-10 12:06:58,405][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:06:59,271][35978] Updated weights for policy 0, policy_version 25585 (0.0042) [2024-06-10 12:07:02,772][35978] Updated weights for policy 0, policy_version 25595 (0.0030) [2024-06-10 12:07:03,402][35745] Fps is (10 sec: 42598.7, 60 sec: 45059.0, 300 sec: 45819.7). Total num frames: 419348480. Throughput: 0: 45429.3. Samples: 17976800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-10 12:07:03,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:07:06,750][35978] Updated weights for policy 0, policy_version 25605 (0.0024) [2024-06-10 12:07:08,402][35745] Fps is (10 sec: 42608.3, 60 sec: 45056.0, 300 sec: 45875.2). Total num frames: 419577856. Throughput: 0: 45430.2. Samples: 18246480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-10 12:07:08,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:07:10,210][35978] Updated weights for policy 0, policy_version 25615 (0.0021) [2024-06-10 12:07:13,404][35745] Fps is (10 sec: 47502.3, 60 sec: 45873.4, 300 sec: 45930.4). Total num frames: 419823616. Throughput: 0: 45324.3. Samples: 18385160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-10 12:07:13,405][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:07:13,811][35978] Updated weights for policy 0, policy_version 25625 (0.0027) [2024-06-10 12:07:17,748][35978] Updated weights for policy 0, policy_version 25635 (0.0032) [2024-06-10 12:07:18,404][35745] Fps is (10 sec: 47502.5, 60 sec: 45600.3, 300 sec: 45819.3). Total num frames: 420052992. Throughput: 0: 45228.4. Samples: 18653200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-10 12:07:18,405][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:07:21,255][35978] Updated weights for policy 0, policy_version 25645 (0.0034) [2024-06-10 12:07:23,402][35745] Fps is (10 sec: 44246.8, 60 sec: 45055.9, 300 sec: 45875.5). Total num frames: 420265984. Throughput: 0: 45299.9. Samples: 18928320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-10 12:07:23,402][35745] Avg episode reward: [(0, '0.283')] [2024-06-10 12:07:24,715][35978] Updated weights for policy 0, policy_version 25655 (0.0027) [2024-06-10 12:07:28,402][35745] Fps is (10 sec: 42608.4, 60 sec: 45056.1, 300 sec: 45764.4). Total num frames: 420478976. Throughput: 0: 45298.0. Samples: 19062480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-10 12:07:28,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:07:28,757][35978] Updated weights for policy 0, policy_version 25665 (0.0038) [2024-06-10 12:07:32,112][35978] Updated weights for policy 0, policy_version 25675 (0.0039) [2024-06-10 12:07:33,402][35745] Fps is (10 sec: 47513.9, 60 sec: 45602.2, 300 sec: 45819.9). Total num frames: 420741120. Throughput: 0: 45303.4. Samples: 19337200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-10 12:07:33,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:07:33,424][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000025680_420741120.pth... [2024-06-10 12:07:33,490][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000025009_409747456.pth [2024-06-10 12:07:33,493][35957] Saving new best policy, reward=0.300! [2024-06-10 12:07:35,931][35978] Updated weights for policy 0, policy_version 25685 (0.0038) [2024-06-10 12:07:38,402][35745] Fps is (10 sec: 45875.1, 60 sec: 45056.0, 300 sec: 45764.1). Total num frames: 420937728. Throughput: 0: 45235.1. Samples: 19604680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-10 12:07:38,402][35745] Avg episode reward: [(0, '0.281')] [2024-06-10 12:07:39,267][35978] Updated weights for policy 0, policy_version 25695 (0.0031) [2024-06-10 12:07:43,356][35978] Updated weights for policy 0, policy_version 25705 (0.0034) [2024-06-10 12:07:43,404][35745] Fps is (10 sec: 40950.5, 60 sec: 44508.3, 300 sec: 45708.6). Total num frames: 421150720. Throughput: 0: 45218.2. Samples: 19735140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-10 12:07:43,405][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:07:46,628][35978] Updated weights for policy 0, policy_version 25715 (0.0041) [2024-06-10 12:07:48,402][35745] Fps is (10 sec: 47513.6, 60 sec: 45875.1, 300 sec: 45819.7). Total num frames: 421412864. Throughput: 0: 45139.1. Samples: 20008060. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-10 12:07:48,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:07:50,454][35978] Updated weights for policy 0, policy_version 25725 (0.0032) [2024-06-10 12:07:53,402][35745] Fps is (10 sec: 47524.5, 60 sec: 45056.0, 300 sec: 45764.1). Total num frames: 421625856. Throughput: 0: 45238.6. Samples: 20282220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-10 12:07:53,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:07:53,624][35978] Updated weights for policy 0, policy_version 25735 (0.0035) [2024-06-10 12:07:55,738][35957] Signal inference workers to stop experience collection... (250 times) [2024-06-10 12:07:55,738][35957] Signal inference workers to resume experience collection... (250 times) [2024-06-10 12:07:55,773][35978] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-06-10 12:07:55,773][35978] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-06-10 12:07:57,924][35978] Updated weights for policy 0, policy_version 25745 (0.0035) [2024-06-10 12:07:58,402][35745] Fps is (10 sec: 40960.2, 60 sec: 44511.6, 300 sec: 45653.1). Total num frames: 421822464. Throughput: 0: 45045.9. Samples: 20412120. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-10 12:07:58,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:08:01,028][35978] Updated weights for policy 0, policy_version 25755 (0.0026) [2024-06-10 12:08:03,402][35745] Fps is (10 sec: 49151.6, 60 sec: 46148.1, 300 sec: 45875.2). Total num frames: 422117376. Throughput: 0: 45207.1. Samples: 20687420. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-10 12:08:03,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:08:05,220][35978] Updated weights for policy 0, policy_version 25765 (0.0033) [2024-06-10 12:08:08,141][35978] Updated weights for policy 0, policy_version 25775 (0.0033) [2024-06-10 12:08:08,402][35745] Fps is (10 sec: 49151.6, 60 sec: 45602.1, 300 sec: 45653.0). Total num frames: 422313984. Throughput: 0: 45257.8. Samples: 20964920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 12:08:08,402][35745] Avg episode reward: [(0, '0.286')] [2024-06-10 12:08:12,285][35978] Updated weights for policy 0, policy_version 25785 (0.0022) [2024-06-10 12:08:13,402][35745] Fps is (10 sec: 40960.1, 60 sec: 45057.7, 300 sec: 45708.6). Total num frames: 422526976. Throughput: 0: 45021.2. Samples: 21088440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 12:08:13,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:08:15,522][35978] Updated weights for policy 0, policy_version 25795 (0.0037) [2024-06-10 12:08:18,402][35745] Fps is (10 sec: 45875.5, 60 sec: 45330.8, 300 sec: 45708.6). Total num frames: 422772736. Throughput: 0: 45119.1. Samples: 21367560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 12:08:18,402][35745] Avg episode reward: [(0, '0.285')] [2024-06-10 12:08:19,233][35978] Updated weights for policy 0, policy_version 25805 (0.0033) [2024-06-10 12:08:22,755][35978] Updated weights for policy 0, policy_version 25815 (0.0031) [2024-06-10 12:08:23,402][35745] Fps is (10 sec: 45875.8, 60 sec: 45329.1, 300 sec: 45653.0). Total num frames: 422985728. Throughput: 0: 45148.0. Samples: 21636340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 12:08:23,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:08:26,484][35978] Updated weights for policy 0, policy_version 25825 (0.0036) [2024-06-10 12:08:28,402][35745] Fps is (10 sec: 42598.5, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 423198720. Throughput: 0: 45304.2. Samples: 21773720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 12:08:28,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:08:29,889][35978] Updated weights for policy 0, policy_version 25835 (0.0043) [2024-06-10 12:08:33,402][35745] Fps is (10 sec: 44236.5, 60 sec: 44782.9, 300 sec: 45597.6). Total num frames: 423428096. Throughput: 0: 45072.4. Samples: 22036320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 12:08:33,402][35745] Avg episode reward: [(0, '0.287')] [2024-06-10 12:08:33,934][35978] Updated weights for policy 0, policy_version 25845 (0.0030) [2024-06-10 12:08:37,277][35978] Updated weights for policy 0, policy_version 25855 (0.0055) [2024-06-10 12:08:38,402][35745] Fps is (10 sec: 49151.5, 60 sec: 45875.2, 300 sec: 45708.6). Total num frames: 423690240. Throughput: 0: 45064.4. Samples: 22310120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 12:08:38,402][35745] Avg episode reward: [(0, '0.274')] [2024-06-10 12:08:41,026][35978] Updated weights for policy 0, policy_version 25865 (0.0036) [2024-06-10 12:08:43,402][35745] Fps is (10 sec: 42598.3, 60 sec: 45057.7, 300 sec: 45430.9). Total num frames: 423854080. Throughput: 0: 45271.5. Samples: 22449340. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-06-10 12:08:43,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:08:44,262][35978] Updated weights for policy 0, policy_version 25875 (0.0034) [2024-06-10 12:08:48,025][35978] Updated weights for policy 0, policy_version 25885 (0.0028) [2024-06-10 12:08:48,401][35745] Fps is (10 sec: 40960.4, 60 sec: 44783.0, 300 sec: 45542.0). Total num frames: 424099840. Throughput: 0: 45110.8. Samples: 22717400. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-06-10 12:08:48,402][35745] Avg episode reward: [(0, '0.282')] [2024-06-10 12:08:51,704][35978] Updated weights for policy 0, policy_version 25895 (0.0024) [2024-06-10 12:08:53,406][35745] Fps is (10 sec: 49129.3, 60 sec: 45325.6, 300 sec: 45541.2). Total num frames: 424345600. Throughput: 0: 44841.6. Samples: 22983000. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-06-10 12:08:53,407][35745] Avg episode reward: [(0, '0.283')] [2024-06-10 12:08:55,292][35978] Updated weights for policy 0, policy_version 25905 (0.0035) [2024-06-10 12:08:58,402][35745] Fps is (10 sec: 42597.7, 60 sec: 45055.9, 300 sec: 45375.3). Total num frames: 424525824. Throughput: 0: 45107.1. Samples: 23118260. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-10 12:08:58,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:08:58,992][35978] Updated weights for policy 0, policy_version 25915 (0.0034) [2024-06-10 12:09:02,638][35978] Updated weights for policy 0, policy_version 25925 (0.0037) [2024-06-10 12:09:03,402][35745] Fps is (10 sec: 42618.5, 60 sec: 44236.9, 300 sec: 45430.9). Total num frames: 424771584. Throughput: 0: 44848.5. Samples: 23385740. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-10 12:09:03,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:09:05,558][35957] Signal inference workers to stop experience collection... (300 times) [2024-06-10 12:09:05,559][35957] Signal inference workers to resume experience collection... (300 times) [2024-06-10 12:09:05,573][35978] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-06-10 12:09:05,573][35978] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-06-10 12:09:06,294][35978] Updated weights for policy 0, policy_version 25935 (0.0030) [2024-06-10 12:09:08,401][35745] Fps is (10 sec: 50791.2, 60 sec: 45329.1, 300 sec: 45597.5). Total num frames: 425033728. Throughput: 0: 44942.7. Samples: 23658760. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-10 12:09:08,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:09:10,003][35978] Updated weights for policy 0, policy_version 25945 (0.0032) [2024-06-10 12:09:13,402][35745] Fps is (10 sec: 45875.0, 60 sec: 45056.1, 300 sec: 45430.9). Total num frames: 425230336. Throughput: 0: 44947.1. Samples: 23796340. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-10 12:09:13,402][35745] Avg episode reward: [(0, '0.283')] [2024-06-10 12:09:13,429][35978] Updated weights for policy 0, policy_version 25955 (0.0026) [2024-06-10 12:09:16,921][35978] Updated weights for policy 0, policy_version 25965 (0.0035) [2024-06-10 12:09:18,402][35745] Fps is (10 sec: 39321.1, 60 sec: 44236.7, 300 sec: 45264.3). Total num frames: 425426944. Throughput: 0: 45098.6. Samples: 24065760. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-10 12:09:18,402][35745] Avg episode reward: [(0, '0.287')] [2024-06-10 12:09:20,637][35978] Updated weights for policy 0, policy_version 25975 (0.0034) [2024-06-10 12:09:23,402][35745] Fps is (10 sec: 47513.9, 60 sec: 45329.1, 300 sec: 45430.9). Total num frames: 425705472. Throughput: 0: 45013.0. Samples: 24335700. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-10 12:09:23,402][35745] Avg episode reward: [(0, '0.287')] [2024-06-10 12:09:23,869][35978] Updated weights for policy 0, policy_version 25985 (0.0035) [2024-06-10 12:09:28,104][35978] Updated weights for policy 0, policy_version 25995 (0.0037) [2024-06-10 12:09:28,402][35745] Fps is (10 sec: 47513.9, 60 sec: 45056.0, 300 sec: 45430.9). Total num frames: 425902080. Throughput: 0: 45085.0. Samples: 24478160. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-10 12:09:28,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:09:31,437][35978] Updated weights for policy 0, policy_version 26005 (0.0035) [2024-06-10 12:09:33,402][35745] Fps is (10 sec: 40959.5, 60 sec: 44782.9, 300 sec: 45264.3). Total num frames: 426115072. Throughput: 0: 44967.4. Samples: 24740940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-10 12:09:33,403][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:09:33,416][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000026008_426115072.pth... [2024-06-10 12:09:33,473][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000025347_415285248.pth [2024-06-10 12:09:35,511][35978] Updated weights for policy 0, policy_version 26015 (0.0028) [2024-06-10 12:09:38,404][35745] Fps is (10 sec: 45864.4, 60 sec: 44508.2, 300 sec: 45375.0). Total num frames: 426360832. Throughput: 0: 44922.8. Samples: 25004420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-10 12:09:38,405][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:09:38,998][35978] Updated weights for policy 0, policy_version 26025 (0.0038) [2024-06-10 12:09:42,538][35978] Updated weights for policy 0, policy_version 26035 (0.0033) [2024-06-10 12:09:43,401][35745] Fps is (10 sec: 47514.1, 60 sec: 45602.2, 300 sec: 45486.4). Total num frames: 426590208. Throughput: 0: 45253.0. Samples: 25154640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-10 12:09:43,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:09:46,104][35978] Updated weights for policy 0, policy_version 26045 (0.0031) [2024-06-10 12:09:48,402][35745] Fps is (10 sec: 44247.0, 60 sec: 45055.9, 300 sec: 45375.3). Total num frames: 426803200. Throughput: 0: 45231.9. Samples: 25421180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-10 12:09:48,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:09:49,679][35978] Updated weights for policy 0, policy_version 26055 (0.0037) [2024-06-10 12:09:53,084][35978] Updated weights for policy 0, policy_version 26065 (0.0031) [2024-06-10 12:09:53,402][35745] Fps is (10 sec: 45875.0, 60 sec: 45059.5, 300 sec: 45375.3). Total num frames: 427048960. Throughput: 0: 45040.0. Samples: 25685560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 12:09:53,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:09:57,065][35978] Updated weights for policy 0, policy_version 26075 (0.0033) [2024-06-10 12:09:58,402][35745] Fps is (10 sec: 47513.7, 60 sec: 45875.3, 300 sec: 45375.7). Total num frames: 427278336. Throughput: 0: 45280.5. Samples: 25833960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 12:09:58,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:10:00,585][35978] Updated weights for policy 0, policy_version 26085 (0.0040) [2024-06-10 12:10:03,402][35745] Fps is (10 sec: 42597.7, 60 sec: 45055.9, 300 sec: 45264.2). Total num frames: 427474944. Throughput: 0: 45281.7. Samples: 26103440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 12:10:03,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:10:04,371][35978] Updated weights for policy 0, policy_version 26095 (0.0027) [2024-06-10 12:10:04,814][35957] Signal inference workers to stop experience collection... (350 times) [2024-06-10 12:10:04,855][35978] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-06-10 12:10:04,863][35957] Signal inference workers to resume experience collection... (350 times) [2024-06-10 12:10:04,876][35978] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-06-10 12:10:07,636][35978] Updated weights for policy 0, policy_version 26105 (0.0031) [2024-06-10 12:10:08,402][35745] Fps is (10 sec: 42598.3, 60 sec: 44509.8, 300 sec: 45264.3). Total num frames: 427704320. Throughput: 0: 45022.1. Samples: 26361700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-10 12:10:08,402][35745] Avg episode reward: [(0, '0.286')] [2024-06-10 12:10:11,556][35978] Updated weights for policy 0, policy_version 26115 (0.0045) [2024-06-10 12:10:13,403][35745] Fps is (10 sec: 47506.3, 60 sec: 45327.8, 300 sec: 45319.8). Total num frames: 427950080. Throughput: 0: 45141.9. Samples: 26509620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-10 12:10:13,404][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:10:15,101][35978] Updated weights for policy 0, policy_version 26125 (0.0036) [2024-06-10 12:10:18,402][35745] Fps is (10 sec: 44237.1, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 428146688. Throughput: 0: 45369.4. Samples: 26782560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-10 12:10:18,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:10:18,875][35978] Updated weights for policy 0, policy_version 26135 (0.0022) [2024-06-10 12:10:22,139][35978] Updated weights for policy 0, policy_version 26145 (0.0033) [2024-06-10 12:10:23,404][35745] Fps is (10 sec: 42595.7, 60 sec: 44508.1, 300 sec: 45264.1). Total num frames: 428376064. Throughput: 0: 45177.8. Samples: 27037420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-10 12:10:23,404][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:10:26,206][35978] Updated weights for policy 0, policy_version 26155 (0.0038) [2024-06-10 12:10:28,402][35745] Fps is (10 sec: 49151.2, 60 sec: 45602.0, 300 sec: 45319.8). Total num frames: 428638208. Throughput: 0: 45058.9. Samples: 27182300. Policy #0 lag: (min: 0.0, avg: 6.8, max: 20.0) [2024-06-10 12:10:28,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:10:29,643][35978] Updated weights for policy 0, policy_version 26165 (0.0034) [2024-06-10 12:10:33,402][35745] Fps is (10 sec: 45886.0, 60 sec: 45329.1, 300 sec: 45264.3). Total num frames: 428834816. Throughput: 0: 45205.4. Samples: 27455420. Policy #0 lag: (min: 0.0, avg: 6.8, max: 20.0) [2024-06-10 12:10:33,402][35745] Avg episode reward: [(0, '0.287')] [2024-06-10 12:10:33,421][35978] Updated weights for policy 0, policy_version 26175 (0.0030) [2024-06-10 12:10:36,927][35978] Updated weights for policy 0, policy_version 26185 (0.0028) [2024-06-10 12:10:38,402][35745] Fps is (10 sec: 39322.3, 60 sec: 44511.6, 300 sec: 45153.2). Total num frames: 429031424. Throughput: 0: 45094.7. Samples: 27714820. Policy #0 lag: (min: 0.0, avg: 6.8, max: 20.0) [2024-06-10 12:10:38,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:10:40,921][35978] Updated weights for policy 0, policy_version 26195 (0.0025) [2024-06-10 12:10:43,401][35745] Fps is (10 sec: 45875.3, 60 sec: 45056.0, 300 sec: 45208.8). Total num frames: 429293568. Throughput: 0: 44806.7. Samples: 27850260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-10 12:10:43,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:10:44,274][35978] Updated weights for policy 0, policy_version 26205 (0.0039) [2024-06-10 12:10:48,134][35978] Updated weights for policy 0, policy_version 26215 (0.0039) [2024-06-10 12:10:48,402][35745] Fps is (10 sec: 49151.3, 60 sec: 45329.0, 300 sec: 45264.3). Total num frames: 429522944. Throughput: 0: 44888.9. Samples: 28123440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-10 12:10:48,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:10:51,363][35978] Updated weights for policy 0, policy_version 26225 (0.0028) [2024-06-10 12:10:53,402][35745] Fps is (10 sec: 40959.4, 60 sec: 44236.7, 300 sec: 45097.7). Total num frames: 429703168. Throughput: 0: 45097.7. Samples: 28391100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-10 12:10:53,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:10:55,315][35978] Updated weights for policy 0, policy_version 26235 (0.0038) [2024-06-10 12:10:58,401][35745] Fps is (10 sec: 44237.7, 60 sec: 44783.0, 300 sec: 45153.8). Total num frames: 429965312. Throughput: 0: 44678.2. Samples: 28520060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-10 12:10:58,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:10:58,739][35978] Updated weights for policy 0, policy_version 26245 (0.0031) [2024-06-10 12:11:02,803][35978] Updated weights for policy 0, policy_version 26255 (0.0028) [2024-06-10 12:11:03,402][35745] Fps is (10 sec: 49152.3, 60 sec: 45329.2, 300 sec: 45153.2). Total num frames: 430194688. Throughput: 0: 44814.2. Samples: 28799200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-10 12:11:03,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:11:06,252][35978] Updated weights for policy 0, policy_version 26265 (0.0026) [2024-06-10 12:11:08,402][35745] Fps is (10 sec: 40959.7, 60 sec: 44509.9, 300 sec: 45097.7). Total num frames: 430374912. Throughput: 0: 45142.8. Samples: 29068740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-10 12:11:08,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:11:10,093][35978] Updated weights for policy 0, policy_version 26275 (0.0037) [2024-06-10 12:11:13,203][35978] Updated weights for policy 0, policy_version 26285 (0.0035) [2024-06-10 12:11:13,402][35745] Fps is (10 sec: 45875.3, 60 sec: 45057.3, 300 sec: 45208.7). Total num frames: 430653440. Throughput: 0: 44610.8. Samples: 29189780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-10 12:11:13,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:11:17,441][35978] Updated weights for policy 0, policy_version 26295 (0.0040) [2024-06-10 12:11:18,402][35745] Fps is (10 sec: 49152.0, 60 sec: 45329.1, 300 sec: 45097.7). Total num frames: 430866432. Throughput: 0: 44800.0. Samples: 29471420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-10 12:11:18,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:11:20,304][35978] Updated weights for policy 0, policy_version 26305 (0.0035) [2024-06-10 12:11:23,401][35745] Fps is (10 sec: 39321.9, 60 sec: 44511.6, 300 sec: 44986.6). Total num frames: 431046656. Throughput: 0: 45029.4. Samples: 29741140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-10 12:11:23,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:11:24,596][35978] Updated weights for policy 0, policy_version 26315 (0.0033) [2024-06-10 12:11:25,237][35957] Signal inference workers to stop experience collection... (400 times) [2024-06-10 12:11:25,280][35978] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-06-10 12:11:25,286][35957] Signal inference workers to resume experience collection... (400 times) [2024-06-10 12:11:25,297][35978] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-06-10 12:11:27,785][35978] Updated weights for policy 0, policy_version 26325 (0.0041) [2024-06-10 12:11:28,402][35745] Fps is (10 sec: 44236.4, 60 sec: 44509.9, 300 sec: 45097.7). Total num frames: 431308800. Throughput: 0: 44783.4. Samples: 29865520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-10 12:11:28,402][35745] Avg episode reward: [(0, '0.285')] [2024-06-10 12:11:32,111][35978] Updated weights for policy 0, policy_version 26335 (0.0031) [2024-06-10 12:11:33,404][35745] Fps is (10 sec: 50777.8, 60 sec: 45327.2, 300 sec: 45152.8). Total num frames: 431554560. Throughput: 0: 44796.4. Samples: 30139380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-10 12:11:33,405][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:11:33,412][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000026340_431554560.pth... [2024-06-10 12:11:33,468][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000025680_420741120.pth [2024-06-10 12:11:35,290][35978] Updated weights for policy 0, policy_version 26345 (0.0028) [2024-06-10 12:11:38,404][35745] Fps is (10 sec: 40950.6, 60 sec: 44781.2, 300 sec: 44875.2). Total num frames: 431718400. Throughput: 0: 44864.9. Samples: 30410120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-10 12:11:38,405][35745] Avg episode reward: [(0, '0.285')] [2024-06-10 12:11:39,334][35978] Updated weights for policy 0, policy_version 26355 (0.0030) [2024-06-10 12:11:42,345][35978] Updated weights for policy 0, policy_version 26365 (0.0030) [2024-06-10 12:11:43,402][35745] Fps is (10 sec: 40969.6, 60 sec: 44509.8, 300 sec: 45097.6). Total num frames: 431964160. Throughput: 0: 44766.5. Samples: 30534560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-10 12:11:43,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:11:46,623][35978] Updated weights for policy 0, policy_version 26375 (0.0029) [2024-06-10 12:11:48,402][35745] Fps is (10 sec: 50801.8, 60 sec: 45056.0, 300 sec: 45097.6). Total num frames: 432226304. Throughput: 0: 44603.0. Samples: 30806340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-10 12:11:48,402][35745] Avg episode reward: [(0, '0.286')] [2024-06-10 12:11:49,372][35978] Updated weights for policy 0, policy_version 26385 (0.0030) [2024-06-10 12:11:53,402][35745] Fps is (10 sec: 45875.1, 60 sec: 45329.1, 300 sec: 44986.9). Total num frames: 432422912. Throughput: 0: 44853.7. Samples: 31087160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-10 12:11:53,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:11:53,877][35978] Updated weights for policy 0, policy_version 26395 (0.0026) [2024-06-10 12:11:56,881][35978] Updated weights for policy 0, policy_version 26405 (0.0042) [2024-06-10 12:11:58,402][35745] Fps is (10 sec: 40960.1, 60 sec: 44509.7, 300 sec: 45042.1). Total num frames: 432635904. Throughput: 0: 44918.6. Samples: 31211120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-10 12:11:58,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:12:01,077][35978] Updated weights for policy 0, policy_version 26415 (0.0027) [2024-06-10 12:12:03,402][35745] Fps is (10 sec: 47513.8, 60 sec: 45056.0, 300 sec: 45153.2). Total num frames: 432898048. Throughput: 0: 44828.8. Samples: 31488720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-10 12:12:03,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:12:04,507][35978] Updated weights for policy 0, policy_version 26425 (0.0034) [2024-06-10 12:12:08,402][35745] Fps is (10 sec: 45875.3, 60 sec: 45329.0, 300 sec: 44986.9). Total num frames: 433094656. Throughput: 0: 44829.6. Samples: 31758480. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-10 12:12:08,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:12:08,435][35978] Updated weights for policy 0, policy_version 26435 (0.0034) [2024-06-10 12:12:11,675][35978] Updated weights for policy 0, policy_version 26445 (0.0032) [2024-06-10 12:12:13,402][35745] Fps is (10 sec: 40960.3, 60 sec: 44236.8, 300 sec: 44931.4). Total num frames: 433307648. Throughput: 0: 44805.4. Samples: 31881760. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-10 12:12:13,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:12:15,887][35978] Updated weights for policy 0, policy_version 26455 (0.0033) [2024-06-10 12:12:18,402][35745] Fps is (10 sec: 49151.9, 60 sec: 45329.0, 300 sec: 45153.2). Total num frames: 433586176. Throughput: 0: 44794.3. Samples: 32155020. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-10 12:12:18,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:12:18,584][35978] Updated weights for policy 0, policy_version 26465 (0.0024) [2024-06-10 12:12:23,373][35978] Updated weights for policy 0, policy_version 26475 (0.0043) [2024-06-10 12:12:23,402][35745] Fps is (10 sec: 45874.8, 60 sec: 45329.0, 300 sec: 45042.1). Total num frames: 433766400. Throughput: 0: 44817.4. Samples: 32426800. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-10 12:12:23,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:12:26,381][35978] Updated weights for policy 0, policy_version 26485 (0.0038) [2024-06-10 12:12:28,402][35745] Fps is (10 sec: 36045.1, 60 sec: 43963.8, 300 sec: 44764.4). Total num frames: 433946624. Throughput: 0: 44955.6. Samples: 32557560. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-10 12:12:28,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:12:30,376][35978] Updated weights for policy 0, policy_version 26495 (0.0030) [2024-06-10 12:12:31,440][35957] Signal inference workers to stop experience collection... (450 times) [2024-06-10 12:12:31,444][35957] Signal inference workers to resume experience collection... (450 times) [2024-06-10 12:12:31,454][35978] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-06-10 12:12:31,486][35978] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-06-10 12:12:33,402][35745] Fps is (10 sec: 47513.3, 60 sec: 44784.6, 300 sec: 45097.6). Total num frames: 434241536. Throughput: 0: 44818.2. Samples: 32823160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 12:12:33,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:12:33,948][35978] Updated weights for policy 0, policy_version 26505 (0.0042) [2024-06-10 12:12:37,836][35978] Updated weights for policy 0, policy_version 26515 (0.0039) [2024-06-10 12:12:38,402][35745] Fps is (10 sec: 49151.9, 60 sec: 45330.8, 300 sec: 45042.5). Total num frames: 434438144. Throughput: 0: 44663.2. Samples: 33097000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 12:12:38,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:12:41,294][35978] Updated weights for policy 0, policy_version 26525 (0.0036) [2024-06-10 12:12:43,402][35745] Fps is (10 sec: 39321.6, 60 sec: 44509.8, 300 sec: 44819.9). Total num frames: 434634752. Throughput: 0: 44871.9. Samples: 33230360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 12:12:43,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:12:45,199][35978] Updated weights for policy 0, policy_version 26535 (0.0042) [2024-06-10 12:12:48,402][35745] Fps is (10 sec: 45875.2, 60 sec: 44509.9, 300 sec: 44986.6). Total num frames: 434896896. Throughput: 0: 44628.0. Samples: 33496980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 12:12:48,403][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:12:48,621][35978] Updated weights for policy 0, policy_version 26545 (0.0037) [2024-06-10 12:12:52,486][35978] Updated weights for policy 0, policy_version 26555 (0.0029) [2024-06-10 12:12:53,401][35745] Fps is (10 sec: 47514.7, 60 sec: 44783.1, 300 sec: 45042.1). Total num frames: 435109888. Throughput: 0: 44731.7. Samples: 33771400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 12:12:53,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:12:55,714][35978] Updated weights for policy 0, policy_version 26565 (0.0035) [2024-06-10 12:12:58,402][35745] Fps is (10 sec: 40960.1, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 435306496. Throughput: 0: 44922.6. Samples: 33903280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 12:12:58,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:12:59,429][35978] Updated weights for policy 0, policy_version 26575 (0.0033) [2024-06-10 12:13:02,907][35978] Updated weights for policy 0, policy_version 26585 (0.0021) [2024-06-10 12:13:03,402][35745] Fps is (10 sec: 45874.8, 60 sec: 44509.9, 300 sec: 44931.0). Total num frames: 435568640. Throughput: 0: 44930.3. Samples: 34176880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 12:13:03,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:13:07,004][35978] Updated weights for policy 0, policy_version 26595 (0.0032) [2024-06-10 12:13:08,402][35745] Fps is (10 sec: 49152.2, 60 sec: 45056.1, 300 sec: 44986.6). Total num frames: 435798016. Throughput: 0: 44730.3. Samples: 34439660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 12:13:08,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:13:10,432][35978] Updated weights for policy 0, policy_version 26605 (0.0041) [2024-06-10 12:13:13,402][35745] Fps is (10 sec: 42598.3, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 435994624. Throughput: 0: 44884.4. Samples: 34577360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 12:13:13,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:13:14,514][35978] Updated weights for policy 0, policy_version 26615 (0.0034) [2024-06-10 12:13:17,737][35978] Updated weights for policy 0, policy_version 26625 (0.0046) [2024-06-10 12:13:18,401][35745] Fps is (10 sec: 44237.3, 60 sec: 44237.0, 300 sec: 44931.1). Total num frames: 436240384. Throughput: 0: 44857.6. Samples: 34841740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 12:13:18,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:13:21,714][35978] Updated weights for policy 0, policy_version 26635 (0.0033) [2024-06-10 12:13:23,401][35745] Fps is (10 sec: 49152.5, 60 sec: 45329.2, 300 sec: 45042.1). Total num frames: 436486144. Throughput: 0: 44855.2. Samples: 35115480. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-10 12:13:23,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:13:24,751][35978] Updated weights for policy 0, policy_version 26645 (0.0037) [2024-06-10 12:13:28,402][35745] Fps is (10 sec: 44235.4, 60 sec: 45602.0, 300 sec: 44931.0). Total num frames: 436682752. Throughput: 0: 45040.4. Samples: 35257180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-10 12:13:28,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:13:28,747][35978] Updated weights for policy 0, policy_version 26655 (0.0029) [2024-06-10 12:13:32,107][35978] Updated weights for policy 0, policy_version 26665 (0.0033) [2024-06-10 12:13:33,402][35745] Fps is (10 sec: 40959.0, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 436895744. Throughput: 0: 44999.0. Samples: 35521940. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-10 12:13:33,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:13:33,424][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000026666_436895744.pth... [2024-06-10 12:13:33,492][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000026008_426115072.pth [2024-06-10 12:13:36,128][35978] Updated weights for policy 0, policy_version 26675 (0.0028) [2024-06-10 12:13:38,404][35745] Fps is (10 sec: 47503.0, 60 sec: 45327.3, 300 sec: 45097.3). Total num frames: 437157888. Throughput: 0: 44869.5. Samples: 35790640. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-10 12:13:38,405][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:13:39,745][35978] Updated weights for policy 0, policy_version 26685 (0.0028) [2024-06-10 12:13:43,404][35745] Fps is (10 sec: 44227.1, 60 sec: 45054.3, 300 sec: 44875.1). Total num frames: 437338112. Throughput: 0: 45070.5. Samples: 35931560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:13:43,405][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:13:43,818][35978] Updated weights for policy 0, policy_version 26695 (0.0036) [2024-06-10 12:13:46,917][35978] Updated weights for policy 0, policy_version 26705 (0.0024) [2024-06-10 12:13:48,402][35745] Fps is (10 sec: 40969.9, 60 sec: 44509.9, 300 sec: 44820.7). Total num frames: 437567488. Throughput: 0: 44922.7. Samples: 36198400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:13:48,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:13:51,053][35978] Updated weights for policy 0, policy_version 26715 (0.0033) [2024-06-10 12:13:53,402][35745] Fps is (10 sec: 49163.4, 60 sec: 45329.0, 300 sec: 45097.7). Total num frames: 437829632. Throughput: 0: 44971.0. Samples: 36463360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:13:53,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:13:53,902][35978] Updated weights for policy 0, policy_version 26725 (0.0030) [2024-06-10 12:13:58,094][35978] Updated weights for policy 0, policy_version 26735 (0.0034) [2024-06-10 12:13:58,402][35745] Fps is (10 sec: 45875.1, 60 sec: 45329.1, 300 sec: 44931.0). Total num frames: 438026240. Throughput: 0: 45136.9. Samples: 36608520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:13:58,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:14:01,380][35978] Updated weights for policy 0, policy_version 26745 (0.0023) [2024-06-10 12:14:03,401][35745] Fps is (10 sec: 39321.9, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 438222848. Throughput: 0: 45119.4. Samples: 36872120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-10 12:14:03,402][35745] Avg episode reward: [(0, '0.279')] [2024-06-10 12:14:03,846][35957] Signal inference workers to stop experience collection... (500 times) [2024-06-10 12:14:03,893][35957] Signal inference workers to resume experience collection... (500 times) [2024-06-10 12:14:03,895][35978] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-06-10 12:14:03,932][35978] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-06-10 12:14:05,573][35978] Updated weights for policy 0, policy_version 26755 (0.0030) [2024-06-10 12:14:08,405][35745] Fps is (10 sec: 45858.5, 60 sec: 44780.2, 300 sec: 44930.5). Total num frames: 438484992. Throughput: 0: 44896.7. Samples: 37136000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-10 12:14:08,406][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:14:09,032][35978] Updated weights for policy 0, policy_version 26765 (0.0038) [2024-06-10 12:14:13,199][35978] Updated weights for policy 0, policy_version 26775 (0.0041) [2024-06-10 12:14:13,402][35745] Fps is (10 sec: 47512.9, 60 sec: 45055.9, 300 sec: 44986.6). Total num frames: 438697984. Throughput: 0: 44815.6. Samples: 37273880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-10 12:14:13,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:14:16,046][35978] Updated weights for policy 0, policy_version 26785 (0.0035) [2024-06-10 12:14:18,402][35745] Fps is (10 sec: 45891.2, 60 sec: 45055.8, 300 sec: 44875.5). Total num frames: 438943744. Throughput: 0: 44951.6. Samples: 37544760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-10 12:14:18,402][35745] Avg episode reward: [(0, '0.283')] [2024-06-10 12:14:20,161][35978] Updated weights for policy 0, policy_version 26795 (0.0027) [2024-06-10 12:14:23,159][35978] Updated weights for policy 0, policy_version 26805 (0.0026) [2024-06-10 12:14:23,402][35745] Fps is (10 sec: 47514.2, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 439173120. Throughput: 0: 44938.4. Samples: 37812760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-10 12:14:23,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:14:27,333][35978] Updated weights for policy 0, policy_version 26815 (0.0032) [2024-06-10 12:14:28,401][35745] Fps is (10 sec: 44237.7, 60 sec: 45056.2, 300 sec: 44986.6). Total num frames: 439386112. Throughput: 0: 44786.4. Samples: 37946840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-10 12:14:28,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:14:30,832][35978] Updated weights for policy 0, policy_version 26825 (0.0034) [2024-06-10 12:14:33,402][35745] Fps is (10 sec: 42598.1, 60 sec: 45056.1, 300 sec: 44875.9). Total num frames: 439599104. Throughput: 0: 44916.8. Samples: 38219660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-10 12:14:33,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:14:34,819][35978] Updated weights for policy 0, policy_version 26835 (0.0023) [2024-06-10 12:14:38,038][35978] Updated weights for policy 0, policy_version 26845 (0.0034) [2024-06-10 12:14:38,401][35745] Fps is (10 sec: 44236.9, 60 sec: 44511.7, 300 sec: 44875.5). Total num frames: 439828480. Throughput: 0: 44900.1. Samples: 38483860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 12:14:38,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:14:42,191][35978] Updated weights for policy 0, policy_version 26855 (0.0033) [2024-06-10 12:14:43,402][35745] Fps is (10 sec: 47513.0, 60 sec: 45603.8, 300 sec: 44986.6). Total num frames: 440074240. Throughput: 0: 44745.6. Samples: 38622080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 12:14:43,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:14:45,549][35978] Updated weights for policy 0, policy_version 26865 (0.0033) [2024-06-10 12:14:48,402][35745] Fps is (10 sec: 42598.2, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 440254464. Throughput: 0: 44824.4. Samples: 38889220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 12:14:48,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:14:48,523][35957] Saving new best policy, reward=0.302! [2024-06-10 12:14:49,427][35978] Updated weights for policy 0, policy_version 26875 (0.0047) [2024-06-10 12:14:52,729][35978] Updated weights for policy 0, policy_version 26885 (0.0027) [2024-06-10 12:14:53,402][35745] Fps is (10 sec: 42598.4, 60 sec: 44509.8, 300 sec: 44819.9). Total num frames: 440500224. Throughput: 0: 44975.1. Samples: 39159720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-10 12:14:53,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:14:56,741][35978] Updated weights for policy 0, policy_version 26895 (0.0039) [2024-06-10 12:14:58,401][35745] Fps is (10 sec: 47513.9, 60 sec: 45056.1, 300 sec: 44931.1). Total num frames: 440729600. Throughput: 0: 44852.6. Samples: 39292240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-10 12:14:58,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:15:00,093][35978] Updated weights for policy 0, policy_version 26905 (0.0031) [2024-06-10 12:15:03,402][35745] Fps is (10 sec: 42599.0, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 440926208. Throughput: 0: 44703.3. Samples: 39556400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-10 12:15:03,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:15:04,060][35978] Updated weights for policy 0, policy_version 26915 (0.0028) [2024-06-10 12:15:07,048][35978] Updated weights for policy 0, policy_version 26925 (0.0034) [2024-06-10 12:15:08,402][35745] Fps is (10 sec: 42598.2, 60 sec: 44512.6, 300 sec: 44764.7). Total num frames: 441155584. Throughput: 0: 44918.2. Samples: 39834080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-10 12:15:08,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:15:11,248][35978] Updated weights for policy 0, policy_version 26935 (0.0039) [2024-06-10 12:15:12,472][35957] Signal inference workers to stop experience collection... (550 times) [2024-06-10 12:15:12,472][35957] Signal inference workers to resume experience collection... (550 times) [2024-06-10 12:15:12,490][35978] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-06-10 12:15:12,519][35978] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-06-10 12:15:13,402][35745] Fps is (10 sec: 49152.2, 60 sec: 45329.2, 300 sec: 44986.6). Total num frames: 441417728. Throughput: 0: 44997.8. Samples: 39971740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 12:15:13,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:15:14,578][35978] Updated weights for policy 0, policy_version 26945 (0.0038) [2024-06-10 12:15:18,389][35978] Updated weights for policy 0, policy_version 26955 (0.0035) [2024-06-10 12:15:18,402][35745] Fps is (10 sec: 47513.2, 60 sec: 44783.0, 300 sec: 44931.4). Total num frames: 441630720. Throughput: 0: 44860.4. Samples: 40238380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 12:15:18,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:15:22,047][35978] Updated weights for policy 0, policy_version 26965 (0.0026) [2024-06-10 12:15:23,404][35745] Fps is (10 sec: 42588.3, 60 sec: 44508.1, 300 sec: 44764.1). Total num frames: 441843712. Throughput: 0: 45047.8. Samples: 40511120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 12:15:23,405][35745] Avg episode reward: [(0, '0.286')] [2024-06-10 12:15:25,903][35978] Updated weights for policy 0, policy_version 26975 (0.0032) [2024-06-10 12:15:28,408][35745] Fps is (10 sec: 47484.7, 60 sec: 45324.4, 300 sec: 44985.6). Total num frames: 442105856. Throughput: 0: 44921.6. Samples: 40643820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-10 12:15:28,408][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:15:29,017][35978] Updated weights for policy 0, policy_version 26985 (0.0032) [2024-06-10 12:15:33,080][35978] Updated weights for policy 0, policy_version 26995 (0.0039) [2024-06-10 12:15:33,402][35745] Fps is (10 sec: 45885.4, 60 sec: 45056.0, 300 sec: 44986.6). Total num frames: 442302464. Throughput: 0: 44876.3. Samples: 40908660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-10 12:15:33,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:15:33,416][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000026996_442302464.pth... [2024-06-10 12:15:33,494][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000026340_431554560.pth [2024-06-10 12:15:36,013][35978] Updated weights for policy 0, policy_version 27005 (0.0031) [2024-06-10 12:15:38,402][35745] Fps is (10 sec: 42624.2, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 442531840. Throughput: 0: 44928.9. Samples: 41181520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-10 12:15:38,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:15:40,514][35978] Updated weights for policy 0, policy_version 27015 (0.0030) [2024-06-10 12:15:43,402][35745] Fps is (10 sec: 45875.3, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 442761216. Throughput: 0: 44929.2. Samples: 41314060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-10 12:15:43,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:15:43,694][35978] Updated weights for policy 0, policy_version 27025 (0.0045) [2024-06-10 12:15:47,694][35978] Updated weights for policy 0, policy_version 27035 (0.0023) [2024-06-10 12:15:48,402][35745] Fps is (10 sec: 42599.0, 60 sec: 45056.0, 300 sec: 44931.1). Total num frames: 442957824. Throughput: 0: 45144.0. Samples: 41587880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 12:15:48,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:15:51,556][35978] Updated weights for policy 0, policy_version 27045 (0.0028) [2024-06-10 12:15:53,402][35745] Fps is (10 sec: 42598.8, 60 sec: 44783.1, 300 sec: 44819.9). Total num frames: 443187200. Throughput: 0: 44758.7. Samples: 41848220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 12:15:53,402][35745] Avg episode reward: [(0, '0.285')] [2024-06-10 12:15:55,280][35978] Updated weights for policy 0, policy_version 27055 (0.0033) [2024-06-10 12:15:58,402][35745] Fps is (10 sec: 45875.1, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 443416576. Throughput: 0: 44737.8. Samples: 41984940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 12:15:58,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:15:58,507][35978] Updated weights for policy 0, policy_version 27065 (0.0030) [2024-06-10 12:16:02,401][35978] Updated weights for policy 0, policy_version 27075 (0.0026) [2024-06-10 12:16:03,402][35745] Fps is (10 sec: 45874.9, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 443645952. Throughput: 0: 44852.0. Samples: 42256720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 12:16:03,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:16:05,544][35978] Updated weights for policy 0, policy_version 27085 (0.0047) [2024-06-10 12:16:08,402][35745] Fps is (10 sec: 44236.3, 60 sec: 45055.9, 300 sec: 44764.4). Total num frames: 443858944. Throughput: 0: 44728.4. Samples: 42523800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-10 12:16:08,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:16:09,676][35978] Updated weights for policy 0, policy_version 27095 (0.0036) [2024-06-10 12:16:13,115][35978] Updated weights for policy 0, policy_version 27105 (0.0032) [2024-06-10 12:16:13,402][35745] Fps is (10 sec: 44236.6, 60 sec: 44509.8, 300 sec: 44819.9). Total num frames: 444088320. Throughput: 0: 44790.9. Samples: 42659140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-10 12:16:13,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:16:17,110][35978] Updated weights for policy 0, policy_version 27115 (0.0042) [2024-06-10 12:16:18,402][35745] Fps is (10 sec: 45875.2, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 444317696. Throughput: 0: 44844.0. Samples: 42926640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-10 12:16:18,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:16:20,681][35978] Updated weights for policy 0, policy_version 27125 (0.0027) [2024-06-10 12:16:23,402][35745] Fps is (10 sec: 44236.7, 60 sec: 44784.6, 300 sec: 44820.0). Total num frames: 444530688. Throughput: 0: 44584.5. Samples: 43187820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 12:16:23,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:16:24,135][35978] Updated weights for policy 0, policy_version 27135 (0.0026) [2024-06-10 12:16:27,711][35978] Updated weights for policy 0, policy_version 27145 (0.0028) [2024-06-10 12:16:28,402][35745] Fps is (10 sec: 42598.4, 60 sec: 43968.2, 300 sec: 44709.2). Total num frames: 444743680. Throughput: 0: 44581.8. Samples: 43320240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 12:16:28,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:16:29,064][35957] Signal inference workers to stop experience collection... (600 times) [2024-06-10 12:16:29,068][35957] Signal inference workers to resume experience collection... (600 times) [2024-06-10 12:16:29,096][35978] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-06-10 12:16:29,096][35978] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-06-10 12:16:31,447][35978] Updated weights for policy 0, policy_version 27155 (0.0037) [2024-06-10 12:16:33,402][35745] Fps is (10 sec: 47514.1, 60 sec: 45056.1, 300 sec: 45042.5). Total num frames: 445005824. Throughput: 0: 44612.9. Samples: 43595460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 12:16:33,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:16:34,701][35978] Updated weights for policy 0, policy_version 27165 (0.0036) [2024-06-10 12:16:38,402][35745] Fps is (10 sec: 45875.3, 60 sec: 44509.9, 300 sec: 44875.5). Total num frames: 445202432. Throughput: 0: 44923.9. Samples: 43869800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 12:16:38,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:16:38,779][35978] Updated weights for policy 0, policy_version 27175 (0.0035) [2024-06-10 12:16:42,242][35978] Updated weights for policy 0, policy_version 27185 (0.0032) [2024-06-10 12:16:43,402][35745] Fps is (10 sec: 40959.8, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 445415424. Throughput: 0: 44787.1. Samples: 44000360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-10 12:16:43,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:16:46,356][35978] Updated weights for policy 0, policy_version 27195 (0.0028) [2024-06-10 12:16:48,402][35745] Fps is (10 sec: 45874.8, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 445661184. Throughput: 0: 44632.8. Samples: 44265200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-10 12:16:48,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:16:49,951][35978] Updated weights for policy 0, policy_version 27205 (0.0032) [2024-06-10 12:16:53,401][35745] Fps is (10 sec: 45875.6, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 445874176. Throughput: 0: 44725.0. Samples: 44536420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-10 12:16:53,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:16:53,491][35978] Updated weights for policy 0, policy_version 27215 (0.0027) [2024-06-10 12:16:57,027][35978] Updated weights for policy 0, policy_version 27225 (0.0039) [2024-06-10 12:16:58,402][35745] Fps is (10 sec: 44237.4, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 446103552. Throughput: 0: 44490.7. Samples: 44661220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-10 12:16:58,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:17:00,916][35978] Updated weights for policy 0, policy_version 27235 (0.0035) [2024-06-10 12:17:03,402][35745] Fps is (10 sec: 47512.9, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 446349312. Throughput: 0: 44683.5. Samples: 44937400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-10 12:17:03,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:17:03,969][35978] Updated weights for policy 0, policy_version 27245 (0.0041) [2024-06-10 12:17:08,108][35978] Updated weights for policy 0, policy_version 27255 (0.0036) [2024-06-10 12:17:08,402][35745] Fps is (10 sec: 44236.7, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 446545920. Throughput: 0: 44995.2. Samples: 45212600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-10 12:17:08,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:17:11,782][35978] Updated weights for policy 0, policy_version 27265 (0.0032) [2024-06-10 12:17:13,402][35745] Fps is (10 sec: 40960.1, 60 sec: 44509.9, 300 sec: 44653.3). Total num frames: 446758912. Throughput: 0: 44807.6. Samples: 45336580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-10 12:17:13,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:17:15,293][35978] Updated weights for policy 0, policy_version 27275 (0.0038) [2024-06-10 12:17:18,402][35745] Fps is (10 sec: 44236.5, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 446988288. Throughput: 0: 44835.0. Samples: 45613040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:17:18,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:17:19,209][35978] Updated weights for policy 0, policy_version 27285 (0.0036) [2024-06-10 12:17:22,668][35978] Updated weights for policy 0, policy_version 27295 (0.0025) [2024-06-10 12:17:23,402][35745] Fps is (10 sec: 45875.1, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 447217664. Throughput: 0: 44725.7. Samples: 45882460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:17:23,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:17:26,397][35978] Updated weights for policy 0, policy_version 27305 (0.0033) [2024-06-10 12:17:28,402][35745] Fps is (10 sec: 45875.2, 60 sec: 45056.0, 300 sec: 44764.4). Total num frames: 447447040. Throughput: 0: 44705.3. Samples: 46012100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:17:28,402][35745] Avg episode reward: [(0, '0.286')] [2024-06-10 12:17:30,443][35978] Updated weights for policy 0, policy_version 27315 (0.0039) [2024-06-10 12:17:33,402][35745] Fps is (10 sec: 45875.8, 60 sec: 44509.9, 300 sec: 44875.5). Total num frames: 447676416. Throughput: 0: 44666.4. Samples: 46275180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:17:33,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:17:33,449][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000027325_447692800.pth... [2024-06-10 12:17:33,461][35978] Updated weights for policy 0, policy_version 27325 (0.0058) [2024-06-10 12:17:33,502][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000026666_436895744.pth [2024-06-10 12:17:37,685][35978] Updated weights for policy 0, policy_version 27335 (0.0035) [2024-06-10 12:17:38,402][35745] Fps is (10 sec: 45875.6, 60 sec: 45056.0, 300 sec: 44986.6). Total num frames: 447905792. Throughput: 0: 44753.3. Samples: 46550320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-10 12:17:38,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:17:41,108][35978] Updated weights for policy 0, policy_version 27345 (0.0033) [2024-06-10 12:17:43,402][35745] Fps is (10 sec: 42597.9, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 448102400. Throughput: 0: 44808.3. Samples: 46677600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-10 12:17:43,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:17:44,946][35978] Updated weights for policy 0, policy_version 27355 (0.0046) [2024-06-10 12:17:48,402][35745] Fps is (10 sec: 42598.5, 60 sec: 44510.0, 300 sec: 44819.9). Total num frames: 448331776. Throughput: 0: 44617.4. Samples: 46945180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-10 12:17:48,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:17:48,634][35978] Updated weights for policy 0, policy_version 27365 (0.0042) [2024-06-10 12:17:52,257][35978] Updated weights for policy 0, policy_version 27375 (0.0045) [2024-06-10 12:17:53,402][35745] Fps is (10 sec: 45875.9, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 448561152. Throughput: 0: 44632.1. Samples: 47221040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-10 12:17:53,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:17:55,673][35978] Updated weights for policy 0, policy_version 27385 (0.0034) [2024-06-10 12:17:58,404][35745] Fps is (10 sec: 44226.2, 60 sec: 44508.1, 300 sec: 44764.1). Total num frames: 448774144. Throughput: 0: 44845.3. Samples: 47354720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-10 12:17:58,405][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:17:59,827][35978] Updated weights for policy 0, policy_version 27395 (0.0039) [2024-06-10 12:18:02,595][35978] Updated weights for policy 0, policy_version 27405 (0.0035) [2024-06-10 12:18:03,402][35745] Fps is (10 sec: 45875.2, 60 sec: 44510.0, 300 sec: 44820.0). Total num frames: 449019904. Throughput: 0: 44837.9. Samples: 47630740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-10 12:18:03,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:18:03,476][35957] Signal inference workers to stop experience collection... (650 times) [2024-06-10 12:18:03,477][35957] Signal inference workers to resume experience collection... (650 times) [2024-06-10 12:18:03,489][35978] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-06-10 12:18:03,489][35978] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-06-10 12:18:06,970][35978] Updated weights for policy 0, policy_version 27415 (0.0028) [2024-06-10 12:18:08,402][35745] Fps is (10 sec: 49163.6, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 449265664. Throughput: 0: 44693.4. Samples: 47893660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-10 12:18:08,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:18:10,319][35978] Updated weights for policy 0, policy_version 27425 (0.0039) [2024-06-10 12:18:13,402][35745] Fps is (10 sec: 42598.4, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 449445888. Throughput: 0: 44864.6. Samples: 48031000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-10 12:18:13,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:18:14,344][35978] Updated weights for policy 0, policy_version 27435 (0.0032) [2024-06-10 12:18:17,442][35978] Updated weights for policy 0, policy_version 27445 (0.0031) [2024-06-10 12:18:18,402][35745] Fps is (10 sec: 44236.6, 60 sec: 45329.1, 300 sec: 44819.9). Total num frames: 449708032. Throughput: 0: 45035.5. Samples: 48301780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-10 12:18:18,411][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:18:21,682][35978] Updated weights for policy 0, policy_version 27455 (0.0041) [2024-06-10 12:18:23,402][35745] Fps is (10 sec: 47513.2, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 449921024. Throughput: 0: 44827.5. Samples: 48567560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-10 12:18:23,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:18:24,686][35978] Updated weights for policy 0, policy_version 27465 (0.0037) [2024-06-10 12:18:28,402][35745] Fps is (10 sec: 40959.6, 60 sec: 44509.8, 300 sec: 44820.0). Total num frames: 450117632. Throughput: 0: 45105.7. Samples: 48707360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-10 12:18:28,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:18:28,775][35978] Updated weights for policy 0, policy_version 27475 (0.0031) [2024-06-10 12:18:31,815][35978] Updated weights for policy 0, policy_version 27485 (0.0034) [2024-06-10 12:18:33,405][35745] Fps is (10 sec: 45860.3, 60 sec: 45053.5, 300 sec: 44819.8). Total num frames: 450379776. Throughput: 0: 45046.0. Samples: 48972400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:18:33,405][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:18:36,378][35978] Updated weights for policy 0, policy_version 27495 (0.0032) [2024-06-10 12:18:38,401][35745] Fps is (10 sec: 47514.8, 60 sec: 44783.0, 300 sec: 44931.4). Total num frames: 450592768. Throughput: 0: 44839.2. Samples: 49238800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:18:38,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:18:39,361][35978] Updated weights for policy 0, policy_version 27505 (0.0054) [2024-06-10 12:18:43,402][35745] Fps is (10 sec: 40973.2, 60 sec: 44783.0, 300 sec: 44819.9). Total num frames: 450789376. Throughput: 0: 44922.3. Samples: 49376120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:18:43,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:18:43,474][35978] Updated weights for policy 0, policy_version 27515 (0.0039) [2024-06-10 12:18:46,859][35978] Updated weights for policy 0, policy_version 27525 (0.0035) [2024-06-10 12:18:48,402][35745] Fps is (10 sec: 45874.5, 60 sec: 45329.0, 300 sec: 44820.0). Total num frames: 451051520. Throughput: 0: 44878.6. Samples: 49650280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:18:48,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:18:51,056][35978] Updated weights for policy 0, policy_version 27535 (0.0032) [2024-06-10 12:18:53,402][35745] Fps is (10 sec: 45874.9, 60 sec: 44782.8, 300 sec: 44819.9). Total num frames: 451248128. Throughput: 0: 44909.7. Samples: 49914600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 12:18:53,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:18:53,982][35978] Updated weights for policy 0, policy_version 27545 (0.0029) [2024-06-10 12:18:58,401][35745] Fps is (10 sec: 39322.0, 60 sec: 44511.7, 300 sec: 44820.0). Total num frames: 451444736. Throughput: 0: 44817.4. Samples: 50047780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 12:18:58,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:18:58,524][35978] Updated weights for policy 0, policy_version 27555 (0.0026) [2024-06-10 12:19:01,142][35978] Updated weights for policy 0, policy_version 27565 (0.0029) [2024-06-10 12:19:03,402][35745] Fps is (10 sec: 45876.0, 60 sec: 44782.9, 300 sec: 44820.5). Total num frames: 451706880. Throughput: 0: 44707.2. Samples: 50313600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 12:19:03,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:19:05,681][35978] Updated weights for policy 0, policy_version 27575 (0.0025) [2024-06-10 12:19:08,404][35745] Fps is (10 sec: 49140.2, 60 sec: 44508.1, 300 sec: 44875.2). Total num frames: 451936256. Throughput: 0: 44826.1. Samples: 50584840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-10 12:19:08,405][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:19:08,426][35978] Updated weights for policy 0, policy_version 27585 (0.0022) [2024-06-10 12:19:09,181][35957] Signal inference workers to stop experience collection... (700 times) [2024-06-10 12:19:09,181][35957] Signal inference workers to resume experience collection... (700 times) [2024-06-10 12:19:09,195][35978] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-06-10 12:19:09,195][35978] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-06-10 12:19:12,781][35978] Updated weights for policy 0, policy_version 27595 (0.0032) [2024-06-10 12:19:13,401][35745] Fps is (10 sec: 42598.7, 60 sec: 44783.0, 300 sec: 44708.9). Total num frames: 452132864. Throughput: 0: 44834.5. Samples: 50724900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-10 12:19:13,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:19:15,897][35978] Updated weights for policy 0, policy_version 27605 (0.0035) [2024-06-10 12:19:18,402][35745] Fps is (10 sec: 44247.4, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 452378624. Throughput: 0: 44731.3. Samples: 50985160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-10 12:19:18,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:19:20,463][35978] Updated weights for policy 0, policy_version 27615 (0.0028) [2024-06-10 12:19:23,130][35978] Updated weights for policy 0, policy_version 27625 (0.0038) [2024-06-10 12:19:23,402][35745] Fps is (10 sec: 49151.6, 60 sec: 45056.1, 300 sec: 44875.5). Total num frames: 452624384. Throughput: 0: 44951.0. Samples: 51261600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-10 12:19:23,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:19:27,551][35978] Updated weights for policy 0, policy_version 27635 (0.0030) [2024-06-10 12:19:28,402][35745] Fps is (10 sec: 42597.8, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 452804608. Throughput: 0: 44768.9. Samples: 51390720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:19:28,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:19:30,275][35978] Updated weights for policy 0, policy_version 27645 (0.0033) [2024-06-10 12:19:33,402][35745] Fps is (10 sec: 44236.6, 60 sec: 44785.4, 300 sec: 44875.5). Total num frames: 453066752. Throughput: 0: 44642.7. Samples: 51659200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:19:33,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:19:33,414][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000027653_453066752.pth... [2024-06-10 12:19:33,479][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000026996_442302464.pth [2024-06-10 12:19:35,028][35978] Updated weights for policy 0, policy_version 27655 (0.0040) [2024-06-10 12:19:37,558][35978] Updated weights for policy 0, policy_version 27665 (0.0029) [2024-06-10 12:19:38,402][35745] Fps is (10 sec: 49152.1, 60 sec: 45055.9, 300 sec: 44820.0). Total num frames: 453296128. Throughput: 0: 44699.2. Samples: 51926060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:19:38,404][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:19:42,091][35978] Updated weights for policy 0, policy_version 27675 (0.0032) [2024-06-10 12:19:43,402][35745] Fps is (10 sec: 40960.2, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 453476352. Throughput: 0: 44790.7. Samples: 52063360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:19:43,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:19:45,133][35978] Updated weights for policy 0, policy_version 27685 (0.0035) [2024-06-10 12:19:48,402][35745] Fps is (10 sec: 40960.2, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 453705728. Throughput: 0: 44797.3. Samples: 52329480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 12:19:48,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:19:49,579][35978] Updated weights for policy 0, policy_version 27695 (0.0035) [2024-06-10 12:19:52,318][35978] Updated weights for policy 0, policy_version 27705 (0.0031) [2024-06-10 12:19:53,402][35745] Fps is (10 sec: 49152.0, 60 sec: 45329.2, 300 sec: 44875.5). Total num frames: 453967872. Throughput: 0: 44808.6. Samples: 52601120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 12:19:53,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:19:56,972][35978] Updated weights for policy 0, policy_version 27715 (0.0021) [2024-06-10 12:19:58,402][35745] Fps is (10 sec: 45875.3, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 454164480. Throughput: 0: 44687.4. Samples: 52735840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 12:19:58,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:19:59,733][35978] Updated weights for policy 0, policy_version 27725 (0.0047) [2024-06-10 12:20:03,402][35745] Fps is (10 sec: 40959.3, 60 sec: 44509.8, 300 sec: 44819.9). Total num frames: 454377472. Throughput: 0: 44863.8. Samples: 53004040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-10 12:20:03,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:20:04,055][35978] Updated weights for policy 0, policy_version 27735 (0.0022) [2024-06-10 12:20:06,809][35978] Updated weights for policy 0, policy_version 27745 (0.0047) [2024-06-10 12:20:08,408][35745] Fps is (10 sec: 45846.1, 60 sec: 44780.0, 300 sec: 44763.5). Total num frames: 454623232. Throughput: 0: 44655.9. Samples: 53271400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 12:20:08,409][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:20:11,167][35978] Updated weights for policy 0, policy_version 27755 (0.0032) [2024-06-10 12:20:13,401][35745] Fps is (10 sec: 45876.0, 60 sec: 45056.0, 300 sec: 44764.4). Total num frames: 454836224. Throughput: 0: 44951.3. Samples: 53413520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 12:20:13,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:20:14,257][35978] Updated weights for policy 0, policy_version 27765 (0.0029) [2024-06-10 12:20:18,402][35745] Fps is (10 sec: 42625.1, 60 sec: 44509.8, 300 sec: 44764.8). Total num frames: 455049216. Throughput: 0: 44768.8. Samples: 53673800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 12:20:18,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:20:18,994][35978] Updated weights for policy 0, policy_version 27775 (0.0030) [2024-06-10 12:20:19,544][35957] Signal inference workers to stop experience collection... (750 times) [2024-06-10 12:20:19,545][35957] Signal inference workers to resume experience collection... (750 times) [2024-06-10 12:20:19,556][35978] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-06-10 12:20:19,556][35978] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-06-10 12:20:21,642][35978] Updated weights for policy 0, policy_version 27785 (0.0039) [2024-06-10 12:20:23,402][35745] Fps is (10 sec: 47512.9, 60 sec: 44782.8, 300 sec: 44765.3). Total num frames: 455311360. Throughput: 0: 44812.9. Samples: 53942640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-10 12:20:23,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:20:26,117][35978] Updated weights for policy 0, policy_version 27795 (0.0028) [2024-06-10 12:20:28,402][35745] Fps is (10 sec: 47513.8, 60 sec: 45329.1, 300 sec: 44820.0). Total num frames: 455524352. Throughput: 0: 44970.6. Samples: 54087040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-10 12:20:28,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:20:29,129][35978] Updated weights for policy 0, policy_version 27805 (0.0031) [2024-06-10 12:20:33,404][35745] Fps is (10 sec: 39312.8, 60 sec: 43962.0, 300 sec: 44653.0). Total num frames: 455704576. Throughput: 0: 44676.4. Samples: 54340020. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-10 12:20:33,405][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:20:33,580][35978] Updated weights for policy 0, policy_version 27815 (0.0042) [2024-06-10 12:20:36,321][35978] Updated weights for policy 0, policy_version 27825 (0.0030) [2024-06-10 12:20:38,401][35745] Fps is (10 sec: 44237.6, 60 sec: 44510.0, 300 sec: 44764.5). Total num frames: 455966720. Throughput: 0: 44725.4. Samples: 54613760. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-10 12:20:38,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:20:41,233][35978] Updated weights for policy 0, policy_version 27835 (0.0031) [2024-06-10 12:20:43,401][35745] Fps is (10 sec: 49163.7, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 456196096. Throughput: 0: 44921.4. Samples: 54757300. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-10 12:20:43,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:20:43,508][35978] Updated weights for policy 0, policy_version 27845 (0.0031) [2024-06-10 12:20:48,402][35745] Fps is (10 sec: 39320.4, 60 sec: 44236.7, 300 sec: 44653.3). Total num frames: 456359936. Throughput: 0: 44690.2. Samples: 55015100. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-10 12:20:48,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:20:48,504][35978] Updated weights for policy 0, policy_version 27855 (0.0026) [2024-06-10 12:20:50,949][35978] Updated weights for policy 0, policy_version 27865 (0.0035) [2024-06-10 12:20:53,402][35745] Fps is (10 sec: 44236.5, 60 sec: 44509.8, 300 sec: 44820.0). Total num frames: 456638464. Throughput: 0: 44695.6. Samples: 55282420. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-10 12:20:53,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:20:55,537][35978] Updated weights for policy 0, policy_version 27875 (0.0021) [2024-06-10 12:20:58,402][35745] Fps is (10 sec: 49152.2, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 456851456. Throughput: 0: 44803.0. Samples: 55429660. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-10 12:20:58,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:20:58,511][35978] Updated weights for policy 0, policy_version 27885 (0.0032) [2024-06-10 12:21:03,030][35978] Updated weights for policy 0, policy_version 27895 (0.0034) [2024-06-10 12:21:03,402][35745] Fps is (10 sec: 40959.6, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 457048064. Throughput: 0: 44805.3. Samples: 55690040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-10 12:21:03,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:21:05,602][35978] Updated weights for policy 0, policy_version 27905 (0.0041) [2024-06-10 12:21:08,402][35745] Fps is (10 sec: 47514.1, 60 sec: 45060.8, 300 sec: 44875.5). Total num frames: 457326592. Throughput: 0: 44682.8. Samples: 55953360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-10 12:21:08,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:21:10,485][35978] Updated weights for policy 0, policy_version 27915 (0.0024) [2024-06-10 12:21:12,977][35978] Updated weights for policy 0, policy_version 27925 (0.0031) [2024-06-10 12:21:13,401][35745] Fps is (10 sec: 49152.9, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 457539584. Throughput: 0: 44677.9. Samples: 56097540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-10 12:21:13,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:21:17,909][35978] Updated weights for policy 0, policy_version 27935 (0.0029) [2024-06-10 12:21:18,402][35745] Fps is (10 sec: 39321.6, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 457719808. Throughput: 0: 44884.1. Samples: 56359700. Policy #0 lag: (min: 1.0, avg: 7.4, max: 21.0) [2024-06-10 12:21:18,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:21:20,479][35978] Updated weights for policy 0, policy_version 27945 (0.0026) [2024-06-10 12:21:23,402][35745] Fps is (10 sec: 44236.3, 60 sec: 44509.9, 300 sec: 44875.5). Total num frames: 457981952. Throughput: 0: 44744.7. Samples: 56627280. Policy #0 lag: (min: 1.0, avg: 7.4, max: 21.0) [2024-06-10 12:21:23,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:21:24,874][35978] Updated weights for policy 0, policy_version 27955 (0.0026) [2024-06-10 12:21:27,486][35978] Updated weights for policy 0, policy_version 27965 (0.0039) [2024-06-10 12:21:28,402][35745] Fps is (10 sec: 49151.8, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 458211328. Throughput: 0: 44852.8. Samples: 56775680. Policy #0 lag: (min: 1.0, avg: 7.4, max: 21.0) [2024-06-10 12:21:28,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:21:28,940][35957] Signal inference workers to stop experience collection... (800 times) [2024-06-10 12:21:28,941][35957] Signal inference workers to resume experience collection... (800 times) [2024-06-10 12:21:28,973][35978] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-06-10 12:21:28,973][35978] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-06-10 12:21:32,078][35978] Updated weights for policy 0, policy_version 27975 (0.0035) [2024-06-10 12:21:33,401][35745] Fps is (10 sec: 40960.5, 60 sec: 44784.7, 300 sec: 44708.9). Total num frames: 458391552. Throughput: 0: 45038.4. Samples: 57041820. Policy #0 lag: (min: 1.0, avg: 7.4, max: 21.0) [2024-06-10 12:21:33,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:21:33,547][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000027980_458424320.pth... [2024-06-10 12:21:33,607][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000027325_447692800.pth [2024-06-10 12:21:34,831][35978] Updated weights for policy 0, policy_version 27985 (0.0028) [2024-06-10 12:21:38,402][35745] Fps is (10 sec: 44236.9, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 458653696. Throughput: 0: 44865.3. Samples: 57301360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:21:38,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 12:21:38,407][35957] Saving new best policy, reward=0.304! [2024-06-10 12:21:39,631][35978] Updated weights for policy 0, policy_version 27995 (0.0039) [2024-06-10 12:21:41,942][35978] Updated weights for policy 0, policy_version 28005 (0.0026) [2024-06-10 12:21:43,402][35745] Fps is (10 sec: 49151.1, 60 sec: 44782.8, 300 sec: 44820.0). Total num frames: 458883072. Throughput: 0: 44623.1. Samples: 57437700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:21:43,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:21:46,954][35978] Updated weights for policy 0, policy_version 28015 (0.0034) [2024-06-10 12:21:48,402][35745] Fps is (10 sec: 45875.0, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 459112448. Throughput: 0: 45076.5. Samples: 57718480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:21:48,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:21:49,378][35978] Updated weights for policy 0, policy_version 28025 (0.0033) [2024-06-10 12:21:53,402][35745] Fps is (10 sec: 40960.0, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 459292672. Throughput: 0: 44996.8. Samples: 57978220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:21:53,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:21:54,019][35978] Updated weights for policy 0, policy_version 28035 (0.0041) [2024-06-10 12:21:56,762][35978] Updated weights for policy 0, policy_version 28045 (0.0034) [2024-06-10 12:21:58,401][35745] Fps is (10 sec: 44237.3, 60 sec: 45056.1, 300 sec: 44764.4). Total num frames: 459554816. Throughput: 0: 44759.1. Samples: 58111700. Policy #0 lag: (min: 2.0, avg: 9.7, max: 24.0) [2024-06-10 12:21:58,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:22:01,040][35978] Updated weights for policy 0, policy_version 28055 (0.0042) [2024-06-10 12:22:03,402][35745] Fps is (10 sec: 49152.3, 60 sec: 45602.2, 300 sec: 44875.5). Total num frames: 459784192. Throughput: 0: 44950.6. Samples: 58382480. Policy #0 lag: (min: 2.0, avg: 9.7, max: 24.0) [2024-06-10 12:22:03,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:22:03,877][35978] Updated weights for policy 0, policy_version 28065 (0.0025) [2024-06-10 12:22:08,402][35745] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 44708.9). Total num frames: 459948032. Throughput: 0: 44967.2. Samples: 58650800. Policy #0 lag: (min: 2.0, avg: 9.7, max: 24.0) [2024-06-10 12:22:08,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:22:08,690][35978] Updated weights for policy 0, policy_version 28075 (0.0030) [2024-06-10 12:22:11,056][35978] Updated weights for policy 0, policy_version 28085 (0.0023) [2024-06-10 12:22:13,402][35745] Fps is (10 sec: 44236.2, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 460226560. Throughput: 0: 44443.0. Samples: 58775620. Policy #0 lag: (min: 2.0, avg: 9.7, max: 24.0) [2024-06-10 12:22:13,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:22:16,185][35978] Updated weights for policy 0, policy_version 28095 (0.0031) [2024-06-10 12:22:18,402][35745] Fps is (10 sec: 50790.4, 60 sec: 45602.1, 300 sec: 44875.5). Total num frames: 460455936. Throughput: 0: 44739.0. Samples: 59055080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-10 12:22:18,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:22:18,581][35978] Updated weights for policy 0, policy_version 28105 (0.0031) [2024-06-10 12:22:23,201][35978] Updated weights for policy 0, policy_version 28115 (0.0038) [2024-06-10 12:22:23,402][35745] Fps is (10 sec: 40960.5, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 460636160. Throughput: 0: 45057.8. Samples: 59328960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-10 12:22:23,402][35745] Avg episode reward: [(0, '0.287')] [2024-06-10 12:22:25,935][35978] Updated weights for policy 0, policy_version 28125 (0.0037) [2024-06-10 12:22:28,402][35745] Fps is (10 sec: 42598.1, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 460881920. Throughput: 0: 44690.7. Samples: 59448780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-10 12:22:28,402][35745] Avg episode reward: [(0, '0.287')] [2024-06-10 12:22:30,426][35978] Updated weights for policy 0, policy_version 28135 (0.0048) [2024-06-10 12:22:33,011][35978] Updated weights for policy 0, policy_version 28145 (0.0037) [2024-06-10 12:22:33,402][35745] Fps is (10 sec: 49151.9, 60 sec: 45602.0, 300 sec: 44820.0). Total num frames: 461127680. Throughput: 0: 44587.1. Samples: 59724900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-10 12:22:33,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:22:33,845][35957] Signal inference workers to stop experience collection... (850 times) [2024-06-10 12:22:33,880][35978] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-06-10 12:22:33,901][35957] Signal inference workers to resume experience collection... (850 times) [2024-06-10 12:22:33,908][35978] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-06-10 12:22:38,142][35978] Updated weights for policy 0, policy_version 28155 (0.0024) [2024-06-10 12:22:38,402][35745] Fps is (10 sec: 40960.2, 60 sec: 43963.7, 300 sec: 44708.9). Total num frames: 461291520. Throughput: 0: 44773.4. Samples: 59993020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-10 12:22:38,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:22:40,591][35978] Updated weights for policy 0, policy_version 28165 (0.0029) [2024-06-10 12:22:43,401][35745] Fps is (10 sec: 42598.9, 60 sec: 44510.0, 300 sec: 44820.0). Total num frames: 461553664. Throughput: 0: 44492.9. Samples: 60113880. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-10 12:22:43,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:22:45,321][35978] Updated weights for policy 0, policy_version 28175 (0.0044) [2024-06-10 12:22:48,263][35978] Updated weights for policy 0, policy_version 28185 (0.0024) [2024-06-10 12:22:48,401][35745] Fps is (10 sec: 49152.3, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 461783040. Throughput: 0: 44520.5. Samples: 60385900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-10 12:22:48,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:22:52,323][35978] Updated weights for policy 0, policy_version 28195 (0.0031) [2024-06-10 12:22:53,402][35745] Fps is (10 sec: 42595.8, 60 sec: 44782.6, 300 sec: 44764.7). Total num frames: 461979648. Throughput: 0: 44653.2. Samples: 60660220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-10 12:22:53,403][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:22:55,562][35978] Updated weights for policy 0, policy_version 28205 (0.0029) [2024-06-10 12:22:58,402][35745] Fps is (10 sec: 40959.6, 60 sec: 43963.6, 300 sec: 44653.3). Total num frames: 462192640. Throughput: 0: 44682.8. Samples: 60786340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-10 12:22:58,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:22:59,966][35978] Updated weights for policy 0, policy_version 28215 (0.0029) [2024-06-10 12:23:02,729][35978] Updated weights for policy 0, policy_version 28225 (0.0035) [2024-06-10 12:23:03,402][35745] Fps is (10 sec: 45877.6, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 462438400. Throughput: 0: 44271.5. Samples: 61047300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-10 12:23:03,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:23:07,689][35978] Updated weights for policy 0, policy_version 28235 (0.0033) [2024-06-10 12:23:08,408][35745] Fps is (10 sec: 47483.4, 60 sec: 45324.2, 300 sec: 44819.0). Total num frames: 462667776. Throughput: 0: 44436.4. Samples: 61328880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-10 12:23:08,417][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:23:10,033][35978] Updated weights for policy 0, policy_version 28245 (0.0032) [2024-06-10 12:23:13,402][35745] Fps is (10 sec: 44236.7, 60 sec: 44236.9, 300 sec: 44653.3). Total num frames: 462880768. Throughput: 0: 44630.7. Samples: 61457160. Policy #0 lag: (min: 2.0, avg: 11.8, max: 24.0) [2024-06-10 12:23:13,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:23:14,739][35978] Updated weights for policy 0, policy_version 28255 (0.0037) [2024-06-10 12:23:17,332][35978] Updated weights for policy 0, policy_version 28265 (0.0022) [2024-06-10 12:23:18,402][35745] Fps is (10 sec: 45904.6, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 463126528. Throughput: 0: 44429.4. Samples: 61724220. Policy #0 lag: (min: 2.0, avg: 11.8, max: 24.0) [2024-06-10 12:23:18,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:23:21,870][35978] Updated weights for policy 0, policy_version 28275 (0.0035) [2024-06-10 12:23:23,402][35745] Fps is (10 sec: 45874.8, 60 sec: 45055.9, 300 sec: 44820.0). Total num frames: 463339520. Throughput: 0: 44639.4. Samples: 62001800. Policy #0 lag: (min: 2.0, avg: 11.8, max: 24.0) [2024-06-10 12:23:23,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:23:24,728][35978] Updated weights for policy 0, policy_version 28285 (0.0030) [2024-06-10 12:23:28,402][35745] Fps is (10 sec: 42597.7, 60 sec: 44509.8, 300 sec: 44653.8). Total num frames: 463552512. Throughput: 0: 44782.9. Samples: 62129120. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-10 12:23:28,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:23:29,130][35978] Updated weights for policy 0, policy_version 28295 (0.0022) [2024-06-10 12:23:31,834][35978] Updated weights for policy 0, policy_version 28305 (0.0029) [2024-06-10 12:23:33,402][35745] Fps is (10 sec: 45875.5, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 463798272. Throughput: 0: 44687.0. Samples: 62396820. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-10 12:23:33,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:23:33,413][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000028308_463798272.pth... [2024-06-10 12:23:33,473][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000027653_453066752.pth [2024-06-10 12:23:36,661][35978] Updated weights for policy 0, policy_version 28315 (0.0037) [2024-06-10 12:23:37,091][35957] Signal inference workers to stop experience collection... (900 times) [2024-06-10 12:23:37,093][35957] Signal inference workers to resume experience collection... (900 times) [2024-06-10 12:23:37,107][35978] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-06-10 12:23:37,123][35978] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-06-10 12:23:38,401][35745] Fps is (10 sec: 47514.6, 60 sec: 45602.2, 300 sec: 44875.5). Total num frames: 464027648. Throughput: 0: 44606.4. Samples: 62667480. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-10 12:23:38,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:23:38,856][35978] Updated weights for policy 0, policy_version 28325 (0.0029) [2024-06-10 12:23:43,402][35745] Fps is (10 sec: 42598.4, 60 sec: 44509.8, 300 sec: 44653.3). Total num frames: 464224256. Throughput: 0: 44982.2. Samples: 62810540. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-10 12:23:43,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:23:43,660][35978] Updated weights for policy 0, policy_version 28335 (0.0036) [2024-06-10 12:23:46,390][35978] Updated weights for policy 0, policy_version 28345 (0.0032) [2024-06-10 12:23:48,402][35745] Fps is (10 sec: 42598.4, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 464453632. Throughput: 0: 44949.0. Samples: 63070000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-10 12:23:48,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:23:51,199][35978] Updated weights for policy 0, policy_version 28355 (0.0039) [2024-06-10 12:23:53,402][35745] Fps is (10 sec: 47513.1, 60 sec: 45329.3, 300 sec: 44931.0). Total num frames: 464699392. Throughput: 0: 44570.2. Samples: 63334260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-10 12:23:53,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:23:53,893][35978] Updated weights for policy 0, policy_version 28365 (0.0042) [2024-06-10 12:23:58,402][35745] Fps is (10 sec: 42598.3, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 464879616. Throughput: 0: 44692.1. Samples: 63468300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-10 12:23:58,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:23:58,662][35978] Updated weights for policy 0, policy_version 28375 (0.0026) [2024-06-10 12:24:01,228][35978] Updated weights for policy 0, policy_version 28385 (0.0030) [2024-06-10 12:24:03,402][35745] Fps is (10 sec: 42599.0, 60 sec: 44782.9, 300 sec: 44709.2). Total num frames: 465125376. Throughput: 0: 44623.1. Samples: 63732260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-10 12:24:03,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:24:05,836][35978] Updated weights for policy 0, policy_version 28395 (0.0041) [2024-06-10 12:24:08,232][35978] Updated weights for policy 0, policy_version 28405 (0.0027) [2024-06-10 12:24:08,402][35745] Fps is (10 sec: 50789.9, 60 sec: 45333.8, 300 sec: 44931.0). Total num frames: 465387520. Throughput: 0: 44440.5. Samples: 64001620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-10 12:24:08,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:24:12,943][35978] Updated weights for policy 0, policy_version 28415 (0.0033) [2024-06-10 12:24:13,402][35745] Fps is (10 sec: 42598.4, 60 sec: 44509.9, 300 sec: 44653.3). Total num frames: 465551360. Throughput: 0: 44780.5. Samples: 64144240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-10 12:24:13,403][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:24:15,613][35978] Updated weights for policy 0, policy_version 28425 (0.0035) [2024-06-10 12:24:18,401][35745] Fps is (10 sec: 39322.1, 60 sec: 44236.8, 300 sec: 44597.8). Total num frames: 465780736. Throughput: 0: 44633.0. Samples: 64405300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-10 12:24:18,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:24:20,488][35978] Updated weights for policy 0, policy_version 28435 (0.0028) [2024-06-10 12:24:22,862][35978] Updated weights for policy 0, policy_version 28445 (0.0024) [2024-06-10 12:24:23,402][35745] Fps is (10 sec: 49151.6, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 466042880. Throughput: 0: 44489.1. Samples: 64669500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-10 12:24:23,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:24:27,946][35978] Updated weights for policy 0, policy_version 28455 (0.0040) [2024-06-10 12:24:28,404][35745] Fps is (10 sec: 45864.2, 60 sec: 44781.3, 300 sec: 44653.0). Total num frames: 466239488. Throughput: 0: 44386.7. Samples: 64808040. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-10 12:24:28,405][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:24:30,326][35978] Updated weights for policy 0, policy_version 28465 (0.0037) [2024-06-10 12:24:33,085][35957] Signal inference workers to stop experience collection... (950 times) [2024-06-10 12:24:33,092][35957] Signal inference workers to resume experience collection... (950 times) [2024-06-10 12:24:33,099][35978] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-06-10 12:24:33,131][35978] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-06-10 12:24:33,402][35745] Fps is (10 sec: 40960.5, 60 sec: 44236.8, 300 sec: 44597.8). Total num frames: 466452480. Throughput: 0: 44621.3. Samples: 65077960. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-10 12:24:33,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:24:35,174][35978] Updated weights for policy 0, policy_version 28475 (0.0040) [2024-06-10 12:24:37,423][35978] Updated weights for policy 0, policy_version 28485 (0.0027) [2024-06-10 12:24:38,402][35745] Fps is (10 sec: 45886.1, 60 sec: 44509.8, 300 sec: 44820.0). Total num frames: 466698240. Throughput: 0: 44569.5. Samples: 65339880. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-10 12:24:38,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:24:42,243][35978] Updated weights for policy 0, policy_version 28495 (0.0026) [2024-06-10 12:24:43,404][35745] Fps is (10 sec: 45866.4, 60 sec: 44781.5, 300 sec: 44764.1). Total num frames: 466911232. Throughput: 0: 44745.6. Samples: 65481940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 12:24:43,404][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:24:44,846][35978] Updated weights for policy 0, policy_version 28505 (0.0031) [2024-06-10 12:24:48,402][35745] Fps is (10 sec: 40960.0, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 467107840. Throughput: 0: 44855.2. Samples: 65750740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 12:24:48,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:24:49,788][35978] Updated weights for policy 0, policy_version 28515 (0.0026) [2024-06-10 12:24:52,234][35978] Updated weights for policy 0, policy_version 28525 (0.0030) [2024-06-10 12:24:53,401][35745] Fps is (10 sec: 45884.6, 60 sec: 44510.1, 300 sec: 44764.4). Total num frames: 467369984. Throughput: 0: 44486.8. Samples: 66003520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 12:24:53,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:24:57,278][35978] Updated weights for policy 0, policy_version 28535 (0.0036) [2024-06-10 12:24:58,401][35745] Fps is (10 sec: 47514.1, 60 sec: 45056.1, 300 sec: 44764.5). Total num frames: 467582976. Throughput: 0: 44494.8. Samples: 66146500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 12:24:58,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:24:59,724][35978] Updated weights for policy 0, policy_version 28545 (0.0051) [2024-06-10 12:25:03,402][35745] Fps is (10 sec: 40959.7, 60 sec: 44236.9, 300 sec: 44598.8). Total num frames: 467779584. Throughput: 0: 44760.0. Samples: 66419500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 12:25:03,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:25:04,510][35978] Updated weights for policy 0, policy_version 28555 (0.0038) [2024-06-10 12:25:06,712][35978] Updated weights for policy 0, policy_version 28565 (0.0029) [2024-06-10 12:25:08,402][35745] Fps is (10 sec: 44235.8, 60 sec: 43963.7, 300 sec: 44708.9). Total num frames: 468025344. Throughput: 0: 44813.8. Samples: 66686120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 12:25:08,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:25:11,721][35978] Updated weights for policy 0, policy_version 28575 (0.0032) [2024-06-10 12:25:13,402][35745] Fps is (10 sec: 49151.1, 60 sec: 45329.0, 300 sec: 44819.9). Total num frames: 468271104. Throughput: 0: 44778.2. Samples: 66822960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 12:25:13,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:25:13,906][35978] Updated weights for policy 0, policy_version 28585 (0.0043) [2024-06-10 12:25:18,404][35745] Fps is (10 sec: 44226.9, 60 sec: 44781.2, 300 sec: 44597.5). Total num frames: 468467712. Throughput: 0: 44818.2. Samples: 67094880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 12:25:18,405][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:25:19,084][35978] Updated weights for policy 0, policy_version 28595 (0.0042) [2024-06-10 12:25:21,435][35978] Updated weights for policy 0, policy_version 28605 (0.0036) [2024-06-10 12:25:23,408][35745] Fps is (10 sec: 40934.8, 60 sec: 43959.2, 300 sec: 44596.9). Total num frames: 468680704. Throughput: 0: 44838.6. Samples: 67357900. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-10 12:25:23,409][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:25:26,385][35978] Updated weights for policy 0, policy_version 28615 (0.0035) [2024-06-10 12:25:28,402][35745] Fps is (10 sec: 49163.2, 60 sec: 45330.8, 300 sec: 44931.4). Total num frames: 468959232. Throughput: 0: 44764.1. Samples: 67496240. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-10 12:25:28,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:25:28,702][35978] Updated weights for policy 0, policy_version 28625 (0.0025) [2024-06-10 12:25:33,402][35745] Fps is (10 sec: 45903.6, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 469139456. Throughput: 0: 44806.1. Samples: 67767020. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-10 12:25:33,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:25:33,458][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000028635_469155840.pth... [2024-06-10 12:25:33,463][35978] Updated weights for policy 0, policy_version 28635 (0.0030) [2024-06-10 12:25:33,519][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000027980_458424320.pth [2024-06-10 12:25:35,849][35978] Updated weights for policy 0, policy_version 28645 (0.0035) [2024-06-10 12:25:38,402][35745] Fps is (10 sec: 39321.5, 60 sec: 44236.7, 300 sec: 44597.8). Total num frames: 469352448. Throughput: 0: 45068.3. Samples: 68031600. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-10 12:25:38,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:25:41,088][35978] Updated weights for policy 0, policy_version 28655 (0.0030) [2024-06-10 12:25:41,787][35957] Signal inference workers to stop experience collection... (1000 times) [2024-06-10 12:25:41,788][35957] Signal inference workers to resume experience collection... (1000 times) [2024-06-10 12:25:41,817][35978] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-06-10 12:25:41,817][35978] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-06-10 12:25:43,255][35978] Updated weights for policy 0, policy_version 28665 (0.0034) [2024-06-10 12:25:43,402][35745] Fps is (10 sec: 50790.1, 60 sec: 45603.5, 300 sec: 45042.1). Total num frames: 469647360. Throughput: 0: 44815.7. Samples: 68163220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-10 12:25:43,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:25:48,402][35745] Fps is (10 sec: 44237.2, 60 sec: 44782.9, 300 sec: 44597.8). Total num frames: 469794816. Throughput: 0: 44613.3. Samples: 68427100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-10 12:25:48,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:25:48,544][35978] Updated weights for policy 0, policy_version 28675 (0.0033) [2024-06-10 12:25:50,809][35978] Updated weights for policy 0, policy_version 28685 (0.0028) [2024-06-10 12:25:53,402][35745] Fps is (10 sec: 36045.3, 60 sec: 43963.6, 300 sec: 44597.8). Total num frames: 470007808. Throughput: 0: 44716.1. Samples: 68698340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-10 12:25:53,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:25:55,609][35978] Updated weights for policy 0, policy_version 28695 (0.0031) [2024-06-10 12:25:57,935][35978] Updated weights for policy 0, policy_version 28705 (0.0029) [2024-06-10 12:25:58,402][35745] Fps is (10 sec: 50789.5, 60 sec: 45328.9, 300 sec: 44931.0). Total num frames: 470302720. Throughput: 0: 44689.4. Samples: 68833980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-10 12:25:58,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:26:02,842][35978] Updated weights for policy 0, policy_version 28715 (0.0032) [2024-06-10 12:26:03,402][35745] Fps is (10 sec: 47513.8, 60 sec: 45056.0, 300 sec: 44597.8). Total num frames: 470482944. Throughput: 0: 44703.7. Samples: 69106440. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-10 12:26:03,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:26:05,461][35978] Updated weights for policy 0, policy_version 28725 (0.0030) [2024-06-10 12:26:08,402][35745] Fps is (10 sec: 37683.8, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 470679552. Throughput: 0: 44766.7. Samples: 69372120. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-10 12:26:08,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:26:10,398][35978] Updated weights for policy 0, policy_version 28735 (0.0033) [2024-06-10 12:26:12,611][35978] Updated weights for policy 0, policy_version 28745 (0.0024) [2024-06-10 12:26:13,402][35745] Fps is (10 sec: 47513.7, 60 sec: 44783.1, 300 sec: 44875.5). Total num frames: 470958080. Throughput: 0: 44574.3. Samples: 69502080. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-10 12:26:13,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:26:17,964][35978] Updated weights for policy 0, policy_version 28755 (0.0032) [2024-06-10 12:26:18,402][35745] Fps is (10 sec: 47513.0, 60 sec: 44784.6, 300 sec: 44653.3). Total num frames: 471154688. Throughput: 0: 44680.9. Samples: 69777660. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-10 12:26:18,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:26:20,321][35978] Updated weights for policy 0, policy_version 28765 (0.0020) [2024-06-10 12:26:23,402][35745] Fps is (10 sec: 39321.6, 60 sec: 44514.6, 300 sec: 44542.3). Total num frames: 471351296. Throughput: 0: 44609.9. Samples: 70039040. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-10 12:26:23,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:26:25,133][35978] Updated weights for policy 0, policy_version 28775 (0.0037) [2024-06-10 12:26:26,080][35957] Signal inference workers to stop experience collection... (1050 times) [2024-06-10 12:26:26,124][35978] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-06-10 12:26:26,130][35957] Signal inference workers to resume experience collection... (1050 times) [2024-06-10 12:26:26,141][35978] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-06-10 12:26:27,689][35978] Updated weights for policy 0, policy_version 28785 (0.0022) [2024-06-10 12:26:28,402][35745] Fps is (10 sec: 45874.8, 60 sec: 44236.7, 300 sec: 44819.9). Total num frames: 471613440. Throughput: 0: 44463.1. Samples: 70164060. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-10 12:26:28,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:26:32,313][35978] Updated weights for policy 0, policy_version 28795 (0.0036) [2024-06-10 12:26:33,402][35745] Fps is (10 sec: 47513.1, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 471826432. Throughput: 0: 44791.9. Samples: 70442740. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-10 12:26:33,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:26:34,966][35978] Updated weights for policy 0, policy_version 28805 (0.0024) [2024-06-10 12:26:38,402][35745] Fps is (10 sec: 40960.4, 60 sec: 44509.8, 300 sec: 44542.3). Total num frames: 472023040. Throughput: 0: 44718.2. Samples: 70710660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 12:26:38,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:26:39,810][35978] Updated weights for policy 0, policy_version 28815 (0.0024) [2024-06-10 12:26:42,234][35978] Updated weights for policy 0, policy_version 28825 (0.0028) [2024-06-10 12:26:43,402][35745] Fps is (10 sec: 45874.9, 60 sec: 43963.8, 300 sec: 44653.3). Total num frames: 472285184. Throughput: 0: 44374.2. Samples: 70830820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 12:26:43,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:26:47,283][35978] Updated weights for policy 0, policy_version 28835 (0.0035) [2024-06-10 12:26:48,402][35745] Fps is (10 sec: 47514.1, 60 sec: 45056.0, 300 sec: 44764.4). Total num frames: 472498176. Throughput: 0: 44333.3. Samples: 71101440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 12:26:48,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:26:49,945][35978] Updated weights for policy 0, policy_version 28845 (0.0041) [2024-06-10 12:26:53,402][35745] Fps is (10 sec: 39322.3, 60 sec: 44509.9, 300 sec: 44486.7). Total num frames: 472678400. Throughput: 0: 44327.6. Samples: 71366860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-10 12:26:53,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:26:54,646][35978] Updated weights for policy 0, policy_version 28855 (0.0028) [2024-06-10 12:26:57,302][35978] Updated weights for policy 0, policy_version 28865 (0.0023) [2024-06-10 12:26:58,404][35745] Fps is (10 sec: 44226.2, 60 sec: 43962.1, 300 sec: 44597.4). Total num frames: 472940544. Throughput: 0: 44200.3. Samples: 71491200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 19.0) [2024-06-10 12:26:58,405][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:27:01,681][35978] Updated weights for policy 0, policy_version 28875 (0.0038) [2024-06-10 12:27:03,402][35745] Fps is (10 sec: 49151.7, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 473169920. Throughput: 0: 44245.8. Samples: 71768720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 19.0) [2024-06-10 12:27:03,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:27:04,478][35978] Updated weights for policy 0, policy_version 28885 (0.0021) [2024-06-10 12:27:08,402][35745] Fps is (10 sec: 42608.6, 60 sec: 44782.9, 300 sec: 44542.3). Total num frames: 473366528. Throughput: 0: 44498.6. Samples: 72041480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 19.0) [2024-06-10 12:27:08,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:27:08,951][35978] Updated weights for policy 0, policy_version 28895 (0.0025) [2024-06-10 12:27:11,782][35978] Updated weights for policy 0, policy_version 28905 (0.0028) [2024-06-10 12:27:13,402][35745] Fps is (10 sec: 42598.6, 60 sec: 43963.7, 300 sec: 44542.3). Total num frames: 473595904. Throughput: 0: 44648.2. Samples: 72173220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 19.0) [2024-06-10 12:27:13,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:27:16,470][35978] Updated weights for policy 0, policy_version 28915 (0.0028) [2024-06-10 12:27:18,402][35745] Fps is (10 sec: 47513.9, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 473841664. Throughput: 0: 44382.8. Samples: 72439960. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-10 12:27:18,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:27:19,418][35978] Updated weights for policy 0, policy_version 28925 (0.0029) [2024-06-10 12:27:23,408][35745] Fps is (10 sec: 44208.4, 60 sec: 44778.2, 300 sec: 44596.8). Total num frames: 474038272. Throughput: 0: 44512.4. Samples: 72714000. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-10 12:27:23,409][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:27:23,564][35978] Updated weights for policy 0, policy_version 28935 (0.0037) [2024-06-10 12:27:26,842][35978] Updated weights for policy 0, policy_version 28945 (0.0032) [2024-06-10 12:27:28,402][35745] Fps is (10 sec: 42597.3, 60 sec: 44236.8, 300 sec: 44542.2). Total num frames: 474267648. Throughput: 0: 44788.4. Samples: 72846300. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-10 12:27:28,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:27:30,703][35978] Updated weights for policy 0, policy_version 28955 (0.0036) [2024-06-10 12:27:30,705][35957] Signal inference workers to stop experience collection... (1100 times) [2024-06-10 12:27:30,705][35957] Signal inference workers to resume experience collection... (1100 times) [2024-06-10 12:27:30,735][35978] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-06-10 12:27:30,735][35978] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-06-10 12:27:33,402][35745] Fps is (10 sec: 47544.2, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 474513408. Throughput: 0: 44847.2. Samples: 73119560. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-10 12:27:33,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:27:33,419][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000028962_474513408.pth... [2024-06-10 12:27:33,470][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000028308_463798272.pth [2024-06-10 12:27:33,982][35978] Updated weights for policy 0, policy_version 28965 (0.0044) [2024-06-10 12:27:38,154][35978] Updated weights for policy 0, policy_version 28975 (0.0029) [2024-06-10 12:27:38,402][35745] Fps is (10 sec: 45875.9, 60 sec: 45056.0, 300 sec: 44653.3). Total num frames: 474726400. Throughput: 0: 44747.0. Samples: 73380480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 12:27:38,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:27:41,363][35978] Updated weights for policy 0, policy_version 28985 (0.0036) [2024-06-10 12:27:43,402][35745] Fps is (10 sec: 42598.4, 60 sec: 44236.9, 300 sec: 44597.8). Total num frames: 474939392. Throughput: 0: 44866.9. Samples: 73510100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 12:27:43,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:27:45,586][35978] Updated weights for policy 0, policy_version 28995 (0.0050) [2024-06-10 12:27:48,402][35745] Fps is (10 sec: 45875.7, 60 sec: 44783.0, 300 sec: 44764.5). Total num frames: 475185152. Throughput: 0: 44717.0. Samples: 73780980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 12:27:48,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:27:49,109][35978] Updated weights for policy 0, policy_version 29005 (0.0039) [2024-06-10 12:27:52,961][35978] Updated weights for policy 0, policy_version 29015 (0.0026) [2024-06-10 12:27:53,402][35745] Fps is (10 sec: 47513.0, 60 sec: 45602.0, 300 sec: 44820.0). Total num frames: 475414528. Throughput: 0: 44721.3. Samples: 74053940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 12:27:53,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:27:56,131][35978] Updated weights for policy 0, policy_version 29025 (0.0022) [2024-06-10 12:27:58,402][35745] Fps is (10 sec: 40960.0, 60 sec: 44238.6, 300 sec: 44597.8). Total num frames: 475594752. Throughput: 0: 44704.0. Samples: 74184900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-10 12:27:58,402][35745] Avg episode reward: [(0, '0.286')] [2024-06-10 12:27:59,849][35978] Updated weights for policy 0, policy_version 29035 (0.0033) [2024-06-10 12:28:03,402][35745] Fps is (10 sec: 44237.1, 60 sec: 44782.9, 300 sec: 44709.8). Total num frames: 475856896. Throughput: 0: 45037.7. Samples: 74466660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-10 12:28:03,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:28:03,435][35978] Updated weights for policy 0, policy_version 29045 (0.0025) [2024-06-10 12:28:07,217][35978] Updated weights for policy 0, policy_version 29055 (0.0032) [2024-06-10 12:28:08,404][35745] Fps is (10 sec: 47502.2, 60 sec: 45054.2, 300 sec: 44708.5). Total num frames: 476069888. Throughput: 0: 44665.8. Samples: 74723780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-10 12:28:08,405][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:28:10,918][35978] Updated weights for policy 0, policy_version 29065 (0.0032) [2024-06-10 12:28:13,404][35745] Fps is (10 sec: 42588.5, 60 sec: 44781.2, 300 sec: 44597.5). Total num frames: 476282880. Throughput: 0: 44615.2. Samples: 74854080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-10 12:28:13,405][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:28:14,574][35978] Updated weights for policy 0, policy_version 29075 (0.0031) [2024-06-10 12:28:18,189][35978] Updated weights for policy 0, policy_version 29085 (0.0036) [2024-06-10 12:28:18,402][35745] Fps is (10 sec: 45886.2, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 476528640. Throughput: 0: 44686.7. Samples: 75130460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-10 12:28:18,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:28:22,064][35978] Updated weights for policy 0, policy_version 29095 (0.0035) [2024-06-10 12:28:23,402][35745] Fps is (10 sec: 44246.8, 60 sec: 44787.7, 300 sec: 44653.4). Total num frames: 476725248. Throughput: 0: 44893.8. Samples: 75400700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-10 12:28:23,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:28:25,318][35978] Updated weights for policy 0, policy_version 29105 (0.0032) [2024-06-10 12:28:28,402][35745] Fps is (10 sec: 42598.2, 60 sec: 44783.1, 300 sec: 44597.8). Total num frames: 476954624. Throughput: 0: 44835.1. Samples: 75527680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-10 12:28:28,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:28:29,442][35978] Updated weights for policy 0, policy_version 29115 (0.0035) [2024-06-10 12:28:32,608][35978] Updated weights for policy 0, policy_version 29125 (0.0034) [2024-06-10 12:28:33,402][35745] Fps is (10 sec: 47513.7, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 477200384. Throughput: 0: 44995.5. Samples: 75805780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-10 12:28:33,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:28:36,482][35978] Updated weights for policy 0, policy_version 29135 (0.0033) [2024-06-10 12:28:38,402][35745] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 44597.8). Total num frames: 477380608. Throughput: 0: 44987.2. Samples: 76078360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 12:28:38,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:28:40,102][35978] Updated weights for policy 0, policy_version 29145 (0.0038) [2024-06-10 12:28:43,402][35745] Fps is (10 sec: 44236.9, 60 sec: 45055.9, 300 sec: 44708.9). Total num frames: 477642752. Throughput: 0: 44837.3. Samples: 76202580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 12:28:43,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:28:43,864][35978] Updated weights for policy 0, policy_version 29155 (0.0034) [2024-06-10 12:28:47,455][35978] Updated weights for policy 0, policy_version 29165 (0.0030) [2024-06-10 12:28:48,402][35745] Fps is (10 sec: 49152.4, 60 sec: 44782.9, 300 sec: 44653.4). Total num frames: 477872128. Throughput: 0: 44624.5. Samples: 76474760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 12:28:48,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:28:50,888][35978] Updated weights for policy 0, policy_version 29175 (0.0035) [2024-06-10 12:28:53,402][35745] Fps is (10 sec: 40959.8, 60 sec: 43963.7, 300 sec: 44653.3). Total num frames: 478052352. Throughput: 0: 45054.3. Samples: 76751120. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-10 12:28:53,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:28:54,118][35957] Signal inference workers to stop experience collection... (1150 times) [2024-06-10 12:28:54,118][35957] Signal inference workers to resume experience collection... (1150 times) [2024-06-10 12:28:54,159][35978] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-06-10 12:28:54,159][35978] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-06-10 12:28:54,573][35978] Updated weights for policy 0, policy_version 29185 (0.0027) [2024-06-10 12:28:58,402][35745] Fps is (10 sec: 44236.7, 60 sec: 45329.1, 300 sec: 44708.9). Total num frames: 478314496. Throughput: 0: 45021.9. Samples: 76879960. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-10 12:28:58,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:28:58,434][35978] Updated weights for policy 0, policy_version 29195 (0.0032) [2024-06-10 12:29:02,049][35978] Updated weights for policy 0, policy_version 29205 (0.0029) [2024-06-10 12:29:03,402][35745] Fps is (10 sec: 49151.8, 60 sec: 44782.9, 300 sec: 44597.8). Total num frames: 478543872. Throughput: 0: 44715.4. Samples: 77142660. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-10 12:29:03,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:29:05,851][35978] Updated weights for policy 0, policy_version 29215 (0.0032) [2024-06-10 12:29:08,402][35745] Fps is (10 sec: 42598.2, 60 sec: 44511.6, 300 sec: 44708.9). Total num frames: 478740480. Throughput: 0: 44784.0. Samples: 77415980. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-10 12:29:08,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:29:09,645][35978] Updated weights for policy 0, policy_version 29225 (0.0036) [2024-06-10 12:29:13,038][35978] Updated weights for policy 0, policy_version 29235 (0.0030) [2024-06-10 12:29:13,402][35745] Fps is (10 sec: 44237.3, 60 sec: 45057.8, 300 sec: 44764.4). Total num frames: 478986240. Throughput: 0: 44716.9. Samples: 77539940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-10 12:29:13,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:29:16,894][35978] Updated weights for policy 0, policy_version 29245 (0.0036) [2024-06-10 12:29:18,402][35745] Fps is (10 sec: 45874.7, 60 sec: 44509.7, 300 sec: 44597.8). Total num frames: 479199232. Throughput: 0: 44501.2. Samples: 77808340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-10 12:29:18,403][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:29:20,186][35978] Updated weights for policy 0, policy_version 29255 (0.0042) [2024-06-10 12:29:23,402][35745] Fps is (10 sec: 40959.2, 60 sec: 44509.8, 300 sec: 44598.1). Total num frames: 479395840. Throughput: 0: 44567.0. Samples: 78083880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-10 12:29:23,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:29:24,245][35978] Updated weights for policy 0, policy_version 29265 (0.0041) [2024-06-10 12:29:27,649][35978] Updated weights for policy 0, policy_version 29275 (0.0025) [2024-06-10 12:29:28,402][35745] Fps is (10 sec: 45875.1, 60 sec: 45055.9, 300 sec: 44764.4). Total num frames: 479657984. Throughput: 0: 44661.2. Samples: 78212340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-10 12:29:28,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:29:31,704][35978] Updated weights for policy 0, policy_version 29285 (0.0039) [2024-06-10 12:29:33,402][35745] Fps is (10 sec: 47514.2, 60 sec: 44509.9, 300 sec: 44653.3). Total num frames: 479870976. Throughput: 0: 44427.9. Samples: 78474020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 12:29:33,403][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:29:33,592][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000029291_479903744.pth... [2024-06-10 12:29:33,646][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000028635_469155840.pth [2024-06-10 12:29:35,137][35978] Updated weights for policy 0, policy_version 29295 (0.0022) [2024-06-10 12:29:38,402][35745] Fps is (10 sec: 42599.0, 60 sec: 45056.0, 300 sec: 44653.6). Total num frames: 480083968. Throughput: 0: 44516.9. Samples: 78754380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 12:29:38,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:29:39,069][35978] Updated weights for policy 0, policy_version 29305 (0.0031) [2024-06-10 12:29:42,311][35978] Updated weights for policy 0, policy_version 29315 (0.0024) [2024-06-10 12:29:43,402][35745] Fps is (10 sec: 44236.3, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 480313344. Throughput: 0: 44442.5. Samples: 78879880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 12:29:43,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:29:46,571][35978] Updated weights for policy 0, policy_version 29325 (0.0034) [2024-06-10 12:29:48,402][35745] Fps is (10 sec: 47513.4, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 480559104. Throughput: 0: 44636.1. Samples: 79151280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 12:29:48,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:29:49,351][35978] Updated weights for policy 0, policy_version 29335 (0.0019) [2024-06-10 12:29:53,402][35745] Fps is (10 sec: 44237.4, 60 sec: 45056.0, 300 sec: 44653.3). Total num frames: 480755712. Throughput: 0: 44726.2. Samples: 79428660. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-10 12:29:53,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:29:53,786][35978] Updated weights for policy 0, policy_version 29345 (0.0035) [2024-06-10 12:29:54,671][35957] Signal inference workers to stop experience collection... (1200 times) [2024-06-10 12:29:54,724][35978] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-06-10 12:29:54,731][35957] Signal inference workers to resume experience collection... (1200 times) [2024-06-10 12:29:54,739][35978] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-06-10 12:29:56,884][35978] Updated weights for policy 0, policy_version 29355 (0.0037) [2024-06-10 12:29:58,402][35745] Fps is (10 sec: 42598.5, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 480985088. Throughput: 0: 44749.3. Samples: 79553660. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-10 12:29:58,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:30:01,120][35978] Updated weights for policy 0, policy_version 29365 (0.0033) [2024-06-10 12:30:03,402][35745] Fps is (10 sec: 45875.2, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 481214464. Throughput: 0: 44593.0. Samples: 79815020. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-10 12:30:03,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:30:04,800][35978] Updated weights for policy 0, policy_version 29375 (0.0044) [2024-06-10 12:30:08,402][35745] Fps is (10 sec: 42598.2, 60 sec: 44509.8, 300 sec: 44542.3). Total num frames: 481411072. Throughput: 0: 44398.8. Samples: 80081820. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-10 12:30:08,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:30:08,860][35978] Updated weights for policy 0, policy_version 29385 (0.0039) [2024-06-10 12:30:12,002][35978] Updated weights for policy 0, policy_version 29395 (0.0032) [2024-06-10 12:30:13,402][35745] Fps is (10 sec: 40959.7, 60 sec: 43963.7, 300 sec: 44598.1). Total num frames: 481624064. Throughput: 0: 44337.0. Samples: 80207500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 12:30:13,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:30:16,089][35978] Updated weights for policy 0, policy_version 29405 (0.0032) [2024-06-10 12:30:18,402][35745] Fps is (10 sec: 49152.5, 60 sec: 45056.1, 300 sec: 44820.9). Total num frames: 481902592. Throughput: 0: 44657.4. Samples: 80483600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 12:30:18,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:30:19,353][35978] Updated weights for policy 0, policy_version 29415 (0.0032) [2024-06-10 12:30:23,356][35978] Updated weights for policy 0, policy_version 29425 (0.0029) [2024-06-10 12:30:23,402][35745] Fps is (10 sec: 47514.2, 60 sec: 45056.2, 300 sec: 44542.3). Total num frames: 482099200. Throughput: 0: 44572.9. Samples: 80760160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 12:30:23,402][35745] Avg episode reward: [(0, '0.286')] [2024-06-10 12:30:26,606][35978] Updated weights for policy 0, policy_version 29435 (0.0032) [2024-06-10 12:30:28,402][35745] Fps is (10 sec: 39321.5, 60 sec: 43963.9, 300 sec: 44597.8). Total num frames: 482295808. Throughput: 0: 44554.4. Samples: 80884820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 12:30:28,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:30:30,558][35978] Updated weights for policy 0, policy_version 29445 (0.0037) [2024-06-10 12:30:33,402][35745] Fps is (10 sec: 45874.8, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 482557952. Throughput: 0: 44567.1. Samples: 81156800. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-10 12:30:33,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:30:33,732][35978] Updated weights for policy 0, policy_version 29455 (0.0031) [2024-06-10 12:30:37,893][35978] Updated weights for policy 0, policy_version 29465 (0.0037) [2024-06-10 12:30:38,402][35745] Fps is (10 sec: 47513.8, 60 sec: 44783.0, 300 sec: 44486.8). Total num frames: 482770944. Throughput: 0: 44322.7. Samples: 81423180. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-10 12:30:38,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:30:41,335][35978] Updated weights for policy 0, policy_version 29475 (0.0033) [2024-06-10 12:30:43,402][35745] Fps is (10 sec: 39321.5, 60 sec: 43963.8, 300 sec: 44597.8). Total num frames: 482951168. Throughput: 0: 44489.3. Samples: 81555680. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-10 12:30:43,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:30:45,224][35978] Updated weights for policy 0, policy_version 29485 (0.0027) [2024-06-10 12:30:48,394][35978] Updated weights for policy 0, policy_version 29495 (0.0033) [2024-06-10 12:30:48,402][35745] Fps is (10 sec: 47513.0, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 483246080. Throughput: 0: 44660.8. Samples: 81824760. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-10 12:30:48,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:30:52,451][35978] Updated weights for policy 0, policy_version 29505 (0.0040) [2024-06-10 12:30:53,402][35745] Fps is (10 sec: 50790.6, 60 sec: 45056.0, 300 sec: 44597.8). Total num frames: 483459072. Throughput: 0: 44830.3. Samples: 82099180. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-10 12:30:53,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:30:55,803][35978] Updated weights for policy 0, policy_version 29515 (0.0023) [2024-06-10 12:30:58,402][35745] Fps is (10 sec: 37683.6, 60 sec: 43963.8, 300 sec: 44542.3). Total num frames: 483622912. Throughput: 0: 44841.9. Samples: 82225380. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-10 12:30:58,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:30:58,907][35957] Signal inference workers to stop experience collection... (1250 times) [2024-06-10 12:30:58,907][35957] Signal inference workers to resume experience collection... (1250 times) [2024-06-10 12:30:58,920][35978] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-06-10 12:30:58,921][35978] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-06-10 12:31:00,015][35978] Updated weights for policy 0, policy_version 29525 (0.0031) [2024-06-10 12:31:03,055][35978] Updated weights for policy 0, policy_version 29535 (0.0041) [2024-06-10 12:31:03,402][35745] Fps is (10 sec: 44236.8, 60 sec: 44782.9, 300 sec: 44819.9). Total num frames: 483901440. Throughput: 0: 44597.3. Samples: 82490480. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-10 12:31:03,402][35745] Avg episode reward: [(0, '0.285')] [2024-06-10 12:31:07,440][35978] Updated weights for policy 0, policy_version 29545 (0.0036) [2024-06-10 12:31:08,402][35745] Fps is (10 sec: 52428.0, 60 sec: 45602.1, 300 sec: 44708.9). Total num frames: 484147200. Throughput: 0: 44477.2. Samples: 82761640. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-10 12:31:08,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:31:10,555][35978] Updated weights for policy 0, policy_version 29555 (0.0036) [2024-06-10 12:31:13,402][35745] Fps is (10 sec: 39321.5, 60 sec: 44509.9, 300 sec: 44542.3). Total num frames: 484294656. Throughput: 0: 44649.3. Samples: 82894040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-10 12:31:13,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:31:14,581][35978] Updated weights for policy 0, policy_version 29565 (0.0033) [2024-06-10 12:31:17,549][35978] Updated weights for policy 0, policy_version 29575 (0.0033) [2024-06-10 12:31:18,402][35745] Fps is (10 sec: 42598.6, 60 sec: 44509.8, 300 sec: 44819.9). Total num frames: 484573184. Throughput: 0: 44464.0. Samples: 83157680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-10 12:31:18,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:31:21,870][35978] Updated weights for policy 0, policy_version 29585 (0.0025) [2024-06-10 12:31:23,402][35745] Fps is (10 sec: 52428.7, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 484818944. Throughput: 0: 44814.1. Samples: 83439820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-10 12:31:23,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:31:24,699][35978] Updated weights for policy 0, policy_version 29595 (0.0041) [2024-06-10 12:31:28,401][35745] Fps is (10 sec: 39322.0, 60 sec: 44509.9, 300 sec: 44542.3). Total num frames: 484966400. Throughput: 0: 44892.6. Samples: 83575840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-10 12:31:28,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:31:29,202][35978] Updated weights for policy 0, policy_version 29605 (0.0027) [2024-06-10 12:31:32,295][35978] Updated weights for policy 0, policy_version 29615 (0.0025) [2024-06-10 12:31:33,402][35745] Fps is (10 sec: 40960.1, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 485228544. Throughput: 0: 44824.9. Samples: 83841880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-10 12:31:33,408][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:31:33,434][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000029616_485228544.pth... [2024-06-10 12:31:33,487][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000028962_474513408.pth [2024-06-10 12:31:36,677][35978] Updated weights for policy 0, policy_version 29625 (0.0032) [2024-06-10 12:31:38,401][35745] Fps is (10 sec: 52428.8, 60 sec: 45329.1, 300 sec: 44764.5). Total num frames: 485490688. Throughput: 0: 44590.3. Samples: 84105740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-10 12:31:38,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:31:39,835][35978] Updated weights for policy 0, policy_version 29635 (0.0032) [2024-06-10 12:31:43,404][35745] Fps is (10 sec: 44226.8, 60 sec: 45327.4, 300 sec: 44653.0). Total num frames: 485670912. Throughput: 0: 44891.0. Samples: 84245580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-10 12:31:43,405][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:31:43,923][35978] Updated weights for policy 0, policy_version 29645 (0.0034) [2024-06-10 12:31:46,845][35978] Updated weights for policy 0, policy_version 29655 (0.0037) [2024-06-10 12:31:48,402][35745] Fps is (10 sec: 40959.7, 60 sec: 44236.8, 300 sec: 44820.0). Total num frames: 485900288. Throughput: 0: 44981.4. Samples: 84514640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-10 12:31:48,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:31:49,853][35957] Signal inference workers to stop experience collection... (1300 times) [2024-06-10 12:31:49,892][35978] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-06-10 12:31:49,908][35957] Signal inference workers to resume experience collection... (1300 times) [2024-06-10 12:31:49,908][35978] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-06-10 12:31:51,209][35978] Updated weights for policy 0, policy_version 29665 (0.0041) [2024-06-10 12:31:53,402][35745] Fps is (10 sec: 49163.5, 60 sec: 45056.0, 300 sec: 44820.3). Total num frames: 486162432. Throughput: 0: 44736.6. Samples: 84774780. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-10 12:31:53,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:31:54,041][35978] Updated weights for policy 0, policy_version 29675 (0.0037) [2024-06-10 12:31:58,402][35745] Fps is (10 sec: 42598.4, 60 sec: 45056.0, 300 sec: 44597.8). Total num frames: 486326272. Throughput: 0: 45107.2. Samples: 84923860. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-10 12:31:58,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 12:31:58,590][35957] Saving new best policy, reward=0.305! [2024-06-10 12:31:58,598][35978] Updated weights for policy 0, policy_version 29685 (0.0032) [2024-06-10 12:32:01,835][35978] Updated weights for policy 0, policy_version 29695 (0.0028) [2024-06-10 12:32:03,402][35745] Fps is (10 sec: 39321.2, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 486555648. Throughput: 0: 44947.1. Samples: 85180300. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-10 12:32:03,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:32:06,026][35978] Updated weights for policy 0, policy_version 29705 (0.0033) [2024-06-10 12:32:08,402][35745] Fps is (10 sec: 49151.6, 60 sec: 44509.9, 300 sec: 44819.9). Total num frames: 486817792. Throughput: 0: 44544.4. Samples: 85444320. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-10 12:32:08,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:32:09,447][35978] Updated weights for policy 0, policy_version 29715 (0.0030) [2024-06-10 12:32:13,402][35745] Fps is (10 sec: 44236.8, 60 sec: 45056.0, 300 sec: 44597.8). Total num frames: 486998016. Throughput: 0: 44806.1. Samples: 85592120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 12:32:13,403][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:32:13,415][35978] Updated weights for policy 0, policy_version 29725 (0.0034) [2024-06-10 12:32:16,453][35978] Updated weights for policy 0, policy_version 29735 (0.0036) [2024-06-10 12:32:18,402][35745] Fps is (10 sec: 40960.6, 60 sec: 44236.9, 300 sec: 44709.9). Total num frames: 487227392. Throughput: 0: 44770.3. Samples: 85856540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 12:32:18,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:32:20,382][35978] Updated weights for policy 0, policy_version 29745 (0.0027) [2024-06-10 12:32:23,401][35745] Fps is (10 sec: 49152.7, 60 sec: 44510.0, 300 sec: 44820.0). Total num frames: 487489536. Throughput: 0: 44785.3. Samples: 86121080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 12:32:23,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:32:23,507][35978] Updated weights for policy 0, policy_version 29755 (0.0031) [2024-06-10 12:32:27,663][35978] Updated weights for policy 0, policy_version 29765 (0.0026) [2024-06-10 12:32:28,402][35745] Fps is (10 sec: 49152.0, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 487718912. Throughput: 0: 45017.5. Samples: 86271260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 12:32:28,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:32:31,088][35978] Updated weights for policy 0, policy_version 29775 (0.0029) [2024-06-10 12:32:33,405][35745] Fps is (10 sec: 39307.4, 60 sec: 44234.2, 300 sec: 44597.3). Total num frames: 487882752. Throughput: 0: 44770.7. Samples: 86529480. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-10 12:32:33,406][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:32:35,084][35978] Updated weights for policy 0, policy_version 29785 (0.0027) [2024-06-10 12:32:38,402][35745] Fps is (10 sec: 40959.9, 60 sec: 43963.7, 300 sec: 44708.9). Total num frames: 488128512. Throughput: 0: 44788.9. Samples: 86790280. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-10 12:32:38,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:32:38,783][35978] Updated weights for policy 0, policy_version 29795 (0.0035) [2024-06-10 12:32:42,505][35978] Updated weights for policy 0, policy_version 29805 (0.0033) [2024-06-10 12:32:43,404][35745] Fps is (10 sec: 49157.9, 60 sec: 45056.0, 300 sec: 44708.5). Total num frames: 488374272. Throughput: 0: 44742.6. Samples: 86937380. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-10 12:32:43,405][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:32:45,810][35978] Updated weights for policy 0, policy_version 29815 (0.0028) [2024-06-10 12:32:48,401][35745] Fps is (10 sec: 42598.6, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 488554496. Throughput: 0: 45005.9. Samples: 87205560. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-10 12:32:48,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:32:49,716][35978] Updated weights for policy 0, policy_version 29825 (0.0038) [2024-06-10 12:32:53,386][35978] Updated weights for policy 0, policy_version 29835 (0.0027) [2024-06-10 12:32:53,402][35745] Fps is (10 sec: 44247.2, 60 sec: 44236.8, 300 sec: 44820.0). Total num frames: 488816640. Throughput: 0: 44804.5. Samples: 87460520. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-10 12:32:53,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:32:56,939][35957] Signal inference workers to stop experience collection... (1350 times) [2024-06-10 12:32:56,948][35957] Signal inference workers to resume experience collection... (1350 times) [2024-06-10 12:32:56,964][35978] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-06-10 12:32:56,964][35978] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-06-10 12:32:57,238][35978] Updated weights for policy 0, policy_version 29845 (0.0037) [2024-06-10 12:32:58,402][35745] Fps is (10 sec: 49151.8, 60 sec: 45329.1, 300 sec: 44708.9). Total num frames: 489046016. Throughput: 0: 44707.7. Samples: 87603960. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-10 12:32:58,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:33:00,656][35978] Updated weights for policy 0, policy_version 29855 (0.0035) [2024-06-10 12:33:03,404][35745] Fps is (10 sec: 40950.4, 60 sec: 44508.2, 300 sec: 44597.8). Total num frames: 489226240. Throughput: 0: 44685.6. Samples: 87867500. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-10 12:33:03,405][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:33:04,333][35978] Updated weights for policy 0, policy_version 29865 (0.0032) [2024-06-10 12:33:08,362][35978] Updated weights for policy 0, policy_version 29875 (0.0025) [2024-06-10 12:33:08,402][35745] Fps is (10 sec: 42598.0, 60 sec: 44236.8, 300 sec: 44709.2). Total num frames: 489472000. Throughput: 0: 44719.4. Samples: 88133460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 12:33:08,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:33:11,861][35978] Updated weights for policy 0, policy_version 29885 (0.0035) [2024-06-10 12:33:13,401][35745] Fps is (10 sec: 50802.9, 60 sec: 45602.3, 300 sec: 44764.4). Total num frames: 489734144. Throughput: 0: 44489.8. Samples: 88273300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 12:33:13,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:33:15,347][35978] Updated weights for policy 0, policy_version 29895 (0.0028) [2024-06-10 12:33:18,402][35745] Fps is (10 sec: 42599.0, 60 sec: 44509.9, 300 sec: 44653.4). Total num frames: 489897984. Throughput: 0: 44779.1. Samples: 88544380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 12:33:18,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:33:18,973][35978] Updated weights for policy 0, policy_version 29905 (0.0034) [2024-06-10 12:33:22,487][35978] Updated weights for policy 0, policy_version 29915 (0.0023) [2024-06-10 12:33:23,402][35745] Fps is (10 sec: 40959.5, 60 sec: 44236.7, 300 sec: 44708.9). Total num frames: 490143744. Throughput: 0: 44896.0. Samples: 88810600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 12:33:23,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:33:26,379][35978] Updated weights for policy 0, policy_version 29925 (0.0033) [2024-06-10 12:33:28,402][35745] Fps is (10 sec: 49151.2, 60 sec: 44509.8, 300 sec: 44708.9). Total num frames: 490389504. Throughput: 0: 44707.6. Samples: 88949120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:33:28,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:33:29,799][35978] Updated weights for policy 0, policy_version 29935 (0.0037) [2024-06-10 12:33:33,402][35745] Fps is (10 sec: 45874.8, 60 sec: 45331.7, 300 sec: 44819.9). Total num frames: 490602496. Throughput: 0: 44795.4. Samples: 89221360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:33:33,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:33:33,412][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000029944_490602496.pth... [2024-06-10 12:33:33,470][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000029291_479903744.pth [2024-06-10 12:33:33,626][35978] Updated weights for policy 0, policy_version 29945 (0.0028) [2024-06-10 12:33:37,414][35978] Updated weights for policy 0, policy_version 29955 (0.0030) [2024-06-10 12:33:38,402][35745] Fps is (10 sec: 40960.6, 60 sec: 44509.9, 300 sec: 44597.8). Total num frames: 490799104. Throughput: 0: 44930.3. Samples: 89482380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:33:38,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:33:40,803][35978] Updated weights for policy 0, policy_version 29965 (0.0037) [2024-06-10 12:33:43,402][35745] Fps is (10 sec: 47513.4, 60 sec: 45057.7, 300 sec: 44764.4). Total num frames: 491077632. Throughput: 0: 44744.7. Samples: 89617480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 12:33:43,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:33:44,408][35978] Updated weights for policy 0, policy_version 29975 (0.0023) [2024-06-10 12:33:48,261][35978] Updated weights for policy 0, policy_version 29985 (0.0029) [2024-06-10 12:33:48,402][35745] Fps is (10 sec: 47513.3, 60 sec: 45329.0, 300 sec: 44820.0). Total num frames: 491274240. Throughput: 0: 45040.1. Samples: 89894200. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-10 12:33:48,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:33:51,907][35978] Updated weights for policy 0, policy_version 29995 (0.0029) [2024-06-10 12:33:53,402][35745] Fps is (10 sec: 39322.1, 60 sec: 44236.8, 300 sec: 44597.8). Total num frames: 491470848. Throughput: 0: 44844.1. Samples: 90151440. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-10 12:33:53,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:33:55,683][35978] Updated weights for policy 0, policy_version 30005 (0.0030) [2024-06-10 12:33:58,404][35745] Fps is (10 sec: 45864.6, 60 sec: 44781.2, 300 sec: 44708.5). Total num frames: 491732992. Throughput: 0: 44653.2. Samples: 90282800. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-10 12:33:58,405][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:33:59,217][35978] Updated weights for policy 0, policy_version 30015 (0.0027) [2024-06-10 12:34:02,874][35978] Updated weights for policy 0, policy_version 30025 (0.0034) [2024-06-10 12:34:03,402][35745] Fps is (10 sec: 49151.8, 60 sec: 45603.9, 300 sec: 44820.0). Total num frames: 491962368. Throughput: 0: 44789.2. Samples: 90559900. Policy #0 lag: (min: 0.0, avg: 7.6, max: 22.0) [2024-06-10 12:34:03,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:34:03,872][35957] Signal inference workers to stop experience collection... (1400 times) [2024-06-10 12:34:03,873][35957] Signal inference workers to resume experience collection... (1400 times) [2024-06-10 12:34:03,894][35978] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-06-10 12:34:03,894][35978] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-06-10 12:34:06,467][35978] Updated weights for policy 0, policy_version 30035 (0.0039) [2024-06-10 12:34:08,402][35745] Fps is (10 sec: 40969.3, 60 sec: 44509.9, 300 sec: 44597.8). Total num frames: 492142592. Throughput: 0: 44944.0. Samples: 90833080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 12:34:08,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:34:10,100][35978] Updated weights for policy 0, policy_version 30045 (0.0029) [2024-06-10 12:34:13,402][35745] Fps is (10 sec: 44237.0, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 492404736. Throughput: 0: 44604.5. Samples: 90956320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 12:34:13,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:34:13,419][35978] Updated weights for policy 0, policy_version 30055 (0.0037) [2024-06-10 12:34:17,502][35978] Updated weights for policy 0, policy_version 30065 (0.0043) [2024-06-10 12:34:18,402][35745] Fps is (10 sec: 49152.2, 60 sec: 45602.1, 300 sec: 44875.5). Total num frames: 492634112. Throughput: 0: 44829.9. Samples: 91238700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 12:34:18,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:34:20,895][35978] Updated weights for policy 0, policy_version 30075 (0.0042) [2024-06-10 12:34:23,402][35745] Fps is (10 sec: 39321.7, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 492797952. Throughput: 0: 44901.3. Samples: 91502940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-10 12:34:23,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:34:24,791][35978] Updated weights for policy 0, policy_version 30085 (0.0038) [2024-06-10 12:34:28,320][35978] Updated weights for policy 0, policy_version 30095 (0.0039) [2024-06-10 12:34:28,408][35745] Fps is (10 sec: 44210.0, 60 sec: 44778.5, 300 sec: 44763.5). Total num frames: 493076480. Throughput: 0: 44554.1. Samples: 91622680. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-10 12:34:28,408][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:34:32,061][35978] Updated weights for policy 0, policy_version 30105 (0.0037) [2024-06-10 12:34:33,402][35745] Fps is (10 sec: 52428.4, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 493322240. Throughput: 0: 44644.4. Samples: 91903200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-10 12:34:33,404][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:34:35,588][35978] Updated weights for policy 0, policy_version 30115 (0.0029) [2024-06-10 12:34:38,402][35745] Fps is (10 sec: 40984.6, 60 sec: 44782.9, 300 sec: 44653.4). Total num frames: 493486080. Throughput: 0: 44999.5. Samples: 92176420. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-10 12:34:38,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:34:39,362][35978] Updated weights for policy 0, policy_version 30125 (0.0026) [2024-06-10 12:34:42,757][35978] Updated weights for policy 0, policy_version 30135 (0.0028) [2024-06-10 12:34:43,402][35745] Fps is (10 sec: 40960.1, 60 sec: 44236.9, 300 sec: 44653.3). Total num frames: 493731840. Throughput: 0: 44833.8. Samples: 92300220. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-10 12:34:43,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:34:46,983][35978] Updated weights for policy 0, policy_version 30145 (0.0025) [2024-06-10 12:34:48,402][35745] Fps is (10 sec: 49152.4, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 493977600. Throughput: 0: 44770.3. Samples: 92574560. Policy #0 lag: (min: 3.0, avg: 12.7, max: 27.0) [2024-06-10 12:34:48,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:34:49,671][35957] Signal inference workers to stop experience collection... (1450 times) [2024-06-10 12:34:49,671][35957] Signal inference workers to resume experience collection... (1450 times) [2024-06-10 12:34:49,681][35978] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-06-10 12:34:49,690][35978] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-06-10 12:34:49,998][35978] Updated weights for policy 0, policy_version 30155 (0.0029) [2024-06-10 12:34:53,402][35745] Fps is (10 sec: 42598.7, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 494157824. Throughput: 0: 44755.6. Samples: 92847080. Policy #0 lag: (min: 3.0, avg: 12.7, max: 27.0) [2024-06-10 12:34:53,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:34:53,996][35978] Updated weights for policy 0, policy_version 30165 (0.0040) [2024-06-10 12:34:57,600][35978] Updated weights for policy 0, policy_version 30175 (0.0041) [2024-06-10 12:34:58,402][35745] Fps is (10 sec: 42598.5, 60 sec: 44511.6, 300 sec: 44708.9). Total num frames: 494403584. Throughput: 0: 44728.5. Samples: 92969100. Policy #0 lag: (min: 3.0, avg: 12.7, max: 27.0) [2024-06-10 12:34:58,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:35:01,247][35978] Updated weights for policy 0, policy_version 30185 (0.0029) [2024-06-10 12:35:03,402][35745] Fps is (10 sec: 49151.2, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 494649344. Throughput: 0: 44456.3. Samples: 93239240. Policy #0 lag: (min: 3.0, avg: 12.7, max: 27.0) [2024-06-10 12:35:03,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:35:05,232][35978] Updated weights for policy 0, policy_version 30195 (0.0036) [2024-06-10 12:35:08,402][35745] Fps is (10 sec: 44236.7, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 494845952. Throughput: 0: 44925.3. Samples: 93524580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-10 12:35:08,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:35:08,638][35978] Updated weights for policy 0, policy_version 30205 (0.0032) [2024-06-10 12:35:12,369][35978] Updated weights for policy 0, policy_version 30215 (0.0034) [2024-06-10 12:35:13,402][35745] Fps is (10 sec: 40960.5, 60 sec: 44236.8, 300 sec: 44597.8). Total num frames: 495058944. Throughput: 0: 44983.4. Samples: 93646660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-10 12:35:13,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:35:15,974][35978] Updated weights for policy 0, policy_version 30225 (0.0047) [2024-06-10 12:35:18,402][35745] Fps is (10 sec: 45874.8, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 495304704. Throughput: 0: 44649.3. Samples: 93912420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-10 12:35:18,414][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:35:19,479][35978] Updated weights for policy 0, policy_version 30235 (0.0024) [2024-06-10 12:35:23,187][35978] Updated weights for policy 0, policy_version 30245 (0.0041) [2024-06-10 12:35:23,402][35745] Fps is (10 sec: 47513.6, 60 sec: 45602.1, 300 sec: 44875.5). Total num frames: 495534080. Throughput: 0: 44626.7. Samples: 94184620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-10 12:35:23,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:35:27,210][35978] Updated weights for policy 0, policy_version 30255 (0.0028) [2024-06-10 12:35:28,402][35745] Fps is (10 sec: 44236.9, 60 sec: 44514.3, 300 sec: 44708.9). Total num frames: 495747072. Throughput: 0: 44843.1. Samples: 94318160. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-10 12:35:28,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:35:30,557][35978] Updated weights for policy 0, policy_version 30265 (0.0030) [2024-06-10 12:35:33,402][35745] Fps is (10 sec: 45874.8, 60 sec: 44509.8, 300 sec: 44819.9). Total num frames: 495992832. Throughput: 0: 44587.4. Samples: 94581000. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-10 12:35:33,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:35:33,428][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000030273_495992832.pth... [2024-06-10 12:35:33,483][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000029616_485228544.pth [2024-06-10 12:35:34,722][35978] Updated weights for policy 0, policy_version 30275 (0.0041) [2024-06-10 12:35:37,926][35978] Updated weights for policy 0, policy_version 30285 (0.0050) [2024-06-10 12:35:38,402][35745] Fps is (10 sec: 47514.1, 60 sec: 45602.2, 300 sec: 44986.6). Total num frames: 496222208. Throughput: 0: 44777.8. Samples: 94862080. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-10 12:35:38,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:35:41,999][35978] Updated weights for policy 0, policy_version 30295 (0.0030) [2024-06-10 12:35:43,402][35745] Fps is (10 sec: 40960.1, 60 sec: 44509.8, 300 sec: 44597.8). Total num frames: 496402432. Throughput: 0: 44980.3. Samples: 94993220. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-10 12:35:43,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:35:45,208][35978] Updated weights for policy 0, policy_version 30305 (0.0020) [2024-06-10 12:35:48,408][35745] Fps is (10 sec: 42571.1, 60 sec: 44505.1, 300 sec: 44707.9). Total num frames: 496648192. Throughput: 0: 44897.4. Samples: 95259900. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-10 12:35:48,408][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:35:49,141][35978] Updated weights for policy 0, policy_version 30315 (0.0039) [2024-06-10 12:35:52,295][35978] Updated weights for policy 0, policy_version 30325 (0.0052) [2024-06-10 12:35:53,401][35745] Fps is (10 sec: 49152.7, 60 sec: 45602.2, 300 sec: 44986.6). Total num frames: 496893952. Throughput: 0: 44533.8. Samples: 95528600. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-10 12:35:53,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:35:56,697][35978] Updated weights for policy 0, policy_version 30335 (0.0032) [2024-06-10 12:35:58,403][35745] Fps is (10 sec: 45895.8, 60 sec: 45054.6, 300 sec: 44764.1). Total num frames: 497106944. Throughput: 0: 44929.7. Samples: 95668580. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-10 12:35:58,404][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:35:59,393][35957] Signal inference workers to stop experience collection... (1500 times) [2024-06-10 12:35:59,395][35957] Signal inference workers to resume experience collection... (1500 times) [2024-06-10 12:35:59,403][35978] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-06-10 12:35:59,429][35978] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-06-10 12:35:59,545][35978] Updated weights for policy 0, policy_version 30345 (0.0035) [2024-06-10 12:36:03,402][35745] Fps is (10 sec: 42598.2, 60 sec: 44510.0, 300 sec: 44653.4). Total num frames: 497319936. Throughput: 0: 44954.3. Samples: 95935360. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-10 12:36:03,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 12:36:03,652][35978] Updated weights for policy 0, policy_version 30355 (0.0026) [2024-06-10 12:36:07,174][35978] Updated weights for policy 0, policy_version 30365 (0.0032) [2024-06-10 12:36:08,402][35745] Fps is (10 sec: 47522.8, 60 sec: 45602.2, 300 sec: 45042.1). Total num frames: 497582080. Throughput: 0: 44867.6. Samples: 96203660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 12:36:08,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:36:11,094][35978] Updated weights for policy 0, policy_version 30375 (0.0026) [2024-06-10 12:36:13,402][35745] Fps is (10 sec: 45874.4, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 497778688. Throughput: 0: 45170.6. Samples: 96350840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 12:36:13,408][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:36:14,642][35978] Updated weights for policy 0, policy_version 30385 (0.0039) [2024-06-10 12:36:18,331][35978] Updated weights for policy 0, policy_version 30395 (0.0027) [2024-06-10 12:36:18,401][35745] Fps is (10 sec: 40960.1, 60 sec: 44783.1, 300 sec: 44653.4). Total num frames: 497991680. Throughput: 0: 45050.4. Samples: 96608260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 12:36:18,402][35745] Avg episode reward: [(0, '0.287')] [2024-06-10 12:36:21,627][35978] Updated weights for policy 0, policy_version 30405 (0.0025) [2024-06-10 12:36:23,402][35745] Fps is (10 sec: 47514.4, 60 sec: 45329.1, 300 sec: 45042.1). Total num frames: 498253824. Throughput: 0: 44725.7. Samples: 96874740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 12:36:23,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:36:25,693][35978] Updated weights for policy 0, policy_version 30415 (0.0043) [2024-06-10 12:36:28,402][35745] Fps is (10 sec: 45874.3, 60 sec: 45056.0, 300 sec: 44819.9). Total num frames: 498450432. Throughput: 0: 45003.5. Samples: 97018380. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-10 12:36:28,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:36:28,746][35978] Updated weights for policy 0, policy_version 30425 (0.0032) [2024-06-10 12:36:32,948][35978] Updated weights for policy 0, policy_version 30435 (0.0029) [2024-06-10 12:36:33,402][35745] Fps is (10 sec: 40959.5, 60 sec: 44509.9, 300 sec: 44653.3). Total num frames: 498663424. Throughput: 0: 45010.2. Samples: 97285080. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-10 12:36:33,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:36:36,298][35978] Updated weights for policy 0, policy_version 30445 (0.0035) [2024-06-10 12:36:38,402][35745] Fps is (10 sec: 47514.3, 60 sec: 45056.0, 300 sec: 44931.4). Total num frames: 498925568. Throughput: 0: 44909.7. Samples: 97549540. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-10 12:36:38,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:36:40,427][35978] Updated weights for policy 0, policy_version 30455 (0.0030) [2024-06-10 12:36:43,404][35745] Fps is (10 sec: 44227.1, 60 sec: 45054.3, 300 sec: 44764.1). Total num frames: 499105792. Throughput: 0: 44973.8. Samples: 97692420. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-10 12:36:43,404][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:36:43,764][35978] Updated weights for policy 0, policy_version 30465 (0.0036) [2024-06-10 12:36:47,551][35978] Updated weights for policy 0, policy_version 30475 (0.0032) [2024-06-10 12:36:48,402][35745] Fps is (10 sec: 40959.4, 60 sec: 44787.6, 300 sec: 44653.3). Total num frames: 499335168. Throughput: 0: 45069.2. Samples: 97963480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 12:36:48,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 12:36:48,403][35957] Saving new best policy, reward=0.306! [2024-06-10 12:36:50,683][35978] Updated weights for policy 0, policy_version 30485 (0.0042) [2024-06-10 12:36:53,402][35745] Fps is (10 sec: 47524.5, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 499580928. Throughput: 0: 44942.6. Samples: 98226080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 12:36:53,405][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:36:54,833][35978] Updated weights for policy 0, policy_version 30495 (0.0024) [2024-06-10 12:36:58,102][35978] Updated weights for policy 0, policy_version 30505 (0.0025) [2024-06-10 12:36:58,402][35745] Fps is (10 sec: 47513.9, 60 sec: 45057.4, 300 sec: 44931.0). Total num frames: 499810304. Throughput: 0: 44805.4. Samples: 98367080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 12:36:58,402][35745] Avg episode reward: [(0, '0.286')] [2024-06-10 12:37:02,040][35978] Updated weights for policy 0, policy_version 30515 (0.0036) [2024-06-10 12:37:03,402][35745] Fps is (10 sec: 40960.2, 60 sec: 44509.9, 300 sec: 44653.4). Total num frames: 499990528. Throughput: 0: 45042.2. Samples: 98635160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 12:37:03,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:37:05,698][35978] Updated weights for policy 0, policy_version 30525 (0.0032) [2024-06-10 12:37:08,401][35745] Fps is (10 sec: 44237.4, 60 sec: 44509.9, 300 sec: 44931.1). Total num frames: 500252672. Throughput: 0: 44880.1. Samples: 98894340. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-10 12:37:08,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:37:09,502][35978] Updated weights for policy 0, policy_version 30535 (0.0027) [2024-06-10 12:37:13,402][35745] Fps is (10 sec: 44236.7, 60 sec: 44236.9, 300 sec: 44764.4). Total num frames: 500432896. Throughput: 0: 44793.9. Samples: 99034100. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-10 12:37:13,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:37:13,427][35978] Updated weights for policy 0, policy_version 30545 (0.0033) [2024-06-10 12:37:16,730][35978] Updated weights for policy 0, policy_version 30555 (0.0041) [2024-06-10 12:37:18,402][35745] Fps is (10 sec: 42598.2, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 500678656. Throughput: 0: 44642.8. Samples: 99294000. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-10 12:37:18,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:37:20,485][35978] Updated weights for policy 0, policy_version 30565 (0.0038) [2024-06-10 12:37:23,402][35745] Fps is (10 sec: 47513.0, 60 sec: 44236.7, 300 sec: 44708.9). Total num frames: 500908032. Throughput: 0: 44669.6. Samples: 99559680. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-10 12:37:23,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:37:24,473][35978] Updated weights for policy 0, policy_version 30575 (0.0031) [2024-06-10 12:37:27,555][35978] Updated weights for policy 0, policy_version 30585 (0.0037) [2024-06-10 12:37:28,403][35745] Fps is (10 sec: 45870.6, 60 sec: 44782.3, 300 sec: 44931.4). Total num frames: 501137408. Throughput: 0: 44470.7. Samples: 99693540. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-10 12:37:28,403][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:37:31,512][35978] Updated weights for policy 0, policy_version 30595 (0.0027) [2024-06-10 12:37:33,126][35957] Signal inference workers to stop experience collection... (1550 times) [2024-06-10 12:37:33,178][35978] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-06-10 12:37:33,184][35957] Signal inference workers to resume experience collection... (1550 times) [2024-06-10 12:37:33,193][35978] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-06-10 12:37:33,402][35745] Fps is (10 sec: 44237.4, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 501350400. Throughput: 0: 44476.5. Samples: 99964920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-10 12:37:33,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:37:33,410][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000030600_501350400.pth... [2024-06-10 12:37:33,473][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000029944_490602496.pth [2024-06-10 12:37:35,196][35978] Updated weights for policy 0, policy_version 30605 (0.0039) [2024-06-10 12:37:38,402][35745] Fps is (10 sec: 42602.4, 60 sec: 43963.7, 300 sec: 44709.2). Total num frames: 501563392. Throughput: 0: 44526.2. Samples: 100229760. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-10 12:37:38,402][35745] Avg episode reward: [(0, '0.287')] [2024-06-10 12:37:38,914][35978] Updated weights for policy 0, policy_version 30615 (0.0033) [2024-06-10 12:37:42,634][35978] Updated weights for policy 0, policy_version 30625 (0.0022) [2024-06-10 12:37:43,402][35745] Fps is (10 sec: 45875.1, 60 sec: 45057.7, 300 sec: 44931.0). Total num frames: 501809152. Throughput: 0: 44400.5. Samples: 100365100. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-10 12:37:43,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:37:46,106][35978] Updated weights for policy 0, policy_version 30635 (0.0031) [2024-06-10 12:37:48,402][35745] Fps is (10 sec: 45874.8, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 502022144. Throughput: 0: 44325.7. Samples: 100629820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-10 12:37:48,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:37:49,742][35978] Updated weights for policy 0, policy_version 30645 (0.0034) [2024-06-10 12:37:53,402][35745] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 502235136. Throughput: 0: 44661.7. Samples: 100904120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-10 12:37:53,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:37:53,606][35978] Updated weights for policy 0, policy_version 30655 (0.0020) [2024-06-10 12:37:56,820][35978] Updated weights for policy 0, policy_version 30665 (0.0041) [2024-06-10 12:37:58,402][35745] Fps is (10 sec: 44237.1, 60 sec: 44236.8, 300 sec: 44875.9). Total num frames: 502464512. Throughput: 0: 44538.7. Samples: 101038340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-10 12:37:58,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:38:00,613][35978] Updated weights for policy 0, policy_version 30675 (0.0034) [2024-06-10 12:38:03,402][35745] Fps is (10 sec: 47513.6, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 502710272. Throughput: 0: 44766.2. Samples: 101308480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-10 12:38:03,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:38:04,250][35978] Updated weights for policy 0, policy_version 30685 (0.0039) [2024-06-10 12:38:07,939][35978] Updated weights for policy 0, policy_version 30695 (0.0042) [2024-06-10 12:38:08,402][35745] Fps is (10 sec: 44237.1, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 502906880. Throughput: 0: 44725.1. Samples: 101572300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-10 12:38:08,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:38:11,497][35978] Updated weights for policy 0, policy_version 30705 (0.0038) [2024-06-10 12:38:13,402][35745] Fps is (10 sec: 42598.4, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 503136256. Throughput: 0: 44629.4. Samples: 101701820. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-10 12:38:13,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 12:38:15,331][35978] Updated weights for policy 0, policy_version 30715 (0.0027) [2024-06-10 12:38:18,402][35745] Fps is (10 sec: 45874.9, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 503365632. Throughput: 0: 44564.4. Samples: 101970320. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-10 12:38:18,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:38:18,978][35978] Updated weights for policy 0, policy_version 30725 (0.0035) [2024-06-10 12:38:22,884][35978] Updated weights for policy 0, policy_version 30735 (0.0035) [2024-06-10 12:38:23,402][35745] Fps is (10 sec: 44236.2, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 503578624. Throughput: 0: 44755.9. Samples: 102243780. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-10 12:38:23,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:38:26,712][35978] Updated weights for policy 0, policy_version 30745 (0.0031) [2024-06-10 12:38:28,402][35745] Fps is (10 sec: 44237.0, 60 sec: 44510.6, 300 sec: 44764.4). Total num frames: 503808000. Throughput: 0: 44726.7. Samples: 102377800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 12:38:28,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:38:29,858][35978] Updated weights for policy 0, policy_version 30755 (0.0034) [2024-06-10 12:38:33,402][35745] Fps is (10 sec: 44237.3, 60 sec: 44509.8, 300 sec: 44819.9). Total num frames: 504020992. Throughput: 0: 44888.9. Samples: 102649820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 12:38:33,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:38:33,768][35978] Updated weights for policy 0, policy_version 30765 (0.0028) [2024-06-10 12:38:34,971][35957] Signal inference workers to stop experience collection... (1600 times) [2024-06-10 12:38:34,971][35957] Signal inference workers to resume experience collection... (1600 times) [2024-06-10 12:38:35,002][35978] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-06-10 12:38:35,002][35978] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-06-10 12:38:37,075][35978] Updated weights for policy 0, policy_version 30775 (0.0021) [2024-06-10 12:38:38,402][35745] Fps is (10 sec: 44236.8, 60 sec: 44783.0, 300 sec: 44653.4). Total num frames: 504250368. Throughput: 0: 44604.9. Samples: 102911340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 12:38:38,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:38:40,949][35978] Updated weights for policy 0, policy_version 30785 (0.0033) [2024-06-10 12:38:43,402][35745] Fps is (10 sec: 44236.9, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 504463360. Throughput: 0: 44640.9. Samples: 103047180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-10 12:38:43,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:38:44,582][35978] Updated weights for policy 0, policy_version 30795 (0.0026) [2024-06-10 12:38:48,402][35745] Fps is (10 sec: 44236.7, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 504692736. Throughput: 0: 44514.3. Samples: 103311620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 25.0) [2024-06-10 12:38:48,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:38:48,545][35978] Updated weights for policy 0, policy_version 30805 (0.0047) [2024-06-10 12:38:52,078][35978] Updated weights for policy 0, policy_version 30815 (0.0044) [2024-06-10 12:38:53,402][35745] Fps is (10 sec: 45875.3, 60 sec: 44782.9, 300 sec: 44709.2). Total num frames: 504922112. Throughput: 0: 44696.4. Samples: 103583640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 25.0) [2024-06-10 12:38:53,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 12:38:56,009][35978] Updated weights for policy 0, policy_version 30825 (0.0029) [2024-06-10 12:38:58,402][35745] Fps is (10 sec: 44236.8, 60 sec: 44509.9, 300 sec: 44653.3). Total num frames: 505135104. Throughput: 0: 44944.9. Samples: 103724340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 25.0) [2024-06-10 12:38:58,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:38:59,100][35978] Updated weights for policy 0, policy_version 30835 (0.0033) [2024-06-10 12:39:03,385][35978] Updated weights for policy 0, policy_version 30845 (0.0034) [2024-06-10 12:39:03,408][35745] Fps is (10 sec: 44208.9, 60 sec: 44232.2, 300 sec: 44819.0). Total num frames: 505364480. Throughput: 0: 44760.4. Samples: 103984820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 25.0) [2024-06-10 12:39:03,409][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:39:06,741][35978] Updated weights for policy 0, policy_version 30855 (0.0036) [2024-06-10 12:39:08,404][35745] Fps is (10 sec: 44227.4, 60 sec: 44508.2, 300 sec: 44653.0). Total num frames: 505577472. Throughput: 0: 44534.0. Samples: 104247900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 25.0) [2024-06-10 12:39:08,404][35745] Avg episode reward: [(0, '0.286')] [2024-06-10 12:39:10,704][35978] Updated weights for policy 0, policy_version 30865 (0.0035) [2024-06-10 12:39:13,402][35745] Fps is (10 sec: 44264.9, 60 sec: 44509.9, 300 sec: 44653.3). Total num frames: 505806848. Throughput: 0: 44611.1. Samples: 104385300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-10 12:39:13,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:39:14,355][35978] Updated weights for policy 0, policy_version 30875 (0.0036) [2024-06-10 12:39:17,931][35978] Updated weights for policy 0, policy_version 30885 (0.0026) [2024-06-10 12:39:18,402][35745] Fps is (10 sec: 45884.6, 60 sec: 44509.8, 300 sec: 44875.5). Total num frames: 506036224. Throughput: 0: 44382.6. Samples: 104647040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-10 12:39:18,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:39:21,589][35978] Updated weights for policy 0, policy_version 30895 (0.0034) [2024-06-10 12:39:23,402][35745] Fps is (10 sec: 45874.2, 60 sec: 44782.9, 300 sec: 44709.8). Total num frames: 506265600. Throughput: 0: 44610.5. Samples: 104918820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-10 12:39:23,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:39:25,425][35978] Updated weights for policy 0, policy_version 30905 (0.0039) [2024-06-10 12:39:28,402][35745] Fps is (10 sec: 45875.3, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 506494976. Throughput: 0: 44585.7. Samples: 105053540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-10 12:39:28,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:39:28,663][35978] Updated weights for policy 0, policy_version 30915 (0.0032) [2024-06-10 12:39:32,502][35978] Updated weights for policy 0, policy_version 30925 (0.0036) [2024-06-10 12:39:33,402][35745] Fps is (10 sec: 42598.1, 60 sec: 44509.7, 300 sec: 44764.4). Total num frames: 506691584. Throughput: 0: 44665.1. Samples: 105321560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:39:33,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 12:39:33,529][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000030927_506707968.pth... [2024-06-10 12:39:33,580][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000030273_495992832.pth [2024-06-10 12:39:35,874][35978] Updated weights for policy 0, policy_version 30935 (0.0030) [2024-06-10 12:39:38,402][35745] Fps is (10 sec: 44237.3, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 506937344. Throughput: 0: 44657.4. Samples: 105593220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:39:38,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:39:39,863][35978] Updated weights for policy 0, policy_version 30945 (0.0035) [2024-06-10 12:39:43,401][35745] Fps is (10 sec: 45876.8, 60 sec: 44783.0, 300 sec: 44653.4). Total num frames: 507150336. Throughput: 0: 44549.4. Samples: 105729060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:39:43,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:39:43,444][35978] Updated weights for policy 0, policy_version 30955 (0.0030) [2024-06-10 12:39:47,091][35978] Updated weights for policy 0, policy_version 30965 (0.0035) [2024-06-10 12:39:48,402][35745] Fps is (10 sec: 44236.4, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 507379712. Throughput: 0: 44608.0. Samples: 105991900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:39:48,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:39:50,858][35978] Updated weights for policy 0, policy_version 30975 (0.0031) [2024-06-10 12:39:53,402][35745] Fps is (10 sec: 44236.4, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 507592704. Throughput: 0: 44668.4. Samples: 106257880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-10 12:39:53,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:39:54,749][35978] Updated weights for policy 0, policy_version 30985 (0.0042) [2024-06-10 12:39:57,905][35978] Updated weights for policy 0, policy_version 30995 (0.0022) [2024-06-10 12:39:58,401][35745] Fps is (10 sec: 44237.3, 60 sec: 44783.0, 300 sec: 44653.4). Total num frames: 507822080. Throughput: 0: 44628.0. Samples: 106393560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-10 12:39:58,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:40:01,753][35978] Updated weights for policy 0, policy_version 31005 (0.0041) [2024-06-10 12:40:03,401][35745] Fps is (10 sec: 44237.1, 60 sec: 44514.6, 300 sec: 44708.9). Total num frames: 508035072. Throughput: 0: 44766.4. Samples: 106661520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-10 12:40:03,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:40:05,426][35978] Updated weights for policy 0, policy_version 31015 (0.0041) [2024-06-10 12:40:07,284][35957] Signal inference workers to stop experience collection... (1650 times) [2024-06-10 12:40:07,285][35957] Signal inference workers to resume experience collection... (1650 times) [2024-06-10 12:40:07,321][35978] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-06-10 12:40:07,321][35978] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-06-10 12:40:08,402][35745] Fps is (10 sec: 45874.6, 60 sec: 45057.6, 300 sec: 44820.0). Total num frames: 508280832. Throughput: 0: 44660.1. Samples: 106928520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-10 12:40:08,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:40:08,802][35978] Updated weights for policy 0, policy_version 31025 (0.0039) [2024-06-10 12:40:12,755][35978] Updated weights for policy 0, policy_version 31035 (0.0051) [2024-06-10 12:40:13,402][35745] Fps is (10 sec: 45875.1, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 508493824. Throughput: 0: 44609.5. Samples: 107060960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-10 12:40:13,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 12:40:16,469][35978] Updated weights for policy 0, policy_version 31045 (0.0037) [2024-06-10 12:40:18,402][35745] Fps is (10 sec: 44237.3, 60 sec: 44783.0, 300 sec: 44708.9). Total num frames: 508723200. Throughput: 0: 44701.2. Samples: 107333100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-10 12:40:18,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:40:20,082][35978] Updated weights for policy 0, policy_version 31055 (0.0028) [2024-06-10 12:40:23,402][35745] Fps is (10 sec: 45874.7, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 508952576. Throughput: 0: 44541.2. Samples: 107597580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-10 12:40:23,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 12:40:24,158][35978] Updated weights for policy 0, policy_version 31065 (0.0039) [2024-06-10 12:40:27,149][35978] Updated weights for policy 0, policy_version 31075 (0.0039) [2024-06-10 12:40:28,402][35745] Fps is (10 sec: 45875.0, 60 sec: 44783.0, 300 sec: 44708.9). Total num frames: 509181952. Throughput: 0: 44548.3. Samples: 107733740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-10 12:40:28,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:40:31,166][35978] Updated weights for policy 0, policy_version 31085 (0.0035) [2024-06-10 12:40:33,402][35745] Fps is (10 sec: 44236.5, 60 sec: 45056.1, 300 sec: 44653.3). Total num frames: 509394944. Throughput: 0: 44759.0. Samples: 108006060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 12:40:33,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:40:34,699][35978] Updated weights for policy 0, policy_version 31095 (0.0027) [2024-06-10 12:40:38,205][35978] Updated weights for policy 0, policy_version 31105 (0.0046) [2024-06-10 12:40:38,402][35745] Fps is (10 sec: 44236.2, 60 sec: 44782.8, 300 sec: 44820.0). Total num frames: 509624320. Throughput: 0: 44713.6. Samples: 108270000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 12:40:38,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 12:40:42,043][35978] Updated weights for policy 0, policy_version 31115 (0.0035) [2024-06-10 12:40:43,402][35745] Fps is (10 sec: 44237.4, 60 sec: 44782.9, 300 sec: 44709.8). Total num frames: 509837312. Throughput: 0: 44649.7. Samples: 108402800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 12:40:43,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:40:45,869][35978] Updated weights for policy 0, policy_version 31125 (0.0034) [2024-06-10 12:40:48,402][35745] Fps is (10 sec: 40960.7, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 510033920. Throughput: 0: 44700.0. Samples: 108673020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-10 12:40:48,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:40:49,485][35978] Updated weights for policy 0, policy_version 31135 (0.0041) [2024-06-10 12:40:53,244][35978] Updated weights for policy 0, policy_version 31145 (0.0032) [2024-06-10 12:40:53,402][35745] Fps is (10 sec: 44236.7, 60 sec: 44782.9, 300 sec: 44653.6). Total num frames: 510279680. Throughput: 0: 44667.6. Samples: 108938560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-10 12:40:53,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:40:56,563][35978] Updated weights for policy 0, policy_version 31155 (0.0027) [2024-06-10 12:40:58,402][35745] Fps is (10 sec: 47512.6, 60 sec: 44782.8, 300 sec: 44708.9). Total num frames: 510509056. Throughput: 0: 44876.2. Samples: 109080400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-10 12:40:58,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:41:00,101][35978] Updated weights for policy 0, policy_version 31165 (0.0038) [2024-06-10 12:41:03,402][35745] Fps is (10 sec: 44236.5, 60 sec: 44782.8, 300 sec: 44542.2). Total num frames: 510722048. Throughput: 0: 44701.2. Samples: 109344660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-10 12:41:03,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:41:03,952][35978] Updated weights for policy 0, policy_version 31175 (0.0039) [2024-06-10 12:41:07,559][35978] Updated weights for policy 0, policy_version 31185 (0.0033) [2024-06-10 12:41:08,402][35745] Fps is (10 sec: 44236.9, 60 sec: 44509.8, 300 sec: 44653.3). Total num frames: 510951424. Throughput: 0: 44753.7. Samples: 109611500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-10 12:41:08,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:41:11,482][35978] Updated weights for policy 0, policy_version 31195 (0.0045) [2024-06-10 12:41:13,401][35745] Fps is (10 sec: 45875.8, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 511180800. Throughput: 0: 44724.1. Samples: 109746320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 12:41:13,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:41:15,119][35978] Updated weights for policy 0, policy_version 31205 (0.0027) [2024-06-10 12:41:18,402][35745] Fps is (10 sec: 44237.4, 60 sec: 44509.8, 300 sec: 44542.3). Total num frames: 511393792. Throughput: 0: 44680.6. Samples: 110016680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 12:41:18,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 12:41:18,667][35978] Updated weights for policy 0, policy_version 31215 (0.0022) [2024-06-10 12:41:22,419][35978] Updated weights for policy 0, policy_version 31225 (0.0037) [2024-06-10 12:41:23,402][35745] Fps is (10 sec: 44236.8, 60 sec: 44509.9, 300 sec: 44653.4). Total num frames: 511623168. Throughput: 0: 44871.7. Samples: 110289220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 12:41:23,402][35745] Avg episode reward: [(0, '0.313')] [2024-06-10 12:41:23,502][35957] Saving new best policy, reward=0.313! [2024-06-10 12:41:25,722][35978] Updated weights for policy 0, policy_version 31235 (0.0028) [2024-06-10 12:41:25,993][35957] Signal inference workers to stop experience collection... (1700 times) [2024-06-10 12:41:25,994][35957] Signal inference workers to resume experience collection... (1700 times) [2024-06-10 12:41:26,006][35978] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-06-10 12:41:26,006][35978] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-06-10 12:41:28,402][35745] Fps is (10 sec: 47513.8, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 511868928. Throughput: 0: 44842.2. Samples: 110420700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 12:41:28,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:41:29,409][35978] Updated weights for policy 0, policy_version 31245 (0.0033) [2024-06-10 12:41:33,285][35978] Updated weights for policy 0, policy_version 31255 (0.0037) [2024-06-10 12:41:33,402][35745] Fps is (10 sec: 45874.2, 60 sec: 44782.9, 300 sec: 44597.8). Total num frames: 512081920. Throughput: 0: 44770.5. Samples: 110687700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 12:41:33,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:41:33,417][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000031255_512081920.pth... [2024-06-10 12:41:33,496][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000030600_501350400.pth [2024-06-10 12:41:36,968][35978] Updated weights for policy 0, policy_version 31265 (0.0042) [2024-06-10 12:41:38,401][35745] Fps is (10 sec: 42598.6, 60 sec: 44510.0, 300 sec: 44709.2). Total num frames: 512294912. Throughput: 0: 44841.0. Samples: 110956400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 12:41:38,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:41:40,775][35978] Updated weights for policy 0, policy_version 31275 (0.0022) [2024-06-10 12:41:43,408][35745] Fps is (10 sec: 45846.8, 60 sec: 45051.2, 300 sec: 44763.5). Total num frames: 512540672. Throughput: 0: 44656.5. Samples: 111090220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 12:41:43,409][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:41:44,441][35978] Updated weights for policy 0, policy_version 31285 (0.0047) [2024-06-10 12:41:47,932][35978] Updated weights for policy 0, policy_version 31295 (0.0034) [2024-06-10 12:41:48,402][35745] Fps is (10 sec: 47513.3, 60 sec: 45602.1, 300 sec: 44708.9). Total num frames: 512770048. Throughput: 0: 44931.7. Samples: 111366580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-10 12:41:48,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:41:51,731][35978] Updated weights for policy 0, policy_version 31305 (0.0030) [2024-06-10 12:41:53,401][35745] Fps is (10 sec: 42625.9, 60 sec: 44783.0, 300 sec: 44597.8). Total num frames: 512966656. Throughput: 0: 44872.7. Samples: 111630760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 12:41:53,402][35745] Avg episode reward: [(0, '0.289')] [2024-06-10 12:41:55,070][35978] Updated weights for policy 0, policy_version 31315 (0.0032) [2024-06-10 12:41:58,402][35745] Fps is (10 sec: 42598.1, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 513196032. Throughput: 0: 44753.2. Samples: 111760220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 12:41:58,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:41:58,974][35978] Updated weights for policy 0, policy_version 31325 (0.0033) [2024-06-10 12:42:02,626][35978] Updated weights for policy 0, policy_version 31335 (0.0040) [2024-06-10 12:42:03,401][35745] Fps is (10 sec: 45875.2, 60 sec: 45056.1, 300 sec: 44653.3). Total num frames: 513425408. Throughput: 0: 44706.3. Samples: 112028460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 12:42:03,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:42:06,464][35978] Updated weights for policy 0, policy_version 31345 (0.0036) [2024-06-10 12:42:08,402][35745] Fps is (10 sec: 44236.6, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 513638400. Throughput: 0: 44747.4. Samples: 112302860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-10 12:42:08,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:42:09,827][35978] Updated weights for policy 0, policy_version 31355 (0.0043) [2024-06-10 12:42:13,402][35745] Fps is (10 sec: 44236.0, 60 sec: 44782.8, 300 sec: 44708.9). Total num frames: 513867776. Throughput: 0: 44727.4. Samples: 112433440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 12:42:13,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:42:13,758][35978] Updated weights for policy 0, policy_version 31365 (0.0033) [2024-06-10 12:42:17,277][35978] Updated weights for policy 0, policy_version 31375 (0.0040) [2024-06-10 12:42:18,402][35745] Fps is (10 sec: 47513.7, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 514113536. Throughput: 0: 44859.6. Samples: 112706380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 12:42:18,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:42:20,956][35978] Updated weights for policy 0, policy_version 31385 (0.0035) [2024-06-10 12:42:23,402][35745] Fps is (10 sec: 44237.0, 60 sec: 44782.8, 300 sec: 44653.5). Total num frames: 514310144. Throughput: 0: 44932.3. Samples: 112978360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 12:42:23,402][35745] Avg episode reward: [(0, '0.288')] [2024-06-10 12:42:24,331][35978] Updated weights for policy 0, policy_version 31395 (0.0023) [2024-06-10 12:42:28,402][35745] Fps is (10 sec: 40960.1, 60 sec: 44236.7, 300 sec: 44653.3). Total num frames: 514523136. Throughput: 0: 44928.5. Samples: 113111720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 12:42:28,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:42:28,482][35978] Updated weights for policy 0, policy_version 31405 (0.0035) [2024-06-10 12:42:31,862][35978] Updated weights for policy 0, policy_version 31415 (0.0030) [2024-06-10 12:42:33,401][35745] Fps is (10 sec: 45875.8, 60 sec: 44783.1, 300 sec: 44764.4). Total num frames: 514768896. Throughput: 0: 44773.4. Samples: 113381380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-10 12:42:33,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:42:35,670][35978] Updated weights for policy 0, policy_version 31425 (0.0035) [2024-06-10 12:42:38,402][35745] Fps is (10 sec: 47514.1, 60 sec: 45056.0, 300 sec: 44708.9). Total num frames: 514998272. Throughput: 0: 44754.2. Samples: 113644700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-10 12:42:38,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:42:39,252][35978] Updated weights for policy 0, policy_version 31435 (0.0030) [2024-06-10 12:42:43,034][35978] Updated weights for policy 0, policy_version 31445 (0.0029) [2024-06-10 12:42:43,402][35745] Fps is (10 sec: 42598.3, 60 sec: 44241.5, 300 sec: 44653.4). Total num frames: 515194880. Throughput: 0: 44703.6. Samples: 113771880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-10 12:42:43,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:42:46,495][35978] Updated weights for policy 0, policy_version 31455 (0.0030) [2024-06-10 12:42:48,402][35745] Fps is (10 sec: 45874.9, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 515457024. Throughput: 0: 44969.7. Samples: 114052100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-10 12:42:48,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:42:50,492][35978] Updated weights for policy 0, policy_version 31465 (0.0021) [2024-06-10 12:42:53,402][35745] Fps is (10 sec: 47513.0, 60 sec: 45055.9, 300 sec: 44764.4). Total num frames: 515670016. Throughput: 0: 44893.8. Samples: 114323080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-10 12:42:53,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 12:42:53,695][35978] Updated weights for policy 0, policy_version 31475 (0.0035) [2024-06-10 12:42:56,853][35957] Signal inference workers to stop experience collection... (1750 times) [2024-06-10 12:42:56,853][35957] Signal inference workers to resume experience collection... (1750 times) [2024-06-10 12:42:56,896][35978] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-06-10 12:42:56,897][35978] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-06-10 12:42:57,605][35978] Updated weights for policy 0, policy_version 31485 (0.0041) [2024-06-10 12:42:58,402][35745] Fps is (10 sec: 42598.3, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 515883008. Throughput: 0: 44950.3. Samples: 114456200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 12:42:58,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:43:01,166][35978] Updated weights for policy 0, policy_version 31495 (0.0032) [2024-06-10 12:43:03,402][35745] Fps is (10 sec: 44237.2, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 516112384. Throughput: 0: 44773.9. Samples: 114721200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 12:43:03,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:43:04,858][35978] Updated weights for policy 0, policy_version 31505 (0.0032) [2024-06-10 12:43:08,311][35978] Updated weights for policy 0, policy_version 31515 (0.0026) [2024-06-10 12:43:08,404][35745] Fps is (10 sec: 45864.6, 60 sec: 45054.3, 300 sec: 44764.1). Total num frames: 516341760. Throughput: 0: 44560.0. Samples: 114983660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 12:43:08,405][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:43:12,243][35978] Updated weights for policy 0, policy_version 31525 (0.0036) [2024-06-10 12:43:13,404][35745] Fps is (10 sec: 44226.5, 60 sec: 44781.3, 300 sec: 44708.5). Total num frames: 516554752. Throughput: 0: 44714.6. Samples: 115123980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 12:43:13,404][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:43:15,612][35978] Updated weights for policy 0, policy_version 31535 (0.0038) [2024-06-10 12:43:18,402][35745] Fps is (10 sec: 42608.1, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 516767744. Throughput: 0: 44590.5. Samples: 115387960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 12:43:18,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:43:19,638][35978] Updated weights for policy 0, policy_version 31545 (0.0036) [2024-06-10 12:43:22,964][35978] Updated weights for policy 0, policy_version 31555 (0.0042) [2024-06-10 12:43:23,402][35745] Fps is (10 sec: 45882.3, 60 sec: 45055.5, 300 sec: 44764.3). Total num frames: 517013504. Throughput: 0: 44885.4. Samples: 115664580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 12:43:23,403][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:43:26,977][35978] Updated weights for policy 0, policy_version 31565 (0.0034) [2024-06-10 12:43:28,402][35745] Fps is (10 sec: 45875.4, 60 sec: 45056.0, 300 sec: 44764.4). Total num frames: 517226496. Throughput: 0: 45056.4. Samples: 115799420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 12:43:28,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:43:30,423][35978] Updated weights for policy 0, policy_version 31575 (0.0029) [2024-06-10 12:43:33,402][35745] Fps is (10 sec: 42601.8, 60 sec: 44509.8, 300 sec: 44708.9). Total num frames: 517439488. Throughput: 0: 44644.0. Samples: 116061080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 12:43:33,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:43:33,424][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000031582_517439488.pth... [2024-06-10 12:43:33,477][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000030927_506707968.pth [2024-06-10 12:43:34,025][35978] Updated weights for policy 0, policy_version 31585 (0.0032) [2024-06-10 12:43:37,522][35978] Updated weights for policy 0, policy_version 31595 (0.0028) [2024-06-10 12:43:38,402][35745] Fps is (10 sec: 44237.0, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 517668864. Throughput: 0: 44622.3. Samples: 116331080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-10 12:43:38,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:43:41,366][35978] Updated weights for policy 0, policy_version 31605 (0.0037) [2024-06-10 12:43:43,402][35745] Fps is (10 sec: 47513.1, 60 sec: 45329.0, 300 sec: 44819.9). Total num frames: 517914624. Throughput: 0: 44794.2. Samples: 116471940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-10 12:43:43,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:43:44,881][35978] Updated weights for policy 0, policy_version 31615 (0.0033) [2024-06-10 12:43:48,401][35745] Fps is (10 sec: 45875.6, 60 sec: 44510.0, 300 sec: 44764.4). Total num frames: 518127616. Throughput: 0: 44910.8. Samples: 116742180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-10 12:43:48,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:43:48,527][35978] Updated weights for policy 0, policy_version 31625 (0.0038) [2024-06-10 12:43:52,478][35978] Updated weights for policy 0, policy_version 31635 (0.0041) [2024-06-10 12:43:53,402][35745] Fps is (10 sec: 45875.8, 60 sec: 45056.1, 300 sec: 44875.5). Total num frames: 518373376. Throughput: 0: 45038.4. Samples: 117010280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-10 12:43:53,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:43:55,810][35978] Updated weights for policy 0, policy_version 31645 (0.0035) [2024-06-10 12:43:58,402][35745] Fps is (10 sec: 45874.0, 60 sec: 45055.9, 300 sec: 44820.9). Total num frames: 518586368. Throughput: 0: 44828.4. Samples: 117141160. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-10 12:43:58,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:43:59,516][35978] Updated weights for policy 0, policy_version 31655 (0.0023) [2024-06-10 12:44:03,144][35978] Updated weights for policy 0, policy_version 31665 (0.0034) [2024-06-10 12:44:03,405][35745] Fps is (10 sec: 42584.9, 60 sec: 44780.6, 300 sec: 44819.8). Total num frames: 518799360. Throughput: 0: 44940.5. Samples: 117410420. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-10 12:44:03,405][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:44:05,153][35957] Signal inference workers to stop experience collection... (1800 times) [2024-06-10 12:44:05,154][35957] Signal inference workers to resume experience collection... (1800 times) [2024-06-10 12:44:05,200][35978] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-06-10 12:44:05,200][35978] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-06-10 12:44:06,639][35978] Updated weights for policy 0, policy_version 31675 (0.0036) [2024-06-10 12:44:08,402][35745] Fps is (10 sec: 44237.2, 60 sec: 44784.7, 300 sec: 44819.9). Total num frames: 519028736. Throughput: 0: 44810.5. Samples: 117681020. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-10 12:44:08,403][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:44:10,688][35978] Updated weights for policy 0, policy_version 31685 (0.0047) [2024-06-10 12:44:13,402][35745] Fps is (10 sec: 45889.6, 60 sec: 45057.8, 300 sec: 44820.0). Total num frames: 519258112. Throughput: 0: 44796.5. Samples: 117815260. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-10 12:44:13,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:44:14,249][35978] Updated weights for policy 0, policy_version 31695 (0.0036) [2024-06-10 12:44:17,784][35978] Updated weights for policy 0, policy_version 31705 (0.0038) [2024-06-10 12:44:18,402][35745] Fps is (10 sec: 44237.2, 60 sec: 45056.1, 300 sec: 44764.5). Total num frames: 519471104. Throughput: 0: 45139.1. Samples: 118092340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 12:44:18,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:44:21,415][35978] Updated weights for policy 0, policy_version 31715 (0.0033) [2024-06-10 12:44:23,402][35745] Fps is (10 sec: 44236.8, 60 sec: 44783.5, 300 sec: 44764.4). Total num frames: 519700480. Throughput: 0: 44907.1. Samples: 118351900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 12:44:23,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:44:24,762][35978] Updated weights for policy 0, policy_version 31725 (0.0035) [2024-06-10 12:44:28,402][35745] Fps is (10 sec: 45874.2, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 519929856. Throughput: 0: 44955.5. Samples: 118494940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 12:44:28,403][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 12:44:28,613][35978] Updated weights for policy 0, policy_version 31735 (0.0050) [2024-06-10 12:44:32,357][35978] Updated weights for policy 0, policy_version 31745 (0.0038) [2024-06-10 12:44:33,402][35745] Fps is (10 sec: 44236.9, 60 sec: 45056.0, 300 sec: 44764.4). Total num frames: 520142848. Throughput: 0: 44786.1. Samples: 118757560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 12:44:33,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:44:35,898][35978] Updated weights for policy 0, policy_version 31755 (0.0040) [2024-06-10 12:44:38,402][35745] Fps is (10 sec: 44237.7, 60 sec: 45056.0, 300 sec: 44819.9). Total num frames: 520372224. Throughput: 0: 44809.8. Samples: 119026720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 12:44:38,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:44:40,031][35978] Updated weights for policy 0, policy_version 31765 (0.0036) [2024-06-10 12:44:43,402][35745] Fps is (10 sec: 44236.8, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 520585216. Throughput: 0: 44956.6. Samples: 119164200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 12:44:43,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:44:43,443][35978] Updated weights for policy 0, policy_version 31775 (0.0051) [2024-06-10 12:44:47,063][35978] Updated weights for policy 0, policy_version 31785 (0.0027) [2024-06-10 12:44:48,402][35745] Fps is (10 sec: 44236.3, 60 sec: 44782.8, 300 sec: 44819.9). Total num frames: 520814592. Throughput: 0: 45002.2. Samples: 119435380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 12:44:48,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:44:50,746][35978] Updated weights for policy 0, policy_version 31795 (0.0036) [2024-06-10 12:44:53,402][35745] Fps is (10 sec: 44236.8, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 521027584. Throughput: 0: 44914.3. Samples: 119702160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-10 12:44:53,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:44:54,090][35978] Updated weights for policy 0, policy_version 31805 (0.0042) [2024-06-10 12:44:58,078][35978] Updated weights for policy 0, policy_version 31815 (0.0037) [2024-06-10 12:44:58,402][35745] Fps is (10 sec: 44237.1, 60 sec: 44509.9, 300 sec: 44819.9). Total num frames: 521256960. Throughput: 0: 44917.3. Samples: 119836540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 12:44:58,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:45:01,808][35978] Updated weights for policy 0, policy_version 31825 (0.0037) [2024-06-10 12:45:03,402][35745] Fps is (10 sec: 47513.1, 60 sec: 45058.3, 300 sec: 44820.0). Total num frames: 521502720. Throughput: 0: 44837.6. Samples: 120110040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 12:45:03,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:45:05,254][35978] Updated weights for policy 0, policy_version 31835 (0.0027) [2024-06-10 12:45:08,402][35745] Fps is (10 sec: 45875.1, 60 sec: 44782.9, 300 sec: 44819.9). Total num frames: 521715712. Throughput: 0: 44962.2. Samples: 120375200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 12:45:08,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 12:45:09,105][35978] Updated weights for policy 0, policy_version 31845 (0.0038) [2024-06-10 12:45:12,466][35978] Updated weights for policy 0, policy_version 31855 (0.0031) [2024-06-10 12:45:13,402][35745] Fps is (10 sec: 45875.2, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 521961472. Throughput: 0: 44857.4. Samples: 120513520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 12:45:13,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:45:16,057][35978] Updated weights for policy 0, policy_version 31865 (0.0034) [2024-06-10 12:45:18,402][35745] Fps is (10 sec: 45875.6, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 522174464. Throughput: 0: 45216.0. Samples: 120792280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 12:45:18,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:45:19,983][35978] Updated weights for policy 0, policy_version 31875 (0.0041) [2024-06-10 12:45:23,197][35978] Updated weights for policy 0, policy_version 31885 (0.0035) [2024-06-10 12:45:23,402][35745] Fps is (10 sec: 44237.4, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 522403840. Throughput: 0: 45072.9. Samples: 121055000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 12:45:23,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:45:27,242][35978] Updated weights for policy 0, policy_version 31895 (0.0032) [2024-06-10 12:45:28,402][35745] Fps is (10 sec: 44236.7, 60 sec: 44783.1, 300 sec: 44820.0). Total num frames: 522616832. Throughput: 0: 45100.0. Samples: 121193700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 12:45:28,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:45:30,877][35978] Updated weights for policy 0, policy_version 31905 (0.0028) [2024-06-10 12:45:33,408][35745] Fps is (10 sec: 44208.3, 60 sec: 45051.2, 300 sec: 44819.0). Total num frames: 522846208. Throughput: 0: 44912.4. Samples: 121456720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 12:45:33,409][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:45:33,422][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000031912_522846208.pth... [2024-06-10 12:45:33,489][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000031255_512081920.pth [2024-06-10 12:45:34,346][35978] Updated weights for policy 0, policy_version 31915 (0.0036) [2024-06-10 12:45:38,302][35978] Updated weights for policy 0, policy_version 31925 (0.0027) [2024-06-10 12:45:38,401][35745] Fps is (10 sec: 44237.0, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 523059200. Throughput: 0: 45079.1. Samples: 121730720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-10 12:45:38,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:45:41,496][35978] Updated weights for policy 0, policy_version 31935 (0.0043) [2024-06-10 12:45:43,402][35745] Fps is (10 sec: 45903.9, 60 sec: 45328.9, 300 sec: 44986.5). Total num frames: 523304960. Throughput: 0: 45198.5. Samples: 121870480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-10 12:45:43,402][35745] Avg episode reward: [(0, '0.311')] [2024-06-10 12:45:45,217][35978] Updated weights for policy 0, policy_version 31945 (0.0034) [2024-06-10 12:45:46,538][35957] Signal inference workers to stop experience collection... (1850 times) [2024-06-10 12:45:46,539][35957] Signal inference workers to resume experience collection... (1850 times) [2024-06-10 12:45:46,569][35978] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-06-10 12:45:46,570][35978] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-06-10 12:45:48,402][35745] Fps is (10 sec: 45874.2, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 523517952. Throughput: 0: 44951.9. Samples: 122132880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-10 12:45:48,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:45:49,064][35978] Updated weights for policy 0, policy_version 31955 (0.0032) [2024-06-10 12:45:52,254][35978] Updated weights for policy 0, policy_version 31965 (0.0031) [2024-06-10 12:45:53,402][35745] Fps is (10 sec: 45875.4, 60 sec: 45602.0, 300 sec: 44931.0). Total num frames: 523763712. Throughput: 0: 45132.4. Samples: 122406160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-10 12:45:53,402][35745] Avg episode reward: [(0, '0.291')] [2024-06-10 12:45:56,412][35978] Updated weights for policy 0, policy_version 31975 (0.0041) [2024-06-10 12:45:58,402][35745] Fps is (10 sec: 45875.7, 60 sec: 45329.0, 300 sec: 44931.0). Total num frames: 523976704. Throughput: 0: 45041.8. Samples: 122540400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-10 12:45:58,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:45:59,849][35978] Updated weights for policy 0, policy_version 31985 (0.0032) [2024-06-10 12:46:03,402][35745] Fps is (10 sec: 42598.3, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 524189696. Throughput: 0: 44783.8. Samples: 122807560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 12:46:03,403][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 12:46:03,549][35978] Updated weights for policy 0, policy_version 31995 (0.0032) [2024-06-10 12:46:07,442][35978] Updated weights for policy 0, policy_version 32005 (0.0042) [2024-06-10 12:46:08,402][35745] Fps is (10 sec: 44236.8, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 524419072. Throughput: 0: 44799.5. Samples: 123070980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 12:46:08,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:46:10,947][35978] Updated weights for policy 0, policy_version 32015 (0.0029) [2024-06-10 12:46:13,402][35745] Fps is (10 sec: 45875.9, 60 sec: 44783.0, 300 sec: 44931.0). Total num frames: 524648448. Throughput: 0: 44856.4. Samples: 123212240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 12:46:13,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:46:14,534][35978] Updated weights for policy 0, policy_version 32025 (0.0026) [2024-06-10 12:46:18,401][35745] Fps is (10 sec: 42599.3, 60 sec: 44510.0, 300 sec: 44820.0). Total num frames: 524845056. Throughput: 0: 44875.0. Samples: 123475800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-10 12:46:18,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:46:18,518][35978] Updated weights for policy 0, policy_version 32035 (0.0034) [2024-06-10 12:46:21,598][35978] Updated weights for policy 0, policy_version 32045 (0.0027) [2024-06-10 12:46:23,402][35745] Fps is (10 sec: 44236.3, 60 sec: 44782.8, 300 sec: 44819.9). Total num frames: 525090816. Throughput: 0: 44638.5. Samples: 123739460. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-10 12:46:23,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 12:46:25,763][35978] Updated weights for policy 0, policy_version 32055 (0.0042) [2024-06-10 12:46:28,402][35745] Fps is (10 sec: 45874.4, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 525303808. Throughput: 0: 44523.7. Samples: 123874040. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-10 12:46:28,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 12:46:29,121][35978] Updated weights for policy 0, policy_version 32065 (0.0032) [2024-06-10 12:46:32,897][35978] Updated weights for policy 0, policy_version 32075 (0.0030) [2024-06-10 12:46:33,402][35745] Fps is (10 sec: 44236.9, 60 sec: 44787.7, 300 sec: 44875.5). Total num frames: 525533184. Throughput: 0: 44693.4. Samples: 124144080. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-10 12:46:33,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 12:46:36,492][35978] Updated weights for policy 0, policy_version 32085 (0.0023) [2024-06-10 12:46:38,401][35745] Fps is (10 sec: 45875.6, 60 sec: 45056.0, 300 sec: 44820.9). Total num frames: 525762560. Throughput: 0: 44658.4. Samples: 124415780. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-10 12:46:38,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 12:46:40,324][35978] Updated weights for policy 0, policy_version 32095 (0.0039) [2024-06-10 12:46:43,402][35745] Fps is (10 sec: 45874.8, 60 sec: 44782.9, 300 sec: 44819.9). Total num frames: 525991936. Throughput: 0: 44691.5. Samples: 124551520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 12:46:43,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:46:43,681][35978] Updated weights for policy 0, policy_version 32105 (0.0035) [2024-06-10 12:46:47,791][35978] Updated weights for policy 0, policy_version 32115 (0.0034) [2024-06-10 12:46:48,402][35745] Fps is (10 sec: 45874.3, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 526221312. Throughput: 0: 44748.9. Samples: 124821260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 12:46:48,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:46:50,882][35978] Updated weights for policy 0, policy_version 32125 (0.0024) [2024-06-10 12:46:53,402][35745] Fps is (10 sec: 44237.4, 60 sec: 44509.9, 300 sec: 44875.5). Total num frames: 526434304. Throughput: 0: 44787.1. Samples: 125086400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 12:46:53,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 12:46:54,816][35978] Updated weights for policy 0, policy_version 32135 (0.0028) [2024-06-10 12:46:58,109][35978] Updated weights for policy 0, policy_version 32145 (0.0041) [2024-06-10 12:46:58,402][35745] Fps is (10 sec: 44237.1, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 526663680. Throughput: 0: 44742.6. Samples: 125225660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 12:46:58,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:47:02,182][35978] Updated weights for policy 0, policy_version 32155 (0.0035) [2024-06-10 12:47:03,401][35745] Fps is (10 sec: 44237.4, 60 sec: 44783.1, 300 sec: 44875.5). Total num frames: 526876672. Throughput: 0: 44942.2. Samples: 125498200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 12:47:03,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:47:05,866][35978] Updated weights for policy 0, policy_version 32165 (0.0042) [2024-06-10 12:47:08,402][35745] Fps is (10 sec: 44236.7, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 527106048. Throughput: 0: 44868.0. Samples: 125758520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 12:47:08,407][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:47:09,680][35978] Updated weights for policy 0, policy_version 32175 (0.0036) [2024-06-10 12:47:12,887][35978] Updated weights for policy 0, policy_version 32185 (0.0028) [2024-06-10 12:47:13,402][35745] Fps is (10 sec: 44236.5, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 527319040. Throughput: 0: 44860.9. Samples: 125892780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 12:47:13,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:47:16,806][35957] Signal inference workers to stop experience collection... (1900 times) [2024-06-10 12:47:16,806][35957] Signal inference workers to resume experience collection... (1900 times) [2024-06-10 12:47:16,840][35978] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-06-10 12:47:16,840][35978] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-06-10 12:47:16,958][35978] Updated weights for policy 0, policy_version 32195 (0.0038) [2024-06-10 12:47:18,402][35745] Fps is (10 sec: 45875.4, 60 sec: 45328.9, 300 sec: 44931.0). Total num frames: 527564800. Throughput: 0: 44969.4. Samples: 126167700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 12:47:18,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 12:47:20,246][35978] Updated weights for policy 0, policy_version 32205 (0.0024) [2024-06-10 12:47:23,402][35745] Fps is (10 sec: 42597.0, 60 sec: 44236.7, 300 sec: 44819.9). Total num frames: 527745024. Throughput: 0: 44812.1. Samples: 126432340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-10 12:47:23,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:47:23,960][35978] Updated weights for policy 0, policy_version 32215 (0.0036) [2024-06-10 12:47:27,671][35978] Updated weights for policy 0, policy_version 32225 (0.0032) [2024-06-10 12:47:28,402][35745] Fps is (10 sec: 44237.1, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 528007168. Throughput: 0: 44813.5. Samples: 126568120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 12:47:28,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:47:31,379][35978] Updated weights for policy 0, policy_version 32235 (0.0028) [2024-06-10 12:47:33,402][35745] Fps is (10 sec: 49152.4, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 528236544. Throughput: 0: 44819.9. Samples: 126838160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 12:47:33,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:47:33,419][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000032241_528236544.pth... [2024-06-10 12:47:33,486][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000031582_517439488.pth [2024-06-10 12:47:35,135][35978] Updated weights for policy 0, policy_version 32245 (0.0027) [2024-06-10 12:47:38,402][35745] Fps is (10 sec: 44236.5, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 528449536. Throughput: 0: 45046.7. Samples: 127113500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 12:47:38,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:47:38,520][35978] Updated weights for policy 0, policy_version 32255 (0.0037) [2024-06-10 12:47:42,324][35978] Updated weights for policy 0, policy_version 32265 (0.0032) [2024-06-10 12:47:43,401][35745] Fps is (10 sec: 44238.0, 60 sec: 44783.1, 300 sec: 44820.0). Total num frames: 528678912. Throughput: 0: 44735.7. Samples: 127238760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 12:47:43,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 12:47:46,093][35978] Updated weights for policy 0, policy_version 32275 (0.0038) [2024-06-10 12:47:48,402][35745] Fps is (10 sec: 44236.6, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 528891904. Throughput: 0: 44735.4. Samples: 127511300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 12:47:48,402][35745] Avg episode reward: [(0, '0.287')] [2024-06-10 12:47:49,584][35978] Updated weights for policy 0, policy_version 32285 (0.0032) [2024-06-10 12:47:53,201][35978] Updated weights for policy 0, policy_version 32295 (0.0024) [2024-06-10 12:47:53,402][35745] Fps is (10 sec: 44236.4, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 529121280. Throughput: 0: 44966.3. Samples: 127782000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 12:47:53,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:47:57,071][35978] Updated weights for policy 0, policy_version 32305 (0.0034) [2024-06-10 12:47:58,402][35745] Fps is (10 sec: 47512.8, 60 sec: 45055.9, 300 sec: 44931.0). Total num frames: 529367040. Throughput: 0: 44917.5. Samples: 127914080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 12:47:58,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 12:48:00,537][35978] Updated weights for policy 0, policy_version 32315 (0.0035) [2024-06-10 12:48:03,408][35745] Fps is (10 sec: 45846.9, 60 sec: 45051.3, 300 sec: 44874.9). Total num frames: 529580032. Throughput: 0: 44734.3. Samples: 128181020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 12:48:03,408][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:48:04,371][35978] Updated weights for policy 0, policy_version 32325 (0.0027) [2024-06-10 12:48:07,943][35978] Updated weights for policy 0, policy_version 32335 (0.0022) [2024-06-10 12:48:08,402][35745] Fps is (10 sec: 42598.8, 60 sec: 44782.9, 300 sec: 44875.8). Total num frames: 529793024. Throughput: 0: 44885.0. Samples: 128452160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-10 12:48:08,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:48:11,698][35978] Updated weights for policy 0, policy_version 32345 (0.0025) [2024-06-10 12:48:13,402][35745] Fps is (10 sec: 44263.7, 60 sec: 45055.9, 300 sec: 44931.0). Total num frames: 530022400. Throughput: 0: 44806.1. Samples: 128584400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-10 12:48:13,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:48:15,222][35978] Updated weights for policy 0, policy_version 32355 (0.0036) [2024-06-10 12:48:18,402][35745] Fps is (10 sec: 44237.3, 60 sec: 44509.9, 300 sec: 44820.1). Total num frames: 530235392. Throughput: 0: 44706.0. Samples: 128849920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-10 12:48:18,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 12:48:19,002][35978] Updated weights for policy 0, policy_version 32365 (0.0031) [2024-06-10 12:48:22,151][35978] Updated weights for policy 0, policy_version 32375 (0.0033) [2024-06-10 12:48:23,402][35745] Fps is (10 sec: 45873.5, 60 sec: 45602.0, 300 sec: 44931.0). Total num frames: 530481152. Throughput: 0: 44764.4. Samples: 129127920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-10 12:48:23,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:48:25,992][35978] Updated weights for policy 0, policy_version 32385 (0.0028) [2024-06-10 12:48:28,402][35745] Fps is (10 sec: 45875.3, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 530694144. Throughput: 0: 45077.7. Samples: 129267260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-10 12:48:28,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:48:29,489][35978] Updated weights for policy 0, policy_version 32395 (0.0038) [2024-06-10 12:48:33,402][35745] Fps is (10 sec: 42600.7, 60 sec: 44510.0, 300 sec: 44875.5). Total num frames: 530907136. Throughput: 0: 44926.3. Samples: 129532980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 12:48:33,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:48:33,491][35978] Updated weights for policy 0, policy_version 32405 (0.0026) [2024-06-10 12:48:34,924][35957] Signal inference workers to stop experience collection... (1950 times) [2024-06-10 12:48:34,971][35978] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-06-10 12:48:34,979][35957] Signal inference workers to resume experience collection... (1950 times) [2024-06-10 12:48:34,989][35978] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-06-10 12:48:36,884][35978] Updated weights for policy 0, policy_version 32415 (0.0031) [2024-06-10 12:48:38,402][35745] Fps is (10 sec: 47513.1, 60 sec: 45329.0, 300 sec: 44931.0). Total num frames: 531169280. Throughput: 0: 44885.7. Samples: 129801860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 12:48:38,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:48:40,743][35978] Updated weights for policy 0, policy_version 32425 (0.0041) [2024-06-10 12:48:43,402][35745] Fps is (10 sec: 44235.8, 60 sec: 44509.7, 300 sec: 44819.9). Total num frames: 531349504. Throughput: 0: 45081.4. Samples: 129942740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 12:48:43,403][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:48:44,059][35978] Updated weights for policy 0, policy_version 32435 (0.0032) [2024-06-10 12:48:47,984][35978] Updated weights for policy 0, policy_version 32445 (0.0039) [2024-06-10 12:48:48,402][35745] Fps is (10 sec: 44236.7, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 531611648. Throughput: 0: 45072.8. Samples: 130209020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-10 12:48:48,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:48:51,064][35978] Updated weights for policy 0, policy_version 32455 (0.0033) [2024-06-10 12:48:53,402][35745] Fps is (10 sec: 47514.4, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 531824640. Throughput: 0: 44946.8. Samples: 130474760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 12:48:53,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:48:55,352][35978] Updated weights for policy 0, policy_version 32465 (0.0041) [2024-06-10 12:48:58,402][35745] Fps is (10 sec: 42598.5, 60 sec: 44510.0, 300 sec: 44876.0). Total num frames: 532037632. Throughput: 0: 45200.9. Samples: 130618440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 12:48:58,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:48:58,583][35978] Updated weights for policy 0, policy_version 32475 (0.0044) [2024-06-10 12:49:02,578][35978] Updated weights for policy 0, policy_version 32485 (0.0031) [2024-06-10 12:49:03,402][35745] Fps is (10 sec: 42598.2, 60 sec: 44514.4, 300 sec: 44820.0). Total num frames: 532250624. Throughput: 0: 45105.3. Samples: 130879660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 12:49:03,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:49:06,105][35978] Updated weights for policy 0, policy_version 32495 (0.0038) [2024-06-10 12:49:08,402][35745] Fps is (10 sec: 45875.2, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 532496384. Throughput: 0: 44768.8. Samples: 131142500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 12:49:08,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 12:49:10,161][35978] Updated weights for policy 0, policy_version 32505 (0.0034) [2024-06-10 12:49:13,402][35745] Fps is (10 sec: 45875.1, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 532709376. Throughput: 0: 44899.0. Samples: 131287720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-10 12:49:13,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:49:13,418][35978] Updated weights for policy 0, policy_version 32515 (0.0037) [2024-06-10 12:49:17,299][35978] Updated weights for policy 0, policy_version 32525 (0.0025) [2024-06-10 12:49:18,402][35745] Fps is (10 sec: 42598.8, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 532922368. Throughput: 0: 44836.9. Samples: 131550640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-10 12:49:18,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 12:49:20,418][35978] Updated weights for policy 0, policy_version 32535 (0.0030) [2024-06-10 12:49:23,402][35745] Fps is (10 sec: 44237.3, 60 sec: 44510.3, 300 sec: 44820.0). Total num frames: 533151744. Throughput: 0: 44891.2. Samples: 131821960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-10 12:49:23,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 12:49:24,486][35978] Updated weights for policy 0, policy_version 32545 (0.0034) [2024-06-10 12:49:27,783][35978] Updated weights for policy 0, policy_version 32555 (0.0030) [2024-06-10 12:49:28,402][35745] Fps is (10 sec: 47513.0, 60 sec: 45055.9, 300 sec: 44931.0). Total num frames: 533397504. Throughput: 0: 44782.7. Samples: 131957960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-10 12:49:28,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:49:31,527][35978] Updated weights for policy 0, policy_version 32565 (0.0027) [2024-06-10 12:49:33,402][35745] Fps is (10 sec: 45874.9, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 533610496. Throughput: 0: 44817.0. Samples: 132225780. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-10 12:49:33,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:49:33,524][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000032570_533626880.pth... [2024-06-10 12:49:33,570][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000031912_522846208.pth [2024-06-10 12:49:35,270][35978] Updated weights for policy 0, policy_version 32575 (0.0026) [2024-06-10 12:49:38,401][35745] Fps is (10 sec: 44237.7, 60 sec: 44510.0, 300 sec: 44931.0). Total num frames: 533839872. Throughput: 0: 44971.2. Samples: 132498460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-10 12:49:38,402][35745] Avg episode reward: [(0, '0.290')] [2024-06-10 12:49:38,745][35978] Updated weights for policy 0, policy_version 32585 (0.0032) [2024-06-10 12:49:42,417][35978] Updated weights for policy 0, policy_version 32595 (0.0026) [2024-06-10 12:49:43,402][35745] Fps is (10 sec: 45875.3, 60 sec: 45329.2, 300 sec: 44931.0). Total num frames: 534069248. Throughput: 0: 44673.4. Samples: 132628740. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-10 12:49:43,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:49:46,291][35978] Updated weights for policy 0, policy_version 32605 (0.0039) [2024-06-10 12:49:48,404][35745] Fps is (10 sec: 44226.2, 60 sec: 44508.2, 300 sec: 44930.7). Total num frames: 534282240. Throughput: 0: 44945.3. Samples: 132902300. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-10 12:49:48,404][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:49:49,327][35957] Signal inference workers to stop experience collection... (2000 times) [2024-06-10 12:49:49,376][35978] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-06-10 12:49:49,383][35957] Signal inference workers to resume experience collection... (2000 times) [2024-06-10 12:49:49,387][35978] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-06-10 12:49:49,537][35978] Updated weights for policy 0, policy_version 32615 (0.0028) [2024-06-10 12:49:53,406][35745] Fps is (10 sec: 45853.5, 60 sec: 45052.4, 300 sec: 44985.9). Total num frames: 534528000. Throughput: 0: 45009.1. Samples: 133168120. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-10 12:49:53,407][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:49:53,407][35978] Updated weights for policy 0, policy_version 32625 (0.0033) [2024-06-10 12:49:56,799][35978] Updated weights for policy 0, policy_version 32635 (0.0022) [2024-06-10 12:49:58,402][35745] Fps is (10 sec: 45885.1, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 534740992. Throughput: 0: 44824.8. Samples: 133304840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-10 12:49:58,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 12:50:00,415][35978] Updated weights for policy 0, policy_version 32645 (0.0038) [2024-06-10 12:50:03,404][35745] Fps is (10 sec: 42608.8, 60 sec: 45054.3, 300 sec: 44875.2). Total num frames: 534953984. Throughput: 0: 45190.5. Samples: 133584320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-10 12:50:03,405][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:50:04,312][35978] Updated weights for policy 0, policy_version 32655 (0.0043) [2024-06-10 12:50:08,075][35978] Updated weights for policy 0, policy_version 32665 (0.0027) [2024-06-10 12:50:08,402][35745] Fps is (10 sec: 44237.5, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 535183360. Throughput: 0: 44996.4. Samples: 133846800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-10 12:50:08,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 12:50:11,498][35978] Updated weights for policy 0, policy_version 32675 (0.0031) [2024-06-10 12:50:13,402][35745] Fps is (10 sec: 45886.0, 60 sec: 45056.1, 300 sec: 44875.5). Total num frames: 535412736. Throughput: 0: 44994.4. Samples: 133982700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-10 12:50:13,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 12:50:15,827][35978] Updated weights for policy 0, policy_version 32685 (0.0026) [2024-06-10 12:50:18,404][35745] Fps is (10 sec: 45864.3, 60 sec: 45327.3, 300 sec: 44875.1). Total num frames: 535642112. Throughput: 0: 45006.6. Samples: 134251180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-10 12:50:18,405][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 12:50:18,667][35978] Updated weights for policy 0, policy_version 32695 (0.0023) [2024-06-10 12:50:22,846][35978] Updated weights for policy 0, policy_version 32705 (0.0041) [2024-06-10 12:50:23,402][35745] Fps is (10 sec: 42597.8, 60 sec: 44782.8, 300 sec: 44819.9). Total num frames: 535838720. Throughput: 0: 44741.1. Samples: 134511820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-10 12:50:23,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:50:26,118][35978] Updated weights for policy 0, policy_version 32715 (0.0024) [2024-06-10 12:50:28,401][35745] Fps is (10 sec: 44247.6, 60 sec: 44783.1, 300 sec: 44876.5). Total num frames: 536084480. Throughput: 0: 44828.1. Samples: 134646000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-10 12:50:28,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 12:50:29,837][35978] Updated weights for policy 0, policy_version 32725 (0.0050) [2024-06-10 12:50:33,402][35745] Fps is (10 sec: 47513.8, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 536313856. Throughput: 0: 44844.9. Samples: 134920220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-10 12:50:33,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:50:33,634][35978] Updated weights for policy 0, policy_version 32735 (0.0032) [2024-06-10 12:50:37,532][35978] Updated weights for policy 0, policy_version 32745 (0.0041) [2024-06-10 12:50:38,402][35745] Fps is (10 sec: 42597.8, 60 sec: 44509.7, 300 sec: 44764.4). Total num frames: 536510464. Throughput: 0: 44779.3. Samples: 135182980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 12:50:38,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:50:40,891][35978] Updated weights for policy 0, policy_version 32755 (0.0041) [2024-06-10 12:50:43,402][35745] Fps is (10 sec: 42598.8, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 536739840. Throughput: 0: 44742.9. Samples: 135318260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 12:50:43,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:50:45,152][35978] Updated weights for policy 0, policy_version 32765 (0.0029) [2024-06-10 12:50:47,726][35957] Signal inference workers to stop experience collection... (2050 times) [2024-06-10 12:50:47,727][35957] Signal inference workers to resume experience collection... (2050 times) [2024-06-10 12:50:47,749][35978] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-06-10 12:50:47,749][35978] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-06-10 12:50:48,004][35978] Updated weights for policy 0, policy_version 32775 (0.0026) [2024-06-10 12:50:48,402][35745] Fps is (10 sec: 49152.1, 60 sec: 45330.8, 300 sec: 44875.5). Total num frames: 537001984. Throughput: 0: 44631.6. Samples: 135592640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 12:50:48,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 12:50:52,182][35978] Updated weights for policy 0, policy_version 32785 (0.0036) [2024-06-10 12:50:53,402][35745] Fps is (10 sec: 42598.3, 60 sec: 43967.2, 300 sec: 44708.9). Total num frames: 537165824. Throughput: 0: 44677.3. Samples: 135857280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 12:50:53,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 12:50:55,340][35978] Updated weights for policy 0, policy_version 32795 (0.0028) [2024-06-10 12:50:58,404][35745] Fps is (10 sec: 44226.6, 60 sec: 45054.4, 300 sec: 44930.7). Total num frames: 537444352. Throughput: 0: 44512.3. Samples: 135985860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 12:50:58,405][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:50:59,326][35978] Updated weights for policy 0, policy_version 32805 (0.0030) [2024-06-10 12:51:02,982][35978] Updated weights for policy 0, policy_version 32815 (0.0031) [2024-06-10 12:51:03,402][35745] Fps is (10 sec: 50790.5, 60 sec: 45330.8, 300 sec: 44931.0). Total num frames: 537673728. Throughput: 0: 44757.5. Samples: 136265160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 12:51:03,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:51:06,821][35978] Updated weights for policy 0, policy_version 32825 (0.0038) [2024-06-10 12:51:08,402][35745] Fps is (10 sec: 40969.4, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 537853952. Throughput: 0: 44909.4. Samples: 136532740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 12:51:08,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:51:10,090][35978] Updated weights for policy 0, policy_version 32835 (0.0028) [2024-06-10 12:51:13,402][35745] Fps is (10 sec: 40959.7, 60 sec: 44509.8, 300 sec: 44875.5). Total num frames: 538083328. Throughput: 0: 44644.3. Samples: 136655000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 12:51:13,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 12:51:14,512][35978] Updated weights for policy 0, policy_version 32845 (0.0036) [2024-06-10 12:51:17,113][35978] Updated weights for policy 0, policy_version 32855 (0.0030) [2024-06-10 12:51:18,402][35745] Fps is (10 sec: 49152.1, 60 sec: 45057.8, 300 sec: 44931.0). Total num frames: 538345472. Throughput: 0: 44724.9. Samples: 136932840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 12:51:18,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:51:21,673][35978] Updated weights for policy 0, policy_version 32865 (0.0045) [2024-06-10 12:51:23,402][35745] Fps is (10 sec: 42598.4, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 538509312. Throughput: 0: 44910.2. Samples: 137203940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:51:23,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 12:51:24,648][35978] Updated weights for policy 0, policy_version 32875 (0.0039) [2024-06-10 12:51:28,402][35745] Fps is (10 sec: 40959.7, 60 sec: 44509.7, 300 sec: 44820.0). Total num frames: 538755072. Throughput: 0: 44642.5. Samples: 137327180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:51:28,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 12:51:28,690][35978] Updated weights for policy 0, policy_version 32885 (0.0025) [2024-06-10 12:51:32,048][35978] Updated weights for policy 0, policy_version 32895 (0.0035) [2024-06-10 12:51:33,402][35745] Fps is (10 sec: 50790.2, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 539017216. Throughput: 0: 44610.2. Samples: 137600100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:51:33,406][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:51:33,418][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000032899_539017216.pth... [2024-06-10 12:51:33,477][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000032241_528236544.pth [2024-06-10 12:51:36,469][35978] Updated weights for policy 0, policy_version 32905 (0.0027) [2024-06-10 12:51:38,402][35745] Fps is (10 sec: 45875.3, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 539213824. Throughput: 0: 44893.3. Samples: 137877480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:51:38,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:51:39,314][35978] Updated weights for policy 0, policy_version 32915 (0.0035) [2024-06-10 12:51:43,402][35745] Fps is (10 sec: 39321.1, 60 sec: 44509.7, 300 sec: 44708.9). Total num frames: 539410432. Throughput: 0: 44641.7. Samples: 137994640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 12:51:43,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:51:43,554][35957] Signal inference workers to stop experience collection... (2100 times) [2024-06-10 12:51:43,559][35957] Signal inference workers to resume experience collection... (2100 times) [2024-06-10 12:51:43,596][35978] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-06-10 12:51:43,596][35978] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-06-10 12:51:43,706][35978] Updated weights for policy 0, policy_version 32925 (0.0027) [2024-06-10 12:51:46,552][35978] Updated weights for policy 0, policy_version 32935 (0.0035) [2024-06-10 12:51:48,402][35745] Fps is (10 sec: 47514.1, 60 sec: 44783.0, 300 sec: 44931.0). Total num frames: 539688960. Throughput: 0: 44382.7. Samples: 138262380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 12:51:48,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:51:51,427][35978] Updated weights for policy 0, policy_version 32945 (0.0031) [2024-06-10 12:51:53,402][35745] Fps is (10 sec: 45876.3, 60 sec: 45056.0, 300 sec: 44764.4). Total num frames: 539869184. Throughput: 0: 44669.9. Samples: 138542880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 12:51:53,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:51:54,097][35978] Updated weights for policy 0, policy_version 32955 (0.0034) [2024-06-10 12:51:58,402][35745] Fps is (10 sec: 39321.1, 60 sec: 43965.4, 300 sec: 44764.4). Total num frames: 540082176. Throughput: 0: 44561.3. Samples: 138660260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 12:51:58,402][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 12:51:58,606][35978] Updated weights for policy 0, policy_version 32965 (0.0036) [2024-06-10 12:52:01,343][35978] Updated weights for policy 0, policy_version 32975 (0.0036) [2024-06-10 12:52:03,402][35745] Fps is (10 sec: 47513.6, 60 sec: 44509.9, 300 sec: 44875.5). Total num frames: 540344320. Throughput: 0: 44481.4. Samples: 138934500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 12:52:03,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:52:05,782][35978] Updated weights for policy 0, policy_version 32985 (0.0045) [2024-06-10 12:52:08,402][35745] Fps is (10 sec: 49152.5, 60 sec: 45329.1, 300 sec: 44931.0). Total num frames: 540573696. Throughput: 0: 44633.4. Samples: 139212440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-10 12:52:08,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 12:52:08,734][35978] Updated weights for policy 0, policy_version 32995 (0.0033) [2024-06-10 12:52:13,100][35978] Updated weights for policy 0, policy_version 33005 (0.0043) [2024-06-10 12:52:13,402][35745] Fps is (10 sec: 40959.6, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 540753920. Throughput: 0: 44708.0. Samples: 139339040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-10 12:52:13,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:52:15,806][35978] Updated weights for policy 0, policy_version 33015 (0.0032) [2024-06-10 12:52:18,402][35745] Fps is (10 sec: 44236.4, 60 sec: 44509.8, 300 sec: 44986.6). Total num frames: 541016064. Throughput: 0: 44560.4. Samples: 139605320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-10 12:52:18,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:52:20,718][35978] Updated weights for policy 0, policy_version 33025 (0.0044) [2024-06-10 12:52:23,247][35978] Updated weights for policy 0, policy_version 33035 (0.0037) [2024-06-10 12:52:23,402][35745] Fps is (10 sec: 49151.8, 60 sec: 45602.1, 300 sec: 44875.5). Total num frames: 541245440. Throughput: 0: 44686.2. Samples: 139888360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-10 12:52:23,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 12:52:27,665][35978] Updated weights for policy 0, policy_version 33045 (0.0034) [2024-06-10 12:52:28,401][35745] Fps is (10 sec: 40960.7, 60 sec: 44510.0, 300 sec: 44708.9). Total num frames: 541425664. Throughput: 0: 45036.7. Samples: 140021280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-10 12:52:28,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:52:30,351][35978] Updated weights for policy 0, policy_version 33055 (0.0042) [2024-06-10 12:52:33,401][35745] Fps is (10 sec: 42599.1, 60 sec: 44236.9, 300 sec: 44820.0). Total num frames: 541671424. Throughput: 0: 44995.6. Samples: 140287180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-10 12:52:33,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 12:52:34,975][35978] Updated weights for policy 0, policy_version 33065 (0.0033) [2024-06-10 12:52:37,822][35978] Updated weights for policy 0, policy_version 33075 (0.0031) [2024-06-10 12:52:38,402][35745] Fps is (10 sec: 49151.5, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 541917184. Throughput: 0: 44732.4. Samples: 140555840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-10 12:52:38,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:52:42,591][35978] Updated weights for policy 0, policy_version 33085 (0.0021) [2024-06-10 12:52:43,402][35745] Fps is (10 sec: 44236.5, 60 sec: 45056.1, 300 sec: 44820.0). Total num frames: 542113792. Throughput: 0: 45272.5. Samples: 140697520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-10 12:52:43,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:52:45,287][35978] Updated weights for policy 0, policy_version 33095 (0.0034) [2024-06-10 12:52:48,401][35745] Fps is (10 sec: 42598.8, 60 sec: 44236.8, 300 sec: 44820.0). Total num frames: 542343168. Throughput: 0: 44916.9. Samples: 140955760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-10 12:52:48,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 12:52:49,733][35978] Updated weights for policy 0, policy_version 33105 (0.0031) [2024-06-10 12:52:51,089][35957] Signal inference workers to stop experience collection... (2150 times) [2024-06-10 12:52:51,135][35978] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-06-10 12:52:51,140][35957] Signal inference workers to resume experience collection... (2150 times) [2024-06-10 12:52:51,151][35978] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-06-10 12:52:52,576][35978] Updated weights for policy 0, policy_version 33115 (0.0038) [2024-06-10 12:52:53,402][35745] Fps is (10 sec: 47513.4, 60 sec: 45329.0, 300 sec: 44820.0). Total num frames: 542588928. Throughput: 0: 44703.9. Samples: 141224120. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-10 12:52:53,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:52:56,903][35978] Updated weights for policy 0, policy_version 33125 (0.0032) [2024-06-10 12:52:58,404][35745] Fps is (10 sec: 42588.2, 60 sec: 44781.3, 300 sec: 44709.5). Total num frames: 542769152. Throughput: 0: 44949.7. Samples: 141361880. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-10 12:52:58,405][35745] Avg episode reward: [(0, '0.311')] [2024-06-10 12:52:59,563][35978] Updated weights for policy 0, policy_version 33135 (0.0024) [2024-06-10 12:53:03,402][35745] Fps is (10 sec: 42598.1, 60 sec: 44509.7, 300 sec: 44820.0). Total num frames: 543014912. Throughput: 0: 45038.6. Samples: 141632060. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-10 12:53:03,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:53:04,201][35978] Updated weights for policy 0, policy_version 33145 (0.0036) [2024-06-10 12:53:07,035][35978] Updated weights for policy 0, policy_version 33155 (0.0038) [2024-06-10 12:53:08,402][35745] Fps is (10 sec: 50802.1, 60 sec: 45056.0, 300 sec: 44931.1). Total num frames: 543277056. Throughput: 0: 44865.4. Samples: 141907300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-10 12:53:08,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:53:11,721][35978] Updated weights for policy 0, policy_version 33165 (0.0034) [2024-06-10 12:53:13,401][35745] Fps is (10 sec: 45876.2, 60 sec: 45329.2, 300 sec: 44875.5). Total num frames: 543473664. Throughput: 0: 45030.2. Samples: 142047640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-10 12:53:13,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:53:14,510][35978] Updated weights for policy 0, policy_version 33175 (0.0037) [2024-06-10 12:53:18,402][35745] Fps is (10 sec: 39321.9, 60 sec: 44236.9, 300 sec: 44709.0). Total num frames: 543670272. Throughput: 0: 44969.3. Samples: 142310800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-10 12:53:18,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:53:18,782][35978] Updated weights for policy 0, policy_version 33185 (0.0048) [2024-06-10 12:53:21,723][35978] Updated weights for policy 0, policy_version 33195 (0.0037) [2024-06-10 12:53:23,402][35745] Fps is (10 sec: 47513.0, 60 sec: 45056.1, 300 sec: 44931.0). Total num frames: 543948800. Throughput: 0: 44938.2. Samples: 142578060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-10 12:53:23,402][35745] Avg episode reward: [(0, '0.293')] [2024-06-10 12:53:25,799][35978] Updated weights for policy 0, policy_version 33205 (0.0026) [2024-06-10 12:53:28,402][35745] Fps is (10 sec: 49151.9, 60 sec: 45602.1, 300 sec: 44931.0). Total num frames: 544161792. Throughput: 0: 44897.8. Samples: 142717920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-10 12:53:28,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:53:28,800][35978] Updated weights for policy 0, policy_version 33215 (0.0035) [2024-06-10 12:53:33,289][35978] Updated weights for policy 0, policy_version 33225 (0.0027) [2024-06-10 12:53:33,404][35745] Fps is (10 sec: 40950.5, 60 sec: 44781.1, 300 sec: 44708.5). Total num frames: 544358400. Throughput: 0: 45043.3. Samples: 142982820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 12:53:33,404][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:53:33,518][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000033226_544374784.pth... [2024-06-10 12:53:33,589][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000032570_533626880.pth [2024-06-10 12:53:36,466][35978] Updated weights for policy 0, policy_version 33235 (0.0026) [2024-06-10 12:53:38,402][35745] Fps is (10 sec: 44236.8, 60 sec: 44783.0, 300 sec: 44931.1). Total num frames: 544604160. Throughput: 0: 45035.2. Samples: 143250700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 12:53:38,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:53:40,747][35978] Updated weights for policy 0, policy_version 33245 (0.0049) [2024-06-10 12:53:43,402][35745] Fps is (10 sec: 47524.9, 60 sec: 45329.1, 300 sec: 44820.0). Total num frames: 544833536. Throughput: 0: 45181.9. Samples: 143394960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 12:53:43,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:53:43,494][35978] Updated weights for policy 0, policy_version 33255 (0.0036) [2024-06-10 12:53:48,067][35978] Updated weights for policy 0, policy_version 33265 (0.0027) [2024-06-10 12:53:48,401][35745] Fps is (10 sec: 40960.0, 60 sec: 44509.8, 300 sec: 44708.9). Total num frames: 545013760. Throughput: 0: 45001.1. Samples: 143657100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-10 12:53:48,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:53:50,798][35978] Updated weights for policy 0, policy_version 33275 (0.0026) [2024-06-10 12:53:53,402][35745] Fps is (10 sec: 45874.7, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 545292288. Throughput: 0: 44754.2. Samples: 143921240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:53:53,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:53:55,024][35978] Updated weights for policy 0, policy_version 33285 (0.0037) [2024-06-10 12:53:57,998][35978] Updated weights for policy 0, policy_version 33295 (0.0030) [2024-06-10 12:53:58,401][35745] Fps is (10 sec: 49152.1, 60 sec: 45603.9, 300 sec: 44931.1). Total num frames: 545505280. Throughput: 0: 44866.6. Samples: 144066640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:53:58,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 12:54:02,323][35978] Updated weights for policy 0, policy_version 33305 (0.0049) [2024-06-10 12:54:03,402][35745] Fps is (10 sec: 40960.5, 60 sec: 44783.1, 300 sec: 44764.4). Total num frames: 545701888. Throughput: 0: 45040.4. Samples: 144337620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:54:03,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 12:54:05,670][35978] Updated weights for policy 0, policy_version 33315 (0.0031) [2024-06-10 12:54:08,402][35745] Fps is (10 sec: 44236.2, 60 sec: 44509.8, 300 sec: 44875.5). Total num frames: 545947648. Throughput: 0: 44896.8. Samples: 144598420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:54:08,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 12:54:10,018][35978] Updated weights for policy 0, policy_version 33325 (0.0038) [2024-06-10 12:54:12,943][35978] Updated weights for policy 0, policy_version 33335 (0.0036) [2024-06-10 12:54:13,402][35745] Fps is (10 sec: 45874.8, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 546160640. Throughput: 0: 44782.2. Samples: 144733120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-10 12:54:13,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:54:17,216][35978] Updated weights for policy 0, policy_version 33345 (0.0027) [2024-06-10 12:54:18,002][35957] Signal inference workers to stop experience collection... (2200 times) [2024-06-10 12:54:18,003][35957] Signal inference workers to resume experience collection... (2200 times) [2024-06-10 12:54:18,024][35978] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-06-10 12:54:18,024][35978] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-06-10 12:54:18,402][35745] Fps is (10 sec: 42598.4, 60 sec: 45055.9, 300 sec: 44819.9). Total num frames: 546373632. Throughput: 0: 44972.0. Samples: 145006460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-10 12:54:18,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:54:20,105][35978] Updated weights for policy 0, policy_version 33355 (0.0024) [2024-06-10 12:54:23,402][35745] Fps is (10 sec: 44236.9, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 546603008. Throughput: 0: 44868.0. Samples: 145269760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-10 12:54:23,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 12:54:24,445][35978] Updated weights for policy 0, policy_version 33365 (0.0040) [2024-06-10 12:54:27,483][35978] Updated weights for policy 0, policy_version 33375 (0.0034) [2024-06-10 12:54:28,404][35745] Fps is (10 sec: 47503.0, 60 sec: 44781.2, 300 sec: 44875.2). Total num frames: 546848768. Throughput: 0: 44597.7. Samples: 145401960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-10 12:54:28,405][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:54:31,679][35978] Updated weights for policy 0, policy_version 33385 (0.0028) [2024-06-10 12:54:33,408][35745] Fps is (10 sec: 44208.7, 60 sec: 44779.9, 300 sec: 44763.4). Total num frames: 547045376. Throughput: 0: 44699.8. Samples: 145668880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-10 12:54:33,408][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:54:34,971][35978] Updated weights for policy 0, policy_version 33395 (0.0038) [2024-06-10 12:54:38,402][35745] Fps is (10 sec: 42608.0, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 547274752. Throughput: 0: 44816.4. Samples: 145937980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 12:54:38,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 12:54:39,102][35978] Updated weights for policy 0, policy_version 33405 (0.0032) [2024-06-10 12:54:42,265][35978] Updated weights for policy 0, policy_version 33415 (0.0041) [2024-06-10 12:54:43,401][35745] Fps is (10 sec: 47544.3, 60 sec: 44783.0, 300 sec: 44875.9). Total num frames: 547520512. Throughput: 0: 44718.7. Samples: 146078980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 12:54:43,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 12:54:46,150][35978] Updated weights for policy 0, policy_version 33425 (0.0028) [2024-06-10 12:54:48,402][35745] Fps is (10 sec: 44236.8, 60 sec: 45055.9, 300 sec: 44709.6). Total num frames: 547717120. Throughput: 0: 44671.4. Samples: 146347840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 12:54:48,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:54:49,387][35978] Updated weights for policy 0, policy_version 33435 (0.0043) [2024-06-10 12:54:53,402][35745] Fps is (10 sec: 42597.5, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 547946496. Throughput: 0: 44832.4. Samples: 146615880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-10 12:54:53,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:54:53,565][35978] Updated weights for policy 0, policy_version 33445 (0.0038) [2024-06-10 12:54:56,692][35978] Updated weights for policy 0, policy_version 33455 (0.0041) [2024-06-10 12:54:58,402][35745] Fps is (10 sec: 47513.9, 60 sec: 44782.9, 300 sec: 44875.9). Total num frames: 548192256. Throughput: 0: 44914.3. Samples: 146754260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-10 12:54:58,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:55:00,616][35978] Updated weights for policy 0, policy_version 33465 (0.0024) [2024-06-10 12:55:03,401][35745] Fps is (10 sec: 44237.6, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 548388864. Throughput: 0: 44789.5. Samples: 147021980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-10 12:55:03,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:55:04,099][35978] Updated weights for policy 0, policy_version 33475 (0.0037) [2024-06-10 12:55:08,034][35978] Updated weights for policy 0, policy_version 33485 (0.0031) [2024-06-10 12:55:08,401][35745] Fps is (10 sec: 44237.3, 60 sec: 44783.1, 300 sec: 44820.0). Total num frames: 548634624. Throughput: 0: 44968.6. Samples: 147293340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-10 12:55:08,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:55:11,586][35978] Updated weights for policy 0, policy_version 33495 (0.0030) [2024-06-10 12:55:13,402][35745] Fps is (10 sec: 47513.3, 60 sec: 45056.0, 300 sec: 44820.3). Total num frames: 548864000. Throughput: 0: 45045.0. Samples: 147428880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-10 12:55:13,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:55:15,213][35978] Updated weights for policy 0, policy_version 33505 (0.0037) [2024-06-10 12:55:18,402][35745] Fps is (10 sec: 42598.0, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 549060608. Throughput: 0: 45147.3. Samples: 147700220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-10 12:55:18,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:55:18,804][35978] Updated weights for policy 0, policy_version 33515 (0.0031) [2024-06-10 12:55:22,650][35978] Updated weights for policy 0, policy_version 33525 (0.0033) [2024-06-10 12:55:23,402][35745] Fps is (10 sec: 44236.3, 60 sec: 45055.9, 300 sec: 44819.9). Total num frames: 549306368. Throughput: 0: 44912.8. Samples: 147959060. Policy #0 lag: (min: 1.0, avg: 10.2, max: 24.0) [2024-06-10 12:55:23,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:55:26,031][35978] Updated weights for policy 0, policy_version 33535 (0.0034) [2024-06-10 12:55:28,404][35745] Fps is (10 sec: 47502.3, 60 sec: 44782.9, 300 sec: 44819.6). Total num frames: 549535744. Throughput: 0: 44838.0. Samples: 148096800. Policy #0 lag: (min: 1.0, avg: 10.2, max: 24.0) [2024-06-10 12:55:28,405][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:55:29,809][35978] Updated weights for policy 0, policy_version 33545 (0.0028) [2024-06-10 12:55:33,402][35745] Fps is (10 sec: 44237.4, 60 sec: 45060.8, 300 sec: 44875.5). Total num frames: 549748736. Throughput: 0: 44968.5. Samples: 148371420. Policy #0 lag: (min: 1.0, avg: 10.2, max: 24.0) [2024-06-10 12:55:33,404][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:55:33,420][35978] Updated weights for policy 0, policy_version 33555 (0.0032) [2024-06-10 12:55:33,535][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000033556_549781504.pth... [2024-06-10 12:55:33,581][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000032899_539017216.pth [2024-06-10 12:55:37,019][35978] Updated weights for policy 0, policy_version 33565 (0.0037) [2024-06-10 12:55:38,402][35745] Fps is (10 sec: 44246.8, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 549978112. Throughput: 0: 44920.0. Samples: 148637280. Policy #0 lag: (min: 1.0, avg: 10.2, max: 24.0) [2024-06-10 12:55:38,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:55:40,857][35957] Signal inference workers to stop experience collection... (2250 times) [2024-06-10 12:55:40,889][35978] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-06-10 12:55:40,918][35957] Signal inference workers to resume experience collection... (2250 times) [2024-06-10 12:55:40,922][35978] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-06-10 12:55:40,925][35978] Updated weights for policy 0, policy_version 33575 (0.0028) [2024-06-10 12:55:43,402][35745] Fps is (10 sec: 45875.1, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 550207488. Throughput: 0: 44794.7. Samples: 148770020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-10 12:55:43,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:55:44,283][35978] Updated weights for policy 0, policy_version 33585 (0.0031) [2024-06-10 12:55:48,102][35978] Updated weights for policy 0, policy_version 33595 (0.0044) [2024-06-10 12:55:48,402][35745] Fps is (10 sec: 45875.3, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 550436864. Throughput: 0: 44922.1. Samples: 149043480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-10 12:55:48,402][35745] Avg episode reward: [(0, '0.292')] [2024-06-10 12:55:51,562][35978] Updated weights for policy 0, policy_version 33605 (0.0032) [2024-06-10 12:55:53,402][35745] Fps is (10 sec: 42598.3, 60 sec: 44783.0, 300 sec: 44709.2). Total num frames: 550633472. Throughput: 0: 44811.9. Samples: 149309880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-10 12:55:53,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:55:55,205][35978] Updated weights for policy 0, policy_version 33615 (0.0035) [2024-06-10 12:55:58,402][35745] Fps is (10 sec: 44237.3, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 550879232. Throughput: 0: 44761.8. Samples: 149443160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-10 12:55:58,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:55:58,804][35978] Updated weights for policy 0, policy_version 33625 (0.0034) [2024-06-10 12:56:02,834][35978] Updated weights for policy 0, policy_version 33635 (0.0026) [2024-06-10 12:56:03,402][35745] Fps is (10 sec: 47513.5, 60 sec: 45329.0, 300 sec: 44931.0). Total num frames: 551108608. Throughput: 0: 44763.5. Samples: 149714580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-10 12:56:03,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 12:56:05,885][35978] Updated weights for policy 0, policy_version 33645 (0.0030) [2024-06-10 12:56:08,402][35745] Fps is (10 sec: 40959.6, 60 sec: 44236.7, 300 sec: 44764.4). Total num frames: 551288832. Throughput: 0: 44881.4. Samples: 149978720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-10 12:56:08,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:56:10,291][35978] Updated weights for policy 0, policy_version 33655 (0.0042) [2024-06-10 12:56:13,402][35745] Fps is (10 sec: 45875.2, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 551567360. Throughput: 0: 44761.9. Samples: 150110980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-10 12:56:13,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:56:13,403][35978] Updated weights for policy 0, policy_version 33665 (0.0037) [2024-06-10 12:56:17,380][35978] Updated weights for policy 0, policy_version 33675 (0.0032) [2024-06-10 12:56:18,402][35745] Fps is (10 sec: 49151.7, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 551780352. Throughput: 0: 44860.8. Samples: 150390160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-10 12:56:18,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:56:20,809][35978] Updated weights for policy 0, policy_version 33685 (0.0044) [2024-06-10 12:56:23,404][35745] Fps is (10 sec: 40950.4, 60 sec: 44508.2, 300 sec: 44819.6). Total num frames: 551976960. Throughput: 0: 44872.8. Samples: 150656660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-10 12:56:23,405][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:56:24,531][35978] Updated weights for policy 0, policy_version 33695 (0.0027) [2024-06-10 12:56:27,910][35978] Updated weights for policy 0, policy_version 33705 (0.0031) [2024-06-10 12:56:28,402][35745] Fps is (10 sec: 44237.3, 60 sec: 44784.7, 300 sec: 44764.4). Total num frames: 552222720. Throughput: 0: 44919.1. Samples: 150791380. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-10 12:56:28,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:56:31,942][35978] Updated weights for policy 0, policy_version 33715 (0.0035) [2024-06-10 12:56:33,404][35745] Fps is (10 sec: 49152.2, 60 sec: 45327.3, 300 sec: 44930.7). Total num frames: 552468480. Throughput: 0: 44915.1. Samples: 151064760. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-10 12:56:33,405][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:56:35,099][35978] Updated weights for policy 0, policy_version 33725 (0.0028) [2024-06-10 12:56:38,402][35745] Fps is (10 sec: 42597.6, 60 sec: 44509.8, 300 sec: 44875.5). Total num frames: 552648704. Throughput: 0: 44962.9. Samples: 151333220. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-10 12:56:38,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:56:39,419][35978] Updated weights for policy 0, policy_version 33735 (0.0024) [2024-06-10 12:56:42,561][35978] Updated weights for policy 0, policy_version 33745 (0.0024) [2024-06-10 12:56:43,402][35745] Fps is (10 sec: 44247.3, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 552910848. Throughput: 0: 44796.4. Samples: 151459000. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-10 12:56:43,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:56:46,617][35978] Updated weights for policy 0, policy_version 33755 (0.0040) [2024-06-10 12:56:48,401][35745] Fps is (10 sec: 47514.7, 60 sec: 44783.0, 300 sec: 44931.0). Total num frames: 553123840. Throughput: 0: 44851.6. Samples: 151732900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-10 12:56:48,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:56:49,651][35978] Updated weights for policy 0, policy_version 33765 (0.0036) [2024-06-10 12:56:53,402][35745] Fps is (10 sec: 40959.7, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 553320448. Throughput: 0: 45076.5. Samples: 152007160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-10 12:56:53,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 12:56:53,786][35957] Signal inference workers to stop experience collection... (2300 times) [2024-06-10 12:56:53,820][35978] InferenceWorker_p0-w0: stopping experience collection (2300 times) [2024-06-10 12:56:53,830][35957] Signal inference workers to resume experience collection... (2300 times) [2024-06-10 12:56:53,836][35978] InferenceWorker_p0-w0: resuming experience collection (2300 times) [2024-06-10 12:56:53,968][35978] Updated weights for policy 0, policy_version 33775 (0.0034) [2024-06-10 12:56:57,026][35978] Updated weights for policy 0, policy_version 33785 (0.0029) [2024-06-10 12:56:58,402][35745] Fps is (10 sec: 45875.0, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 553582592. Throughput: 0: 44980.5. Samples: 152135100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-10 12:56:58,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 12:57:01,360][35978] Updated weights for policy 0, policy_version 33795 (0.0022) [2024-06-10 12:57:03,402][35745] Fps is (10 sec: 49152.0, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 553811968. Throughput: 0: 44879.2. Samples: 152409720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-10 12:57:03,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:57:04,102][35978] Updated weights for policy 0, policy_version 33805 (0.0031) [2024-06-10 12:57:08,402][35745] Fps is (10 sec: 40959.8, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 553992192. Throughput: 0: 44896.1. Samples: 152676880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 12:57:08,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:57:08,799][35978] Updated weights for policy 0, policy_version 33815 (0.0030) [2024-06-10 12:57:11,631][35978] Updated weights for policy 0, policy_version 33825 (0.0026) [2024-06-10 12:57:13,402][35745] Fps is (10 sec: 42598.5, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 554237952. Throughput: 0: 44750.6. Samples: 152805160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 12:57:13,402][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 12:57:15,964][35978] Updated weights for policy 0, policy_version 33835 (0.0029) [2024-06-10 12:57:18,404][35745] Fps is (10 sec: 49140.7, 60 sec: 45054.3, 300 sec: 44875.2). Total num frames: 554483712. Throughput: 0: 44675.6. Samples: 153075160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 12:57:18,405][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:57:18,782][35978] Updated weights for policy 0, policy_version 33845 (0.0038) [2024-06-10 12:57:23,052][35978] Updated weights for policy 0, policy_version 33855 (0.0026) [2024-06-10 12:57:23,402][35745] Fps is (10 sec: 44236.8, 60 sec: 45057.8, 300 sec: 44931.0). Total num frames: 554680320. Throughput: 0: 45006.4. Samples: 153358500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 12:57:23,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:57:26,233][35978] Updated weights for policy 0, policy_version 33865 (0.0021) [2024-06-10 12:57:28,402][35745] Fps is (10 sec: 44247.4, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 554926080. Throughput: 0: 45056.0. Samples: 153486520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-10 12:57:28,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:57:30,322][35978] Updated weights for policy 0, policy_version 33875 (0.0039) [2024-06-10 12:57:33,241][35978] Updated weights for policy 0, policy_version 33885 (0.0030) [2024-06-10 12:57:33,402][35745] Fps is (10 sec: 49152.1, 60 sec: 45057.8, 300 sec: 44931.0). Total num frames: 555171840. Throughput: 0: 44992.0. Samples: 153757540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 12:57:33,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:57:33,421][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000033885_555171840.pth... [2024-06-10 12:57:33,472][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000033226_544374784.pth [2024-06-10 12:57:37,823][35978] Updated weights for policy 0, policy_version 33895 (0.0039) [2024-06-10 12:57:38,402][35745] Fps is (10 sec: 42598.3, 60 sec: 45056.2, 300 sec: 44875.5). Total num frames: 555352064. Throughput: 0: 44976.1. Samples: 154031080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 12:57:38,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 12:57:40,925][35978] Updated weights for policy 0, policy_version 33905 (0.0042) [2024-06-10 12:57:43,402][35745] Fps is (10 sec: 42598.0, 60 sec: 44782.8, 300 sec: 44931.0). Total num frames: 555597824. Throughput: 0: 45057.2. Samples: 154162680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 12:57:43,402][35745] Avg episode reward: [(0, '0.294')] [2024-06-10 12:57:45,082][35978] Updated weights for policy 0, policy_version 33915 (0.0033) [2024-06-10 12:57:48,048][35978] Updated weights for policy 0, policy_version 33925 (0.0033) [2024-06-10 12:57:48,404][35745] Fps is (10 sec: 47502.3, 60 sec: 45054.2, 300 sec: 44875.1). Total num frames: 555827200. Throughput: 0: 44888.4. Samples: 154429800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-10 12:57:48,405][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:57:52,586][35978] Updated weights for policy 0, policy_version 33935 (0.0030) [2024-06-10 12:57:53,402][35745] Fps is (10 sec: 45875.0, 60 sec: 45602.1, 300 sec: 45042.4). Total num frames: 556056576. Throughput: 0: 45009.2. Samples: 154702300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-10 12:57:53,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:57:55,625][35978] Updated weights for policy 0, policy_version 33945 (0.0033) [2024-06-10 12:57:58,402][35745] Fps is (10 sec: 42608.3, 60 sec: 44509.8, 300 sec: 44875.5). Total num frames: 556253184. Throughput: 0: 45042.7. Samples: 154832080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-10 12:57:58,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 12:57:59,794][35978] Updated weights for policy 0, policy_version 33955 (0.0036) [2024-06-10 12:58:02,804][35978] Updated weights for policy 0, policy_version 33965 (0.0029) [2024-06-10 12:58:03,402][35745] Fps is (10 sec: 42599.1, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 556482560. Throughput: 0: 44842.4. Samples: 155092960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-10 12:58:03,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:58:06,961][35978] Updated weights for policy 0, policy_version 33975 (0.0036) [2024-06-10 12:58:08,326][35957] Signal inference workers to stop experience collection... (2350 times) [2024-06-10 12:58:08,326][35957] Signal inference workers to resume experience collection... (2350 times) [2024-06-10 12:58:08,351][35978] InferenceWorker_p0-w0: stopping experience collection (2350 times) [2024-06-10 12:58:08,351][35978] InferenceWorker_p0-w0: resuming experience collection (2350 times) [2024-06-10 12:58:08,402][35745] Fps is (10 sec: 47513.9, 60 sec: 45602.2, 300 sec: 44931.0). Total num frames: 556728320. Throughput: 0: 44800.5. Samples: 155374520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-10 12:58:08,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 12:58:10,432][35978] Updated weights for policy 0, policy_version 33985 (0.0031) [2024-06-10 12:58:13,402][35745] Fps is (10 sec: 42598.4, 60 sec: 44509.9, 300 sec: 44875.5). Total num frames: 556908544. Throughput: 0: 44823.1. Samples: 155503560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-10 12:58:13,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:58:14,418][35978] Updated weights for policy 0, policy_version 33995 (0.0030) [2024-06-10 12:58:17,386][35978] Updated weights for policy 0, policy_version 34005 (0.0033) [2024-06-10 12:58:18,402][35745] Fps is (10 sec: 44236.5, 60 sec: 44784.7, 300 sec: 44820.0). Total num frames: 557170688. Throughput: 0: 44811.1. Samples: 155774040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 12:58:18,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 12:58:22,014][35978] Updated weights for policy 0, policy_version 34015 (0.0026) [2024-06-10 12:58:23,402][35745] Fps is (10 sec: 49151.7, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 557400064. Throughput: 0: 44615.9. Samples: 156038800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 12:58:23,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:58:25,075][35978] Updated weights for policy 0, policy_version 34025 (0.0026) [2024-06-10 12:58:28,402][35745] Fps is (10 sec: 42598.1, 60 sec: 44509.8, 300 sec: 44875.8). Total num frames: 557596672. Throughput: 0: 44806.2. Samples: 156178960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 12:58:28,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 12:58:29,258][35978] Updated weights for policy 0, policy_version 34035 (0.0037) [2024-06-10 12:58:32,077][35978] Updated weights for policy 0, policy_version 34045 (0.0043) [2024-06-10 12:58:33,402][35745] Fps is (10 sec: 44236.6, 60 sec: 44509.8, 300 sec: 44875.5). Total num frames: 557842432. Throughput: 0: 44815.2. Samples: 156446380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 12:58:33,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 12:58:36,294][35978] Updated weights for policy 0, policy_version 34055 (0.0044) [2024-06-10 12:58:38,404][35745] Fps is (10 sec: 49141.1, 60 sec: 45600.3, 300 sec: 44930.7). Total num frames: 558088192. Throughput: 0: 44700.0. Samples: 156713900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-10 12:58:38,405][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 12:58:39,421][35978] Updated weights for policy 0, policy_version 34065 (0.0029) [2024-06-10 12:58:43,408][35745] Fps is (10 sec: 42571.9, 60 sec: 44505.3, 300 sec: 44930.1). Total num frames: 558268416. Throughput: 0: 44868.0. Samples: 156851420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-10 12:58:43,409][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 12:58:43,546][35978] Updated weights for policy 0, policy_version 34075 (0.0037) [2024-06-10 12:58:46,492][35978] Updated weights for policy 0, policy_version 34085 (0.0031) [2024-06-10 12:58:48,401][35745] Fps is (10 sec: 44247.4, 60 sec: 45057.8, 300 sec: 44875.5). Total num frames: 558530560. Throughput: 0: 45030.7. Samples: 157119340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-10 12:58:48,402][35745] Avg episode reward: [(0, '0.296')] [2024-06-10 12:58:51,060][35978] Updated weights for policy 0, policy_version 34095 (0.0044) [2024-06-10 12:58:53,402][35745] Fps is (10 sec: 47542.9, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 558743552. Throughput: 0: 44676.7. Samples: 157384980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-10 12:58:53,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 12:58:54,236][35978] Updated weights for policy 0, policy_version 34105 (0.0040) [2024-06-10 12:58:58,402][35745] Fps is (10 sec: 39321.4, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 558923776. Throughput: 0: 44862.2. Samples: 157522360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-10 12:58:58,402][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 12:58:58,634][35978] Updated weights for policy 0, policy_version 34115 (0.0032) [2024-06-10 12:59:01,537][35978] Updated weights for policy 0, policy_version 34125 (0.0039) [2024-06-10 12:59:03,402][35745] Fps is (10 sec: 44237.5, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 559185920. Throughput: 0: 44670.7. Samples: 157784220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 12:59:03,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 12:59:05,542][35978] Updated weights for policy 0, policy_version 34135 (0.0026) [2024-06-10 12:59:08,402][35745] Fps is (10 sec: 49151.8, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 559415296. Throughput: 0: 44789.3. Samples: 158054320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 12:59:08,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:59:08,508][35978] Updated weights for policy 0, policy_version 34145 (0.0039) [2024-06-10 12:59:12,618][35978] Updated weights for policy 0, policy_version 34155 (0.0025) [2024-06-10 12:59:13,402][35745] Fps is (10 sec: 42598.3, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 559611904. Throughput: 0: 44711.7. Samples: 158190980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 12:59:13,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 12:59:15,784][35978] Updated weights for policy 0, policy_version 34165 (0.0026) [2024-06-10 12:59:17,060][35957] Signal inference workers to stop experience collection... (2400 times) [2024-06-10 12:59:17,102][35978] InferenceWorker_p0-w0: stopping experience collection (2400 times) [2024-06-10 12:59:17,119][35957] Signal inference workers to resume experience collection... (2400 times) [2024-06-10 12:59:17,122][35978] InferenceWorker_p0-w0: resuming experience collection (2400 times) [2024-06-10 12:59:18,402][35745] Fps is (10 sec: 44236.4, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 559857664. Throughput: 0: 44884.0. Samples: 158466160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 12:59:18,402][35745] Avg episode reward: [(0, '0.312')] [2024-06-10 12:59:20,391][35978] Updated weights for policy 0, policy_version 34175 (0.0037) [2024-06-10 12:59:23,402][35745] Fps is (10 sec: 45875.2, 60 sec: 44509.9, 300 sec: 44820.3). Total num frames: 560070656. Throughput: 0: 44663.2. Samples: 158723640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-10 12:59:23,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 12:59:23,505][35978] Updated weights for policy 0, policy_version 34185 (0.0032) [2024-06-10 12:59:27,864][35978] Updated weights for policy 0, policy_version 34195 (0.0026) [2024-06-10 12:59:28,401][35745] Fps is (10 sec: 42599.3, 60 sec: 44783.1, 300 sec: 44876.5). Total num frames: 560283648. Throughput: 0: 44614.3. Samples: 158858780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-10 12:59:28,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 12:59:30,766][35978] Updated weights for policy 0, policy_version 34205 (0.0032) [2024-06-10 12:59:33,401][35745] Fps is (10 sec: 42598.7, 60 sec: 44236.9, 300 sec: 44820.0). Total num frames: 560496640. Throughput: 0: 44544.4. Samples: 159123840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-10 12:59:33,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:59:33,444][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000034211_560513024.pth... [2024-06-10 12:59:33,499][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000033556_549781504.pth [2024-06-10 12:59:35,174][35978] Updated weights for policy 0, policy_version 34215 (0.0033) [2024-06-10 12:59:38,079][35978] Updated weights for policy 0, policy_version 34225 (0.0032) [2024-06-10 12:59:38,402][35745] Fps is (10 sec: 45874.4, 60 sec: 44238.5, 300 sec: 44819.9). Total num frames: 560742400. Throughput: 0: 44639.2. Samples: 159393740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-10 12:59:38,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 12:59:42,314][35978] Updated weights for policy 0, policy_version 34235 (0.0033) [2024-06-10 12:59:43,402][35745] Fps is (10 sec: 45874.7, 60 sec: 44787.6, 300 sec: 44875.5). Total num frames: 560955392. Throughput: 0: 44628.8. Samples: 159530660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-10 12:59:43,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 12:59:45,267][35978] Updated weights for policy 0, policy_version 34245 (0.0031) [2024-06-10 12:59:48,402][35745] Fps is (10 sec: 42598.2, 60 sec: 43963.6, 300 sec: 44820.0). Total num frames: 561168384. Throughput: 0: 44908.7. Samples: 159805120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-10 12:59:48,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 12:59:49,449][35978] Updated weights for policy 0, policy_version 34255 (0.0032) [2024-06-10 12:59:52,597][35978] Updated weights for policy 0, policy_version 34265 (0.0035) [2024-06-10 12:59:53,402][35745] Fps is (10 sec: 45875.4, 60 sec: 44510.0, 300 sec: 44820.0). Total num frames: 561414144. Throughput: 0: 44656.0. Samples: 160063840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-10 12:59:53,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 12:59:56,954][35978] Updated weights for policy 0, policy_version 34275 (0.0026) [2024-06-10 12:59:58,408][35745] Fps is (10 sec: 45846.3, 60 sec: 45051.2, 300 sec: 44874.5). Total num frames: 561627136. Throughput: 0: 44809.2. Samples: 160207680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-10 12:59:58,409][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 13:00:00,041][35978] Updated weights for policy 0, policy_version 34285 (0.0019) [2024-06-10 13:00:03,402][35745] Fps is (10 sec: 42598.1, 60 sec: 44236.7, 300 sec: 44764.4). Total num frames: 561840128. Throughput: 0: 44416.0. Samples: 160464880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-10 13:00:03,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:00:04,190][35978] Updated weights for policy 0, policy_version 34295 (0.0024) [2024-06-10 13:00:07,080][35978] Updated weights for policy 0, policy_version 34305 (0.0025) [2024-06-10 13:00:08,402][35745] Fps is (10 sec: 47544.2, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 562102272. Throughput: 0: 44826.7. Samples: 160740840. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-10 13:00:08,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 13:00:11,421][35978] Updated weights for policy 0, policy_version 34315 (0.0039) [2024-06-10 13:00:13,402][35745] Fps is (10 sec: 45875.1, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 562298880. Throughput: 0: 44866.0. Samples: 160877760. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-10 13:00:13,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 13:00:14,395][35978] Updated weights for policy 0, policy_version 34325 (0.0020) [2024-06-10 13:00:18,402][35745] Fps is (10 sec: 42598.4, 60 sec: 44510.0, 300 sec: 44820.0). Total num frames: 562528256. Throughput: 0: 44907.1. Samples: 161144660. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-10 13:00:18,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 13:00:18,440][35978] Updated weights for policy 0, policy_version 34335 (0.0035) [2024-06-10 13:00:21,553][35978] Updated weights for policy 0, policy_version 34345 (0.0031) [2024-06-10 13:00:23,402][35745] Fps is (10 sec: 45874.6, 60 sec: 44782.8, 300 sec: 44820.3). Total num frames: 562757632. Throughput: 0: 44955.8. Samples: 161416760. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-10 13:00:23,402][35745] Avg episode reward: [(0, '0.311')] [2024-06-10 13:00:25,692][35957] Signal inference workers to stop experience collection... (2450 times) [2024-06-10 13:00:25,692][35957] Signal inference workers to resume experience collection... (2450 times) [2024-06-10 13:00:25,718][35978] InferenceWorker_p0-w0: stopping experience collection (2450 times) [2024-06-10 13:00:25,718][35978] InferenceWorker_p0-w0: resuming experience collection (2450 times) [2024-06-10 13:00:25,850][35978] Updated weights for policy 0, policy_version 34355 (0.0030) [2024-06-10 13:00:28,401][35745] Fps is (10 sec: 44237.0, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 562970624. Throughput: 0: 44861.4. Samples: 161549420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-10 13:00:28,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 13:00:29,218][35978] Updated weights for policy 0, policy_version 34365 (0.0023) [2024-06-10 13:00:33,179][35978] Updated weights for policy 0, policy_version 34375 (0.0044) [2024-06-10 13:00:33,404][35745] Fps is (10 sec: 44227.4, 60 sec: 45054.2, 300 sec: 44819.6). Total num frames: 563200000. Throughput: 0: 44780.0. Samples: 161820320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-10 13:00:33,405][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:00:36,589][35978] Updated weights for policy 0, policy_version 34385 (0.0030) [2024-06-10 13:00:38,401][35745] Fps is (10 sec: 44237.0, 60 sec: 44510.0, 300 sec: 44764.4). Total num frames: 563412992. Throughput: 0: 44863.6. Samples: 162082700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-10 13:00:38,402][35745] Avg episode reward: [(0, '0.300')] [2024-06-10 13:00:40,209][35978] Updated weights for policy 0, policy_version 34395 (0.0022) [2024-06-10 13:00:43,401][35745] Fps is (10 sec: 45886.2, 60 sec: 45056.1, 300 sec: 44820.0). Total num frames: 563658752. Throughput: 0: 44906.0. Samples: 162228160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-10 13:00:43,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 13:00:43,852][35978] Updated weights for policy 0, policy_version 34405 (0.0033) [2024-06-10 13:00:47,369][35978] Updated weights for policy 0, policy_version 34415 (0.0030) [2024-06-10 13:00:48,402][35745] Fps is (10 sec: 45874.6, 60 sec: 45056.1, 300 sec: 44875.5). Total num frames: 563871744. Throughput: 0: 45128.0. Samples: 162495640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-10 13:00:48,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 13:00:50,935][35978] Updated weights for policy 0, policy_version 34425 (0.0043) [2024-06-10 13:00:53,402][35745] Fps is (10 sec: 44236.1, 60 sec: 44782.8, 300 sec: 44819.9). Total num frames: 564101120. Throughput: 0: 44880.8. Samples: 162760480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-10 13:00:53,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 13:00:54,820][35978] Updated weights for policy 0, policy_version 34435 (0.0035) [2024-06-10 13:00:58,347][35978] Updated weights for policy 0, policy_version 34445 (0.0027) [2024-06-10 13:00:58,402][35745] Fps is (10 sec: 47513.8, 60 sec: 45333.9, 300 sec: 44875.5). Total num frames: 564346880. Throughput: 0: 44880.5. Samples: 162897380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-10 13:00:58,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:01:02,081][35978] Updated weights for policy 0, policy_version 34455 (0.0031) [2024-06-10 13:01:03,402][35745] Fps is (10 sec: 42599.1, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 564527104. Throughput: 0: 44886.2. Samples: 163164540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-10 13:01:03,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 13:01:05,680][35978] Updated weights for policy 0, policy_version 34465 (0.0031) [2024-06-10 13:01:08,402][35745] Fps is (10 sec: 42598.2, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 564772864. Throughput: 0: 44970.0. Samples: 163440400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-10 13:01:08,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 13:01:09,100][35978] Updated weights for policy 0, policy_version 34475 (0.0029) [2024-06-10 13:01:12,983][35978] Updated weights for policy 0, policy_version 34485 (0.0026) [2024-06-10 13:01:13,402][35745] Fps is (10 sec: 49151.4, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 565018624. Throughput: 0: 45041.6. Samples: 163576300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-10 13:01:13,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 13:01:16,526][35978] Updated weights for policy 0, policy_version 34495 (0.0050) [2024-06-10 13:01:18,402][35745] Fps is (10 sec: 42598.7, 60 sec: 44509.9, 300 sec: 44820.3). Total num frames: 565198848. Throughput: 0: 44772.6. Samples: 163834980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-10 13:01:18,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:01:20,250][35978] Updated weights for policy 0, policy_version 34505 (0.0031) [2024-06-10 13:01:23,402][35745] Fps is (10 sec: 42598.3, 60 sec: 44783.0, 300 sec: 44819.9). Total num frames: 565444608. Throughput: 0: 45006.5. Samples: 164108000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-10 13:01:23,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 13:01:23,422][35957] Signal inference workers to stop experience collection... (2500 times) [2024-06-10 13:01:23,423][35957] Signal inference workers to resume experience collection... (2500 times) [2024-06-10 13:01:23,466][35978] InferenceWorker_p0-w0: stopping experience collection (2500 times) [2024-06-10 13:01:23,466][35978] InferenceWorker_p0-w0: resuming experience collection (2500 times) [2024-06-10 13:01:23,738][35978] Updated weights for policy 0, policy_version 34515 (0.0035) [2024-06-10 13:01:27,594][35978] Updated weights for policy 0, policy_version 34525 (0.0032) [2024-06-10 13:01:28,402][35745] Fps is (10 sec: 49150.7, 60 sec: 45328.8, 300 sec: 44820.3). Total num frames: 565690368. Throughput: 0: 44732.6. Samples: 164241140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-10 13:01:28,403][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 13:01:31,250][35978] Updated weights for policy 0, policy_version 34535 (0.0037) [2024-06-10 13:01:33,402][35745] Fps is (10 sec: 42598.5, 60 sec: 44511.5, 300 sec: 44820.0). Total num frames: 565870592. Throughput: 0: 44704.4. Samples: 164507340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-10 13:01:33,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:01:33,407][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000034538_565870592.pth... [2024-06-10 13:01:33,464][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000033885_555171840.pth [2024-06-10 13:01:35,122][35978] Updated weights for policy 0, policy_version 34545 (0.0034) [2024-06-10 13:01:38,401][35745] Fps is (10 sec: 44238.3, 60 sec: 45329.1, 300 sec: 44820.0). Total num frames: 566132736. Throughput: 0: 44758.0. Samples: 164774580. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-10 13:01:38,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:01:38,427][35978] Updated weights for policy 0, policy_version 34555 (0.0037) [2024-06-10 13:01:42,337][35978] Updated weights for policy 0, policy_version 34565 (0.0032) [2024-06-10 13:01:43,402][35745] Fps is (10 sec: 49151.9, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 566362112. Throughput: 0: 44720.8. Samples: 164909820. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-10 13:01:43,402][35745] Avg episode reward: [(0, '0.295')] [2024-06-10 13:01:45,707][35978] Updated weights for policy 0, policy_version 34575 (0.0029) [2024-06-10 13:01:48,408][35745] Fps is (10 sec: 39296.8, 60 sec: 44232.3, 300 sec: 44763.5). Total num frames: 566525952. Throughput: 0: 44689.4. Samples: 165175840. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-10 13:01:48,408][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 13:01:49,754][35978] Updated weights for policy 0, policy_version 34585 (0.0030) [2024-06-10 13:01:52,894][35978] Updated weights for policy 0, policy_version 34595 (0.0029) [2024-06-10 13:01:53,404][35745] Fps is (10 sec: 44226.9, 60 sec: 45054.3, 300 sec: 44819.6). Total num frames: 566804480. Throughput: 0: 44352.4. Samples: 165436360. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-10 13:01:53,405][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 13:01:57,130][35978] Updated weights for policy 0, policy_version 34605 (0.0028) [2024-06-10 13:01:58,402][35745] Fps is (10 sec: 50821.3, 60 sec: 44782.8, 300 sec: 44819.9). Total num frames: 567033856. Throughput: 0: 44622.6. Samples: 165584320. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-10 13:01:58,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 13:02:00,417][35978] Updated weights for policy 0, policy_version 34615 (0.0027) [2024-06-10 13:02:03,402][35745] Fps is (10 sec: 40969.1, 60 sec: 44782.8, 300 sec: 44819.9). Total num frames: 567214080. Throughput: 0: 44843.0. Samples: 165852920. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-10 13:02:03,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 13:02:04,371][35978] Updated weights for policy 0, policy_version 34625 (0.0027) [2024-06-10 13:02:07,453][35978] Updated weights for policy 0, policy_version 34635 (0.0033) [2024-06-10 13:02:08,402][35745] Fps is (10 sec: 42599.1, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 567459840. Throughput: 0: 44637.0. Samples: 166116660. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-10 13:02:08,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:02:11,771][35978] Updated weights for policy 0, policy_version 34645 (0.0038) [2024-06-10 13:02:13,402][35745] Fps is (10 sec: 45876.0, 60 sec: 44236.9, 300 sec: 44709.2). Total num frames: 567672832. Throughput: 0: 44857.2. Samples: 166259700. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-10 13:02:13,402][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 13:02:14,786][35978] Updated weights for policy 0, policy_version 34655 (0.0038) [2024-06-10 13:02:18,402][35745] Fps is (10 sec: 44236.8, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 567902208. Throughput: 0: 44876.6. Samples: 166526780. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-10 13:02:18,402][35745] Avg episode reward: [(0, '0.312')] [2024-06-10 13:02:19,034][35978] Updated weights for policy 0, policy_version 34665 (0.0041) [2024-06-10 13:02:22,324][35978] Updated weights for policy 0, policy_version 34675 (0.0027) [2024-06-10 13:02:23,402][35745] Fps is (10 sec: 45874.7, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 568131584. Throughput: 0: 44750.9. Samples: 166788380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-10 13:02:23,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 13:02:26,099][35957] Signal inference workers to stop experience collection... (2550 times) [2024-06-10 13:02:26,099][35957] Signal inference workers to resume experience collection... (2550 times) [2024-06-10 13:02:26,144][35978] InferenceWorker_p0-w0: stopping experience collection (2550 times) [2024-06-10 13:02:26,144][35978] InferenceWorker_p0-w0: resuming experience collection (2550 times) [2024-06-10 13:02:26,387][35978] Updated weights for policy 0, policy_version 34685 (0.0027) [2024-06-10 13:02:28,402][35745] Fps is (10 sec: 45874.4, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 568360960. Throughput: 0: 44817.8. Samples: 166926620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-10 13:02:28,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:02:29,655][35978] Updated weights for policy 0, policy_version 34695 (0.0025) [2024-06-10 13:02:33,402][35745] Fps is (10 sec: 44237.3, 60 sec: 45056.1, 300 sec: 44820.0). Total num frames: 568573952. Throughput: 0: 44914.2. Samples: 167196700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-10 13:02:33,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 13:02:33,592][35978] Updated weights for policy 0, policy_version 34705 (0.0034) [2024-06-10 13:02:36,641][35978] Updated weights for policy 0, policy_version 34715 (0.0041) [2024-06-10 13:02:38,402][35745] Fps is (10 sec: 42598.7, 60 sec: 44236.7, 300 sec: 44708.9). Total num frames: 568786944. Throughput: 0: 45033.0. Samples: 167462740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-10 13:02:38,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:02:41,081][35978] Updated weights for policy 0, policy_version 34725 (0.0040) [2024-06-10 13:02:43,402][35745] Fps is (10 sec: 47512.8, 60 sec: 44782.9, 300 sec: 44820.3). Total num frames: 569049088. Throughput: 0: 44743.6. Samples: 167597780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-10 13:02:43,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 13:02:43,905][35978] Updated weights for policy 0, policy_version 34735 (0.0032) [2024-06-10 13:02:48,404][35745] Fps is (10 sec: 45864.7, 60 sec: 45332.0, 300 sec: 44708.5). Total num frames: 569245696. Throughput: 0: 44795.6. Samples: 167868820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-10 13:02:48,405][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 13:02:48,601][35978] Updated weights for policy 0, policy_version 34745 (0.0034) [2024-06-10 13:02:51,412][35978] Updated weights for policy 0, policy_version 34755 (0.0038) [2024-06-10 13:02:53,401][35745] Fps is (10 sec: 40960.8, 60 sec: 44238.6, 300 sec: 44764.4). Total num frames: 569458688. Throughput: 0: 44802.2. Samples: 168132760. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-10 13:02:53,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 13:02:55,965][35978] Updated weights for policy 0, policy_version 34765 (0.0027) [2024-06-10 13:02:58,404][35745] Fps is (10 sec: 49152.0, 60 sec: 45054.3, 300 sec: 44930.7). Total num frames: 569737216. Throughput: 0: 44740.3. Samples: 168273120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-10 13:02:58,405][35745] Avg episode reward: [(0, '0.311')] [2024-06-10 13:02:58,547][35978] Updated weights for policy 0, policy_version 34775 (0.0026) [2024-06-10 13:03:03,055][35978] Updated weights for policy 0, policy_version 34785 (0.0022) [2024-06-10 13:03:03,402][35745] Fps is (10 sec: 47512.9, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 569933824. Throughput: 0: 44809.7. Samples: 168543220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 13:03:03,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 13:03:05,686][35978] Updated weights for policy 0, policy_version 34795 (0.0040) [2024-06-10 13:03:08,402][35745] Fps is (10 sec: 39330.8, 60 sec: 44509.8, 300 sec: 44820.0). Total num frames: 570130432. Throughput: 0: 44917.0. Samples: 168809640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 13:03:08,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:03:10,282][35978] Updated weights for policy 0, policy_version 34805 (0.0040) [2024-06-10 13:03:13,374][35978] Updated weights for policy 0, policy_version 34815 (0.0033) [2024-06-10 13:03:13,402][35745] Fps is (10 sec: 47513.9, 60 sec: 45602.1, 300 sec: 44875.5). Total num frames: 570408960. Throughput: 0: 44794.8. Samples: 168942380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 13:03:13,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 13:03:17,801][35978] Updated weights for policy 0, policy_version 34825 (0.0022) [2024-06-10 13:03:18,402][35745] Fps is (10 sec: 47513.7, 60 sec: 45056.0, 300 sec: 44764.4). Total num frames: 570605568. Throughput: 0: 44899.5. Samples: 169217180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 13:03:18,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:03:20,474][35978] Updated weights for policy 0, policy_version 34835 (0.0036) [2024-06-10 13:03:23,402][35745] Fps is (10 sec: 37683.3, 60 sec: 44236.9, 300 sec: 44708.9). Total num frames: 570785792. Throughput: 0: 44977.0. Samples: 169486700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-10 13:03:23,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 13:03:25,103][35978] Updated weights for policy 0, policy_version 34845 (0.0024) [2024-06-10 13:03:27,418][35978] Updated weights for policy 0, policy_version 34855 (0.0038) [2024-06-10 13:03:28,403][35745] Fps is (10 sec: 45870.1, 60 sec: 45055.3, 300 sec: 44819.8). Total num frames: 571064320. Throughput: 0: 44958.2. Samples: 169620940. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-10 13:03:28,403][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:03:32,531][35978] Updated weights for policy 0, policy_version 34865 (0.0032) [2024-06-10 13:03:33,402][35745] Fps is (10 sec: 49150.9, 60 sec: 45055.8, 300 sec: 44709.2). Total num frames: 571277312. Throughput: 0: 45001.7. Samples: 169893800. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-10 13:03:33,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 13:03:33,539][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000034870_571310080.pth... [2024-06-10 13:03:33,615][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000034211_560513024.pth [2024-06-10 13:03:35,009][35978] Updated weights for policy 0, policy_version 34875 (0.0033) [2024-06-10 13:03:38,401][35745] Fps is (10 sec: 42603.3, 60 sec: 45056.1, 300 sec: 44820.9). Total num frames: 571490304. Throughput: 0: 45106.2. Samples: 170162540. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-10 13:03:38,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 13:03:39,908][35978] Updated weights for policy 0, policy_version 34885 (0.0031) [2024-06-10 13:03:40,783][35957] Signal inference workers to stop experience collection... (2600 times) [2024-06-10 13:03:40,784][35957] Signal inference workers to resume experience collection... (2600 times) [2024-06-10 13:03:40,805][35978] InferenceWorker_p0-w0: stopping experience collection (2600 times) [2024-06-10 13:03:40,805][35978] InferenceWorker_p0-w0: resuming experience collection (2600 times) [2024-06-10 13:03:42,834][35978] Updated weights for policy 0, policy_version 34895 (0.0030) [2024-06-10 13:03:43,402][35745] Fps is (10 sec: 44237.5, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 571719680. Throughput: 0: 44799.2. Samples: 170288980. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-10 13:03:43,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 13:03:47,030][35978] Updated weights for policy 0, policy_version 34905 (0.0037) [2024-06-10 13:03:48,402][35745] Fps is (10 sec: 47513.3, 60 sec: 45330.8, 300 sec: 44820.0). Total num frames: 571965440. Throughput: 0: 44911.2. Samples: 170564220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-10 13:03:48,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:03:49,997][35978] Updated weights for policy 0, policy_version 34915 (0.0038) [2024-06-10 13:03:53,402][35745] Fps is (10 sec: 45875.6, 60 sec: 45329.0, 300 sec: 44931.0). Total num frames: 572178432. Throughput: 0: 44933.8. Samples: 170831660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-10 13:03:53,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:03:54,334][35978] Updated weights for policy 0, policy_version 34925 (0.0042) [2024-06-10 13:03:57,082][35978] Updated weights for policy 0, policy_version 34935 (0.0028) [2024-06-10 13:03:58,404][35745] Fps is (10 sec: 42588.4, 60 sec: 44236.8, 300 sec: 44764.1). Total num frames: 572391424. Throughput: 0: 44748.4. Samples: 170956160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-10 13:03:58,405][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:04:01,698][35978] Updated weights for policy 0, policy_version 34945 (0.0031) [2024-06-10 13:04:03,402][35745] Fps is (10 sec: 47513.0, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 572653568. Throughput: 0: 44740.8. Samples: 171230520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-10 13:04:03,402][35745] Avg episode reward: [(0, '0.311')] [2024-06-10 13:04:04,893][35978] Updated weights for policy 0, policy_version 34955 (0.0025) [2024-06-10 13:04:08,402][35745] Fps is (10 sec: 44247.3, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 572833792. Throughput: 0: 44646.2. Samples: 171495780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-10 13:04:08,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 13:04:09,182][35978] Updated weights for policy 0, policy_version 34965 (0.0037) [2024-06-10 13:04:12,553][35978] Updated weights for policy 0, policy_version 34975 (0.0031) [2024-06-10 13:04:13,402][35745] Fps is (10 sec: 39322.1, 60 sec: 43963.8, 300 sec: 44708.9). Total num frames: 573046784. Throughput: 0: 44538.0. Samples: 171625100. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-10 13:04:13,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 13:04:16,154][35978] Updated weights for policy 0, policy_version 34985 (0.0037) [2024-06-10 13:04:18,404][35745] Fps is (10 sec: 49140.3, 60 sec: 45327.3, 300 sec: 44930.7). Total num frames: 573325312. Throughput: 0: 44534.3. Samples: 171897940. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-10 13:04:18,405][35745] Avg episode reward: [(0, '0.312')] [2024-06-10 13:04:19,506][35978] Updated weights for policy 0, policy_version 34995 (0.0023) [2024-06-10 13:04:23,265][35978] Updated weights for policy 0, policy_version 35005 (0.0046) [2024-06-10 13:04:23,402][35745] Fps is (10 sec: 47513.1, 60 sec: 45602.1, 300 sec: 44875.5). Total num frames: 573521920. Throughput: 0: 44617.6. Samples: 172170340. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-10 13:04:23,402][35745] Avg episode reward: [(0, '0.314')] [2024-06-10 13:04:23,409][35957] Saving new best policy, reward=0.314! [2024-06-10 13:04:26,459][35978] Updated weights for policy 0, policy_version 35015 (0.0035) [2024-06-10 13:04:28,402][35745] Fps is (10 sec: 39330.4, 60 sec: 44237.5, 300 sec: 44819.9). Total num frames: 573718528. Throughput: 0: 44763.0. Samples: 172303320. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-10 13:04:28,408][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:04:30,758][35978] Updated weights for policy 0, policy_version 35025 (0.0033) [2024-06-10 13:04:33,402][35745] Fps is (10 sec: 47513.4, 60 sec: 45329.1, 300 sec: 44931.0). Total num frames: 573997056. Throughput: 0: 44777.2. Samples: 172579200. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-10 13:04:33,404][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 13:04:34,051][35978] Updated weights for policy 0, policy_version 35035 (0.0039) [2024-06-10 13:04:37,951][35978] Updated weights for policy 0, policy_version 35045 (0.0031) [2024-06-10 13:04:38,402][35745] Fps is (10 sec: 47514.0, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 574193664. Throughput: 0: 44875.9. Samples: 172851080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 13:04:38,402][35745] Avg episode reward: [(0, '0.298')] [2024-06-10 13:04:41,631][35978] Updated weights for policy 0, policy_version 35055 (0.0023) [2024-06-10 13:04:43,402][35745] Fps is (10 sec: 39322.1, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 574390272. Throughput: 0: 45009.0. Samples: 172981460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 13:04:43,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:04:45,122][35978] Updated weights for policy 0, policy_version 35065 (0.0034) [2024-06-10 13:04:48,402][35745] Fps is (10 sec: 44236.4, 60 sec: 44509.8, 300 sec: 44819.9). Total num frames: 574636032. Throughput: 0: 44888.0. Samples: 173250480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 13:04:48,402][35745] Avg episode reward: [(0, '0.311')] [2024-06-10 13:04:48,663][35978] Updated weights for policy 0, policy_version 35075 (0.0047) [2024-06-10 13:04:50,285][35957] Signal inference workers to stop experience collection... (2650 times) [2024-06-10 13:04:50,289][35957] Signal inference workers to resume experience collection... (2650 times) [2024-06-10 13:04:50,319][35978] InferenceWorker_p0-w0: stopping experience collection (2650 times) [2024-06-10 13:04:50,320][35978] InferenceWorker_p0-w0: resuming experience collection (2650 times) [2024-06-10 13:04:52,266][35978] Updated weights for policy 0, policy_version 35085 (0.0035) [2024-06-10 13:04:53,402][35745] Fps is (10 sec: 47513.5, 60 sec: 44782.9, 300 sec: 44876.5). Total num frames: 574865408. Throughput: 0: 45125.3. Samples: 173526420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 13:04:53,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:04:55,611][35978] Updated weights for policy 0, policy_version 35095 (0.0035) [2024-06-10 13:04:58,402][35745] Fps is (10 sec: 44237.3, 60 sec: 44784.7, 300 sec: 44875.5). Total num frames: 575078400. Throughput: 0: 45160.8. Samples: 173657340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 13:04:58,410][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:04:59,713][35978] Updated weights for policy 0, policy_version 35105 (0.0026) [2024-06-10 13:05:03,234][35978] Updated weights for policy 0, policy_version 35115 (0.0030) [2024-06-10 13:05:03,402][35745] Fps is (10 sec: 45874.9, 60 sec: 44509.9, 300 sec: 44819.9). Total num frames: 575324160. Throughput: 0: 45132.9. Samples: 173928820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 13:05:03,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:05:07,116][35978] Updated weights for policy 0, policy_version 35125 (0.0041) [2024-06-10 13:05:08,401][35745] Fps is (10 sec: 47513.9, 60 sec: 45329.1, 300 sec: 44931.1). Total num frames: 575553536. Throughput: 0: 44884.6. Samples: 174190140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 13:05:08,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:05:11,241][35978] Updated weights for policy 0, policy_version 35135 (0.0035) [2024-06-10 13:05:13,402][35745] Fps is (10 sec: 42598.9, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 575750144. Throughput: 0: 45017.5. Samples: 174329100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 13:05:13,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:05:14,209][35978] Updated weights for policy 0, policy_version 35145 (0.0028) [2024-06-10 13:05:18,128][35978] Updated weights for policy 0, policy_version 35155 (0.0046) [2024-06-10 13:05:18,402][35745] Fps is (10 sec: 42598.4, 60 sec: 44238.6, 300 sec: 44820.0). Total num frames: 575979520. Throughput: 0: 44956.2. Samples: 174602220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 13:05:18,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:05:21,408][35978] Updated weights for policy 0, policy_version 35165 (0.0050) [2024-06-10 13:05:23,402][35745] Fps is (10 sec: 47513.1, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 576225280. Throughput: 0: 44947.9. Samples: 174873740. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-10 13:05:23,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:05:25,064][35978] Updated weights for policy 0, policy_version 35175 (0.0032) [2024-06-10 13:05:28,401][35745] Fps is (10 sec: 45875.2, 60 sec: 45329.2, 300 sec: 44875.9). Total num frames: 576438272. Throughput: 0: 45067.6. Samples: 175009500. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-10 13:05:28,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:05:28,865][35978] Updated weights for policy 0, policy_version 35185 (0.0027) [2024-06-10 13:05:32,531][35978] Updated weights for policy 0, policy_version 35195 (0.0034) [2024-06-10 13:05:33,402][35745] Fps is (10 sec: 40959.8, 60 sec: 43963.7, 300 sec: 44819.9). Total num frames: 576634880. Throughput: 0: 44924.0. Samples: 175272060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-10 13:05:33,402][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 13:05:33,459][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000035196_576651264.pth... [2024-06-10 13:05:33,521][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000034538_565870592.pth [2024-06-10 13:05:36,149][35978] Updated weights for policy 0, policy_version 35205 (0.0041) [2024-06-10 13:05:38,402][35745] Fps is (10 sec: 45874.7, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 576897024. Throughput: 0: 44812.9. Samples: 175543000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-10 13:05:38,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:05:40,176][35978] Updated weights for policy 0, policy_version 35215 (0.0029) [2024-06-10 13:05:43,402][35745] Fps is (10 sec: 47514.2, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 577110016. Throughput: 0: 45002.7. Samples: 175682460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 13:05:43,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:05:43,706][35978] Updated weights for policy 0, policy_version 35225 (0.0031) [2024-06-10 13:05:47,230][35978] Updated weights for policy 0, policy_version 35235 (0.0036) [2024-06-10 13:05:48,405][35745] Fps is (10 sec: 42582.3, 60 sec: 44780.2, 300 sec: 44819.4). Total num frames: 577323008. Throughput: 0: 44791.8. Samples: 175944620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 13:05:48,406][35745] Avg episode reward: [(0, '0.312')] [2024-06-10 13:05:50,708][35978] Updated weights for policy 0, policy_version 35245 (0.0031) [2024-06-10 13:05:53,404][35745] Fps is (10 sec: 45864.5, 60 sec: 45054.3, 300 sec: 44819.6). Total num frames: 577568768. Throughput: 0: 45106.0. Samples: 176220020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 13:05:53,404][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 13:05:54,134][35978] Updated weights for policy 0, policy_version 35255 (0.0027) [2024-06-10 13:05:57,900][35978] Updated weights for policy 0, policy_version 35265 (0.0027) [2024-06-10 13:05:58,402][35745] Fps is (10 sec: 45892.7, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 577781760. Throughput: 0: 44985.3. Samples: 176353440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 13:05:58,402][35745] Avg episode reward: [(0, '0.311')] [2024-06-10 13:06:01,642][35978] Updated weights for policy 0, policy_version 35275 (0.0030) [2024-06-10 13:06:03,402][35745] Fps is (10 sec: 42608.6, 60 sec: 44510.0, 300 sec: 44820.0). Total num frames: 577994752. Throughput: 0: 44812.0. Samples: 176618760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-10 13:06:03,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 13:06:05,138][35978] Updated weights for policy 0, policy_version 35285 (0.0030) [2024-06-10 13:06:08,402][35745] Fps is (10 sec: 45874.6, 60 sec: 44782.8, 300 sec: 44820.0). Total num frames: 578240512. Throughput: 0: 44827.9. Samples: 176891000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-10 13:06:08,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:06:09,103][35978] Updated weights for policy 0, policy_version 35295 (0.0034) [2024-06-10 13:06:12,494][35978] Updated weights for policy 0, policy_version 35305 (0.0040) [2024-06-10 13:06:13,402][35745] Fps is (10 sec: 45874.8, 60 sec: 45055.9, 300 sec: 44931.0). Total num frames: 578453504. Throughput: 0: 44890.6. Samples: 177029580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-10 13:06:13,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 13:06:13,920][35957] Signal inference workers to stop experience collection... (2700 times) [2024-06-10 13:06:13,922][35957] Signal inference workers to resume experience collection... (2700 times) [2024-06-10 13:06:13,933][35978] InferenceWorker_p0-w0: stopping experience collection (2700 times) [2024-06-10 13:06:13,942][35978] InferenceWorker_p0-w0: resuming experience collection (2700 times) [2024-06-10 13:06:16,391][35978] Updated weights for policy 0, policy_version 35315 (0.0026) [2024-06-10 13:06:18,402][35745] Fps is (10 sec: 42599.1, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 578666496. Throughput: 0: 44929.1. Samples: 177293860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-10 13:06:18,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 13:06:19,554][35978] Updated weights for policy 0, policy_version 35325 (0.0025) [2024-06-10 13:06:23,402][35745] Fps is (10 sec: 44237.0, 60 sec: 44509.9, 300 sec: 44764.5). Total num frames: 578895872. Throughput: 0: 44957.4. Samples: 177566080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-10 13:06:23,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 13:06:23,577][35978] Updated weights for policy 0, policy_version 35335 (0.0042) [2024-06-10 13:06:26,902][35978] Updated weights for policy 0, policy_version 35345 (0.0024) [2024-06-10 13:06:28,402][35745] Fps is (10 sec: 47513.2, 60 sec: 45055.9, 300 sec: 44986.6). Total num frames: 579141632. Throughput: 0: 45037.7. Samples: 177709160. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-10 13:06:28,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 13:06:30,904][35978] Updated weights for policy 0, policy_version 35355 (0.0035) [2024-06-10 13:06:33,402][35745] Fps is (10 sec: 45872.5, 60 sec: 45328.7, 300 sec: 44819.9). Total num frames: 579354624. Throughput: 0: 45057.0. Samples: 177972040. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-10 13:06:33,402][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 13:06:34,083][35978] Updated weights for policy 0, policy_version 35365 (0.0035) [2024-06-10 13:06:38,303][35978] Updated weights for policy 0, policy_version 35375 (0.0045) [2024-06-10 13:06:38,402][35745] Fps is (10 sec: 44236.6, 60 sec: 44782.9, 300 sec: 44820.0). Total num frames: 579584000. Throughput: 0: 44896.0. Samples: 178240240. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-10 13:06:38,402][35745] Avg episode reward: [(0, '0.312')] [2024-06-10 13:06:41,659][35978] Updated weights for policy 0, policy_version 35385 (0.0025) [2024-06-10 13:06:43,402][35745] Fps is (10 sec: 47516.0, 60 sec: 45329.0, 300 sec: 45098.6). Total num frames: 579829760. Throughput: 0: 44781.7. Samples: 178368620. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-10 13:06:43,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:06:45,651][35978] Updated weights for policy 0, policy_version 35395 (0.0034) [2024-06-10 13:06:48,402][35745] Fps is (10 sec: 45875.1, 60 sec: 45331.9, 300 sec: 44875.8). Total num frames: 580042752. Throughput: 0: 45124.7. Samples: 178649380. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-10 13:06:48,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:06:48,626][35978] Updated weights for policy 0, policy_version 35405 (0.0027) [2024-06-10 13:06:52,872][35978] Updated weights for policy 0, policy_version 35415 (0.0037) [2024-06-10 13:06:53,402][35745] Fps is (10 sec: 42598.7, 60 sec: 44784.7, 300 sec: 44820.0). Total num frames: 580255744. Throughput: 0: 44893.9. Samples: 178911220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-10 13:06:53,402][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 13:06:56,305][35978] Updated weights for policy 0, policy_version 35425 (0.0035) [2024-06-10 13:06:58,404][35745] Fps is (10 sec: 44227.0, 60 sec: 45054.3, 300 sec: 44986.2). Total num frames: 580485120. Throughput: 0: 44802.2. Samples: 179045780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-10 13:06:58,405][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 13:07:00,074][35978] Updated weights for policy 0, policy_version 35435 (0.0020) [2024-06-10 13:07:03,356][35978] Updated weights for policy 0, policy_version 35445 (0.0026) [2024-06-10 13:07:03,402][35745] Fps is (10 sec: 47513.3, 60 sec: 45602.1, 300 sec: 44986.6). Total num frames: 580730880. Throughput: 0: 44927.9. Samples: 179315620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-10 13:07:03,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 13:07:07,599][35978] Updated weights for policy 0, policy_version 35455 (0.0038) [2024-06-10 13:07:08,408][35745] Fps is (10 sec: 42581.3, 60 sec: 44505.3, 300 sec: 44874.5). Total num frames: 580911104. Throughput: 0: 44772.3. Samples: 179581120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-10 13:07:08,409][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:07:10,644][35978] Updated weights for policy 0, policy_version 35465 (0.0040) [2024-06-10 13:07:13,402][35745] Fps is (10 sec: 40959.9, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 581140480. Throughput: 0: 44456.4. Samples: 179709700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-10 13:07:13,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 13:07:14,751][35978] Updated weights for policy 0, policy_version 35475 (0.0030) [2024-06-10 13:07:17,785][35978] Updated weights for policy 0, policy_version 35485 (0.0038) [2024-06-10 13:07:18,404][35745] Fps is (10 sec: 47534.4, 60 sec: 45327.5, 300 sec: 44930.7). Total num frames: 581386240. Throughput: 0: 44986.6. Samples: 179996500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 13:07:18,404][35745] Avg episode reward: [(0, '0.302')] [2024-06-10 13:07:21,992][35978] Updated weights for policy 0, policy_version 35495 (0.0028) [2024-06-10 13:07:23,402][35745] Fps is (10 sec: 45875.3, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 581599232. Throughput: 0: 44992.9. Samples: 180264920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 13:07:23,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 13:07:25,251][35978] Updated weights for policy 0, policy_version 35505 (0.0031) [2024-06-10 13:07:28,402][35745] Fps is (10 sec: 40967.5, 60 sec: 44236.7, 300 sec: 44819.9). Total num frames: 581795840. Throughput: 0: 45011.5. Samples: 180394140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 13:07:28,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 13:07:29,296][35978] Updated weights for policy 0, policy_version 35515 (0.0029) [2024-06-10 13:07:32,515][35978] Updated weights for policy 0, policy_version 35525 (0.0042) [2024-06-10 13:07:33,402][35745] Fps is (10 sec: 45875.7, 60 sec: 45056.5, 300 sec: 44986.6). Total num frames: 582057984. Throughput: 0: 44704.6. Samples: 180661080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-10 13:07:33,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:07:33,413][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000035526_582057984.pth... [2024-06-10 13:07:33,468][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000034870_571310080.pth [2024-06-10 13:07:36,663][35957] Signal inference workers to stop experience collection... (2750 times) [2024-06-10 13:07:36,716][35978] InferenceWorker_p0-w0: stopping experience collection (2750 times) [2024-06-10 13:07:36,718][35957] Signal inference workers to resume experience collection... (2750 times) [2024-06-10 13:07:36,733][35978] InferenceWorker_p0-w0: resuming experience collection (2750 times) [2024-06-10 13:07:36,858][35978] Updated weights for policy 0, policy_version 35535 (0.0027) [2024-06-10 13:07:38,403][35745] Fps is (10 sec: 45868.7, 60 sec: 44508.8, 300 sec: 44764.2). Total num frames: 582254592. Throughput: 0: 44933.6. Samples: 180933300. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-10 13:07:38,404][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 13:07:39,685][35978] Updated weights for policy 0, policy_version 35545 (0.0030) [2024-06-10 13:07:43,402][35745] Fps is (10 sec: 42597.6, 60 sec: 44236.7, 300 sec: 44875.8). Total num frames: 582483968. Throughput: 0: 44860.4. Samples: 181064400. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-10 13:07:43,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 13:07:44,070][35978] Updated weights for policy 0, policy_version 35555 (0.0043) [2024-06-10 13:07:47,315][35978] Updated weights for policy 0, policy_version 35565 (0.0028) [2024-06-10 13:07:48,402][35745] Fps is (10 sec: 45881.8, 60 sec: 44509.9, 300 sec: 44931.0). Total num frames: 582713344. Throughput: 0: 44846.2. Samples: 181333700. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-10 13:07:48,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 13:07:51,204][35978] Updated weights for policy 0, policy_version 35575 (0.0034) [2024-06-10 13:07:53,402][35745] Fps is (10 sec: 45875.6, 60 sec: 44782.9, 300 sec: 44764.8). Total num frames: 582942720. Throughput: 0: 45000.1. Samples: 181605840. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-10 13:07:53,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 13:07:54,720][35978] Updated weights for policy 0, policy_version 35585 (0.0031) [2024-06-10 13:07:58,402][35745] Fps is (10 sec: 44237.3, 60 sec: 44511.6, 300 sec: 44820.0). Total num frames: 583155712. Throughput: 0: 45021.0. Samples: 181735640. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-10 13:07:58,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:07:58,674][35978] Updated weights for policy 0, policy_version 35595 (0.0035) [2024-06-10 13:08:02,155][35978] Updated weights for policy 0, policy_version 35605 (0.0035) [2024-06-10 13:08:03,401][35745] Fps is (10 sec: 45875.7, 60 sec: 44509.9, 300 sec: 44986.6). Total num frames: 583401472. Throughput: 0: 44527.3. Samples: 182000140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-10 13:08:03,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 13:08:06,212][35978] Updated weights for policy 0, policy_version 35615 (0.0028) [2024-06-10 13:08:08,401][35745] Fps is (10 sec: 47513.9, 60 sec: 45333.9, 300 sec: 44820.0). Total num frames: 583630848. Throughput: 0: 44660.6. Samples: 182274640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-10 13:08:08,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:08:09,408][35978] Updated weights for policy 0, policy_version 35625 (0.0039) [2024-06-10 13:08:13,335][35978] Updated weights for policy 0, policy_version 35635 (0.0025) [2024-06-10 13:08:13,402][35745] Fps is (10 sec: 44236.0, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 583843840. Throughput: 0: 44834.7. Samples: 182411700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-10 13:08:13,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:08:16,466][35978] Updated weights for policy 0, policy_version 35645 (0.0034) [2024-06-10 13:08:18,403][35745] Fps is (10 sec: 42593.6, 60 sec: 44510.5, 300 sec: 44986.4). Total num frames: 584056832. Throughput: 0: 44850.0. Samples: 182679380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-10 13:08:18,403][35745] Avg episode reward: [(0, '0.315')] [2024-06-10 13:08:18,454][35957] Saving new best policy, reward=0.315! [2024-06-10 13:08:20,451][35978] Updated weights for policy 0, policy_version 35655 (0.0032) [2024-06-10 13:08:23,402][35745] Fps is (10 sec: 45875.8, 60 sec: 45056.1, 300 sec: 44875.7). Total num frames: 584302592. Throughput: 0: 44752.6. Samples: 182947100. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-10 13:08:23,402][35745] Avg episode reward: [(0, '0.314')] [2024-06-10 13:08:23,803][35978] Updated weights for policy 0, policy_version 35665 (0.0032) [2024-06-10 13:08:27,925][35978] Updated weights for policy 0, policy_version 35675 (0.0022) [2024-06-10 13:08:28,401][35745] Fps is (10 sec: 44242.0, 60 sec: 45056.2, 300 sec: 44820.0). Total num frames: 584499200. Throughput: 0: 44823.3. Samples: 183081440. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-10 13:08:28,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 13:08:31,420][35978] Updated weights for policy 0, policy_version 35685 (0.0031) [2024-06-10 13:08:33,408][35745] Fps is (10 sec: 42571.2, 60 sec: 44505.1, 300 sec: 44874.5). Total num frames: 584728576. Throughput: 0: 44852.0. Samples: 183352320. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-10 13:08:33,409][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:08:35,313][35978] Updated weights for policy 0, policy_version 35695 (0.0025) [2024-06-10 13:08:38,402][35745] Fps is (10 sec: 47513.0, 60 sec: 45330.2, 300 sec: 44931.0). Total num frames: 584974336. Throughput: 0: 44706.2. Samples: 183617620. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-10 13:08:38,402][35745] Avg episode reward: [(0, '0.299')] [2024-06-10 13:08:38,427][35978] Updated weights for policy 0, policy_version 35705 (0.0024) [2024-06-10 13:08:42,377][35978] Updated weights for policy 0, policy_version 35715 (0.0024) [2024-06-10 13:08:43,402][35745] Fps is (10 sec: 45904.4, 60 sec: 45056.1, 300 sec: 44820.0). Total num frames: 585187328. Throughput: 0: 44964.9. Samples: 183759060. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-10 13:08:43,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:08:45,413][35978] Updated weights for policy 0, policy_version 35725 (0.0033) [2024-06-10 13:08:48,402][35745] Fps is (10 sec: 42598.3, 60 sec: 44783.0, 300 sec: 44819.9). Total num frames: 585400320. Throughput: 0: 45221.7. Samples: 184035120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-10 13:08:48,411][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:08:49,655][35978] Updated weights for policy 0, policy_version 35735 (0.0046) [2024-06-10 13:08:51,852][35957] Signal inference workers to stop experience collection... (2800 times) [2024-06-10 13:08:51,853][35957] Signal inference workers to resume experience collection... (2800 times) [2024-06-10 13:08:51,879][35978] InferenceWorker_p0-w0: stopping experience collection (2800 times) [2024-06-10 13:08:51,879][35978] InferenceWorker_p0-w0: resuming experience collection (2800 times) [2024-06-10 13:08:52,845][35978] Updated weights for policy 0, policy_version 35745 (0.0025) [2024-06-10 13:08:53,402][35745] Fps is (10 sec: 47513.4, 60 sec: 45329.1, 300 sec: 44986.9). Total num frames: 585662464. Throughput: 0: 44973.2. Samples: 184298440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-10 13:08:53,402][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 13:08:57,238][35978] Updated weights for policy 0, policy_version 35755 (0.0038) [2024-06-10 13:08:58,402][35745] Fps is (10 sec: 47513.8, 60 sec: 45329.1, 300 sec: 44820.0). Total num frames: 585875456. Throughput: 0: 45067.7. Samples: 184439740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-10 13:08:58,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:09:00,207][35978] Updated weights for policy 0, policy_version 35765 (0.0032) [2024-06-10 13:09:03,402][35745] Fps is (10 sec: 42598.7, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 586088448. Throughput: 0: 44892.6. Samples: 184699500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-10 13:09:03,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:09:04,396][35978] Updated weights for policy 0, policy_version 35775 (0.0039) [2024-06-10 13:09:07,282][35978] Updated weights for policy 0, policy_version 35785 (0.0028) [2024-06-10 13:09:08,402][35745] Fps is (10 sec: 42598.4, 60 sec: 44509.8, 300 sec: 44931.0). Total num frames: 586301440. Throughput: 0: 45040.4. Samples: 184973920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-10 13:09:08,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 13:09:11,469][35978] Updated weights for policy 0, policy_version 35795 (0.0042) [2024-06-10 13:09:13,402][35745] Fps is (10 sec: 47513.6, 60 sec: 45329.2, 300 sec: 44875.9). Total num frames: 586563584. Throughput: 0: 45087.0. Samples: 185110360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-10 13:09:13,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:09:14,602][35978] Updated weights for policy 0, policy_version 35805 (0.0028) [2024-06-10 13:09:18,402][35745] Fps is (10 sec: 47513.9, 60 sec: 45329.9, 300 sec: 44931.1). Total num frames: 586776576. Throughput: 0: 45197.1. Samples: 185385900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-10 13:09:18,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 13:09:18,622][35978] Updated weights for policy 0, policy_version 35815 (0.0027) [2024-06-10 13:09:22,103][35978] Updated weights for policy 0, policy_version 35825 (0.0034) [2024-06-10 13:09:23,404][35745] Fps is (10 sec: 44226.2, 60 sec: 45054.2, 300 sec: 45041.8). Total num frames: 587005952. Throughput: 0: 45207.0. Samples: 185652040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-10 13:09:23,405][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:09:26,064][35978] Updated weights for policy 0, policy_version 35835 (0.0029) [2024-06-10 13:09:28,402][35745] Fps is (10 sec: 44236.7, 60 sec: 45329.0, 300 sec: 44820.0). Total num frames: 587218944. Throughput: 0: 45016.0. Samples: 185784780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-10 13:09:28,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 13:09:29,480][35978] Updated weights for policy 0, policy_version 35845 (0.0046) [2024-06-10 13:09:33,175][35978] Updated weights for policy 0, policy_version 35855 (0.0041) [2024-06-10 13:09:33,402][35745] Fps is (10 sec: 44247.1, 60 sec: 45333.9, 300 sec: 44931.0). Total num frames: 587448320. Throughput: 0: 44951.6. Samples: 186057940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 13:09:33,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 13:09:33,497][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000035856_587464704.pth... [2024-06-10 13:09:33,551][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000035196_576651264.pth [2024-06-10 13:09:36,560][35978] Updated weights for policy 0, policy_version 35865 (0.0028) [2024-06-10 13:09:38,402][35745] Fps is (10 sec: 44236.6, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 587661312. Throughput: 0: 45003.6. Samples: 186323600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 13:09:38,402][35745] Avg episode reward: [(0, '0.312')] [2024-06-10 13:09:40,311][35978] Updated weights for policy 0, policy_version 35875 (0.0035) [2024-06-10 13:09:43,402][35745] Fps is (10 sec: 42598.8, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 587874304. Throughput: 0: 44814.7. Samples: 186456400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 13:09:43,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 13:09:44,213][35978] Updated weights for policy 0, policy_version 35885 (0.0023) [2024-06-10 13:09:47,744][35978] Updated weights for policy 0, policy_version 35895 (0.0041) [2024-06-10 13:09:48,402][35745] Fps is (10 sec: 44236.6, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 588103680. Throughput: 0: 45041.7. Samples: 186726380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 13:09:48,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:09:51,770][35978] Updated weights for policy 0, policy_version 35905 (0.0047) [2024-06-10 13:09:53,402][35745] Fps is (10 sec: 45872.6, 60 sec: 44509.5, 300 sec: 44931.0). Total num frames: 588333056. Throughput: 0: 44869.3. Samples: 186993060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-10 13:09:53,403][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:09:55,384][35978] Updated weights for policy 0, policy_version 35915 (0.0035) [2024-06-10 13:09:58,402][35745] Fps is (10 sec: 42598.9, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 588529664. Throughput: 0: 44686.7. Samples: 187121260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 13:09:58,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 13:09:58,931][35978] Updated weights for policy 0, policy_version 35925 (0.0031) [2024-06-10 13:10:00,829][35957] Signal inference workers to stop experience collection... (2850 times) [2024-06-10 13:10:00,830][35957] Signal inference workers to resume experience collection... (2850 times) [2024-06-10 13:10:00,878][35978] InferenceWorker_p0-w0: stopping experience collection (2850 times) [2024-06-10 13:10:00,879][35978] InferenceWorker_p0-w0: resuming experience collection (2850 times) [2024-06-10 13:10:02,628][35978] Updated weights for policy 0, policy_version 35935 (0.0031) [2024-06-10 13:10:03,402][35745] Fps is (10 sec: 45877.4, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 588791808. Throughput: 0: 44606.1. Samples: 187393180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 13:10:03,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 13:10:06,678][35978] Updated weights for policy 0, policy_version 35945 (0.0030) [2024-06-10 13:10:08,402][35745] Fps is (10 sec: 45874.8, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 588988416. Throughput: 0: 44788.1. Samples: 187667400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 13:10:08,402][35745] Avg episode reward: [(0, '0.311')] [2024-06-10 13:10:09,734][35978] Updated weights for policy 0, policy_version 35955 (0.0025) [2024-06-10 13:10:13,402][35745] Fps is (10 sec: 42598.2, 60 sec: 44236.7, 300 sec: 44875.5). Total num frames: 589217792. Throughput: 0: 44688.8. Samples: 187795780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 13:10:13,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 13:10:13,938][35978] Updated weights for policy 0, policy_version 35965 (0.0037) [2024-06-10 13:10:17,315][35978] Updated weights for policy 0, policy_version 35975 (0.0030) [2024-06-10 13:10:18,401][35745] Fps is (10 sec: 49152.5, 60 sec: 45056.0, 300 sec: 44931.1). Total num frames: 589479936. Throughput: 0: 44706.3. Samples: 188069720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-10 13:10:18,402][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 13:10:20,910][35978] Updated weights for policy 0, policy_version 35985 (0.0039) [2024-06-10 13:10:23,402][35745] Fps is (10 sec: 45874.8, 60 sec: 44511.5, 300 sec: 44875.5). Total num frames: 589676544. Throughput: 0: 44755.9. Samples: 188337620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-10 13:10:23,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:10:24,333][35978] Updated weights for policy 0, policy_version 35995 (0.0037) [2024-06-10 13:10:28,247][35978] Updated weights for policy 0, policy_version 36005 (0.0029) [2024-06-10 13:10:28,402][35745] Fps is (10 sec: 42598.1, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 589905920. Throughput: 0: 44711.5. Samples: 188468420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-10 13:10:28,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:10:31,599][35978] Updated weights for policy 0, policy_version 36015 (0.0028) [2024-06-10 13:10:33,402][35745] Fps is (10 sec: 45875.2, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 590135296. Throughput: 0: 44646.1. Samples: 188735460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-10 13:10:33,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 13:10:35,775][35978] Updated weights for policy 0, policy_version 36025 (0.0021) [2024-06-10 13:10:38,402][35745] Fps is (10 sec: 45875.1, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 590364672. Throughput: 0: 44942.7. Samples: 189015460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-10 13:10:38,402][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 13:10:38,844][35978] Updated weights for policy 0, policy_version 36035 (0.0038) [2024-06-10 13:10:42,882][35978] Updated weights for policy 0, policy_version 36045 (0.0028) [2024-06-10 13:10:43,402][35745] Fps is (10 sec: 44237.4, 60 sec: 45056.0, 300 sec: 44931.6). Total num frames: 590577664. Throughput: 0: 45000.4. Samples: 189146280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 13:10:43,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:10:46,222][35978] Updated weights for policy 0, policy_version 36055 (0.0042) [2024-06-10 13:10:48,402][35745] Fps is (10 sec: 44236.4, 60 sec: 45055.9, 300 sec: 44875.8). Total num frames: 590807040. Throughput: 0: 44874.1. Samples: 189412520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 13:10:48,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:10:50,121][35978] Updated weights for policy 0, policy_version 36065 (0.0035) [2024-06-10 13:10:53,402][35745] Fps is (10 sec: 45875.3, 60 sec: 45056.4, 300 sec: 44931.0). Total num frames: 591036416. Throughput: 0: 44840.9. Samples: 189685240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 13:10:53,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:10:53,436][35978] Updated weights for policy 0, policy_version 36075 (0.0048) [2024-06-10 13:10:57,456][35978] Updated weights for policy 0, policy_version 36085 (0.0038) [2024-06-10 13:10:58,402][35745] Fps is (10 sec: 42598.6, 60 sec: 45055.9, 300 sec: 44875.5). Total num frames: 591233024. Throughput: 0: 44922.2. Samples: 189817280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 13:10:58,402][35745] Avg episode reward: [(0, '0.314')] [2024-06-10 13:11:00,572][35978] Updated weights for policy 0, policy_version 36095 (0.0022) [2024-06-10 13:11:03,401][35745] Fps is (10 sec: 40960.3, 60 sec: 44236.9, 300 sec: 44764.5). Total num frames: 591446016. Throughput: 0: 44782.7. Samples: 190084940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 13:11:03,402][35745] Avg episode reward: [(0, '0.313')] [2024-06-10 13:11:05,059][35978] Updated weights for policy 0, policy_version 36105 (0.0040) [2024-06-10 13:11:07,922][35978] Updated weights for policy 0, policy_version 36115 (0.0034) [2024-06-10 13:11:08,402][35745] Fps is (10 sec: 49152.3, 60 sec: 45602.1, 300 sec: 44986.6). Total num frames: 591724544. Throughput: 0: 44670.8. Samples: 190347800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-10 13:11:08,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:11:12,398][35978] Updated weights for policy 0, policy_version 36125 (0.0035) [2024-06-10 13:11:13,402][35745] Fps is (10 sec: 47513.0, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 591921152. Throughput: 0: 44925.3. Samples: 190490060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-10 13:11:13,402][35745] Avg episode reward: [(0, '0.311')] [2024-06-10 13:11:15,666][35978] Updated weights for policy 0, policy_version 36135 (0.0031) [2024-06-10 13:11:18,401][35745] Fps is (10 sec: 39322.0, 60 sec: 43963.7, 300 sec: 44820.0). Total num frames: 592117760. Throughput: 0: 44582.0. Samples: 190741640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-10 13:11:18,402][35745] Avg episode reward: [(0, '0.314')] [2024-06-10 13:11:19,015][35957] Signal inference workers to stop experience collection... (2900 times) [2024-06-10 13:11:19,064][35957] Signal inference workers to resume experience collection... (2900 times) [2024-06-10 13:11:19,065][35978] InferenceWorker_p0-w0: stopping experience collection (2900 times) [2024-06-10 13:11:19,087][35978] InferenceWorker_p0-w0: resuming experience collection (2900 times) [2024-06-10 13:11:19,850][35978] Updated weights for policy 0, policy_version 36145 (0.0039) [2024-06-10 13:11:22,838][35978] Updated weights for policy 0, policy_version 36155 (0.0028) [2024-06-10 13:11:23,402][35745] Fps is (10 sec: 45875.2, 60 sec: 45056.1, 300 sec: 44875.5). Total num frames: 592379904. Throughput: 0: 44352.0. Samples: 191011300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-10 13:11:23,404][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:11:26,987][35978] Updated weights for policy 0, policy_version 36165 (0.0028) [2024-06-10 13:11:28,402][35745] Fps is (10 sec: 45874.8, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 592576512. Throughput: 0: 44679.1. Samples: 191156840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-10 13:11:28,406][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 13:11:29,947][35978] Updated weights for policy 0, policy_version 36175 (0.0025) [2024-06-10 13:11:33,402][35745] Fps is (10 sec: 40960.3, 60 sec: 44236.9, 300 sec: 44764.4). Total num frames: 592789504. Throughput: 0: 44705.1. Samples: 191424240. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-10 13:11:33,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:11:33,450][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000036182_592805888.pth... [2024-06-10 13:11:33,519][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000035526_582057984.pth [2024-06-10 13:11:34,560][35978] Updated weights for policy 0, policy_version 36185 (0.0023) [2024-06-10 13:11:37,400][35978] Updated weights for policy 0, policy_version 36195 (0.0030) [2024-06-10 13:11:38,402][35745] Fps is (10 sec: 49152.3, 60 sec: 45056.1, 300 sec: 44875.5). Total num frames: 593068032. Throughput: 0: 44410.7. Samples: 191683720. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-10 13:11:38,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:11:41,620][35978] Updated weights for policy 0, policy_version 36205 (0.0035) [2024-06-10 13:11:43,402][35745] Fps is (10 sec: 44236.4, 60 sec: 44236.8, 300 sec: 44708.9). Total num frames: 593231872. Throughput: 0: 44654.7. Samples: 191826740. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-10 13:11:43,406][35745] Avg episode reward: [(0, '0.313')] [2024-06-10 13:11:44,940][35978] Updated weights for policy 0, policy_version 36215 (0.0027) [2024-06-10 13:11:48,402][35745] Fps is (10 sec: 39321.3, 60 sec: 44236.9, 300 sec: 44764.4). Total num frames: 593461248. Throughput: 0: 44733.7. Samples: 192097960. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-10 13:11:48,402][35745] Avg episode reward: [(0, '0.297')] [2024-06-10 13:11:48,993][35978] Updated weights for policy 0, policy_version 36225 (0.0038) [2024-06-10 13:11:52,108][35978] Updated weights for policy 0, policy_version 36235 (0.0044) [2024-06-10 13:11:53,402][35745] Fps is (10 sec: 49152.3, 60 sec: 44782.9, 300 sec: 44875.9). Total num frames: 593723392. Throughput: 0: 44820.0. Samples: 192364700. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-10 13:11:53,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:11:56,207][35978] Updated weights for policy 0, policy_version 36245 (0.0032) [2024-06-10 13:11:58,404][35745] Fps is (10 sec: 47502.7, 60 sec: 45054.3, 300 sec: 44764.1). Total num frames: 593936384. Throughput: 0: 44743.5. Samples: 192503620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 13:11:58,405][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 13:11:59,160][35978] Updated weights for policy 0, policy_version 36255 (0.0022) [2024-06-10 13:12:03,402][35745] Fps is (10 sec: 40959.9, 60 sec: 44782.9, 300 sec: 44820.9). Total num frames: 594132992. Throughput: 0: 45113.2. Samples: 192771740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 13:12:03,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 13:12:03,844][35978] Updated weights for policy 0, policy_version 36265 (0.0037) [2024-06-10 13:12:06,563][35978] Updated weights for policy 0, policy_version 36275 (0.0034) [2024-06-10 13:12:08,402][35745] Fps is (10 sec: 44247.2, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 594378752. Throughput: 0: 45061.4. Samples: 193039060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 13:12:08,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:12:10,941][35978] Updated weights for policy 0, policy_version 36285 (0.0038) [2024-06-10 13:12:13,402][35745] Fps is (10 sec: 45875.5, 60 sec: 44509.9, 300 sec: 44764.7). Total num frames: 594591744. Throughput: 0: 44757.0. Samples: 193170900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 13:12:13,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:12:14,172][35978] Updated weights for policy 0, policy_version 36295 (0.0032) [2024-06-10 13:12:18,375][35978] Updated weights for policy 0, policy_version 36305 (0.0030) [2024-06-10 13:12:18,407][35745] Fps is (10 sec: 44211.3, 60 sec: 45051.6, 300 sec: 44819.1). Total num frames: 594821120. Throughput: 0: 44606.3. Samples: 193431780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 13:12:18,408][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:12:21,381][35978] Updated weights for policy 0, policy_version 36315 (0.0022) [2024-06-10 13:12:23,402][35745] Fps is (10 sec: 44235.8, 60 sec: 44236.7, 300 sec: 44875.5). Total num frames: 595034112. Throughput: 0: 44828.2. Samples: 193701000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 13:12:23,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:12:25,707][35978] Updated weights for policy 0, policy_version 36325 (0.0026) [2024-06-10 13:12:28,402][35745] Fps is (10 sec: 47541.3, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 595296256. Throughput: 0: 44804.5. Samples: 193842940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 13:12:28,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 13:12:28,535][35978] Updated weights for policy 0, policy_version 36335 (0.0040) [2024-06-10 13:12:33,205][35978] Updated weights for policy 0, policy_version 36345 (0.0023) [2024-06-10 13:12:33,401][35745] Fps is (10 sec: 44237.9, 60 sec: 44783.0, 300 sec: 44820.2). Total num frames: 595476480. Throughput: 0: 44773.9. Samples: 194112780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 13:12:33,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 13:12:35,163][35957] Signal inference workers to stop experience collection... (2950 times) [2024-06-10 13:12:35,163][35957] Signal inference workers to resume experience collection... (2950 times) [2024-06-10 13:12:35,183][35978] InferenceWorker_p0-w0: stopping experience collection (2950 times) [2024-06-10 13:12:35,188][35978] InferenceWorker_p0-w0: resuming experience collection (2950 times) [2024-06-10 13:12:35,760][35978] Updated weights for policy 0, policy_version 36355 (0.0021) [2024-06-10 13:12:38,403][35745] Fps is (10 sec: 40954.6, 60 sec: 43962.8, 300 sec: 44819.8). Total num frames: 595705856. Throughput: 0: 44725.8. Samples: 194377420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-10 13:12:38,404][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:12:40,388][35978] Updated weights for policy 0, policy_version 36365 (0.0033) [2024-06-10 13:12:43,313][35978] Updated weights for policy 0, policy_version 36375 (0.0033) [2024-06-10 13:12:43,402][35745] Fps is (10 sec: 49151.1, 60 sec: 45602.1, 300 sec: 44931.0). Total num frames: 595968000. Throughput: 0: 44613.3. Samples: 194511120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 13:12:43,408][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:12:47,618][35978] Updated weights for policy 0, policy_version 36385 (0.0045) [2024-06-10 13:12:48,404][35745] Fps is (10 sec: 45869.9, 60 sec: 45054.2, 300 sec: 44819.6). Total num frames: 596164608. Throughput: 0: 44832.7. Samples: 194789320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 13:12:48,405][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:12:50,677][35978] Updated weights for policy 0, policy_version 36395 (0.0024) [2024-06-10 13:12:53,402][35745] Fps is (10 sec: 42598.5, 60 sec: 44509.8, 300 sec: 44875.5). Total num frames: 596393984. Throughput: 0: 44651.9. Samples: 195048400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 13:12:53,402][35745] Avg episode reward: [(0, '0.303')] [2024-06-10 13:12:55,056][35978] Updated weights for policy 0, policy_version 36405 (0.0031) [2024-06-10 13:12:57,669][35978] Updated weights for policy 0, policy_version 36415 (0.0032) [2024-06-10 13:12:58,402][35745] Fps is (10 sec: 50802.5, 60 sec: 45603.9, 300 sec: 44986.6). Total num frames: 596672512. Throughput: 0: 44958.6. Samples: 195194040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 13:12:58,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:13:02,447][35978] Updated weights for policy 0, policy_version 36425 (0.0042) [2024-06-10 13:13:03,402][35745] Fps is (10 sec: 44236.9, 60 sec: 45056.0, 300 sec: 44764.4). Total num frames: 596836352. Throughput: 0: 45282.2. Samples: 195469220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-10 13:13:03,402][35745] Avg episode reward: [(0, '0.311')] [2024-06-10 13:13:04,864][35978] Updated weights for policy 0, policy_version 36435 (0.0036) [2024-06-10 13:13:08,402][35745] Fps is (10 sec: 37683.2, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 597049344. Throughput: 0: 44975.7. Samples: 195724900. Policy #0 lag: (min: 2.0, avg: 12.2, max: 23.0) [2024-06-10 13:13:08,403][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 13:13:09,605][35978] Updated weights for policy 0, policy_version 36445 (0.0037) [2024-06-10 13:13:12,387][35978] Updated weights for policy 0, policy_version 36455 (0.0027) [2024-06-10 13:13:13,402][35745] Fps is (10 sec: 49152.2, 60 sec: 45602.1, 300 sec: 44986.7). Total num frames: 597327872. Throughput: 0: 44916.8. Samples: 195864200. Policy #0 lag: (min: 2.0, avg: 12.2, max: 23.0) [2024-06-10 13:13:13,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 13:13:16,997][35978] Updated weights for policy 0, policy_version 36465 (0.0028) [2024-06-10 13:13:18,404][35745] Fps is (10 sec: 47502.6, 60 sec: 45058.6, 300 sec: 44819.6). Total num frames: 597524480. Throughput: 0: 44992.2. Samples: 196137540. Policy #0 lag: (min: 2.0, avg: 12.2, max: 23.0) [2024-06-10 13:13:18,405][35745] Avg episode reward: [(0, '0.312')] [2024-06-10 13:13:19,698][35978] Updated weights for policy 0, policy_version 36475 (0.0021) [2024-06-10 13:13:23,402][35745] Fps is (10 sec: 39321.4, 60 sec: 44783.0, 300 sec: 44819.9). Total num frames: 597721088. Throughput: 0: 45059.9. Samples: 196405060. Policy #0 lag: (min: 2.0, avg: 12.2, max: 23.0) [2024-06-10 13:13:23,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:13:24,297][35978] Updated weights for policy 0, policy_version 36485 (0.0041) [2024-06-10 13:13:26,714][35978] Updated weights for policy 0, policy_version 36495 (0.0027) [2024-06-10 13:13:28,402][35745] Fps is (10 sec: 45886.1, 60 sec: 44782.9, 300 sec: 44932.0). Total num frames: 597983232. Throughput: 0: 45065.5. Samples: 196539060. Policy #0 lag: (min: 2.0, avg: 12.2, max: 23.0) [2024-06-10 13:13:28,402][35745] Avg episode reward: [(0, '0.311')] [2024-06-10 13:13:31,539][35978] Updated weights for policy 0, policy_version 36505 (0.0039) [2024-06-10 13:13:33,402][35745] Fps is (10 sec: 49152.5, 60 sec: 45602.1, 300 sec: 44875.5). Total num frames: 598212608. Throughput: 0: 45059.3. Samples: 196816880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-10 13:13:33,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:13:33,478][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000036513_598228992.pth... [2024-06-10 13:13:33,530][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000035856_587464704.pth [2024-06-10 13:13:33,954][35978] Updated weights for policy 0, policy_version 36515 (0.0034) [2024-06-10 13:13:38,402][35745] Fps is (10 sec: 42598.4, 60 sec: 45057.0, 300 sec: 44820.0). Total num frames: 598409216. Throughput: 0: 45293.4. Samples: 197086600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-10 13:13:38,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:13:38,549][35978] Updated weights for policy 0, policy_version 36525 (0.0031) [2024-06-10 13:13:40,974][35957] Signal inference workers to stop experience collection... (3000 times) [2024-06-10 13:13:41,011][35978] InferenceWorker_p0-w0: stopping experience collection (3000 times) [2024-06-10 13:13:41,024][35957] Signal inference workers to resume experience collection... (3000 times) [2024-06-10 13:13:41,029][35978] InferenceWorker_p0-w0: resuming experience collection (3000 times) [2024-06-10 13:13:41,406][35978] Updated weights for policy 0, policy_version 36535 (0.0028) [2024-06-10 13:13:43,402][35745] Fps is (10 sec: 45875.3, 60 sec: 45056.1, 300 sec: 44986.6). Total num frames: 598671360. Throughput: 0: 45078.3. Samples: 197222560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-10 13:13:43,402][35745] Avg episode reward: [(0, '0.313')] [2024-06-10 13:13:46,067][35978] Updated weights for policy 0, policy_version 36545 (0.0036) [2024-06-10 13:13:48,402][35745] Fps is (10 sec: 47513.6, 60 sec: 45330.9, 300 sec: 44820.0). Total num frames: 598884352. Throughput: 0: 44911.6. Samples: 197490240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-10 13:13:48,402][35745] Avg episode reward: [(0, '0.321')] [2024-06-10 13:13:48,402][35957] Saving new best policy, reward=0.321! [2024-06-10 13:13:49,052][35978] Updated weights for policy 0, policy_version 36555 (0.0040) [2024-06-10 13:13:53,401][35745] Fps is (10 sec: 39321.9, 60 sec: 44510.0, 300 sec: 44708.9). Total num frames: 599064576. Throughput: 0: 45465.9. Samples: 197770860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 13:13:53,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:13:53,458][35978] Updated weights for policy 0, policy_version 36565 (0.0045) [2024-06-10 13:13:55,983][35978] Updated weights for policy 0, policy_version 36575 (0.0034) [2024-06-10 13:13:58,402][35745] Fps is (10 sec: 44236.0, 60 sec: 44236.7, 300 sec: 44875.5). Total num frames: 599326720. Throughput: 0: 45030.5. Samples: 197890580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 13:13:58,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:14:00,652][35978] Updated weights for policy 0, policy_version 36585 (0.0034) [2024-06-10 13:14:03,205][35978] Updated weights for policy 0, policy_version 36595 (0.0042) [2024-06-10 13:14:03,408][35745] Fps is (10 sec: 50757.5, 60 sec: 45597.4, 300 sec: 44985.6). Total num frames: 599572480. Throughput: 0: 45087.1. Samples: 198166640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 13:14:03,409][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:14:07,652][35978] Updated weights for policy 0, policy_version 36605 (0.0034) [2024-06-10 13:14:08,401][35745] Fps is (10 sec: 42599.5, 60 sec: 45056.1, 300 sec: 44708.9). Total num frames: 599752704. Throughput: 0: 45063.7. Samples: 198432920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 13:14:08,402][35745] Avg episode reward: [(0, '0.311')] [2024-06-10 13:14:10,834][35978] Updated weights for policy 0, policy_version 36615 (0.0031) [2024-06-10 13:14:13,402][35745] Fps is (10 sec: 42625.6, 60 sec: 44509.9, 300 sec: 44820.0). Total num frames: 599998464. Throughput: 0: 45044.4. Samples: 198566060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-10 13:14:13,402][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 13:14:15,315][35978] Updated weights for policy 0, policy_version 36625 (0.0032) [2024-06-10 13:14:18,151][35978] Updated weights for policy 0, policy_version 36635 (0.0031) [2024-06-10 13:14:18,402][35745] Fps is (10 sec: 49151.3, 60 sec: 45330.8, 300 sec: 44875.9). Total num frames: 600244224. Throughput: 0: 44880.4. Samples: 198836500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 13:14:18,402][35745] Avg episode reward: [(0, '0.312')] [2024-06-10 13:14:22,686][35978] Updated weights for policy 0, policy_version 36645 (0.0038) [2024-06-10 13:14:23,402][35745] Fps is (10 sec: 45874.5, 60 sec: 45602.1, 300 sec: 44875.5). Total num frames: 600457216. Throughput: 0: 45170.1. Samples: 199119260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 13:14:23,402][35745] Avg episode reward: [(0, '0.312')] [2024-06-10 13:14:25,221][35978] Updated weights for policy 0, policy_version 36655 (0.0047) [2024-06-10 13:14:28,402][35745] Fps is (10 sec: 40960.0, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 600653824. Throughput: 0: 44744.3. Samples: 199236060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 13:14:28,402][35745] Avg episode reward: [(0, '0.314')] [2024-06-10 13:14:29,640][35978] Updated weights for policy 0, policy_version 36665 (0.0028) [2024-06-10 13:14:32,363][35978] Updated weights for policy 0, policy_version 36675 (0.0038) [2024-06-10 13:14:33,402][35745] Fps is (10 sec: 45875.9, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 600915968. Throughput: 0: 44996.4. Samples: 199515080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 13:14:33,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:14:36,674][35978] Updated weights for policy 0, policy_version 36685 (0.0021) [2024-06-10 13:14:38,402][35745] Fps is (10 sec: 47514.0, 60 sec: 45329.1, 300 sec: 44931.0). Total num frames: 601128960. Throughput: 0: 44918.1. Samples: 199792180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-10 13:14:38,402][35745] Avg episode reward: [(0, '0.312')] [2024-06-10 13:14:39,752][35978] Updated weights for policy 0, policy_version 36695 (0.0031) [2024-06-10 13:14:43,402][35745] Fps is (10 sec: 42597.7, 60 sec: 44509.7, 300 sec: 44875.5). Total num frames: 601341952. Throughput: 0: 44980.9. Samples: 199914720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 13:14:43,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:14:44,285][35978] Updated weights for policy 0, policy_version 36705 (0.0034) [2024-06-10 13:14:46,974][35978] Updated weights for policy 0, policy_version 36715 (0.0036) [2024-06-10 13:14:48,404][35745] Fps is (10 sec: 49140.2, 60 sec: 45600.3, 300 sec: 45041.8). Total num frames: 601620480. Throughput: 0: 44940.9. Samples: 200188800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 13:14:48,405][35745] Avg episode reward: [(0, '0.312')] [2024-06-10 13:14:51,789][35978] Updated weights for policy 0, policy_version 36725 (0.0031) [2024-06-10 13:14:53,402][35745] Fps is (10 sec: 47513.6, 60 sec: 45875.0, 300 sec: 45042.1). Total num frames: 601817088. Throughput: 0: 45045.1. Samples: 200459960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 13:14:53,402][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 13:14:54,148][35978] Updated weights for policy 0, policy_version 36735 (0.0028) [2024-06-10 13:14:58,402][35745] Fps is (10 sec: 37691.5, 60 sec: 44509.9, 300 sec: 44764.4). Total num frames: 601997312. Throughput: 0: 45050.0. Samples: 200593320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-10 13:14:58,402][35745] Avg episode reward: [(0, '0.314')] [2024-06-10 13:14:58,925][35978] Updated weights for policy 0, policy_version 36745 (0.0028) [2024-06-10 13:15:00,201][35957] Signal inference workers to stop experience collection... (3050 times) [2024-06-10 13:15:00,202][35957] Signal inference workers to resume experience collection... (3050 times) [2024-06-10 13:15:00,221][35978] InferenceWorker_p0-w0: stopping experience collection (3050 times) [2024-06-10 13:15:00,221][35978] InferenceWorker_p0-w0: resuming experience collection (3050 times) [2024-06-10 13:15:01,455][35978] Updated weights for policy 0, policy_version 36755 (0.0027) [2024-06-10 13:15:03,404][35745] Fps is (10 sec: 45865.2, 60 sec: 45059.0, 300 sec: 45041.8). Total num frames: 602275840. Throughput: 0: 44953.7. Samples: 200859520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 13:15:03,405][35745] Avg episode reward: [(0, '0.315')] [2024-06-10 13:15:05,891][35978] Updated weights for policy 0, policy_version 36765 (0.0050) [2024-06-10 13:15:08,404][35745] Fps is (10 sec: 49141.4, 60 sec: 45600.3, 300 sec: 44986.2). Total num frames: 602488832. Throughput: 0: 44860.0. Samples: 201138060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 13:15:08,405][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:15:08,864][35978] Updated weights for policy 0, policy_version 36775 (0.0027) [2024-06-10 13:15:13,380][35978] Updated weights for policy 0, policy_version 36785 (0.0030) [2024-06-10 13:15:13,402][35745] Fps is (10 sec: 40969.1, 60 sec: 44782.8, 300 sec: 44764.4). Total num frames: 602685440. Throughput: 0: 45095.9. Samples: 201265380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 13:15:13,402][35745] Avg episode reward: [(0, '0.312')] [2024-06-10 13:15:16,408][35978] Updated weights for policy 0, policy_version 36795 (0.0032) [2024-06-10 13:15:18,402][35745] Fps is (10 sec: 45885.8, 60 sec: 45056.0, 300 sec: 44986.6). Total num frames: 602947584. Throughput: 0: 44920.4. Samples: 201536500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 13:15:18,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:15:20,902][35978] Updated weights for policy 0, policy_version 36805 (0.0037) [2024-06-10 13:15:23,278][35978] Updated weights for policy 0, policy_version 36815 (0.0034) [2024-06-10 13:15:23,402][35745] Fps is (10 sec: 49152.2, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 603176960. Throughput: 0: 44746.6. Samples: 201805780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 13:15:23,402][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 13:15:27,895][35978] Updated weights for policy 0, policy_version 36825 (0.0036) [2024-06-10 13:15:28,402][35745] Fps is (10 sec: 40959.9, 60 sec: 45056.0, 300 sec: 44820.0). Total num frames: 603357184. Throughput: 0: 45191.7. Samples: 201948340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-10 13:15:28,402][35745] Avg episode reward: [(0, '0.316')] [2024-06-10 13:15:30,510][35978] Updated weights for policy 0, policy_version 36835 (0.0030) [2024-06-10 13:15:33,402][35745] Fps is (10 sec: 42598.3, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 603602944. Throughput: 0: 44914.3. Samples: 202209840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-10 13:15:33,402][35745] Avg episode reward: [(0, '0.312')] [2024-06-10 13:15:33,416][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000036841_603602944.pth... [2024-06-10 13:15:33,473][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000036182_592805888.pth [2024-06-10 13:15:34,988][35978] Updated weights for policy 0, policy_version 36845 (0.0029) [2024-06-10 13:15:37,759][35978] Updated weights for policy 0, policy_version 36855 (0.0032) [2024-06-10 13:15:38,402][35745] Fps is (10 sec: 47514.0, 60 sec: 45056.0, 300 sec: 44931.0). Total num frames: 603832320. Throughput: 0: 44908.2. Samples: 202480820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-10 13:15:38,402][35745] Avg episode reward: [(0, '0.315')] [2024-06-10 13:15:42,507][35978] Updated weights for policy 0, policy_version 36865 (0.0023) [2024-06-10 13:15:43,402][35745] Fps is (10 sec: 44237.3, 60 sec: 45056.1, 300 sec: 44875.5). Total num frames: 604045312. Throughput: 0: 45221.5. Samples: 202628280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-10 13:15:43,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:15:45,322][35978] Updated weights for policy 0, policy_version 36875 (0.0036) [2024-06-10 13:15:48,402][35745] Fps is (10 sec: 42598.2, 60 sec: 43965.5, 300 sec: 44820.0). Total num frames: 604258304. Throughput: 0: 45104.1. Samples: 202889100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-10 13:15:48,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:15:50,024][35978] Updated weights for policy 0, policy_version 36885 (0.0021) [2024-06-10 13:15:52,540][35978] Updated weights for policy 0, policy_version 36895 (0.0029) [2024-06-10 13:15:53,402][35745] Fps is (10 sec: 49152.1, 60 sec: 45329.2, 300 sec: 45097.7). Total num frames: 604536832. Throughput: 0: 44702.8. Samples: 203149580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-10 13:15:53,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 13:15:57,096][35978] Updated weights for policy 0, policy_version 36905 (0.0046) [2024-06-10 13:15:58,402][35745] Fps is (10 sec: 45874.6, 60 sec: 45329.1, 300 sec: 44986.5). Total num frames: 604717056. Throughput: 0: 45146.7. Samples: 203296980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-10 13:15:58,402][35745] Avg episode reward: [(0, '0.307')] [2024-06-10 13:16:00,102][35978] Updated weights for policy 0, policy_version 36915 (0.0029) [2024-06-10 13:16:03,401][35745] Fps is (10 sec: 37683.4, 60 sec: 43965.5, 300 sec: 44708.9). Total num frames: 604913664. Throughput: 0: 44831.6. Samples: 203553920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-10 13:16:03,402][35745] Avg episode reward: [(0, '0.312')] [2024-06-10 13:16:04,505][35978] Updated weights for policy 0, policy_version 36925 (0.0038) [2024-06-10 13:16:07,162][35978] Updated weights for policy 0, policy_version 36935 (0.0026) [2024-06-10 13:16:08,161][35957] Signal inference workers to stop experience collection... (3100 times) [2024-06-10 13:16:08,168][35957] Signal inference workers to resume experience collection... (3100 times) [2024-06-10 13:16:08,194][35978] InferenceWorker_p0-w0: stopping experience collection (3100 times) [2024-06-10 13:16:08,194][35978] InferenceWorker_p0-w0: resuming experience collection (3100 times) [2024-06-10 13:16:08,402][35745] Fps is (10 sec: 47514.3, 60 sec: 45057.8, 300 sec: 44986.6). Total num frames: 605192192. Throughput: 0: 44952.1. Samples: 203828620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-10 13:16:08,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:16:11,918][35978] Updated weights for policy 0, policy_version 36945 (0.0041) [2024-06-10 13:16:13,402][35745] Fps is (10 sec: 47513.1, 60 sec: 45056.1, 300 sec: 44986.6). Total num frames: 605388800. Throughput: 0: 44941.4. Samples: 203970700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-10 13:16:13,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:16:14,715][35978] Updated weights for policy 0, policy_version 36955 (0.0036) [2024-06-10 13:16:18,402][35745] Fps is (10 sec: 39321.6, 60 sec: 43963.8, 300 sec: 44764.4). Total num frames: 605585408. Throughput: 0: 44873.5. Samples: 204229140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 13:16:18,402][35745] Avg episode reward: [(0, '0.314')] [2024-06-10 13:16:19,238][35978] Updated weights for policy 0, policy_version 36965 (0.0024) [2024-06-10 13:16:21,795][35978] Updated weights for policy 0, policy_version 36975 (0.0034) [2024-06-10 13:16:23,402][35745] Fps is (10 sec: 49151.6, 60 sec: 45056.0, 300 sec: 45097.6). Total num frames: 605880320. Throughput: 0: 44824.3. Samples: 204497920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 13:16:23,402][35745] Avg episode reward: [(0, '0.313')] [2024-06-10 13:16:26,340][35978] Updated weights for policy 0, policy_version 36985 (0.0034) [2024-06-10 13:16:28,401][35745] Fps is (10 sec: 47513.8, 60 sec: 45056.1, 300 sec: 44986.6). Total num frames: 606060544. Throughput: 0: 44799.2. Samples: 204644240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 13:16:28,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:16:29,156][35978] Updated weights for policy 0, policy_version 36995 (0.0038) [2024-06-10 13:16:33,402][35745] Fps is (10 sec: 37683.7, 60 sec: 44236.9, 300 sec: 44708.9). Total num frames: 606257152. Throughput: 0: 44766.7. Samples: 204903600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 13:16:33,402][35745] Avg episode reward: [(0, '0.312')] [2024-06-10 13:16:34,016][35978] Updated weights for policy 0, policy_version 37005 (0.0036) [2024-06-10 13:16:36,577][35978] Updated weights for policy 0, policy_version 37015 (0.0033) [2024-06-10 13:16:38,402][35745] Fps is (10 sec: 47513.0, 60 sec: 45055.9, 300 sec: 45097.7). Total num frames: 606535680. Throughput: 0: 44829.7. Samples: 205166920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-10 13:16:38,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 13:16:41,123][35978] Updated weights for policy 0, policy_version 37025 (0.0032) [2024-06-10 13:16:43,402][35745] Fps is (10 sec: 49151.7, 60 sec: 45056.0, 300 sec: 45042.1). Total num frames: 606748672. Throughput: 0: 44918.3. Samples: 205318300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-10 13:16:43,402][35745] Avg episode reward: [(0, '0.305')] [2024-06-10 13:16:43,725][35978] Updated weights for policy 0, policy_version 37035 (0.0035) [2024-06-10 13:16:48,402][35745] Fps is (10 sec: 39321.3, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 606928896. Throughput: 0: 45100.7. Samples: 205583460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-10 13:16:48,402][35745] Avg episode reward: [(0, '0.316')] [2024-06-10 13:16:48,495][35978] Updated weights for policy 0, policy_version 37045 (0.0038) [2024-06-10 13:16:50,964][35978] Updated weights for policy 0, policy_version 37055 (0.0037) [2024-06-10 13:16:53,402][35745] Fps is (10 sec: 45875.5, 60 sec: 44509.9, 300 sec: 44986.9). Total num frames: 607207424. Throughput: 0: 44811.1. Samples: 205845120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-10 13:16:53,402][35745] Avg episode reward: [(0, '0.317')] [2024-06-10 13:16:55,681][35978] Updated weights for policy 0, policy_version 37065 (0.0027) [2024-06-10 13:16:58,248][35978] Updated weights for policy 0, policy_version 37075 (0.0042) [2024-06-10 13:16:58,402][35745] Fps is (10 sec: 50790.8, 60 sec: 45329.1, 300 sec: 45097.7). Total num frames: 607436800. Throughput: 0: 44673.8. Samples: 205981020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-10 13:16:58,402][35745] Avg episode reward: [(0, '0.315')] [2024-06-10 13:17:03,335][35978] Updated weights for policy 0, policy_version 37085 (0.0025) [2024-06-10 13:17:03,402][35745] Fps is (10 sec: 39321.3, 60 sec: 44782.8, 300 sec: 44820.0). Total num frames: 607600640. Throughput: 0: 45033.7. Samples: 206255660. Policy #0 lag: (min: 0.0, avg: 7.7, max: 22.0) [2024-06-10 13:17:03,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 13:17:05,796][35978] Updated weights for policy 0, policy_version 37095 (0.0040) [2024-06-10 13:17:08,402][35745] Fps is (10 sec: 42598.7, 60 sec: 44509.9, 300 sec: 44986.6). Total num frames: 607862784. Throughput: 0: 44933.5. Samples: 206519920. Policy #0 lag: (min: 0.0, avg: 7.7, max: 22.0) [2024-06-10 13:17:08,402][35745] Avg episode reward: [(0, '0.311')] [2024-06-10 13:17:10,531][35978] Updated weights for policy 0, policy_version 37105 (0.0026) [2024-06-10 13:17:12,919][35978] Updated weights for policy 0, policy_version 37115 (0.0028) [2024-06-10 13:17:13,401][35745] Fps is (10 sec: 50791.1, 60 sec: 45329.1, 300 sec: 45043.0). Total num frames: 608108544. Throughput: 0: 44905.8. Samples: 206665000. Policy #0 lag: (min: 0.0, avg: 7.7, max: 22.0) [2024-06-10 13:17:13,402][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:17:14,554][35957] Signal inference workers to stop experience collection... (3150 times) [2024-06-10 13:17:14,555][35957] Signal inference workers to resume experience collection... (3150 times) [2024-06-10 13:17:14,569][35978] InferenceWorker_p0-w0: stopping experience collection (3150 times) [2024-06-10 13:17:14,569][35978] InferenceWorker_p0-w0: resuming experience collection (3150 times) [2024-06-10 13:17:17,684][35978] Updated weights for policy 0, policy_version 37125 (0.0025) [2024-06-10 13:17:18,402][35745] Fps is (10 sec: 42598.2, 60 sec: 45056.0, 300 sec: 44931.1). Total num frames: 608288768. Throughput: 0: 45001.3. Samples: 206928660. Policy #0 lag: (min: 0.0, avg: 7.7, max: 22.0) [2024-06-10 13:17:18,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:17:19,897][35978] Updated weights for policy 0, policy_version 37135 (0.0037) [2024-06-10 13:17:23,402][35745] Fps is (10 sec: 40959.8, 60 sec: 43963.8, 300 sec: 44820.0). Total num frames: 608518144. Throughput: 0: 45093.4. Samples: 207196120. Policy #0 lag: (min: 0.0, avg: 7.7, max: 22.0) [2024-06-10 13:17:23,402][35745] Avg episode reward: [(0, '0.312')] [2024-06-10 13:17:24,973][35978] Updated weights for policy 0, policy_version 37145 (0.0027) [2024-06-10 13:17:27,551][35978] Updated weights for policy 0, policy_version 37155 (0.0023) [2024-06-10 13:17:28,402][35745] Fps is (10 sec: 50790.2, 60 sec: 45602.0, 300 sec: 45153.2). Total num frames: 608796672. Throughput: 0: 44639.1. Samples: 207327060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-10 13:17:28,402][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 13:17:32,322][35978] Updated weights for policy 0, policy_version 37165 (0.0050) [2024-06-10 13:17:33,402][35745] Fps is (10 sec: 44236.5, 60 sec: 45055.9, 300 sec: 44931.2). Total num frames: 608960512. Throughput: 0: 44700.1. Samples: 207594960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-10 13:17:33,402][35745] Avg episode reward: [(0, '0.315')] [2024-06-10 13:17:33,446][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000037169_608976896.pth... [2024-06-10 13:17:33,503][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000036513_598228992.pth [2024-06-10 13:17:34,848][35978] Updated weights for policy 0, policy_version 37175 (0.0036) [2024-06-10 13:17:38,402][35745] Fps is (10 sec: 39321.4, 60 sec: 44236.7, 300 sec: 44820.0). Total num frames: 609189888. Throughput: 0: 44865.2. Samples: 207864060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-10 13:17:38,402][35745] Avg episode reward: [(0, '0.311')] [2024-06-10 13:17:39,721][35978] Updated weights for policy 0, policy_version 37185 (0.0041) [2024-06-10 13:17:42,072][35978] Updated weights for policy 0, policy_version 37195 (0.0032) [2024-06-10 13:17:43,402][35745] Fps is (10 sec: 49152.3, 60 sec: 45056.0, 300 sec: 45042.5). Total num frames: 609452032. Throughput: 0: 44919.2. Samples: 208002380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-10 13:17:43,402][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 13:17:46,732][35978] Updated weights for policy 0, policy_version 37205 (0.0025) [2024-06-10 13:17:48,402][35745] Fps is (10 sec: 45875.4, 60 sec: 45329.1, 300 sec: 44931.0). Total num frames: 609648640. Throughput: 0: 44733.3. Samples: 208268660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-10 13:17:48,402][35745] Avg episode reward: [(0, '0.306')] [2024-06-10 13:17:49,415][35978] Updated weights for policy 0, policy_version 37215 (0.0025) [2024-06-10 13:17:53,402][35745] Fps is (10 sec: 39321.2, 60 sec: 43963.7, 300 sec: 44653.3). Total num frames: 609845248. Throughput: 0: 44847.5. Samples: 208538060. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-10 13:17:53,402][35745] Avg episode reward: [(0, '0.314')] [2024-06-10 13:17:54,288][35978] Updated weights for policy 0, policy_version 37225 (0.0038) [2024-06-10 13:17:56,823][35978] Updated weights for policy 0, policy_version 37235 (0.0024) [2024-06-10 13:17:58,402][35745] Fps is (10 sec: 49151.9, 60 sec: 45056.0, 300 sec: 45097.7). Total num frames: 610140160. Throughput: 0: 44630.5. Samples: 208673380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-10 13:17:58,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:18:01,706][35978] Updated weights for policy 0, policy_version 37245 (0.0035) [2024-06-10 13:18:03,401][35745] Fps is (10 sec: 47514.1, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 610320384. Throughput: 0: 44747.6. Samples: 208942300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-10 13:18:03,402][35745] Avg episode reward: [(0, '0.314')] [2024-06-10 13:18:04,070][35978] Updated weights for policy 0, policy_version 37255 (0.0036) [2024-06-10 13:18:08,402][35745] Fps is (10 sec: 39321.6, 60 sec: 44509.8, 300 sec: 44764.4). Total num frames: 610533376. Throughput: 0: 44721.2. Samples: 209208580. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-10 13:18:08,402][35745] Avg episode reward: [(0, '0.311')] [2024-06-10 13:18:08,734][35978] Updated weights for policy 0, policy_version 37265 (0.0027) [2024-06-10 13:18:11,166][35978] Updated weights for policy 0, policy_version 37275 (0.0030) [2024-06-10 13:18:13,404][35745] Fps is (10 sec: 47500.1, 60 sec: 44780.8, 300 sec: 44986.5). Total num frames: 610795520. Throughput: 0: 44721.7. Samples: 209339660. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-10 13:18:13,405][35745] Avg episode reward: [(0, '0.308')] [2024-06-10 13:18:15,887][35978] Updated weights for policy 0, policy_version 37285 (0.0036) [2024-06-10 13:18:18,402][35745] Fps is (10 sec: 49152.2, 60 sec: 45602.1, 300 sec: 45097.7). Total num frames: 611024896. Throughput: 0: 44927.5. Samples: 209616700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 13:18:18,402][35745] Avg episode reward: [(0, '0.312')] [2024-06-10 13:18:18,519][35978] Updated weights for policy 0, policy_version 37295 (0.0026) [2024-06-10 13:18:19,515][35957] Signal inference workers to stop experience collection... (3200 times) [2024-06-10 13:18:19,516][35957] Signal inference workers to resume experience collection... (3200 times) [2024-06-10 13:18:19,568][35978] InferenceWorker_p0-w0: stopping experience collection (3200 times) [2024-06-10 13:18:19,568][35978] InferenceWorker_p0-w0: resuming experience collection (3200 times) [2024-06-10 13:18:23,258][35978] Updated weights for policy 0, policy_version 37305 (0.0032) [2024-06-10 13:18:23,402][35745] Fps is (10 sec: 40971.2, 60 sec: 44782.9, 300 sec: 44819.9). Total num frames: 611205120. Throughput: 0: 44753.4. Samples: 209877960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 13:18:23,414][35745] Avg episode reward: [(0, '0.311')] [2024-06-10 13:18:26,175][35978] Updated weights for policy 0, policy_version 37315 (0.0032) [2024-06-10 13:18:28,402][35745] Fps is (10 sec: 42598.3, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 611450880. Throughput: 0: 44663.5. Samples: 210012240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 13:18:28,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:18:30,415][35978] Updated weights for policy 0, policy_version 37325 (0.0031) [2024-06-10 13:18:33,402][35745] Fps is (10 sec: 47513.7, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 611680256. Throughput: 0: 44838.3. Samples: 210286380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 13:18:33,402][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 13:18:33,461][35978] Updated weights for policy 0, policy_version 37335 (0.0039) [2024-06-10 13:18:37,845][35978] Updated weights for policy 0, policy_version 37345 (0.0033) [2024-06-10 13:18:38,402][35745] Fps is (10 sec: 42598.5, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 611876864. Throughput: 0: 44860.0. Samples: 210556760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-10 13:18:38,402][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 13:18:40,552][35978] Updated weights for policy 0, policy_version 37355 (0.0038) [2024-06-10 13:18:43,402][35745] Fps is (10 sec: 44236.8, 60 sec: 44509.8, 300 sec: 44875.5). Total num frames: 612122624. Throughput: 0: 44690.3. Samples: 210684440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 13:18:43,402][35745] Avg episode reward: [(0, '0.316')] [2024-06-10 13:18:44,905][35978] Updated weights for policy 0, policy_version 37365 (0.0036) [2024-06-10 13:18:48,103][35978] Updated weights for policy 0, policy_version 37375 (0.0038) [2024-06-10 13:18:48,401][35745] Fps is (10 sec: 49152.6, 60 sec: 45329.2, 300 sec: 45097.6). Total num frames: 612368384. Throughput: 0: 44846.7. Samples: 210960400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 13:18:48,402][35745] Avg episode reward: [(0, '0.311')] [2024-06-10 13:18:52,319][35978] Updated weights for policy 0, policy_version 37385 (0.0027) [2024-06-10 13:18:53,402][35745] Fps is (10 sec: 44236.8, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 612564992. Throughput: 0: 44830.7. Samples: 211225960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 13:18:53,402][35745] Avg episode reward: [(0, '0.314')] [2024-06-10 13:18:55,404][35978] Updated weights for policy 0, policy_version 37395 (0.0026) [2024-06-10 13:18:58,404][35745] Fps is (10 sec: 42588.0, 60 sec: 44235.1, 300 sec: 44820.6). Total num frames: 612794368. Throughput: 0: 44858.2. Samples: 211358260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-10 13:18:58,405][35745] Avg episode reward: [(0, '0.315')] [2024-06-10 13:18:59,491][35978] Updated weights for policy 0, policy_version 37405 (0.0040) [2024-06-10 13:19:02,786][35978] Updated weights for policy 0, policy_version 37415 (0.0036) [2024-06-10 13:19:03,402][35745] Fps is (10 sec: 49152.4, 60 sec: 45602.1, 300 sec: 45097.6). Total num frames: 613056512. Throughput: 0: 44916.5. Samples: 211637940. Policy #0 lag: (min: 1.0, avg: 10.9, max: 19.0) [2024-06-10 13:19:03,402][35745] Avg episode reward: [(0, '0.301')] [2024-06-10 13:19:06,888][35978] Updated weights for policy 0, policy_version 37425 (0.0043) [2024-06-10 13:19:08,402][35745] Fps is (10 sec: 42608.2, 60 sec: 44783.0, 300 sec: 44819.9). Total num frames: 613220352. Throughput: 0: 45078.2. Samples: 211906480. Policy #0 lag: (min: 1.0, avg: 10.9, max: 19.0) [2024-06-10 13:19:08,402][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 13:19:09,921][35978] Updated weights for policy 0, policy_version 37435 (0.0034) [2024-06-10 13:19:13,402][35745] Fps is (10 sec: 40959.5, 60 sec: 44511.9, 300 sec: 44820.0). Total num frames: 613466112. Throughput: 0: 44923.1. Samples: 212033780. Policy #0 lag: (min: 1.0, avg: 10.9, max: 19.0) [2024-06-10 13:19:13,402][35745] Avg episode reward: [(0, '0.312')] [2024-06-10 13:19:14,173][35978] Updated weights for policy 0, policy_version 37445 (0.0024) [2024-06-10 13:19:17,341][35978] Updated weights for policy 0, policy_version 37455 (0.0042) [2024-06-10 13:19:18,402][35745] Fps is (10 sec: 49151.6, 60 sec: 44782.9, 300 sec: 44931.0). Total num frames: 613711872. Throughput: 0: 44855.0. Samples: 212304860. Policy #0 lag: (min: 1.0, avg: 10.9, max: 19.0) [2024-06-10 13:19:18,402][35745] Avg episode reward: [(0, '0.313')] [2024-06-10 13:19:21,199][35978] Updated weights for policy 0, policy_version 37465 (0.0028) [2024-06-10 13:19:23,402][35745] Fps is (10 sec: 44236.3, 60 sec: 45055.9, 300 sec: 44931.0). Total num frames: 613908480. Throughput: 0: 44902.5. Samples: 212577380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 19.0) [2024-06-10 13:19:23,402][35745] Avg episode reward: [(0, '0.310')] [2024-06-10 13:19:24,741][35978] Updated weights for policy 0, policy_version 37475 (0.0026) [2024-06-10 13:19:28,402][35745] Fps is (10 sec: 42599.2, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 614137856. Throughput: 0: 44863.6. Samples: 212703300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 13:19:28,402][35745] Avg episode reward: [(0, '0.316')] [2024-06-10 13:19:28,559][35978] Updated weights for policy 0, policy_version 37485 (0.0029) [2024-06-10 13:19:31,617][35957] Signal inference workers to stop experience collection... (3250 times) [2024-06-10 13:19:31,659][35978] InferenceWorker_p0-w0: stopping experience collection (3250 times) [2024-06-10 13:19:31,666][35957] Signal inference workers to resume experience collection... (3250 times) [2024-06-10 13:19:31,675][35978] InferenceWorker_p0-w0: resuming experience collection (3250 times) [2024-06-10 13:19:32,171][35978] Updated weights for policy 0, policy_version 37495 (0.0033) [2024-06-10 13:19:33,404][35745] Fps is (10 sec: 49141.3, 60 sec: 45327.3, 300 sec: 44986.2). Total num frames: 614400000. Throughput: 0: 45009.6. Samples: 212985940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 13:19:33,405][35745] Avg episode reward: [(0, '0.316')] [2024-06-10 13:19:33,421][35957] Saving /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000037500_614400000.pth... [2024-06-10 13:19:33,470][35957] Removing /workspace/metta/train_dir/p2.metta.6/checkpoint_p0/checkpoint_000036841_603602944.pth [2024-06-10 13:19:35,799][35978] Updated weights for policy 0, policy_version 37505 (0.0031) [2024-06-10 13:19:38,401][35745] Fps is (10 sec: 42598.5, 60 sec: 44783.0, 300 sec: 44820.0). Total num frames: 614563840. Throughput: 0: 45137.9. Samples: 213257160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 13:19:38,402][35745] Avg episode reward: [(0, '0.316')] [2024-06-10 13:19:39,255][35978] Updated weights for policy 0, policy_version 37515 (0.0046) [2024-06-10 13:19:43,168][35978] Updated weights for policy 0, policy_version 37525 (0.0032) [2024-06-10 13:19:43,402][35745] Fps is (10 sec: 42608.0, 60 sec: 45055.9, 300 sec: 44764.8). Total num frames: 614825984. Throughput: 0: 44963.2. Samples: 213381500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 13:19:43,402][35745] Avg episode reward: [(0, '0.304')] [2024-06-10 13:19:46,647][35978] Updated weights for policy 0, policy_version 37535 (0.0032) [2024-06-10 13:19:48,402][35745] Fps is (10 sec: 50789.3, 60 sec: 45055.8, 300 sec: 44931.0). Total num frames: 615071744. Throughput: 0: 44923.8. Samples: 213659520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-10 13:19:48,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:19:50,159][35978] Updated weights for policy 0, policy_version 37545 (0.0032) [2024-06-10 13:19:53,402][35745] Fps is (10 sec: 42598.6, 60 sec: 44782.9, 300 sec: 44931.1). Total num frames: 615251968. Throughput: 0: 45056.0. Samples: 213934000. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-10 13:19:53,402][35745] Avg episode reward: [(0, '0.316')] [2024-06-10 13:19:53,850][35978] Updated weights for policy 0, policy_version 37555 (0.0034) [2024-06-10 13:19:57,523][35978] Updated weights for policy 0, policy_version 37565 (0.0034) [2024-06-10 13:19:58,402][35745] Fps is (10 sec: 40960.6, 60 sec: 44784.7, 300 sec: 44764.8). Total num frames: 615481344. Throughput: 0: 44974.7. Samples: 214057640. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-10 13:19:58,402][35745] Avg episode reward: [(0, '0.309')] [2024-06-10 13:20:01,387][35978] Updated weights for policy 0, policy_version 37575 (0.0031) [2024-06-10 13:20:03,402][35745] Fps is (10 sec: 47513.6, 60 sec: 44509.8, 300 sec: 44875.8). Total num frames: 615727104. Throughput: 0: 45020.5. Samples: 214330780. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-10 13:20:03,402][35745] Avg episode reward: [(0, '0.312')] [2024-06-10 13:20:04,589][35978] Updated weights for policy 0, policy_version 37585 (0.0026) [2024-06-10 13:20:08,402][35745] Fps is (10 sec: 45875.3, 60 sec: 45329.1, 300 sec: 44931.1). Total num frames: 615940096. Throughput: 0: 45250.4. Samples: 214613640. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-10 13:20:08,402][35745] Avg episode reward: [(0, '0.315')] [2024-06-10 13:20:08,406][35978] Updated weights for policy 0, policy_version 37595 (0.0046) [2024-06-10 13:20:12,286][35978] Updated weights for policy 0, policy_version 37605 (0.0033) [2024-06-10 13:20:13,402][35745] Fps is (10 sec: 42598.4, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 616153088. Throughput: 0: 45159.9. Samples: 214735500. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-10 13:20:13,402][35745] Avg episode reward: [(0, '0.315')] [2024-06-10 13:20:15,698][35978] Updated weights for policy 0, policy_version 37615 (0.0035) [2024-06-10 13:20:18,402][35745] Fps is (10 sec: 44236.6, 60 sec: 44510.0, 300 sec: 44764.4). Total num frames: 616382464. Throughput: 0: 44625.9. Samples: 214994000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-10 13:20:18,402][35745] Avg episode reward: [(0, '0.314')] [2024-06-10 13:20:19,478][35978] Updated weights for policy 0, policy_version 37625 (0.0038) [2024-06-10 13:20:22,386][35957] Signal inference workers to stop experience collection... (3300 times) [2024-06-10 13:20:22,387][35957] Signal inference workers to resume experience collection... (3300 times) [2024-06-10 13:20:22,397][35978] InferenceWorker_p0-w0: stopping experience collection (3300 times) [2024-06-10 13:20:22,398][35978] InferenceWorker_p0-w0: resuming experience collection (3300 times) [2024-06-10 13:20:22,963][35978] Updated weights for policy 0, policy_version 37635 (0.0030) [2024-06-10 13:20:23,402][35745] Fps is (10 sec: 47513.9, 60 sec: 45329.2, 300 sec: 44986.6). Total num frames: 616628224. Throughput: 0: 44940.8. Samples: 215279500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-10 13:20:23,402][35745] Avg episode reward: [(0, '0.318')] [2024-06-10 13:20:26,620][35978] Updated weights for policy 0, policy_version 37645 (0.0039) [2024-06-10 13:20:28,402][35745] Fps is (10 sec: 45875.5, 60 sec: 45056.0, 300 sec: 44875.5). Total num frames: 616841216. Throughput: 0: 45151.7. Samples: 215413320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-10 13:20:28,402][35745] Avg episode reward: [(0, '0.311')] [2024-06-10 13:20:30,406][35978] Updated weights for policy 0, policy_version 37655 (0.0022) [2024-06-10 13:20:33,402][35745] Fps is (10 sec: 44236.4, 60 sec: 44511.6, 300 sec: 44875.5). Total num frames: 617070592. Throughput: 0: 44772.9. Samples: 215674300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-10 13:20:33,402][35745] Avg episode reward: [(0, '0.313')] [2024-06-10 13:20:34,122][35978] Updated weights for policy 0, policy_version 37665 (0.0027) [2024-06-10 13:20:37,677][35978] Updated weights for policy 0, policy_version 37675 (0.0023) [2024-06-10 13:20:38,401][35745] Fps is (10 sec: 49152.1, 60 sec: 46148.3, 300 sec: 45042.1). Total num frames: 617332736. Throughput: 0: 44905.9. Samples: 215954760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-10 13:20:38,402][35745] Avg episode reward: [(0, '0.316')] [2024-06-10 13:20:41,393][35978] Updated weights for policy 0, policy_version 37685 (0.0026)