[2024-07-02 11:04:38,457][36761] Saving configuration to ./train_dir/sample_factory/p2.sf.1/config.json... [2024-07-02 11:04:38,489][36761] Rollout worker 0 uses device cpu [2024-07-02 11:04:38,489][36761] Rollout worker 1 uses device cpu [2024-07-02 11:04:38,489][36761] Rollout worker 2 uses device cpu [2024-07-02 11:04:38,489][36761] Rollout worker 3 uses device cpu [2024-07-02 11:04:38,489][36761] Rollout worker 4 uses device cpu [2024-07-02 11:04:38,490][36761] Rollout worker 5 uses device cpu [2024-07-02 11:04:38,490][36761] Rollout worker 6 uses device cpu [2024-07-02 11:04:38,490][36761] Rollout worker 7 uses device cpu [2024-07-02 11:04:38,490][36761] Rollout worker 8 uses device cpu [2024-07-02 11:04:38,490][36761] Rollout worker 9 uses device cpu [2024-07-02 11:04:38,490][36761] Rollout worker 10 uses device cpu [2024-07-02 11:04:38,490][36761] Rollout worker 11 uses device cpu [2024-07-02 11:04:38,490][36761] Rollout worker 12 uses device cpu [2024-07-02 11:04:38,490][36761] Rollout worker 13 uses device cpu [2024-07-02 11:04:38,490][36761] Rollout worker 14 uses device cpu [2024-07-02 11:04:38,490][36761] Rollout worker 15 uses device cpu [2024-07-02 11:04:38,490][36761] Rollout worker 16 uses device cpu [2024-07-02 11:04:38,491][36761] Rollout worker 17 uses device cpu [2024-07-02 11:04:38,491][36761] Rollout worker 18 uses device cpu [2024-07-02 11:04:38,491][36761] Rollout worker 19 uses device cpu [2024-07-02 11:04:38,491][36761] Rollout worker 20 uses device cpu [2024-07-02 11:04:38,491][36761] Rollout worker 21 uses device cpu [2024-07-02 11:04:38,491][36761] Rollout worker 22 uses device cpu [2024-07-02 11:04:38,491][36761] Rollout worker 23 uses device cpu [2024-07-02 11:04:38,491][36761] Rollout worker 24 uses device cpu [2024-07-02 11:04:38,491][36761] Rollout worker 25 uses device cpu [2024-07-02 11:04:38,492][36761] Rollout worker 26 uses device cpu [2024-07-02 11:04:38,492][36761] Rollout worker 27 uses device cpu [2024-07-02 11:04:38,492][36761] Rollout worker 28 uses device cpu [2024-07-02 11:04:38,492][36761] Rollout worker 29 uses device cpu [2024-07-02 11:04:38,492][36761] Rollout worker 30 uses device cpu [2024-07-02 11:04:38,492][36761] Rollout worker 31 uses device cpu [2024-07-02 11:04:39,077][36761] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-07-02 11:04:39,077][36761] InferenceWorker_p0-w0: min num requests: 10 [2024-07-02 11:04:39,121][36761] Starting all processes... [2024-07-02 11:04:39,121][36761] Starting process learner_proc0 [2024-07-02 11:04:39,403][36761] Starting all processes... [2024-07-02 11:04:39,406][36761] Starting process inference_proc0-0 [2024-07-02 11:04:39,406][36761] Starting process rollout_proc0 [2024-07-02 11:04:39,407][36761] Starting process rollout_proc1 [2024-07-02 11:04:39,408][36761] Starting process rollout_proc2 [2024-07-02 11:04:39,408][36761] Starting process rollout_proc3 [2024-07-02 11:04:39,408][36761] Starting process rollout_proc4 [2024-07-02 11:04:39,409][36761] Starting process rollout_proc5 [2024-07-02 11:04:39,410][36761] Starting process rollout_proc6 [2024-07-02 11:04:39,410][36761] Starting process rollout_proc7 [2024-07-02 11:04:39,411][36761] Starting process rollout_proc8 [2024-07-02 11:04:39,411][36761] Starting process rollout_proc9 [2024-07-02 11:04:39,411][36761] Starting process rollout_proc10 [2024-07-02 11:04:39,412][36761] Starting process rollout_proc11 [2024-07-02 11:04:39,413][36761] Starting process rollout_proc12 [2024-07-02 11:04:39,415][36761] Starting process rollout_proc13 [2024-07-02 11:04:39,415][36761] Starting process rollout_proc14 [2024-07-02 11:04:39,416][36761] Starting process rollout_proc15 [2024-07-02 11:04:39,416][36761] Starting process rollout_proc16 [2024-07-02 11:04:39,417][36761] Starting process rollout_proc17 [2024-07-02 11:04:39,417][36761] Starting process rollout_proc18 [2024-07-02 11:04:39,417][36761] Starting process rollout_proc19 [2024-07-02 11:04:39,417][36761] Starting process rollout_proc20 [2024-07-02 11:04:39,420][36761] Starting process rollout_proc21 [2024-07-02 11:04:39,420][36761] Starting process rollout_proc22 [2024-07-02 11:04:39,422][36761] Starting process rollout_proc23 [2024-07-02 11:04:39,423][36761] Starting process rollout_proc24 [2024-07-02 11:04:39,425][36761] Starting process rollout_proc25 [2024-07-02 11:04:39,427][36761] Starting process rollout_proc26 [2024-07-02 11:04:39,428][36761] Starting process rollout_proc27 [2024-07-02 11:04:39,429][36761] Starting process rollout_proc28 [2024-07-02 11:04:39,432][36761] Starting process rollout_proc29 [2024-07-02 11:04:39,433][36761] Starting process rollout_proc30 [2024-07-02 11:04:39,433][36761] Starting process rollout_proc31 [2024-07-02 11:04:41,468][37009] Worker 9 uses CPU cores [9] [2024-07-02 11:04:41,552][37008] Worker 7 uses CPU cores [7] [2024-07-02 11:04:41,552][37022] Worker 22 uses CPU cores [22] [2024-07-02 11:04:41,588][36999] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-07-02 11:04:41,588][36999] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-07-02 11:04:41,591][37026] Worker 28 uses CPU cores [28] [2024-07-02 11:04:41,592][37012] Worker 11 uses CPU cores [11] [2024-07-02 11:04:41,596][37025] Worker 25 uses CPU cores [25] [2024-07-02 11:04:41,596][37029] Worker 31 uses CPU cores [31] [2024-07-02 11:04:41,597][36999] Num visible devices: 1 [2024-07-02 11:04:41,612][37010] Worker 10 uses CPU cores [10] [2024-07-02 11:04:41,648][37023] Worker 23 uses CPU cores [23] [2024-07-02 11:04:41,655][37006] Worker 5 uses CPU cores [5] [2024-07-02 11:04:41,656][37024] Worker 24 uses CPU cores [24] [2024-07-02 11:04:41,671][37030] Worker 29 uses CPU cores [29] [2024-07-02 11:04:41,752][37013] Worker 15 uses CPU cores [15] [2024-07-02 11:04:41,785][37007] Worker 8 uses CPU cores [8] [2024-07-02 11:04:41,787][37000] Worker 0 uses CPU cores [0] [2024-07-02 11:04:41,787][37017] Worker 17 uses CPU cores [17] [2024-07-02 11:04:41,799][37031] Worker 30 uses CPU cores [30] [2024-07-02 11:04:41,805][37011] Worker 13 uses CPU cores [13] [2024-07-02 11:04:41,808][37016] Worker 18 uses CPU cores [18] [2024-07-02 11:04:41,812][37002] Worker 2 uses CPU cores [2] [2024-07-02 11:04:41,813][36979] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-07-02 11:04:41,813][36979] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-07-02 11:04:41,815][37005] Worker 6 uses CPU cores [6] [2024-07-02 11:04:41,817][37001] Worker 1 uses CPU cores [1] [2024-07-02 11:04:41,822][36979] Num visible devices: 1 [2024-07-02 11:04:41,848][36979] Setting fixed seed 0 [2024-07-02 11:04:41,849][37018] Worker 16 uses CPU cores [16] [2024-07-02 11:04:41,856][37027] Worker 27 uses CPU cores [27] [2024-07-02 11:04:41,861][36979] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-07-02 11:04:41,861][36979] Initializing actor-critic model on device cuda:0 [2024-07-02 11:04:41,880][37021] Worker 21 uses CPU cores [21] [2024-07-02 11:04:41,908][37004] Worker 4 uses CPU cores [4] [2024-07-02 11:04:41,911][37028] Worker 26 uses CPU cores [26] [2024-07-02 11:04:41,913][37014] Worker 12 uses CPU cores [12] [2024-07-02 11:04:41,924][37020] Worker 20 uses CPU cores [20] [2024-07-02 11:04:41,964][37019] Worker 19 uses CPU cores [19] [2024-07-02 11:04:42,007][37015] Worker 14 uses CPU cores [14] [2024-07-02 11:04:42,124][37003] Worker 3 uses CPU cores [3] [2024-07-02 11:04:42,607][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,607][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,607][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,607][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,607][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,607][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,607][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,607][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,607][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,607][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,607][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,607][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,607][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,608][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,608][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,608][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,608][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,608][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,608][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,608][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,608][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,608][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,608][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,611][36979] RunningMeanStd input shape: (1,) [2024-07-02 11:04:42,611][36979] RunningMeanStd input shape: (1,) [2024-07-02 11:04:42,611][36979] RunningMeanStd input shape: (1,) [2024-07-02 11:04:42,612][36979] RunningMeanStd input shape: (1,) [2024-07-02 11:04:42,612][36979] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:42,647][36979] RunningMeanStd input shape: (1,) [2024-07-02 11:04:42,655][36979] Created Actor Critic model with architecture: [2024-07-02 11:04:42,655][36979] SampleFactoryAgentWrapper( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (agent): MettaAgent( (_encoder): MultiFeatureSetEncoder( (feature_set_encoders): ModuleDict( (grid_obs): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (clock): RunningMeanStdInPlace() (converter): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:hp): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv_r1): RunningMeanStdInPlace() (agent:inv_r2): RunningMeanStdInPlace() (agent:inv_r3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (altar:hp): RunningMeanStdInPlace() (altar:state): RunningMeanStdInPlace() (converter:hp): RunningMeanStdInPlace() (converter:state): RunningMeanStdInPlace() (generator:amount): RunningMeanStdInPlace() (generator:hp): RunningMeanStdInPlace() (generator:state): RunningMeanStdInPlace() (wall:hp): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (global_vars): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (_steps): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_action): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_reward): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_reward): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (kinship): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (kinship): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) ) (merged_encoder): Sequential( (0): Linear(in_features=544, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (_decoder): Decoder( (mlp): Identity() ) (_critic_linear): Linear(in_features=512, out_features=1, bias=True) ) (_core): ModelCoreRNN( (core): GRU(512, 512) ) (_action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) [2024-07-02 11:04:42,719][36979] Using optimizer [2024-07-02 11:04:42,899][36979] No checkpoints found [2024-07-02 11:04:42,899][36979] Did not load from checkpoint, starting from scratch! [2024-07-02 11:04:42,899][36979] Initialized policy 0 weights for model version 0 [2024-07-02 11:04:42,901][36979] LearnerWorker_p0 finished initialization! [2024-07-02 11:04:42,901][36979] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-07-02 11:04:43,621][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,621][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,621][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,622][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,622][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,622][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,622][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,622][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,622][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,622][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,622][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,622][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,622][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,622][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,622][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,622][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,622][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,622][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,622][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,622][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,622][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,622][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,623][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,626][36999] RunningMeanStd input shape: (1,) [2024-07-02 11:04:43,626][36999] RunningMeanStd input shape: (1,) [2024-07-02 11:04:43,626][36999] RunningMeanStd input shape: (1,) [2024-07-02 11:04:43,626][36999] RunningMeanStd input shape: (1,) [2024-07-02 11:04:43,626][36999] RunningMeanStd input shape: (11, 11) [2024-07-02 11:04:43,663][36999] RunningMeanStd input shape: (1,) [2024-07-02 11:04:43,688][36761] Inference worker 0-0 is ready! [2024-07-02 11:04:43,688][36761] All inference workers are ready! Signal rollout workers to start! [2024-07-02 11:04:46,095][36761] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-07-02 11:04:46,447][37025] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,449][37022] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,450][37029] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,459][37018] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,459][37030] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,462][37016] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,464][37023] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,466][37020] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,467][37024] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,474][37019] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,475][37021] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,482][37026] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,483][37031] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,495][37008] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,497][37001] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,497][37011] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,503][37012] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,504][37003] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,505][37000] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,506][37009] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,506][37017] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,506][37028] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,506][37006] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,507][37013] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,509][37015] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,509][37007] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,510][37002] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,511][37010] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,511][37014] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,512][37004] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,519][37005] Decorrelating experience for 0 frames... [2024-07-02 11:04:46,541][37027] Decorrelating experience for 0 frames... [2024-07-02 11:04:47,604][37025] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,617][37018] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,618][37029] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,619][37022] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,628][37016] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,633][37020] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,633][37024] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,638][37030] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,641][37023] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,644][37019] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,653][37021] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,667][37031] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,675][37026] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,679][37008] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,691][37001] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,691][37011] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,693][37017] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,699][37012] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,699][37003] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,699][37028] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,704][37009] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,710][37013] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,711][37006] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,712][37000] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,715][37002] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,719][37015] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,720][37010] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,722][37007] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,724][37014] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,727][37004] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,731][37005] Decorrelating experience for 256 frames... [2024-07-02 11:04:47,767][37027] Decorrelating experience for 256 frames... [2024-07-02 11:04:51,095][36761] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 6576.1. Samples: 32880. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-07-02 11:04:54,754][37022] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-07-02 11:04:54,892][37002] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-07-02 11:04:55,488][37006] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-07-02 11:04:55,509][37003] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-07-02 11:04:55,524][37008] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-07-02 11:04:55,548][37024] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-07-02 11:04:55,559][37009] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-07-02 11:04:55,560][37025] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-07-02 11:04:55,560][37030] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-07-02 11:04:55,570][37021] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-07-02 11:04:55,570][37029] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-07-02 11:04:55,570][37023] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-07-02 11:04:55,586][37010] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-07-02 11:04:55,586][37026] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-07-02 11:04:55,587][37016] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-07-02 11:04:55,642][36979] Signal inference workers to stop experience collection... [2024-07-02 11:04:55,665][36999] InferenceWorker_p0-w0: stopping experience collection [2024-07-02 11:04:56,095][36761] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 32152.5. Samples: 321520. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-07-02 11:04:56,096][36761] Avg episode reward: [(0, '0.000')] [2024-07-02 11:04:56,317][36979] Signal inference workers to resume experience collection... [2024-07-02 11:04:56,317][36999] InferenceWorker_p0-w0: resuming experience collection [2024-07-02 11:04:56,362][37027] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-07-02 11:04:56,607][37012] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-07-02 11:04:56,689][37013] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-07-02 11:04:56,689][37014] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-07-02 11:04:56,699][37007] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-07-02 11:04:56,701][37011] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-07-02 11:04:56,726][37015] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-07-02 11:04:56,816][37001] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-07-02 11:04:56,843][37005] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-07-02 11:04:56,869][37004] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-07-02 11:04:56,886][37020] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-07-02 11:04:56,891][37019] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-07-02 11:04:56,937][37028] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-07-02 11:04:56,942][37017] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-07-02 11:04:56,942][37031] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-07-02 11:04:56,987][37018] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-07-02 11:04:57,507][36999] Updated weights for policy 0, policy_version 10 (0.0017) [2024-07-02 11:04:59,073][36761] Heartbeat connected on Batcher_0 [2024-07-02 11:04:59,075][36761] Heartbeat connected on LearnerWorker_p0 [2024-07-02 11:04:59,080][36761] Heartbeat connected on RolloutWorker_w0 [2024-07-02 11:04:59,136][36761] Heartbeat connected on InferenceWorker_p0-w0 [2024-07-02 11:05:01,095][36761] Fps is (10 sec: 16383.9, 60 sec: 10922.7, 300 sec: 10922.7). Total num frames: 163840. Throughput: 0: 21965.3. Samples: 329480. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-07-02 11:05:01,096][36761] Avg episode reward: [(0, '0.000')] [2024-07-02 11:05:01,097][36979] Saving new best policy, reward=0.000! [2024-07-02 11:05:01,527][37001] Worker 1 awakens! [2024-07-02 11:05:01,533][36761] Heartbeat connected on RolloutWorker_w1 [2024-07-02 11:05:04,314][37002] Worker 2 awakens! [2024-07-02 11:05:04,319][36761] Heartbeat connected on RolloutWorker_w2 [2024-07-02 11:05:06,095][36761] Fps is (10 sec: 16383.8, 60 sec: 8192.0, 300 sec: 8192.0). Total num frames: 163840. Throughput: 0: 17052.0. Samples: 341040. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-07-02 11:05:06,096][36761] Avg episode reward: [(0, '0.000')] [2024-07-02 11:05:09,571][37003] Worker 3 awakens! [2024-07-02 11:05:09,584][36761] Heartbeat connected on RolloutWorker_w3 [2024-07-02 11:05:11,095][36761] Fps is (10 sec: 3276.8, 60 sec: 7864.3, 300 sec: 7864.3). Total num frames: 196608. Throughput: 0: 14508.7. Samples: 362720. Policy #0 lag: (min: 0.0, avg: 4.2, max: 11.0) [2024-07-02 11:05:11,096][36761] Avg episode reward: [(0, '0.001')] [2024-07-02 11:05:11,105][36979] Saving new best policy, reward=0.001! [2024-07-02 11:05:15,695][37004] Worker 4 awakens! [2024-07-02 11:05:15,703][36761] Heartbeat connected on RolloutWorker_w4 [2024-07-02 11:05:16,095][36761] Fps is (10 sec: 6553.6, 60 sec: 7645.9, 300 sec: 7645.9). Total num frames: 229376. Throughput: 0: 12552.7. Samples: 376580. Policy #0 lag: (min: 0.0, avg: 4.3, max: 13.0) [2024-07-02 11:05:16,096][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:05:18,998][37006] Worker 5 awakens! [2024-07-02 11:05:19,005][36761] Heartbeat connected on RolloutWorker_w5 [2024-07-02 11:05:21,095][36761] Fps is (10 sec: 8192.1, 60 sec: 7958.0, 300 sec: 7958.0). Total num frames: 278528. Throughput: 0: 12462.3. Samples: 436180. Policy #0 lag: (min: 0.0, avg: 2.2, max: 15.0) [2024-07-02 11:05:21,096][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:05:23,863][36999] Updated weights for policy 0, policy_version 20 (0.0014) [2024-07-02 11:05:25,068][37005] Worker 6 awakens! [2024-07-02 11:05:25,075][36761] Heartbeat connected on RolloutWorker_w6 [2024-07-02 11:05:26,095][36761] Fps is (10 sec: 11468.8, 60 sec: 8601.6, 300 sec: 8601.6). Total num frames: 344064. Throughput: 0: 13008.5. Samples: 520340. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) [2024-07-02 11:05:26,096][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:05:28,436][37008] Worker 7 awakens! [2024-07-02 11:05:28,443][36761] Heartbeat connected on RolloutWorker_w7 [2024-07-02 11:05:31,095][36761] Fps is (10 sec: 16383.9, 60 sec: 9830.4, 300 sec: 9830.4). Total num frames: 442368. Throughput: 0: 12700.0. Samples: 571500. Policy #0 lag: (min: 0.0, avg: 2.5, max: 5.0) [2024-07-02 11:05:31,096][36761] Avg episode reward: [(0, '0.001')] [2024-07-02 11:05:32,854][36999] Updated weights for policy 0, policy_version 30 (0.0013) [2024-07-02 11:05:34,290][37007] Worker 8 awakens! [2024-07-02 11:05:34,296][36761] Heartbeat connected on RolloutWorker_w8 [2024-07-02 11:05:36,095][36761] Fps is (10 sec: 19660.6, 60 sec: 10813.4, 300 sec: 10813.4). Total num frames: 540672. Throughput: 0: 14565.3. Samples: 688320. Policy #0 lag: (min: 0.0, avg: 2.9, max: 6.0) [2024-07-02 11:05:36,096][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:05:37,847][37009] Worker 9 awakens! [2024-07-02 11:05:37,853][36761] Heartbeat connected on RolloutWorker_w9 [2024-07-02 11:05:41,095][36761] Fps is (10 sec: 19660.9, 60 sec: 11617.8, 300 sec: 11617.8). Total num frames: 638976. Throughput: 0: 11025.7. Samples: 817680. Policy #0 lag: (min: 0.0, avg: 3.3, max: 7.0) [2024-07-02 11:05:41,096][36761] Avg episode reward: [(0, '0.001')] [2024-07-02 11:05:41,297][36999] Updated weights for policy 0, policy_version 40 (0.0012) [2024-07-02 11:05:42,562][37010] Worker 10 awakens! [2024-07-02 11:05:42,576][36761] Heartbeat connected on RolloutWorker_w10 [2024-07-02 11:05:46,095][36761] Fps is (10 sec: 22937.5, 60 sec: 12834.1, 300 sec: 12834.1). Total num frames: 770048. Throughput: 0: 12393.3. Samples: 887180. Policy #0 lag: (min: 0.0, avg: 3.1, max: 8.0) [2024-07-02 11:05:46,096][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:05:46,963][36999] Updated weights for policy 0, policy_version 50 (0.0014) [2024-07-02 11:05:48,268][37012] Worker 11 awakens! [2024-07-02 11:05:48,277][36761] Heartbeat connected on RolloutWorker_w11 [2024-07-02 11:05:51,095][36761] Fps is (10 sec: 29491.1, 60 sec: 15564.8, 300 sec: 14367.5). Total num frames: 933888. Throughput: 0: 15777.3. Samples: 1051020. Policy #0 lag: (min: 0.0, avg: 15.4, max: 51.0) [2024-07-02 11:05:51,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:05:51,096][36979] Saving new best policy, reward=0.003! [2024-07-02 11:05:53,037][37014] Worker 12 awakens! [2024-07-02 11:05:53,045][36761] Heartbeat connected on RolloutWorker_w12 [2024-07-02 11:05:53,632][36999] Updated weights for policy 0, policy_version 60 (0.0014) [2024-07-02 11:05:56,095][36761] Fps is (10 sec: 29491.2, 60 sec: 17749.3, 300 sec: 15213.7). Total num frames: 1064960. Throughput: 0: 19272.5. Samples: 1229980. Policy #0 lag: (min: 0.0, avg: 17.7, max: 58.0) [2024-07-02 11:05:56,096][36761] Avg episode reward: [(0, '0.000')] [2024-07-02 11:05:57,739][37011] Worker 13 awakens! [2024-07-02 11:05:57,747][36761] Heartbeat connected on RolloutWorker_w13 [2024-07-02 11:05:58,829][36999] Updated weights for policy 0, policy_version 70 (0.0014) [2024-07-02 11:06:01,096][36761] Fps is (10 sec: 29490.9, 60 sec: 17749.3, 300 sec: 16384.0). Total num frames: 1228800. Throughput: 0: 20963.0. Samples: 1319920. Policy #0 lag: (min: 0.0, avg: 4.6, max: 10.0) [2024-07-02 11:06:01,096][36761] Avg episode reward: [(0, '0.001')] [2024-07-02 11:06:02,448][37015] Worker 14 awakens! [2024-07-02 11:06:02,454][36761] Heartbeat connected on RolloutWorker_w14 [2024-07-02 11:06:03,789][36999] Updated weights for policy 0, policy_version 80 (0.0031) [2024-07-02 11:06:06,095][36761] Fps is (10 sec: 31129.6, 60 sec: 20206.9, 300 sec: 17203.2). Total num frames: 1376256. Throughput: 0: 23806.2. Samples: 1507460. Policy #0 lag: (min: 0.0, avg: 6.7, max: 10.0) [2024-07-02 11:06:06,096][36761] Avg episode reward: [(0, '0.001')] [2024-07-02 11:06:07,104][37013] Worker 15 awakens! [2024-07-02 11:06:07,111][36761] Heartbeat connected on RolloutWorker_w15 [2024-07-02 11:06:08,829][36999] Updated weights for policy 0, policy_version 90 (0.0030) [2024-07-02 11:06:11,095][36761] Fps is (10 sec: 29491.7, 60 sec: 22118.4, 300 sec: 17926.0). Total num frames: 1523712. Throughput: 0: 26141.3. Samples: 1696700. Policy #0 lag: (min: 0.0, avg: 6.9, max: 10.0) [2024-07-02 11:06:11,096][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:06:12,084][37018] Worker 16 awakens! [2024-07-02 11:06:12,094][36761] Heartbeat connected on RolloutWorker_w16 [2024-07-02 11:06:14,141][36999] Updated weights for policy 0, policy_version 100 (0.0035) [2024-07-02 11:06:16,095][36761] Fps is (10 sec: 31129.6, 60 sec: 24302.9, 300 sec: 18750.6). Total num frames: 1687552. Throughput: 0: 27048.9. Samples: 1788700. Policy #0 lag: (min: 0.0, avg: 7.5, max: 13.0) [2024-07-02 11:06:16,096][36761] Avg episode reward: [(0, '0.001')] [2024-07-02 11:06:16,732][37017] Worker 17 awakens! [2024-07-02 11:06:16,743][36761] Heartbeat connected on RolloutWorker_w17 [2024-07-02 11:06:19,525][36999] Updated weights for policy 0, policy_version 110 (0.0027) [2024-07-02 11:06:19,998][37016] Worker 18 awakens! [2024-07-02 11:06:20,007][36761] Heartbeat connected on RolloutWorker_w18 [2024-07-02 11:06:21,095][36761] Fps is (10 sec: 34406.0, 60 sec: 26487.4, 300 sec: 19660.8). Total num frames: 1867776. Throughput: 0: 28695.9. Samples: 1979640. Policy #0 lag: (min: 1.0, avg: 7.4, max: 12.0) [2024-07-02 11:06:21,096][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:06:24,401][36999] Updated weights for policy 0, policy_version 120 (0.0034) [2024-07-02 11:06:26,053][37019] Worker 19 awakens! [2024-07-02 11:06:26,064][36761] Heartbeat connected on RolloutWorker_w19 [2024-07-02 11:06:26,095][36761] Fps is (10 sec: 36045.0, 60 sec: 28398.9, 300 sec: 20480.0). Total num frames: 2048000. Throughput: 0: 30265.3. Samples: 2179620. Policy #0 lag: (min: 0.0, avg: 5.3, max: 12.0) [2024-07-02 11:06:26,096][36761] Avg episode reward: [(0, '0.000')] [2024-07-02 11:06:29,256][36999] Updated weights for policy 0, policy_version 130 (0.0026) [2024-07-02 11:06:30,736][37020] Worker 20 awakens! [2024-07-02 11:06:30,747][36761] Heartbeat connected on RolloutWorker_w20 [2024-07-02 11:06:31,095][36761] Fps is (10 sec: 32768.4, 60 sec: 29218.2, 300 sec: 20909.1). Total num frames: 2195456. Throughput: 0: 31274.7. Samples: 2294540. Policy #0 lag: (min: 0.0, avg: 6.0, max: 13.0) [2024-07-02 11:06:31,096][36761] Avg episode reward: [(0, '0.001')] [2024-07-02 11:06:34,108][37021] Worker 21 awakens! [2024-07-02 11:06:34,121][36761] Heartbeat connected on RolloutWorker_w21 [2024-07-02 11:06:34,271][36999] Updated weights for policy 0, policy_version 140 (0.0025) [2024-07-02 11:06:36,096][36761] Fps is (10 sec: 31126.5, 60 sec: 30309.9, 300 sec: 21448.0). Total num frames: 2359296. Throughput: 0: 32265.1. Samples: 2502980. Policy #0 lag: (min: 0.0, avg: 6.7, max: 15.0) [2024-07-02 11:06:36,097][36761] Avg episode reward: [(0, '0.001')] [2024-07-02 11:06:36,115][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000000145_2375680.pth... [2024-07-02 11:06:37,753][36999] Updated weights for policy 0, policy_version 150 (0.0023) [2024-07-02 11:06:37,980][37022] Worker 22 awakens! [2024-07-02 11:06:37,991][36761] Heartbeat connected on RolloutWorker_w22 [2024-07-02 11:06:41,095][36761] Fps is (10 sec: 32767.9, 60 sec: 31402.7, 300 sec: 21940.3). Total num frames: 2523136. Throughput: 0: 33197.4. Samples: 2723860. Policy #0 lag: (min: 0.0, avg: 36.2, max: 152.0) [2024-07-02 11:06:41,096][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:06:43,109][36999] Updated weights for policy 0, policy_version 160 (0.0033) [2024-07-02 11:06:43,424][37023] Worker 23 awakens! [2024-07-02 11:06:43,434][36761] Heartbeat connected on RolloutWorker_w23 [2024-07-02 11:06:46,095][36761] Fps is (10 sec: 39325.7, 60 sec: 33041.1, 300 sec: 22937.6). Total num frames: 2752512. Throughput: 0: 33425.0. Samples: 2824040. Policy #0 lag: (min: 1.0, avg: 9.7, max: 18.0) [2024-07-02 11:06:46,096][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:06:47,506][36999] Updated weights for policy 0, policy_version 170 (0.0031) [2024-07-02 11:06:48,148][37024] Worker 24 awakens! [2024-07-02 11:06:48,160][36761] Heartbeat connected on RolloutWorker_w24 [2024-07-02 11:06:50,876][36999] Updated weights for policy 0, policy_version 180 (0.0040) [2024-07-02 11:06:51,095][36761] Fps is (10 sec: 44236.2, 60 sec: 33860.2, 300 sec: 23724.0). Total num frames: 2965504. Throughput: 0: 34413.3. Samples: 3056060. Policy #0 lag: (min: 0.0, avg: 5.8, max: 16.0) [2024-07-02 11:06:51,096][36761] Avg episode reward: [(0, '0.001')] [2024-07-02 11:06:52,748][37025] Worker 25 awakens! [2024-07-02 11:06:52,760][36761] Heartbeat connected on RolloutWorker_w25 [2024-07-02 11:06:56,095][36761] Fps is (10 sec: 34406.2, 60 sec: 33860.3, 300 sec: 23819.8). Total num frames: 3096576. Throughput: 0: 35304.4. Samples: 3285400. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-07-02 11:06:56,096][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:06:56,274][36999] Updated weights for policy 0, policy_version 190 (0.0026) [2024-07-02 11:06:58,912][37028] Worker 26 awakens! [2024-07-02 11:06:58,926][36761] Heartbeat connected on RolloutWorker_w26 [2024-07-02 11:07:00,254][36999] Updated weights for policy 0, policy_version 200 (0.0025) [2024-07-02 11:07:01,095][36761] Fps is (10 sec: 32768.2, 60 sec: 34406.4, 300 sec: 24393.9). Total num frames: 3293184. Throughput: 0: 35654.7. Samples: 3393160. Policy #0 lag: (min: 0.0, avg: 20.5, max: 199.0) [2024-07-02 11:07:01,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:07:02,948][37027] Worker 27 awakens! [2024-07-02 11:07:02,961][36761] Heartbeat connected on RolloutWorker_w27 [2024-07-02 11:07:04,063][36999] Updated weights for policy 0, policy_version 210 (0.0038) [2024-07-02 11:07:06,096][36761] Fps is (10 sec: 42597.7, 60 sec: 35771.7, 300 sec: 25161.1). Total num frames: 3522560. Throughput: 0: 36625.7. Samples: 3627800. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-07-02 11:07:06,096][36761] Avg episode reward: [(0, '0.001')] [2024-07-02 11:07:06,871][37026] Worker 28 awakens! [2024-07-02 11:07:06,886][36761] Heartbeat connected on RolloutWorker_w28 [2024-07-02 11:07:08,695][36999] Updated weights for policy 0, policy_version 220 (0.0025) [2024-07-02 11:07:11,095][36761] Fps is (10 sec: 37683.3, 60 sec: 35771.7, 300 sec: 25310.4). Total num frames: 3670016. Throughput: 0: 37522.2. Samples: 3868120. Policy #0 lag: (min: 0.0, avg: 7.8, max: 19.0) [2024-07-02 11:07:11,096][36761] Avg episode reward: [(0, '0.001')] [2024-07-02 11:07:11,596][37030] Worker 29 awakens! [2024-07-02 11:07:11,611][36761] Heartbeat connected on RolloutWorker_w29 [2024-07-02 11:07:11,920][36999] Updated weights for policy 0, policy_version 230 (0.0027) [2024-07-02 11:07:16,096][36761] Fps is (10 sec: 36044.8, 60 sec: 36590.9, 300 sec: 25886.7). Total num frames: 3883008. Throughput: 0: 37484.3. Samples: 3981340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-07-02 11:07:16,096][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:07:16,996][36999] Updated weights for policy 0, policy_version 240 (0.0034) [2024-07-02 11:07:17,666][37031] Worker 30 awakens! [2024-07-02 11:07:17,679][36761] Heartbeat connected on RolloutWorker_w30 [2024-07-02 11:07:20,034][36999] Updated weights for policy 0, policy_version 250 (0.0034) [2024-07-02 11:07:20,980][37029] Worker 31 awakens! [2024-07-02 11:07:20,996][36761] Heartbeat connected on RolloutWorker_w31 [2024-07-02 11:07:21,095][36761] Fps is (10 sec: 45875.7, 60 sec: 37683.3, 300 sec: 26637.2). Total num frames: 4128768. Throughput: 0: 38279.6. Samples: 4225520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-07-02 11:07:21,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:07:25,062][36999] Updated weights for policy 0, policy_version 260 (0.0043) [2024-07-02 11:07:25,611][36979] Signal inference workers to stop experience collection... (50 times) [2024-07-02 11:07:25,654][36999] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-07-02 11:07:25,659][36979] Signal inference workers to resume experience collection... (50 times) [2024-07-02 11:07:25,666][36999] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-07-02 11:07:26,095][36761] Fps is (10 sec: 42599.6, 60 sec: 37683.3, 300 sec: 26931.2). Total num frames: 4308992. Throughput: 0: 38981.4. Samples: 4478020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-07-02 11:07:26,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:07:26,109][36979] Saving new best policy, reward=0.006! [2024-07-02 11:07:28,365][36999] Updated weights for policy 0, policy_version 270 (0.0039) [2024-07-02 11:07:31,095][36761] Fps is (10 sec: 39321.1, 60 sec: 38775.4, 300 sec: 27406.0). Total num frames: 4521984. Throughput: 0: 39314.6. Samples: 4593200. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-07-02 11:07:31,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:07:32,923][36999] Updated weights for policy 0, policy_version 280 (0.0028) [2024-07-02 11:07:36,095][36761] Fps is (10 sec: 42598.1, 60 sec: 39595.4, 300 sec: 27852.8). Total num frames: 4734976. Throughput: 0: 39722.8. Samples: 4843580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 11:07:36,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:07:36,122][36999] Updated weights for policy 0, policy_version 290 (0.0039) [2024-07-02 11:07:40,912][36999] Updated weights for policy 0, policy_version 300 (0.0034) [2024-07-02 11:07:41,095][36761] Fps is (10 sec: 39322.3, 60 sec: 39867.8, 300 sec: 28086.9). Total num frames: 4915200. Throughput: 0: 40138.7. Samples: 5091640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-07-02 11:07:41,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:07:44,066][36999] Updated weights for policy 0, policy_version 310 (0.0037) [2024-07-02 11:07:46,095][36761] Fps is (10 sec: 39321.2, 60 sec: 39594.6, 300 sec: 28489.9). Total num frames: 5128192. Throughput: 0: 40362.2. Samples: 5209460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 11:07:46,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:07:48,882][36999] Updated weights for policy 0, policy_version 320 (0.0033) [2024-07-02 11:07:51,095][36761] Fps is (10 sec: 42597.8, 60 sec: 39594.7, 300 sec: 28871.3). Total num frames: 5341184. Throughput: 0: 40845.4. Samples: 5465840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 11:07:51,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:07:52,173][36999] Updated weights for policy 0, policy_version 330 (0.0041) [2024-07-02 11:07:56,095][36761] Fps is (10 sec: 40960.7, 60 sec: 40687.0, 300 sec: 29146.3). Total num frames: 5537792. Throughput: 0: 40886.4. Samples: 5708000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-07-02 11:07:56,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:07:56,613][36999] Updated weights for policy 0, policy_version 340 (0.0045) [2024-07-02 11:07:59,725][36999] Updated weights for policy 0, policy_version 350 (0.0040) [2024-07-02 11:08:01,095][36761] Fps is (10 sec: 44237.6, 60 sec: 41506.3, 300 sec: 29659.3). Total num frames: 5783552. Throughput: 0: 41206.5. Samples: 5835620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-07-02 11:08:01,096][36761] Avg episode reward: [(0, '0.001')] [2024-07-02 11:08:04,695][36999] Updated weights for policy 0, policy_version 360 (0.0034) [2024-07-02 11:08:06,095][36761] Fps is (10 sec: 42598.2, 60 sec: 40687.1, 300 sec: 29818.9). Total num frames: 5963776. Throughput: 0: 41395.1. Samples: 6088300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 11:08:06,095][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:08:07,999][36999] Updated weights for policy 0, policy_version 370 (0.0044) [2024-07-02 11:08:11,095][36761] Fps is (10 sec: 39321.0, 60 sec: 41779.2, 300 sec: 30130.6). Total num frames: 6176768. Throughput: 0: 41099.9. Samples: 6327520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 11:08:11,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:08:12,548][36999] Updated weights for policy 0, policy_version 380 (0.0034) [2024-07-02 11:08:16,073][36999] Updated weights for policy 0, policy_version 390 (0.0042) [2024-07-02 11:08:16,095][36761] Fps is (10 sec: 42598.2, 60 sec: 41779.3, 300 sec: 30427.4). Total num frames: 6389760. Throughput: 0: 41380.1. Samples: 6455300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-07-02 11:08:16,096][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:08:20,678][36999] Updated weights for policy 0, policy_version 400 (0.0031) [2024-07-02 11:08:21,095][36761] Fps is (10 sec: 37682.9, 60 sec: 40413.7, 300 sec: 30481.8). Total num frames: 6553600. Throughput: 0: 41310.5. Samples: 6702560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-07-02 11:08:21,101][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:08:24,233][36999] Updated weights for policy 0, policy_version 410 (0.0035) [2024-07-02 11:08:26,095][36761] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 30980.7). Total num frames: 6815744. Throughput: 0: 41092.4. Samples: 6940800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 11:08:26,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:08:28,734][36999] Updated weights for policy 0, policy_version 420 (0.0044) [2024-07-02 11:08:31,096][36761] Fps is (10 sec: 45875.2, 60 sec: 41506.1, 300 sec: 31166.0). Total num frames: 7012352. Throughput: 0: 41295.0. Samples: 7067740. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-07-02 11:08:31,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:08:31,798][36999] Updated weights for policy 0, policy_version 430 (0.0034) [2024-07-02 11:08:36,095][36761] Fps is (10 sec: 34406.5, 60 sec: 40413.9, 300 sec: 31129.6). Total num frames: 7159808. Throughput: 0: 41027.3. Samples: 7312060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 11:08:36,095][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:08:36,130][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000000438_7176192.pth... [2024-07-02 11:08:36,552][36999] Updated weights for policy 0, policy_version 440 (0.0036) [2024-07-02 11:08:40,011][36999] Updated weights for policy 0, policy_version 450 (0.0033) [2024-07-02 11:08:41,095][36761] Fps is (10 sec: 39321.8, 60 sec: 41506.0, 300 sec: 31513.0). Total num frames: 7405568. Throughput: 0: 41193.6. Samples: 7561720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 11:08:41,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:08:44,222][36999] Updated weights for policy 0, policy_version 460 (0.0033) [2024-07-02 11:08:46,100][36761] Fps is (10 sec: 44215.9, 60 sec: 41229.9, 300 sec: 31675.1). Total num frames: 7602176. Throughput: 0: 41126.3. Samples: 7686500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 11:08:46,101][36761] Avg episode reward: [(0, '0.004')] [2024-07-02 11:08:47,809][36999] Updated weights for policy 0, policy_version 470 (0.0034) [2024-07-02 11:08:51,095][36761] Fps is (10 sec: 39322.3, 60 sec: 40960.1, 300 sec: 31831.8). Total num frames: 7798784. Throughput: 0: 41093.3. Samples: 7937500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 11:08:51,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:08:52,412][36999] Updated weights for policy 0, policy_version 480 (0.0031) [2024-07-02 11:08:55,863][36999] Updated weights for policy 0, policy_version 490 (0.0039) [2024-07-02 11:08:56,095][36761] Fps is (10 sec: 42618.5, 60 sec: 41506.1, 300 sec: 32112.7). Total num frames: 8028160. Throughput: 0: 41203.2. Samples: 8181660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-07-02 11:08:56,096][36761] Avg episode reward: [(0, '0.001')] [2024-07-02 11:09:00,167][36999] Updated weights for policy 0, policy_version 500 (0.0040) [2024-07-02 11:09:01,095][36761] Fps is (10 sec: 44236.8, 60 sec: 40960.0, 300 sec: 32318.3). Total num frames: 8241152. Throughput: 0: 41100.5. Samples: 8304820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-07-02 11:09:01,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:09:03,823][36999] Updated weights for policy 0, policy_version 510 (0.0031) [2024-07-02 11:09:06,100][36761] Fps is (10 sec: 39303.4, 60 sec: 40956.8, 300 sec: 32389.3). Total num frames: 8421376. Throughput: 0: 41157.7. Samples: 8554840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-07-02 11:09:06,101][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:09:08,165][36999] Updated weights for policy 0, policy_version 520 (0.0039) [2024-07-02 11:09:11,095][36761] Fps is (10 sec: 39321.5, 60 sec: 40960.1, 300 sec: 32582.5). Total num frames: 8634368. Throughput: 0: 41423.5. Samples: 8804860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 11:09:11,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:09:11,209][36979] Saving new best policy, reward=0.007! [2024-07-02 11:09:11,264][36979] Signal inference workers to stop experience collection... (100 times) [2024-07-02 11:09:11,264][36979] Signal inference workers to resume experience collection... (100 times) [2024-07-02 11:09:11,282][36999] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-07-02 11:09:11,282][36999] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-07-02 11:09:11,718][36999] Updated weights for policy 0, policy_version 530 (0.0031) [2024-07-02 11:09:16,095][36761] Fps is (10 sec: 40978.8, 60 sec: 40686.9, 300 sec: 32707.3). Total num frames: 8830976. Throughput: 0: 41353.0. Samples: 8928620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-07-02 11:09:16,096][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:09:16,101][36999] Updated weights for policy 0, policy_version 540 (0.0030) [2024-07-02 11:09:19,898][36999] Updated weights for policy 0, policy_version 550 (0.0029) [2024-07-02 11:09:21,095][36761] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 32946.7). Total num frames: 9060352. Throughput: 0: 41522.2. Samples: 9180560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 11:09:21,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:09:23,622][36999] Updated weights for policy 0, policy_version 560 (0.0025) [2024-07-02 11:09:26,095][36761] Fps is (10 sec: 44236.9, 60 sec: 40960.0, 300 sec: 33119.1). Total num frames: 9273344. Throughput: 0: 41479.7. Samples: 9428300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 11:09:26,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:09:27,552][36999] Updated weights for policy 0, policy_version 570 (0.0036) [2024-07-02 11:09:31,095][36761] Fps is (10 sec: 42598.5, 60 sec: 41233.2, 300 sec: 33285.4). Total num frames: 9486336. Throughput: 0: 41468.3. Samples: 9552380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 11:09:31,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:09:31,202][36999] Updated weights for policy 0, policy_version 580 (0.0039) [2024-07-02 11:09:35,367][36999] Updated weights for policy 0, policy_version 590 (0.0042) [2024-07-02 11:09:36,095][36761] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 33446.0). Total num frames: 9699328. Throughput: 0: 41586.2. Samples: 9808880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 11:09:36,096][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:09:39,496][36999] Updated weights for policy 0, policy_version 600 (0.0033) [2024-07-02 11:09:41,095][36761] Fps is (10 sec: 42597.7, 60 sec: 41779.2, 300 sec: 33601.1). Total num frames: 9912320. Throughput: 0: 41590.5. Samples: 10053240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-07-02 11:09:41,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:09:43,192][36999] Updated weights for policy 0, policy_version 610 (0.0029) [2024-07-02 11:09:46,095][36761] Fps is (10 sec: 40960.2, 60 sec: 41782.5, 300 sec: 34267.6). Total num frames: 10108928. Throughput: 0: 41708.4. Samples: 10181700. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-07-02 11:09:46,096][36761] Avg episode reward: [(0, '0.004')] [2024-07-02 11:09:46,987][36999] Updated weights for policy 0, policy_version 620 (0.0036) [2024-07-02 11:09:51,071][36999] Updated weights for policy 0, policy_version 630 (0.0027) [2024-07-02 11:09:51,095][36761] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 34989.5). Total num frames: 10321920. Throughput: 0: 41894.5. Samples: 10439900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 11:09:51,096][36761] Avg episode reward: [(0, '0.004')] [2024-07-02 11:09:54,622][36999] Updated weights for policy 0, policy_version 640 (0.0039) [2024-07-02 11:09:56,095][36761] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 35267.3). Total num frames: 10567680. Throughput: 0: 41818.2. Samples: 10686680. Policy #0 lag: (min: 1.0, avg: 12.3, max: 24.0) [2024-07-02 11:09:56,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:09:58,750][36999] Updated weights for policy 0, policy_version 650 (0.0037) [2024-07-02 11:10:01,095][36761] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 35822.6). Total num frames: 10731520. Throughput: 0: 41881.8. Samples: 10813300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 11:10:01,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:10:02,229][36999] Updated weights for policy 0, policy_version 660 (0.0026) [2024-07-02 11:10:06,095][36761] Fps is (10 sec: 37683.0, 60 sec: 42055.5, 300 sec: 36433.6). Total num frames: 10944512. Throughput: 0: 41893.3. Samples: 11065760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-07-02 11:10:06,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:10:06,790][36999] Updated weights for policy 0, policy_version 670 (0.0033) [2024-07-02 11:10:10,513][36999] Updated weights for policy 0, policy_version 680 (0.0055) [2024-07-02 11:10:11,095][36761] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 37100.0). Total num frames: 11173888. Throughput: 0: 42037.7. Samples: 11320000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-07-02 11:10:11,096][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:10:14,428][36999] Updated weights for policy 0, policy_version 690 (0.0043) [2024-07-02 11:10:16,095][36761] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 37544.3). Total num frames: 11354112. Throughput: 0: 42146.1. Samples: 11448960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 11:10:16,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:10:18,069][36999] Updated weights for policy 0, policy_version 700 (0.0033) [2024-07-02 11:10:21,095][36761] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 38099.7). Total num frames: 11583488. Throughput: 0: 41919.1. Samples: 11695240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 11:10:21,098][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:10:22,071][36999] Updated weights for policy 0, policy_version 710 (0.0024) [2024-07-02 11:10:25,944][36999] Updated weights for policy 0, policy_version 720 (0.0038) [2024-07-02 11:10:26,095][36761] Fps is (10 sec: 44237.5, 60 sec: 42052.3, 300 sec: 38488.5). Total num frames: 11796480. Throughput: 0: 42190.4. Samples: 11951800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 11:10:26,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:10:30,028][36999] Updated weights for policy 0, policy_version 730 (0.0030) [2024-07-02 11:10:31,095][36761] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 38877.3). Total num frames: 12009472. Throughput: 0: 42103.9. Samples: 12076380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 11:10:31,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:10:33,428][36979] Signal inference workers to stop experience collection... (150 times) [2024-07-02 11:10:33,429][36979] Signal inference workers to resume experience collection... (150 times) [2024-07-02 11:10:33,468][36999] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-07-02 11:10:33,468][36999] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-07-02 11:10:33,583][36999] Updated weights for policy 0, policy_version 740 (0.0023) [2024-07-02 11:10:36,095][36761] Fps is (10 sec: 40959.1, 60 sec: 41779.1, 300 sec: 39210.5). Total num frames: 12206080. Throughput: 0: 41907.8. Samples: 12325760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-07-02 11:10:36,103][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:10:36,121][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000000745_12206080.pth... [2024-07-02 11:10:36,191][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000000145_2375680.pth [2024-07-02 11:10:37,662][36999] Updated weights for policy 0, policy_version 750 (0.0046) [2024-07-02 11:10:41,100][36761] Fps is (10 sec: 40941.7, 60 sec: 41776.1, 300 sec: 39487.6). Total num frames: 12419072. Throughput: 0: 42178.4. Samples: 12584900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-07-02 11:10:41,100][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:10:41,535][36999] Updated weights for policy 0, policy_version 760 (0.0031) [2024-07-02 11:10:45,259][36999] Updated weights for policy 0, policy_version 770 (0.0029) [2024-07-02 11:10:46,095][36761] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 39654.8). Total num frames: 12632064. Throughput: 0: 42056.8. Samples: 12705860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 11:10:46,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:10:49,174][36999] Updated weights for policy 0, policy_version 780 (0.0042) [2024-07-02 11:10:51,095][36761] Fps is (10 sec: 42618.1, 60 sec: 42052.3, 300 sec: 39932.6). Total num frames: 12845056. Throughput: 0: 42087.2. Samples: 12959680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-07-02 11:10:51,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:10:52,931][36999] Updated weights for policy 0, policy_version 790 (0.0040) [2024-07-02 11:10:56,095][36761] Fps is (10 sec: 40960.7, 60 sec: 41233.1, 300 sec: 40043.6). Total num frames: 13041664. Throughput: 0: 42152.5. Samples: 13216860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-07-02 11:10:56,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:10:56,117][36979] Saving new best policy, reward=0.008! [2024-07-02 11:10:57,254][36999] Updated weights for policy 0, policy_version 800 (0.0044) [2024-07-02 11:11:00,633][36999] Updated weights for policy 0, policy_version 810 (0.0035) [2024-07-02 11:11:01,095][36761] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 40321.3). Total num frames: 13271040. Throughput: 0: 41985.9. Samples: 13338320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 11:11:01,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:11:05,006][36999] Updated weights for policy 0, policy_version 820 (0.0027) [2024-07-02 11:11:06,096][36761] Fps is (10 sec: 40959.0, 60 sec: 41779.1, 300 sec: 40432.4). Total num frames: 13451264. Throughput: 0: 42158.9. Samples: 13592400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-07-02 11:11:06,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:11:08,350][36999] Updated weights for policy 0, policy_version 830 (0.0032) [2024-07-02 11:11:11,096][36761] Fps is (10 sec: 40957.5, 60 sec: 41778.8, 300 sec: 40654.5). Total num frames: 13680640. Throughput: 0: 42252.7. Samples: 13853200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 11:11:11,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:11:12,928][36999] Updated weights for policy 0, policy_version 840 (0.0032) [2024-07-02 11:11:16,095][36761] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 40821.2). Total num frames: 13910016. Throughput: 0: 42253.8. Samples: 13977800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-07-02 11:11:16,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:11:16,160][36999] Updated weights for policy 0, policy_version 850 (0.0044) [2024-07-02 11:11:20,456][36999] Updated weights for policy 0, policy_version 860 (0.0032) [2024-07-02 11:11:21,095][36761] Fps is (10 sec: 40962.5, 60 sec: 41779.3, 300 sec: 40821.2). Total num frames: 14090240. Throughput: 0: 42384.6. Samples: 14233060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 11:11:21,095][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:11:23,801][36999] Updated weights for policy 0, policy_version 870 (0.0032) [2024-07-02 11:11:26,095][36761] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 41154.4). Total num frames: 14336000. Throughput: 0: 42281.5. Samples: 14487380. Policy #0 lag: (min: 0.0, avg: 12.5, max: 28.0) [2024-07-02 11:11:26,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:11:28,104][36999] Updated weights for policy 0, policy_version 880 (0.0033) [2024-07-02 11:11:31,095][36761] Fps is (10 sec: 45874.7, 60 sec: 42325.4, 300 sec: 41321.1). Total num frames: 14548992. Throughput: 0: 42389.4. Samples: 14613380. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-07-02 11:11:31,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:11:31,516][36999] Updated weights for policy 0, policy_version 890 (0.0028) [2024-07-02 11:11:35,533][36999] Updated weights for policy 0, policy_version 900 (0.0033) [2024-07-02 11:11:36,095][36761] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 41432.1). Total num frames: 14745600. Throughput: 0: 42358.5. Samples: 14865820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 11:11:36,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:11:39,404][36999] Updated weights for policy 0, policy_version 910 (0.0031) [2024-07-02 11:11:41,095][36761] Fps is (10 sec: 40960.0, 60 sec: 42328.5, 300 sec: 41376.5). Total num frames: 14958592. Throughput: 0: 42259.4. Samples: 15118540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-07-02 11:11:41,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:11:41,140][36979] Saving new best policy, reward=0.009! [2024-07-02 11:11:43,271][36999] Updated weights for policy 0, policy_version 920 (0.0029) [2024-07-02 11:11:46,095][36761] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41321.0). Total num frames: 15155200. Throughput: 0: 42278.1. Samples: 15240840. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-07-02 11:11:46,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:11:47,187][36999] Updated weights for policy 0, policy_version 930 (0.0029) [2024-07-02 11:11:48,478][36979] Signal inference workers to stop experience collection... (200 times) [2024-07-02 11:11:48,506][36999] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-07-02 11:11:48,532][36979] Signal inference workers to resume experience collection... (200 times) [2024-07-02 11:11:48,534][36999] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-07-02 11:11:51,095][36761] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 41654.2). Total num frames: 15384576. Throughput: 0: 42213.5. Samples: 15492000. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-07-02 11:11:51,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:11:51,184][36999] Updated weights for policy 0, policy_version 940 (0.0051) [2024-07-02 11:11:55,271][36999] Updated weights for policy 0, policy_version 950 (0.0052) [2024-07-02 11:11:56,095][36761] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 41654.2). Total num frames: 15581184. Throughput: 0: 41946.6. Samples: 15740780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-07-02 11:11:56,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:11:59,167][36999] Updated weights for policy 0, policy_version 960 (0.0036) [2024-07-02 11:12:01,095][36761] Fps is (10 sec: 39321.3, 60 sec: 41779.1, 300 sec: 41543.2). Total num frames: 15777792. Throughput: 0: 41858.7. Samples: 15861440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 11:12:01,098][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:12:03,070][36999] Updated weights for policy 0, policy_version 970 (0.0039) [2024-07-02 11:12:06,095][36761] Fps is (10 sec: 42599.2, 60 sec: 42598.6, 300 sec: 41820.9). Total num frames: 16007168. Throughput: 0: 41896.4. Samples: 16118400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 11:12:06,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:12:07,231][36999] Updated weights for policy 0, policy_version 980 (0.0037) [2024-07-02 11:12:10,763][36999] Updated weights for policy 0, policy_version 990 (0.0041) [2024-07-02 11:12:11,095][36761] Fps is (10 sec: 45875.7, 60 sec: 42598.8, 300 sec: 41876.4). Total num frames: 16236544. Throughput: 0: 41848.2. Samples: 16370540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 11:12:11,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:12:14,989][36999] Updated weights for policy 0, policy_version 1000 (0.0028) [2024-07-02 11:12:16,095][36761] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 41654.2). Total num frames: 16416768. Throughput: 0: 41850.3. Samples: 16496640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 11:12:16,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:12:18,477][36999] Updated weights for policy 0, policy_version 1010 (0.0024) [2024-07-02 11:12:21,095][36761] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 16629760. Throughput: 0: 41873.9. Samples: 16750140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-07-02 11:12:21,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:12:22,748][36999] Updated weights for policy 0, policy_version 1020 (0.0039) [2024-07-02 11:12:26,095][36761] Fps is (10 sec: 44236.1, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 16859136. Throughput: 0: 42019.1. Samples: 17009400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 11:12:26,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:12:26,386][36999] Updated weights for policy 0, policy_version 1030 (0.0053) [2024-07-02 11:12:30,593][36999] Updated weights for policy 0, policy_version 1040 (0.0035) [2024-07-02 11:12:31,095][36761] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 17055744. Throughput: 0: 42112.0. Samples: 17135880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 11:12:31,096][36761] Avg episode reward: [(0, '0.001')] [2024-07-02 11:12:33,908][36999] Updated weights for policy 0, policy_version 1050 (0.0028) [2024-07-02 11:12:36,099][36761] Fps is (10 sec: 40947.3, 60 sec: 42050.1, 300 sec: 41875.9). Total num frames: 17268736. Throughput: 0: 42120.5. Samples: 17387560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-07-02 11:12:36,099][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:12:36,128][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000001054_17268736.pth... [2024-07-02 11:12:36,186][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000000438_7176192.pth [2024-07-02 11:12:38,339][36999] Updated weights for policy 0, policy_version 1060 (0.0048) [2024-07-02 11:12:41,095][36761] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 17498112. Throughput: 0: 42337.4. Samples: 17645960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-07-02 11:12:41,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:12:41,475][36999] Updated weights for policy 0, policy_version 1070 (0.0034) [2024-07-02 11:12:46,095][36761] Fps is (10 sec: 40973.6, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 17678336. Throughput: 0: 42429.5. Samples: 17770760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-07-02 11:12:46,095][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:12:46,223][36999] Updated weights for policy 0, policy_version 1080 (0.0041) [2024-07-02 11:12:49,280][36999] Updated weights for policy 0, policy_version 1090 (0.0036) [2024-07-02 11:12:51,095][36761] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 17907712. Throughput: 0: 42297.3. Samples: 18021780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 11:12:51,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:12:53,801][36999] Updated weights for policy 0, policy_version 1100 (0.0030) [2024-07-02 11:12:56,096][36761] Fps is (10 sec: 44235.6, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 18120704. Throughput: 0: 42381.1. Samples: 18277700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-07-02 11:12:56,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:12:57,085][36999] Updated weights for policy 0, policy_version 1110 (0.0038) [2024-07-02 11:13:01,095][36761] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 18317312. Throughput: 0: 42504.5. Samples: 18409340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 11:13:01,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:13:01,383][36999] Updated weights for policy 0, policy_version 1120 (0.0046) [2024-07-02 11:13:05,160][36999] Updated weights for policy 0, policy_version 1130 (0.0039) [2024-07-02 11:13:06,095][36761] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 18530304. Throughput: 0: 42257.3. Samples: 18651720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-07-02 11:13:06,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:13:06,115][36979] Saving new best policy, reward=0.011! [2024-07-02 11:13:09,072][36999] Updated weights for policy 0, policy_version 1140 (0.0041) [2024-07-02 11:13:11,095][36761] Fps is (10 sec: 40959.4, 60 sec: 41506.0, 300 sec: 41820.8). Total num frames: 18726912. Throughput: 0: 42158.2. Samples: 18906520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-07-02 11:13:11,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:13:12,988][36999] Updated weights for policy 0, policy_version 1150 (0.0031) [2024-07-02 11:13:16,095][36761] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 18956288. Throughput: 0: 42104.1. Samples: 19030560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 11:13:16,096][36761] Avg episode reward: [(0, '0.004')] [2024-07-02 11:13:16,691][36999] Updated weights for policy 0, policy_version 1160 (0.0029) [2024-07-02 11:13:20,997][36999] Updated weights for policy 0, policy_version 1170 (0.0037) [2024-07-02 11:13:21,096][36761] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 19169280. Throughput: 0: 42124.6. Samples: 19283040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-07-02 11:13:21,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:13:24,680][36999] Updated weights for policy 0, policy_version 1180 (0.0032) [2024-07-02 11:13:26,095][36761] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 19365888. Throughput: 0: 42002.3. Samples: 19536060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-07-02 11:13:26,096][36761] Avg episode reward: [(0, '0.004')] [2024-07-02 11:13:28,879][36999] Updated weights for policy 0, policy_version 1190 (0.0035) [2024-07-02 11:13:31,095][36761] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 19578880. Throughput: 0: 42038.1. Samples: 19662480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 11:13:31,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:13:32,431][36999] Updated weights for policy 0, policy_version 1200 (0.0035) [2024-07-02 11:13:33,727][36979] Signal inference workers to stop experience collection... (250 times) [2024-07-02 11:13:33,727][36979] Signal inference workers to resume experience collection... (250 times) [2024-07-02 11:13:33,760][36999] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-07-02 11:13:33,760][36999] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-07-02 11:13:36,095][36761] Fps is (10 sec: 42598.2, 60 sec: 42054.5, 300 sec: 41987.5). Total num frames: 19791872. Throughput: 0: 42009.7. Samples: 19912220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-07-02 11:13:36,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:13:36,898][36999] Updated weights for policy 0, policy_version 1210 (0.0031) [2024-07-02 11:13:40,188][36999] Updated weights for policy 0, policy_version 1220 (0.0038) [2024-07-02 11:13:41,095][36761] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 42043.7). Total num frames: 20004864. Throughput: 0: 41898.7. Samples: 20163140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-07-02 11:13:41,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:13:44,715][36999] Updated weights for policy 0, policy_version 1230 (0.0046) [2024-07-02 11:13:46,095][36761] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 20201472. Throughput: 0: 41865.7. Samples: 20293300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 11:13:46,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:13:47,904][36999] Updated weights for policy 0, policy_version 1240 (0.0033) [2024-07-02 11:13:51,095][36761] Fps is (10 sec: 40960.2, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 20414464. Throughput: 0: 41924.0. Samples: 20538300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 11:13:51,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:13:52,396][36999] Updated weights for policy 0, policy_version 1250 (0.0038) [2024-07-02 11:13:55,951][36999] Updated weights for policy 0, policy_version 1260 (0.0035) [2024-07-02 11:13:56,095][36761] Fps is (10 sec: 45875.6, 60 sec: 42325.5, 300 sec: 42098.6). Total num frames: 20660224. Throughput: 0: 42002.4. Samples: 20796620. Policy #0 lag: (min: 1.0, avg: 10.1, max: 19.0) [2024-07-02 11:13:56,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:14:00,300][36999] Updated weights for policy 0, policy_version 1270 (0.0038) [2024-07-02 11:14:01,095][36761] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42099.2). Total num frames: 20840448. Throughput: 0: 42194.1. Samples: 20929300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 11:14:01,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:14:03,525][36999] Updated weights for policy 0, policy_version 1280 (0.0035) [2024-07-02 11:14:06,095][36761] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 21053440. Throughput: 0: 42064.6. Samples: 21175940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 11:14:06,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:14:08,182][36999] Updated weights for policy 0, policy_version 1290 (0.0044) [2024-07-02 11:14:11,095][36761] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 21266432. Throughput: 0: 41845.4. Samples: 21419100. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-07-02 11:14:11,095][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:14:11,331][36999] Updated weights for policy 0, policy_version 1300 (0.0026) [2024-07-02 11:14:16,063][36999] Updated weights for policy 0, policy_version 1310 (0.0026) [2024-07-02 11:14:16,095][36761] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 21463040. Throughput: 0: 41937.3. Samples: 21549660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 11:14:16,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:14:19,617][36999] Updated weights for policy 0, policy_version 1320 (0.0036) [2024-07-02 11:14:21,095][36761] Fps is (10 sec: 42597.7, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 21692416. Throughput: 0: 41993.3. Samples: 21801920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 11:14:21,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:14:23,618][36999] Updated weights for policy 0, policy_version 1330 (0.0033) [2024-07-02 11:14:26,095][36761] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 21905408. Throughput: 0: 42088.6. Samples: 22057120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-07-02 11:14:26,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:14:27,146][36999] Updated weights for policy 0, policy_version 1340 (0.0036) [2024-07-02 11:14:31,095][36761] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 22085632. Throughput: 0: 42041.3. Samples: 22185160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-07-02 11:14:31,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:14:31,319][36999] Updated weights for policy 0, policy_version 1350 (0.0032) [2024-07-02 11:14:35,026][36999] Updated weights for policy 0, policy_version 1360 (0.0044) [2024-07-02 11:14:36,095][36761] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 22315008. Throughput: 0: 42199.5. Samples: 22437280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 11:14:36,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:14:36,114][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000001362_22315008.pth... [2024-07-02 11:14:36,174][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000000745_12206080.pth [2024-07-02 11:14:39,013][36999] Updated weights for policy 0, policy_version 1370 (0.0028) [2024-07-02 11:14:41,095][36761] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 22544384. Throughput: 0: 42041.3. Samples: 22688480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 11:14:41,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:14:42,615][36999] Updated weights for policy 0, policy_version 1380 (0.0032) [2024-07-02 11:14:46,095][36761] Fps is (10 sec: 39322.0, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 22708224. Throughput: 0: 42046.8. Samples: 22821400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 11:14:46,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:14:46,891][36999] Updated weights for policy 0, policy_version 1390 (0.0028) [2024-07-02 11:14:50,346][36999] Updated weights for policy 0, policy_version 1400 (0.0046) [2024-07-02 11:14:51,095][36761] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 22953984. Throughput: 0: 42089.7. Samples: 23069980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 11:14:51,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:14:54,430][36979] Signal inference workers to stop experience collection... (300 times) [2024-07-02 11:14:54,430][36979] Signal inference workers to resume experience collection... (300 times) [2024-07-02 11:14:54,446][36999] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-07-02 11:14:54,446][36999] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-07-02 11:14:54,582][36999] Updated weights for policy 0, policy_version 1410 (0.0038) [2024-07-02 11:14:56,095][36761] Fps is (10 sec: 47513.8, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 23183360. Throughput: 0: 42344.5. Samples: 23324600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-07-02 11:14:56,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:14:58,003][36999] Updated weights for policy 0, policy_version 1420 (0.0045) [2024-07-02 11:15:01,095][36761] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 23363584. Throughput: 0: 42292.1. Samples: 23452800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 11:15:01,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:15:02,381][36999] Updated weights for policy 0, policy_version 1430 (0.0024) [2024-07-02 11:15:05,790][36999] Updated weights for policy 0, policy_version 1440 (0.0046) [2024-07-02 11:15:06,095][36761] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 23592960. Throughput: 0: 42220.1. Samples: 23701820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-07-02 11:15:06,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:15:10,216][36999] Updated weights for policy 0, policy_version 1450 (0.0050) [2024-07-02 11:15:11,095][36761] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 23805952. Throughput: 0: 42275.1. Samples: 23959500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-07-02 11:15:11,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:15:13,885][36999] Updated weights for policy 0, policy_version 1460 (0.0024) [2024-07-02 11:15:16,095][36761] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 24002560. Throughput: 0: 42219.0. Samples: 24085020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-07-02 11:15:16,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:15:18,256][36999] Updated weights for policy 0, policy_version 1470 (0.0038) [2024-07-02 11:15:21,095][36761] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 42098.5). Total num frames: 24215552. Throughput: 0: 42143.2. Samples: 24333720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-07-02 11:15:21,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:15:21,508][36999] Updated weights for policy 0, policy_version 1480 (0.0037) [2024-07-02 11:15:25,842][36999] Updated weights for policy 0, policy_version 1490 (0.0030) [2024-07-02 11:15:26,095][36761] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 24412160. Throughput: 0: 42355.6. Samples: 24594480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 11:15:26,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:15:29,047][36999] Updated weights for policy 0, policy_version 1500 (0.0035) [2024-07-02 11:15:31,095][36761] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 24641536. Throughput: 0: 42167.0. Samples: 24718920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-07-02 11:15:31,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:15:33,421][36999] Updated weights for policy 0, policy_version 1510 (0.0036) [2024-07-02 11:15:36,095][36761] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42154.7). Total num frames: 24854528. Throughput: 0: 42272.0. Samples: 24972220. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-07-02 11:15:36,105][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:15:36,634][36999] Updated weights for policy 0, policy_version 1520 (0.0044) [2024-07-02 11:15:41,095][36761] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 25034752. Throughput: 0: 42389.7. Samples: 25232140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 11:15:41,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:15:41,333][36999] Updated weights for policy 0, policy_version 1530 (0.0034) [2024-07-02 11:15:44,726][36999] Updated weights for policy 0, policy_version 1540 (0.0038) [2024-07-02 11:15:46,095][36761] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42154.1). Total num frames: 25280512. Throughput: 0: 42235.9. Samples: 25353420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 26.0) [2024-07-02 11:15:46,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:15:49,082][36999] Updated weights for policy 0, policy_version 1550 (0.0042) [2024-07-02 11:15:51,095][36761] Fps is (10 sec: 45875.4, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 25493504. Throughput: 0: 42223.1. Samples: 25601860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-07-02 11:15:51,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:15:52,630][36999] Updated weights for policy 0, policy_version 1560 (0.0033) [2024-07-02 11:15:56,095][36761] Fps is (10 sec: 37683.0, 60 sec: 41232.9, 300 sec: 41987.4). Total num frames: 25657344. Throughput: 0: 42135.4. Samples: 25855600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-07-02 11:15:56,096][36761] Avg episode reward: [(0, '0.004')] [2024-07-02 11:15:57,043][36999] Updated weights for policy 0, policy_version 1570 (0.0037) [2024-07-02 11:16:00,560][36999] Updated weights for policy 0, policy_version 1580 (0.0030) [2024-07-02 11:16:01,095][36761] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42209.7). Total num frames: 25903104. Throughput: 0: 42003.3. Samples: 25975160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-07-02 11:16:01,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:16:04,653][36999] Updated weights for policy 0, policy_version 1590 (0.0043) [2024-07-02 11:16:06,095][36761] Fps is (10 sec: 45875.9, 60 sec: 42052.3, 300 sec: 42154.2). Total num frames: 26116096. Throughput: 0: 42218.2. Samples: 26233540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-07-02 11:16:06,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:16:08,129][36999] Updated weights for policy 0, policy_version 1600 (0.0033) [2024-07-02 11:16:11,095][36761] Fps is (10 sec: 40959.4, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 26312704. Throughput: 0: 42155.4. Samples: 26491480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 11:16:11,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:16:12,436][36999] Updated weights for policy 0, policy_version 1610 (0.0033) [2024-07-02 11:16:15,634][36999] Updated weights for policy 0, policy_version 1620 (0.0038) [2024-07-02 11:16:16,095][36761] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 26542080. Throughput: 0: 42211.2. Samples: 26618420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-07-02 11:16:16,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:16:17,698][36979] Signal inference workers to stop experience collection... (350 times) [2024-07-02 11:16:17,698][36979] Signal inference workers to resume experience collection... (350 times) [2024-07-02 11:16:17,729][36999] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-07-02 11:16:17,729][36999] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-07-02 11:16:19,966][36999] Updated weights for policy 0, policy_version 1630 (0.0029) [2024-07-02 11:16:21,095][36761] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 26755072. Throughput: 0: 42274.3. Samples: 26874560. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-07-02 11:16:21,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:16:23,150][36999] Updated weights for policy 0, policy_version 1640 (0.0048) [2024-07-02 11:16:26,095][36761] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 26968064. Throughput: 0: 42105.3. Samples: 27126880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 11:16:26,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:16:27,680][36999] Updated weights for policy 0, policy_version 1650 (0.0029) [2024-07-02 11:16:30,713][36999] Updated weights for policy 0, policy_version 1660 (0.0041) [2024-07-02 11:16:31,096][36761] Fps is (10 sec: 44233.9, 60 sec: 42598.0, 300 sec: 42209.5). Total num frames: 27197440. Throughput: 0: 42336.4. Samples: 27258580. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-07-02 11:16:31,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:16:35,296][36999] Updated weights for policy 0, policy_version 1670 (0.0053) [2024-07-02 11:16:36,096][36761] Fps is (10 sec: 40955.3, 60 sec: 42051.5, 300 sec: 42098.4). Total num frames: 27377664. Throughput: 0: 42497.6. Samples: 27514300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-07-02 11:16:36,097][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:16:36,198][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000001672_27394048.pth... [2024-07-02 11:16:36,273][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000001054_17268736.pth [2024-07-02 11:16:38,887][36999] Updated weights for policy 0, policy_version 1680 (0.0034) [2024-07-02 11:16:41,095][36761] Fps is (10 sec: 40962.4, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 27607040. Throughput: 0: 42385.0. Samples: 27762920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 11:16:41,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:16:43,075][36999] Updated weights for policy 0, policy_version 1690 (0.0039) [2024-07-02 11:16:46,098][36761] Fps is (10 sec: 45866.5, 60 sec: 42596.3, 300 sec: 42209.2). Total num frames: 27836416. Throughput: 0: 42619.7. Samples: 27893180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-07-02 11:16:46,099][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:16:46,533][36999] Updated weights for policy 0, policy_version 1700 (0.0029) [2024-07-02 11:16:50,988][36999] Updated weights for policy 0, policy_version 1710 (0.0036) [2024-07-02 11:16:51,095][36761] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 28016640. Throughput: 0: 42497.8. Samples: 28145940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-07-02 11:16:51,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:16:54,205][36999] Updated weights for policy 0, policy_version 1720 (0.0034) [2024-07-02 11:16:56,100][36761] Fps is (10 sec: 40953.6, 60 sec: 43141.3, 300 sec: 42264.5). Total num frames: 28246016. Throughput: 0: 42459.7. Samples: 28402360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 11:16:56,101][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:16:58,744][36999] Updated weights for policy 0, policy_version 1730 (0.0036) [2024-07-02 11:17:01,095][36761] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42265.1). Total num frames: 28475392. Throughput: 0: 42518.5. Samples: 28531760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 11:17:01,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:17:02,205][36999] Updated weights for policy 0, policy_version 1740 (0.0043) [2024-07-02 11:17:06,095][36761] Fps is (10 sec: 39339.2, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 28639232. Throughput: 0: 42427.0. Samples: 28783780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 11:17:06,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:17:06,418][36999] Updated weights for policy 0, policy_version 1750 (0.0036) [2024-07-02 11:17:09,748][36999] Updated weights for policy 0, policy_version 1760 (0.0030) [2024-07-02 11:17:11,095][36761] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 28884992. Throughput: 0: 42420.9. Samples: 29035820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 11:17:11,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:17:14,161][36999] Updated weights for policy 0, policy_version 1770 (0.0039) [2024-07-02 11:17:16,095][36761] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 29081600. Throughput: 0: 42384.5. Samples: 29165860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) [2024-07-02 11:17:16,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:17:16,116][36979] Saving new best policy, reward=0.012! [2024-07-02 11:17:17,763][36999] Updated weights for policy 0, policy_version 1780 (0.0035) [2024-07-02 11:17:21,095][36761] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 29278208. Throughput: 0: 42199.0. Samples: 29413200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 11:17:21,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:17:21,846][36999] Updated weights for policy 0, policy_version 1790 (0.0037) [2024-07-02 11:17:25,618][36999] Updated weights for policy 0, policy_version 1800 (0.0030) [2024-07-02 11:17:26,095][36761] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 29507584. Throughput: 0: 42263.6. Samples: 29664780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 11:17:26,100][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:17:29,326][36979] Signal inference workers to stop experience collection... (400 times) [2024-07-02 11:17:29,326][36979] Signal inference workers to resume experience collection... (400 times) [2024-07-02 11:17:29,356][36999] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-07-02 11:17:29,356][36999] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-07-02 11:17:29,477][36999] Updated weights for policy 0, policy_version 1810 (0.0040) [2024-07-02 11:17:31,095][36761] Fps is (10 sec: 45874.1, 60 sec: 42325.7, 300 sec: 42265.6). Total num frames: 29736960. Throughput: 0: 42265.8. Samples: 29795020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-07-02 11:17:31,098][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:17:33,275][36999] Updated weights for policy 0, policy_version 1820 (0.0025) [2024-07-02 11:17:36,095][36761] Fps is (10 sec: 40960.4, 60 sec: 42326.2, 300 sec: 42098.6). Total num frames: 29917184. Throughput: 0: 42271.2. Samples: 30048140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 11:17:36,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:17:37,070][36999] Updated weights for policy 0, policy_version 1830 (0.0033) [2024-07-02 11:17:40,870][36999] Updated weights for policy 0, policy_version 1840 (0.0028) [2024-07-02 11:17:41,095][36761] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 30146560. Throughput: 0: 42248.8. Samples: 30303360. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-07-02 11:17:41,095][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:17:44,966][36999] Updated weights for policy 0, policy_version 1850 (0.0042) [2024-07-02 11:17:46,097][36761] Fps is (10 sec: 44227.5, 60 sec: 42053.0, 300 sec: 42209.3). Total num frames: 30359552. Throughput: 0: 42178.6. Samples: 30429880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-07-02 11:17:46,098][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:17:48,474][36999] Updated weights for policy 0, policy_version 1860 (0.0028) [2024-07-02 11:17:51,095][36761] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 30556160. Throughput: 0: 42280.9. Samples: 30686420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-07-02 11:17:51,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:17:52,673][36999] Updated weights for policy 0, policy_version 1870 (0.0037) [2024-07-02 11:17:56,095][36761] Fps is (10 sec: 42607.0, 60 sec: 42328.5, 300 sec: 42265.2). Total num frames: 30785536. Throughput: 0: 42318.2. Samples: 30940140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 11:17:56,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:17:56,393][36999] Updated weights for policy 0, policy_version 1880 (0.0042) [2024-07-02 11:18:00,209][36999] Updated weights for policy 0, policy_version 1890 (0.0043) [2024-07-02 11:18:01,096][36761] Fps is (10 sec: 44232.2, 60 sec: 42051.6, 300 sec: 42265.0). Total num frames: 30998528. Throughput: 0: 42317.7. Samples: 31070200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-07-02 11:18:01,097][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:18:03,920][36999] Updated weights for policy 0, policy_version 1900 (0.0043) [2024-07-02 11:18:06,095][36761] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 31195136. Throughput: 0: 42546.9. Samples: 31327820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 11:18:06,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:18:07,942][36999] Updated weights for policy 0, policy_version 1910 (0.0041) [2024-07-02 11:18:11,095][36761] Fps is (10 sec: 42603.0, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 31424512. Throughput: 0: 42563.6. Samples: 31580140. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-07-02 11:18:11,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:18:11,594][36999] Updated weights for policy 0, policy_version 1920 (0.0034) [2024-07-02 11:18:15,587][36999] Updated weights for policy 0, policy_version 1930 (0.0029) [2024-07-02 11:18:16,095][36761] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 31637504. Throughput: 0: 42567.6. Samples: 31710560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 11:18:16,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:18:19,294][36999] Updated weights for policy 0, policy_version 1940 (0.0034) [2024-07-02 11:18:21,095][36761] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 31817728. Throughput: 0: 42348.9. Samples: 31953840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 11:18:21,095][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:18:23,605][36999] Updated weights for policy 0, policy_version 1950 (0.0033) [2024-07-02 11:18:26,095][36761] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 32063488. Throughput: 0: 42318.2. Samples: 32207680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-07-02 11:18:26,097][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:18:27,181][36999] Updated weights for policy 0, policy_version 1960 (0.0044) [2024-07-02 11:18:31,095][36761] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 32243712. Throughput: 0: 42451.7. Samples: 32340120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-07-02 11:18:31,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:18:31,395][36999] Updated weights for policy 0, policy_version 1970 (0.0040) [2024-07-02 11:18:35,044][36999] Updated weights for policy 0, policy_version 1980 (0.0033) [2024-07-02 11:18:36,095][36761] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 32456704. Throughput: 0: 42324.9. Samples: 32591040. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-07-02 11:18:36,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:18:36,216][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000001982_32473088.pth... [2024-07-02 11:18:36,273][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000001362_22315008.pth [2024-07-02 11:18:39,017][36999] Updated weights for policy 0, policy_version 1990 (0.0037) [2024-07-02 11:18:41,095][36761] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 32686080. Throughput: 0: 42313.4. Samples: 32844240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 11:18:41,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:18:42,591][36999] Updated weights for policy 0, policy_version 2000 (0.0033) [2024-07-02 11:18:46,095][36761] Fps is (10 sec: 42598.3, 60 sec: 42053.6, 300 sec: 42265.2). Total num frames: 32882688. Throughput: 0: 42445.8. Samples: 32980220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-07-02 11:18:46,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:18:46,881][36999] Updated weights for policy 0, policy_version 2010 (0.0042) [2024-07-02 11:18:50,156][36999] Updated weights for policy 0, policy_version 2020 (0.0032) [2024-07-02 11:18:51,095][36761] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 33112064. Throughput: 0: 42246.0. Samples: 33228880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 11:18:51,096][36761] Avg episode reward: [(0, '0.004')] [2024-07-02 11:18:54,831][36999] Updated weights for policy 0, policy_version 2030 (0.0034) [2024-07-02 11:18:56,095][36761] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 33341440. Throughput: 0: 42265.8. Samples: 33482100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 11:18:56,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:18:57,689][36999] Updated weights for policy 0, policy_version 2040 (0.0045) [2024-07-02 11:18:59,985][36979] Signal inference workers to stop experience collection... (450 times) [2024-07-02 11:19:00,021][36999] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-07-02 11:19:00,048][36979] Signal inference workers to resume experience collection... (450 times) [2024-07-02 11:19:00,050][36999] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-07-02 11:19:01,095][36761] Fps is (10 sec: 42598.4, 60 sec: 42326.1, 300 sec: 42320.7). Total num frames: 33538048. Throughput: 0: 42359.2. Samples: 33616720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-07-02 11:19:01,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:19:02,260][36999] Updated weights for policy 0, policy_version 2050 (0.0037) [2024-07-02 11:19:05,309][36999] Updated weights for policy 0, policy_version 2060 (0.0034) [2024-07-02 11:19:06,095][36761] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 33767424. Throughput: 0: 42621.2. Samples: 33871800. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-07-02 11:19:06,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:19:09,787][36999] Updated weights for policy 0, policy_version 2070 (0.0024) [2024-07-02 11:19:11,095][36761] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 33980416. Throughput: 0: 42676.0. Samples: 34128100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-07-02 11:19:11,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:19:12,994][36999] Updated weights for policy 0, policy_version 2080 (0.0040) [2024-07-02 11:19:16,095][36761] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 34160640. Throughput: 0: 42557.4. Samples: 34255200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-07-02 11:19:16,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:19:17,407][36999] Updated weights for policy 0, policy_version 2090 (0.0034) [2024-07-02 11:19:20,674][36999] Updated weights for policy 0, policy_version 2100 (0.0033) [2024-07-02 11:19:21,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42376.3). Total num frames: 34406400. Throughput: 0: 42607.3. Samples: 34508360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 11:19:21,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:19:24,967][36999] Updated weights for policy 0, policy_version 2110 (0.0039) [2024-07-02 11:19:26,099][36761] Fps is (10 sec: 42584.7, 60 sec: 42050.0, 300 sec: 42375.8). Total num frames: 34586624. Throughput: 0: 42780.0. Samples: 34769480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-07-02 11:19:26,099][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:19:28,646][36999] Updated weights for policy 0, policy_version 2120 (0.0034) [2024-07-02 11:19:31,095][36761] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 34799616. Throughput: 0: 42453.5. Samples: 34890620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 11:19:31,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:19:32,706][36999] Updated weights for policy 0, policy_version 2130 (0.0032) [2024-07-02 11:19:36,095][36761] Fps is (10 sec: 44250.6, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 35028992. Throughput: 0: 42643.4. Samples: 35147840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 11:19:36,109][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:19:36,296][36999] Updated weights for policy 0, policy_version 2140 (0.0028) [2024-07-02 11:19:40,633][36999] Updated weights for policy 0, policy_version 2150 (0.0041) [2024-07-02 11:19:41,095][36761] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 35241984. Throughput: 0: 42651.1. Samples: 35401400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 11:19:41,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:19:43,979][36999] Updated weights for policy 0, policy_version 2160 (0.0028) [2024-07-02 11:19:46,095][36761] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 35438592. Throughput: 0: 42461.2. Samples: 35527480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-07-02 11:19:46,096][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:19:48,603][36999] Updated weights for policy 0, policy_version 2170 (0.0033) [2024-07-02 11:19:51,095][36761] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 35684352. Throughput: 0: 42468.5. Samples: 35782880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 11:19:51,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:19:51,755][36999] Updated weights for policy 0, policy_version 2180 (0.0039) [2024-07-02 11:19:56,095][36761] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 35864576. Throughput: 0: 42464.4. Samples: 36039000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 11:19:56,098][36761] Avg episode reward: [(0, '0.013')] [2024-07-02 11:19:56,393][36999] Updated weights for policy 0, policy_version 2190 (0.0037) [2024-07-02 11:19:59,455][36999] Updated weights for policy 0, policy_version 2200 (0.0029) [2024-07-02 11:20:01,095][36761] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 36093952. Throughput: 0: 42325.7. Samples: 36159860. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-07-02 11:20:01,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:20:04,017][36999] Updated weights for policy 0, policy_version 2210 (0.0030) [2024-07-02 11:20:06,095][36761] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42320.7). Total num frames: 36290560. Throughput: 0: 42438.2. Samples: 36418080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 11:20:06,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:20:07,345][36999] Updated weights for policy 0, policy_version 2220 (0.0050) [2024-07-02 11:20:11,095][36761] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42376.3). Total num frames: 36503552. Throughput: 0: 42343.4. Samples: 36674800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 11:20:11,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:20:11,651][36999] Updated weights for policy 0, policy_version 2230 (0.0034) [2024-07-02 11:20:14,403][36979] Signal inference workers to stop experience collection... (500 times) [2024-07-02 11:20:14,447][36999] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-07-02 11:20:14,452][36979] Signal inference workers to resume experience collection... (500 times) [2024-07-02 11:20:14,462][36999] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-07-02 11:20:14,865][36999] Updated weights for policy 0, policy_version 2240 (0.0046) [2024-07-02 11:20:16,095][36761] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 36732928. Throughput: 0: 42452.8. Samples: 36801000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-07-02 11:20:16,098][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:20:19,256][36999] Updated weights for policy 0, policy_version 2250 (0.0032) [2024-07-02 11:20:21,095][36761] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 36929536. Throughput: 0: 42324.9. Samples: 37052460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 11:20:21,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:20:22,477][36999] Updated weights for policy 0, policy_version 2260 (0.0026) [2024-07-02 11:20:26,095][36761] Fps is (10 sec: 37683.4, 60 sec: 42054.5, 300 sec: 42265.2). Total num frames: 37109760. Throughput: 0: 42465.7. Samples: 37312360. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-07-02 11:20:26,096][36761] Avg episode reward: [(0, '0.004')] [2024-07-02 11:20:26,852][36999] Updated weights for policy 0, policy_version 2270 (0.0044) [2024-07-02 11:20:30,521][36999] Updated weights for policy 0, policy_version 2280 (0.0033) [2024-07-02 11:20:31,095][36761] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 37371904. Throughput: 0: 42382.3. Samples: 37434680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-07-02 11:20:31,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:20:34,548][36999] Updated weights for policy 0, policy_version 2290 (0.0038) [2024-07-02 11:20:36,097][36761] Fps is (10 sec: 45866.9, 60 sec: 42324.1, 300 sec: 42487.1). Total num frames: 37568512. Throughput: 0: 42307.2. Samples: 37686780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-07-02 11:20:36,098][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:20:36,121][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000002293_37568512.pth... [2024-07-02 11:20:36,180][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000001672_27394048.pth [2024-07-02 11:20:38,137][36999] Updated weights for policy 0, policy_version 2300 (0.0033) [2024-07-02 11:20:41,095][36761] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 37765120. Throughput: 0: 42421.9. Samples: 37947980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-07-02 11:20:41,095][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:20:42,074][36999] Updated weights for policy 0, policy_version 2310 (0.0048) [2024-07-02 11:20:46,039][36999] Updated weights for policy 0, policy_version 2320 (0.0044) [2024-07-02 11:20:46,095][36761] Fps is (10 sec: 44244.6, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 38010880. Throughput: 0: 42599.1. Samples: 38076820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 11:20:46,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:20:49,830][36999] Updated weights for policy 0, policy_version 2330 (0.0037) [2024-07-02 11:20:51,095][36761] Fps is (10 sec: 44236.1, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 38207488. Throughput: 0: 42466.5. Samples: 38329080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-07-02 11:20:51,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:20:54,211][36999] Updated weights for policy 0, policy_version 2340 (0.0036) [2024-07-02 11:20:56,096][36761] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42431.7). Total num frames: 38420480. Throughput: 0: 42425.6. Samples: 38583960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 11:20:56,096][36761] Avg episode reward: [(0, '0.004')] [2024-07-02 11:20:57,629][36999] Updated weights for policy 0, policy_version 2350 (0.0030) [2024-07-02 11:21:01,095][36761] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 38649856. Throughput: 0: 42429.0. Samples: 38710300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-07-02 11:21:01,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:21:01,746][36999] Updated weights for policy 0, policy_version 2360 (0.0034) [2024-07-02 11:21:05,460][36999] Updated weights for policy 0, policy_version 2370 (0.0040) [2024-07-02 11:21:06,095][36761] Fps is (10 sec: 44237.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 38862848. Throughput: 0: 42581.3. Samples: 38968620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-07-02 11:21:06,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:21:09,167][36999] Updated weights for policy 0, policy_version 2380 (0.0042) [2024-07-02 11:21:11,095][36761] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 39059456. Throughput: 0: 42395.6. Samples: 39220160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 11:21:11,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:21:13,019][36999] Updated weights for policy 0, policy_version 2390 (0.0027) [2024-07-02 11:21:16,095][36761] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 39272448. Throughput: 0: 42597.8. Samples: 39351580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 11:21:16,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:21:17,163][36999] Updated weights for policy 0, policy_version 2400 (0.0035) [2024-07-02 11:21:20,831][36999] Updated weights for policy 0, policy_version 2410 (0.0044) [2024-07-02 11:21:21,095][36761] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 39501824. Throughput: 0: 42757.8. Samples: 39610800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-07-02 11:21:21,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:21:24,729][36999] Updated weights for policy 0, policy_version 2420 (0.0038) [2024-07-02 11:21:26,100][36761] Fps is (10 sec: 42578.9, 60 sec: 43141.3, 300 sec: 42375.7). Total num frames: 39698432. Throughput: 0: 42562.7. Samples: 39863500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 11:21:26,101][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:21:28,395][36999] Updated weights for policy 0, policy_version 2430 (0.0043) [2024-07-02 11:21:31,096][36761] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42487.5). Total num frames: 39911424. Throughput: 0: 42450.6. Samples: 39987100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 11:21:31,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:21:32,454][36999] Updated weights for policy 0, policy_version 2440 (0.0039) [2024-07-02 11:21:36,076][36999] Updated weights for policy 0, policy_version 2450 (0.0042) [2024-07-02 11:21:36,095][36761] Fps is (10 sec: 44256.9, 60 sec: 42872.8, 300 sec: 42487.3). Total num frames: 40140800. Throughput: 0: 42547.2. Samples: 40243700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-07-02 11:21:36,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:21:36,109][36979] Saving new best policy, reward=0.015! [2024-07-02 11:21:40,029][36999] Updated weights for policy 0, policy_version 2460 (0.0035) [2024-07-02 11:21:41,095][36761] Fps is (10 sec: 42599.2, 60 sec: 42871.4, 300 sec: 42376.7). Total num frames: 40337408. Throughput: 0: 42600.2. Samples: 40500960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-07-02 11:21:41,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:21:43,651][36999] Updated weights for policy 0, policy_version 2470 (0.0027) [2024-07-02 11:21:46,095][36761] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 40566784. Throughput: 0: 42544.0. Samples: 40624780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 11:21:46,095][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:21:48,095][36999] Updated weights for policy 0, policy_version 2480 (0.0033) [2024-07-02 11:21:51,096][36761] Fps is (10 sec: 42596.7, 60 sec: 42598.2, 300 sec: 42432.4). Total num frames: 40763392. Throughput: 0: 42523.7. Samples: 40882200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 11:21:51,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:21:51,798][36999] Updated weights for policy 0, policy_version 2490 (0.0034) [2024-07-02 11:21:53,192][36979] Signal inference workers to stop experience collection... (550 times) [2024-07-02 11:21:53,193][36979] Signal inference workers to resume experience collection... (550 times) [2024-07-02 11:21:53,207][36999] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-07-02 11:21:53,207][36999] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-07-02 11:21:55,638][36999] Updated weights for policy 0, policy_version 2500 (0.0025) [2024-07-02 11:21:56,100][36761] Fps is (10 sec: 42578.3, 60 sec: 42868.3, 300 sec: 42431.1). Total num frames: 40992768. Throughput: 0: 42544.0. Samples: 41134840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-07-02 11:21:56,101][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:21:59,311][36999] Updated weights for policy 0, policy_version 2510 (0.0035) [2024-07-02 11:22:01,095][36761] Fps is (10 sec: 42600.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 41189376. Throughput: 0: 42600.9. Samples: 41268620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 11:22:01,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:22:03,327][36999] Updated weights for policy 0, policy_version 2520 (0.0030) [2024-07-02 11:22:06,095][36761] Fps is (10 sec: 39339.7, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 41385984. Throughput: 0: 42371.1. Samples: 41517500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-07-02 11:22:06,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:22:06,967][36999] Updated weights for policy 0, policy_version 2530 (0.0033) [2024-07-02 11:22:11,013][36999] Updated weights for policy 0, policy_version 2540 (0.0039) [2024-07-02 11:22:11,095][36761] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 41615360. Throughput: 0: 42425.2. Samples: 41772440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 11:22:11,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:22:14,920][36999] Updated weights for policy 0, policy_version 2550 (0.0037) [2024-07-02 11:22:16,095][36761] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 41795584. Throughput: 0: 42502.3. Samples: 41899700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 11:22:16,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:22:18,788][36999] Updated weights for policy 0, policy_version 2560 (0.0027) [2024-07-02 11:22:21,095][36761] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 42041344. Throughput: 0: 42424.8. Samples: 42152820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-07-02 11:22:21,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:22:22,547][36999] Updated weights for policy 0, policy_version 2570 (0.0039) [2024-07-02 11:22:26,095][36761] Fps is (10 sec: 45875.4, 60 sec: 42601.6, 300 sec: 42431.8). Total num frames: 42254336. Throughput: 0: 42302.2. Samples: 42404560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 11:22:26,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:22:26,505][36999] Updated weights for policy 0, policy_version 2580 (0.0036) [2024-07-02 11:22:30,343][36999] Updated weights for policy 0, policy_version 2590 (0.0037) [2024-07-02 11:22:31,095][36761] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 42450944. Throughput: 0: 42431.4. Samples: 42534200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 11:22:31,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:22:34,250][36999] Updated weights for policy 0, policy_version 2600 (0.0024) [2024-07-02 11:22:36,095][36761] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 42663936. Throughput: 0: 42316.0. Samples: 42786400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 11:22:36,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:22:36,135][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000002605_42680320.pth... [2024-07-02 11:22:36,183][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000001982_32473088.pth [2024-07-02 11:22:38,178][36999] Updated weights for policy 0, policy_version 2610 (0.0038) [2024-07-02 11:22:41,095][36761] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42487.6). Total num frames: 42893312. Throughput: 0: 42430.1. Samples: 43044000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 11:22:41,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:22:41,850][36999] Updated weights for policy 0, policy_version 2620 (0.0044) [2024-07-02 11:22:45,712][36999] Updated weights for policy 0, policy_version 2630 (0.0032) [2024-07-02 11:22:46,095][36761] Fps is (10 sec: 42597.6, 60 sec: 42052.1, 300 sec: 42487.3). Total num frames: 43089920. Throughput: 0: 42364.7. Samples: 43175040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-07-02 11:22:46,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:22:49,434][36999] Updated weights for policy 0, policy_version 2640 (0.0027) [2024-07-02 11:22:51,095][36761] Fps is (10 sec: 39321.9, 60 sec: 42052.5, 300 sec: 42376.3). Total num frames: 43286528. Throughput: 0: 42376.0. Samples: 43424420. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-07-02 11:22:51,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:22:53,606][36999] Updated weights for policy 0, policy_version 2650 (0.0033) [2024-07-02 11:22:56,096][36761] Fps is (10 sec: 44236.6, 60 sec: 42328.5, 300 sec: 42487.5). Total num frames: 43532288. Throughput: 0: 42414.0. Samples: 43681080. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-07-02 11:22:56,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:22:57,031][36999] Updated weights for policy 0, policy_version 2660 (0.0034) [2024-07-02 11:23:01,095][36761] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 43728896. Throughput: 0: 42598.2. Samples: 43816620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-07-02 11:23:01,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:23:01,291][36999] Updated weights for policy 0, policy_version 2670 (0.0035) [2024-07-02 11:23:04,628][36999] Updated weights for policy 0, policy_version 2680 (0.0040) [2024-07-02 11:23:06,095][36761] Fps is (10 sec: 39322.8, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 43925504. Throughput: 0: 42490.8. Samples: 44064900. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) [2024-07-02 11:23:06,095][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:23:06,798][36979] Signal inference workers to stop experience collection... (600 times) [2024-07-02 11:23:06,834][36999] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-07-02 11:23:06,844][36979] Signal inference workers to resume experience collection... (600 times) [2024-07-02 11:23:06,855][36999] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-07-02 11:23:09,058][36999] Updated weights for policy 0, policy_version 2690 (0.0040) [2024-07-02 11:23:11,095][36761] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 44171264. Throughput: 0: 42660.4. Samples: 44324280. Policy #0 lag: (min: 2.0, avg: 11.8, max: 23.0) [2024-07-02 11:23:11,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:23:12,370][36999] Updated weights for policy 0, policy_version 2700 (0.0036) [2024-07-02 11:23:16,095][36761] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 44367872. Throughput: 0: 42807.9. Samples: 44460560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 11:23:16,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:23:16,657][36999] Updated weights for policy 0, policy_version 2710 (0.0022) [2024-07-02 11:23:19,923][36999] Updated weights for policy 0, policy_version 2720 (0.0034) [2024-07-02 11:23:21,095][36761] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 44580864. Throughput: 0: 42823.1. Samples: 44713440. Policy #0 lag: (min: 0.0, avg: 12.4, max: 26.0) [2024-07-02 11:23:21,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:23:24,194][36999] Updated weights for policy 0, policy_version 2730 (0.0030) [2024-07-02 11:23:26,098][36761] Fps is (10 sec: 45862.7, 60 sec: 42869.5, 300 sec: 42653.5). Total num frames: 44826624. Throughput: 0: 42713.0. Samples: 44966200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-07-02 11:23:26,098][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:23:27,543][36999] Updated weights for policy 0, policy_version 2740 (0.0034) [2024-07-02 11:23:31,100][36761] Fps is (10 sec: 42578.9, 60 sec: 42595.2, 300 sec: 42542.2). Total num frames: 45006848. Throughput: 0: 42778.0. Samples: 45100240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-07-02 11:23:31,109][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:23:31,817][36999] Updated weights for policy 0, policy_version 2750 (0.0038) [2024-07-02 11:23:35,082][36999] Updated weights for policy 0, policy_version 2760 (0.0028) [2024-07-02 11:23:36,095][36761] Fps is (10 sec: 40971.1, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 45236224. Throughput: 0: 42812.3. Samples: 45350980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-07-02 11:23:36,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:23:39,399][36999] Updated weights for policy 0, policy_version 2770 (0.0042) [2024-07-02 11:23:41,095][36761] Fps is (10 sec: 45895.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 45465600. Throughput: 0: 42702.8. Samples: 45602700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-07-02 11:23:41,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:23:42,684][36999] Updated weights for policy 0, policy_version 2780 (0.0030) [2024-07-02 11:23:46,096][36761] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 45645824. Throughput: 0: 42651.0. Samples: 45735920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-07-02 11:23:46,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:23:46,857][36999] Updated weights for policy 0, policy_version 2790 (0.0033) [2024-07-02 11:23:50,559][36999] Updated weights for policy 0, policy_version 2800 (0.0023) [2024-07-02 11:23:51,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43417.5, 300 sec: 42542.8). Total num frames: 45891584. Throughput: 0: 42823.4. Samples: 45991960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-07-02 11:23:51,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:23:54,638][36999] Updated weights for policy 0, policy_version 2810 (0.0034) [2024-07-02 11:23:56,095][36761] Fps is (10 sec: 44237.8, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 46088192. Throughput: 0: 42762.8. Samples: 46248600. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-07-02 11:23:56,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:23:58,231][36999] Updated weights for policy 0, policy_version 2820 (0.0031) [2024-07-02 11:24:01,099][36761] Fps is (10 sec: 39306.9, 60 sec: 42595.8, 300 sec: 42431.3). Total num frames: 46284800. Throughput: 0: 42503.6. Samples: 46373380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 11:24:01,100][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:24:02,237][36999] Updated weights for policy 0, policy_version 2830 (0.0031) [2024-07-02 11:24:05,097][36979] Signal inference workers to stop experience collection... (650 times) [2024-07-02 11:24:05,144][36999] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-07-02 11:24:05,207][36979] Signal inference workers to resume experience collection... (650 times) [2024-07-02 11:24:05,207][36999] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-07-02 11:24:05,909][36999] Updated weights for policy 0, policy_version 2840 (0.0027) [2024-07-02 11:24:06,095][36761] Fps is (10 sec: 45875.2, 60 sec: 43690.6, 300 sec: 42598.4). Total num frames: 46546944. Throughput: 0: 42655.1. Samples: 46632920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-07-02 11:24:06,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:24:10,134][36999] Updated weights for policy 0, policy_version 2850 (0.0050) [2024-07-02 11:24:11,095][36761] Fps is (10 sec: 44253.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 46727168. Throughput: 0: 42654.5. Samples: 46885540. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-07-02 11:24:11,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:24:13,732][36999] Updated weights for policy 0, policy_version 2860 (0.0041) [2024-07-02 11:24:16,095][36761] Fps is (10 sec: 37683.4, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 46923776. Throughput: 0: 42299.9. Samples: 47003540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 11:24:16,095][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:24:17,943][36999] Updated weights for policy 0, policy_version 2870 (0.0022) [2024-07-02 11:24:21,095][36761] Fps is (10 sec: 42598.4, 60 sec: 42871.3, 300 sec: 42598.8). Total num frames: 47153152. Throughput: 0: 42435.1. Samples: 47260560. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-07-02 11:24:21,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:24:21,409][36999] Updated weights for policy 0, policy_version 2880 (0.0033) [2024-07-02 11:24:25,596][36999] Updated weights for policy 0, policy_version 2890 (0.0031) [2024-07-02 11:24:26,096][36761] Fps is (10 sec: 44235.8, 60 sec: 42327.2, 300 sec: 42598.4). Total num frames: 47366144. Throughput: 0: 42551.0. Samples: 47517500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 11:24:26,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:24:29,330][36999] Updated weights for policy 0, policy_version 2900 (0.0036) [2024-07-02 11:24:31,095][36761] Fps is (10 sec: 39322.2, 60 sec: 42328.6, 300 sec: 42431.8). Total num frames: 47546368. Throughput: 0: 42355.8. Samples: 47641920. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-07-02 11:24:31,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:24:33,403][36999] Updated weights for policy 0, policy_version 2910 (0.0040) [2024-07-02 11:24:36,095][36761] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 47792128. Throughput: 0: 42440.1. Samples: 47901760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-07-02 11:24:36,096][36761] Avg episode reward: [(0, '0.004')] [2024-07-02 11:24:36,202][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000002918_47808512.pth... [2024-07-02 11:24:36,255][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000002293_37568512.pth [2024-07-02 11:24:37,089][36999] Updated weights for policy 0, policy_version 2920 (0.0033) [2024-07-02 11:24:41,038][36999] Updated weights for policy 0, policy_version 2930 (0.0042) [2024-07-02 11:24:41,095][36761] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 48005120. Throughput: 0: 42476.5. Samples: 48160040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 11:24:41,095][36761] Avg episode reward: [(0, '0.004')] [2024-07-02 11:24:44,730][36999] Updated weights for policy 0, policy_version 2940 (0.0034) [2024-07-02 11:24:46,095][36761] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 48201728. Throughput: 0: 42468.9. Samples: 48284320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 11:24:46,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:24:48,866][36999] Updated weights for policy 0, policy_version 2950 (0.0034) [2024-07-02 11:24:51,100][36761] Fps is (10 sec: 42578.5, 60 sec: 42322.2, 300 sec: 42597.8). Total num frames: 48431104. Throughput: 0: 42399.7. Samples: 48541100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 11:24:51,100][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:24:52,375][36999] Updated weights for policy 0, policy_version 2960 (0.0023) [2024-07-02 11:24:56,096][36761] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 48627712. Throughput: 0: 42569.7. Samples: 48801180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 11:24:56,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:24:56,627][36999] Updated weights for policy 0, policy_version 2970 (0.0029) [2024-07-02 11:24:59,920][36999] Updated weights for policy 0, policy_version 2980 (0.0040) [2024-07-02 11:25:01,095][36761] Fps is (10 sec: 42618.2, 60 sec: 42874.2, 300 sec: 42598.4). Total num frames: 48857088. Throughput: 0: 42653.8. Samples: 48922960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-07-02 11:25:01,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:25:04,051][36999] Updated weights for policy 0, policy_version 2990 (0.0027) [2024-07-02 11:25:06,095][36761] Fps is (10 sec: 44237.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 49070080. Throughput: 0: 42768.6. Samples: 49185140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 11:25:06,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:25:07,738][36999] Updated weights for policy 0, policy_version 3000 (0.0038) [2024-07-02 11:25:11,095][36761] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 49283072. Throughput: 0: 42806.3. Samples: 49443780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 11:25:11,096][36761] Avg episode reward: [(0, '0.002')] [2024-07-02 11:25:11,595][36999] Updated weights for policy 0, policy_version 3010 (0.0032) [2024-07-02 11:25:15,925][36999] Updated weights for policy 0, policy_version 3020 (0.0035) [2024-07-02 11:25:16,095][36761] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 49479680. Throughput: 0: 42657.2. Samples: 49561500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 11:25:16,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:25:18,497][36979] Signal inference workers to stop experience collection... (700 times) [2024-07-02 11:25:18,519][36999] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-07-02 11:25:18,557][36979] Signal inference workers to resume experience collection... (700 times) [2024-07-02 11:25:18,557][36999] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-07-02 11:25:19,152][36999] Updated weights for policy 0, policy_version 3030 (0.0028) [2024-07-02 11:25:21,095][36761] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 49725440. Throughput: 0: 42762.1. Samples: 49826060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 11:25:21,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:25:23,559][36999] Updated weights for policy 0, policy_version 3040 (0.0035) [2024-07-02 11:25:26,095][36761] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 49905664. Throughput: 0: 42642.5. Samples: 50078960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-07-02 11:25:26,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:25:26,888][36999] Updated weights for policy 0, policy_version 3050 (0.0037) [2024-07-02 11:25:31,056][36999] Updated weights for policy 0, policy_version 3060 (0.0038) [2024-07-02 11:25:31,095][36761] Fps is (10 sec: 40960.7, 60 sec: 43144.6, 300 sec: 42598.7). Total num frames: 50135040. Throughput: 0: 42588.1. Samples: 50200780. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-07-02 11:25:31,095][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:25:34,464][36999] Updated weights for policy 0, policy_version 3070 (0.0033) [2024-07-02 11:25:36,095][36761] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 50348032. Throughput: 0: 42485.6. Samples: 50452760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-07-02 11:25:36,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:25:38,781][36999] Updated weights for policy 0, policy_version 3080 (0.0040) [2024-07-02 11:25:41,095][36761] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 50544640. Throughput: 0: 42511.3. Samples: 50714180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 11:25:41,095][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:25:42,085][36999] Updated weights for policy 0, policy_version 3090 (0.0043) [2024-07-02 11:25:46,095][36761] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 50757632. Throughput: 0: 42595.1. Samples: 50839740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-07-02 11:25:46,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:25:46,427][36999] Updated weights for policy 0, policy_version 3100 (0.0036) [2024-07-02 11:25:49,575][36999] Updated weights for policy 0, policy_version 3110 (0.0030) [2024-07-02 11:25:51,095][36761] Fps is (10 sec: 44236.5, 60 sec: 42601.6, 300 sec: 42598.4). Total num frames: 50987008. Throughput: 0: 42330.6. Samples: 51090020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-07-02 11:25:51,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:25:54,420][36999] Updated weights for policy 0, policy_version 3120 (0.0038) [2024-07-02 11:25:56,095][36761] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 51200000. Throughput: 0: 42567.2. Samples: 51359300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-07-02 11:25:56,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:25:57,541][36999] Updated weights for policy 0, policy_version 3130 (0.0050) [2024-07-02 11:26:01,095][36761] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 51396608. Throughput: 0: 42589.3. Samples: 51478020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-07-02 11:26:01,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:26:02,071][36999] Updated weights for policy 0, policy_version 3140 (0.0036) [2024-07-02 11:26:05,511][36999] Updated weights for policy 0, policy_version 3150 (0.0037) [2024-07-02 11:26:06,095][36761] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 51625984. Throughput: 0: 42373.4. Samples: 51732860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 11:26:06,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:26:09,907][36999] Updated weights for policy 0, policy_version 3160 (0.0041) [2024-07-02 11:26:11,095][36761] Fps is (10 sec: 42599.3, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 51822592. Throughput: 0: 42460.5. Samples: 51989680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-07-02 11:26:11,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:26:13,094][36999] Updated weights for policy 0, policy_version 3170 (0.0039) [2024-07-02 11:26:16,095][36761] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 52035584. Throughput: 0: 42551.5. Samples: 52115600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 11:26:16,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:26:17,395][36999] Updated weights for policy 0, policy_version 3180 (0.0036) [2024-07-02 11:26:20,659][36999] Updated weights for policy 0, policy_version 3190 (0.0051) [2024-07-02 11:26:21,095][36761] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42599.0). Total num frames: 52264960. Throughput: 0: 42631.0. Samples: 52371160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-07-02 11:26:21,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:26:24,858][36999] Updated weights for policy 0, policy_version 3200 (0.0030) [2024-07-02 11:26:26,095][36761] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 52461568. Throughput: 0: 42634.1. Samples: 52632720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 11:26:26,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:26:28,252][36999] Updated weights for policy 0, policy_version 3210 (0.0036) [2024-07-02 11:26:31,095][36761] Fps is (10 sec: 40960.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 52674560. Throughput: 0: 42651.1. Samples: 52759040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 27.0) [2024-07-02 11:26:31,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:26:32,445][36999] Updated weights for policy 0, policy_version 3220 (0.0044) [2024-07-02 11:26:36,095][36761] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 52903936. Throughput: 0: 42679.6. Samples: 53010600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 11:26:36,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:26:36,136][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000003230_52920320.pth... [2024-07-02 11:26:36,143][36999] Updated weights for policy 0, policy_version 3230 (0.0030) [2024-07-02 11:26:36,192][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000002605_42680320.pth [2024-07-02 11:26:40,396][36999] Updated weights for policy 0, policy_version 3240 (0.0043) [2024-07-02 11:26:41,095][36761] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 53100544. Throughput: 0: 42498.2. Samples: 53271720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 11:26:41,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:26:43,950][36999] Updated weights for policy 0, policy_version 3250 (0.0044) [2024-07-02 11:26:46,100][36761] Fps is (10 sec: 39303.5, 60 sec: 42322.1, 300 sec: 42486.7). Total num frames: 53297152. Throughput: 0: 42648.2. Samples: 53397380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 11:26:46,100][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:26:47,906][36999] Updated weights for policy 0, policy_version 3260 (0.0045) [2024-07-02 11:26:51,095][36761] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42599.1). Total num frames: 53559296. Throughput: 0: 42636.0. Samples: 53651480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 11:26:51,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:26:51,511][36999] Updated weights for policy 0, policy_version 3270 (0.0024) [2024-07-02 11:26:55,726][36999] Updated weights for policy 0, policy_version 3280 (0.0036) [2024-07-02 11:26:56,095][36761] Fps is (10 sec: 44256.8, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 53739520. Throughput: 0: 42752.8. Samples: 53913560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 11:26:56,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:26:59,641][36999] Updated weights for policy 0, policy_version 3290 (0.0038) [2024-07-02 11:27:01,100][36761] Fps is (10 sec: 39303.6, 60 sec: 42595.3, 300 sec: 42597.7). Total num frames: 53952512. Throughput: 0: 42766.8. Samples: 54040300. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-07-02 11:27:01,100][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:27:03,262][36999] Updated weights for policy 0, policy_version 3300 (0.0032) [2024-07-02 11:27:04,281][36979] Signal inference workers to stop experience collection... (750 times) [2024-07-02 11:27:04,281][36979] Signal inference workers to resume experience collection... (750 times) [2024-07-02 11:27:04,325][36999] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-07-02 11:27:04,325][36999] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-07-02 11:27:06,095][36761] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 54198272. Throughput: 0: 42725.5. Samples: 54293800. Policy #0 lag: (min: 1.0, avg: 12.2, max: 20.0) [2024-07-02 11:27:06,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:27:07,181][36999] Updated weights for policy 0, policy_version 3310 (0.0033) [2024-07-02 11:27:10,809][36999] Updated weights for policy 0, policy_version 3320 (0.0042) [2024-07-02 11:27:11,095][36761] Fps is (10 sec: 44256.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 54394880. Throughput: 0: 42683.6. Samples: 54553480. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-07-02 11:27:11,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:27:14,767][36999] Updated weights for policy 0, policy_version 3330 (0.0029) [2024-07-02 11:27:16,095][36761] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 54591488. Throughput: 0: 42657.1. Samples: 54678620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 11:27:16,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:27:18,658][36999] Updated weights for policy 0, policy_version 3340 (0.0040) [2024-07-02 11:27:21,100][36761] Fps is (10 sec: 45854.6, 60 sec: 43141.4, 300 sec: 42708.8). Total num frames: 54853632. Throughput: 0: 42789.0. Samples: 54936300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-07-02 11:27:21,100][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:27:22,546][36999] Updated weights for policy 0, policy_version 3350 (0.0034) [2024-07-02 11:27:26,095][36761] Fps is (10 sec: 42599.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 55017472. Throughput: 0: 42799.6. Samples: 55197700. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-07-02 11:27:26,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:27:26,265][36999] Updated weights for policy 0, policy_version 3360 (0.0051) [2024-07-02 11:27:30,221][36999] Updated weights for policy 0, policy_version 3370 (0.0034) [2024-07-02 11:27:31,095][36761] Fps is (10 sec: 39339.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 55246848. Throughput: 0: 42649.6. Samples: 55316420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-07-02 11:27:31,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:27:33,875][36999] Updated weights for policy 0, policy_version 3380 (0.0032) [2024-07-02 11:27:36,095][36761] Fps is (10 sec: 49151.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 55508992. Throughput: 0: 42846.2. Samples: 55579560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-07-02 11:27:36,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:27:37,760][36999] Updated weights for policy 0, policy_version 3390 (0.0032) [2024-07-02 11:27:41,095][36761] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 55672832. Throughput: 0: 42788.5. Samples: 55839040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-07-02 11:27:41,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:27:41,584][36999] Updated weights for policy 0, policy_version 3400 (0.0035) [2024-07-02 11:27:45,337][36999] Updated weights for policy 0, policy_version 3410 (0.0028) [2024-07-02 11:27:46,095][36761] Fps is (10 sec: 36044.7, 60 sec: 42874.7, 300 sec: 42653.9). Total num frames: 55869440. Throughput: 0: 42793.2. Samples: 55965800. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-07-02 11:27:46,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:27:49,284][36999] Updated weights for policy 0, policy_version 3420 (0.0029) [2024-07-02 11:27:51,095][36761] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 56131584. Throughput: 0: 42941.2. Samples: 56226160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-07-02 11:27:51,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:27:52,959][36999] Updated weights for policy 0, policy_version 3430 (0.0047) [2024-07-02 11:27:56,095][36761] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 56295424. Throughput: 0: 42922.3. Samples: 56484980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-07-02 11:27:56,095][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:27:56,925][36999] Updated weights for policy 0, policy_version 3440 (0.0027) [2024-07-02 11:28:00,624][36999] Updated weights for policy 0, policy_version 3450 (0.0034) [2024-07-02 11:28:01,095][36761] Fps is (10 sec: 39321.8, 60 sec: 42874.7, 300 sec: 42709.5). Total num frames: 56524800. Throughput: 0: 42724.6. Samples: 56601220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-07-02 11:28:01,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:28:04,770][36999] Updated weights for policy 0, policy_version 3460 (0.0023) [2024-07-02 11:28:06,095][36761] Fps is (10 sec: 47513.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 56770560. Throughput: 0: 42846.5. Samples: 56864200. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-07-02 11:28:06,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:28:08,463][36999] Updated weights for policy 0, policy_version 3470 (0.0033) [2024-07-02 11:28:11,095][36761] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 56950784. Throughput: 0: 42729.6. Samples: 57120540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 11:28:11,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:28:12,743][36999] Updated weights for policy 0, policy_version 3480 (0.0029) [2024-07-02 11:28:16,095][36761] Fps is (10 sec: 39322.0, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 57163776. Throughput: 0: 42801.0. Samples: 57242460. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-07-02 11:28:16,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:28:16,371][36999] Updated weights for policy 0, policy_version 3490 (0.0039) [2024-07-02 11:28:19,458][36979] Signal inference workers to stop experience collection... (800 times) [2024-07-02 11:28:19,459][36979] Signal inference workers to resume experience collection... (800 times) [2024-07-02 11:28:19,503][36999] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-07-02 11:28:19,503][36999] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-07-02 11:28:20,426][36999] Updated weights for policy 0, policy_version 3500 (0.0050) [2024-07-02 11:28:21,095][36761] Fps is (10 sec: 42599.1, 60 sec: 42055.5, 300 sec: 42543.3). Total num frames: 57376768. Throughput: 0: 42785.8. Samples: 57504920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 11:28:21,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:28:23,937][36999] Updated weights for policy 0, policy_version 3510 (0.0033) [2024-07-02 11:28:26,095][36761] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42654.6). Total num frames: 57589760. Throughput: 0: 42700.8. Samples: 57760580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-07-02 11:28:26,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:28:28,117][36999] Updated weights for policy 0, policy_version 3520 (0.0034) [2024-07-02 11:28:31,095][36761] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 57819136. Throughput: 0: 42692.0. Samples: 57886940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 11:28:31,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:28:31,882][36999] Updated weights for policy 0, policy_version 3530 (0.0053) [2024-07-02 11:28:35,878][36999] Updated weights for policy 0, policy_version 3540 (0.0035) [2024-07-02 11:28:36,095][36761] Fps is (10 sec: 42598.7, 60 sec: 41779.1, 300 sec: 42542.9). Total num frames: 58015744. Throughput: 0: 42548.0. Samples: 58140820. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-07-02 11:28:36,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:28:36,110][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000003541_58015744.pth... [2024-07-02 11:28:36,158][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000002918_47808512.pth [2024-07-02 11:28:39,651][36999] Updated weights for policy 0, policy_version 3550 (0.0033) [2024-07-02 11:28:41,095][36761] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 58228736. Throughput: 0: 42429.6. Samples: 58394320. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-07-02 11:28:41,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:28:43,449][36999] Updated weights for policy 0, policy_version 3560 (0.0034) [2024-07-02 11:28:46,095][36761] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 58474496. Throughput: 0: 42812.4. Samples: 58527780. Policy #0 lag: (min: 2.0, avg: 10.5, max: 21.0) [2024-07-02 11:28:46,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:28:47,219][36999] Updated weights for policy 0, policy_version 3570 (0.0022) [2024-07-02 11:28:51,095][36761] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42542.8). Total num frames: 58638336. Throughput: 0: 42528.9. Samples: 58778000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-07-02 11:28:51,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:28:51,355][36999] Updated weights for policy 0, policy_version 3580 (0.0038) [2024-07-02 11:28:54,846][36999] Updated weights for policy 0, policy_version 3590 (0.0036) [2024-07-02 11:28:56,096][36761] Fps is (10 sec: 39320.7, 60 sec: 42871.2, 300 sec: 42654.5). Total num frames: 58867712. Throughput: 0: 42379.9. Samples: 59027640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-07-02 11:28:56,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:28:59,191][36999] Updated weights for policy 0, policy_version 3600 (0.0033) [2024-07-02 11:29:01,095][36761] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 59080704. Throughput: 0: 42776.5. Samples: 59167400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 11:29:01,095][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:29:02,359][36999] Updated weights for policy 0, policy_version 3610 (0.0037) [2024-07-02 11:29:06,095][36761] Fps is (10 sec: 40961.0, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 59277312. Throughput: 0: 42491.1. Samples: 59417020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 11:29:06,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:29:06,719][36999] Updated weights for policy 0, policy_version 3620 (0.0029) [2024-07-02 11:29:09,943][36999] Updated weights for policy 0, policy_version 3630 (0.0037) [2024-07-02 11:29:11,095][36761] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 59506688. Throughput: 0: 42460.5. Samples: 59671300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-07-02 11:29:11,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:29:14,205][36999] Updated weights for policy 0, policy_version 3640 (0.0028) [2024-07-02 11:29:16,095][36761] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 59736064. Throughput: 0: 42682.2. Samples: 59807640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 11:29:16,095][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:29:17,381][36999] Updated weights for policy 0, policy_version 3650 (0.0041) [2024-07-02 11:29:21,095][36761] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 59932672. Throughput: 0: 42724.1. Samples: 60063400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-07-02 11:29:21,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:29:21,892][36999] Updated weights for policy 0, policy_version 3660 (0.0041) [2024-07-02 11:29:24,945][36999] Updated weights for policy 0, policy_version 3670 (0.0033) [2024-07-02 11:29:26,095][36761] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 60162048. Throughput: 0: 42761.0. Samples: 60318560. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-07-02 11:29:26,096][36761] Avg episode reward: [(0, '0.004')] [2024-07-02 11:29:29,464][36999] Updated weights for policy 0, policy_version 3680 (0.0035) [2024-07-02 11:29:31,095][36761] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 60375040. Throughput: 0: 42728.9. Samples: 60450580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-07-02 11:29:31,096][36761] Avg episode reward: [(0, '0.013')] [2024-07-02 11:29:32,795][36999] Updated weights for policy 0, policy_version 3690 (0.0037) [2024-07-02 11:29:36,095][36761] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 60571648. Throughput: 0: 42695.1. Samples: 60699280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 11:29:36,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:29:37,218][36999] Updated weights for policy 0, policy_version 3700 (0.0031) [2024-07-02 11:29:37,559][36979] Signal inference workers to stop experience collection... (850 times) [2024-07-02 11:29:37,561][36979] Signal inference workers to resume experience collection... (850 times) [2024-07-02 11:29:37,572][36999] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-07-02 11:29:37,604][36999] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-07-02 11:29:40,492][36999] Updated weights for policy 0, policy_version 3710 (0.0023) [2024-07-02 11:29:41,095][36761] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 60801024. Throughput: 0: 42754.1. Samples: 60951560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-07-02 11:29:41,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:29:44,819][36999] Updated weights for policy 0, policy_version 3720 (0.0029) [2024-07-02 11:29:46,095][36761] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42654.6). Total num frames: 61014016. Throughput: 0: 42696.8. Samples: 61088760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 11:29:46,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:29:48,126][36999] Updated weights for policy 0, policy_version 3730 (0.0026) [2024-07-02 11:29:51,100][36761] Fps is (10 sec: 40941.3, 60 sec: 42868.3, 300 sec: 42653.3). Total num frames: 61210624. Throughput: 0: 42597.1. Samples: 61334080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 11:29:51,100][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:29:52,365][36999] Updated weights for policy 0, policy_version 3740 (0.0034) [2024-07-02 11:29:56,041][36999] Updated weights for policy 0, policy_version 3750 (0.0041) [2024-07-02 11:29:56,095][36761] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 61440000. Throughput: 0: 42728.0. Samples: 61594060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 11:29:56,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:30:00,011][36999] Updated weights for policy 0, policy_version 3760 (0.0040) [2024-07-02 11:30:01,095][36761] Fps is (10 sec: 44256.2, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 61652992. Throughput: 0: 42565.6. Samples: 61723100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-07-02 11:30:01,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:30:03,963][36999] Updated weights for policy 0, policy_version 3770 (0.0046) [2024-07-02 11:30:06,095][36761] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 61849600. Throughput: 0: 42351.1. Samples: 61969200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-07-02 11:30:06,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:30:07,867][36999] Updated weights for policy 0, policy_version 3780 (0.0045) [2024-07-02 11:30:11,100][36761] Fps is (10 sec: 40942.9, 60 sec: 42595.4, 300 sec: 42653.3). Total num frames: 62062592. Throughput: 0: 42388.4. Samples: 62226220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-07-02 11:30:11,100][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:30:11,636][36999] Updated weights for policy 0, policy_version 3790 (0.0031) [2024-07-02 11:30:15,609][36999] Updated weights for policy 0, policy_version 3800 (0.0032) [2024-07-02 11:30:16,095][36761] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 62275584. Throughput: 0: 42318.3. Samples: 62354900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-07-02 11:30:16,098][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:30:19,334][36999] Updated weights for policy 0, policy_version 3810 (0.0030) [2024-07-02 11:30:21,095][36761] Fps is (10 sec: 44255.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 62504960. Throughput: 0: 42399.1. Samples: 62607240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-07-02 11:30:21,097][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:30:23,198][36999] Updated weights for policy 0, policy_version 3820 (0.0039) [2024-07-02 11:30:26,095][36761] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 62701568. Throughput: 0: 42489.6. Samples: 62863600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-07-02 11:30:26,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:30:26,978][36999] Updated weights for policy 0, policy_version 3830 (0.0031) [2024-07-02 11:30:30,776][36999] Updated weights for policy 0, policy_version 3840 (0.0030) [2024-07-02 11:30:31,095][36761] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 62930944. Throughput: 0: 42215.0. Samples: 62988440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 11:30:31,096][36761] Avg episode reward: [(0, '0.004')] [2024-07-02 11:30:34,812][36999] Updated weights for policy 0, policy_version 3850 (0.0033) [2024-07-02 11:30:36,095][36761] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 63143936. Throughput: 0: 42554.5. Samples: 63248840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 11:30:36,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:30:36,142][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000003855_63160320.pth... [2024-07-02 11:30:36,200][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000003230_52920320.pth [2024-07-02 11:30:38,334][36999] Updated weights for policy 0, policy_version 3860 (0.0043) [2024-07-02 11:30:41,095][36761] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 63324160. Throughput: 0: 42464.9. Samples: 63504980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 11:30:41,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:30:42,423][36999] Updated weights for policy 0, policy_version 3870 (0.0039) [2024-07-02 11:30:46,081][36999] Updated weights for policy 0, policy_version 3880 (0.0043) [2024-07-02 11:30:46,095][36761] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 63569920. Throughput: 0: 42274.3. Samples: 63625440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-07-02 11:30:46,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:30:50,368][36979] Signal inference workers to stop experience collection... (900 times) [2024-07-02 11:30:50,426][36999] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-07-02 11:30:50,429][36979] Signal inference workers to resume experience collection... (900 times) [2024-07-02 11:30:50,442][36999] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-07-02 11:30:50,445][36999] Updated weights for policy 0, policy_version 3890 (0.0035) [2024-07-02 11:30:51,095][36761] Fps is (10 sec: 42598.3, 60 sec: 42328.5, 300 sec: 42542.9). Total num frames: 63750144. Throughput: 0: 42535.6. Samples: 63883300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-07-02 11:30:51,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:30:53,744][36999] Updated weights for policy 0, policy_version 3900 (0.0039) [2024-07-02 11:30:56,100][36761] Fps is (10 sec: 39303.8, 60 sec: 42049.0, 300 sec: 42597.8). Total num frames: 63963136. Throughput: 0: 42367.7. Samples: 64132780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 11:30:56,101][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:30:58,348][36999] Updated weights for policy 0, policy_version 3910 (0.0039) [2024-07-02 11:31:01,095][36761] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 64176128. Throughput: 0: 42272.7. Samples: 64257180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 11:31:01,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:31:01,571][36999] Updated weights for policy 0, policy_version 3920 (0.0033) [2024-07-02 11:31:06,095][36761] Fps is (10 sec: 40979.1, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 64372736. Throughput: 0: 42227.2. Samples: 64507460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 11:31:06,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:31:06,157][36999] Updated weights for policy 0, policy_version 3930 (0.0048) [2024-07-02 11:31:09,397][36999] Updated weights for policy 0, policy_version 3940 (0.0039) [2024-07-02 11:31:11,095][36761] Fps is (10 sec: 42598.7, 60 sec: 42328.3, 300 sec: 42598.4). Total num frames: 64602112. Throughput: 0: 42076.1. Samples: 64757020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 11:31:11,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:31:14,091][36999] Updated weights for policy 0, policy_version 3950 (0.0032) [2024-07-02 11:31:16,095][36761] Fps is (10 sec: 42597.7, 60 sec: 42052.1, 300 sec: 42487.3). Total num frames: 64798720. Throughput: 0: 42238.7. Samples: 64889180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-07-02 11:31:16,096][36761] Avg episode reward: [(0, '0.004')] [2024-07-02 11:31:17,394][36999] Updated weights for policy 0, policy_version 3960 (0.0041) [2024-07-02 11:31:21,095][36761] Fps is (10 sec: 39322.1, 60 sec: 41506.2, 300 sec: 42487.3). Total num frames: 64995328. Throughput: 0: 41960.9. Samples: 65137080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 11:31:21,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:31:21,823][36999] Updated weights for policy 0, policy_version 3970 (0.0039) [2024-07-02 11:31:25,090][36999] Updated weights for policy 0, policy_version 3980 (0.0038) [2024-07-02 11:31:26,095][36761] Fps is (10 sec: 44237.6, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 65241088. Throughput: 0: 41824.5. Samples: 65387080. Policy #0 lag: (min: 2.0, avg: 11.7, max: 21.0) [2024-07-02 11:31:26,095][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:31:29,470][36999] Updated weights for policy 0, policy_version 3990 (0.0039) [2024-07-02 11:31:31,095][36761] Fps is (10 sec: 44236.5, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 65437696. Throughput: 0: 42081.0. Samples: 65519080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 11:31:31,096][36761] Avg episode reward: [(0, '0.004')] [2024-07-02 11:31:32,661][36999] Updated weights for policy 0, policy_version 4000 (0.0037) [2024-07-02 11:31:36,095][36761] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 42487.3). Total num frames: 65634304. Throughput: 0: 41867.1. Samples: 65767320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 11:31:36,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:31:37,122][36999] Updated weights for policy 0, policy_version 4010 (0.0028) [2024-07-02 11:31:40,415][36999] Updated weights for policy 0, policy_version 4020 (0.0030) [2024-07-02 11:31:41,095][36761] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42654.6). Total num frames: 65880064. Throughput: 0: 41989.1. Samples: 66022100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-07-02 11:31:41,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:31:44,850][36999] Updated weights for policy 0, policy_version 4030 (0.0033) [2024-07-02 11:31:46,095][36761] Fps is (10 sec: 44236.9, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 66076672. Throughput: 0: 42258.8. Samples: 66158820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-07-02 11:31:46,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:31:48,128][36999] Updated weights for policy 0, policy_version 4040 (0.0022) [2024-07-02 11:31:51,095][36761] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 66273280. Throughput: 0: 42195.5. Samples: 66406260. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-07-02 11:31:51,096][36761] Avg episode reward: [(0, '0.004')] [2024-07-02 11:31:52,622][36999] Updated weights for policy 0, policy_version 4050 (0.0039) [2024-07-02 11:31:55,723][36999] Updated weights for policy 0, policy_version 4060 (0.0034) [2024-07-02 11:31:56,095][36761] Fps is (10 sec: 44236.8, 60 sec: 42601.7, 300 sec: 42599.1). Total num frames: 66519040. Throughput: 0: 42124.5. Samples: 66652620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 11:31:56,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:32:00,190][36999] Updated weights for policy 0, policy_version 4070 (0.0044) [2024-07-02 11:32:01,095][36761] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 66699264. Throughput: 0: 42216.9. Samples: 66788940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-07-02 11:32:01,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:32:03,608][36999] Updated weights for policy 0, policy_version 4080 (0.0036) [2024-07-02 11:32:06,095][36761] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 66912256. Throughput: 0: 42303.1. Samples: 67040720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 11:32:06,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:32:08,021][36999] Updated weights for policy 0, policy_version 4090 (0.0034) [2024-07-02 11:32:11,095][36761] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 67141632. Throughput: 0: 42215.0. Samples: 67286760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-07-02 11:32:11,100][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:32:11,475][36999] Updated weights for policy 0, policy_version 4100 (0.0039) [2024-07-02 11:32:14,860][36979] Signal inference workers to stop experience collection... (950 times) [2024-07-02 11:32:14,861][36979] Signal inference workers to resume experience collection... (950 times) [2024-07-02 11:32:14,879][36999] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-07-02 11:32:14,879][36999] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-07-02 11:32:15,927][36999] Updated weights for policy 0, policy_version 4110 (0.0032) [2024-07-02 11:32:16,095][36761] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42321.4). Total num frames: 67338240. Throughput: 0: 42031.6. Samples: 67410500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 11:32:16,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:32:19,073][36999] Updated weights for policy 0, policy_version 4120 (0.0044) [2024-07-02 11:32:21,095][36761] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 67551232. Throughput: 0: 42171.9. Samples: 67665060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 11:32:21,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:32:24,170][36999] Updated weights for policy 0, policy_version 4130 (0.0048) [2024-07-02 11:32:26,095][36761] Fps is (10 sec: 40959.4, 60 sec: 41779.1, 300 sec: 42376.2). Total num frames: 67747840. Throughput: 0: 42234.2. Samples: 67922640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-07-02 11:32:26,098][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:32:27,149][36999] Updated weights for policy 0, policy_version 4140 (0.0033) [2024-07-02 11:32:31,095][36761] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 67960832. Throughput: 0: 41852.0. Samples: 68042160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-07-02 11:32:31,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:32:32,027][36999] Updated weights for policy 0, policy_version 4150 (0.0025) [2024-07-02 11:32:35,087][36999] Updated weights for policy 0, policy_version 4160 (0.0031) [2024-07-02 11:32:36,095][36761] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 68190208. Throughput: 0: 41960.4. Samples: 68294480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-07-02 11:32:36,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:32:36,207][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000004163_68206592.pth... [2024-07-02 11:32:36,264][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000003541_58015744.pth [2024-07-02 11:32:39,853][36999] Updated weights for policy 0, policy_version 4170 (0.0032) [2024-07-02 11:32:41,095][36761] Fps is (10 sec: 39320.9, 60 sec: 41233.0, 300 sec: 42320.7). Total num frames: 68354048. Throughput: 0: 42301.6. Samples: 68556200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-07-02 11:32:41,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:32:42,964][36999] Updated weights for policy 0, policy_version 4180 (0.0034) [2024-07-02 11:32:46,095][36761] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 68616192. Throughput: 0: 41860.0. Samples: 68672640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-07-02 11:32:46,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:32:47,655][36999] Updated weights for policy 0, policy_version 4190 (0.0044) [2024-07-02 11:32:50,960][36999] Updated weights for policy 0, policy_version 4200 (0.0034) [2024-07-02 11:32:51,095][36761] Fps is (10 sec: 45876.1, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 68812800. Throughput: 0: 41968.5. Samples: 68929300. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-07-02 11:32:51,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:32:55,203][36999] Updated weights for policy 0, policy_version 4210 (0.0034) [2024-07-02 11:32:56,095][36761] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 42320.7). Total num frames: 69009408. Throughput: 0: 42200.0. Samples: 69185760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 11:32:56,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:32:58,453][36999] Updated weights for policy 0, policy_version 4220 (0.0022) [2024-07-02 11:33:01,095][36761] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 69255168. Throughput: 0: 42266.2. Samples: 69312480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 11:33:01,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:33:02,917][36999] Updated weights for policy 0, policy_version 4230 (0.0037) [2024-07-02 11:33:06,095][36761] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 69451776. Throughput: 0: 42243.1. Samples: 69566000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 11:33:06,096][36761] Avg episode reward: [(0, '0.003')] [2024-07-02 11:33:06,191][36999] Updated weights for policy 0, policy_version 4240 (0.0046) [2024-07-02 11:33:10,640][36999] Updated weights for policy 0, policy_version 4250 (0.0046) [2024-07-02 11:33:11,095][36761] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 69648384. Throughput: 0: 42216.5. Samples: 69822380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-07-02 11:33:11,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:33:12,787][36979] Signal inference workers to stop experience collection... (1000 times) [2024-07-02 11:33:12,788][36979] Signal inference workers to resume experience collection... (1000 times) [2024-07-02 11:33:12,838][36999] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-07-02 11:33:12,839][36999] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-07-02 11:33:13,950][36999] Updated weights for policy 0, policy_version 4260 (0.0032) [2024-07-02 11:33:16,095][36761] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 69877760. Throughput: 0: 42304.3. Samples: 69945860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 11:33:16,096][36761] Avg episode reward: [(0, '0.013')] [2024-07-02 11:33:18,498][36999] Updated weights for policy 0, policy_version 4270 (0.0048) [2024-07-02 11:33:21,095][36761] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42320.7). Total num frames: 70074368. Throughput: 0: 42276.5. Samples: 70196920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-07-02 11:33:21,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:33:21,686][36999] Updated weights for policy 0, policy_version 4280 (0.0043) [2024-07-02 11:33:26,100][36761] Fps is (10 sec: 39304.0, 60 sec: 42049.1, 300 sec: 42209.0). Total num frames: 70270976. Throughput: 0: 42302.5. Samples: 70460000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-07-02 11:33:26,101][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:33:26,307][36999] Updated weights for policy 0, policy_version 4290 (0.0035) [2024-07-02 11:33:29,384][36999] Updated weights for policy 0, policy_version 4300 (0.0030) [2024-07-02 11:33:31,095][36761] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 70500352. Throughput: 0: 42433.5. Samples: 70582140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 11:33:31,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:33:33,965][36999] Updated weights for policy 0, policy_version 4310 (0.0033) [2024-07-02 11:33:36,095][36761] Fps is (10 sec: 45896.1, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 70729728. Throughput: 0: 42519.0. Samples: 70842660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 11:33:36,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:33:36,965][36999] Updated weights for policy 0, policy_version 4320 (0.0028) [2024-07-02 11:33:41,095][36761] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 70926336. Throughput: 0: 42521.3. Samples: 71099220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 11:33:41,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:33:41,524][36999] Updated weights for policy 0, policy_version 4330 (0.0036) [2024-07-02 11:33:44,536][36999] Updated weights for policy 0, policy_version 4340 (0.0042) [2024-07-02 11:33:46,095][36761] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 71139328. Throughput: 0: 42367.2. Samples: 71219000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 11:33:46,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:33:49,117][36999] Updated weights for policy 0, policy_version 4350 (0.0042) [2024-07-02 11:33:51,096][36761] Fps is (10 sec: 44236.3, 60 sec: 42598.2, 300 sec: 42376.3). Total num frames: 71368704. Throughput: 0: 42527.9. Samples: 71479760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-07-02 11:33:51,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:33:52,504][36999] Updated weights for policy 0, policy_version 4360 (0.0039) [2024-07-02 11:33:56,095][36761] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 71565312. Throughput: 0: 42531.1. Samples: 71736280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 11:33:56,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:33:56,615][36999] Updated weights for policy 0, policy_version 4370 (0.0038) [2024-07-02 11:34:00,361][36999] Updated weights for policy 0, policy_version 4380 (0.0041) [2024-07-02 11:34:01,095][36761] Fps is (10 sec: 40961.0, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 71778304. Throughput: 0: 42489.9. Samples: 71857900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-07-02 11:34:01,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:34:04,406][36999] Updated weights for policy 0, policy_version 4390 (0.0032) [2024-07-02 11:34:06,095][36761] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 72024064. Throughput: 0: 42729.2. Samples: 72119740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 11:34:06,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:34:07,856][36999] Updated weights for policy 0, policy_version 4400 (0.0036) [2024-07-02 11:34:11,095][36761] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 72220672. Throughput: 0: 42518.5. Samples: 72373140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 11:34:11,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:34:11,959][36999] Updated weights for policy 0, policy_version 4410 (0.0039) [2024-07-02 11:34:15,386][36999] Updated weights for policy 0, policy_version 4420 (0.0037) [2024-07-02 11:34:16,095][36761] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 72417280. Throughput: 0: 42517.6. Samples: 72495440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 11:34:16,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:34:19,814][36999] Updated weights for policy 0, policy_version 4430 (0.0038) [2024-07-02 11:34:21,095][36761] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 72630272. Throughput: 0: 42463.6. Samples: 72753520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-07-02 11:34:21,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:34:23,296][36999] Updated weights for policy 0, policy_version 4440 (0.0026) [2024-07-02 11:34:26,095][36761] Fps is (10 sec: 40960.7, 60 sec: 42601.7, 300 sec: 42209.6). Total num frames: 72826880. Throughput: 0: 42575.7. Samples: 73015120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-07-02 11:34:26,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:34:27,470][36999] Updated weights for policy 0, policy_version 4450 (0.0032) [2024-07-02 11:34:30,677][36999] Updated weights for policy 0, policy_version 4460 (0.0032) [2024-07-02 11:34:31,095][36761] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 73072640. Throughput: 0: 42706.6. Samples: 73140800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-07-02 11:34:31,096][36761] Avg episode reward: [(0, '0.005')] [2024-07-02 11:34:35,009][36999] Updated weights for policy 0, policy_version 4470 (0.0037) [2024-07-02 11:34:36,095][36761] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 73269248. Throughput: 0: 42695.7. Samples: 73401060. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-07-02 11:34:36,104][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:34:36,253][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000004473_73285632.pth... [2024-07-02 11:34:36,324][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000003855_63160320.pth [2024-07-02 11:34:38,194][36999] Updated weights for policy 0, policy_version 4480 (0.0030) [2024-07-02 11:34:41,096][36761] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 73482240. Throughput: 0: 42820.8. Samples: 73663220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 11:34:41,105][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:34:42,639][36999] Updated weights for policy 0, policy_version 4490 (0.0036) [2024-07-02 11:34:45,879][36999] Updated weights for policy 0, policy_version 4500 (0.0038) [2024-07-02 11:34:46,095][36761] Fps is (10 sec: 45875.3, 60 sec: 43144.4, 300 sec: 42432.4). Total num frames: 73728000. Throughput: 0: 42895.0. Samples: 73788180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-07-02 11:34:46,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:34:50,216][36979] Signal inference workers to stop experience collection... (1050 times) [2024-07-02 11:34:50,216][36979] Signal inference workers to resume experience collection... (1050 times) [2024-07-02 11:34:50,247][36999] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-07-02 11:34:50,247][36999] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-07-02 11:34:50,365][36999] Updated weights for policy 0, policy_version 4510 (0.0037) [2024-07-02 11:34:51,095][36761] Fps is (10 sec: 42599.4, 60 sec: 42325.5, 300 sec: 42265.2). Total num frames: 73908224. Throughput: 0: 42641.1. Samples: 74038580. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-07-02 11:34:51,095][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:34:53,804][36999] Updated weights for policy 0, policy_version 4520 (0.0036) [2024-07-02 11:34:56,096][36761] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 74121216. Throughput: 0: 42756.7. Samples: 74297200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 11:34:56,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:34:58,058][36999] Updated weights for policy 0, policy_version 4530 (0.0041) [2024-07-02 11:35:01,095][36761] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42376.3). Total num frames: 74350592. Throughput: 0: 42826.4. Samples: 74422620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-07-02 11:35:01,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:35:01,484][36999] Updated weights for policy 0, policy_version 4540 (0.0040) [2024-07-02 11:35:06,095][36761] Fps is (10 sec: 40960.7, 60 sec: 41779.3, 300 sec: 42265.8). Total num frames: 74530816. Throughput: 0: 42755.0. Samples: 74677500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-07-02 11:35:06,100][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:35:06,160][36999] Updated weights for policy 0, policy_version 4550 (0.0042) [2024-07-02 11:35:09,064][36999] Updated weights for policy 0, policy_version 4560 (0.0042) [2024-07-02 11:35:11,095][36761] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 74760192. Throughput: 0: 42415.9. Samples: 74923840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-07-02 11:35:11,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:35:13,820][36999] Updated weights for policy 0, policy_version 4570 (0.0030) [2024-07-02 11:35:16,095][36761] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42320.7). Total num frames: 74989568. Throughput: 0: 42582.3. Samples: 75057000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-07-02 11:35:16,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:35:16,953][36999] Updated weights for policy 0, policy_version 4580 (0.0040) [2024-07-02 11:35:21,095][36761] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 42209.7). Total num frames: 75153408. Throughput: 0: 42271.2. Samples: 75303260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-07-02 11:35:21,095][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:35:21,674][36999] Updated weights for policy 0, policy_version 4590 (0.0027) [2024-07-02 11:35:24,694][36999] Updated weights for policy 0, policy_version 4600 (0.0031) [2024-07-02 11:35:26,095][36761] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 75399168. Throughput: 0: 42086.0. Samples: 75557080. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-07-02 11:35:26,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:35:29,326][36999] Updated weights for policy 0, policy_version 4610 (0.0031) [2024-07-02 11:35:31,095][36761] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 75612160. Throughput: 0: 42249.4. Samples: 75689400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 11:35:31,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:35:32,280][36999] Updated weights for policy 0, policy_version 4620 (0.0037) [2024-07-02 11:35:36,095][36761] Fps is (10 sec: 39321.0, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 75792384. Throughput: 0: 42276.3. Samples: 75941020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 11:35:36,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:35:37,078][36999] Updated weights for policy 0, policy_version 4630 (0.0033) [2024-07-02 11:35:39,827][36999] Updated weights for policy 0, policy_version 4640 (0.0040) [2024-07-02 11:35:41,095][36761] Fps is (10 sec: 42597.8, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 76038144. Throughput: 0: 42086.3. Samples: 76191080. Policy #0 lag: (min: 0.0, avg: 8.0, max: 23.0) [2024-07-02 11:35:41,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:35:44,798][36999] Updated weights for policy 0, policy_version 4650 (0.0028) [2024-07-02 11:35:46,095][36761] Fps is (10 sec: 47513.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 76267520. Throughput: 0: 42238.1. Samples: 76323340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-07-02 11:35:46,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:35:47,974][36999] Updated weights for policy 0, policy_version 4660 (0.0028) [2024-07-02 11:35:51,095][36761] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42321.4). Total num frames: 76447744. Throughput: 0: 42240.9. Samples: 76578340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 11:35:51,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:35:52,351][36999] Updated weights for policy 0, policy_version 4670 (0.0036) [2024-07-02 11:35:55,739][36999] Updated weights for policy 0, policy_version 4680 (0.0035) [2024-07-02 11:35:56,095][36761] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 76677120. Throughput: 0: 42475.5. Samples: 76835240. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-07-02 11:35:56,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:35:59,944][36999] Updated weights for policy 0, policy_version 4690 (0.0042) [2024-07-02 11:36:01,097][36761] Fps is (10 sec: 44229.3, 60 sec: 42324.1, 300 sec: 42431.5). Total num frames: 76890112. Throughput: 0: 42330.4. Samples: 76961940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 11:36:01,098][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:36:03,644][36999] Updated weights for policy 0, policy_version 4700 (0.0032) [2024-07-02 11:36:06,095][36761] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 77086720. Throughput: 0: 42477.7. Samples: 77214760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 11:36:06,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:36:07,690][36999] Updated weights for policy 0, policy_version 4710 (0.0032) [2024-07-02 11:36:11,096][36761] Fps is (10 sec: 40965.9, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 77299712. Throughput: 0: 42555.7. Samples: 77472100. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-07-02 11:36:11,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:36:11,509][36999] Updated weights for policy 0, policy_version 4720 (0.0032) [2024-07-02 11:36:15,216][36979] Signal inference workers to stop experience collection... (1100 times) [2024-07-02 11:36:15,250][36999] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-07-02 11:36:15,264][36979] Signal inference workers to resume experience collection... (1100 times) [2024-07-02 11:36:15,275][36999] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-07-02 11:36:15,282][36999] Updated weights for policy 0, policy_version 4730 (0.0035) [2024-07-02 11:36:16,095][36761] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 77529088. Throughput: 0: 42419.5. Samples: 77598280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-07-02 11:36:16,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:36:19,556][36999] Updated weights for policy 0, policy_version 4740 (0.0031) [2024-07-02 11:36:21,095][36761] Fps is (10 sec: 42599.6, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 77725696. Throughput: 0: 42465.9. Samples: 77851980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 11:36:21,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:36:22,975][36999] Updated weights for policy 0, policy_version 4750 (0.0027) [2024-07-02 11:36:26,095][36761] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 77955072. Throughput: 0: 42457.9. Samples: 78101680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 11:36:26,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:36:27,092][36999] Updated weights for policy 0, policy_version 4760 (0.0034) [2024-07-02 11:36:30,661][36999] Updated weights for policy 0, policy_version 4770 (0.0041) [2024-07-02 11:36:31,095][36761] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 78168064. Throughput: 0: 42476.1. Samples: 78234760. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-07-02 11:36:31,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:36:34,683][36999] Updated weights for policy 0, policy_version 4780 (0.0032) [2024-07-02 11:36:36,095][36761] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 78348288. Throughput: 0: 42370.3. Samples: 78485000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 11:36:36,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:36:36,170][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000004783_78364672.pth... [2024-07-02 11:36:36,221][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000004163_68206592.pth [2024-07-02 11:36:38,314][36999] Updated weights for policy 0, policy_version 4790 (0.0046) [2024-07-02 11:36:41,095][36761] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 78577664. Throughput: 0: 42313.8. Samples: 78739360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 11:36:41,098][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:36:42,450][36999] Updated weights for policy 0, policy_version 4800 (0.0031) [2024-07-02 11:36:46,076][36999] Updated weights for policy 0, policy_version 4810 (0.0026) [2024-07-02 11:36:46,095][36761] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 78807040. Throughput: 0: 42497.5. Samples: 78874260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-07-02 11:36:46,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:36:50,060][36999] Updated weights for policy 0, policy_version 4820 (0.0034) [2024-07-02 11:36:51,095][36761] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 78987264. Throughput: 0: 42612.0. Samples: 79132300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 11:36:51,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:36:53,692][36999] Updated weights for policy 0, policy_version 4830 (0.0030) [2024-07-02 11:36:56,097][36761] Fps is (10 sec: 42591.8, 60 sec: 42597.3, 300 sec: 42487.1). Total num frames: 79233024. Throughput: 0: 42552.5. Samples: 79387020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 11:36:56,097][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:36:57,629][36999] Updated weights for policy 0, policy_version 4840 (0.0035) [2024-07-02 11:37:01,095][36761] Fps is (10 sec: 44236.6, 60 sec: 42326.4, 300 sec: 42431.8). Total num frames: 79429632. Throughput: 0: 42678.5. Samples: 79518820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-07-02 11:37:01,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:37:01,507][36999] Updated weights for policy 0, policy_version 4850 (0.0029) [2024-07-02 11:37:05,362][36999] Updated weights for policy 0, policy_version 4860 (0.0032) [2024-07-02 11:37:06,095][36761] Fps is (10 sec: 40966.8, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 79642624. Throughput: 0: 42663.6. Samples: 79771840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 11:37:06,095][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:37:09,008][36999] Updated weights for policy 0, policy_version 4870 (0.0023) [2024-07-02 11:37:11,096][36761] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 79872000. Throughput: 0: 42649.1. Samples: 80020900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-07-02 11:37:11,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:37:12,889][36999] Updated weights for policy 0, policy_version 4880 (0.0028) [2024-07-02 11:37:16,095][36761] Fps is (10 sec: 40959.2, 60 sec: 42052.1, 300 sec: 42376.2). Total num frames: 80052224. Throughput: 0: 42745.2. Samples: 80158300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-07-02 11:37:16,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:37:16,701][36999] Updated weights for policy 0, policy_version 4890 (0.0037) [2024-07-02 11:37:20,443][36999] Updated weights for policy 0, policy_version 4900 (0.0038) [2024-07-02 11:37:21,095][36761] Fps is (10 sec: 42599.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 80297984. Throughput: 0: 42751.2. Samples: 80408800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 11:37:21,095][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:37:24,666][36999] Updated weights for policy 0, policy_version 4910 (0.0036) [2024-07-02 11:37:26,095][36761] Fps is (10 sec: 47514.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 80527360. Throughput: 0: 42714.7. Samples: 80661520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 11:37:26,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:37:28,619][36999] Updated weights for policy 0, policy_version 4920 (0.0028) [2024-07-02 11:37:31,095][36761] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 80707584. Throughput: 0: 42660.1. Samples: 80793960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-07-02 11:37:31,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:37:32,289][36999] Updated weights for policy 0, policy_version 4930 (0.0042) [2024-07-02 11:37:36,092][36999] Updated weights for policy 0, policy_version 4940 (0.0043) [2024-07-02 11:37:36,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 80936960. Throughput: 0: 42480.6. Samples: 81043920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-07-02 11:37:36,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:37:39,406][36979] Signal inference workers to stop experience collection... (1150 times) [2024-07-02 11:37:39,411][36979] Signal inference workers to resume experience collection... (1150 times) [2024-07-02 11:37:39,456][36999] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-07-02 11:37:39,456][36999] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-07-02 11:37:39,886][36999] Updated weights for policy 0, policy_version 4950 (0.0035) [2024-07-02 11:37:41,095][36761] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 81149952. Throughput: 0: 42749.1. Samples: 81310660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 11:37:41,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:37:43,623][36999] Updated weights for policy 0, policy_version 4960 (0.0029) [2024-07-02 11:37:46,095][36761] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 81346560. Throughput: 0: 42510.3. Samples: 81431780. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-07-02 11:37:46,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:37:47,798][36999] Updated weights for policy 0, policy_version 4970 (0.0041) [2024-07-02 11:37:51,095][36761] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 81575936. Throughput: 0: 42641.2. Samples: 81690700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-07-02 11:37:51,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:37:51,122][36999] Updated weights for policy 0, policy_version 4980 (0.0027) [2024-07-02 11:37:55,487][36999] Updated weights for policy 0, policy_version 4990 (0.0036) [2024-07-02 11:37:56,095][36761] Fps is (10 sec: 42598.4, 60 sec: 42326.5, 300 sec: 42431.8). Total num frames: 81772544. Throughput: 0: 42963.7. Samples: 81954260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 11:37:56,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:37:58,645][36999] Updated weights for policy 0, policy_version 5000 (0.0041) [2024-07-02 11:38:01,095][36761] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 82001920. Throughput: 0: 42508.6. Samples: 82071180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-07-02 11:38:01,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:38:03,112][36999] Updated weights for policy 0, policy_version 5010 (0.0036) [2024-07-02 11:38:06,095][36761] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 82231296. Throughput: 0: 42711.5. Samples: 82330820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-07-02 11:38:06,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:38:06,155][36999] Updated weights for policy 0, policy_version 5020 (0.0022) [2024-07-02 11:38:10,674][36999] Updated weights for policy 0, policy_version 5030 (0.0051) [2024-07-02 11:38:11,095][36761] Fps is (10 sec: 42598.5, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 82427904. Throughput: 0: 42833.8. Samples: 82589040. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-07-02 11:38:11,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:38:14,106][36999] Updated weights for policy 0, policy_version 5040 (0.0029) [2024-07-02 11:38:16,096][36761] Fps is (10 sec: 40959.0, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 82640896. Throughput: 0: 42748.2. Samples: 82717640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-07-02 11:38:16,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:38:18,220][36999] Updated weights for policy 0, policy_version 5050 (0.0039) [2024-07-02 11:38:21,095][36761] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42710.1). Total num frames: 82870272. Throughput: 0: 42900.4. Samples: 82974440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-07-02 11:38:21,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:38:21,859][36999] Updated weights for policy 0, policy_version 5060 (0.0030) [2024-07-02 11:38:25,866][36999] Updated weights for policy 0, policy_version 5070 (0.0029) [2024-07-02 11:38:26,095][36761] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 83066880. Throughput: 0: 42859.4. Samples: 83239340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-07-02 11:38:26,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:38:29,367][36999] Updated weights for policy 0, policy_version 5080 (0.0040) [2024-07-02 11:38:31,095][36761] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 83279872. Throughput: 0: 42902.2. Samples: 83362380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 11:38:31,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:38:33,420][36999] Updated weights for policy 0, policy_version 5090 (0.0033) [2024-07-02 11:38:36,095][36761] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 83525632. Throughput: 0: 42837.9. Samples: 83618400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 11:38:36,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:38:36,103][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000005098_83525632.pth... [2024-07-02 11:38:36,151][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000004473_73285632.pth [2024-07-02 11:38:36,877][36999] Updated weights for policy 0, policy_version 5100 (0.0046) [2024-07-02 11:38:40,984][36999] Updated weights for policy 0, policy_version 5110 (0.0032) [2024-07-02 11:38:41,095][36761] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 83722240. Throughput: 0: 42779.5. Samples: 83879340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 11:38:41,099][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:38:45,142][36999] Updated weights for policy 0, policy_version 5120 (0.0035) [2024-07-02 11:38:46,095][36761] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42487.4). Total num frames: 83902464. Throughput: 0: 43022.2. Samples: 84007180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 11:38:46,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:38:48,435][36999] Updated weights for policy 0, policy_version 5130 (0.0028) [2024-07-02 11:38:51,095][36761] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 84148224. Throughput: 0: 42895.9. Samples: 84261140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 11:38:51,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:38:52,668][36999] Updated weights for policy 0, policy_version 5140 (0.0045) [2024-07-02 11:38:54,238][36979] Signal inference workers to stop experience collection... (1200 times) [2024-07-02 11:38:54,292][36979] Signal inference workers to resume experience collection... (1200 times) [2024-07-02 11:38:54,296][36999] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-07-02 11:38:54,324][36999] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-07-02 11:38:56,014][36999] Updated weights for policy 0, policy_version 5150 (0.0023) [2024-07-02 11:38:56,095][36761] Fps is (10 sec: 47513.4, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 84377600. Throughput: 0: 42917.2. Samples: 84520320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 11:38:56,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:39:00,560][36999] Updated weights for policy 0, policy_version 5160 (0.0047) [2024-07-02 11:39:01,095][36761] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 84541440. Throughput: 0: 42896.5. Samples: 84647980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 11:39:01,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:39:03,511][36999] Updated weights for policy 0, policy_version 5170 (0.0027) [2024-07-02 11:39:06,095][36761] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 84803584. Throughput: 0: 42710.6. Samples: 84896420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-07-02 11:39:06,096][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 11:39:06,111][36979] Saving new best policy, reward=0.019! [2024-07-02 11:39:08,273][36999] Updated weights for policy 0, policy_version 5180 (0.0030) [2024-07-02 11:39:11,095][36761] Fps is (10 sec: 47514.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 85016576. Throughput: 0: 42678.8. Samples: 85159880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 11:39:11,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:39:11,111][36999] Updated weights for policy 0, policy_version 5190 (0.0026) [2024-07-02 11:39:16,050][36999] Updated weights for policy 0, policy_version 5200 (0.0034) [2024-07-02 11:39:16,095][36761] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 85196800. Throughput: 0: 42741.3. Samples: 85285740. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-07-02 11:39:16,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:39:18,944][36999] Updated weights for policy 0, policy_version 5210 (0.0033) [2024-07-02 11:39:21,095][36761] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 85442560. Throughput: 0: 42698.6. Samples: 85539840. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-07-02 11:39:21,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:39:23,625][36999] Updated weights for policy 0, policy_version 5220 (0.0045) [2024-07-02 11:39:26,095][36761] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 85655552. Throughput: 0: 42787.9. Samples: 85804800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-07-02 11:39:26,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:39:26,698][36999] Updated weights for policy 0, policy_version 5230 (0.0042) [2024-07-02 11:39:31,095][36761] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 85819392. Throughput: 0: 42700.0. Samples: 85928680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-07-02 11:39:31,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:39:31,555][36999] Updated weights for policy 0, policy_version 5240 (0.0032) [2024-07-02 11:39:34,262][36999] Updated weights for policy 0, policy_version 5250 (0.0025) [2024-07-02 11:39:36,095][36761] Fps is (10 sec: 44237.9, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 86097920. Throughput: 0: 42687.7. Samples: 86182080. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-07-02 11:39:36,095][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:39:39,103][36999] Updated weights for policy 0, policy_version 5260 (0.0031) [2024-07-02 11:39:41,095][36761] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 86278144. Throughput: 0: 42734.6. Samples: 86443380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-07-02 11:39:41,104][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:39:42,023][36999] Updated weights for policy 0, policy_version 5270 (0.0032) [2024-07-02 11:39:46,095][36761] Fps is (10 sec: 37683.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 86474752. Throughput: 0: 42594.4. Samples: 86564720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 11:39:46,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:39:46,670][36999] Updated weights for policy 0, policy_version 5280 (0.0034) [2024-07-02 11:39:49,894][36999] Updated weights for policy 0, policy_version 5290 (0.0046) [2024-07-02 11:39:51,095][36761] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 86720512. Throughput: 0: 42815.6. Samples: 86823120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-07-02 11:39:51,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:39:54,276][36999] Updated weights for policy 0, policy_version 5300 (0.0034) [2024-07-02 11:39:56,095][36761] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 86900736. Throughput: 0: 42673.8. Samples: 87080200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-07-02 11:39:56,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:39:57,538][36999] Updated weights for policy 0, policy_version 5310 (0.0035) [2024-07-02 11:40:01,095][36761] Fps is (10 sec: 39321.4, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 87113728. Throughput: 0: 42508.9. Samples: 87198640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-07-02 11:40:01,100][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:40:02,215][36999] Updated weights for policy 0, policy_version 5320 (0.0031) [2024-07-02 11:40:05,188][36999] Updated weights for policy 0, policy_version 5330 (0.0029) [2024-07-02 11:40:06,095][36761] Fps is (10 sec: 47513.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 87375872. Throughput: 0: 42783.6. Samples: 87465100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 11:40:06,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:40:09,725][36999] Updated weights for policy 0, policy_version 5340 (0.0037) [2024-07-02 11:40:11,095][36761] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 87556096. Throughput: 0: 42557.3. Samples: 87719880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-07-02 11:40:11,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:40:12,701][36999] Updated weights for policy 0, policy_version 5350 (0.0031) [2024-07-02 11:40:16,096][36761] Fps is (10 sec: 37682.7, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 87752704. Throughput: 0: 42590.5. Samples: 87845260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 11:40:16,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:40:16,379][36979] Signal inference workers to stop experience collection... (1250 times) [2024-07-02 11:40:16,379][36979] Signal inference workers to resume experience collection... (1250 times) [2024-07-02 11:40:16,394][36999] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-07-02 11:40:16,394][36999] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-07-02 11:40:17,209][36999] Updated weights for policy 0, policy_version 5360 (0.0030) [2024-07-02 11:40:20,325][36999] Updated weights for policy 0, policy_version 5370 (0.0040) [2024-07-02 11:40:21,095][36761] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 87998464. Throughput: 0: 42786.1. Samples: 88107460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-07-02 11:40:21,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:40:25,254][36999] Updated weights for policy 0, policy_version 5380 (0.0033) [2024-07-02 11:40:26,095][36761] Fps is (10 sec: 42599.4, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 88178688. Throughput: 0: 42633.5. Samples: 88361880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 11:40:26,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:40:28,114][36999] Updated weights for policy 0, policy_version 5390 (0.0045) [2024-07-02 11:40:31,095][36761] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 88408064. Throughput: 0: 42622.1. Samples: 88482720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-07-02 11:40:31,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:40:32,701][36999] Updated weights for policy 0, policy_version 5400 (0.0035) [2024-07-02 11:40:35,680][36999] Updated weights for policy 0, policy_version 5410 (0.0031) [2024-07-02 11:40:36,096][36761] Fps is (10 sec: 45869.9, 60 sec: 42324.5, 300 sec: 42709.3). Total num frames: 88637440. Throughput: 0: 42723.4. Samples: 88745720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-07-02 11:40:36,097][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:40:36,116][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000005410_88637440.pth... [2024-07-02 11:40:36,199][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000004783_78364672.pth [2024-07-02 11:40:40,223][36999] Updated weights for policy 0, policy_version 5420 (0.0025) [2024-07-02 11:40:41,095][36761] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 88817664. Throughput: 0: 42715.1. Samples: 89002380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 11:40:41,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:40:43,579][36999] Updated weights for policy 0, policy_version 5430 (0.0041) [2024-07-02 11:40:46,095][36761] Fps is (10 sec: 40964.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 89047040. Throughput: 0: 42852.0. Samples: 89126980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 11:40:46,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:40:47,751][36999] Updated weights for policy 0, policy_version 5440 (0.0032) [2024-07-02 11:40:51,038][36999] Updated weights for policy 0, policy_version 5450 (0.0026) [2024-07-02 11:40:51,100][36761] Fps is (10 sec: 47491.9, 60 sec: 42868.2, 300 sec: 42764.4). Total num frames: 89292800. Throughput: 0: 42708.6. Samples: 89387180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 11:40:51,100][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:40:55,502][36999] Updated weights for policy 0, policy_version 5460 (0.0036) [2024-07-02 11:40:56,095][36761] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42654.2). Total num frames: 89473024. Throughput: 0: 42876.2. Samples: 89649300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-07-02 11:40:56,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:40:58,675][36999] Updated weights for policy 0, policy_version 5470 (0.0038) [2024-07-02 11:41:01,095][36761] Fps is (10 sec: 42617.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 89718784. Throughput: 0: 42804.6. Samples: 89771460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-07-02 11:41:01,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:41:02,993][36999] Updated weights for policy 0, policy_version 5480 (0.0041) [2024-07-02 11:41:06,095][36761] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 89915392. Throughput: 0: 42754.7. Samples: 90031420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 11:41:06,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:41:06,515][36999] Updated weights for policy 0, policy_version 5490 (0.0043) [2024-07-02 11:41:10,687][36999] Updated weights for policy 0, policy_version 5500 (0.0040) [2024-07-02 11:41:11,095][36761] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 90128384. Throughput: 0: 42784.0. Samples: 90287160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 11:41:11,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:41:14,130][36999] Updated weights for policy 0, policy_version 5510 (0.0035) [2024-07-02 11:41:16,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43417.7, 300 sec: 42820.5). Total num frames: 90357760. Throughput: 0: 43003.1. Samples: 90417860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 11:41:16,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:41:18,370][36999] Updated weights for policy 0, policy_version 5520 (0.0027) [2024-07-02 11:41:21,095][36761] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 90570752. Throughput: 0: 42984.6. Samples: 90679980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 11:41:21,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:41:21,678][36999] Updated weights for policy 0, policy_version 5530 (0.0035) [2024-07-02 11:41:25,850][36999] Updated weights for policy 0, policy_version 5540 (0.0029) [2024-07-02 11:41:26,096][36761] Fps is (10 sec: 42596.0, 60 sec: 43417.1, 300 sec: 42764.9). Total num frames: 90783744. Throughput: 0: 42894.5. Samples: 90932660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 11:41:26,097][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:41:29,245][36999] Updated weights for policy 0, policy_version 5550 (0.0029) [2024-07-02 11:41:31,097][36761] Fps is (10 sec: 40953.4, 60 sec: 42870.3, 300 sec: 42820.3). Total num frames: 90980352. Throughput: 0: 42972.7. Samples: 91060820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-07-02 11:41:31,097][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:41:33,416][36999] Updated weights for policy 0, policy_version 5560 (0.0024) [2024-07-02 11:41:36,095][36761] Fps is (10 sec: 40961.9, 60 sec: 42599.1, 300 sec: 42765.0). Total num frames: 91193344. Throughput: 0: 43037.5. Samples: 91323680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 11:41:36,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:41:36,759][36999] Updated weights for policy 0, policy_version 5570 (0.0034) [2024-07-02 11:41:40,953][36999] Updated weights for policy 0, policy_version 5580 (0.0037) [2024-07-02 11:41:41,095][36761] Fps is (10 sec: 44243.7, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 91422720. Throughput: 0: 42896.3. Samples: 91579640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 11:41:41,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:41:43,894][36979] Signal inference workers to stop experience collection... (1300 times) [2024-07-02 11:41:43,894][36979] Signal inference workers to resume experience collection... (1300 times) [2024-07-02 11:41:43,912][36999] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-07-02 11:41:43,913][36999] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-07-02 11:41:44,521][36999] Updated weights for policy 0, policy_version 5590 (0.0040) [2024-07-02 11:41:46,095][36761] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 91635712. Throughput: 0: 43024.9. Samples: 91707580. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-07-02 11:41:46,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:41:48,632][36999] Updated weights for policy 0, policy_version 5600 (0.0033) [2024-07-02 11:41:51,095][36761] Fps is (10 sec: 40960.5, 60 sec: 42328.6, 300 sec: 42709.7). Total num frames: 91832320. Throughput: 0: 42868.1. Samples: 91960480. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-07-02 11:41:51,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:41:52,294][36999] Updated weights for policy 0, policy_version 5610 (0.0045) [2024-07-02 11:41:56,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 92061696. Throughput: 0: 42928.0. Samples: 92218920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-07-02 11:41:56,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:41:56,126][36999] Updated weights for policy 0, policy_version 5620 (0.0042) [2024-07-02 11:42:00,174][36999] Updated weights for policy 0, policy_version 5630 (0.0032) [2024-07-02 11:42:01,095][36761] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 92274688. Throughput: 0: 42845.5. Samples: 92345900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 11:42:01,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:42:03,782][36999] Updated weights for policy 0, policy_version 5640 (0.0036) [2024-07-02 11:42:06,099][36761] Fps is (10 sec: 40945.2, 60 sec: 42595.9, 300 sec: 42709.0). Total num frames: 92471296. Throughput: 0: 42631.3. Samples: 92598540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 11:42:06,099][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:42:07,834][36999] Updated weights for policy 0, policy_version 5650 (0.0030) [2024-07-02 11:42:11,095][36761] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 92700672. Throughput: 0: 42679.7. Samples: 92853220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-07-02 11:42:11,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:42:11,463][36999] Updated weights for policy 0, policy_version 5660 (0.0037) [2024-07-02 11:42:15,534][36999] Updated weights for policy 0, policy_version 5670 (0.0041) [2024-07-02 11:42:16,095][36761] Fps is (10 sec: 44253.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 92913664. Throughput: 0: 42609.1. Samples: 92978160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 11:42:16,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:42:19,387][36999] Updated weights for policy 0, policy_version 5680 (0.0033) [2024-07-02 11:42:21,095][36761] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 93110272. Throughput: 0: 42401.0. Samples: 93231720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 11:42:21,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:42:23,109][36999] Updated weights for policy 0, policy_version 5690 (0.0034) [2024-07-02 11:42:26,096][36761] Fps is (10 sec: 40957.2, 60 sec: 42325.3, 300 sec: 42764.9). Total num frames: 93323264. Throughput: 0: 42414.6. Samples: 93488320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-07-02 11:42:26,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:42:27,294][36999] Updated weights for policy 0, policy_version 5700 (0.0042) [2024-07-02 11:42:30,862][36999] Updated weights for policy 0, policy_version 5710 (0.0022) [2024-07-02 11:42:31,095][36761] Fps is (10 sec: 44237.0, 60 sec: 42872.6, 300 sec: 42765.0). Total num frames: 93552640. Throughput: 0: 42393.7. Samples: 93615300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-07-02 11:42:31,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:42:35,104][36999] Updated weights for policy 0, policy_version 5720 (0.0033) [2024-07-02 11:42:36,095][36761] Fps is (10 sec: 44239.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 93765632. Throughput: 0: 42611.9. Samples: 93878020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 19.0) [2024-07-02 11:42:36,097][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:42:36,111][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000005723_93765632.pth... [2024-07-02 11:42:36,158][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000005098_83525632.pth [2024-07-02 11:42:38,563][36999] Updated weights for policy 0, policy_version 5730 (0.0031) [2024-07-02 11:42:41,095][36761] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 93978624. Throughput: 0: 42551.1. Samples: 94133720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 11:42:41,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:42:42,644][36999] Updated weights for policy 0, policy_version 5740 (0.0035) [2024-07-02 11:42:46,095][36761] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 94208000. Throughput: 0: 42607.1. Samples: 94263220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 11:42:46,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:42:46,109][36999] Updated weights for policy 0, policy_version 5750 (0.0033) [2024-07-02 11:42:50,147][36999] Updated weights for policy 0, policy_version 5760 (0.0039) [2024-07-02 11:42:51,095][36761] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 94404608. Throughput: 0: 42767.9. Samples: 94522940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 11:42:51,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:42:53,824][36999] Updated weights for policy 0, policy_version 5770 (0.0030) [2024-07-02 11:42:56,095][36761] Fps is (10 sec: 40958.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 94617600. Throughput: 0: 42863.4. Samples: 94782080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 11:42:56,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:42:57,840][36999] Updated weights for policy 0, policy_version 5780 (0.0034) [2024-07-02 11:43:01,095][36761] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 94830592. Throughput: 0: 42993.3. Samples: 94912860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 11:43:01,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:43:01,427][36999] Updated weights for policy 0, policy_version 5790 (0.0027) [2024-07-02 11:43:05,264][36999] Updated weights for policy 0, policy_version 5800 (0.0053) [2024-07-02 11:43:06,095][36761] Fps is (10 sec: 42598.5, 60 sec: 42873.9, 300 sec: 42765.0). Total num frames: 95043584. Throughput: 0: 43275.1. Samples: 95179100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-07-02 11:43:06,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:43:07,030][36979] Signal inference workers to stop experience collection... (1350 times) [2024-07-02 11:43:07,030][36979] Signal inference workers to resume experience collection... (1350 times) [2024-07-02 11:43:07,069][36999] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-07-02 11:43:07,070][36999] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-07-02 11:43:09,065][36999] Updated weights for policy 0, policy_version 5810 (0.0044) [2024-07-02 11:43:11,096][36761] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 95272960. Throughput: 0: 43209.3. Samples: 95432720. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-07-02 11:43:11,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:43:12,732][36999] Updated weights for policy 0, policy_version 5820 (0.0043) [2024-07-02 11:43:16,095][36761] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 95485952. Throughput: 0: 43340.4. Samples: 95565620. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-07-02 11:43:16,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:43:16,796][36999] Updated weights for policy 0, policy_version 5830 (0.0036) [2024-07-02 11:43:20,520][36999] Updated weights for policy 0, policy_version 5840 (0.0036) [2024-07-02 11:43:21,095][36761] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 95698944. Throughput: 0: 43157.9. Samples: 95820120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 11:43:21,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:43:24,234][36999] Updated weights for policy 0, policy_version 5850 (0.0033) [2024-07-02 11:43:26,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43145.0, 300 sec: 42820.6). Total num frames: 95911936. Throughput: 0: 43280.9. Samples: 96081360. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-07-02 11:43:26,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:43:28,195][36999] Updated weights for policy 0, policy_version 5860 (0.0036) [2024-07-02 11:43:31,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 96141312. Throughput: 0: 43276.8. Samples: 96210680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 11:43:31,096][36761] Avg episode reward: [(0, '0.017')] [2024-07-02 11:43:31,806][36999] Updated weights for policy 0, policy_version 5870 (0.0041) [2024-07-02 11:43:35,566][36999] Updated weights for policy 0, policy_version 5880 (0.0036) [2024-07-02 11:43:36,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 96354304. Throughput: 0: 43211.6. Samples: 96467460. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-07-02 11:43:36,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:43:39,483][36999] Updated weights for policy 0, policy_version 5890 (0.0028) [2024-07-02 11:43:41,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 96567296. Throughput: 0: 43273.1. Samples: 96729360. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-07-02 11:43:41,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:43:43,010][36999] Updated weights for policy 0, policy_version 5900 (0.0027) [2024-07-02 11:43:46,095][36761] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 96763904. Throughput: 0: 43223.0. Samples: 96857900. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-07-02 11:43:46,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:43:47,024][36999] Updated weights for policy 0, policy_version 5910 (0.0035) [2024-07-02 11:43:50,547][36999] Updated weights for policy 0, policy_version 5920 (0.0045) [2024-07-02 11:43:51,095][36761] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 97009664. Throughput: 0: 42875.1. Samples: 97108480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 11:43:51,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:43:54,693][36999] Updated weights for policy 0, policy_version 5930 (0.0026) [2024-07-02 11:43:56,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 97206272. Throughput: 0: 43174.4. Samples: 97375560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 25.0) [2024-07-02 11:43:56,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:43:58,080][36999] Updated weights for policy 0, policy_version 5940 (0.0037) [2024-07-02 11:44:01,095][36761] Fps is (10 sec: 40960.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 97419264. Throughput: 0: 42916.1. Samples: 97496840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 25.0) [2024-07-02 11:44:01,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:44:02,511][36999] Updated weights for policy 0, policy_version 5950 (0.0041) [2024-07-02 11:44:05,643][36999] Updated weights for policy 0, policy_version 5960 (0.0028) [2024-07-02 11:44:06,095][36761] Fps is (10 sec: 45875.1, 60 sec: 43690.8, 300 sec: 42876.1). Total num frames: 97665024. Throughput: 0: 42976.9. Samples: 97754080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-07-02 11:44:06,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:44:09,999][36999] Updated weights for policy 0, policy_version 5970 (0.0037) [2024-07-02 11:44:11,095][36761] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 97845248. Throughput: 0: 43065.8. Samples: 98019320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-07-02 11:44:11,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:44:13,307][36999] Updated weights for policy 0, policy_version 5980 (0.0047) [2024-07-02 11:44:16,095][36761] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 98058240. Throughput: 0: 42858.6. Samples: 98139320. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-07-02 11:44:16,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:44:17,587][36999] Updated weights for policy 0, policy_version 5990 (0.0025) [2024-07-02 11:44:20,897][36999] Updated weights for policy 0, policy_version 6000 (0.0033) [2024-07-02 11:44:21,095][36761] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 98304000. Throughput: 0: 42866.2. Samples: 98396440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 11:44:21,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:44:25,180][36999] Updated weights for policy 0, policy_version 6010 (0.0033) [2024-07-02 11:44:26,100][36761] Fps is (10 sec: 42579.4, 60 sec: 42868.2, 300 sec: 42931.0). Total num frames: 98484224. Throughput: 0: 42873.4. Samples: 98658860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-07-02 11:44:26,100][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:44:28,410][36999] Updated weights for policy 0, policy_version 6020 (0.0029) [2024-07-02 11:44:31,096][36761] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 98697216. Throughput: 0: 42742.1. Samples: 98781300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 11:44:31,096][36761] Avg episode reward: [(0, '0.013')] [2024-07-02 11:44:33,111][36999] Updated weights for policy 0, policy_version 6030 (0.0032) [2024-07-02 11:44:35,890][36999] Updated weights for policy 0, policy_version 6040 (0.0041) [2024-07-02 11:44:36,100][36761] Fps is (10 sec: 47513.2, 60 sec: 43414.3, 300 sec: 42986.5). Total num frames: 98959360. Throughput: 0: 42920.6. Samples: 99040100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 11:44:36,101][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:44:36,108][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000006040_98959360.pth... [2024-07-02 11:44:36,155][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000005410_88637440.pth [2024-07-02 11:44:40,686][36999] Updated weights for policy 0, policy_version 6050 (0.0028) [2024-07-02 11:44:41,095][36761] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 99139584. Throughput: 0: 42824.3. Samples: 99302660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-07-02 11:44:41,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:44:42,043][36979] Signal inference workers to stop experience collection... (1400 times) [2024-07-02 11:44:42,087][36999] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-07-02 11:44:42,094][36979] Signal inference workers to resume experience collection... (1400 times) [2024-07-02 11:44:42,112][36999] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-07-02 11:44:43,541][36999] Updated weights for policy 0, policy_version 6060 (0.0038) [2024-07-02 11:44:46,095][36761] Fps is (10 sec: 39339.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 99352576. Throughput: 0: 42797.7. Samples: 99422740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 11:44:46,096][36761] Avg episode reward: [(0, '0.020')] [2024-07-02 11:44:46,114][36979] Saving new best policy, reward=0.020! [2024-07-02 11:44:48,223][36999] Updated weights for policy 0, policy_version 6070 (0.0040) [2024-07-02 11:44:51,095][36761] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 99581952. Throughput: 0: 42987.6. Samples: 99688520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 11:44:51,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:44:51,458][36999] Updated weights for policy 0, policy_version 6080 (0.0034) [2024-07-02 11:44:56,095][36761] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 99745792. Throughput: 0: 42971.6. Samples: 99953040. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-07-02 11:44:56,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:44:56,390][36999] Updated weights for policy 0, policy_version 6090 (0.0025) [2024-07-02 11:44:59,086][36999] Updated weights for policy 0, policy_version 6100 (0.0037) [2024-07-02 11:45:01,095][36761] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 100007936. Throughput: 0: 42880.0. Samples: 100068920. Policy #0 lag: (min: 0.0, avg: 13.2, max: 23.0) [2024-07-02 11:45:01,100][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:45:03,936][36999] Updated weights for policy 0, policy_version 6110 (0.0037) [2024-07-02 11:45:06,095][36761] Fps is (10 sec: 47513.6, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 100220928. Throughput: 0: 42937.4. Samples: 100328620. Policy #0 lag: (min: 0.0, avg: 13.2, max: 23.0) [2024-07-02 11:45:06,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:45:06,738][36999] Updated weights for policy 0, policy_version 6120 (0.0038) [2024-07-02 11:45:11,095][36761] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 100401152. Throughput: 0: 42918.9. Samples: 100590020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-07-02 11:45:11,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:45:11,522][36999] Updated weights for policy 0, policy_version 6130 (0.0045) [2024-07-02 11:45:14,696][36999] Updated weights for policy 0, policy_version 6140 (0.0029) [2024-07-02 11:45:16,095][36761] Fps is (10 sec: 44236.0, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 100663296. Throughput: 0: 42860.0. Samples: 100710000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 11:45:16,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:45:18,997][36999] Updated weights for policy 0, policy_version 6150 (0.0038) [2024-07-02 11:45:21,095][36761] Fps is (10 sec: 45876.1, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 100859904. Throughput: 0: 43009.8. Samples: 100975340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 11:45:21,096][36761] Avg episode reward: [(0, '0.013')] [2024-07-02 11:45:22,393][36999] Updated weights for policy 0, policy_version 6160 (0.0043) [2024-07-02 11:45:26,095][36761] Fps is (10 sec: 39321.9, 60 sec: 42874.7, 300 sec: 42876.1). Total num frames: 101056512. Throughput: 0: 42986.7. Samples: 101237060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 11:45:26,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:45:26,511][36999] Updated weights for policy 0, policy_version 6170 (0.0037) [2024-07-02 11:45:29,976][36999] Updated weights for policy 0, policy_version 6180 (0.0039) [2024-07-02 11:45:31,095][36761] Fps is (10 sec: 45875.0, 60 sec: 43690.8, 300 sec: 42987.3). Total num frames: 101318656. Throughput: 0: 43057.0. Samples: 101360300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-07-02 11:45:31,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:45:33,977][36999] Updated weights for policy 0, policy_version 6190 (0.0028) [2024-07-02 11:45:36,095][36761] Fps is (10 sec: 47513.8, 60 sec: 42874.8, 300 sec: 43098.2). Total num frames: 101531648. Throughput: 0: 43226.6. Samples: 101633720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-07-02 11:45:36,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:45:37,362][36999] Updated weights for policy 0, policy_version 6200 (0.0027) [2024-07-02 11:45:41,095][36761] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 101711872. Throughput: 0: 43102.6. Samples: 101892660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 11:45:41,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:45:41,515][36999] Updated weights for policy 0, policy_version 6210 (0.0034) [2024-07-02 11:45:44,839][36999] Updated weights for policy 0, policy_version 6220 (0.0029) [2024-07-02 11:45:46,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42932.3). Total num frames: 101957632. Throughput: 0: 43360.9. Samples: 102020160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 11:45:46,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:45:49,312][36999] Updated weights for policy 0, policy_version 6230 (0.0030) [2024-07-02 11:45:51,095][36761] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 102170624. Throughput: 0: 43367.9. Samples: 102280180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 11:45:51,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:45:52,611][36999] Updated weights for policy 0, policy_version 6240 (0.0031) [2024-07-02 11:45:56,095][36761] Fps is (10 sec: 40959.6, 60 sec: 43690.5, 300 sec: 42876.1). Total num frames: 102367232. Throughput: 0: 43324.9. Samples: 102539640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-07-02 11:45:56,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:45:56,698][36999] Updated weights for policy 0, policy_version 6250 (0.0032) [2024-07-02 11:46:00,186][36999] Updated weights for policy 0, policy_version 6260 (0.0032) [2024-07-02 11:46:01,097][36761] Fps is (10 sec: 44229.0, 60 sec: 43416.3, 300 sec: 43042.5). Total num frames: 102612992. Throughput: 0: 43476.6. Samples: 102666520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 11:46:01,097][36761] Avg episode reward: [(0, '0.021')] [2024-07-02 11:46:01,098][36979] Saving new best policy, reward=0.021! [2024-07-02 11:46:04,137][36999] Updated weights for policy 0, policy_version 6270 (0.0038) [2024-07-02 11:46:06,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 102809600. Throughput: 0: 43301.2. Samples: 102923900. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-07-02 11:46:06,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:46:07,670][36999] Updated weights for policy 0, policy_version 6280 (0.0033) [2024-07-02 11:46:11,095][36761] Fps is (10 sec: 40967.3, 60 sec: 43690.8, 300 sec: 42931.6). Total num frames: 103022592. Throughput: 0: 43254.7. Samples: 103183520. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-07-02 11:46:11,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:46:11,526][36999] Updated weights for policy 0, policy_version 6290 (0.0022) [2024-07-02 11:46:15,456][36999] Updated weights for policy 0, policy_version 6300 (0.0039) [2024-07-02 11:46:16,100][36761] Fps is (10 sec: 42579.1, 60 sec: 42868.2, 300 sec: 42931.0). Total num frames: 103235584. Throughput: 0: 43467.9. Samples: 103316560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 11:46:16,101][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:46:17,995][36979] Signal inference workers to stop experience collection... (1450 times) [2024-07-02 11:46:18,044][36999] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-07-02 11:46:18,106][36979] Signal inference workers to resume experience collection... (1450 times) [2024-07-02 11:46:18,106][36999] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-07-02 11:46:19,148][36999] Updated weights for policy 0, policy_version 6310 (0.0033) [2024-07-02 11:46:21,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 103448576. Throughput: 0: 43113.3. Samples: 103573820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-07-02 11:46:21,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:46:22,897][36999] Updated weights for policy 0, policy_version 6320 (0.0034) [2024-07-02 11:46:26,095][36761] Fps is (10 sec: 44257.5, 60 sec: 43690.7, 300 sec: 43043.0). Total num frames: 103677952. Throughput: 0: 42924.0. Samples: 103824240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 11:46:26,096][36761] Avg episode reward: [(0, '0.013')] [2024-07-02 11:46:26,582][36999] Updated weights for policy 0, policy_version 6330 (0.0033) [2024-07-02 11:46:30,422][36999] Updated weights for policy 0, policy_version 6340 (0.0034) [2024-07-02 11:46:31,095][36761] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 103890944. Throughput: 0: 43151.9. Samples: 103962000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 11:46:31,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:46:34,062][36999] Updated weights for policy 0, policy_version 6350 (0.0035) [2024-07-02 11:46:36,095][36761] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 104087552. Throughput: 0: 43100.4. Samples: 104219700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 11:46:36,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:46:36,344][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000006355_104120320.pth... [2024-07-02 11:46:36,405][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000005723_93765632.pth [2024-07-02 11:46:38,005][36999] Updated weights for policy 0, policy_version 6360 (0.0044) [2024-07-02 11:46:41,095][36761] Fps is (10 sec: 44237.3, 60 sec: 43690.7, 300 sec: 43042.7). Total num frames: 104333312. Throughput: 0: 42979.2. Samples: 104473700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 11:46:41,096][36761] Avg episode reward: [(0, '0.007')] [2024-07-02 11:46:41,802][36999] Updated weights for policy 0, policy_version 6370 (0.0029) [2024-07-02 11:46:45,722][36999] Updated weights for policy 0, policy_version 6380 (0.0046) [2024-07-02 11:46:46,095][36761] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 104546304. Throughput: 0: 43062.9. Samples: 104604280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 11:46:46,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:46:49,476][36999] Updated weights for policy 0, policy_version 6390 (0.0054) [2024-07-02 11:46:51,095][36761] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 104726528. Throughput: 0: 42955.8. Samples: 104856900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 11:46:51,095][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:46:53,569][36999] Updated weights for policy 0, policy_version 6400 (0.0042) [2024-07-02 11:46:56,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 104972288. Throughput: 0: 42976.4. Samples: 105117460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 11:46:56,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:46:57,299][36999] Updated weights for policy 0, policy_version 6410 (0.0023) [2024-07-02 11:47:01,095][36761] Fps is (10 sec: 44236.8, 60 sec: 42599.7, 300 sec: 43043.3). Total num frames: 105168896. Throughput: 0: 42830.3. Samples: 105243720. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-07-02 11:47:01,095][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:47:01,231][36999] Updated weights for policy 0, policy_version 6420 (0.0024) [2024-07-02 11:47:04,859][36999] Updated weights for policy 0, policy_version 6430 (0.0027) [2024-07-02 11:47:06,095][36761] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 105381888. Throughput: 0: 42861.0. Samples: 105502560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 11:47:06,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:47:09,023][36999] Updated weights for policy 0, policy_version 6440 (0.0037) [2024-07-02 11:47:11,095][36761] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 105594880. Throughput: 0: 43064.3. Samples: 105762140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 11:47:11,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:47:12,627][36999] Updated weights for policy 0, policy_version 6450 (0.0032) [2024-07-02 11:47:16,100][36761] Fps is (10 sec: 44236.4, 60 sec: 43147.8, 300 sec: 43098.3). Total num frames: 105824256. Throughput: 0: 42837.8. Samples: 105889700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 11:47:16,100][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:47:16,560][36999] Updated weights for policy 0, policy_version 6460 (0.0026) [2024-07-02 11:47:20,451][36999] Updated weights for policy 0, policy_version 6470 (0.0036) [2024-07-02 11:47:21,095][36761] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 106037248. Throughput: 0: 42852.0. Samples: 106148040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 11:47:21,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:47:24,204][36999] Updated weights for policy 0, policy_version 6480 (0.0039) [2024-07-02 11:47:26,095][36761] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 106233856. Throughput: 0: 43068.8. Samples: 106411800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 11:47:26,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:47:27,907][36999] Updated weights for policy 0, policy_version 6490 (0.0032) [2024-07-02 11:47:31,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 106479616. Throughput: 0: 42990.3. Samples: 106538840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 11:47:31,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:47:31,621][36999] Updated weights for policy 0, policy_version 6500 (0.0037) [2024-07-02 11:47:35,470][36999] Updated weights for policy 0, policy_version 6510 (0.0032) [2024-07-02 11:47:36,095][36761] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 106676224. Throughput: 0: 43187.1. Samples: 106800320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 11:47:36,095][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:47:39,131][36999] Updated weights for policy 0, policy_version 6520 (0.0040) [2024-07-02 11:47:41,095][36761] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 106872832. Throughput: 0: 43102.4. Samples: 107057060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 11:47:41,095][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:47:43,121][36999] Updated weights for policy 0, policy_version 6530 (0.0038) [2024-07-02 11:47:46,095][36761] Fps is (10 sec: 44235.7, 60 sec: 42871.4, 300 sec: 43098.2). Total num frames: 107118592. Throughput: 0: 43115.8. Samples: 107183940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 11:47:46,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:47:46,655][36999] Updated weights for policy 0, policy_version 6540 (0.0036) [2024-07-02 11:47:50,796][36999] Updated weights for policy 0, policy_version 6550 (0.0027) [2024-07-02 11:47:51,095][36761] Fps is (10 sec: 45874.5, 60 sec: 43417.5, 300 sec: 43098.3). Total num frames: 107331584. Throughput: 0: 43267.4. Samples: 107449600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 11:47:51,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:47:54,631][36999] Updated weights for policy 0, policy_version 6560 (0.0039) [2024-07-02 11:47:56,095][36761] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 107528192. Throughput: 0: 43069.0. Samples: 107700240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 11:47:56,100][36761] Avg episode reward: [(0, '0.017')] [2024-07-02 11:47:58,589][36999] Updated weights for policy 0, policy_version 6570 (0.0029) [2024-07-02 11:48:01,095][36761] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 107757568. Throughput: 0: 43055.2. Samples: 107827180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-07-02 11:48:01,096][36761] Avg episode reward: [(0, '0.013')] [2024-07-02 11:48:02,163][36999] Updated weights for policy 0, policy_version 6580 (0.0023) [2024-07-02 11:48:06,074][36999] Updated weights for policy 0, policy_version 6590 (0.0040) [2024-07-02 11:48:06,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 107970560. Throughput: 0: 43160.9. Samples: 108090280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-07-02 11:48:06,096][36761] Avg episode reward: [(0, '0.013')] [2024-07-02 11:48:07,342][36979] Signal inference workers to stop experience collection... (1500 times) [2024-07-02 11:48:07,401][36999] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-07-02 11:48:07,454][36979] Signal inference workers to resume experience collection... (1500 times) [2024-07-02 11:48:07,454][36999] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-07-02 11:48:09,711][36999] Updated weights for policy 0, policy_version 6600 (0.0046) [2024-07-02 11:48:11,100][36761] Fps is (10 sec: 42578.8, 60 sec: 43141.3, 300 sec: 43042.1). Total num frames: 108183552. Throughput: 0: 42913.1. Samples: 108343080. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-07-02 11:48:11,100][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:48:13,847][36999] Updated weights for policy 0, policy_version 6610 (0.0036) [2024-07-02 11:48:16,095][36761] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 108396544. Throughput: 0: 43016.4. Samples: 108474580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-07-02 11:48:16,099][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:48:17,204][36999] Updated weights for policy 0, policy_version 6620 (0.0032) [2024-07-02 11:48:21,100][36761] Fps is (10 sec: 40959.9, 60 sec: 42595.2, 300 sec: 42986.5). Total num frames: 108593152. Throughput: 0: 42900.4. Samples: 108731040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-07-02 11:48:21,100][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:48:21,469][36999] Updated weights for policy 0, policy_version 6630 (0.0032) [2024-07-02 11:48:24,630][36999] Updated weights for policy 0, policy_version 6640 (0.0025) [2024-07-02 11:48:26,100][36761] Fps is (10 sec: 42579.1, 60 sec: 43141.3, 300 sec: 42986.5). Total num frames: 108822528. Throughput: 0: 42950.6. Samples: 108990040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-07-02 11:48:26,100][36761] Avg episode reward: [(0, '0.013')] [2024-07-02 11:48:28,951][36999] Updated weights for policy 0, policy_version 6650 (0.0042) [2024-07-02 11:48:31,095][36761] Fps is (10 sec: 44257.5, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 109035520. Throughput: 0: 43078.4. Samples: 109122460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-07-02 11:48:31,096][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 11:48:32,014][36999] Updated weights for policy 0, policy_version 6660 (0.0026) [2024-07-02 11:48:36,095][36761] Fps is (10 sec: 42618.2, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 109248512. Throughput: 0: 42973.9. Samples: 109383420. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-07-02 11:48:36,096][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 11:48:36,221][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000006669_109264896.pth... [2024-07-02 11:48:36,279][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000006040_98959360.pth [2024-07-02 11:48:36,449][36999] Updated weights for policy 0, policy_version 6670 (0.0032) [2024-07-02 11:48:39,816][36999] Updated weights for policy 0, policy_version 6680 (0.0022) [2024-07-02 11:48:41,095][36761] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 43098.2). Total num frames: 109477888. Throughput: 0: 42940.0. Samples: 109632540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-07-02 11:48:41,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:48:44,269][36999] Updated weights for policy 0, policy_version 6690 (0.0037) [2024-07-02 11:48:46,095][36761] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 109690880. Throughput: 0: 42890.6. Samples: 109757260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 11:48:46,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:48:48,110][36999] Updated weights for policy 0, policy_version 6700 (0.0031) [2024-07-02 11:48:51,095][36761] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 109903872. Throughput: 0: 42839.2. Samples: 110018040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 11:48:51,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:48:51,746][36999] Updated weights for policy 0, policy_version 6710 (0.0025) [2024-07-02 11:48:55,553][36999] Updated weights for policy 0, policy_version 6720 (0.0028) [2024-07-02 11:48:56,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 110116864. Throughput: 0: 42954.1. Samples: 110275820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 11:48:56,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:48:59,263][36999] Updated weights for policy 0, policy_version 6730 (0.0025) [2024-07-02 11:49:01,095][36761] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 110313472. Throughput: 0: 42849.5. Samples: 110402800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 11:49:01,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:49:03,130][36999] Updated weights for policy 0, policy_version 6740 (0.0032) [2024-07-02 11:49:06,095][36761] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 110559232. Throughput: 0: 42936.7. Samples: 110663000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-07-02 11:49:06,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:49:06,820][36999] Updated weights for policy 0, policy_version 6750 (0.0034) [2024-07-02 11:49:10,904][36999] Updated weights for policy 0, policy_version 6760 (0.0049) [2024-07-02 11:49:11,095][36761] Fps is (10 sec: 44236.9, 60 sec: 42874.8, 300 sec: 43042.7). Total num frames: 110755840. Throughput: 0: 42956.5. Samples: 110922880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 11:49:11,095][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:49:14,634][36999] Updated weights for policy 0, policy_version 6770 (0.0031) [2024-07-02 11:49:16,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 110985216. Throughput: 0: 42815.0. Samples: 111049140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 11:49:16,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:49:18,553][36999] Updated weights for policy 0, policy_version 6780 (0.0034) [2024-07-02 11:49:21,095][36761] Fps is (10 sec: 42597.5, 60 sec: 43147.7, 300 sec: 43043.4). Total num frames: 111181824. Throughput: 0: 42850.9. Samples: 111311720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 11:49:21,096][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 11:49:22,048][36999] Updated weights for policy 0, policy_version 6790 (0.0028) [2024-07-02 11:49:26,013][36999] Updated weights for policy 0, policy_version 6800 (0.0041) [2024-07-02 11:49:26,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43147.8, 300 sec: 43098.3). Total num frames: 111411200. Throughput: 0: 43044.5. Samples: 111569540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 11:49:26,104][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:49:29,717][36999] Updated weights for policy 0, policy_version 6810 (0.0029) [2024-07-02 11:49:31,100][36761] Fps is (10 sec: 45856.0, 60 sec: 43414.4, 300 sec: 42987.2). Total num frames: 111640576. Throughput: 0: 43206.5. Samples: 111701740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-07-02 11:49:31,102][36761] Avg episode reward: [(0, '0.020')] [2024-07-02 11:49:33,565][36999] Updated weights for policy 0, policy_version 6820 (0.0028) [2024-07-02 11:49:36,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 111837184. Throughput: 0: 43216.3. Samples: 111962780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 11:49:36,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:49:37,163][36999] Updated weights for policy 0, policy_version 6830 (0.0034) [2024-07-02 11:49:41,095][36761] Fps is (10 sec: 40977.5, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 112050176. Throughput: 0: 43201.7. Samples: 112219900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-07-02 11:49:41,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:49:41,332][36999] Updated weights for policy 0, policy_version 6840 (0.0036) [2024-07-02 11:49:43,261][36979] Signal inference workers to stop experience collection... (1550 times) [2024-07-02 11:49:43,309][36999] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-07-02 11:49:43,374][36979] Signal inference workers to resume experience collection... (1550 times) [2024-07-02 11:49:43,374][36999] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-07-02 11:49:45,158][36999] Updated weights for policy 0, policy_version 6850 (0.0031) [2024-07-02 11:49:46,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 112279552. Throughput: 0: 43222.1. Samples: 112347800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-07-02 11:49:46,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:49:48,835][36999] Updated weights for policy 0, policy_version 6860 (0.0029) [2024-07-02 11:49:51,095][36761] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 43153.8). Total num frames: 112476160. Throughput: 0: 43088.6. Samples: 112601980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-07-02 11:49:51,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:49:52,726][36999] Updated weights for policy 0, policy_version 6870 (0.0034) [2024-07-02 11:49:56,095][36761] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 112705536. Throughput: 0: 43042.2. Samples: 112859780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-07-02 11:49:56,095][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:49:56,198][36999] Updated weights for policy 0, policy_version 6880 (0.0041) [2024-07-02 11:50:00,252][36999] Updated weights for policy 0, policy_version 6890 (0.0034) [2024-07-02 11:50:01,095][36761] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 112918528. Throughput: 0: 43170.7. Samples: 112991820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-07-02 11:50:01,096][36761] Avg episode reward: [(0, '0.020')] [2024-07-02 11:50:03,788][36999] Updated weights for policy 0, policy_version 6900 (0.0038) [2024-07-02 11:50:06,096][36761] Fps is (10 sec: 39320.6, 60 sec: 42325.3, 300 sec: 43042.7). Total num frames: 113098752. Throughput: 0: 43004.0. Samples: 113246900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-07-02 11:50:06,096][36761] Avg episode reward: [(0, '0.017')] [2024-07-02 11:50:07,771][36999] Updated weights for policy 0, policy_version 6910 (0.0027) [2024-07-02 11:50:11,095][36761] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 113344512. Throughput: 0: 43038.3. Samples: 113506260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-07-02 11:50:11,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:50:11,390][36999] Updated weights for policy 0, policy_version 6920 (0.0027) [2024-07-02 11:50:15,318][36999] Updated weights for policy 0, policy_version 6930 (0.0022) [2024-07-02 11:50:16,098][36761] Fps is (10 sec: 45865.2, 60 sec: 42869.8, 300 sec: 43042.4). Total num frames: 113557504. Throughput: 0: 42972.5. Samples: 113635420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 11:50:16,098][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:50:19,253][36999] Updated weights for policy 0, policy_version 6940 (0.0035) [2024-07-02 11:50:21,095][36761] Fps is (10 sec: 40959.8, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 113754112. Throughput: 0: 42820.5. Samples: 113889700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-07-02 11:50:21,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:50:22,775][36999] Updated weights for policy 0, policy_version 6950 (0.0026) [2024-07-02 11:50:26,098][36761] Fps is (10 sec: 44234.9, 60 sec: 43142.5, 300 sec: 42986.8). Total num frames: 113999872. Throughput: 0: 42937.8. Samples: 114152220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 11:50:26,099][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 11:50:26,666][36999] Updated weights for policy 0, policy_version 6960 (0.0038) [2024-07-02 11:50:30,482][36999] Updated weights for policy 0, policy_version 6970 (0.0031) [2024-07-02 11:50:31,095][36761] Fps is (10 sec: 45875.5, 60 sec: 42874.6, 300 sec: 42987.2). Total num frames: 114212864. Throughput: 0: 43065.0. Samples: 114285720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 11:50:31,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:50:34,437][36999] Updated weights for policy 0, policy_version 6980 (0.0046) [2024-07-02 11:50:36,095][36761] Fps is (10 sec: 40971.2, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 114409472. Throughput: 0: 43079.0. Samples: 114540540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 11:50:36,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:50:36,118][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000006983_114409472.pth... [2024-07-02 11:50:36,179][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000006355_104120320.pth [2024-07-02 11:50:38,056][36999] Updated weights for policy 0, policy_version 6990 (0.0033) [2024-07-02 11:50:41,095][36761] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 114655232. Throughput: 0: 43135.9. Samples: 114800900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 11:50:41,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:50:42,053][36999] Updated weights for policy 0, policy_version 7000 (0.0031) [2024-07-02 11:50:45,912][36999] Updated weights for policy 0, policy_version 7010 (0.0050) [2024-07-02 11:50:46,095][36761] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 114851840. Throughput: 0: 43147.6. Samples: 114933460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 11:50:46,096][36761] Avg episode reward: [(0, '0.017')] [2024-07-02 11:50:49,757][36999] Updated weights for policy 0, policy_version 7020 (0.0037) [2024-07-02 11:50:51,095][36761] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 115048448. Throughput: 0: 43143.4. Samples: 115188340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 11:50:51,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:50:53,463][36999] Updated weights for policy 0, policy_version 7030 (0.0028) [2024-07-02 11:50:56,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42987.4). Total num frames: 115294208. Throughput: 0: 43140.0. Samples: 115447560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 11:50:56,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:50:57,577][36999] Updated weights for policy 0, policy_version 7040 (0.0046) [2024-07-02 11:51:01,026][36999] Updated weights for policy 0, policy_version 7050 (0.0036) [2024-07-02 11:51:01,095][36761] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 115507200. Throughput: 0: 43193.5. Samples: 115579020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 11:51:01,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:51:05,018][36999] Updated weights for policy 0, policy_version 7060 (0.0038) [2024-07-02 11:51:06,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43690.8, 300 sec: 43042.7). Total num frames: 115720192. Throughput: 0: 43351.5. Samples: 115840520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-07-02 11:51:06,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:51:08,399][36999] Updated weights for policy 0, policy_version 7070 (0.0025) [2024-07-02 11:51:11,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 43043.4). Total num frames: 115933184. Throughput: 0: 43152.0. Samples: 116093940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-07-02 11:51:11,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:51:12,484][36999] Updated weights for policy 0, policy_version 7080 (0.0035) [2024-07-02 11:51:13,405][36979] Signal inference workers to stop experience collection... (1600 times) [2024-07-02 11:51:13,456][36999] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-07-02 11:51:13,515][36979] Signal inference workers to resume experience collection... (1600 times) [2024-07-02 11:51:13,516][36999] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-07-02 11:51:15,826][36999] Updated weights for policy 0, policy_version 7090 (0.0032) [2024-07-02 11:51:16,095][36761] Fps is (10 sec: 44237.3, 60 sec: 43419.3, 300 sec: 43098.3). Total num frames: 116162560. Throughput: 0: 43107.1. Samples: 116225540. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-07-02 11:51:16,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:51:19,931][36999] Updated weights for policy 0, policy_version 7100 (0.0031) [2024-07-02 11:51:21,095][36761] Fps is (10 sec: 44237.6, 60 sec: 43690.8, 300 sec: 43042.7). Total num frames: 116375552. Throughput: 0: 43327.8. Samples: 116490280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-07-02 11:51:21,095][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:51:23,330][36999] Updated weights for policy 0, policy_version 7110 (0.0042) [2024-07-02 11:51:26,100][36761] Fps is (10 sec: 42578.6, 60 sec: 43143.2, 300 sec: 43042.1). Total num frames: 116588544. Throughput: 0: 43325.8. Samples: 116750760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-07-02 11:51:26,101][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:51:27,643][36999] Updated weights for policy 0, policy_version 7120 (0.0047) [2024-07-02 11:51:30,794][36999] Updated weights for policy 0, policy_version 7130 (0.0041) [2024-07-02 11:51:31,095][36761] Fps is (10 sec: 44236.1, 60 sec: 43417.6, 300 sec: 43153.8). Total num frames: 116817920. Throughput: 0: 43245.8. Samples: 116879520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 11:51:31,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:51:35,024][36999] Updated weights for policy 0, policy_version 7140 (0.0034) [2024-07-02 11:51:36,095][36761] Fps is (10 sec: 44257.3, 60 sec: 43690.7, 300 sec: 43042.7). Total num frames: 117030912. Throughput: 0: 43430.2. Samples: 117142700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 11:51:36,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:51:38,487][36999] Updated weights for policy 0, policy_version 7150 (0.0034) [2024-07-02 11:51:41,095][36761] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 117227520. Throughput: 0: 43358.2. Samples: 117398680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 11:51:41,096][36761] Avg episode reward: [(0, '0.023')] [2024-07-02 11:51:41,096][36979] Saving new best policy, reward=0.023! [2024-07-02 11:51:42,451][36999] Updated weights for policy 0, policy_version 7160 (0.0038) [2024-07-02 11:51:46,004][36999] Updated weights for policy 0, policy_version 7170 (0.0038) [2024-07-02 11:51:46,100][36761] Fps is (10 sec: 44216.6, 60 sec: 43687.3, 300 sec: 43208.6). Total num frames: 117473280. Throughput: 0: 43308.9. Samples: 117528120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-07-02 11:51:46,100][36761] Avg episode reward: [(0, '0.023')] [2024-07-02 11:51:50,533][36999] Updated weights for policy 0, policy_version 7180 (0.0038) [2024-07-02 11:51:51,095][36761] Fps is (10 sec: 42597.8, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 117653504. Throughput: 0: 43239.5. Samples: 117786300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-07-02 11:51:51,096][36761] Avg episode reward: [(0, '0.013')] [2024-07-02 11:51:53,650][36999] Updated weights for policy 0, policy_version 7190 (0.0041) [2024-07-02 11:51:56,095][36761] Fps is (10 sec: 40978.8, 60 sec: 43144.6, 300 sec: 43098.2). Total num frames: 117882880. Throughput: 0: 43293.8. Samples: 118042160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-07-02 11:51:56,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:51:58,045][36999] Updated weights for policy 0, policy_version 7200 (0.0048) [2024-07-02 11:52:01,025][36999] Updated weights for policy 0, policy_version 7210 (0.0034) [2024-07-02 11:52:01,095][36761] Fps is (10 sec: 47513.8, 60 sec: 43690.6, 300 sec: 43209.3). Total num frames: 118128640. Throughput: 0: 43443.9. Samples: 118180520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 11:52:01,096][36761] Avg episode reward: [(0, '0.017')] [2024-07-02 11:52:05,512][36999] Updated weights for policy 0, policy_version 7220 (0.0038) [2024-07-02 11:52:06,095][36761] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 118308864. Throughput: 0: 43268.3. Samples: 118437360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 11:52:06,096][36761] Avg episode reward: [(0, '0.020')] [2024-07-02 11:52:08,646][36999] Updated weights for policy 0, policy_version 7230 (0.0030) [2024-07-02 11:52:11,095][36761] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 118521856. Throughput: 0: 43181.8. Samples: 118693740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-07-02 11:52:11,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:52:12,990][36999] Updated weights for policy 0, policy_version 7240 (0.0038) [2024-07-02 11:52:16,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 118751232. Throughput: 0: 43262.2. Samples: 118826320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 11:52:16,100][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:52:16,398][36999] Updated weights for policy 0, policy_version 7250 (0.0037) [2024-07-02 11:52:20,326][36999] Updated weights for policy 0, policy_version 7260 (0.0030) [2024-07-02 11:52:21,095][36761] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 43153.8). Total num frames: 118964224. Throughput: 0: 43119.5. Samples: 119083080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 11:52:21,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:52:23,933][36999] Updated weights for policy 0, policy_version 7270 (0.0041) [2024-07-02 11:52:26,095][36761] Fps is (10 sec: 40959.8, 60 sec: 42874.7, 300 sec: 42987.2). Total num frames: 119160832. Throughput: 0: 43281.7. Samples: 119346360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 11:52:26,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:52:27,807][36999] Updated weights for policy 0, policy_version 7280 (0.0030) [2024-07-02 11:52:31,095][36761] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 43098.2). Total num frames: 119390208. Throughput: 0: 43149.6. Samples: 119469660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 11:52:31,096][36761] Avg episode reward: [(0, '0.013')] [2024-07-02 11:52:31,679][36999] Updated weights for policy 0, policy_version 7290 (0.0042) [2024-07-02 11:52:35,382][36999] Updated weights for policy 0, policy_version 7300 (0.0040) [2024-07-02 11:52:36,095][36761] Fps is (10 sec: 47513.5, 60 sec: 43417.5, 300 sec: 43264.8). Total num frames: 119635968. Throughput: 0: 43251.6. Samples: 119732620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 11:52:36,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:52:36,165][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000007303_119652352.pth... [2024-07-02 11:52:36,215][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000006669_109264896.pth [2024-07-02 11:52:39,026][36999] Updated weights for policy 0, policy_version 7310 (0.0018) [2024-07-02 11:52:41,095][36761] Fps is (10 sec: 42599.1, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 119816192. Throughput: 0: 43584.0. Samples: 120003440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-07-02 11:52:41,096][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:52:43,031][36999] Updated weights for policy 0, policy_version 7320 (0.0035) [2024-07-02 11:52:46,095][36761] Fps is (10 sec: 42598.9, 60 sec: 43147.8, 300 sec: 43153.8). Total num frames: 120061952. Throughput: 0: 43213.8. Samples: 120125140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 11:52:46,096][36761] Avg episode reward: [(0, '0.013')] [2024-07-02 11:52:46,467][36999] Updated weights for policy 0, policy_version 7330 (0.0042) [2024-07-02 11:52:48,204][36979] Signal inference workers to stop experience collection... (1650 times) [2024-07-02 11:52:48,204][36979] Signal inference workers to resume experience collection... (1650 times) [2024-07-02 11:52:48,218][36999] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-07-02 11:52:48,218][36999] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-07-02 11:52:50,617][36999] Updated weights for policy 0, policy_version 7340 (0.0037) [2024-07-02 11:52:51,096][36761] Fps is (10 sec: 45872.1, 60 sec: 43690.3, 300 sec: 43209.2). Total num frames: 120274944. Throughput: 0: 43310.5. Samples: 120386360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 11:52:51,097][36761] Avg episode reward: [(0, '0.011')] [2024-07-02 11:52:54,178][36999] Updated weights for policy 0, policy_version 7350 (0.0046) [2024-07-02 11:52:56,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 120471552. Throughput: 0: 43570.2. Samples: 120654400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 11:52:56,096][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:52:58,126][36999] Updated weights for policy 0, policy_version 7360 (0.0024) [2024-07-02 11:53:01,095][36761] Fps is (10 sec: 42601.0, 60 sec: 42871.5, 300 sec: 43153.8). Total num frames: 120700928. Throughput: 0: 43401.4. Samples: 120779380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-07-02 11:53:01,096][36761] Avg episode reward: [(0, '0.018')] [2024-07-02 11:53:01,516][36999] Updated weights for policy 0, policy_version 7370 (0.0025) [2024-07-02 11:53:05,489][36999] Updated weights for policy 0, policy_version 7380 (0.0032) [2024-07-02 11:53:06,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 43154.5). Total num frames: 120913920. Throughput: 0: 43495.7. Samples: 121040380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-07-02 11:53:06,096][36761] Avg episode reward: [(0, '0.017')] [2024-07-02 11:53:08,870][36999] Updated weights for policy 0, policy_version 7390 (0.0036) [2024-07-02 11:53:11,100][36761] Fps is (10 sec: 42578.7, 60 sec: 43414.2, 300 sec: 43153.1). Total num frames: 121126912. Throughput: 0: 43607.6. Samples: 121308900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 11:53:11,101][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:53:12,877][36999] Updated weights for policy 0, policy_version 7400 (0.0023) [2024-07-02 11:53:16,095][36761] Fps is (10 sec: 45875.1, 60 sec: 43690.7, 300 sec: 43321.1). Total num frames: 121372672. Throughput: 0: 43754.8. Samples: 121438620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-07-02 11:53:16,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:53:16,324][36999] Updated weights for policy 0, policy_version 7410 (0.0037) [2024-07-02 11:53:20,283][36999] Updated weights for policy 0, policy_version 7420 (0.0040) [2024-07-02 11:53:21,095][36761] Fps is (10 sec: 45895.9, 60 sec: 43690.6, 300 sec: 43265.5). Total num frames: 121585664. Throughput: 0: 43846.7. Samples: 121705720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-07-02 11:53:21,096][36761] Avg episode reward: [(0, '0.018')] [2024-07-02 11:53:23,827][36999] Updated weights for policy 0, policy_version 7430 (0.0021) [2024-07-02 11:53:26,095][36761] Fps is (10 sec: 40959.6, 60 sec: 43690.7, 300 sec: 43209.3). Total num frames: 121782272. Throughput: 0: 43621.2. Samples: 121966400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-07-02 11:53:26,099][36761] Avg episode reward: [(0, '0.013')] [2024-07-02 11:53:28,114][36999] Updated weights for policy 0, policy_version 7440 (0.0046) [2024-07-02 11:53:31,095][36761] Fps is (10 sec: 44237.6, 60 sec: 43963.9, 300 sec: 43320.4). Total num frames: 122028032. Throughput: 0: 43662.3. Samples: 122089940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-07-02 11:53:31,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:53:31,542][36999] Updated weights for policy 0, policy_version 7450 (0.0029) [2024-07-02 11:53:35,676][36999] Updated weights for policy 0, policy_version 7460 (0.0037) [2024-07-02 11:53:36,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 43209.3). Total num frames: 122224640. Throughput: 0: 43675.3. Samples: 122351720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-07-02 11:53:36,100][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:53:39,010][36999] Updated weights for policy 0, policy_version 7470 (0.0035) [2024-07-02 11:53:41,096][36761] Fps is (10 sec: 40957.6, 60 sec: 43690.2, 300 sec: 43209.3). Total num frames: 122437632. Throughput: 0: 43481.3. Samples: 122611080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-07-02 11:53:41,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:53:43,121][36999] Updated weights for policy 0, policy_version 7480 (0.0036) [2024-07-02 11:53:46,095][36761] Fps is (10 sec: 45875.4, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 122683392. Throughput: 0: 43524.5. Samples: 122737980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-07-02 11:53:46,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:53:46,980][36999] Updated weights for policy 0, policy_version 7490 (0.0037) [2024-07-02 11:53:50,586][36999] Updated weights for policy 0, policy_version 7500 (0.0032) [2024-07-02 11:53:51,095][36761] Fps is (10 sec: 44239.4, 60 sec: 43418.1, 300 sec: 43264.9). Total num frames: 122880000. Throughput: 0: 43627.1. Samples: 123003600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-07-02 11:53:51,095][36761] Avg episode reward: [(0, '0.009')] [2024-07-02 11:53:54,392][36999] Updated weights for policy 0, policy_version 7510 (0.0030) [2024-07-02 11:53:56,095][36761] Fps is (10 sec: 39321.2, 60 sec: 43417.5, 300 sec: 43264.9). Total num frames: 123076608. Throughput: 0: 43576.9. Samples: 123269660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 11:53:56,096][36761] Avg episode reward: [(0, '0.013')] [2024-07-02 11:53:58,147][36999] Updated weights for policy 0, policy_version 7520 (0.0038) [2024-07-02 11:54:01,100][36761] Fps is (10 sec: 45854.1, 60 sec: 43960.4, 300 sec: 43319.8). Total num frames: 123338752. Throughput: 0: 43425.8. Samples: 123392980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 11:54:01,100][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:54:01,751][36999] Updated weights for policy 0, policy_version 7530 (0.0027) [2024-07-02 11:54:05,754][36999] Updated weights for policy 0, policy_version 7540 (0.0037) [2024-07-02 11:54:06,095][36761] Fps is (10 sec: 47513.4, 60 sec: 43963.6, 300 sec: 43375.9). Total num frames: 123551744. Throughput: 0: 43450.2. Samples: 123660980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-07-02 11:54:06,096][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 11:54:06,769][36979] Signal inference workers to stop experience collection... (1700 times) [2024-07-02 11:54:06,769][36979] Signal inference workers to resume experience collection... (1700 times) [2024-07-02 11:54:06,799][36999] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-07-02 11:54:06,799][36999] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-07-02 11:54:09,314][36999] Updated weights for policy 0, policy_version 7550 (0.0022) [2024-07-02 11:54:11,095][36761] Fps is (10 sec: 39339.6, 60 sec: 43421.0, 300 sec: 43209.3). Total num frames: 123731968. Throughput: 0: 43457.4. Samples: 123921980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-07-02 11:54:11,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:54:13,294][36999] Updated weights for policy 0, policy_version 7560 (0.0026) [2024-07-02 11:54:16,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43690.6, 300 sec: 43431.5). Total num frames: 123994112. Throughput: 0: 43458.1. Samples: 124045560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-07-02 11:54:16,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:54:16,801][36999] Updated weights for policy 0, policy_version 7570 (0.0028) [2024-07-02 11:54:20,795][36999] Updated weights for policy 0, policy_version 7580 (0.0027) [2024-07-02 11:54:21,095][36761] Fps is (10 sec: 45875.3, 60 sec: 43417.7, 300 sec: 43320.4). Total num frames: 124190720. Throughput: 0: 43575.2. Samples: 124312600. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-07-02 11:54:21,100][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:54:24,294][36999] Updated weights for policy 0, policy_version 7590 (0.0040) [2024-07-02 11:54:26,095][36761] Fps is (10 sec: 39321.8, 60 sec: 43417.6, 300 sec: 43210.0). Total num frames: 124387328. Throughput: 0: 43566.2. Samples: 124571540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 11:54:26,096][36761] Avg episode reward: [(0, '0.013')] [2024-07-02 11:54:28,259][36999] Updated weights for policy 0, policy_version 7600 (0.0030) [2024-07-02 11:54:31,095][36761] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 43431.5). Total num frames: 124649472. Throughput: 0: 43555.1. Samples: 124697960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 11:54:31,096][36761] Avg episode reward: [(0, '0.008')] [2024-07-02 11:54:31,732][36999] Updated weights for policy 0, policy_version 7610 (0.0041) [2024-07-02 11:54:35,768][36999] Updated weights for policy 0, policy_version 7620 (0.0031) [2024-07-02 11:54:36,095][36761] Fps is (10 sec: 45875.6, 60 sec: 43690.7, 300 sec: 43376.0). Total num frames: 124846080. Throughput: 0: 43532.4. Samples: 124962560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 11:54:36,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:54:36,107][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000007620_124846080.pth... [2024-07-02 11:54:36,158][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000006983_114409472.pth [2024-07-02 11:54:39,193][36999] Updated weights for policy 0, policy_version 7630 (0.0034) [2024-07-02 11:54:41,100][36761] Fps is (10 sec: 37665.8, 60 sec: 43141.6, 300 sec: 43208.7). Total num frames: 125026304. Throughput: 0: 43290.3. Samples: 125217920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-07-02 11:54:41,101][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:54:43,386][36999] Updated weights for policy 0, policy_version 7640 (0.0033) [2024-07-02 11:54:46,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 43375.9). Total num frames: 125272064. Throughput: 0: 43277.7. Samples: 125340280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-07-02 11:54:46,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:54:47,102][36999] Updated weights for policy 0, policy_version 7650 (0.0029) [2024-07-02 11:54:51,033][36999] Updated weights for policy 0, policy_version 7660 (0.0029) [2024-07-02 11:54:51,095][36761] Fps is (10 sec: 47535.3, 60 sec: 43690.6, 300 sec: 43375.9). Total num frames: 125501440. Throughput: 0: 43203.2. Samples: 125605120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-07-02 11:54:51,096][36761] Avg episode reward: [(0, '0.018')] [2024-07-02 11:54:54,767][36999] Updated weights for policy 0, policy_version 7670 (0.0030) [2024-07-02 11:54:56,095][36761] Fps is (10 sec: 40959.6, 60 sec: 43417.6, 300 sec: 43264.9). Total num frames: 125681664. Throughput: 0: 42992.8. Samples: 125856660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-07-02 11:54:56,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:54:58,824][36999] Updated weights for policy 0, policy_version 7680 (0.0039) [2024-07-02 11:55:01,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43147.8, 300 sec: 43487.0). Total num frames: 125927424. Throughput: 0: 43126.3. Samples: 125986240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-07-02 11:55:01,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:55:02,677][36999] Updated weights for policy 0, policy_version 7690 (0.0032) [2024-07-02 11:55:06,095][36761] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 43320.4). Total num frames: 126124032. Throughput: 0: 43078.2. Samples: 126251120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 11:55:06,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:55:06,349][36999] Updated weights for policy 0, policy_version 7700 (0.0046) [2024-07-02 11:55:10,389][36999] Updated weights for policy 0, policy_version 7710 (0.0029) [2024-07-02 11:55:11,095][36761] Fps is (10 sec: 40960.4, 60 sec: 43417.6, 300 sec: 43320.8). Total num frames: 126337024. Throughput: 0: 42901.0. Samples: 126502080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 11:55:11,096][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 11:55:14,087][36999] Updated weights for policy 0, policy_version 7720 (0.0038) [2024-07-02 11:55:16,095][36761] Fps is (10 sec: 45875.5, 60 sec: 43144.7, 300 sec: 43487.0). Total num frames: 126582784. Throughput: 0: 42985.9. Samples: 126632320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 11:55:16,095][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:55:18,110][36999] Updated weights for policy 0, policy_version 7730 (0.0034) [2024-07-02 11:55:21,095][36761] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 43265.3). Total num frames: 126763008. Throughput: 0: 42924.0. Samples: 126894140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 11:55:21,096][36761] Avg episode reward: [(0, '0.013')] [2024-07-02 11:55:21,874][36999] Updated weights for policy 0, policy_version 7740 (0.0034) [2024-07-02 11:55:25,775][36999] Updated weights for policy 0, policy_version 7750 (0.0029) [2024-07-02 11:55:26,096][36761] Fps is (10 sec: 40958.8, 60 sec: 43417.5, 300 sec: 43320.4). Total num frames: 126992384. Throughput: 0: 42912.6. Samples: 127148800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-07-02 11:55:26,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:55:28,512][36979] Signal inference workers to stop experience collection... (1750 times) [2024-07-02 11:55:28,544][36999] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-07-02 11:55:28,579][36979] Signal inference workers to resume experience collection... (1750 times) [2024-07-02 11:55:28,579][36999] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-07-02 11:55:29,414][36999] Updated weights for policy 0, policy_version 7760 (0.0028) [2024-07-02 11:55:31,095][36761] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 43431.5). Total num frames: 127221760. Throughput: 0: 43080.4. Samples: 127278900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 11:55:31,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:55:33,222][36999] Updated weights for policy 0, policy_version 7770 (0.0036) [2024-07-02 11:55:36,095][36761] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 43264.9). Total num frames: 127418368. Throughput: 0: 43049.4. Samples: 127542340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-07-02 11:55:36,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:55:36,918][36999] Updated weights for policy 0, policy_version 7780 (0.0035) [2024-07-02 11:55:40,895][36999] Updated weights for policy 0, policy_version 7790 (0.0031) [2024-07-02 11:55:41,100][36761] Fps is (10 sec: 42579.2, 60 sec: 43690.7, 300 sec: 43375.3). Total num frames: 127647744. Throughput: 0: 43168.2. Samples: 127799420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-07-02 11:55:41,101][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:55:44,421][36999] Updated weights for policy 0, policy_version 7800 (0.0024) [2024-07-02 11:55:46,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 43431.5). Total num frames: 127860736. Throughput: 0: 43135.7. Samples: 127927340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-07-02 11:55:46,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:55:48,361][36999] Updated weights for policy 0, policy_version 7810 (0.0023) [2024-07-02 11:55:51,095][36761] Fps is (10 sec: 40978.3, 60 sec: 42598.3, 300 sec: 43264.9). Total num frames: 128057344. Throughput: 0: 43097.6. Samples: 128190520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-07-02 11:55:51,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:55:52,269][36999] Updated weights for policy 0, policy_version 7820 (0.0031) [2024-07-02 11:55:55,991][36999] Updated weights for policy 0, policy_version 7830 (0.0025) [2024-07-02 11:55:56,095][36761] Fps is (10 sec: 42597.7, 60 sec: 43417.6, 300 sec: 43320.4). Total num frames: 128286720. Throughput: 0: 43227.4. Samples: 128447320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-07-02 11:55:56,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:55:59,776][36999] Updated weights for policy 0, policy_version 7840 (0.0046) [2024-07-02 11:56:01,095][36761] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 43376.0). Total num frames: 128516096. Throughput: 0: 43164.4. Samples: 128574720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-07-02 11:56:01,095][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:56:03,509][36999] Updated weights for policy 0, policy_version 7850 (0.0041) [2024-07-02 11:56:06,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 43320.4). Total num frames: 128712704. Throughput: 0: 43291.0. Samples: 128842240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 11:56:06,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:56:07,184][36999] Updated weights for policy 0, policy_version 7860 (0.0026) [2024-07-02 11:56:10,909][36999] Updated weights for policy 0, policy_version 7870 (0.0023) [2024-07-02 11:56:11,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 43320.4). Total num frames: 128942080. Throughput: 0: 43269.6. Samples: 129095920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 11:56:11,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:56:14,560][36999] Updated weights for policy 0, policy_version 7880 (0.0044) [2024-07-02 11:56:16,095][36761] Fps is (10 sec: 45875.9, 60 sec: 43144.5, 300 sec: 43375.9). Total num frames: 129171456. Throughput: 0: 43425.8. Samples: 129233060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 11:56:16,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:56:18,420][36999] Updated weights for policy 0, policy_version 7890 (0.0035) [2024-07-02 11:56:21,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 43321.1). Total num frames: 129368064. Throughput: 0: 43410.7. Samples: 129495820. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-07-02 11:56:21,095][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:56:22,117][36999] Updated weights for policy 0, policy_version 7900 (0.0040) [2024-07-02 11:56:25,866][36999] Updated weights for policy 0, policy_version 7910 (0.0024) [2024-07-02 11:56:26,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43690.8, 300 sec: 43375.9). Total num frames: 129613824. Throughput: 0: 43338.1. Samples: 129749440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 11:56:26,096][36761] Avg episode reward: [(0, '0.006')] [2024-07-02 11:56:29,558][36999] Updated weights for policy 0, policy_version 7920 (0.0030) [2024-07-02 11:56:31,099][36761] Fps is (10 sec: 45859.9, 60 sec: 43415.3, 300 sec: 43375.5). Total num frames: 129826816. Throughput: 0: 43452.8. Samples: 129882860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 11:56:31,100][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:56:33,274][36999] Updated weights for policy 0, policy_version 7930 (0.0036) [2024-07-02 11:56:36,095][36761] Fps is (10 sec: 40959.7, 60 sec: 43417.5, 300 sec: 43375.9). Total num frames: 130023424. Throughput: 0: 43556.0. Samples: 130150540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 11:56:36,096][36761] Avg episode reward: [(0, '0.020')] [2024-07-02 11:56:36,121][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000007936_130023424.pth... [2024-07-02 11:56:36,189][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000007303_119652352.pth [2024-07-02 11:56:36,955][36999] Updated weights for policy 0, policy_version 7940 (0.0028) [2024-07-02 11:56:40,793][36999] Updated weights for policy 0, policy_version 7950 (0.0040) [2024-07-02 11:56:41,100][36761] Fps is (10 sec: 42592.4, 60 sec: 43417.5, 300 sec: 43320.4). Total num frames: 130252800. Throughput: 0: 43460.5. Samples: 130403240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-07-02 11:56:41,101][36761] Avg episode reward: [(0, '0.020')] [2024-07-02 11:56:45,306][36999] Updated weights for policy 0, policy_version 7960 (0.0032) [2024-07-02 11:56:46,096][36761] Fps is (10 sec: 45874.7, 60 sec: 43690.4, 300 sec: 43487.0). Total num frames: 130482176. Throughput: 0: 43624.6. Samples: 130537840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 11:56:46,100][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:56:48,286][36999] Updated weights for policy 0, policy_version 7970 (0.0042) [2024-07-02 11:56:51,095][36761] Fps is (10 sec: 42618.1, 60 sec: 43690.7, 300 sec: 43375.9). Total num frames: 130678784. Throughput: 0: 43574.8. Samples: 130803100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 11:56:51,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:56:52,761][36999] Updated weights for policy 0, policy_version 7980 (0.0030) [2024-07-02 11:56:55,800][36999] Updated weights for policy 0, policy_version 7990 (0.0033) [2024-07-02 11:56:56,096][36761] Fps is (10 sec: 42595.2, 60 sec: 43690.0, 300 sec: 43320.3). Total num frames: 130908160. Throughput: 0: 43552.3. Samples: 131055820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 11:56:56,097][36761] Avg episode reward: [(0, '0.017')] [2024-07-02 11:57:00,102][36999] Updated weights for policy 0, policy_version 8000 (0.0030) [2024-07-02 11:57:01,095][36761] Fps is (10 sec: 45875.5, 60 sec: 43690.7, 300 sec: 43487.0). Total num frames: 131137536. Throughput: 0: 43537.8. Samples: 131192260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-07-02 11:57:01,096][36761] Avg episode reward: [(0, '0.017')] [2024-07-02 11:57:03,251][36999] Updated weights for policy 0, policy_version 8010 (0.0029) [2024-07-02 11:57:06,095][36761] Fps is (10 sec: 42602.2, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 131334144. Throughput: 0: 43538.5. Samples: 131455060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-07-02 11:57:06,096][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 11:57:07,595][36999] Updated weights for policy 0, policy_version 8020 (0.0041) [2024-07-02 11:57:10,861][36999] Updated weights for policy 0, policy_version 8030 (0.0032) [2024-07-02 11:57:11,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43690.6, 300 sec: 43431.5). Total num frames: 131563520. Throughput: 0: 43545.3. Samples: 131708980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 11:57:11,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:57:15,106][36999] Updated weights for policy 0, policy_version 8040 (0.0029) [2024-07-02 11:57:16,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 43431.5). Total num frames: 131776512. Throughput: 0: 43507.1. Samples: 131840540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 11:57:16,096][36761] Avg episode reward: [(0, '0.022')] [2024-07-02 11:57:18,264][36999] Updated weights for policy 0, policy_version 8050 (0.0034) [2024-07-02 11:57:21,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43417.5, 300 sec: 43431.5). Total num frames: 131973120. Throughput: 0: 43366.7. Samples: 132102040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 11:57:21,096][36761] Avg episode reward: [(0, '0.010')] [2024-07-02 11:57:21,442][36979] Signal inference workers to stop experience collection... (1800 times) [2024-07-02 11:57:21,496][36999] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-07-02 11:57:21,504][36979] Signal inference workers to resume experience collection... (1800 times) [2024-07-02 11:57:21,507][36999] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-07-02 11:57:22,467][36999] Updated weights for policy 0, policy_version 8060 (0.0032) [2024-07-02 11:57:26,095][36761] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 43376.0). Total num frames: 132186112. Throughput: 0: 43462.3. Samples: 132358840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 11:57:26,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:57:26,426][36999] Updated weights for policy 0, policy_version 8070 (0.0033) [2024-07-02 11:57:29,867][36999] Updated weights for policy 0, policy_version 8080 (0.0034) [2024-07-02 11:57:31,095][36761] Fps is (10 sec: 45875.4, 60 sec: 43420.0, 300 sec: 43376.0). Total num frames: 132431872. Throughput: 0: 43386.5. Samples: 132490220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 11:57:31,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:57:33,943][36999] Updated weights for policy 0, policy_version 8090 (0.0034) [2024-07-02 11:57:36,095][36761] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 43431.5). Total num frames: 132628480. Throughput: 0: 43316.8. Samples: 132752360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 11:57:36,100][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:57:37,362][36999] Updated weights for policy 0, policy_version 8100 (0.0041) [2024-07-02 11:57:41,095][36761] Fps is (10 sec: 40960.2, 60 sec: 43147.9, 300 sec: 43320.4). Total num frames: 132841472. Throughput: 0: 43391.2. Samples: 133008380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 11:57:41,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:57:41,551][36999] Updated weights for policy 0, policy_version 8110 (0.0043) [2024-07-02 11:57:45,215][36999] Updated weights for policy 0, policy_version 8120 (0.0027) [2024-07-02 11:57:46,095][36761] Fps is (10 sec: 45875.9, 60 sec: 43417.8, 300 sec: 43431.6). Total num frames: 133087232. Throughput: 0: 43274.7. Samples: 133139620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 11:57:46,096][36761] Avg episode reward: [(0, '0.021')] [2024-07-02 11:57:49,166][36999] Updated weights for policy 0, policy_version 8130 (0.0038) [2024-07-02 11:57:51,095][36761] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 43375.9). Total num frames: 133267456. Throughput: 0: 43215.0. Samples: 133399740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 11:57:51,096][36761] Avg episode reward: [(0, '0.020')] [2024-07-02 11:57:52,698][36999] Updated weights for policy 0, policy_version 8140 (0.0040) [2024-07-02 11:57:56,095][36761] Fps is (10 sec: 40959.8, 60 sec: 43145.2, 300 sec: 43375.9). Total num frames: 133496832. Throughput: 0: 43434.3. Samples: 133663520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 11:57:56,096][36761] Avg episode reward: [(0, '0.017')] [2024-07-02 11:57:56,729][36999] Updated weights for policy 0, policy_version 8150 (0.0032) [2024-07-02 11:58:00,232][36999] Updated weights for policy 0, policy_version 8160 (0.0030) [2024-07-02 11:58:01,095][36761] Fps is (10 sec: 47514.1, 60 sec: 43417.5, 300 sec: 43487.0). Total num frames: 133742592. Throughput: 0: 43267.1. Samples: 133787560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 11:58:01,096][36761] Avg episode reward: [(0, '0.017')] [2024-07-02 11:58:04,686][36999] Updated weights for policy 0, policy_version 8170 (0.0024) [2024-07-02 11:58:06,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 43432.2). Total num frames: 133939200. Throughput: 0: 43341.4. Samples: 134052400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-07-02 11:58:06,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:58:07,834][36999] Updated weights for policy 0, policy_version 8180 (0.0038) [2024-07-02 11:58:11,095][36761] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 43264.8). Total num frames: 134135808. Throughput: 0: 43328.8. Samples: 134308640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 11:58:11,096][36761] Avg episode reward: [(0, '0.021')] [2024-07-02 11:58:12,278][36999] Updated weights for policy 0, policy_version 8190 (0.0027) [2024-07-02 11:58:15,497][36999] Updated weights for policy 0, policy_version 8200 (0.0041) [2024-07-02 11:58:16,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 43376.0). Total num frames: 134381568. Throughput: 0: 43184.4. Samples: 134433520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 11:58:16,096][36761] Avg episode reward: [(0, '0.023')] [2024-07-02 11:58:19,702][36999] Updated weights for policy 0, policy_version 8210 (0.0032) [2024-07-02 11:58:21,095][36761] Fps is (10 sec: 44237.4, 60 sec: 43417.6, 300 sec: 43376.0). Total num frames: 134578176. Throughput: 0: 43219.2. Samples: 134697220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-07-02 11:58:21,096][36761] Avg episode reward: [(0, '0.018')] [2024-07-02 11:58:23,006][36999] Updated weights for policy 0, policy_version 8220 (0.0025) [2024-07-02 11:58:26,095][36761] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 43209.3). Total num frames: 134774784. Throughput: 0: 43326.6. Samples: 134958080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 11:58:26,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 11:58:27,168][36999] Updated weights for policy 0, policy_version 8230 (0.0040) [2024-07-02 11:58:30,378][36999] Updated weights for policy 0, policy_version 8240 (0.0046) [2024-07-02 11:58:31,095][36761] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 43375.9). Total num frames: 135020544. Throughput: 0: 43313.2. Samples: 135088720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 11:58:31,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:58:34,963][36999] Updated weights for policy 0, policy_version 8250 (0.0029) [2024-07-02 11:58:36,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 43320.5). Total num frames: 135217152. Throughput: 0: 43281.9. Samples: 135347420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-07-02 11:58:36,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:58:36,118][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000008253_135217152.pth... [2024-07-02 11:58:36,170][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000007620_124846080.pth [2024-07-02 11:58:37,824][36979] Signal inference workers to stop experience collection... (1850 times) [2024-07-02 11:58:37,825][36979] Signal inference workers to resume experience collection... (1850 times) [2024-07-02 11:58:37,855][36999] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-07-02 11:58:37,856][36999] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-07-02 11:58:37,963][36999] Updated weights for policy 0, policy_version 8260 (0.0030) [2024-07-02 11:58:41,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 43209.3). Total num frames: 135430144. Throughput: 0: 43105.3. Samples: 135603260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-07-02 11:58:41,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:58:42,493][36999] Updated weights for policy 0, policy_version 8270 (0.0041) [2024-07-02 11:58:45,569][36999] Updated weights for policy 0, policy_version 8280 (0.0045) [2024-07-02 11:58:46,095][36761] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 43375.9). Total num frames: 135675904. Throughput: 0: 43300.4. Samples: 135736080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-07-02 11:58:46,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:58:49,991][36999] Updated weights for policy 0, policy_version 8290 (0.0021) [2024-07-02 11:58:51,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43417.7, 300 sec: 43376.0). Total num frames: 135872512. Throughput: 0: 43259.1. Samples: 135999060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-07-02 11:58:51,096][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 11:58:53,146][36999] Updated weights for policy 0, policy_version 8300 (0.0041) [2024-07-02 11:58:56,095][36761] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 43210.0). Total num frames: 136085504. Throughput: 0: 43193.5. Samples: 136252340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 11:58:56,096][36761] Avg episode reward: [(0, '0.025')] [2024-07-02 11:58:56,104][36979] Saving new best policy, reward=0.025! [2024-07-02 11:58:57,627][36999] Updated weights for policy 0, policy_version 8310 (0.0034) [2024-07-02 11:59:00,684][36999] Updated weights for policy 0, policy_version 8320 (0.0031) [2024-07-02 11:59:01,095][36761] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 43320.4). Total num frames: 136331264. Throughput: 0: 43427.1. Samples: 136387740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 11:59:01,096][36761] Avg episode reward: [(0, '0.023')] [2024-07-02 11:59:05,027][36999] Updated weights for policy 0, policy_version 8330 (0.0024) [2024-07-02 11:59:06,095][36761] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 43264.9). Total num frames: 136495104. Throughput: 0: 43428.9. Samples: 136651520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-07-02 11:59:06,095][36761] Avg episode reward: [(0, '0.022')] [2024-07-02 11:59:08,171][36999] Updated weights for policy 0, policy_version 8340 (0.0033) [2024-07-02 11:59:11,095][36761] Fps is (10 sec: 40959.9, 60 sec: 43417.7, 300 sec: 43209.3). Total num frames: 136740864. Throughput: 0: 43266.7. Samples: 136905080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-07-02 11:59:11,096][36761] Avg episode reward: [(0, '0.021')] [2024-07-02 11:59:12,445][36999] Updated weights for policy 0, policy_version 8350 (0.0045) [2024-07-02 11:59:15,693][36999] Updated weights for policy 0, policy_version 8360 (0.0039) [2024-07-02 11:59:16,095][36761] Fps is (10 sec: 49151.3, 60 sec: 43417.5, 300 sec: 43375.9). Total num frames: 136986624. Throughput: 0: 43340.0. Samples: 137039020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-07-02 11:59:16,096][36761] Avg episode reward: [(0, '0.020')] [2024-07-02 11:59:20,050][36999] Updated weights for policy 0, policy_version 8370 (0.0042) [2024-07-02 11:59:21,095][36761] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 43320.4). Total num frames: 137166848. Throughput: 0: 43464.5. Samples: 137303320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-07-02 11:59:21,096][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 11:59:23,262][36999] Updated weights for policy 0, policy_version 8380 (0.0039) [2024-07-02 11:59:26,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43690.6, 300 sec: 43209.3). Total num frames: 137396224. Throughput: 0: 43466.2. Samples: 137559240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-07-02 11:59:26,096][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 11:59:27,597][36999] Updated weights for policy 0, policy_version 8390 (0.0032) [2024-07-02 11:59:30,759][36999] Updated weights for policy 0, policy_version 8400 (0.0046) [2024-07-02 11:59:31,095][36761] Fps is (10 sec: 47513.3, 60 sec: 43690.7, 300 sec: 43376.0). Total num frames: 137641984. Throughput: 0: 43603.7. Samples: 137698240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-07-02 11:59:31,096][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 11:59:35,007][36999] Updated weights for policy 0, policy_version 8410 (0.0039) [2024-07-02 11:59:36,095][36761] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 43321.1). Total num frames: 137805824. Throughput: 0: 43493.8. Samples: 137956280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 11:59:36,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:59:38,239][36999] Updated weights for policy 0, policy_version 8420 (0.0031) [2024-07-02 11:59:41,095][36761] Fps is (10 sec: 40959.7, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 138051584. Throughput: 0: 43485.7. Samples: 138209200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 11:59:41,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 11:59:42,558][36999] Updated weights for policy 0, policy_version 8430 (0.0023) [2024-07-02 11:59:45,833][36999] Updated weights for policy 0, policy_version 8440 (0.0026) [2024-07-02 11:59:46,095][36761] Fps is (10 sec: 47513.8, 60 sec: 43417.7, 300 sec: 43320.4). Total num frames: 138280960. Throughput: 0: 43482.3. Samples: 138344440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 11:59:46,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 11:59:49,971][36999] Updated weights for policy 0, policy_version 8450 (0.0036) [2024-07-02 11:59:51,095][36761] Fps is (10 sec: 42598.9, 60 sec: 43417.7, 300 sec: 43376.0). Total num frames: 138477568. Throughput: 0: 43303.1. Samples: 138600160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-07-02 11:59:51,095][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 11:59:53,433][36999] Updated weights for policy 0, policy_version 8460 (0.0039) [2024-07-02 11:59:56,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43690.6, 300 sec: 43320.4). Total num frames: 138706944. Throughput: 0: 43266.6. Samples: 138852080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-07-02 11:59:56,099][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 11:59:57,963][36999] Updated weights for policy 0, policy_version 8470 (0.0033) [2024-07-02 12:00:01,005][36999] Updated weights for policy 0, policy_version 8480 (0.0024) [2024-07-02 12:00:01,095][36761] Fps is (10 sec: 45874.4, 60 sec: 43417.5, 300 sec: 43431.5). Total num frames: 138936320. Throughput: 0: 43186.7. Samples: 138982420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-07-02 12:00:01,096][36761] Avg episode reward: [(0, '0.021')] [2024-07-02 12:00:05,187][36979] Signal inference workers to stop experience collection... (1900 times) [2024-07-02 12:00:05,239][36979] Signal inference workers to resume experience collection... (1900 times) [2024-07-02 12:00:05,240][36999] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-07-02 12:00:05,264][36999] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-07-02 12:00:05,381][36999] Updated weights for policy 0, policy_version 8490 (0.0036) [2024-07-02 12:00:06,095][36761] Fps is (10 sec: 40960.4, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 139116544. Throughput: 0: 43236.0. Samples: 139248940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 12:00:06,096][36761] Avg episode reward: [(0, '0.025')] [2024-07-02 12:00:08,461][36999] Updated weights for policy 0, policy_version 8500 (0.0027) [2024-07-02 12:00:11,095][36761] Fps is (10 sec: 40960.4, 60 sec: 43417.6, 300 sec: 43264.9). Total num frames: 139345920. Throughput: 0: 43212.1. Samples: 139503780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 12:00:11,096][36761] Avg episode reward: [(0, '0.023')] [2024-07-02 12:00:12,735][36999] Updated weights for policy 0, policy_version 8510 (0.0034) [2024-07-02 12:00:15,968][36999] Updated weights for policy 0, policy_version 8520 (0.0037) [2024-07-02 12:00:16,095][36761] Fps is (10 sec: 47513.7, 60 sec: 43417.7, 300 sec: 43487.0). Total num frames: 139591680. Throughput: 0: 43209.8. Samples: 139642680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 12:00:16,096][36761] Avg episode reward: [(0, '0.021')] [2024-07-02 12:00:20,140][36999] Updated weights for policy 0, policy_version 8530 (0.0023) [2024-07-02 12:00:21,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 43320.4). Total num frames: 139771904. Throughput: 0: 43172.0. Samples: 139899020. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-07-02 12:00:21,096][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 12:00:23,559][36999] Updated weights for policy 0, policy_version 8540 (0.0027) [2024-07-02 12:00:26,095][36761] Fps is (10 sec: 39320.8, 60 sec: 43144.5, 300 sec: 43264.9). Total num frames: 139984896. Throughput: 0: 43414.6. Samples: 140162860. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-07-02 12:00:26,096][36761] Avg episode reward: [(0, '0.018')] [2024-07-02 12:00:27,614][36999] Updated weights for policy 0, policy_version 8550 (0.0028) [2024-07-02 12:00:31,095][36761] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 43431.5). Total num frames: 140230656. Throughput: 0: 43341.8. Samples: 140294820. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-07-02 12:00:31,095][36761] Avg episode reward: [(0, '0.017')] [2024-07-02 12:00:31,166][36999] Updated weights for policy 0, policy_version 8560 (0.0038) [2024-07-02 12:00:35,050][36999] Updated weights for policy 0, policy_version 8570 (0.0033) [2024-07-02 12:00:36,095][36761] Fps is (10 sec: 44237.3, 60 sec: 43690.6, 300 sec: 43321.1). Total num frames: 140427264. Throughput: 0: 43426.6. Samples: 140554360. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-07-02 12:00:36,096][36761] Avg episode reward: [(0, '0.012')] [2024-07-02 12:00:36,109][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000008571_140427264.pth... [2024-07-02 12:00:36,166][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000007936_130023424.pth [2024-07-02 12:00:38,707][36999] Updated weights for policy 0, policy_version 8580 (0.0025) [2024-07-02 12:00:41,095][36761] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 43320.4). Total num frames: 140640256. Throughput: 0: 43581.0. Samples: 140813220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-07-02 12:00:41,096][36761] Avg episode reward: [(0, '0.013')] [2024-07-02 12:00:42,471][36999] Updated weights for policy 0, policy_version 8590 (0.0026) [2024-07-02 12:00:46,097][36999] Updated weights for policy 0, policy_version 8600 (0.0052) [2024-07-02 12:00:46,098][36761] Fps is (10 sec: 47502.7, 60 sec: 43688.9, 300 sec: 43542.2). Total num frames: 140902400. Throughput: 0: 43713.4. Samples: 140949620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-07-02 12:00:46,098][36761] Avg episode reward: [(0, '0.013')] [2024-07-02 12:00:49,934][36999] Updated weights for policy 0, policy_version 8610 (0.0036) [2024-07-02 12:00:51,095][36761] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 43376.0). Total num frames: 141082624. Throughput: 0: 43438.2. Samples: 141203660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-07-02 12:00:51,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 12:00:53,865][36999] Updated weights for policy 0, policy_version 8620 (0.0037) [2024-07-02 12:00:56,100][36761] Fps is (10 sec: 40950.8, 60 sec: 43414.3, 300 sec: 43375.3). Total num frames: 141312000. Throughput: 0: 43552.0. Samples: 141463820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 12:00:56,100][36761] Avg episode reward: [(0, '0.023')] [2024-07-02 12:00:57,768][36999] Updated weights for policy 0, policy_version 8630 (0.0038) [2024-07-02 12:01:01,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 43431.5). Total num frames: 141524992. Throughput: 0: 43598.6. Samples: 141604620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 12:01:01,096][36761] Avg episode reward: [(0, '0.022')] [2024-07-02 12:01:01,602][36999] Updated weights for policy 0, policy_version 8640 (0.0033) [2024-07-02 12:01:05,265][36999] Updated weights for policy 0, policy_version 8650 (0.0031) [2024-07-02 12:01:06,095][36761] Fps is (10 sec: 40978.7, 60 sec: 43417.5, 300 sec: 43320.4). Total num frames: 141721600. Throughput: 0: 43580.8. Samples: 141860160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 12:01:06,096][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 12:01:08,868][36999] Updated weights for policy 0, policy_version 8660 (0.0031) [2024-07-02 12:01:11,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43690.7, 300 sec: 43375.9). Total num frames: 141967360. Throughput: 0: 43665.4. Samples: 142127800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 12:01:11,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 12:01:12,676][36999] Updated weights for policy 0, policy_version 8670 (0.0024) [2024-07-02 12:01:15,995][36979] Signal inference workers to stop experience collection... (1950 times) [2024-07-02 12:01:15,996][36979] Signal inference workers to resume experience collection... (1950 times) [2024-07-02 12:01:16,029][36999] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-07-02 12:01:16,029][36999] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-07-02 12:01:16,095][36761] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 43431.5). Total num frames: 142180352. Throughput: 0: 43722.1. Samples: 142262320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 12:01:16,096][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 12:01:16,299][36999] Updated weights for policy 0, policy_version 8680 (0.0028) [2024-07-02 12:01:20,165][36999] Updated weights for policy 0, policy_version 8690 (0.0040) [2024-07-02 12:01:21,095][36761] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 43264.9). Total num frames: 142376960. Throughput: 0: 43658.8. Samples: 142519000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 12:01:21,096][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 12:01:23,715][36999] Updated weights for policy 0, policy_version 8700 (0.0030) [2024-07-02 12:01:26,100][36761] Fps is (10 sec: 44216.5, 60 sec: 43960.4, 300 sec: 43375.7). Total num frames: 142622720. Throughput: 0: 43787.4. Samples: 142783860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 12:01:26,101][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 12:01:27,676][36999] Updated weights for policy 0, policy_version 8710 (0.0028) [2024-07-02 12:01:31,095][36761] Fps is (10 sec: 47513.5, 60 sec: 43690.6, 300 sec: 43487.0). Total num frames: 142852096. Throughput: 0: 43648.5. Samples: 142913700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 12:01:31,096][36761] Avg episode reward: [(0, '0.017')] [2024-07-02 12:01:31,203][36999] Updated weights for policy 0, policy_version 8720 (0.0030) [2024-07-02 12:01:35,157][36999] Updated weights for policy 0, policy_version 8730 (0.0042) [2024-07-02 12:01:36,095][36761] Fps is (10 sec: 42618.1, 60 sec: 43690.7, 300 sec: 43376.6). Total num frames: 143048704. Throughput: 0: 43790.2. Samples: 143174220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 12:01:36,096][36761] Avg episode reward: [(0, '0.018')] [2024-07-02 12:01:38,618][36999] Updated weights for policy 0, policy_version 8740 (0.0033) [2024-07-02 12:01:41,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43963.7, 300 sec: 43376.0). Total num frames: 143278080. Throughput: 0: 43830.2. Samples: 143435980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 12:01:41,096][36761] Avg episode reward: [(0, '0.018')] [2024-07-02 12:01:42,571][36999] Updated weights for policy 0, policy_version 8750 (0.0035) [2024-07-02 12:01:46,046][36999] Updated weights for policy 0, policy_version 8760 (0.0028) [2024-07-02 12:01:46,095][36761] Fps is (10 sec: 47513.5, 60 sec: 43692.3, 300 sec: 43542.6). Total num frames: 143523840. Throughput: 0: 43638.6. Samples: 143568360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 12:01:46,096][36761] Avg episode reward: [(0, '0.018')] [2024-07-02 12:01:50,226][36999] Updated weights for policy 0, policy_version 8770 (0.0027) [2024-07-02 12:01:51,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43963.7, 300 sec: 43431.6). Total num frames: 143720448. Throughput: 0: 43795.5. Samples: 143830960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 12:01:51,096][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 12:01:53,458][36999] Updated weights for policy 0, policy_version 8780 (0.0037) [2024-07-02 12:01:56,095][36761] Fps is (10 sec: 39322.1, 60 sec: 43421.0, 300 sec: 43320.4). Total num frames: 143917056. Throughput: 0: 43819.2. Samples: 144099660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 12:01:56,096][36761] Avg episode reward: [(0, '0.024')] [2024-07-02 12:01:57,588][36999] Updated weights for policy 0, policy_version 8790 (0.0031) [2024-07-02 12:02:00,861][36999] Updated weights for policy 0, policy_version 8800 (0.0040) [2024-07-02 12:02:01,095][36761] Fps is (10 sec: 45874.9, 60 sec: 44236.7, 300 sec: 43542.6). Total num frames: 144179200. Throughput: 0: 43705.3. Samples: 144229060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 12:02:01,096][36761] Avg episode reward: [(0, '0.025')] [2024-07-02 12:02:05,133][36999] Updated weights for policy 0, policy_version 8810 (0.0032) [2024-07-02 12:02:06,096][36761] Fps is (10 sec: 45874.0, 60 sec: 44236.7, 300 sec: 43431.5). Total num frames: 144375808. Throughput: 0: 43888.7. Samples: 144494000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 12:02:06,096][36761] Avg episode reward: [(0, '0.026')] [2024-07-02 12:02:08,629][36999] Updated weights for policy 0, policy_version 8820 (0.0038) [2024-07-02 12:02:11,095][36761] Fps is (10 sec: 39321.6, 60 sec: 43417.5, 300 sec: 43375.9). Total num frames: 144572416. Throughput: 0: 43800.4. Samples: 144754680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-07-02 12:02:11,096][36761] Avg episode reward: [(0, '0.023')] [2024-07-02 12:02:12,828][36999] Updated weights for policy 0, policy_version 8830 (0.0033) [2024-07-02 12:02:16,095][36761] Fps is (10 sec: 42599.4, 60 sec: 43690.7, 300 sec: 43487.0). Total num frames: 144801792. Throughput: 0: 43785.8. Samples: 144884060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 12:02:16,095][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 12:02:16,195][36979] Signal inference workers to stop experience collection... (2000 times) [2024-07-02 12:02:16,195][36979] Signal inference workers to resume experience collection... (2000 times) [2024-07-02 12:02:16,237][36999] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-07-02 12:02:16,237][36999] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-07-02 12:02:16,332][36999] Updated weights for policy 0, policy_version 8840 (0.0031) [2024-07-02 12:02:20,315][36999] Updated weights for policy 0, policy_version 8850 (0.0040) [2024-07-02 12:02:21,095][36761] Fps is (10 sec: 44237.2, 60 sec: 43963.7, 300 sec: 43487.0). Total num frames: 145014784. Throughput: 0: 43620.0. Samples: 145137120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 12:02:21,096][36761] Avg episode reward: [(0, '0.024')] [2024-07-02 12:02:23,793][36999] Updated weights for policy 0, policy_version 8860 (0.0036) [2024-07-02 12:02:26,095][36761] Fps is (10 sec: 40959.7, 60 sec: 43147.9, 300 sec: 43320.4). Total num frames: 145211392. Throughput: 0: 43824.5. Samples: 145408080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 12:02:26,096][36761] Avg episode reward: [(0, '0.028')] [2024-07-02 12:02:26,132][36979] Saving new best policy, reward=0.028! [2024-07-02 12:02:27,749][36999] Updated weights for policy 0, policy_version 8870 (0.0037) [2024-07-02 12:02:31,095][36761] Fps is (10 sec: 45875.9, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 145473536. Throughput: 0: 43630.4. Samples: 145531720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 12:02:31,095][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 12:02:31,227][36999] Updated weights for policy 0, policy_version 8880 (0.0027) [2024-07-02 12:02:35,219][36999] Updated weights for policy 0, policy_version 8890 (0.0033) [2024-07-02 12:02:36,095][36761] Fps is (10 sec: 44237.4, 60 sec: 43417.7, 300 sec: 43431.5). Total num frames: 145653760. Throughput: 0: 43484.6. Samples: 145787760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 12:02:36,095][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 12:02:36,196][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000008891_145670144.pth... [2024-07-02 12:02:36,252][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000008253_135217152.pth [2024-07-02 12:02:38,726][36999] Updated weights for policy 0, policy_version 8900 (0.0040) [2024-07-02 12:02:41,095][36761] Fps is (10 sec: 39320.7, 60 sec: 43144.5, 300 sec: 43320.4). Total num frames: 145866752. Throughput: 0: 43438.9. Samples: 146054420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-07-02 12:02:41,096][36761] Avg episode reward: [(0, '0.015')] [2024-07-02 12:02:42,752][36999] Updated weights for policy 0, policy_version 8910 (0.0022) [2024-07-02 12:02:46,095][36761] Fps is (10 sec: 47512.8, 60 sec: 43417.6, 300 sec: 43598.1). Total num frames: 146128896. Throughput: 0: 43389.8. Samples: 146181600. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-07-02 12:02:46,096][36761] Avg episode reward: [(0, '0.025')] [2024-07-02 12:02:46,206][36999] Updated weights for policy 0, policy_version 8920 (0.0028) [2024-07-02 12:02:50,050][36999] Updated weights for policy 0, policy_version 8930 (0.0026) [2024-07-02 12:02:51,095][36761] Fps is (10 sec: 45875.9, 60 sec: 43417.7, 300 sec: 43487.0). Total num frames: 146325504. Throughput: 0: 43310.4. Samples: 146442960. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-07-02 12:02:51,096][36761] Avg episode reward: [(0, '0.026')] [2024-07-02 12:02:53,790][36999] Updated weights for policy 0, policy_version 8940 (0.0027) [2024-07-02 12:02:56,095][36761] Fps is (10 sec: 39321.4, 60 sec: 43417.5, 300 sec: 43320.4). Total num frames: 146522112. Throughput: 0: 43431.1. Samples: 146709080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 12:02:56,096][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 12:02:57,558][36999] Updated weights for policy 0, policy_version 8950 (0.0034) [2024-07-02 12:03:01,095][36761] Fps is (10 sec: 45875.3, 60 sec: 43417.7, 300 sec: 43542.6). Total num frames: 146784256. Throughput: 0: 43436.0. Samples: 146838680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 12:03:01,096][36761] Avg episode reward: [(0, '0.020')] [2024-07-02 12:03:01,201][36999] Updated weights for policy 0, policy_version 8960 (0.0040) [2024-07-02 12:03:05,036][36999] Updated weights for policy 0, policy_version 8970 (0.0035) [2024-07-02 12:03:06,095][36761] Fps is (10 sec: 44237.5, 60 sec: 43144.7, 300 sec: 43487.0). Total num frames: 146964480. Throughput: 0: 43466.7. Samples: 147093120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 12:03:06,096][36761] Avg episode reward: [(0, '0.036')] [2024-07-02 12:03:06,201][36979] Saving new best policy, reward=0.036! [2024-07-02 12:03:08,842][36999] Updated weights for policy 0, policy_version 8980 (0.0024) [2024-07-02 12:03:11,095][36761] Fps is (10 sec: 39321.3, 60 sec: 43417.7, 300 sec: 43375.9). Total num frames: 147177472. Throughput: 0: 43464.4. Samples: 147363980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-07-02 12:03:11,100][36761] Avg episode reward: [(0, '0.036')] [2024-07-02 12:03:12,519][36979] Signal inference workers to stop experience collection... (2050 times) [2024-07-02 12:03:12,564][36999] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-07-02 12:03:12,574][36979] Signal inference workers to resume experience collection... (2050 times) [2024-07-02 12:03:12,585][36999] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-07-02 12:03:12,710][36999] Updated weights for policy 0, policy_version 8990 (0.0039) [2024-07-02 12:03:16,095][36761] Fps is (10 sec: 47513.5, 60 sec: 43963.7, 300 sec: 43598.1). Total num frames: 147439616. Throughput: 0: 43498.6. Samples: 147489160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-07-02 12:03:16,096][36761] Avg episode reward: [(0, '0.029')] [2024-07-02 12:03:16,411][36999] Updated weights for policy 0, policy_version 9000 (0.0044) [2024-07-02 12:03:20,827][36999] Updated weights for policy 0, policy_version 9010 (0.0039) [2024-07-02 12:03:21,095][36761] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 43542.6). Total num frames: 147619840. Throughput: 0: 43491.5. Samples: 147744880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-07-02 12:03:21,096][36761] Avg episode reward: [(0, '0.021')] [2024-07-02 12:03:23,888][36999] Updated weights for policy 0, policy_version 9020 (0.0027) [2024-07-02 12:03:26,095][36761] Fps is (10 sec: 39321.4, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 147832832. Throughput: 0: 43349.9. Samples: 148005160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 12:03:26,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 12:03:28,739][36999] Updated weights for policy 0, policy_version 9030 (0.0036) [2024-07-02 12:03:31,095][36761] Fps is (10 sec: 45875.1, 60 sec: 43417.5, 300 sec: 43598.1). Total num frames: 148078592. Throughput: 0: 43470.7. Samples: 148137780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-07-02 12:03:31,095][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 12:03:31,318][36999] Updated weights for policy 0, policy_version 9040 (0.0028) [2024-07-02 12:03:36,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43417.5, 300 sec: 43487.0). Total num frames: 148258816. Throughput: 0: 43464.0. Samples: 148398840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-07-02 12:03:36,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 12:03:36,160][36999] Updated weights for policy 0, policy_version 9050 (0.0037) [2024-07-02 12:03:38,776][36999] Updated weights for policy 0, policy_version 9060 (0.0037) [2024-07-02 12:03:41,095][36761] Fps is (10 sec: 40959.7, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 148488192. Throughput: 0: 43401.8. Samples: 148662160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 12:03:41,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 12:03:43,780][36999] Updated weights for policy 0, policy_version 9070 (0.0031) [2024-07-02 12:03:46,095][36761] Fps is (10 sec: 47513.3, 60 sec: 43417.6, 300 sec: 43598.1). Total num frames: 148733952. Throughput: 0: 43523.0. Samples: 148797220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-07-02 12:03:46,096][36761] Avg episode reward: [(0, '0.013')] [2024-07-02 12:03:46,274][36999] Updated weights for policy 0, policy_version 9080 (0.0033) [2024-07-02 12:03:51,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 43487.0). Total num frames: 148914176. Throughput: 0: 43531.9. Samples: 149052060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-07-02 12:03:51,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 12:03:51,216][36999] Updated weights for policy 0, policy_version 9090 (0.0031) [2024-07-02 12:03:54,254][36999] Updated weights for policy 0, policy_version 9100 (0.0031) [2024-07-02 12:03:56,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 149143552. Throughput: 0: 43358.7. Samples: 149315120. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-07-02 12:03:56,096][36761] Avg episode reward: [(0, '0.032')] [2024-07-02 12:03:58,709][36999] Updated weights for policy 0, policy_version 9110 (0.0023) [2024-07-02 12:04:01,095][36761] Fps is (10 sec: 47513.8, 60 sec: 43417.6, 300 sec: 43709.2). Total num frames: 149389312. Throughput: 0: 43471.6. Samples: 149445380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 12:04:01,096][36761] Avg episode reward: [(0, '0.021')] [2024-07-02 12:04:01,842][36999] Updated weights for policy 0, policy_version 9120 (0.0033) [2024-07-02 12:04:06,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43417.5, 300 sec: 43487.0). Total num frames: 149569536. Throughput: 0: 43554.6. Samples: 149704840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 12:04:06,096][36761] Avg episode reward: [(0, '0.021')] [2024-07-02 12:04:06,152][36999] Updated weights for policy 0, policy_version 9130 (0.0032) [2024-07-02 12:04:09,148][36999] Updated weights for policy 0, policy_version 9140 (0.0022) [2024-07-02 12:04:11,097][36761] Fps is (10 sec: 40952.4, 60 sec: 43689.4, 300 sec: 43431.2). Total num frames: 149798912. Throughput: 0: 43724.5. Samples: 149972840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 12:04:11,098][36761] Avg episode reward: [(0, '0.021')] [2024-07-02 12:04:13,737][36999] Updated weights for policy 0, policy_version 9150 (0.0027) [2024-07-02 12:04:16,100][36761] Fps is (10 sec: 47492.1, 60 sec: 43414.3, 300 sec: 43652.9). Total num frames: 150044672. Throughput: 0: 43593.3. Samples: 150099680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-07-02 12:04:16,100][36761] Avg episode reward: [(0, '0.021')] [2024-07-02 12:04:16,680][36999] Updated weights for policy 0, policy_version 9160 (0.0036) [2024-07-02 12:04:21,064][36979] Signal inference workers to stop experience collection... (2100 times) [2024-07-02 12:04:21,064][36979] Signal inference workers to resume experience collection... (2100 times) [2024-07-02 12:04:21,080][36999] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-07-02 12:04:21,080][36999] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-07-02 12:04:21,095][36761] Fps is (10 sec: 42606.4, 60 sec: 43417.6, 300 sec: 43487.0). Total num frames: 150224896. Throughput: 0: 43502.7. Samples: 150356460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-07-02 12:04:21,095][36761] Avg episode reward: [(0, '0.021')] [2024-07-02 12:04:21,214][36999] Updated weights for policy 0, policy_version 9170 (0.0027) [2024-07-02 12:04:24,036][36999] Updated weights for policy 0, policy_version 9180 (0.0044) [2024-07-02 12:04:26,095][36761] Fps is (10 sec: 40978.7, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 150454272. Throughput: 0: 43605.4. Samples: 150624400. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-07-02 12:04:26,096][36761] Avg episode reward: [(0, '0.022')] [2024-07-02 12:04:28,625][36999] Updated weights for policy 0, policy_version 9190 (0.0029) [2024-07-02 12:04:31,095][36761] Fps is (10 sec: 47513.8, 60 sec: 43690.7, 300 sec: 43709.2). Total num frames: 150700032. Throughput: 0: 43465.9. Samples: 150753180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 12:04:31,095][36761] Avg episode reward: [(0, '0.021')] [2024-07-02 12:04:31,567][36999] Updated weights for policy 0, policy_version 9200 (0.0023) [2024-07-02 12:04:36,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 43487.0). Total num frames: 150880256. Throughput: 0: 43545.4. Samples: 151011600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 12:04:36,096][36761] Avg episode reward: [(0, '0.023')] [2024-07-02 12:04:36,223][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000009210_150896640.pth... [2024-07-02 12:04:36,226][36999] Updated weights for policy 0, policy_version 9210 (0.0027) [2024-07-02 12:04:36,276][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000008571_140427264.pth [2024-07-02 12:04:39,281][36999] Updated weights for policy 0, policy_version 9220 (0.0034) [2024-07-02 12:04:41,095][36761] Fps is (10 sec: 40959.8, 60 sec: 43690.7, 300 sec: 43487.0). Total num frames: 151109632. Throughput: 0: 43492.5. Samples: 151272280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 12:04:41,096][36761] Avg episode reward: [(0, '0.014')] [2024-07-02 12:04:43,902][36999] Updated weights for policy 0, policy_version 9230 (0.0040) [2024-07-02 12:04:46,095][36761] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 43598.1). Total num frames: 151339008. Throughput: 0: 43569.3. Samples: 151406000. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-07-02 12:04:46,096][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 12:04:46,703][36999] Updated weights for policy 0, policy_version 9240 (0.0037) [2024-07-02 12:04:51,095][36761] Fps is (10 sec: 42598.1, 60 sec: 43690.6, 300 sec: 43487.0). Total num frames: 151535616. Throughput: 0: 43474.7. Samples: 151661200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-07-02 12:04:51,096][36761] Avg episode reward: [(0, '0.022')] [2024-07-02 12:04:51,430][36999] Updated weights for policy 0, policy_version 9250 (0.0024) [2024-07-02 12:04:54,237][36999] Updated weights for policy 0, policy_version 9260 (0.0039) [2024-07-02 12:04:56,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 43431.5). Total num frames: 151748608. Throughput: 0: 43423.5. Samples: 151926820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 12:04:56,096][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 12:04:58,958][36999] Updated weights for policy 0, policy_version 9270 (0.0028) [2024-07-02 12:05:01,095][36761] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 43653.6). Total num frames: 151994368. Throughput: 0: 43556.0. Samples: 152059500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-07-02 12:05:01,100][36761] Avg episode reward: [(0, '0.016')] [2024-07-02 12:05:01,740][36999] Updated weights for policy 0, policy_version 9280 (0.0043) [2024-07-02 12:05:06,095][36761] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 43487.0). Total num frames: 152174592. Throughput: 0: 43585.7. Samples: 152317820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-07-02 12:05:06,096][36761] Avg episode reward: [(0, '0.020')] [2024-07-02 12:05:06,599][36999] Updated weights for policy 0, policy_version 9290 (0.0034) [2024-07-02 12:05:09,366][36999] Updated weights for policy 0, policy_version 9300 (0.0033) [2024-07-02 12:05:11,095][36761] Fps is (10 sec: 39321.4, 60 sec: 43145.8, 300 sec: 43375.9). Total num frames: 152387584. Throughput: 0: 43362.6. Samples: 152575720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 12:05:11,097][36761] Avg episode reward: [(0, '0.020')] [2024-07-02 12:05:14,162][36999] Updated weights for policy 0, policy_version 9310 (0.0035) [2024-07-02 12:05:16,095][36761] Fps is (10 sec: 45875.4, 60 sec: 43147.8, 300 sec: 43598.1). Total num frames: 152633344. Throughput: 0: 43436.8. Samples: 152707840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-07-02 12:05:16,096][36761] Avg episode reward: [(0, '0.025')] [2024-07-02 12:05:17,015][36999] Updated weights for policy 0, policy_version 9320 (0.0030) [2024-07-02 12:05:21,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43417.5, 300 sec: 43542.6). Total num frames: 152829952. Throughput: 0: 43348.3. Samples: 152962280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-07-02 12:05:21,096][36761] Avg episode reward: [(0, '0.027')] [2024-07-02 12:05:21,721][36999] Updated weights for policy 0, policy_version 9330 (0.0032) [2024-07-02 12:05:24,762][36999] Updated weights for policy 0, policy_version 9340 (0.0035) [2024-07-02 12:05:26,095][36761] Fps is (10 sec: 40959.5, 60 sec: 43144.4, 300 sec: 43431.4). Total num frames: 153042944. Throughput: 0: 43260.3. Samples: 153219000. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-07-02 12:05:26,096][36761] Avg episode reward: [(0, '0.027')] [2024-07-02 12:05:29,171][36999] Updated weights for policy 0, policy_version 9350 (0.0049) [2024-07-02 12:05:31,095][36761] Fps is (10 sec: 45875.3, 60 sec: 43144.4, 300 sec: 43598.1). Total num frames: 153288704. Throughput: 0: 43193.7. Samples: 153349720. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-07-02 12:05:31,096][36761] Avg episode reward: [(0, '0.030')] [2024-07-02 12:05:32,526][36999] Updated weights for policy 0, policy_version 9360 (0.0035) [2024-07-02 12:05:36,095][36761] Fps is (10 sec: 44237.5, 60 sec: 43417.6, 300 sec: 43542.6). Total num frames: 153485312. Throughput: 0: 43326.7. Samples: 153610900. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-07-02 12:05:36,096][36761] Avg episode reward: [(0, '0.033')] [2024-07-02 12:05:36,619][36999] Updated weights for policy 0, policy_version 9370 (0.0025) [2024-07-02 12:05:40,248][36999] Updated weights for policy 0, policy_version 9380 (0.0037) [2024-07-02 12:05:41,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 43376.3). Total num frames: 153698304. Throughput: 0: 43208.4. Samples: 153871200. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-07-02 12:05:41,096][36761] Avg episode reward: [(0, '0.029')] [2024-07-02 12:05:44,401][36999] Updated weights for policy 0, policy_version 9390 (0.0039) [2024-07-02 12:05:45,619][36979] Signal inference workers to stop experience collection... (2150 times) [2024-07-02 12:05:45,663][36999] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-07-02 12:05:45,680][36979] Signal inference workers to resume experience collection... (2150 times) [2024-07-02 12:05:45,680][36999] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-07-02 12:05:46,095][36761] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 43542.6). Total num frames: 153927680. Throughput: 0: 43161.3. Samples: 154001760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-07-02 12:05:46,096][36761] Avg episode reward: [(0, '0.019')] [2024-07-02 12:05:47,749][36999] Updated weights for policy 0, policy_version 9400 (0.0036) [2024-07-02 12:05:51,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 43432.2). Total num frames: 154124288. Throughput: 0: 42985.4. Samples: 154252160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-07-02 12:05:51,096][36761] Avg episode reward: [(0, '0.023')] [2024-07-02 12:05:51,938][36999] Updated weights for policy 0, policy_version 9410 (0.0034) [2024-07-02 12:05:55,384][36999] Updated weights for policy 0, policy_version 9420 (0.0041) [2024-07-02 12:05:56,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 43487.0). Total num frames: 154353664. Throughput: 0: 43092.5. Samples: 154514880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-07-02 12:05:56,096][36761] Avg episode reward: [(0, '0.023')] [2024-07-02 12:05:59,357][36999] Updated weights for policy 0, policy_version 9430 (0.0026) [2024-07-02 12:06:01,100][36761] Fps is (10 sec: 45854.1, 60 sec: 43141.3, 300 sec: 43597.4). Total num frames: 154583040. Throughput: 0: 43060.1. Samples: 154645740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-07-02 12:06:01,100][36761] Avg episode reward: [(0, '0.020')] [2024-07-02 12:06:02,879][36999] Updated weights for policy 0, policy_version 9440 (0.0029) [2024-07-02 12:06:06,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 43431.5). Total num frames: 154779648. Throughput: 0: 43269.4. Samples: 154909400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-07-02 12:06:06,096][36761] Avg episode reward: [(0, '0.020')] [2024-07-02 12:06:06,874][36999] Updated weights for policy 0, policy_version 9450 (0.0019) [2024-07-02 12:06:10,605][36999] Updated weights for policy 0, policy_version 9460 (0.0036) [2024-07-02 12:06:11,095][36761] Fps is (10 sec: 40978.8, 60 sec: 43417.7, 300 sec: 43431.5). Total num frames: 154992640. Throughput: 0: 43305.0. Samples: 155167720. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-07-02 12:06:11,096][36761] Avg episode reward: [(0, '0.026')] [2024-07-02 12:06:14,316][36999] Updated weights for policy 0, policy_version 9470 (0.0038) [2024-07-02 12:06:16,095][36761] Fps is (10 sec: 45874.8, 60 sec: 43417.6, 300 sec: 43598.1). Total num frames: 155238400. Throughput: 0: 43340.9. Samples: 155300060. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-07-02 12:06:16,096][36761] Avg episode reward: [(0, '0.029')] [2024-07-02 12:06:18,177][36999] Updated weights for policy 0, policy_version 9480 (0.0040) [2024-07-02 12:06:21,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 43432.2). Total num frames: 155435008. Throughput: 0: 43322.7. Samples: 155560420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 12:06:21,096][36761] Avg episode reward: [(0, '0.018')] [2024-07-02 12:06:21,847][36999] Updated weights for policy 0, policy_version 9490 (0.0040) [2024-07-02 12:06:25,794][36999] Updated weights for policy 0, policy_version 9500 (0.0031) [2024-07-02 12:06:26,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43417.7, 300 sec: 43375.9). Total num frames: 155648000. Throughput: 0: 43237.3. Samples: 155816880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 12:06:26,096][36761] Avg episode reward: [(0, '0.025')] [2024-07-02 12:06:29,617][36999] Updated weights for policy 0, policy_version 9510 (0.0031) [2024-07-02 12:06:31,100][36761] Fps is (10 sec: 42578.9, 60 sec: 42868.2, 300 sec: 43430.8). Total num frames: 155860992. Throughput: 0: 43189.0. Samples: 155945460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 12:06:31,100][36761] Avg episode reward: [(0, '0.027')] [2024-07-02 12:06:33,321][36999] Updated weights for policy 0, policy_version 9520 (0.0026) [2024-07-02 12:06:36,096][36761] Fps is (10 sec: 45874.2, 60 sec: 43690.5, 300 sec: 43487.0). Total num frames: 156106752. Throughput: 0: 43569.5. Samples: 156212800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-07-02 12:06:36,096][36761] Avg episode reward: [(0, '0.027')] [2024-07-02 12:06:36,117][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000009528_156106752.pth... [2024-07-02 12:06:36,169][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000008891_145670144.pth [2024-07-02 12:06:37,226][36999] Updated weights for policy 0, policy_version 9530 (0.0033) [2024-07-02 12:06:40,881][36999] Updated weights for policy 0, policy_version 9540 (0.0038) [2024-07-02 12:06:41,095][36761] Fps is (10 sec: 44256.5, 60 sec: 43417.5, 300 sec: 43320.4). Total num frames: 156303360. Throughput: 0: 43537.7. Samples: 156474080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 12:06:41,096][36761] Avg episode reward: [(0, '0.024')] [2024-07-02 12:06:44,629][36999] Updated weights for policy 0, policy_version 9550 (0.0038) [2024-07-02 12:06:46,095][36761] Fps is (10 sec: 42599.5, 60 sec: 43417.7, 300 sec: 43431.5). Total num frames: 156532736. Throughput: 0: 43473.3. Samples: 156601840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 12:06:46,096][36761] Avg episode reward: [(0, '0.024')] [2024-07-02 12:06:48,378][36999] Updated weights for policy 0, policy_version 9560 (0.0030) [2024-07-02 12:06:51,100][36761] Fps is (10 sec: 45854.9, 60 sec: 43960.4, 300 sec: 43541.9). Total num frames: 156762112. Throughput: 0: 43521.3. Samples: 156868060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 12:06:51,101][36761] Avg episode reward: [(0, '0.024')] [2024-07-02 12:06:52,066][36999] Updated weights for policy 0, policy_version 9570 (0.0032) [2024-07-02 12:06:55,803][36999] Updated weights for policy 0, policy_version 9580 (0.0027) [2024-07-02 12:06:56,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 43320.4). Total num frames: 156958720. Throughput: 0: 43745.8. Samples: 157136280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 12:06:56,095][36761] Avg episode reward: [(0, '0.022')] [2024-07-02 12:06:59,663][36999] Updated weights for policy 0, policy_version 9590 (0.0031) [2024-07-02 12:07:01,095][36761] Fps is (10 sec: 44256.7, 60 sec: 43693.9, 300 sec: 43487.0). Total num frames: 157204480. Throughput: 0: 43617.7. Samples: 157262860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 12:07:01,096][36761] Avg episode reward: [(0, '0.024')] [2024-07-02 12:07:03,194][36999] Updated weights for policy 0, policy_version 9600 (0.0025) [2024-07-02 12:07:06,095][36761] Fps is (10 sec: 45874.5, 60 sec: 43963.6, 300 sec: 43542.6). Total num frames: 157417472. Throughput: 0: 43729.7. Samples: 157528260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-07-02 12:07:06,096][36761] Avg episode reward: [(0, '0.023')] [2024-07-02 12:07:07,072][36999] Updated weights for policy 0, policy_version 9610 (0.0040) [2024-07-02 12:07:10,612][36999] Updated weights for policy 0, policy_version 9620 (0.0049) [2024-07-02 12:07:11,096][36761] Fps is (10 sec: 42596.5, 60 sec: 43963.3, 300 sec: 43486.9). Total num frames: 157630464. Throughput: 0: 43690.2. Samples: 157782960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-07-02 12:07:11,096][36761] Avg episode reward: [(0, '0.024')] [2024-07-02 12:07:14,853][36999] Updated weights for policy 0, policy_version 9630 (0.0051) [2024-07-02 12:07:16,095][36761] Fps is (10 sec: 42599.0, 60 sec: 43417.7, 300 sec: 43487.0). Total num frames: 157843456. Throughput: 0: 43726.3. Samples: 157912940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 12:07:16,095][36761] Avg episode reward: [(0, '0.030')] [2024-07-02 12:07:18,386][36999] Updated weights for policy 0, policy_version 9640 (0.0039) [2024-07-02 12:07:20,607][36979] Signal inference workers to stop experience collection... (2200 times) [2024-07-02 12:07:20,607][36979] Signal inference workers to resume experience collection... (2200 times) [2024-07-02 12:07:20,653][36999] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-07-02 12:07:20,653][36999] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-07-02 12:07:21,095][36761] Fps is (10 sec: 44239.1, 60 sec: 43963.7, 300 sec: 43598.1). Total num frames: 158072832. Throughput: 0: 43680.7. Samples: 158178420. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-07-02 12:07:21,096][36761] Avg episode reward: [(0, '0.024')] [2024-07-02 12:07:22,311][36999] Updated weights for policy 0, policy_version 9650 (0.0031) [2024-07-02 12:07:25,737][36999] Updated weights for policy 0, policy_version 9660 (0.0045) [2024-07-02 12:07:26,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43690.7, 300 sec: 43375.9). Total num frames: 158269440. Throughput: 0: 43671.7. Samples: 158439300. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-07-02 12:07:26,096][36761] Avg episode reward: [(0, '0.024')] [2024-07-02 12:07:29,759][36999] Updated weights for policy 0, policy_version 9670 (0.0023) [2024-07-02 12:07:31,098][36761] Fps is (10 sec: 42585.6, 60 sec: 43964.9, 300 sec: 43542.1). Total num frames: 158498816. Throughput: 0: 43719.7. Samples: 158569360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 12:07:31,099][36761] Avg episode reward: [(0, '0.024')] [2024-07-02 12:07:33,139][36999] Updated weights for policy 0, policy_version 9680 (0.0029) [2024-07-02 12:07:36,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43144.7, 300 sec: 43487.0). Total num frames: 158695424. Throughput: 0: 43669.8. Samples: 158833000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 12:07:36,096][36761] Avg episode reward: [(0, '0.023')] [2024-07-02 12:07:37,107][36999] Updated weights for policy 0, policy_version 9690 (0.0033) [2024-07-02 12:07:40,606][36999] Updated weights for policy 0, policy_version 9700 (0.0026) [2024-07-02 12:07:41,095][36761] Fps is (10 sec: 42611.4, 60 sec: 43690.8, 300 sec: 43376.0). Total num frames: 158924800. Throughput: 0: 43512.4. Samples: 159094340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 12:07:41,096][36761] Avg episode reward: [(0, '0.021')] [2024-07-02 12:07:44,606][36999] Updated weights for policy 0, policy_version 9710 (0.0028) [2024-07-02 12:07:46,095][36761] Fps is (10 sec: 45874.8, 60 sec: 43690.6, 300 sec: 43487.0). Total num frames: 159154176. Throughput: 0: 43717.8. Samples: 159230160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 12:07:46,096][36761] Avg episode reward: [(0, '0.021')] [2024-07-02 12:07:48,015][36999] Updated weights for policy 0, policy_version 9720 (0.0042) [2024-07-02 12:07:51,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43147.9, 300 sec: 43487.0). Total num frames: 159350784. Throughput: 0: 43469.0. Samples: 159484360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 12:07:51,096][36761] Avg episode reward: [(0, '0.025')] [2024-07-02 12:07:52,103][36999] Updated weights for policy 0, policy_version 9730 (0.0027) [2024-07-02 12:07:55,839][36999] Updated weights for policy 0, policy_version 9740 (0.0036) [2024-07-02 12:07:56,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43690.5, 300 sec: 43375.9). Total num frames: 159580160. Throughput: 0: 43648.4. Samples: 159747120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-07-02 12:07:56,096][36761] Avg episode reward: [(0, '0.035')] [2024-07-02 12:07:59,644][36999] Updated weights for policy 0, policy_version 9750 (0.0026) [2024-07-02 12:08:01,100][36761] Fps is (10 sec: 45853.8, 60 sec: 43414.3, 300 sec: 43541.9). Total num frames: 159809536. Throughput: 0: 43619.9. Samples: 159876040. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-07-02 12:08:01,101][36761] Avg episode reward: [(0, '0.038')] [2024-07-02 12:08:01,101][36979] Saving new best policy, reward=0.038! [2024-07-02 12:08:03,353][36999] Updated weights for policy 0, policy_version 9760 (0.0026) [2024-07-02 12:08:06,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 43487.0). Total num frames: 160006144. Throughput: 0: 43558.5. Samples: 160138560. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-07-02 12:08:06,096][36761] Avg episode reward: [(0, '0.030')] [2024-07-02 12:08:07,125][36999] Updated weights for policy 0, policy_version 9770 (0.0028) [2024-07-02 12:08:11,005][36999] Updated weights for policy 0, policy_version 9780 (0.0025) [2024-07-02 12:08:11,100][36761] Fps is (10 sec: 42598.5, 60 sec: 43414.7, 300 sec: 43375.3). Total num frames: 160235520. Throughput: 0: 43611.1. Samples: 160402000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-07-02 12:08:11,101][36761] Avg episode reward: [(0, '0.033')] [2024-07-02 12:08:14,772][36999] Updated weights for policy 0, policy_version 9790 (0.0025) [2024-07-02 12:08:16,095][36761] Fps is (10 sec: 45876.1, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 160464896. Throughput: 0: 43659.8. Samples: 160533920. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-07-02 12:08:16,096][36761] Avg episode reward: [(0, '0.030')] [2024-07-02 12:08:18,547][36999] Updated weights for policy 0, policy_version 9800 (0.0031) [2024-07-02 12:08:21,095][36761] Fps is (10 sec: 42618.2, 60 sec: 43144.6, 300 sec: 43487.0). Total num frames: 160661504. Throughput: 0: 43575.6. Samples: 160793900. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-07-02 12:08:21,095][36761] Avg episode reward: [(0, '0.026')] [2024-07-02 12:08:22,331][36999] Updated weights for policy 0, policy_version 9810 (0.0036) [2024-07-02 12:08:25,996][36999] Updated weights for policy 0, policy_version 9820 (0.0035) [2024-07-02 12:08:26,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 160890880. Throughput: 0: 43524.4. Samples: 161052940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 12:08:26,096][36761] Avg episode reward: [(0, '0.024')] [2024-07-02 12:08:26,925][36979] Signal inference workers to stop experience collection... (2250 times) [2024-07-02 12:08:26,926][36979] Signal inference workers to resume experience collection... (2250 times) [2024-07-02 12:08:26,956][36999] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-07-02 12:08:26,956][36999] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-07-02 12:08:29,821][36999] Updated weights for policy 0, policy_version 9830 (0.0033) [2024-07-02 12:08:31,096][36761] Fps is (10 sec: 47510.7, 60 sec: 43965.5, 300 sec: 43653.6). Total num frames: 161136640. Throughput: 0: 43418.2. Samples: 161184000. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-07-02 12:08:31,097][36761] Avg episode reward: [(0, '0.026')] [2024-07-02 12:08:33,387][36999] Updated weights for policy 0, policy_version 9840 (0.0027) [2024-07-02 12:08:36,095][36761] Fps is (10 sec: 42597.8, 60 sec: 43690.6, 300 sec: 43487.0). Total num frames: 161316864. Throughput: 0: 43618.5. Samples: 161447200. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-07-02 12:08:36,096][36761] Avg episode reward: [(0, '0.026')] [2024-07-02 12:08:36,119][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000009846_161316864.pth... [2024-07-02 12:08:36,179][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000009210_150896640.pth [2024-07-02 12:08:37,302][36999] Updated weights for policy 0, policy_version 9850 (0.0027) [2024-07-02 12:08:40,789][36999] Updated weights for policy 0, policy_version 9860 (0.0030) [2024-07-02 12:08:41,095][36761] Fps is (10 sec: 40962.3, 60 sec: 43690.6, 300 sec: 43431.5). Total num frames: 161546240. Throughput: 0: 43665.0. Samples: 161712040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-07-02 12:08:41,096][36761] Avg episode reward: [(0, '0.029')] [2024-07-02 12:08:44,630][36999] Updated weights for policy 0, policy_version 9870 (0.0035) [2024-07-02 12:08:46,095][36761] Fps is (10 sec: 45876.0, 60 sec: 43690.7, 300 sec: 43598.1). Total num frames: 161775616. Throughput: 0: 43878.8. Samples: 161850380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-07-02 12:08:46,096][36761] Avg episode reward: [(0, '0.037')] [2024-07-02 12:08:48,167][36999] Updated weights for policy 0, policy_version 9880 (0.0036) [2024-07-02 12:08:51,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 43487.0). Total num frames: 161972224. Throughput: 0: 43705.1. Samples: 162105280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-07-02 12:08:51,096][36761] Avg episode reward: [(0, '0.031')] [2024-07-02 12:08:52,165][36999] Updated weights for policy 0, policy_version 9890 (0.0032) [2024-07-02 12:08:55,607][36999] Updated weights for policy 0, policy_version 9900 (0.0027) [2024-07-02 12:08:56,100][36761] Fps is (10 sec: 42576.3, 60 sec: 43687.0, 300 sec: 43430.7). Total num frames: 162201600. Throughput: 0: 43591.9. Samples: 162363660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-07-02 12:08:56,101][36761] Avg episode reward: [(0, '0.025')] [2024-07-02 12:08:59,700][36999] Updated weights for policy 0, policy_version 9910 (0.0035) [2024-07-02 12:09:01,095][36761] Fps is (10 sec: 45875.3, 60 sec: 43694.1, 300 sec: 43598.1). Total num frames: 162430976. Throughput: 0: 43821.8. Samples: 162505900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-07-02 12:09:01,096][36761] Avg episode reward: [(0, '0.025')] [2024-07-02 12:09:02,997][36999] Updated weights for policy 0, policy_version 9920 (0.0039) [2024-07-02 12:09:06,095][36761] Fps is (10 sec: 42620.4, 60 sec: 43690.8, 300 sec: 43487.3). Total num frames: 162627584. Throughput: 0: 43671.5. Samples: 162759120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 12:09:06,096][36761] Avg episode reward: [(0, '0.030')] [2024-07-02 12:09:07,372][36999] Updated weights for policy 0, policy_version 9930 (0.0033) [2024-07-02 12:09:10,471][36999] Updated weights for policy 0, policy_version 9940 (0.0029) [2024-07-02 12:09:11,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43694.0, 300 sec: 43432.2). Total num frames: 162856960. Throughput: 0: 43708.0. Samples: 163019800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 12:09:11,096][36761] Avg episode reward: [(0, '0.030')] [2024-07-02 12:09:14,877][36999] Updated weights for policy 0, policy_version 9950 (0.0031) [2024-07-02 12:09:16,095][36761] Fps is (10 sec: 45874.6, 60 sec: 43690.6, 300 sec: 43598.1). Total num frames: 163086336. Throughput: 0: 43841.3. Samples: 163156840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 12:09:16,096][36761] Avg episode reward: [(0, '0.030')] [2024-07-02 12:09:17,925][36999] Updated weights for policy 0, policy_version 9960 (0.0040) [2024-07-02 12:09:21,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 43487.0). Total num frames: 163282944. Throughput: 0: 43753.4. Samples: 163416100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 12:09:21,096][36761] Avg episode reward: [(0, '0.033')] [2024-07-02 12:09:22,215][36999] Updated weights for policy 0, policy_version 9970 (0.0029) [2024-07-02 12:09:25,398][36999] Updated weights for policy 0, policy_version 9980 (0.0022) [2024-07-02 12:09:26,095][36761] Fps is (10 sec: 44237.4, 60 sec: 43963.7, 300 sec: 43487.0). Total num frames: 163528704. Throughput: 0: 43597.4. Samples: 163673920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 12:09:26,096][36761] Avg episode reward: [(0, '0.035')] [2024-07-02 12:09:29,837][36999] Updated weights for policy 0, policy_version 9990 (0.0032) [2024-07-02 12:09:31,095][36761] Fps is (10 sec: 45875.1, 60 sec: 43418.0, 300 sec: 43598.1). Total num frames: 163741696. Throughput: 0: 43616.8. Samples: 163813140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-07-02 12:09:31,096][36761] Avg episode reward: [(0, '0.035')] [2024-07-02 12:09:32,846][36999] Updated weights for policy 0, policy_version 10000 (0.0032) [2024-07-02 12:09:36,095][36761] Fps is (10 sec: 40959.7, 60 sec: 43690.7, 300 sec: 43487.0). Total num frames: 163938304. Throughput: 0: 43595.0. Samples: 164067060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-07-02 12:09:36,096][36761] Avg episode reward: [(0, '0.027')] [2024-07-02 12:09:37,590][36999] Updated weights for policy 0, policy_version 10010 (0.0042) [2024-07-02 12:09:40,257][36999] Updated weights for policy 0, policy_version 10020 (0.0039) [2024-07-02 12:09:41,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43963.7, 300 sec: 43542.6). Total num frames: 164184064. Throughput: 0: 43664.1. Samples: 164328320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-07-02 12:09:41,096][36761] Avg episode reward: [(0, '0.034')] [2024-07-02 12:09:44,917][36999] Updated weights for policy 0, policy_version 10030 (0.0034) [2024-07-02 12:09:46,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 43542.6). Total num frames: 164380672. Throughput: 0: 43581.2. Samples: 164467060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-07-02 12:09:46,096][36761] Avg episode reward: [(0, '0.029')] [2024-07-02 12:09:48,049][36999] Updated weights for policy 0, policy_version 10040 (0.0032) [2024-07-02 12:09:51,098][36761] Fps is (10 sec: 40950.8, 60 sec: 43689.0, 300 sec: 43542.2). Total num frames: 164593664. Throughput: 0: 43671.1. Samples: 164724420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-07-02 12:09:51,098][36761] Avg episode reward: [(0, '0.029')] [2024-07-02 12:09:52,290][36999] Updated weights for policy 0, policy_version 10050 (0.0038) [2024-07-02 12:09:52,983][36979] Signal inference workers to stop experience collection... (2300 times) [2024-07-02 12:09:52,988][36979] Signal inference workers to resume experience collection... (2300 times) [2024-07-02 12:09:53,040][36999] InferenceWorker_p0-w0: stopping experience collection (2300 times) [2024-07-02 12:09:53,040][36999] InferenceWorker_p0-w0: resuming experience collection (2300 times) [2024-07-02 12:09:55,439][36999] Updated weights for policy 0, policy_version 10060 (0.0037) [2024-07-02 12:09:56,095][36761] Fps is (10 sec: 44237.4, 60 sec: 43694.4, 300 sec: 43487.0). Total num frames: 164823040. Throughput: 0: 43736.5. Samples: 164987940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 12:09:56,096][36761] Avg episode reward: [(0, '0.029')] [2024-07-02 12:09:59,718][36999] Updated weights for policy 0, policy_version 10070 (0.0037) [2024-07-02 12:10:01,095][36761] Fps is (10 sec: 44246.5, 60 sec: 43417.5, 300 sec: 43598.1). Total num frames: 165036032. Throughput: 0: 43737.8. Samples: 165125040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 12:10:01,096][36761] Avg episode reward: [(0, '0.042')] [2024-07-02 12:10:01,107][36979] Saving new best policy, reward=0.042! [2024-07-02 12:10:02,946][36999] Updated weights for policy 0, policy_version 10080 (0.0037) [2024-07-02 12:10:06,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43690.7, 300 sec: 43598.1). Total num frames: 165249024. Throughput: 0: 43631.6. Samples: 165379520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 12:10:06,096][36761] Avg episode reward: [(0, '0.039')] [2024-07-02 12:10:07,085][36999] Updated weights for policy 0, policy_version 10090 (0.0036) [2024-07-02 12:10:10,546][36999] Updated weights for policy 0, policy_version 10100 (0.0019) [2024-07-02 12:10:11,095][36761] Fps is (10 sec: 45875.5, 60 sec: 43963.7, 300 sec: 43598.1). Total num frames: 165494784. Throughput: 0: 43790.7. Samples: 165644500. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-07-02 12:10:11,096][36761] Avg episode reward: [(0, '0.039')] [2024-07-02 12:10:14,484][36999] Updated weights for policy 0, policy_version 10110 (0.0041) [2024-07-02 12:10:16,095][36761] Fps is (10 sec: 45875.0, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 165707776. Throughput: 0: 43831.1. Samples: 165785540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-07-02 12:10:16,096][36761] Avg episode reward: [(0, '0.040')] [2024-07-02 12:10:18,261][36999] Updated weights for policy 0, policy_version 10120 (0.0043) [2024-07-02 12:10:21,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43690.7, 300 sec: 43598.1). Total num frames: 165904384. Throughput: 0: 43753.4. Samples: 166035960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 12:10:21,096][36761] Avg episode reward: [(0, '0.030')] [2024-07-02 12:10:22,287][36999] Updated weights for policy 0, policy_version 10130 (0.0033) [2024-07-02 12:10:25,717][36999] Updated weights for policy 0, policy_version 10140 (0.0031) [2024-07-02 12:10:26,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43690.6, 300 sec: 43598.1). Total num frames: 166150144. Throughput: 0: 43849.3. Samples: 166301540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 12:10:26,096][36761] Avg episode reward: [(0, '0.033')] [2024-07-02 12:10:29,702][36999] Updated weights for policy 0, policy_version 10150 (0.0025) [2024-07-02 12:10:31,095][36761] Fps is (10 sec: 47513.1, 60 sec: 43963.7, 300 sec: 43709.2). Total num frames: 166379520. Throughput: 0: 43802.2. Samples: 166438160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 12:10:31,096][36761] Avg episode reward: [(0, '0.039')] [2024-07-02 12:10:33,040][36999] Updated weights for policy 0, policy_version 10160 (0.0038) [2024-07-02 12:10:36,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43963.7, 300 sec: 43653.6). Total num frames: 166576128. Throughput: 0: 43939.9. Samples: 166701620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 12:10:36,096][36761] Avg episode reward: [(0, '0.043')] [2024-07-02 12:10:36,108][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000010167_166576128.pth... [2024-07-02 12:10:36,163][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000009528_156106752.pth [2024-07-02 12:10:36,167][36979] Saving new best policy, reward=0.043! [2024-07-02 12:10:37,017][36999] Updated weights for policy 0, policy_version 10170 (0.0037) [2024-07-02 12:10:40,461][36999] Updated weights for policy 0, policy_version 10180 (0.0029) [2024-07-02 12:10:41,095][36761] Fps is (10 sec: 42597.9, 60 sec: 43690.5, 300 sec: 43653.6). Total num frames: 166805504. Throughput: 0: 43817.1. Samples: 166959720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 12:10:41,096][36761] Avg episode reward: [(0, '0.036')] [2024-07-02 12:10:44,394][36999] Updated weights for policy 0, policy_version 10190 (0.0037) [2024-07-02 12:10:46,095][36761] Fps is (10 sec: 44237.6, 60 sec: 43963.9, 300 sec: 43709.2). Total num frames: 167018496. Throughput: 0: 43722.4. Samples: 167092540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-07-02 12:10:46,095][36761] Avg episode reward: [(0, '0.041')] [2024-07-02 12:10:47,883][36999] Updated weights for policy 0, policy_version 10200 (0.0029) [2024-07-02 12:10:51,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43965.3, 300 sec: 43653.6). Total num frames: 167231488. Throughput: 0: 43962.1. Samples: 167357820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-07-02 12:10:51,096][36761] Avg episode reward: [(0, '0.024')] [2024-07-02 12:10:51,903][36999] Updated weights for policy 0, policy_version 10210 (0.0028) [2024-07-02 12:10:55,357][36999] Updated weights for policy 0, policy_version 10220 (0.0038) [2024-07-02 12:10:56,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43963.7, 300 sec: 43654.3). Total num frames: 167460864. Throughput: 0: 43853.4. Samples: 167617900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-07-02 12:10:56,096][36761] Avg episode reward: [(0, '0.024')] [2024-07-02 12:10:59,267][36999] Updated weights for policy 0, policy_version 10230 (0.0027) [2024-07-02 12:11:01,095][36761] Fps is (10 sec: 45875.9, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 167690240. Throughput: 0: 43670.3. Samples: 167750700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 12:11:01,096][36761] Avg episode reward: [(0, '0.035')] [2024-07-02 12:11:02,902][36999] Updated weights for policy 0, policy_version 10240 (0.0032) [2024-07-02 12:11:06,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 167870464. Throughput: 0: 44102.2. Samples: 168020560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 12:11:06,096][36761] Avg episode reward: [(0, '0.034')] [2024-07-02 12:11:06,645][36999] Updated weights for policy 0, policy_version 10250 (0.0026) [2024-07-02 12:11:10,491][36999] Updated weights for policy 0, policy_version 10260 (0.0033) [2024-07-02 12:11:11,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 43653.7). Total num frames: 168116224. Throughput: 0: 43884.1. Samples: 168276320. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-07-02 12:11:11,096][36761] Avg episode reward: [(0, '0.048')] [2024-07-02 12:11:11,100][36979] Saving new best policy, reward=0.048! [2024-07-02 12:11:14,105][36999] Updated weights for policy 0, policy_version 10270 (0.0033) [2024-07-02 12:11:16,095][36761] Fps is (10 sec: 47513.6, 60 sec: 43963.8, 300 sec: 43764.7). Total num frames: 168345600. Throughput: 0: 43816.5. Samples: 168409900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 12:11:16,096][36761] Avg episode reward: [(0, '0.048')] [2024-07-02 12:11:17,896][36999] Updated weights for policy 0, policy_version 10280 (0.0029) [2024-07-02 12:11:21,095][36761] Fps is (10 sec: 44236.2, 60 sec: 44236.7, 300 sec: 43764.7). Total num frames: 168558592. Throughput: 0: 43659.1. Samples: 168666280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 12:11:21,096][36761] Avg episode reward: [(0, '0.032')] [2024-07-02 12:11:22,102][36999] Updated weights for policy 0, policy_version 10290 (0.0032) [2024-07-02 12:11:25,640][36999] Updated weights for policy 0, policy_version 10300 (0.0037) [2024-07-02 12:11:25,653][36979] Signal inference workers to stop experience collection... (2350 times) [2024-07-02 12:11:25,654][36979] Signal inference workers to resume experience collection... (2350 times) [2024-07-02 12:11:25,704][36999] InferenceWorker_p0-w0: stopping experience collection (2350 times) [2024-07-02 12:11:25,704][36999] InferenceWorker_p0-w0: resuming experience collection (2350 times) [2024-07-02 12:11:26,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43690.7, 300 sec: 43765.4). Total num frames: 168771584. Throughput: 0: 43706.8. Samples: 168926520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-07-02 12:11:26,096][36761] Avg episode reward: [(0, '0.030')] [2024-07-02 12:11:29,495][36999] Updated weights for policy 0, policy_version 10310 (0.0038) [2024-07-02 12:11:31,095][36761] Fps is (10 sec: 44237.2, 60 sec: 43690.7, 300 sec: 43709.2). Total num frames: 169000960. Throughput: 0: 43694.6. Samples: 169058800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-07-02 12:11:31,096][36761] Avg episode reward: [(0, '0.034')] [2024-07-02 12:11:33,018][36999] Updated weights for policy 0, policy_version 10320 (0.0025) [2024-07-02 12:11:36,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43963.8, 300 sec: 43764.7). Total num frames: 169213952. Throughput: 0: 43740.5. Samples: 169326140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 12:11:36,096][36761] Avg episode reward: [(0, '0.034')] [2024-07-02 12:11:36,877][36999] Updated weights for policy 0, policy_version 10330 (0.0033) [2024-07-02 12:11:40,414][36999] Updated weights for policy 0, policy_version 10340 (0.0030) [2024-07-02 12:11:41,095][36761] Fps is (10 sec: 40960.3, 60 sec: 43417.8, 300 sec: 43653.6). Total num frames: 169410560. Throughput: 0: 43828.9. Samples: 169590200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 12:11:41,096][36761] Avg episode reward: [(0, '0.034')] [2024-07-02 12:11:44,186][36999] Updated weights for policy 0, policy_version 10350 (0.0029) [2024-07-02 12:11:46,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43963.6, 300 sec: 43709.8). Total num frames: 169656320. Throughput: 0: 43855.9. Samples: 169724220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 12:11:46,096][36761] Avg episode reward: [(0, '0.036')] [2024-07-02 12:11:47,805][36999] Updated weights for policy 0, policy_version 10360 (0.0034) [2024-07-02 12:11:51,095][36761] Fps is (10 sec: 45874.8, 60 sec: 43963.8, 300 sec: 43764.7). Total num frames: 169869312. Throughput: 0: 43790.2. Samples: 169991120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-07-02 12:11:51,096][36761] Avg episode reward: [(0, '0.047')] [2024-07-02 12:11:51,602][36999] Updated weights for policy 0, policy_version 10370 (0.0033) [2024-07-02 12:11:55,437][36999] Updated weights for policy 0, policy_version 10380 (0.0026) [2024-07-02 12:11:56,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43690.6, 300 sec: 43653.6). Total num frames: 170082304. Throughput: 0: 43853.3. Samples: 170249720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-07-02 12:11:56,096][36761] Avg episode reward: [(0, '0.047')] [2024-07-02 12:11:59,008][36999] Updated weights for policy 0, policy_version 10390 (0.0031) [2024-07-02 12:12:01,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43690.6, 300 sec: 43709.2). Total num frames: 170311680. Throughput: 0: 43767.9. Samples: 170379460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-07-02 12:12:01,096][36761] Avg episode reward: [(0, '0.034')] [2024-07-02 12:12:02,854][36999] Updated weights for policy 0, policy_version 10400 (0.0036) [2024-07-02 12:12:06,095][36761] Fps is (10 sec: 44236.5, 60 sec: 44236.7, 300 sec: 43709.2). Total num frames: 170524672. Throughput: 0: 44011.1. Samples: 170646780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-07-02 12:12:06,096][36761] Avg episode reward: [(0, '0.046')] [2024-07-02 12:12:06,412][36999] Updated weights for policy 0, policy_version 10410 (0.0033) [2024-07-02 12:12:10,310][36999] Updated weights for policy 0, policy_version 10420 (0.0033) [2024-07-02 12:12:11,095][36761] Fps is (10 sec: 42598.9, 60 sec: 43690.7, 300 sec: 43709.2). Total num frames: 170737664. Throughput: 0: 43901.9. Samples: 170902100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-07-02 12:12:11,096][36761] Avg episode reward: [(0, '0.038')] [2024-07-02 12:12:14,021][36999] Updated weights for policy 0, policy_version 10430 (0.0039) [2024-07-02 12:12:16,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43690.6, 300 sec: 43709.2). Total num frames: 170967040. Throughput: 0: 43875.1. Samples: 171033180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-07-02 12:12:16,096][36761] Avg episode reward: [(0, '0.026')] [2024-07-02 12:12:17,959][36999] Updated weights for policy 0, policy_version 10440 (0.0036) [2024-07-02 12:12:21,095][36761] Fps is (10 sec: 44236.1, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 171180032. Throughput: 0: 43932.0. Samples: 171303080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-07-02 12:12:21,096][36761] Avg episode reward: [(0, '0.028')] [2024-07-02 12:12:21,389][36999] Updated weights for policy 0, policy_version 10450 (0.0022) [2024-07-02 12:12:25,426][36999] Updated weights for policy 0, policy_version 10460 (0.0029) [2024-07-02 12:12:26,097][36761] Fps is (10 sec: 42590.4, 60 sec: 43689.3, 300 sec: 43709.3). Total num frames: 171393024. Throughput: 0: 43656.8. Samples: 171554840. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-07-02 12:12:26,097][36761] Avg episode reward: [(0, '0.034')] [2024-07-02 12:12:28,861][36999] Updated weights for policy 0, policy_version 10470 (0.0042) [2024-07-02 12:12:31,100][36761] Fps is (10 sec: 45854.7, 60 sec: 43960.4, 300 sec: 43875.1). Total num frames: 171638784. Throughput: 0: 43648.1. Samples: 171688580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-07-02 12:12:31,100][36761] Avg episode reward: [(0, '0.035')] [2024-07-02 12:12:32,763][36999] Updated weights for policy 0, policy_version 10480 (0.0037) [2024-07-02 12:12:36,095][36761] Fps is (10 sec: 44245.0, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 171835392. Throughput: 0: 43803.1. Samples: 171962260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-07-02 12:12:36,096][36761] Avg episode reward: [(0, '0.031')] [2024-07-02 12:12:36,180][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000010489_171851776.pth... [2024-07-02 12:12:36,242][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000009846_161316864.pth [2024-07-02 12:12:36,407][36999] Updated weights for policy 0, policy_version 10490 (0.0032) [2024-07-02 12:12:37,248][36979] Signal inference workers to stop experience collection... (2400 times) [2024-07-02 12:12:37,249][36979] Signal inference workers to resume experience collection... (2400 times) [2024-07-02 12:12:37,263][36999] InferenceWorker_p0-w0: stopping experience collection (2400 times) [2024-07-02 12:12:37,264][36999] InferenceWorker_p0-w0: resuming experience collection (2400 times) [2024-07-02 12:12:40,092][36999] Updated weights for policy 0, policy_version 10500 (0.0028) [2024-07-02 12:12:41,095][36761] Fps is (10 sec: 40978.7, 60 sec: 43963.7, 300 sec: 43709.2). Total num frames: 172048384. Throughput: 0: 43659.6. Samples: 172214400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 12:12:41,096][36761] Avg episode reward: [(0, '0.034')] [2024-07-02 12:12:44,167][36999] Updated weights for policy 0, policy_version 10510 (0.0025) [2024-07-02 12:12:46,095][36761] Fps is (10 sec: 45875.4, 60 sec: 43963.8, 300 sec: 43875.8). Total num frames: 172294144. Throughput: 0: 43741.8. Samples: 172347840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 12:12:46,096][36761] Avg episode reward: [(0, '0.051')] [2024-07-02 12:12:46,110][36979] Saving new best policy, reward=0.051! [2024-07-02 12:12:47,667][36999] Updated weights for policy 0, policy_version 10520 (0.0031) [2024-07-02 12:12:51,095][36761] Fps is (10 sec: 44236.0, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 172490752. Throughput: 0: 43771.5. Samples: 172616500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 12:12:51,096][36761] Avg episode reward: [(0, '0.056')] [2024-07-02 12:12:51,097][36979] Saving new best policy, reward=0.056! [2024-07-02 12:12:51,590][36999] Updated weights for policy 0, policy_version 10530 (0.0025) [2024-07-02 12:12:55,141][36999] Updated weights for policy 0, policy_version 10540 (0.0030) [2024-07-02 12:12:56,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43690.7, 300 sec: 43709.9). Total num frames: 172703744. Throughput: 0: 43787.5. Samples: 172872540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 12:12:56,096][36761] Avg episode reward: [(0, '0.060')] [2024-07-02 12:12:56,113][36979] Saving new best policy, reward=0.060! [2024-07-02 12:12:59,078][36999] Updated weights for policy 0, policy_version 10550 (0.0021) [2024-07-02 12:13:01,100][36761] Fps is (10 sec: 45854.9, 60 sec: 43960.4, 300 sec: 43875.1). Total num frames: 172949504. Throughput: 0: 43788.9. Samples: 173003880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 12:13:01,100][36761] Avg episode reward: [(0, '0.048')] [2024-07-02 12:13:02,671][36999] Updated weights for policy 0, policy_version 10560 (0.0029) [2024-07-02 12:13:06,100][36761] Fps is (10 sec: 44216.4, 60 sec: 43687.4, 300 sec: 43764.7). Total num frames: 173146112. Throughput: 0: 43661.4. Samples: 173268040. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-07-02 12:13:06,101][36761] Avg episode reward: [(0, '0.037')] [2024-07-02 12:13:06,685][36999] Updated weights for policy 0, policy_version 10570 (0.0028) [2024-07-02 12:13:10,046][36999] Updated weights for policy 0, policy_version 10580 (0.0030) [2024-07-02 12:13:11,095][36761] Fps is (10 sec: 39339.5, 60 sec: 43417.5, 300 sec: 43653.6). Total num frames: 173342720. Throughput: 0: 43880.9. Samples: 173529400. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-07-02 12:13:11,096][36761] Avg episode reward: [(0, '0.034')] [2024-07-02 12:13:14,245][36999] Updated weights for policy 0, policy_version 10590 (0.0031) [2024-07-02 12:13:16,095][36761] Fps is (10 sec: 45896.0, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 173604864. Throughput: 0: 43752.8. Samples: 173657260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 12:13:16,096][36761] Avg episode reward: [(0, '0.029')] [2024-07-02 12:13:17,414][36999] Updated weights for policy 0, policy_version 10600 (0.0034) [2024-07-02 12:13:21,095][36761] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 173801472. Throughput: 0: 43704.9. Samples: 173928980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 12:13:21,096][36761] Avg episode reward: [(0, '0.034')] [2024-07-02 12:13:21,451][36999] Updated weights for policy 0, policy_version 10610 (0.0024) [2024-07-02 12:13:24,865][36999] Updated weights for policy 0, policy_version 10620 (0.0029) [2024-07-02 12:13:26,095][36761] Fps is (10 sec: 40960.2, 60 sec: 43692.0, 300 sec: 43653.7). Total num frames: 174014464. Throughput: 0: 43808.0. Samples: 174185760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-07-02 12:13:26,100][36761] Avg episode reward: [(0, '0.042')] [2024-07-02 12:13:28,943][36999] Updated weights for policy 0, policy_version 10630 (0.0044) [2024-07-02 12:13:31,100][36761] Fps is (10 sec: 45853.8, 60 sec: 43690.6, 300 sec: 43875.1). Total num frames: 174260224. Throughput: 0: 43748.4. Samples: 174316720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 12:13:31,101][36761] Avg episode reward: [(0, '0.047')] [2024-07-02 12:13:32,216][36999] Updated weights for policy 0, policy_version 10640 (0.0035) [2024-07-02 12:13:36,095][36761] Fps is (10 sec: 44236.5, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 174456832. Throughput: 0: 43702.3. Samples: 174583100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 12:13:36,096][36761] Avg episode reward: [(0, '0.026')] [2024-07-02 12:13:36,679][36999] Updated weights for policy 0, policy_version 10650 (0.0036) [2024-07-02 12:13:39,866][36999] Updated weights for policy 0, policy_version 10660 (0.0033) [2024-07-02 12:13:41,095][36761] Fps is (10 sec: 40978.9, 60 sec: 43690.6, 300 sec: 43709.2). Total num frames: 174669824. Throughput: 0: 43754.6. Samples: 174841500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-07-02 12:13:41,096][36761] Avg episode reward: [(0, '0.027')] [2024-07-02 12:13:44,194][36999] Updated weights for policy 0, policy_version 10670 (0.0042) [2024-07-02 12:13:46,100][36761] Fps is (10 sec: 45854.2, 60 sec: 43687.3, 300 sec: 43875.1). Total num frames: 174915584. Throughput: 0: 43773.7. Samples: 174973700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-07-02 12:13:46,101][36761] Avg episode reward: [(0, '0.045')] [2024-07-02 12:13:47,127][36999] Updated weights for policy 0, policy_version 10680 (0.0038) [2024-07-02 12:13:51,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43690.7, 300 sec: 43765.5). Total num frames: 175112192. Throughput: 0: 43914.1. Samples: 175243980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-07-02 12:13:51,096][36761] Avg episode reward: [(0, '0.045')] [2024-07-02 12:13:51,606][36999] Updated weights for policy 0, policy_version 10690 (0.0036) [2024-07-02 12:13:54,602][36999] Updated weights for policy 0, policy_version 10700 (0.0046) [2024-07-02 12:13:56,095][36761] Fps is (10 sec: 42617.7, 60 sec: 43963.6, 300 sec: 43764.7). Total num frames: 175341568. Throughput: 0: 43687.5. Samples: 175495340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-07-02 12:13:56,096][36761] Avg episode reward: [(0, '0.038')] [2024-07-02 12:13:57,561][36979] Signal inference workers to stop experience collection... (2450 times) [2024-07-02 12:13:57,562][36979] Signal inference workers to resume experience collection... (2450 times) [2024-07-02 12:13:57,593][36999] InferenceWorker_p0-w0: stopping experience collection (2450 times) [2024-07-02 12:13:57,593][36999] InferenceWorker_p0-w0: resuming experience collection (2450 times) [2024-07-02 12:13:59,022][36999] Updated weights for policy 0, policy_version 10710 (0.0023) [2024-07-02 12:14:01,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43420.8, 300 sec: 43820.2). Total num frames: 175554560. Throughput: 0: 43961.3. Samples: 175635520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 12:14:01,096][36761] Avg episode reward: [(0, '0.038')] [2024-07-02 12:14:02,072][36999] Updated weights for policy 0, policy_version 10720 (0.0030) [2024-07-02 12:14:06,095][36761] Fps is (10 sec: 42599.0, 60 sec: 43694.0, 300 sec: 43764.7). Total num frames: 175767552. Throughput: 0: 43875.2. Samples: 175903360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 12:14:06,096][36761] Avg episode reward: [(0, '0.038')] [2024-07-02 12:14:06,441][36999] Updated weights for policy 0, policy_version 10730 (0.0030) [2024-07-02 12:14:09,592][36999] Updated weights for policy 0, policy_version 10740 (0.0031) [2024-07-02 12:14:11,097][36761] Fps is (10 sec: 44229.7, 60 sec: 44235.6, 300 sec: 43764.5). Total num frames: 175996928. Throughput: 0: 43744.1. Samples: 176154320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 12:14:11,097][36761] Avg episode reward: [(0, '0.040')] [2024-07-02 12:14:13,781][36999] Updated weights for policy 0, policy_version 10750 (0.0035) [2024-07-02 12:14:16,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 43820.3). Total num frames: 176209920. Throughput: 0: 43884.2. Samples: 176291300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 12:14:16,095][36761] Avg episode reward: [(0, '0.040')] [2024-07-02 12:14:17,087][36999] Updated weights for policy 0, policy_version 10760 (0.0027) [2024-07-02 12:14:21,095][36761] Fps is (10 sec: 44244.4, 60 sec: 43963.8, 300 sec: 43764.7). Total num frames: 176439296. Throughput: 0: 43942.8. Samples: 176560520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 12:14:21,096][36761] Avg episode reward: [(0, '0.041')] [2024-07-02 12:14:21,206][36999] Updated weights for policy 0, policy_version 10770 (0.0035) [2024-07-02 12:14:24,621][36999] Updated weights for policy 0, policy_version 10780 (0.0032) [2024-07-02 12:14:26,100][36761] Fps is (10 sec: 44216.1, 60 sec: 43960.4, 300 sec: 43764.0). Total num frames: 176652288. Throughput: 0: 43868.0. Samples: 176815760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 12:14:26,100][36761] Avg episode reward: [(0, '0.033')] [2024-07-02 12:14:28,830][36999] Updated weights for policy 0, policy_version 10790 (0.0032) [2024-07-02 12:14:31,096][36761] Fps is (10 sec: 44235.8, 60 sec: 43693.9, 300 sec: 43875.8). Total num frames: 176881664. Throughput: 0: 43881.7. Samples: 176948180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 12:14:31,096][36761] Avg episode reward: [(0, '0.037')] [2024-07-02 12:14:32,205][36999] Updated weights for policy 0, policy_version 10800 (0.0027) [2024-07-02 12:14:36,095][36761] Fps is (10 sec: 42618.0, 60 sec: 43690.7, 300 sec: 43709.2). Total num frames: 177078272. Throughput: 0: 43749.0. Samples: 177212680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 12:14:36,096][36761] Avg episode reward: [(0, '0.040')] [2024-07-02 12:14:36,217][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000010809_177094656.pth... [2024-07-02 12:14:36,259][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000010167_166576128.pth [2024-07-02 12:14:36,412][36999] Updated weights for policy 0, policy_version 10810 (0.0037) [2024-07-02 12:14:39,874][36999] Updated weights for policy 0, policy_version 10820 (0.0026) [2024-07-02 12:14:41,095][36761] Fps is (10 sec: 42599.1, 60 sec: 43963.8, 300 sec: 43820.3). Total num frames: 177307648. Throughput: 0: 43926.7. Samples: 177472040. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-07-02 12:14:41,096][36761] Avg episode reward: [(0, '0.042')] [2024-07-02 12:14:43,767][36999] Updated weights for policy 0, policy_version 10830 (0.0036) [2024-07-02 12:14:46,095][36761] Fps is (10 sec: 44236.5, 60 sec: 43420.9, 300 sec: 43820.6). Total num frames: 177520640. Throughput: 0: 43714.7. Samples: 177602680. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-07-02 12:14:46,096][36761] Avg episode reward: [(0, '0.042')] [2024-07-02 12:14:47,302][36999] Updated weights for policy 0, policy_version 10840 (0.0030) [2024-07-02 12:14:51,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43963.8, 300 sec: 43820.3). Total num frames: 177750016. Throughput: 0: 43714.2. Samples: 177870500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-07-02 12:14:51,096][36761] Avg episode reward: [(0, '0.043')] [2024-07-02 12:14:51,155][36999] Updated weights for policy 0, policy_version 10850 (0.0037) [2024-07-02 12:14:54,727][36999] Updated weights for policy 0, policy_version 10860 (0.0034) [2024-07-02 12:14:56,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 177963008. Throughput: 0: 43843.4. Samples: 178127200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-07-02 12:14:56,096][36761] Avg episode reward: [(0, '0.039')] [2024-07-02 12:14:58,727][36999] Updated weights for policy 0, policy_version 10870 (0.0037) [2024-07-02 12:15:01,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43963.8, 300 sec: 43875.8). Total num frames: 178192384. Throughput: 0: 43743.4. Samples: 178259760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-07-02 12:15:01,096][36761] Avg episode reward: [(0, '0.037')] [2024-07-02 12:15:02,515][36999] Updated weights for policy 0, policy_version 10880 (0.0036) [2024-07-02 12:15:06,043][36999] Updated weights for policy 0, policy_version 10890 (0.0027) [2024-07-02 12:15:06,095][36761] Fps is (10 sec: 45875.3, 60 sec: 44236.7, 300 sec: 43820.2). Total num frames: 178421760. Throughput: 0: 43639.0. Samples: 178524280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-07-02 12:15:06,098][36761] Avg episode reward: [(0, '0.044')] [2024-07-02 12:15:09,993][36999] Updated weights for policy 0, policy_version 10900 (0.0039) [2024-07-02 12:15:11,095][36761] Fps is (10 sec: 40959.9, 60 sec: 43418.8, 300 sec: 43709.2). Total num frames: 178601984. Throughput: 0: 43756.0. Samples: 178784580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 12:15:11,096][36761] Avg episode reward: [(0, '0.036')] [2024-07-02 12:15:13,506][36999] Updated weights for policy 0, policy_version 10910 (0.0036) [2024-07-02 12:15:16,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 178847744. Throughput: 0: 43610.4. Samples: 178910640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 12:15:16,096][36761] Avg episode reward: [(0, '0.038')] [2024-07-02 12:15:17,437][36999] Updated weights for policy 0, policy_version 10920 (0.0041) [2024-07-02 12:15:20,857][36999] Updated weights for policy 0, policy_version 10930 (0.0026) [2024-07-02 12:15:21,095][36761] Fps is (10 sec: 47513.7, 60 sec: 43963.7, 300 sec: 43820.3). Total num frames: 179077120. Throughput: 0: 43647.1. Samples: 179176800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 12:15:21,096][36761] Avg episode reward: [(0, '0.038')] [2024-07-02 12:15:22,928][36979] Signal inference workers to stop experience collection... (2500 times) [2024-07-02 12:15:22,929][36979] Signal inference workers to resume experience collection... (2500 times) [2024-07-02 12:15:22,972][36999] InferenceWorker_p0-w0: stopping experience collection (2500 times) [2024-07-02 12:15:22,972][36999] InferenceWorker_p0-w0: resuming experience collection (2500 times) [2024-07-02 12:15:25,185][36999] Updated weights for policy 0, policy_version 10940 (0.0043) [2024-07-02 12:15:26,095][36761] Fps is (10 sec: 40959.9, 60 sec: 43420.9, 300 sec: 43653.6). Total num frames: 179257344. Throughput: 0: 43945.3. Samples: 179449580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 12:15:26,096][36761] Avg episode reward: [(0, '0.048')] [2024-07-02 12:15:28,250][36999] Updated weights for policy 0, policy_version 10950 (0.0031) [2024-07-02 12:15:31,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43690.8, 300 sec: 43820.3). Total num frames: 179503104. Throughput: 0: 43764.1. Samples: 179572060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 12:15:31,095][36761] Avg episode reward: [(0, '0.050')] [2024-07-02 12:15:32,680][36999] Updated weights for policy 0, policy_version 10960 (0.0029) [2024-07-02 12:15:35,626][36999] Updated weights for policy 0, policy_version 10970 (0.0040) [2024-07-02 12:15:36,095][36761] Fps is (10 sec: 47513.8, 60 sec: 44236.8, 300 sec: 43820.3). Total num frames: 179732480. Throughput: 0: 43606.7. Samples: 179832800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 12:15:36,096][36761] Avg episode reward: [(0, '0.038')] [2024-07-02 12:15:40,149][36999] Updated weights for policy 0, policy_version 10980 (0.0037) [2024-07-02 12:15:41,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 43709.2). Total num frames: 179912704. Throughput: 0: 43829.0. Samples: 180099500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 12:15:41,096][36761] Avg episode reward: [(0, '0.030')] [2024-07-02 12:15:43,248][36999] Updated weights for policy 0, policy_version 10990 (0.0038) [2024-07-02 12:15:46,095][36761] Fps is (10 sec: 42598.1, 60 sec: 43963.8, 300 sec: 43820.3). Total num frames: 180158464. Throughput: 0: 43595.6. Samples: 180221560. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-07-02 12:15:46,096][36761] Avg episode reward: [(0, '0.032')] [2024-07-02 12:15:47,527][36999] Updated weights for policy 0, policy_version 11000 (0.0048) [2024-07-02 12:15:50,625][36999] Updated weights for policy 0, policy_version 11010 (0.0043) [2024-07-02 12:15:51,095][36761] Fps is (10 sec: 47512.8, 60 sec: 43963.6, 300 sec: 43820.2). Total num frames: 180387840. Throughput: 0: 43758.1. Samples: 180493400. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-07-02 12:15:51,096][36761] Avg episode reward: [(0, '0.033')] [2024-07-02 12:15:54,853][36999] Updated weights for policy 0, policy_version 11020 (0.0044) [2024-07-02 12:15:56,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 43653.6). Total num frames: 180568064. Throughput: 0: 43861.8. Samples: 180758360. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-07-02 12:15:56,096][36761] Avg episode reward: [(0, '0.039')] [2024-07-02 12:15:58,214][36999] Updated weights for policy 0, policy_version 11030 (0.0036) [2024-07-02 12:16:01,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 180813824. Throughput: 0: 43784.8. Samples: 180880960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 12:16:01,096][36761] Avg episode reward: [(0, '0.036')] [2024-07-02 12:16:02,159][36999] Updated weights for policy 0, policy_version 11040 (0.0048) [2024-07-02 12:16:05,820][36999] Updated weights for policy 0, policy_version 11050 (0.0033) [2024-07-02 12:16:06,095][36761] Fps is (10 sec: 49152.5, 60 sec: 43963.8, 300 sec: 43875.8). Total num frames: 181059584. Throughput: 0: 43883.6. Samples: 181151560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 12:16:06,096][36761] Avg episode reward: [(0, '0.029')] [2024-07-02 12:16:09,611][36999] Updated weights for policy 0, policy_version 11060 (0.0026) [2024-07-02 12:16:11,095][36761] Fps is (10 sec: 40960.3, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 181223424. Throughput: 0: 43775.2. Samples: 181419460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 12:16:11,096][36761] Avg episode reward: [(0, '0.028')] [2024-07-02 12:16:13,371][36999] Updated weights for policy 0, policy_version 11070 (0.0020) [2024-07-02 12:16:16,095][36761] Fps is (10 sec: 42597.9, 60 sec: 43963.7, 300 sec: 43820.3). Total num frames: 181485568. Throughput: 0: 43705.2. Samples: 181538800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 12:16:16,096][36761] Avg episode reward: [(0, '0.033')] [2024-07-02 12:16:17,050][36999] Updated weights for policy 0, policy_version 11080 (0.0035) [2024-07-02 12:16:20,752][36999] Updated weights for policy 0, policy_version 11090 (0.0034) [2024-07-02 12:16:21,095][36761] Fps is (10 sec: 49151.7, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 181714944. Throughput: 0: 44024.8. Samples: 181813920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 12:16:21,096][36761] Avg episode reward: [(0, '0.031')] [2024-07-02 12:16:24,639][36999] Updated weights for policy 0, policy_version 11100 (0.0031) [2024-07-02 12:16:26,095][36761] Fps is (10 sec: 39321.9, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 181878784. Throughput: 0: 43936.0. Samples: 182076620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 12:16:26,095][36761] Avg episode reward: [(0, '0.032')] [2024-07-02 12:16:27,854][36979] Signal inference workers to stop experience collection... (2550 times) [2024-07-02 12:16:27,913][36979] Signal inference workers to resume experience collection... (2550 times) [2024-07-02 12:16:27,914][36999] InferenceWorker_p0-w0: stopping experience collection (2550 times) [2024-07-02 12:16:27,932][36999] InferenceWorker_p0-w0: resuming experience collection (2550 times) [2024-07-02 12:16:28,206][36999] Updated weights for policy 0, policy_version 11110 (0.0035) [2024-07-02 12:16:31,095][36761] Fps is (10 sec: 42598.8, 60 sec: 43963.7, 300 sec: 43820.3). Total num frames: 182140928. Throughput: 0: 43945.4. Samples: 182199100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-07-02 12:16:31,095][36761] Avg episode reward: [(0, '0.032')] [2024-07-02 12:16:32,164][36999] Updated weights for policy 0, policy_version 11120 (0.0031) [2024-07-02 12:16:35,595][36999] Updated weights for policy 0, policy_version 11130 (0.0026) [2024-07-02 12:16:36,095][36761] Fps is (10 sec: 49151.3, 60 sec: 43963.6, 300 sec: 43931.3). Total num frames: 182370304. Throughput: 0: 43928.0. Samples: 182470160. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-07-02 12:16:36,096][36761] Avg episode reward: [(0, '0.032')] [2024-07-02 12:16:36,107][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000011131_182370304.pth... [2024-07-02 12:16:36,180][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000010489_171851776.pth [2024-07-02 12:16:39,508][36999] Updated weights for policy 0, policy_version 11140 (0.0036) [2024-07-02 12:16:41,095][36761] Fps is (10 sec: 42597.9, 60 sec: 44236.7, 300 sec: 43764.7). Total num frames: 182566912. Throughput: 0: 43976.4. Samples: 182737300. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-07-02 12:16:41,096][36761] Avg episode reward: [(0, '0.031')] [2024-07-02 12:16:42,992][36999] Updated weights for policy 0, policy_version 11150 (0.0053) [2024-07-02 12:16:46,095][36761] Fps is (10 sec: 44237.7, 60 sec: 44236.9, 300 sec: 43875.8). Total num frames: 182812672. Throughput: 0: 44074.8. Samples: 182864320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 12:16:46,095][36761] Avg episode reward: [(0, '0.033')] [2024-07-02 12:16:46,913][36999] Updated weights for policy 0, policy_version 11160 (0.0029) [2024-07-02 12:16:50,611][36999] Updated weights for policy 0, policy_version 11170 (0.0030) [2024-07-02 12:16:51,100][36761] Fps is (10 sec: 44216.7, 60 sec: 43687.4, 300 sec: 43819.6). Total num frames: 183009280. Throughput: 0: 44028.8. Samples: 183133060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 12:16:51,101][36761] Avg episode reward: [(0, '0.037')] [2024-07-02 12:16:54,313][36999] Updated weights for policy 0, policy_version 11180 (0.0029) [2024-07-02 12:16:56,097][36761] Fps is (10 sec: 40951.5, 60 sec: 44235.4, 300 sec: 43764.4). Total num frames: 183222272. Throughput: 0: 44016.2. Samples: 183400280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-07-02 12:16:56,098][36761] Avg episode reward: [(0, '0.033')] [2024-07-02 12:16:58,432][36999] Updated weights for policy 0, policy_version 11190 (0.0036) [2024-07-02 12:17:01,095][36761] Fps is (10 sec: 44257.1, 60 sec: 43963.8, 300 sec: 43820.3). Total num frames: 183451648. Throughput: 0: 44192.9. Samples: 183527480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-07-02 12:17:01,096][36761] Avg episode reward: [(0, '0.032')] [2024-07-02 12:17:01,649][36999] Updated weights for policy 0, policy_version 11200 (0.0029) [2024-07-02 12:17:05,787][36999] Updated weights for policy 0, policy_version 11210 (0.0034) [2024-07-02 12:17:06,100][36761] Fps is (10 sec: 45863.3, 60 sec: 43687.3, 300 sec: 43875.1). Total num frames: 183681024. Throughput: 0: 43836.5. Samples: 183786760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 12:17:06,101][36761] Avg episode reward: [(0, '0.029')] [2024-07-02 12:17:09,492][36999] Updated weights for policy 0, policy_version 11220 (0.0039) [2024-07-02 12:17:11,095][36761] Fps is (10 sec: 42598.7, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 183877632. Throughput: 0: 43936.0. Samples: 184053740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 12:17:11,095][36761] Avg episode reward: [(0, '0.025')] [2024-07-02 12:17:13,204][36999] Updated weights for policy 0, policy_version 11230 (0.0036) [2024-07-02 12:17:16,095][36761] Fps is (10 sec: 44256.7, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 184123392. Throughput: 0: 44118.5. Samples: 184184440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 12:17:16,096][36761] Avg episode reward: [(0, '0.039')] [2024-07-02 12:17:16,868][36999] Updated weights for policy 0, policy_version 11240 (0.0033) [2024-07-02 12:17:20,707][36999] Updated weights for policy 0, policy_version 11250 (0.0029) [2024-07-02 12:17:21,095][36761] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 43820.5). Total num frames: 184320000. Throughput: 0: 43874.3. Samples: 184444500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 12:17:21,096][36761] Avg episode reward: [(0, '0.042')] [2024-07-02 12:17:24,258][36999] Updated weights for policy 0, policy_version 11260 (0.0037) [2024-07-02 12:17:26,095][36761] Fps is (10 sec: 44237.4, 60 sec: 44783.0, 300 sec: 43820.9). Total num frames: 184565760. Throughput: 0: 43870.3. Samples: 184711460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 12:17:26,096][36761] Avg episode reward: [(0, '0.038')] [2024-07-02 12:17:27,835][36979] Signal inference workers to stop experience collection... (2600 times) [2024-07-02 12:17:27,872][36999] InferenceWorker_p0-w0: stopping experience collection (2600 times) [2024-07-02 12:17:27,888][36979] Signal inference workers to resume experience collection... (2600 times) [2024-07-02 12:17:27,899][36999] InferenceWorker_p0-w0: resuming experience collection (2600 times) [2024-07-02 12:17:28,020][36999] Updated weights for policy 0, policy_version 11270 (0.0033) [2024-07-02 12:17:31,095][36761] Fps is (10 sec: 45875.3, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 184778752. Throughput: 0: 43950.1. Samples: 184842080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-07-02 12:17:31,096][36761] Avg episode reward: [(0, '0.040')] [2024-07-02 12:17:31,609][36999] Updated weights for policy 0, policy_version 11280 (0.0024) [2024-07-02 12:17:35,538][36999] Updated weights for policy 0, policy_version 11290 (0.0045) [2024-07-02 12:17:36,100][36761] Fps is (10 sec: 40940.5, 60 sec: 43414.3, 300 sec: 43819.6). Total num frames: 184975360. Throughput: 0: 43774.6. Samples: 185102920. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-07-02 12:17:36,101][36761] Avg episode reward: [(0, '0.037')] [2024-07-02 12:17:39,077][36999] Updated weights for policy 0, policy_version 11300 (0.0034) [2024-07-02 12:17:41,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43963.8, 300 sec: 43764.7). Total num frames: 185204736. Throughput: 0: 43565.0. Samples: 185360620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-07-02 12:17:41,096][36761] Avg episode reward: [(0, '0.042')] [2024-07-02 12:17:42,937][36999] Updated weights for policy 0, policy_version 11310 (0.0029) [2024-07-02 12:17:46,095][36761] Fps is (10 sec: 44257.8, 60 sec: 43417.6, 300 sec: 43820.3). Total num frames: 185417728. Throughput: 0: 43768.5. Samples: 185497060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-07-02 12:17:46,096][36761] Avg episode reward: [(0, '0.040')] [2024-07-02 12:17:46,474][36999] Updated weights for policy 0, policy_version 11320 (0.0027) [2024-07-02 12:17:50,322][36999] Updated weights for policy 0, policy_version 11330 (0.0027) [2024-07-02 12:17:51,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43694.0, 300 sec: 43820.3). Total num frames: 185630720. Throughput: 0: 43795.1. Samples: 185757340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 12:17:51,096][36761] Avg episode reward: [(0, '0.035')] [2024-07-02 12:17:54,161][36999] Updated weights for policy 0, policy_version 11340 (0.0024) [2024-07-02 12:17:56,095][36761] Fps is (10 sec: 44235.9, 60 sec: 43965.1, 300 sec: 43765.4). Total num frames: 185860096. Throughput: 0: 43730.5. Samples: 186021620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 12:17:56,096][36761] Avg episode reward: [(0, '0.034')] [2024-07-02 12:17:57,655][36999] Updated weights for policy 0, policy_version 11350 (0.0025) [2024-07-02 12:18:01,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43690.6, 300 sec: 43820.9). Total num frames: 186073088. Throughput: 0: 43820.0. Samples: 186156340. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-07-02 12:18:01,098][36761] Avg episode reward: [(0, '0.030')] [2024-07-02 12:18:01,782][36999] Updated weights for policy 0, policy_version 11360 (0.0028) [2024-07-02 12:18:05,334][36999] Updated weights for policy 0, policy_version 11370 (0.0030) [2024-07-02 12:18:06,095][36761] Fps is (10 sec: 42598.8, 60 sec: 43420.9, 300 sec: 43875.8). Total num frames: 186286080. Throughput: 0: 43813.8. Samples: 186416120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-07-02 12:18:06,100][36761] Avg episode reward: [(0, '0.023')] [2024-07-02 12:18:09,518][36999] Updated weights for policy 0, policy_version 11380 (0.0039) [2024-07-02 12:18:11,095][36761] Fps is (10 sec: 45875.5, 60 sec: 44236.8, 300 sec: 43820.3). Total num frames: 186531840. Throughput: 0: 43760.4. Samples: 186680680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-07-02 12:18:11,096][36761] Avg episode reward: [(0, '0.028')] [2024-07-02 12:18:12,672][36999] Updated weights for policy 0, policy_version 11390 (0.0032) [2024-07-02 12:18:16,095][36761] Fps is (10 sec: 45875.1, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 186744832. Throughput: 0: 43888.8. Samples: 186817080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 12:18:16,096][36761] Avg episode reward: [(0, '0.033')] [2024-07-02 12:18:17,069][36999] Updated weights for policy 0, policy_version 11400 (0.0026) [2024-07-02 12:18:20,058][36999] Updated weights for policy 0, policy_version 11410 (0.0051) [2024-07-02 12:18:21,100][36761] Fps is (10 sec: 42578.9, 60 sec: 43960.4, 300 sec: 43875.1). Total num frames: 186957824. Throughput: 0: 43805.0. Samples: 187074140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 12:18:21,100][36761] Avg episode reward: [(0, '0.030')] [2024-07-02 12:18:24,413][36999] Updated weights for policy 0, policy_version 11420 (0.0035) [2024-07-02 12:18:26,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43417.5, 300 sec: 43765.4). Total num frames: 187170816. Throughput: 0: 43989.8. Samples: 187340160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 12:18:26,096][36761] Avg episode reward: [(0, '0.040')] [2024-07-02 12:18:27,435][36999] Updated weights for policy 0, policy_version 11430 (0.0040) [2024-07-02 12:18:31,095][36761] Fps is (10 sec: 44256.7, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 187400192. Throughput: 0: 43872.4. Samples: 187471320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 12:18:31,096][36761] Avg episode reward: [(0, '0.035')] [2024-07-02 12:18:31,767][36999] Updated weights for policy 0, policy_version 11440 (0.0040) [2024-07-02 12:18:34,821][36999] Updated weights for policy 0, policy_version 11450 (0.0040) [2024-07-02 12:18:36,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43967.2, 300 sec: 43875.8). Total num frames: 187613184. Throughput: 0: 43826.3. Samples: 187729520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 12:18:36,096][36761] Avg episode reward: [(0, '0.037')] [2024-07-02 12:18:36,189][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000011452_187629568.pth... [2024-07-02 12:18:36,232][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000010809_177094656.pth [2024-07-02 12:18:39,158][36999] Updated weights for policy 0, policy_version 11460 (0.0035) [2024-07-02 12:18:41,095][36761] Fps is (10 sec: 42598.8, 60 sec: 43690.7, 300 sec: 43765.4). Total num frames: 187826176. Throughput: 0: 43990.4. Samples: 188001180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-07-02 12:18:41,095][36761] Avg episode reward: [(0, '0.033')] [2024-07-02 12:18:42,321][36999] Updated weights for policy 0, policy_version 11470 (0.0025) [2024-07-02 12:18:46,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 188055552. Throughput: 0: 43809.0. Samples: 188127740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-07-02 12:18:46,096][36761] Avg episode reward: [(0, '0.032')] [2024-07-02 12:18:47,066][36999] Updated weights for policy 0, policy_version 11480 (0.0028) [2024-07-02 12:18:49,324][36979] Signal inference workers to stop experience collection... (2650 times) [2024-07-02 12:18:49,368][36999] InferenceWorker_p0-w0: stopping experience collection (2650 times) [2024-07-02 12:18:49,390][36979] Signal inference workers to resume experience collection... (2650 times) [2024-07-02 12:18:49,390][36999] InferenceWorker_p0-w0: resuming experience collection (2650 times) [2024-07-02 12:18:49,684][36999] Updated weights for policy 0, policy_version 11490 (0.0025) [2024-07-02 12:18:51,095][36761] Fps is (10 sec: 45875.1, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 188284928. Throughput: 0: 43770.3. Samples: 188385780. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-07-02 12:18:51,096][36761] Avg episode reward: [(0, '0.032')] [2024-07-02 12:18:54,422][36999] Updated weights for policy 0, policy_version 11500 (0.0042) [2024-07-02 12:18:56,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43963.9, 300 sec: 43875.8). Total num frames: 188497920. Throughput: 0: 44026.2. Samples: 188661860. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-07-02 12:18:56,096][36761] Avg episode reward: [(0, '0.032')] [2024-07-02 12:18:57,103][36999] Updated weights for policy 0, policy_version 11510 (0.0030) [2024-07-02 12:19:01,095][36761] Fps is (10 sec: 44236.6, 60 sec: 44236.8, 300 sec: 43931.3). Total num frames: 188727296. Throughput: 0: 43843.6. Samples: 188790040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 12:19:01,096][36761] Avg episode reward: [(0, '0.034')] [2024-07-02 12:19:01,774][36999] Updated weights for policy 0, policy_version 11520 (0.0043) [2024-07-02 12:19:04,681][36999] Updated weights for policy 0, policy_version 11530 (0.0041) [2024-07-02 12:19:06,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43963.7, 300 sec: 43820.5). Total num frames: 188923904. Throughput: 0: 43945.3. Samples: 189051480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 12:19:06,098][36761] Avg episode reward: [(0, '0.046')] [2024-07-02 12:19:09,030][36999] Updated weights for policy 0, policy_version 11540 (0.0030) [2024-07-02 12:19:11,095][36761] Fps is (10 sec: 42598.9, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 189153280. Throughput: 0: 43946.8. Samples: 189317760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 12:19:11,095][36761] Avg episode reward: [(0, '0.045')] [2024-07-02 12:19:12,337][36999] Updated weights for policy 0, policy_version 11550 (0.0024) [2024-07-02 12:19:16,095][36761] Fps is (10 sec: 45875.1, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 189382656. Throughput: 0: 44000.4. Samples: 189451340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 12:19:16,096][36761] Avg episode reward: [(0, '0.039')] [2024-07-02 12:19:16,357][36999] Updated weights for policy 0, policy_version 11560 (0.0030) [2024-07-02 12:19:19,922][36999] Updated weights for policy 0, policy_version 11570 (0.0042) [2024-07-02 12:19:21,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43967.1, 300 sec: 43876.5). Total num frames: 189595648. Throughput: 0: 44039.6. Samples: 189711300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-07-02 12:19:21,095][36761] Avg episode reward: [(0, '0.041')] [2024-07-02 12:19:23,818][36999] Updated weights for policy 0, policy_version 11580 (0.0046) [2024-07-02 12:19:26,095][36761] Fps is (10 sec: 44237.0, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 189825024. Throughput: 0: 43870.6. Samples: 189975360. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-07-02 12:19:26,096][36761] Avg episode reward: [(0, '0.034')] [2024-07-02 12:19:27,433][36999] Updated weights for policy 0, policy_version 11590 (0.0043) [2024-07-02 12:19:31,095][36761] Fps is (10 sec: 44236.1, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 190038016. Throughput: 0: 43938.1. Samples: 190104960. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-07-02 12:19:31,096][36761] Avg episode reward: [(0, '0.037')] [2024-07-02 12:19:31,150][36999] Updated weights for policy 0, policy_version 11600 (0.0039) [2024-07-02 12:19:34,867][36999] Updated weights for policy 0, policy_version 11610 (0.0039) [2024-07-02 12:19:36,095][36761] Fps is (10 sec: 44236.4, 60 sec: 44236.7, 300 sec: 43931.3). Total num frames: 190267392. Throughput: 0: 44178.6. Samples: 190373820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-07-02 12:19:36,096][36761] Avg episode reward: [(0, '0.037')] [2024-07-02 12:19:38,521][36999] Updated weights for policy 0, policy_version 11620 (0.0026) [2024-07-02 12:19:41,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43963.6, 300 sec: 43875.8). Total num frames: 190464000. Throughput: 0: 43947.0. Samples: 190639480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-07-02 12:19:41,096][36761] Avg episode reward: [(0, '0.034')] [2024-07-02 12:19:42,380][36999] Updated weights for policy 0, policy_version 11630 (0.0033) [2024-07-02 12:19:46,037][36999] Updated weights for policy 0, policy_version 11640 (0.0025) [2024-07-02 12:19:46,095][36761] Fps is (10 sec: 44237.0, 60 sec: 44236.7, 300 sec: 43931.3). Total num frames: 190709760. Throughput: 0: 44066.6. Samples: 190773040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 12:19:46,096][36761] Avg episode reward: [(0, '0.030')] [2024-07-02 12:19:50,037][36999] Updated weights for policy 0, policy_version 11650 (0.0032) [2024-07-02 12:19:51,095][36761] Fps is (10 sec: 45875.4, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 190922752. Throughput: 0: 44079.1. Samples: 191035040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 12:19:51,096][36761] Avg episode reward: [(0, '0.038')] [2024-07-02 12:19:53,593][36999] Updated weights for policy 0, policy_version 11660 (0.0025) [2024-07-02 12:19:56,095][36761] Fps is (10 sec: 42599.0, 60 sec: 43963.8, 300 sec: 43875.8). Total num frames: 191135744. Throughput: 0: 43980.0. Samples: 191296860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 12:19:56,096][36761] Avg episode reward: [(0, '0.028')] [2024-07-02 12:19:57,413][36999] Updated weights for policy 0, policy_version 11670 (0.0031) [2024-07-02 12:20:01,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 43820.3). Total num frames: 191348736. Throughput: 0: 43955.5. Samples: 191429340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 12:20:01,096][36761] Avg episode reward: [(0, '0.050')] [2024-07-02 12:20:01,107][36999] Updated weights for policy 0, policy_version 11680 (0.0028) [2024-07-02 12:20:04,903][36999] Updated weights for policy 0, policy_version 11690 (0.0041) [2024-07-02 12:20:06,100][36761] Fps is (10 sec: 44216.1, 60 sec: 44233.5, 300 sec: 43986.2). Total num frames: 191578112. Throughput: 0: 44082.6. Samples: 191695220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 12:20:06,101][36761] Avg episode reward: [(0, '0.045')] [2024-07-02 12:20:08,601][36999] Updated weights for policy 0, policy_version 11700 (0.0038) [2024-07-02 12:20:11,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43963.6, 300 sec: 43875.8). Total num frames: 191791104. Throughput: 0: 43984.8. Samples: 191954680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 12:20:11,096][36761] Avg episode reward: [(0, '0.030')] [2024-07-02 12:20:12,679][36999] Updated weights for policy 0, policy_version 11710 (0.0045) [2024-07-02 12:20:16,095][36761] Fps is (10 sec: 42617.5, 60 sec: 43690.7, 300 sec: 43820.2). Total num frames: 192004096. Throughput: 0: 44045.8. Samples: 192087020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 12:20:16,096][36761] Avg episode reward: [(0, '0.033')] [2024-07-02 12:20:16,276][36999] Updated weights for policy 0, policy_version 11720 (0.0049) [2024-07-02 12:20:20,040][36999] Updated weights for policy 0, policy_version 11730 (0.0033) [2024-07-02 12:20:21,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43963.6, 300 sec: 43986.9). Total num frames: 192233472. Throughput: 0: 43929.3. Samples: 192350640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 12:20:21,096][36761] Avg episode reward: [(0, '0.035')] [2024-07-02 12:20:23,641][36999] Updated weights for policy 0, policy_version 11740 (0.0031) [2024-07-02 12:20:26,095][36761] Fps is (10 sec: 45874.8, 60 sec: 43963.6, 300 sec: 43931.3). Total num frames: 192462848. Throughput: 0: 43823.5. Samples: 192611540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 12:20:26,096][36761] Avg episode reward: [(0, '0.045')] [2024-07-02 12:20:27,689][36999] Updated weights for policy 0, policy_version 11750 (0.0035) [2024-07-02 12:20:31,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 43820.2). Total num frames: 192659456. Throughput: 0: 43859.5. Samples: 192746720. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-07-02 12:20:31,096][36761] Avg episode reward: [(0, '0.032')] [2024-07-02 12:20:31,324][36999] Updated weights for policy 0, policy_version 11760 (0.0036) [2024-07-02 12:20:35,067][36999] Updated weights for policy 0, policy_version 11770 (0.0031) [2024-07-02 12:20:35,463][36979] Signal inference workers to stop experience collection... (2700 times) [2024-07-02 12:20:35,463][36979] Signal inference workers to resume experience collection... (2700 times) [2024-07-02 12:20:35,477][36999] InferenceWorker_p0-w0: stopping experience collection (2700 times) [2024-07-02 12:20:35,485][36999] InferenceWorker_p0-w0: resuming experience collection (2700 times) [2024-07-02 12:20:36,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 192888832. Throughput: 0: 43789.3. Samples: 193005560. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-07-02 12:20:36,096][36761] Avg episode reward: [(0, '0.037')] [2024-07-02 12:20:36,106][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000011773_192888832.pth... [2024-07-02 12:20:36,167][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000011131_182370304.pth [2024-07-02 12:20:38,702][36999] Updated weights for policy 0, policy_version 11780 (0.0046) [2024-07-02 12:20:41,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43963.8, 300 sec: 43875.8). Total num frames: 193101824. Throughput: 0: 43736.3. Samples: 193265000. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-07-02 12:20:41,096][36761] Avg episode reward: [(0, '0.047')] [2024-07-02 12:20:42,475][36999] Updated weights for policy 0, policy_version 11790 (0.0027) [2024-07-02 12:20:46,095][36761] Fps is (10 sec: 42598.9, 60 sec: 43417.6, 300 sec: 43820.3). Total num frames: 193314816. Throughput: 0: 43827.6. Samples: 193401580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 12:20:46,096][36761] Avg episode reward: [(0, '0.050')] [2024-07-02 12:20:46,311][36999] Updated weights for policy 0, policy_version 11800 (0.0041) [2024-07-02 12:20:49,982][36999] Updated weights for policy 0, policy_version 11810 (0.0026) [2024-07-02 12:20:51,096][36761] Fps is (10 sec: 42594.6, 60 sec: 43417.0, 300 sec: 43931.2). Total num frames: 193527808. Throughput: 0: 43820.0. Samples: 193666960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 12:20:51,097][36761] Avg episode reward: [(0, '0.054')] [2024-07-02 12:20:53,729][36999] Updated weights for policy 0, policy_version 11820 (0.0036) [2024-07-02 12:20:56,095][36761] Fps is (10 sec: 45875.4, 60 sec: 43963.7, 300 sec: 43931.4). Total num frames: 193773568. Throughput: 0: 43695.3. Samples: 193920960. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-07-02 12:20:56,096][36761] Avg episode reward: [(0, '0.045')] [2024-07-02 12:20:57,309][36999] Updated weights for policy 0, policy_version 11830 (0.0045) [2024-07-02 12:21:01,080][36999] Updated weights for policy 0, policy_version 11840 (0.0047) [2024-07-02 12:21:01,095][36761] Fps is (10 sec: 45879.1, 60 sec: 43963.8, 300 sec: 43820.2). Total num frames: 193986560. Throughput: 0: 43815.2. Samples: 194058700. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-07-02 12:21:01,096][36761] Avg episode reward: [(0, '0.053')] [2024-07-02 12:21:04,773][36999] Updated weights for policy 0, policy_version 11850 (0.0025) [2024-07-02 12:21:06,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43694.0, 300 sec: 43986.9). Total num frames: 194199552. Throughput: 0: 43782.7. Samples: 194320860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 12:21:06,096][36761] Avg episode reward: [(0, '0.045')] [2024-07-02 12:21:08,566][36999] Updated weights for policy 0, policy_version 11860 (0.0032) [2024-07-02 12:21:11,095][36761] Fps is (10 sec: 44236.5, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 194428928. Throughput: 0: 43775.2. Samples: 194581420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 12:21:11,096][36761] Avg episode reward: [(0, '0.059')] [2024-07-02 12:21:12,193][36999] Updated weights for policy 0, policy_version 11870 (0.0028) [2024-07-02 12:21:16,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 43709.2). Total num frames: 194609152. Throughput: 0: 43781.8. Samples: 194716900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 12:21:16,096][36761] Avg episode reward: [(0, '0.059')] [2024-07-02 12:21:16,242][36999] Updated weights for policy 0, policy_version 11880 (0.0030) [2024-07-02 12:21:19,695][36999] Updated weights for policy 0, policy_version 11890 (0.0043) [2024-07-02 12:21:21,095][36761] Fps is (10 sec: 42598.9, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 194854912. Throughput: 0: 43824.1. Samples: 194977640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 12:21:21,096][36761] Avg episode reward: [(0, '0.059')] [2024-07-02 12:21:23,642][36999] Updated weights for policy 0, policy_version 11900 (0.0035) [2024-07-02 12:21:26,095][36761] Fps is (10 sec: 47513.8, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 195084288. Throughput: 0: 43908.9. Samples: 195240900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-07-02 12:21:26,096][36761] Avg episode reward: [(0, '0.050')] [2024-07-02 12:21:27,197][36999] Updated weights for policy 0, policy_version 11910 (0.0033) [2024-07-02 12:21:31,075][36999] Updated weights for policy 0, policy_version 11920 (0.0054) [2024-07-02 12:21:31,100][36761] Fps is (10 sec: 44216.6, 60 sec: 43960.5, 300 sec: 43819.6). Total num frames: 195297280. Throughput: 0: 43792.0. Samples: 195372420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-07-02 12:21:31,100][36761] Avg episode reward: [(0, '0.058')] [2024-07-02 12:21:34,732][36999] Updated weights for policy 0, policy_version 11930 (0.0031) [2024-07-02 12:21:36,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 195510272. Throughput: 0: 43766.6. Samples: 195636420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 12:21:36,096][36761] Avg episode reward: [(0, '0.063')] [2024-07-02 12:21:36,110][36979] Saving new best policy, reward=0.063! [2024-07-02 12:21:38,525][36999] Updated weights for policy 0, policy_version 11940 (0.0049) [2024-07-02 12:21:41,098][36761] Fps is (10 sec: 44245.9, 60 sec: 43961.9, 300 sec: 43819.9). Total num frames: 195739648. Throughput: 0: 43849.1. Samples: 195894280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 12:21:41,098][36761] Avg episode reward: [(0, '0.058')] [2024-07-02 12:21:42,158][36999] Updated weights for policy 0, policy_version 11950 (0.0026) [2024-07-02 12:21:46,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 43820.9). Total num frames: 195936256. Throughput: 0: 43821.4. Samples: 196030660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 12:21:46,096][36761] Avg episode reward: [(0, '0.051')] [2024-07-02 12:21:46,318][36999] Updated weights for policy 0, policy_version 11960 (0.0033) [2024-07-02 12:21:49,498][36999] Updated weights for policy 0, policy_version 11970 (0.0021) [2024-07-02 12:21:51,095][36761] Fps is (10 sec: 42608.8, 60 sec: 43964.3, 300 sec: 43876.1). Total num frames: 196165632. Throughput: 0: 43760.4. Samples: 196290080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 12:21:51,096][36761] Avg episode reward: [(0, '0.055')] [2024-07-02 12:21:53,750][36999] Updated weights for policy 0, policy_version 11980 (0.0044) [2024-07-02 12:21:56,095][36761] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 43820.2). Total num frames: 196378624. Throughput: 0: 43739.2. Samples: 196549680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 12:21:56,096][36761] Avg episode reward: [(0, '0.054')] [2024-07-02 12:21:57,003][36999] Updated weights for policy 0, policy_version 11990 (0.0021) [2024-07-02 12:22:01,065][36999] Updated weights for policy 0, policy_version 12000 (0.0027) [2024-07-02 12:22:01,095][36761] Fps is (10 sec: 44237.2, 60 sec: 43690.7, 300 sec: 43820.9). Total num frames: 196608000. Throughput: 0: 43790.3. Samples: 196687460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 12:22:01,096][36761] Avg episode reward: [(0, '0.045')] [2024-07-02 12:22:04,455][36999] Updated weights for policy 0, policy_version 12010 (0.0035) [2024-07-02 12:22:06,098][36761] Fps is (10 sec: 44225.9, 60 sec: 43688.9, 300 sec: 43875.4). Total num frames: 196820992. Throughput: 0: 43720.2. Samples: 196945160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 12:22:06,098][36761] Avg episode reward: [(0, '0.040')] [2024-07-02 12:22:08,602][36999] Updated weights for policy 0, policy_version 12020 (0.0039) [2024-07-02 12:22:09,212][36979] Signal inference workers to stop experience collection... (2750 times) [2024-07-02 12:22:09,213][36979] Signal inference workers to resume experience collection... (2750 times) [2024-07-02 12:22:09,239][36999] InferenceWorker_p0-w0: stopping experience collection (2750 times) [2024-07-02 12:22:09,240][36999] InferenceWorker_p0-w0: resuming experience collection (2750 times) [2024-07-02 12:22:11,095][36761] Fps is (10 sec: 44236.1, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 197050368. Throughput: 0: 43724.8. Samples: 197208520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-07-02 12:22:11,096][36761] Avg episode reward: [(0, '0.040')] [2024-07-02 12:22:12,154][36999] Updated weights for policy 0, policy_version 12030 (0.0042) [2024-07-02 12:22:16,095][36761] Fps is (10 sec: 42609.2, 60 sec: 43963.8, 300 sec: 43820.3). Total num frames: 197246976. Throughput: 0: 43724.9. Samples: 197339840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-07-02 12:22:16,096][36761] Avg episode reward: [(0, '0.052')] [2024-07-02 12:22:16,151][36999] Updated weights for policy 0, policy_version 12040 (0.0029) [2024-07-02 12:22:19,487][36999] Updated weights for policy 0, policy_version 12050 (0.0046) [2024-07-02 12:22:21,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 197476352. Throughput: 0: 43681.2. Samples: 197602080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 12:22:21,096][36761] Avg episode reward: [(0, '0.051')] [2024-07-02 12:22:23,649][36999] Updated weights for policy 0, policy_version 12060 (0.0032) [2024-07-02 12:22:26,095][36761] Fps is (10 sec: 45874.9, 60 sec: 43690.6, 300 sec: 43820.3). Total num frames: 197705728. Throughput: 0: 43768.6. Samples: 197863760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 12:22:26,096][36761] Avg episode reward: [(0, '0.050')] [2024-07-02 12:22:26,855][36999] Updated weights for policy 0, policy_version 12070 (0.0031) [2024-07-02 12:22:31,095][36761] Fps is (10 sec: 42598.8, 60 sec: 43420.9, 300 sec: 43820.9). Total num frames: 197902336. Throughput: 0: 43614.6. Samples: 197993320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-07-02 12:22:31,096][36761] Avg episode reward: [(0, '0.054')] [2024-07-02 12:22:31,256][36999] Updated weights for policy 0, policy_version 12080 (0.0032) [2024-07-02 12:22:34,834][36999] Updated weights for policy 0, policy_version 12090 (0.0032) [2024-07-02 12:22:36,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 198131712. Throughput: 0: 43608.9. Samples: 198252480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-07-02 12:22:36,096][36761] Avg episode reward: [(0, '0.044')] [2024-07-02 12:22:36,231][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000012094_198148096.pth... [2024-07-02 12:22:36,282][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000011452_187629568.pth [2024-07-02 12:22:38,679][36999] Updated weights for policy 0, policy_version 12100 (0.0042) [2024-07-02 12:22:41,096][36761] Fps is (10 sec: 44236.1, 60 sec: 43419.3, 300 sec: 43820.2). Total num frames: 198344704. Throughput: 0: 43546.5. Samples: 198509280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 12:22:41,096][36761] Avg episode reward: [(0, '0.047')] [2024-07-02 12:22:42,516][36999] Updated weights for policy 0, policy_version 12110 (0.0040) [2024-07-02 12:22:46,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43690.6, 300 sec: 43820.3). Total num frames: 198557696. Throughput: 0: 43347.9. Samples: 198638120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 12:22:46,096][36761] Avg episode reward: [(0, '0.047')] [2024-07-02 12:22:46,135][36999] Updated weights for policy 0, policy_version 12120 (0.0049) [2024-07-02 12:22:50,028][36999] Updated weights for policy 0, policy_version 12130 (0.0022) [2024-07-02 12:22:51,095][36761] Fps is (10 sec: 44237.8, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 198787072. Throughput: 0: 43679.4. Samples: 198910620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 12:22:51,096][36761] Avg episode reward: [(0, '0.048')] [2024-07-02 12:22:53,484][36999] Updated weights for policy 0, policy_version 12140 (0.0030) [2024-07-02 12:22:56,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 199000064. Throughput: 0: 43522.3. Samples: 199167020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 12:22:56,096][36761] Avg episode reward: [(0, '0.047')] [2024-07-02 12:22:57,525][36999] Updated weights for policy 0, policy_version 12150 (0.0029) [2024-07-02 12:23:01,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 199229440. Throughput: 0: 43438.2. Samples: 199294560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 12:23:01,096][36761] Avg episode reward: [(0, '0.043')] [2024-07-02 12:23:01,097][36999] Updated weights for policy 0, policy_version 12160 (0.0032) [2024-07-02 12:23:04,861][36999] Updated weights for policy 0, policy_version 12170 (0.0025) [2024-07-02 12:23:06,095][36761] Fps is (10 sec: 47513.6, 60 sec: 44238.6, 300 sec: 43875.8). Total num frames: 199475200. Throughput: 0: 43756.6. Samples: 199571120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-07-02 12:23:06,096][36761] Avg episode reward: [(0, '0.043')] [2024-07-02 12:23:08,570][36999] Updated weights for policy 0, policy_version 12180 (0.0032) [2024-07-02 12:23:11,099][36761] Fps is (10 sec: 44220.2, 60 sec: 43688.0, 300 sec: 43819.7). Total num frames: 199671808. Throughput: 0: 43727.5. Samples: 199831660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-07-02 12:23:11,100][36761] Avg episode reward: [(0, '0.028')] [2024-07-02 12:23:12,241][36999] Updated weights for policy 0, policy_version 12190 (0.0036) [2024-07-02 12:23:15,921][36999] Updated weights for policy 0, policy_version 12200 (0.0026) [2024-07-02 12:23:16,096][36761] Fps is (10 sec: 40957.7, 60 sec: 43963.3, 300 sec: 43820.8). Total num frames: 199884800. Throughput: 0: 43651.5. Samples: 199957660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 12:23:16,097][36761] Avg episode reward: [(0, '0.034')] [2024-07-02 12:23:19,635][36999] Updated weights for policy 0, policy_version 12210 (0.0031) [2024-07-02 12:23:21,095][36761] Fps is (10 sec: 45892.3, 60 sec: 44236.9, 300 sec: 43931.3). Total num frames: 200130560. Throughput: 0: 43781.8. Samples: 200222660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 12:23:21,096][36761] Avg episode reward: [(0, '0.040')] [2024-07-02 12:23:23,750][36999] Updated weights for policy 0, policy_version 12220 (0.0027) [2024-07-02 12:23:26,095][36761] Fps is (10 sec: 44239.5, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 200327168. Throughput: 0: 43971.8. Samples: 200488000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 12:23:26,096][36761] Avg episode reward: [(0, '0.044')] [2024-07-02 12:23:26,979][36999] Updated weights for policy 0, policy_version 12230 (0.0035) [2024-07-02 12:23:31,095][36761] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 200523776. Throughput: 0: 43936.9. Samples: 200615280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 12:23:31,096][36761] Avg episode reward: [(0, '0.045')] [2024-07-02 12:23:31,133][36999] Updated weights for policy 0, policy_version 12240 (0.0040) [2024-07-02 12:23:34,306][36999] Updated weights for policy 0, policy_version 12250 (0.0032) [2024-07-02 12:23:35,227][36979] Signal inference workers to stop experience collection... (2800 times) [2024-07-02 12:23:35,227][36979] Signal inference workers to resume experience collection... (2800 times) [2024-07-02 12:23:35,259][36999] InferenceWorker_p0-w0: stopping experience collection (2800 times) [2024-07-02 12:23:35,264][36999] InferenceWorker_p0-w0: resuming experience collection (2800 times) [2024-07-02 12:23:36,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43963.8, 300 sec: 43875.8). Total num frames: 200769536. Throughput: 0: 43904.9. Samples: 200886340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 12:23:36,096][36761] Avg episode reward: [(0, '0.045')] [2024-07-02 12:23:38,449][36999] Updated weights for policy 0, policy_version 12260 (0.0044) [2024-07-02 12:23:41,095][36761] Fps is (10 sec: 45875.1, 60 sec: 43963.8, 300 sec: 43820.2). Total num frames: 200982528. Throughput: 0: 44125.3. Samples: 201152660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 12:23:41,096][36761] Avg episode reward: [(0, '0.055')] [2024-07-02 12:23:41,697][36999] Updated weights for policy 0, policy_version 12270 (0.0049) [2024-07-02 12:23:45,809][36999] Updated weights for policy 0, policy_version 12280 (0.0042) [2024-07-02 12:23:46,095][36761] Fps is (10 sec: 44237.0, 60 sec: 44236.9, 300 sec: 43820.3). Total num frames: 201211904. Throughput: 0: 44153.8. Samples: 201281480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 12:23:46,096][36761] Avg episode reward: [(0, '0.055')] [2024-07-02 12:23:49,010][36999] Updated weights for policy 0, policy_version 12290 (0.0033) [2024-07-02 12:23:51,100][36761] Fps is (10 sec: 45854.5, 60 sec: 44233.4, 300 sec: 43875.1). Total num frames: 201441280. Throughput: 0: 43945.8. Samples: 201548880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 12:23:51,100][36761] Avg episode reward: [(0, '0.050')] [2024-07-02 12:23:53,220][36999] Updated weights for policy 0, policy_version 12300 (0.0035) [2024-07-02 12:23:56,098][36761] Fps is (10 sec: 44223.2, 60 sec: 44234.6, 300 sec: 43819.8). Total num frames: 201654272. Throughput: 0: 44131.0. Samples: 201817520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 12:23:56,099][36761] Avg episode reward: [(0, '0.057')] [2024-07-02 12:23:56,382][36999] Updated weights for policy 0, policy_version 12310 (0.0030) [2024-07-02 12:24:00,547][36999] Updated weights for policy 0, policy_version 12320 (0.0022) [2024-07-02 12:24:01,095][36761] Fps is (10 sec: 40978.7, 60 sec: 43690.6, 300 sec: 43820.3). Total num frames: 201850880. Throughput: 0: 44225.9. Samples: 201947800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 12:24:01,096][36761] Avg episode reward: [(0, '0.056')] [2024-07-02 12:24:03,923][36999] Updated weights for policy 0, policy_version 12330 (0.0036) [2024-07-02 12:24:06,095][36761] Fps is (10 sec: 44250.3, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 202096640. Throughput: 0: 44115.6. Samples: 202207860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 12:24:06,096][36761] Avg episode reward: [(0, '0.055')] [2024-07-02 12:24:07,978][36999] Updated weights for policy 0, policy_version 12340 (0.0030) [2024-07-02 12:24:11,100][36761] Fps is (10 sec: 47491.9, 60 sec: 44236.2, 300 sec: 43875.1). Total num frames: 202326016. Throughput: 0: 44035.5. Samples: 202469800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 12:24:11,100][36761] Avg episode reward: [(0, '0.046')] [2024-07-02 12:24:11,396][36999] Updated weights for policy 0, policy_version 12350 (0.0035) [2024-07-02 12:24:15,562][36999] Updated weights for policy 0, policy_version 12360 (0.0032) [2024-07-02 12:24:16,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43964.2, 300 sec: 43820.3). Total num frames: 202522624. Throughput: 0: 44249.8. Samples: 202606520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 12:24:16,096][36761] Avg episode reward: [(0, '0.066')] [2024-07-02 12:24:16,106][36979] Saving new best policy, reward=0.066! [2024-07-02 12:24:18,785][36999] Updated weights for policy 0, policy_version 12370 (0.0024) [2024-07-02 12:24:21,095][36761] Fps is (10 sec: 42617.8, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 202752000. Throughput: 0: 44083.9. Samples: 202870120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 12:24:21,096][36761] Avg episode reward: [(0, '0.071')] [2024-07-02 12:24:21,097][36979] Saving new best policy, reward=0.071! [2024-07-02 12:24:22,877][36999] Updated weights for policy 0, policy_version 12380 (0.0035) [2024-07-02 12:24:26,087][36999] Updated weights for policy 0, policy_version 12390 (0.0033) [2024-07-02 12:24:26,095][36761] Fps is (10 sec: 47513.7, 60 sec: 44509.9, 300 sec: 43931.4). Total num frames: 202997760. Throughput: 0: 44044.5. Samples: 203134660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 12:24:26,096][36761] Avg episode reward: [(0, '0.075')] [2024-07-02 12:24:26,104][36979] Saving new best policy, reward=0.075! [2024-07-02 12:24:30,131][36999] Updated weights for policy 0, policy_version 12400 (0.0036) [2024-07-02 12:24:31,100][36761] Fps is (10 sec: 44217.0, 60 sec: 44506.5, 300 sec: 43819.6). Total num frames: 203194368. Throughput: 0: 44268.8. Samples: 203273780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 12:24:31,100][36761] Avg episode reward: [(0, '0.066')] [2024-07-02 12:24:33,375][36999] Updated weights for policy 0, policy_version 12410 (0.0031) [2024-07-02 12:24:36,095][36761] Fps is (10 sec: 40959.7, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 203407360. Throughput: 0: 44147.6. Samples: 203535320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 12:24:36,096][36761] Avg episode reward: [(0, '0.066')] [2024-07-02 12:24:36,112][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000012415_203407360.pth... [2024-07-02 12:24:36,166][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000011773_192888832.pth [2024-07-02 12:24:37,483][36999] Updated weights for policy 0, policy_version 12420 (0.0045) [2024-07-02 12:24:40,817][36999] Updated weights for policy 0, policy_version 12430 (0.0029) [2024-07-02 12:24:41,095][36761] Fps is (10 sec: 45896.0, 60 sec: 44509.9, 300 sec: 43875.8). Total num frames: 203653120. Throughput: 0: 43884.7. Samples: 203792200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 12:24:41,096][36761] Avg episode reward: [(0, '0.061')] [2024-07-02 12:24:44,854][36999] Updated weights for policy 0, policy_version 12440 (0.0030) [2024-07-02 12:24:46,095][36761] Fps is (10 sec: 45875.4, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 203866112. Throughput: 0: 44077.4. Samples: 203931280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 12:24:46,096][36761] Avg episode reward: [(0, '0.060')] [2024-07-02 12:24:48,202][36999] Updated weights for policy 0, policy_version 12450 (0.0028) [2024-07-02 12:24:51,095][36761] Fps is (10 sec: 42597.8, 60 sec: 43967.0, 300 sec: 43875.8). Total num frames: 204079104. Throughput: 0: 44198.5. Samples: 204196800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 12:24:51,096][36761] Avg episode reward: [(0, '0.057')] [2024-07-02 12:24:52,510][36999] Updated weights for policy 0, policy_version 12460 (0.0041) [2024-07-02 12:24:55,932][36999] Updated weights for policy 0, policy_version 12470 (0.0045) [2024-07-02 12:24:56,095][36761] Fps is (10 sec: 44236.5, 60 sec: 44239.0, 300 sec: 43931.3). Total num frames: 204308480. Throughput: 0: 44174.7. Samples: 204457460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 12:24:56,096][36761] Avg episode reward: [(0, '0.045')] [2024-07-02 12:24:59,796][36999] Updated weights for policy 0, policy_version 12480 (0.0041) [2024-07-02 12:25:01,098][36761] Fps is (10 sec: 45864.5, 60 sec: 44781.1, 300 sec: 43931.6). Total num frames: 204537856. Throughput: 0: 44223.3. Samples: 204596680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 12:25:01,098][36761] Avg episode reward: [(0, '0.050')] [2024-07-02 12:25:03,258][36999] Updated weights for policy 0, policy_version 12490 (0.0043) [2024-07-02 12:25:06,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 204734464. Throughput: 0: 44273.3. Samples: 204862420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 12:25:06,096][36761] Avg episode reward: [(0, '0.051')] [2024-07-02 12:25:07,095][36999] Updated weights for policy 0, policy_version 12500 (0.0025) [2024-07-02 12:25:10,798][36999] Updated weights for policy 0, policy_version 12510 (0.0033) [2024-07-02 12:25:11,095][36761] Fps is (10 sec: 42608.7, 60 sec: 43967.1, 300 sec: 43931.3). Total num frames: 204963840. Throughput: 0: 44111.5. Samples: 205119680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 12:25:11,096][36761] Avg episode reward: [(0, '0.043')] [2024-07-02 12:25:14,658][36999] Updated weights for policy 0, policy_version 12520 (0.0032) [2024-07-02 12:25:15,165][36979] Signal inference workers to stop experience collection... (2850 times) [2024-07-02 12:25:15,166][36979] Signal inference workers to resume experience collection... (2850 times) [2024-07-02 12:25:15,212][36999] InferenceWorker_p0-w0: stopping experience collection (2850 times) [2024-07-02 12:25:15,212][36999] InferenceWorker_p0-w0: resuming experience collection (2850 times) [2024-07-02 12:25:16,095][36761] Fps is (10 sec: 44236.8, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 205176832. Throughput: 0: 44083.1. Samples: 205257320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 12:25:16,096][36761] Avg episode reward: [(0, '0.059')] [2024-07-02 12:25:18,215][36999] Updated weights for policy 0, policy_version 12530 (0.0027) [2024-07-02 12:25:21,095][36761] Fps is (10 sec: 40960.6, 60 sec: 43690.8, 300 sec: 43764.8). Total num frames: 205373440. Throughput: 0: 44025.9. Samples: 205516480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 12:25:21,095][36761] Avg episode reward: [(0, '0.052')] [2024-07-02 12:25:22,038][36999] Updated weights for policy 0, policy_version 12540 (0.0027) [2024-07-02 12:25:25,828][36999] Updated weights for policy 0, policy_version 12550 (0.0030) [2024-07-02 12:25:26,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43690.7, 300 sec: 43931.4). Total num frames: 205619200. Throughput: 0: 44140.5. Samples: 205778520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 12:25:26,096][36761] Avg episode reward: [(0, '0.047')] [2024-07-02 12:25:29,643][36999] Updated weights for policy 0, policy_version 12560 (0.0044) [2024-07-02 12:25:31,095][36761] Fps is (10 sec: 47513.3, 60 sec: 44240.2, 300 sec: 43931.4). Total num frames: 205848576. Throughput: 0: 44042.7. Samples: 205913200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 12:25:31,096][36761] Avg episode reward: [(0, '0.048')] [2024-07-02 12:25:33,272][36999] Updated weights for policy 0, policy_version 12570 (0.0027) [2024-07-02 12:25:36,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 206045184. Throughput: 0: 43941.8. Samples: 206174180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 12:25:36,096][36761] Avg episode reward: [(0, '0.060')] [2024-07-02 12:25:37,031][36999] Updated weights for policy 0, policy_version 12580 (0.0038) [2024-07-02 12:25:40,590][36999] Updated weights for policy 0, policy_version 12590 (0.0040) [2024-07-02 12:25:41,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 43931.3). Total num frames: 206274560. Throughput: 0: 44011.6. Samples: 206437980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 12:25:41,096][36761] Avg episode reward: [(0, '0.061')] [2024-07-02 12:25:44,712][36999] Updated weights for policy 0, policy_version 12600 (0.0035) [2024-07-02 12:25:46,095][36761] Fps is (10 sec: 45875.2, 60 sec: 43963.7, 300 sec: 43987.0). Total num frames: 206503936. Throughput: 0: 43937.0. Samples: 206573740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 12:25:46,096][36761] Avg episode reward: [(0, '0.042')] [2024-07-02 12:25:47,888][36999] Updated weights for policy 0, policy_version 12610 (0.0049) [2024-07-02 12:25:51,096][36761] Fps is (10 sec: 42597.6, 60 sec: 43690.6, 300 sec: 43820.2). Total num frames: 206700544. Throughput: 0: 43746.5. Samples: 206831020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-07-02 12:25:51,096][36761] Avg episode reward: [(0, '0.062')] [2024-07-02 12:25:52,166][36999] Updated weights for policy 0, policy_version 12620 (0.0027) [2024-07-02 12:25:55,258][36999] Updated weights for policy 0, policy_version 12630 (0.0038) [2024-07-02 12:25:56,095][36761] Fps is (10 sec: 42598.8, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 206929920. Throughput: 0: 43891.2. Samples: 207094780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-07-02 12:25:56,095][36761] Avg episode reward: [(0, '0.069')] [2024-07-02 12:25:59,570][36999] Updated weights for policy 0, policy_version 12640 (0.0025) [2024-07-02 12:26:01,095][36761] Fps is (10 sec: 45876.4, 60 sec: 43692.5, 300 sec: 43931.4). Total num frames: 207159296. Throughput: 0: 43924.5. Samples: 207233920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 12:26:01,096][36761] Avg episode reward: [(0, '0.073')] [2024-07-02 12:26:03,103][36999] Updated weights for policy 0, policy_version 12650 (0.0028) [2024-07-02 12:26:06,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 207355904. Throughput: 0: 44106.2. Samples: 207501260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 12:26:06,096][36761] Avg episode reward: [(0, '0.060')] [2024-07-02 12:26:07,094][36999] Updated weights for policy 0, policy_version 12660 (0.0026) [2024-07-02 12:26:10,479][36999] Updated weights for policy 0, policy_version 12670 (0.0021) [2024-07-02 12:26:11,095][36761] Fps is (10 sec: 44236.4, 60 sec: 43963.8, 300 sec: 44042.4). Total num frames: 207601664. Throughput: 0: 44139.1. Samples: 207764780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 12:26:11,096][36761] Avg episode reward: [(0, '0.060')] [2024-07-02 12:26:14,516][36999] Updated weights for policy 0, policy_version 12680 (0.0030) [2024-07-02 12:26:16,095][36761] Fps is (10 sec: 47513.8, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 207831040. Throughput: 0: 44175.6. Samples: 207901100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-07-02 12:26:16,095][36761] Avg episode reward: [(0, '0.050')] [2024-07-02 12:26:17,829][36999] Updated weights for policy 0, policy_version 12690 (0.0024) [2024-07-02 12:26:21,095][36761] Fps is (10 sec: 42598.2, 60 sec: 44236.7, 300 sec: 43875.8). Total num frames: 208027648. Throughput: 0: 44300.9. Samples: 208167720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-07-02 12:26:21,096][36761] Avg episode reward: [(0, '0.056')] [2024-07-02 12:26:21,855][36999] Updated weights for policy 0, policy_version 12700 (0.0040) [2024-07-02 12:26:25,231][36999] Updated weights for policy 0, policy_version 12710 (0.0033) [2024-07-02 12:26:26,095][36761] Fps is (10 sec: 45874.6, 60 sec: 44509.8, 300 sec: 44043.1). Total num frames: 208289792. Throughput: 0: 44213.3. Samples: 208427580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-07-02 12:26:26,096][36761] Avg episode reward: [(0, '0.068')] [2024-07-02 12:26:29,264][36999] Updated weights for policy 0, policy_version 12720 (0.0027) [2024-07-02 12:26:31,095][36761] Fps is (10 sec: 47513.6, 60 sec: 44236.7, 300 sec: 44042.4). Total num frames: 208502784. Throughput: 0: 44259.6. Samples: 208565420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-07-02 12:26:31,096][36761] Avg episode reward: [(0, '0.056')] [2024-07-02 12:26:32,495][36999] Updated weights for policy 0, policy_version 12730 (0.0031) [2024-07-02 12:26:36,095][36761] Fps is (10 sec: 40960.1, 60 sec: 44236.8, 300 sec: 43931.7). Total num frames: 208699392. Throughput: 0: 44463.7. Samples: 208831880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 12:26:36,096][36761] Avg episode reward: [(0, '0.050')] [2024-07-02 12:26:36,184][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000012739_208715776.pth... [2024-07-02 12:26:36,226][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000012094_198148096.pth [2024-07-02 12:26:36,555][36999] Updated weights for policy 0, policy_version 12740 (0.0031) [2024-07-02 12:26:36,588][36979] Signal inference workers to stop experience collection... (2900 times) [2024-07-02 12:26:36,588][36979] Signal inference workers to resume experience collection... (2900 times) [2024-07-02 12:26:36,640][36999] InferenceWorker_p0-w0: stopping experience collection (2900 times) [2024-07-02 12:26:36,640][36999] InferenceWorker_p0-w0: resuming experience collection (2900 times) [2024-07-02 12:26:40,035][36999] Updated weights for policy 0, policy_version 12750 (0.0037) [2024-07-02 12:26:41,095][36761] Fps is (10 sec: 44236.2, 60 sec: 44509.7, 300 sec: 44097.9). Total num frames: 208945152. Throughput: 0: 44321.1. Samples: 209089240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 12:26:41,100][36761] Avg episode reward: [(0, '0.065')] [2024-07-02 12:26:43,840][36999] Updated weights for policy 0, policy_version 12760 (0.0026) [2024-07-02 12:26:46,095][36761] Fps is (10 sec: 45875.1, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 209158144. Throughput: 0: 44347.5. Samples: 209229560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 12:26:46,096][36761] Avg episode reward: [(0, '0.062')] [2024-07-02 12:26:47,234][36999] Updated weights for policy 0, policy_version 12770 (0.0044) [2024-07-02 12:26:51,095][36761] Fps is (10 sec: 42599.4, 60 sec: 44510.1, 300 sec: 44042.4). Total num frames: 209371136. Throughput: 0: 44434.2. Samples: 209500800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 12:26:51,095][36761] Avg episode reward: [(0, '0.062')] [2024-07-02 12:26:51,155][36999] Updated weights for policy 0, policy_version 12780 (0.0036) [2024-07-02 12:26:54,548][36999] Updated weights for policy 0, policy_version 12790 (0.0037) [2024-07-02 12:26:56,095][36761] Fps is (10 sec: 44236.8, 60 sec: 44509.8, 300 sec: 44042.4). Total num frames: 209600512. Throughput: 0: 44404.0. Samples: 209762960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 12:26:56,097][36761] Avg episode reward: [(0, '0.046')] [2024-07-02 12:26:58,424][36999] Updated weights for policy 0, policy_version 12800 (0.0030) [2024-07-02 12:27:01,095][36761] Fps is (10 sec: 45874.9, 60 sec: 44509.8, 300 sec: 44098.3). Total num frames: 209829888. Throughput: 0: 44390.6. Samples: 209898680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 12:27:01,096][36761] Avg episode reward: [(0, '0.046')] [2024-07-02 12:27:01,953][36999] Updated weights for policy 0, policy_version 12810 (0.0027) [2024-07-02 12:27:05,714][36999] Updated weights for policy 0, policy_version 12820 (0.0031) [2024-07-02 12:27:06,095][36761] Fps is (10 sec: 44236.8, 60 sec: 44782.9, 300 sec: 44042.4). Total num frames: 210042880. Throughput: 0: 44393.8. Samples: 210165440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 12:27:06,096][36761] Avg episode reward: [(0, '0.048')] [2024-07-02 12:27:09,428][36999] Updated weights for policy 0, policy_version 12830 (0.0031) [2024-07-02 12:27:11,096][36761] Fps is (10 sec: 42596.2, 60 sec: 44236.4, 300 sec: 44097.9). Total num frames: 210255872. Throughput: 0: 44488.8. Samples: 210429600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 12:27:11,096][36761] Avg episode reward: [(0, '0.064')] [2024-07-02 12:27:13,018][36999] Updated weights for policy 0, policy_version 12840 (0.0028) [2024-07-02 12:27:16,096][36761] Fps is (10 sec: 45870.3, 60 sec: 44509.0, 300 sec: 44153.3). Total num frames: 210501632. Throughput: 0: 44347.9. Samples: 210561120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-07-02 12:27:16,097][36761] Avg episode reward: [(0, '0.078')] [2024-07-02 12:27:16,121][36979] Saving new best policy, reward=0.078! [2024-07-02 12:27:16,820][36999] Updated weights for policy 0, policy_version 12850 (0.0040) [2024-07-02 12:27:20,795][36999] Updated weights for policy 0, policy_version 12860 (0.0024) [2024-07-02 12:27:21,095][36761] Fps is (10 sec: 45877.5, 60 sec: 44783.0, 300 sec: 44098.0). Total num frames: 210714624. Throughput: 0: 44392.0. Samples: 210829520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-07-02 12:27:21,096][36761] Avg episode reward: [(0, '0.055')] [2024-07-02 12:27:24,098][36999] Updated weights for policy 0, policy_version 12870 (0.0036) [2024-07-02 12:27:26,095][36761] Fps is (10 sec: 40964.1, 60 sec: 43690.6, 300 sec: 44097.9). Total num frames: 210911232. Throughput: 0: 44613.9. Samples: 211096860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 12:27:26,096][36761] Avg episode reward: [(0, '0.061')] [2024-07-02 12:27:28,063][36999] Updated weights for policy 0, policy_version 12880 (0.0035) [2024-07-02 12:27:31,095][36761] Fps is (10 sec: 45875.2, 60 sec: 44509.9, 300 sec: 44209.0). Total num frames: 211173376. Throughput: 0: 44385.3. Samples: 211226900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 12:27:31,096][36761] Avg episode reward: [(0, '0.061')] [2024-07-02 12:27:31,817][36999] Updated weights for policy 0, policy_version 12890 (0.0033) [2024-07-02 12:27:35,375][36999] Updated weights for policy 0, policy_version 12900 (0.0032) [2024-07-02 12:27:36,095][36761] Fps is (10 sec: 47514.3, 60 sec: 44783.0, 300 sec: 44209.1). Total num frames: 211386368. Throughput: 0: 44303.5. Samples: 211494460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-07-02 12:27:36,096][36761] Avg episode reward: [(0, '0.066')] [2024-07-02 12:27:39,121][36999] Updated weights for policy 0, policy_version 12910 (0.0032) [2024-07-02 12:27:41,095][36761] Fps is (10 sec: 39321.7, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 211566592. Throughput: 0: 44416.9. Samples: 211761720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-07-02 12:27:41,096][36761] Avg episode reward: [(0, '0.055')] [2024-07-02 12:27:42,737][36999] Updated weights for policy 0, policy_version 12920 (0.0038) [2024-07-02 12:27:46,095][36761] Fps is (10 sec: 44236.4, 60 sec: 44509.9, 300 sec: 44209.0). Total num frames: 211828736. Throughput: 0: 44262.6. Samples: 211890500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 12:27:46,096][36761] Avg episode reward: [(0, '0.069')] [2024-07-02 12:27:46,492][36999] Updated weights for policy 0, policy_version 12930 (0.0047) [2024-07-02 12:27:50,330][36999] Updated weights for policy 0, policy_version 12940 (0.0029) [2024-07-02 12:27:51,095][36761] Fps is (10 sec: 47513.7, 60 sec: 44509.8, 300 sec: 44209.0). Total num frames: 212041728. Throughput: 0: 44117.0. Samples: 212150700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 12:27:51,096][36761] Avg episode reward: [(0, '0.073')] [2024-07-02 12:27:53,874][36999] Updated weights for policy 0, policy_version 12950 (0.0028) [2024-07-02 12:27:56,095][36761] Fps is (10 sec: 39321.9, 60 sec: 43690.7, 300 sec: 44042.4). Total num frames: 212221952. Throughput: 0: 44266.8. Samples: 212421580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 12:27:56,096][36761] Avg episode reward: [(0, '0.073')] [2024-07-02 12:27:57,646][36999] Updated weights for policy 0, policy_version 12960 (0.0046) [2024-07-02 12:28:01,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 212467712. Throughput: 0: 44154.4. Samples: 212548020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 12:28:01,096][36761] Avg episode reward: [(0, '0.064')] [2024-07-02 12:28:01,274][36999] Updated weights for policy 0, policy_version 12970 (0.0044) [2024-07-02 12:28:05,208][36999] Updated weights for policy 0, policy_version 12980 (0.0027) [2024-07-02 12:28:06,095][36761] Fps is (10 sec: 49151.1, 60 sec: 44509.8, 300 sec: 44209.6). Total num frames: 212713472. Throughput: 0: 44066.6. Samples: 212812520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-07-02 12:28:06,096][36761] Avg episode reward: [(0, '0.063')] [2024-07-02 12:28:08,720][36999] Updated weights for policy 0, policy_version 12990 (0.0042) [2024-07-02 12:28:11,096][36761] Fps is (10 sec: 42596.2, 60 sec: 43963.7, 300 sec: 44098.0). Total num frames: 212893696. Throughput: 0: 44100.0. Samples: 213081380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-07-02 12:28:11,096][36761] Avg episode reward: [(0, '0.069')] [2024-07-02 12:28:12,722][36999] Updated weights for policy 0, policy_version 13000 (0.0040) [2024-07-02 12:28:16,034][36999] Updated weights for policy 0, policy_version 13010 (0.0030) [2024-07-02 12:28:16,095][36761] Fps is (10 sec: 44237.2, 60 sec: 44237.6, 300 sec: 44153.5). Total num frames: 213155840. Throughput: 0: 43785.3. Samples: 213197240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 12:28:16,096][36761] Avg episode reward: [(0, '0.082')] [2024-07-02 12:28:16,108][36979] Saving new best policy, reward=0.082! [2024-07-02 12:28:20,233][36999] Updated weights for policy 0, policy_version 13020 (0.0040) [2024-07-02 12:28:21,095][36761] Fps is (10 sec: 47516.0, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 213368832. Throughput: 0: 43985.7. Samples: 213473820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 12:28:21,096][36761] Avg episode reward: [(0, '0.085')] [2024-07-02 12:28:21,100][36979] Saving new best policy, reward=0.085! [2024-07-02 12:28:23,371][36999] Updated weights for policy 0, policy_version 13030 (0.0031) [2024-07-02 12:28:26,095][36761] Fps is (10 sec: 37683.8, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 213532672. Throughput: 0: 44141.9. Samples: 213748100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 12:28:26,095][36761] Avg episode reward: [(0, '0.089')] [2024-07-02 12:28:26,116][36979] Saving new best policy, reward=0.089! [2024-07-02 12:28:26,479][36979] Signal inference workers to stop experience collection... (2950 times) [2024-07-02 12:28:26,532][36979] Signal inference workers to resume experience collection... (2950 times) [2024-07-02 12:28:26,534][36999] InferenceWorker_p0-w0: stopping experience collection (2950 times) [2024-07-02 12:28:26,552][36999] InferenceWorker_p0-w0: resuming experience collection (2950 times) [2024-07-02 12:28:27,580][36999] Updated weights for policy 0, policy_version 13040 (0.0032) [2024-07-02 12:28:30,962][36999] Updated weights for policy 0, policy_version 13050 (0.0036) [2024-07-02 12:28:31,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43963.7, 300 sec: 44209.0). Total num frames: 213811200. Throughput: 0: 43876.4. Samples: 213864940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 12:28:31,096][36761] Avg episode reward: [(0, '0.089')] [2024-07-02 12:28:34,818][36999] Updated weights for policy 0, policy_version 13060 (0.0029) [2024-07-02 12:28:36,095][36761] Fps is (10 sec: 50789.5, 60 sec: 44236.7, 300 sec: 44264.6). Total num frames: 214040576. Throughput: 0: 44206.1. Samples: 214139980. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-07-02 12:28:36,096][36761] Avg episode reward: [(0, '0.086')] [2024-07-02 12:28:36,105][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000013064_214040576.pth... [2024-07-02 12:28:36,162][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000012415_203407360.pth [2024-07-02 12:28:38,267][36999] Updated weights for policy 0, policy_version 13070 (0.0028) [2024-07-02 12:28:41,095][36761] Fps is (10 sec: 40960.1, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 214220800. Throughput: 0: 44267.5. Samples: 214413620. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-07-02 12:28:41,096][36761] Avg episode reward: [(0, '0.080')] [2024-07-02 12:28:42,120][36999] Updated weights for policy 0, policy_version 13080 (0.0036) [2024-07-02 12:28:45,631][36999] Updated weights for policy 0, policy_version 13090 (0.0036) [2024-07-02 12:28:46,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43963.7, 300 sec: 44154.2). Total num frames: 214466560. Throughput: 0: 44133.3. Samples: 214534020. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-07-02 12:28:46,096][36761] Avg episode reward: [(0, '0.068')] [2024-07-02 12:28:49,598][36999] Updated weights for policy 0, policy_version 13100 (0.0027) [2024-07-02 12:28:51,095][36761] Fps is (10 sec: 47513.9, 60 sec: 44236.8, 300 sec: 44209.5). Total num frames: 214695936. Throughput: 0: 44171.7. Samples: 214800240. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-07-02 12:28:51,096][36761] Avg episode reward: [(0, '0.050')] [2024-07-02 12:28:52,934][36999] Updated weights for policy 0, policy_version 13110 (0.0021) [2024-07-02 12:28:56,095][36761] Fps is (10 sec: 40960.1, 60 sec: 44236.7, 300 sec: 44153.5). Total num frames: 214876160. Throughput: 0: 44257.8. Samples: 215072960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 12:28:56,096][36761] Avg episode reward: [(0, '0.068')] [2024-07-02 12:28:57,296][36999] Updated weights for policy 0, policy_version 13120 (0.0035) [2024-07-02 12:29:00,221][36999] Updated weights for policy 0, policy_version 13130 (0.0037) [2024-07-02 12:29:01,095][36761] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 44153.5). Total num frames: 215121920. Throughput: 0: 44417.9. Samples: 215196040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 12:29:01,096][36761] Avg episode reward: [(0, '0.068')] [2024-07-02 12:29:04,768][36999] Updated weights for policy 0, policy_version 13140 (0.0032) [2024-07-02 12:29:06,095][36761] Fps is (10 sec: 47514.4, 60 sec: 43963.9, 300 sec: 44154.2). Total num frames: 215351296. Throughput: 0: 44230.8. Samples: 215464200. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-07-02 12:29:06,096][36761] Avg episode reward: [(0, '0.077')] [2024-07-02 12:29:07,722][36999] Updated weights for policy 0, policy_version 13150 (0.0032) [2024-07-02 12:29:11,100][36761] Fps is (10 sec: 42578.9, 60 sec: 44233.8, 300 sec: 44152.8). Total num frames: 215547904. Throughput: 0: 44066.1. Samples: 215731280. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-07-02 12:29:11,100][36761] Avg episode reward: [(0, '0.070')] [2024-07-02 12:29:12,274][36999] Updated weights for policy 0, policy_version 13160 (0.0028) [2024-07-02 12:29:15,070][36999] Updated weights for policy 0, policy_version 13170 (0.0028) [2024-07-02 12:29:16,095][36761] Fps is (10 sec: 44236.4, 60 sec: 43963.8, 300 sec: 44209.0). Total num frames: 215793664. Throughput: 0: 44210.3. Samples: 215854400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-07-02 12:29:16,096][36761] Avg episode reward: [(0, '0.073')] [2024-07-02 12:29:19,693][36999] Updated weights for policy 0, policy_version 13180 (0.0035) [2024-07-02 12:29:21,095][36761] Fps is (10 sec: 47535.0, 60 sec: 44236.8, 300 sec: 44153.5). Total num frames: 216023040. Throughput: 0: 44093.4. Samples: 216124180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-07-02 12:29:21,096][36761] Avg episode reward: [(0, '0.065')] [2024-07-02 12:29:22,427][36999] Updated weights for policy 0, policy_version 13190 (0.0031) [2024-07-02 12:29:26,095][36761] Fps is (10 sec: 40960.4, 60 sec: 44509.9, 300 sec: 44098.6). Total num frames: 216203264. Throughput: 0: 43910.8. Samples: 216389600. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-07-02 12:29:26,095][36761] Avg episode reward: [(0, '0.070')] [2024-07-02 12:29:27,090][36999] Updated weights for policy 0, policy_version 13200 (0.0021) [2024-07-02 12:29:30,550][36999] Updated weights for policy 0, policy_version 13210 (0.0023) [2024-07-02 12:29:31,095][36761] Fps is (10 sec: 40960.4, 60 sec: 43690.8, 300 sec: 44153.5). Total num frames: 216432640. Throughput: 0: 43903.7. Samples: 216509680. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-07-02 12:29:31,096][36761] Avg episode reward: [(0, '0.094')] [2024-07-02 12:29:31,096][36979] Saving new best policy, reward=0.094! [2024-07-02 12:29:34,397][36999] Updated weights for policy 0, policy_version 13220 (0.0023) [2024-07-02 12:29:36,095][36761] Fps is (10 sec: 49151.4, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 216694784. Throughput: 0: 44087.1. Samples: 216784160. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-07-02 12:29:36,098][36761] Avg episode reward: [(0, '0.061')] [2024-07-02 12:29:37,845][36999] Updated weights for policy 0, policy_version 13230 (0.0027) [2024-07-02 12:29:40,274][36979] Signal inference workers to stop experience collection... (3000 times) [2024-07-02 12:29:40,274][36979] Signal inference workers to resume experience collection... (3000 times) [2024-07-02 12:29:40,298][36999] InferenceWorker_p0-w0: stopping experience collection (3000 times) [2024-07-02 12:29:40,298][36999] InferenceWorker_p0-w0: resuming experience collection (3000 times) [2024-07-02 12:29:41,096][36761] Fps is (10 sec: 44233.1, 60 sec: 44236.2, 300 sec: 44097.8). Total num frames: 216875008. Throughput: 0: 43866.4. Samples: 217046980. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-07-02 12:29:41,097][36761] Avg episode reward: [(0, '0.072')] [2024-07-02 12:29:41,785][36999] Updated weights for policy 0, policy_version 13240 (0.0030) [2024-07-02 12:29:45,173][36999] Updated weights for policy 0, policy_version 13250 (0.0023) [2024-07-02 12:29:46,095][36761] Fps is (10 sec: 39321.8, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 217088000. Throughput: 0: 43859.1. Samples: 217169700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 12:29:46,096][36761] Avg episode reward: [(0, '0.075')] [2024-07-02 12:29:49,410][36999] Updated weights for policy 0, policy_version 13260 (0.0026) [2024-07-02 12:29:51,095][36761] Fps is (10 sec: 47517.6, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 217350144. Throughput: 0: 43948.4. Samples: 217441880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 12:29:51,096][36761] Avg episode reward: [(0, '0.060')] [2024-07-02 12:29:52,670][36999] Updated weights for policy 0, policy_version 13270 (0.0028) [2024-07-02 12:29:56,095][36761] Fps is (10 sec: 44236.6, 60 sec: 44236.8, 300 sec: 44042.8). Total num frames: 217530368. Throughput: 0: 43976.9. Samples: 217710040. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-07-02 12:29:56,096][36761] Avg episode reward: [(0, '0.070')] [2024-07-02 12:29:56,767][36999] Updated weights for policy 0, policy_version 13280 (0.0033) [2024-07-02 12:30:00,167][36999] Updated weights for policy 0, policy_version 13290 (0.0024) [2024-07-02 12:30:01,095][36761] Fps is (10 sec: 39321.2, 60 sec: 43690.6, 300 sec: 44097.9). Total num frames: 217743360. Throughput: 0: 43990.6. Samples: 217833980. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-07-02 12:30:01,096][36761] Avg episode reward: [(0, '0.067')] [2024-07-02 12:30:04,249][36999] Updated weights for policy 0, policy_version 13300 (0.0025) [2024-07-02 12:30:06,096][36761] Fps is (10 sec: 49151.2, 60 sec: 44509.7, 300 sec: 44264.5). Total num frames: 218021888. Throughput: 0: 44013.6. Samples: 218104800. Policy #0 lag: (min: 2.0, avg: 8.7, max: 22.0) [2024-07-02 12:30:06,096][36761] Avg episode reward: [(0, '0.087')] [2024-07-02 12:30:07,737][36999] Updated weights for policy 0, policy_version 13310 (0.0037) [2024-07-02 12:30:11,095][36761] Fps is (10 sec: 44237.3, 60 sec: 43967.1, 300 sec: 44098.0). Total num frames: 218185728. Throughput: 0: 44047.9. Samples: 218371760. Policy #0 lag: (min: 2.0, avg: 8.7, max: 22.0) [2024-07-02 12:30:11,096][36761] Avg episode reward: [(0, '0.074')] [2024-07-02 12:30:11,595][36999] Updated weights for policy 0, policy_version 13320 (0.0032) [2024-07-02 12:30:15,331][36999] Updated weights for policy 0, policy_version 13330 (0.0039) [2024-07-02 12:30:16,095][36761] Fps is (10 sec: 37684.1, 60 sec: 43417.6, 300 sec: 44153.5). Total num frames: 218398720. Throughput: 0: 43994.7. Samples: 218489440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 12:30:16,096][36761] Avg episode reward: [(0, '0.079')] [2024-07-02 12:30:18,873][36999] Updated weights for policy 0, policy_version 13340 (0.0037) [2024-07-02 12:30:21,095][36761] Fps is (10 sec: 49152.1, 60 sec: 44236.9, 300 sec: 44264.6). Total num frames: 218677248. Throughput: 0: 44031.2. Samples: 218765560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 12:30:21,095][36761] Avg episode reward: [(0, '0.084')] [2024-07-02 12:30:22,684][36999] Updated weights for policy 0, policy_version 13350 (0.0027) [2024-07-02 12:30:26,095][36761] Fps is (10 sec: 47513.7, 60 sec: 44509.8, 300 sec: 44153.5). Total num frames: 218873856. Throughput: 0: 44268.8. Samples: 219039040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 12:30:26,095][36761] Avg episode reward: [(0, '0.085')] [2024-07-02 12:30:26,199][36999] Updated weights for policy 0, policy_version 13360 (0.0027) [2024-07-02 12:30:29,985][36999] Updated weights for policy 0, policy_version 13370 (0.0038) [2024-07-02 12:30:31,095][36761] Fps is (10 sec: 37683.1, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 219054080. Throughput: 0: 44270.2. Samples: 219161860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 12:30:31,095][36761] Avg episode reward: [(0, '0.086')] [2024-07-02 12:30:33,498][36999] Updated weights for policy 0, policy_version 13380 (0.0033) [2024-07-02 12:30:36,095][36761] Fps is (10 sec: 47513.0, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 219348992. Throughput: 0: 44276.3. Samples: 219434320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 12:30:36,096][36761] Avg episode reward: [(0, '0.093')] [2024-07-02 12:30:36,120][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000013388_219348992.pth... [2024-07-02 12:30:36,172][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000012739_208715776.pth [2024-07-02 12:30:37,342][36999] Updated weights for policy 0, policy_version 13390 (0.0030) [2024-07-02 12:30:40,900][36979] Signal inference workers to stop experience collection... (3050 times) [2024-07-02 12:30:40,924][36999] InferenceWorker_p0-w0: stopping experience collection (3050 times) [2024-07-02 12:30:40,955][36979] Signal inference workers to resume experience collection... (3050 times) [2024-07-02 12:30:40,955][36999] InferenceWorker_p0-w0: resuming experience collection (3050 times) [2024-07-02 12:30:40,958][36999] Updated weights for policy 0, policy_version 13400 (0.0025) [2024-07-02 12:30:41,095][36761] Fps is (10 sec: 50789.8, 60 sec: 44783.5, 300 sec: 44264.6). Total num frames: 219561984. Throughput: 0: 44164.0. Samples: 219697420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 12:30:41,100][36761] Avg episode reward: [(0, '0.094')] [2024-07-02 12:30:44,751][36999] Updated weights for policy 0, policy_version 13410 (0.0031) [2024-07-02 12:30:46,095][36761] Fps is (10 sec: 37683.4, 60 sec: 43963.7, 300 sec: 44153.5). Total num frames: 219725824. Throughput: 0: 44145.8. Samples: 219820540. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-07-02 12:30:46,096][36761] Avg episode reward: [(0, '0.096')] [2024-07-02 12:30:46,107][36979] Saving new best policy, reward=0.096! [2024-07-02 12:30:48,350][36999] Updated weights for policy 0, policy_version 13420 (0.0034) [2024-07-02 12:30:51,095][36761] Fps is (10 sec: 44237.0, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 220004352. Throughput: 0: 44127.3. Samples: 220090520. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-07-02 12:30:51,096][36761] Avg episode reward: [(0, '0.096')] [2024-07-02 12:30:52,040][36999] Updated weights for policy 0, policy_version 13430 (0.0045) [2024-07-02 12:30:55,715][36999] Updated weights for policy 0, policy_version 13440 (0.0026) [2024-07-02 12:30:56,095][36761] Fps is (10 sec: 47513.8, 60 sec: 44509.9, 300 sec: 44209.0). Total num frames: 220200960. Throughput: 0: 44032.9. Samples: 220353240. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-07-02 12:30:56,096][36761] Avg episode reward: [(0, '0.096')] [2024-07-02 12:30:59,538][36999] Updated weights for policy 0, policy_version 13450 (0.0035) [2024-07-02 12:31:01,095][36761] Fps is (10 sec: 37683.4, 60 sec: 43963.8, 300 sec: 44153.5). Total num frames: 220381184. Throughput: 0: 44400.9. Samples: 220487480. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-07-02 12:31:01,096][36761] Avg episode reward: [(0, '0.100')] [2024-07-02 12:31:01,202][36979] Saving new best policy, reward=0.100! [2024-07-02 12:31:03,038][36999] Updated weights for policy 0, policy_version 13460 (0.0025) [2024-07-02 12:31:06,095][36761] Fps is (10 sec: 45874.8, 60 sec: 43963.8, 300 sec: 44264.6). Total num frames: 220659712. Throughput: 0: 44099.9. Samples: 220750060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 12:31:06,096][36761] Avg episode reward: [(0, '0.084')] [2024-07-02 12:31:06,843][36999] Updated weights for policy 0, policy_version 13470 (0.0026) [2024-07-02 12:31:10,473][36999] Updated weights for policy 0, policy_version 13480 (0.0036) [2024-07-02 12:31:11,095][36761] Fps is (10 sec: 49152.0, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 220872704. Throughput: 0: 44047.1. Samples: 221021160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 12:31:11,096][36761] Avg episode reward: [(0, '0.065')] [2024-07-02 12:31:14,815][36999] Updated weights for policy 0, policy_version 13490 (0.0031) [2024-07-02 12:31:16,095][36761] Fps is (10 sec: 40960.6, 60 sec: 44509.9, 300 sec: 44209.1). Total num frames: 221069312. Throughput: 0: 44358.2. Samples: 221157980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 12:31:16,096][36761] Avg episode reward: [(0, '0.076')] [2024-07-02 12:31:18,051][36999] Updated weights for policy 0, policy_version 13500 (0.0029) [2024-07-02 12:31:21,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43963.7, 300 sec: 44153.5). Total num frames: 221315072. Throughput: 0: 44024.6. Samples: 221415420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 12:31:21,096][36761] Avg episode reward: [(0, '0.094')] [2024-07-02 12:31:22,216][36999] Updated weights for policy 0, policy_version 13510 (0.0030) [2024-07-02 12:31:25,531][36999] Updated weights for policy 0, policy_version 13520 (0.0034) [2024-07-02 12:31:26,095][36761] Fps is (10 sec: 45875.1, 60 sec: 44236.8, 300 sec: 44153.5). Total num frames: 221528064. Throughput: 0: 44178.8. Samples: 221685460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-07-02 12:31:26,096][36761] Avg episode reward: [(0, '0.094')] [2024-07-02 12:31:29,710][36999] Updated weights for policy 0, policy_version 13530 (0.0038) [2024-07-02 12:31:31,095][36761] Fps is (10 sec: 40959.6, 60 sec: 44509.8, 300 sec: 44153.5). Total num frames: 221724672. Throughput: 0: 44507.5. Samples: 221823380. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-07-02 12:31:31,096][36761] Avg episode reward: [(0, '0.063')] [2024-07-02 12:31:32,802][36999] Updated weights for policy 0, policy_version 13540 (0.0025) [2024-07-02 12:31:36,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43690.7, 300 sec: 44153.5). Total num frames: 221970432. Throughput: 0: 44176.4. Samples: 222078460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-07-02 12:31:36,096][36761] Avg episode reward: [(0, '0.062')] [2024-07-02 12:31:37,278][36999] Updated weights for policy 0, policy_version 13550 (0.0030) [2024-07-02 12:31:40,173][36979] Signal inference workers to stop experience collection... (3100 times) [2024-07-02 12:31:40,213][36999] InferenceWorker_p0-w0: stopping experience collection (3100 times) [2024-07-02 12:31:40,221][36979] Signal inference workers to resume experience collection... (3100 times) [2024-07-02 12:31:40,231][36999] InferenceWorker_p0-w0: resuming experience collection (3100 times) [2024-07-02 12:31:40,233][36999] Updated weights for policy 0, policy_version 13560 (0.0027) [2024-07-02 12:31:41,095][36761] Fps is (10 sec: 47514.0, 60 sec: 43963.8, 300 sec: 44209.0). Total num frames: 222199808. Throughput: 0: 44375.1. Samples: 222350120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-07-02 12:31:41,096][36761] Avg episode reward: [(0, '0.069')] [2024-07-02 12:31:44,919][36999] Updated weights for policy 0, policy_version 13570 (0.0026) [2024-07-02 12:31:46,095][36761] Fps is (10 sec: 40960.2, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 222380032. Throughput: 0: 44411.5. Samples: 222486000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-07-02 12:31:46,096][36761] Avg episode reward: [(0, '0.078')] [2024-07-02 12:31:47,627][36999] Updated weights for policy 0, policy_version 13580 (0.0025) [2024-07-02 12:31:51,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43690.7, 300 sec: 44153.5). Total num frames: 222625792. Throughput: 0: 44205.4. Samples: 222739300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-07-02 12:31:51,096][36761] Avg episode reward: [(0, '0.091')] [2024-07-02 12:31:52,311][36999] Updated weights for policy 0, policy_version 13590 (0.0023) [2024-07-02 12:31:54,895][36999] Updated weights for policy 0, policy_version 13600 (0.0034) [2024-07-02 12:31:56,095][36761] Fps is (10 sec: 50790.6, 60 sec: 44783.0, 300 sec: 44264.6). Total num frames: 222887936. Throughput: 0: 44216.0. Samples: 223010880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 12:31:56,095][36761] Avg episode reward: [(0, '0.103')] [2024-07-02 12:31:56,185][36979] Saving new best policy, reward=0.103! [2024-07-02 12:31:59,850][36999] Updated weights for policy 0, policy_version 13610 (0.0032) [2024-07-02 12:32:01,095][36761] Fps is (10 sec: 44237.0, 60 sec: 44782.9, 300 sec: 44153.5). Total num frames: 223068160. Throughput: 0: 44222.7. Samples: 223148000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 12:32:01,096][36761] Avg episode reward: [(0, '0.112')] [2024-07-02 12:32:01,100][36979] Saving new best policy, reward=0.112! [2024-07-02 12:32:02,345][36999] Updated weights for policy 0, policy_version 13620 (0.0036) [2024-07-02 12:32:06,095][36761] Fps is (10 sec: 39321.4, 60 sec: 43690.7, 300 sec: 44153.6). Total num frames: 223281152. Throughput: 0: 44184.9. Samples: 223403740. Policy #0 lag: (min: 0.0, avg: 13.9, max: 24.0) [2024-07-02 12:32:06,096][36761] Avg episode reward: [(0, '0.102')] [2024-07-02 12:32:07,345][36999] Updated weights for policy 0, policy_version 13630 (0.0032) [2024-07-02 12:32:09,676][36999] Updated weights for policy 0, policy_version 13640 (0.0042) [2024-07-02 12:32:11,095][36761] Fps is (10 sec: 47513.6, 60 sec: 44509.9, 300 sec: 44209.2). Total num frames: 223543296. Throughput: 0: 44047.6. Samples: 223667600. Policy #0 lag: (min: 0.0, avg: 13.9, max: 24.0) [2024-07-02 12:32:11,095][36761] Avg episode reward: [(0, '0.114')] [2024-07-02 12:32:11,189][36979] Saving new best policy, reward=0.114! [2024-07-02 12:32:14,673][36999] Updated weights for policy 0, policy_version 13650 (0.0026) [2024-07-02 12:32:16,095][36761] Fps is (10 sec: 44236.5, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 223723520. Throughput: 0: 44239.1. Samples: 223814140. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-07-02 12:32:16,096][36761] Avg episode reward: [(0, '0.112')] [2024-07-02 12:32:17,232][36999] Updated weights for policy 0, policy_version 13660 (0.0050) [2024-07-02 12:32:21,095][36761] Fps is (10 sec: 39321.1, 60 sec: 43690.6, 300 sec: 44153.5). Total num frames: 223936512. Throughput: 0: 44239.1. Samples: 224069220. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-07-02 12:32:21,096][36761] Avg episode reward: [(0, '0.101')] [2024-07-02 12:32:22,020][36999] Updated weights for policy 0, policy_version 13670 (0.0038) [2024-07-02 12:32:24,490][36999] Updated weights for policy 0, policy_version 13680 (0.0029) [2024-07-02 12:32:26,095][36761] Fps is (10 sec: 49151.9, 60 sec: 44782.8, 300 sec: 44209.0). Total num frames: 224215040. Throughput: 0: 44031.4. Samples: 224331540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-07-02 12:32:26,096][36761] Avg episode reward: [(0, '0.114')] [2024-07-02 12:32:29,387][36999] Updated weights for policy 0, policy_version 13690 (0.0054) [2024-07-02 12:32:31,095][36761] Fps is (10 sec: 44236.8, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 224378880. Throughput: 0: 44266.1. Samples: 224477980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-07-02 12:32:31,096][36761] Avg episode reward: [(0, '0.123')] [2024-07-02 12:32:31,096][36979] Saving new best policy, reward=0.123! [2024-07-02 12:32:31,914][36999] Updated weights for policy 0, policy_version 13700 (0.0031) [2024-07-02 12:32:36,095][36761] Fps is (10 sec: 36044.8, 60 sec: 43417.6, 300 sec: 44097.9). Total num frames: 224575488. Throughput: 0: 44217.7. Samples: 224729100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-07-02 12:32:36,096][36761] Avg episode reward: [(0, '0.115')] [2024-07-02 12:32:36,155][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000013708_224591872.pth... [2024-07-02 12:32:36,207][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000013064_214040576.pth [2024-07-02 12:32:36,707][36999] Updated weights for policy 0, policy_version 13710 (0.0031) [2024-07-02 12:32:38,831][36979] Signal inference workers to stop experience collection... (3150 times) [2024-07-02 12:32:38,831][36979] Signal inference workers to resume experience collection... (3150 times) [2024-07-02 12:32:38,872][36999] InferenceWorker_p0-w0: stopping experience collection (3150 times) [2024-07-02 12:32:38,872][36999] InferenceWorker_p0-w0: resuming experience collection (3150 times) [2024-07-02 12:32:39,277][36999] Updated weights for policy 0, policy_version 13720 (0.0027) [2024-07-02 12:32:41,095][36761] Fps is (10 sec: 50790.5, 60 sec: 44782.9, 300 sec: 44264.6). Total num frames: 224886784. Throughput: 0: 43984.8. Samples: 224990200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 12:32:41,096][36761] Avg episode reward: [(0, '0.101')] [2024-07-02 12:32:44,629][36999] Updated weights for policy 0, policy_version 13730 (0.0039) [2024-07-02 12:32:46,096][36761] Fps is (10 sec: 47513.1, 60 sec: 44509.7, 300 sec: 44097.9). Total num frames: 225050624. Throughput: 0: 44208.7. Samples: 225137400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 12:32:46,096][36761] Avg episode reward: [(0, '0.096')] [2024-07-02 12:32:46,739][36999] Updated weights for policy 0, policy_version 13740 (0.0037) [2024-07-02 12:32:51,095][36761] Fps is (10 sec: 34406.3, 60 sec: 43417.5, 300 sec: 44097.9). Total num frames: 225230848. Throughput: 0: 44096.4. Samples: 225388080. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-07-02 12:32:51,096][36761] Avg episode reward: [(0, '0.102')] [2024-07-02 12:32:52,095][36999] Updated weights for policy 0, policy_version 13750 (0.0030) [2024-07-02 12:32:54,139][36999] Updated weights for policy 0, policy_version 13760 (0.0035) [2024-07-02 12:32:56,095][36761] Fps is (10 sec: 49152.3, 60 sec: 44236.7, 300 sec: 44320.1). Total num frames: 225542144. Throughput: 0: 43929.2. Samples: 225644420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-07-02 12:32:56,096][36761] Avg episode reward: [(0, '0.110')] [2024-07-02 12:32:59,446][36999] Updated weights for policy 0, policy_version 13770 (0.0038) [2024-07-02 12:33:01,095][36761] Fps is (10 sec: 49152.1, 60 sec: 44236.7, 300 sec: 44098.0). Total num frames: 225722368. Throughput: 0: 44112.0. Samples: 225799180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 12:33:01,096][36761] Avg episode reward: [(0, '0.104')] [2024-07-02 12:33:01,673][36999] Updated weights for policy 0, policy_version 13780 (0.0027) [2024-07-02 12:33:06,095][36761] Fps is (10 sec: 36045.0, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 225902592. Throughput: 0: 44147.6. Samples: 226055860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 12:33:06,096][36761] Avg episode reward: [(0, '0.100')] [2024-07-02 12:33:06,905][36999] Updated weights for policy 0, policy_version 13790 (0.0047) [2024-07-02 12:33:09,194][36999] Updated weights for policy 0, policy_version 13800 (0.0032) [2024-07-02 12:33:11,095][36761] Fps is (10 sec: 49152.2, 60 sec: 44509.8, 300 sec: 44264.6). Total num frames: 226213888. Throughput: 0: 43927.2. Samples: 226308260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-07-02 12:33:11,096][36761] Avg episode reward: [(0, '0.109')] [2024-07-02 12:33:14,496][36999] Updated weights for policy 0, policy_version 13810 (0.0034) [2024-07-02 12:33:16,095][36761] Fps is (10 sec: 47514.0, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 226377728. Throughput: 0: 44065.4. Samples: 226460920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-07-02 12:33:16,096][36761] Avg episode reward: [(0, '0.097')] [2024-07-02 12:33:16,613][36999] Updated weights for policy 0, policy_version 13820 (0.0026) [2024-07-02 12:33:21,100][36761] Fps is (10 sec: 34390.7, 60 sec: 43687.4, 300 sec: 44152.8). Total num frames: 226557952. Throughput: 0: 44164.9. Samples: 226716720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-07-02 12:33:21,100][36761] Avg episode reward: [(0, '0.097')] [2024-07-02 12:33:22,014][36999] Updated weights for policy 0, policy_version 13830 (0.0027) [2024-07-02 12:33:22,710][36979] Signal inference workers to stop experience collection... (3200 times) [2024-07-02 12:33:22,710][36979] Signal inference workers to resume experience collection... (3200 times) [2024-07-02 12:33:22,739][36999] InferenceWorker_p0-w0: stopping experience collection (3200 times) [2024-07-02 12:33:22,740][36999] InferenceWorker_p0-w0: resuming experience collection (3200 times) [2024-07-02 12:33:23,991][36999] Updated weights for policy 0, policy_version 13840 (0.0024) [2024-07-02 12:33:26,095][36761] Fps is (10 sec: 49151.7, 60 sec: 44236.8, 300 sec: 44264.6). Total num frames: 226869248. Throughput: 0: 44047.6. Samples: 226972340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-07-02 12:33:26,104][36761] Avg episode reward: [(0, '0.117')] [2024-07-02 12:33:29,492][36999] Updated weights for policy 0, policy_version 13850 (0.0035) [2024-07-02 12:33:31,100][36761] Fps is (10 sec: 50790.2, 60 sec: 44779.5, 300 sec: 44152.8). Total num frames: 227065856. Throughput: 0: 44275.6. Samples: 227130000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 12:33:31,101][36761] Avg episode reward: [(0, '0.130')] [2024-07-02 12:33:31,101][36979] Saving new best policy, reward=0.130! [2024-07-02 12:33:31,385][36999] Updated weights for policy 0, policy_version 13860 (0.0036) [2024-07-02 12:33:36,095][36761] Fps is (10 sec: 34406.6, 60 sec: 43963.8, 300 sec: 44042.4). Total num frames: 227213312. Throughput: 0: 44321.4. Samples: 227382540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 12:33:36,096][36761] Avg episode reward: [(0, '0.118')] [2024-07-02 12:33:36,794][36999] Updated weights for policy 0, policy_version 13870 (0.0022) [2024-07-02 12:33:38,697][36999] Updated weights for policy 0, policy_version 13880 (0.0028) [2024-07-02 12:33:41,095][36761] Fps is (10 sec: 45896.2, 60 sec: 43963.7, 300 sec: 44264.6). Total num frames: 227524608. Throughput: 0: 44334.7. Samples: 227639480. Policy #0 lag: (min: 0.0, avg: 13.9, max: 21.0) [2024-07-02 12:33:41,096][36761] Avg episode reward: [(0, '0.097')] [2024-07-02 12:33:44,050][36999] Updated weights for policy 0, policy_version 13890 (0.0029) [2024-07-02 12:33:46,005][36999] Updated weights for policy 0, policy_version 13900 (0.0031) [2024-07-02 12:33:46,095][36761] Fps is (10 sec: 52428.4, 60 sec: 44783.0, 300 sec: 44209.0). Total num frames: 227737600. Throughput: 0: 44449.3. Samples: 227799400. Policy #0 lag: (min: 0.0, avg: 13.9, max: 21.0) [2024-07-02 12:33:46,096][36761] Avg episode reward: [(0, '0.117')] [2024-07-02 12:33:51,095][36761] Fps is (10 sec: 34406.8, 60 sec: 43963.8, 300 sec: 44042.4). Total num frames: 227868672. Throughput: 0: 44350.3. Samples: 228051620. Policy #0 lag: (min: 0.0, avg: 13.9, max: 21.0) [2024-07-02 12:33:51,095][36761] Avg episode reward: [(0, '0.114')] [2024-07-02 12:33:51,340][36999] Updated weights for policy 0, policy_version 13910 (0.0036) [2024-07-02 12:33:53,397][36999] Updated weights for policy 0, policy_version 13920 (0.0036) [2024-07-02 12:33:56,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43963.8, 300 sec: 44264.6). Total num frames: 228179968. Throughput: 0: 44439.2. Samples: 228308020. Policy #0 lag: (min: 0.0, avg: 6.2, max: 21.0) [2024-07-02 12:33:56,096][36761] Avg episode reward: [(0, '0.118')] [2024-07-02 12:33:58,860][36999] Updated weights for policy 0, policy_version 13930 (0.0035) [2024-07-02 12:34:00,921][36999] Updated weights for policy 0, policy_version 13940 (0.0035) [2024-07-02 12:34:01,095][36761] Fps is (10 sec: 54066.5, 60 sec: 44782.9, 300 sec: 44264.5). Total num frames: 228409344. Throughput: 0: 44411.0. Samples: 228459420. Policy #0 lag: (min: 0.0, avg: 6.2, max: 21.0) [2024-07-02 12:34:01,096][36761] Avg episode reward: [(0, '0.103')] [2024-07-02 12:34:06,095][36761] Fps is (10 sec: 36044.8, 60 sec: 43963.8, 300 sec: 44043.1). Total num frames: 228540416. Throughput: 0: 44526.3. Samples: 228720200. Policy #0 lag: (min: 0.0, avg: 14.0, max: 31.0) [2024-07-02 12:34:06,096][36761] Avg episode reward: [(0, '0.092')] [2024-07-02 12:34:06,115][36999] Updated weights for policy 0, policy_version 13950 (0.0030) [2024-07-02 12:34:08,117][36999] Updated weights for policy 0, policy_version 13960 (0.0023) [2024-07-02 12:34:09,004][36979] Signal inference workers to stop experience collection... (3250 times) [2024-07-02 12:34:09,005][36979] Signal inference workers to resume experience collection... (3250 times) [2024-07-02 12:34:09,021][36999] InferenceWorker_p0-w0: stopping experience collection (3250 times) [2024-07-02 12:34:09,021][36999] InferenceWorker_p0-w0: resuming experience collection (3250 times) [2024-07-02 12:34:11,095][36761] Fps is (10 sec: 42598.8, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 228835328. Throughput: 0: 44639.6. Samples: 228981120. Policy #0 lag: (min: 0.0, avg: 14.0, max: 31.0) [2024-07-02 12:34:11,096][36761] Avg episode reward: [(0, '0.092')] [2024-07-02 12:34:13,440][36999] Updated weights for policy 0, policy_version 13970 (0.0025) [2024-07-02 12:34:15,540][36999] Updated weights for policy 0, policy_version 13980 (0.0029) [2024-07-02 12:34:16,095][36761] Fps is (10 sec: 52429.2, 60 sec: 44783.0, 300 sec: 44209.1). Total num frames: 229064704. Throughput: 0: 44356.2. Samples: 229125820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 12:34:16,095][36761] Avg episode reward: [(0, '0.111')] [2024-07-02 12:34:20,746][36999] Updated weights for policy 0, policy_version 13990 (0.0040) [2024-07-02 12:34:21,095][36761] Fps is (10 sec: 39321.5, 60 sec: 44513.2, 300 sec: 44153.5). Total num frames: 229228544. Throughput: 0: 44615.1. Samples: 229390220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 12:34:21,096][36761] Avg episode reward: [(0, '0.126')] [2024-07-02 12:34:23,013][36999] Updated weights for policy 0, policy_version 14000 (0.0028) [2024-07-02 12:34:26,095][36761] Fps is (10 sec: 40959.2, 60 sec: 43417.5, 300 sec: 44209.0). Total num frames: 229474304. Throughput: 0: 44540.4. Samples: 229643800. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-07-02 12:34:26,096][36761] Avg episode reward: [(0, '0.125')] [2024-07-02 12:34:28,154][36999] Updated weights for policy 0, policy_version 14010 (0.0022) [2024-07-02 12:34:30,460][36999] Updated weights for policy 0, policy_version 14020 (0.0036) [2024-07-02 12:34:31,100][36761] Fps is (10 sec: 50767.4, 60 sec: 44509.9, 300 sec: 44208.4). Total num frames: 229736448. Throughput: 0: 44113.4. Samples: 229784700. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-07-02 12:34:31,100][36761] Avg episode reward: [(0, '0.108')] [2024-07-02 12:34:35,830][36999] Updated weights for policy 0, policy_version 14030 (0.0031) [2024-07-02 12:34:36,095][36761] Fps is (10 sec: 40960.0, 60 sec: 44509.8, 300 sec: 44098.1). Total num frames: 229883904. Throughput: 0: 44366.1. Samples: 230048100. Policy #0 lag: (min: 0.0, avg: 12.8, max: 26.0) [2024-07-02 12:34:36,096][36761] Avg episode reward: [(0, '0.112')] [2024-07-02 12:34:36,144][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000014032_229900288.pth... [2024-07-02 12:34:36,191][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000013388_219348992.pth [2024-07-02 12:34:37,845][36999] Updated weights for policy 0, policy_version 14040 (0.0029) [2024-07-02 12:34:41,095][36761] Fps is (10 sec: 39339.3, 60 sec: 43417.6, 300 sec: 44209.0). Total num frames: 230129664. Throughput: 0: 44384.8. Samples: 230305340. Policy #0 lag: (min: 0.0, avg: 12.8, max: 26.0) [2024-07-02 12:34:41,096][36761] Avg episode reward: [(0, '0.112')] [2024-07-02 12:34:43,234][36999] Updated weights for policy 0, policy_version 14050 (0.0025) [2024-07-02 12:34:45,167][36999] Updated weights for policy 0, policy_version 14060 (0.0035) [2024-07-02 12:34:46,095][36761] Fps is (10 sec: 50791.2, 60 sec: 44236.9, 300 sec: 44209.0). Total num frames: 230391808. Throughput: 0: 44152.6. Samples: 230446280. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-07-02 12:34:46,096][36761] Avg episode reward: [(0, '0.113')] [2024-07-02 12:34:50,637][36999] Updated weights for policy 0, policy_version 14070 (0.0041) [2024-07-02 12:34:51,095][36761] Fps is (10 sec: 42598.6, 60 sec: 44782.9, 300 sec: 44153.5). Total num frames: 230555648. Throughput: 0: 44319.5. Samples: 230714580. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-07-02 12:34:51,096][36761] Avg episode reward: [(0, '0.118')] [2024-07-02 12:34:52,707][36999] Updated weights for policy 0, policy_version 14080 (0.0034) [2024-07-02 12:34:56,100][36761] Fps is (10 sec: 39303.5, 60 sec: 43414.3, 300 sec: 44208.4). Total num frames: 230785024. Throughput: 0: 44185.7. Samples: 230969680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-07-02 12:34:56,100][36761] Avg episode reward: [(0, '0.112')] [2024-07-02 12:34:57,939][36999] Updated weights for policy 0, policy_version 14090 (0.0033) [2024-07-02 12:35:00,067][36999] Updated weights for policy 0, policy_version 14100 (0.0035) [2024-07-02 12:35:01,095][36761] Fps is (10 sec: 50790.1, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 231063552. Throughput: 0: 44021.2. Samples: 231106780. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-07-02 12:35:01,099][36761] Avg episode reward: [(0, '0.105')] [2024-07-02 12:35:05,469][36979] Signal inference workers to stop experience collection... (3300 times) [2024-07-02 12:35:05,469][36979] Signal inference workers to resume experience collection... (3300 times) [2024-07-02 12:35:05,473][36999] Updated weights for policy 0, policy_version 14110 (0.0045) [2024-07-02 12:35:05,506][36999] InferenceWorker_p0-w0: stopping experience collection (3300 times) [2024-07-02 12:35:05,506][36999] InferenceWorker_p0-w0: resuming experience collection (3300 times) [2024-07-02 12:35:06,095][36761] Fps is (10 sec: 44256.9, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 231227392. Throughput: 0: 44172.9. Samples: 231378000. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-07-02 12:35:06,096][36761] Avg episode reward: [(0, '0.117')] [2024-07-02 12:35:07,625][36999] Updated weights for policy 0, policy_version 14120 (0.0040) [2024-07-02 12:35:11,095][36761] Fps is (10 sec: 37683.0, 60 sec: 43417.5, 300 sec: 44209.0). Total num frames: 231440384. Throughput: 0: 44039.1. Samples: 231625560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 12:35:11,096][36761] Avg episode reward: [(0, '0.134')] [2024-07-02 12:35:11,212][36979] Saving new best policy, reward=0.134! [2024-07-02 12:35:12,866][36999] Updated weights for policy 0, policy_version 14130 (0.0025) [2024-07-02 12:35:15,058][36999] Updated weights for policy 0, policy_version 14140 (0.0030) [2024-07-02 12:35:16,095][36761] Fps is (10 sec: 49152.4, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 231718912. Throughput: 0: 43960.9. Samples: 231762740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 12:35:16,095][36761] Avg episode reward: [(0, '0.134')] [2024-07-02 12:35:20,254][36999] Updated weights for policy 0, policy_version 14150 (0.0038) [2024-07-02 12:35:21,095][36761] Fps is (10 sec: 44237.1, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 231882752. Throughput: 0: 44210.7. Samples: 232037580. Policy #0 lag: (min: 0.0, avg: 13.0, max: 20.0) [2024-07-02 12:35:21,096][36761] Avg episode reward: [(0, '0.124')] [2024-07-02 12:35:22,535][36999] Updated weights for policy 0, policy_version 14160 (0.0041) [2024-07-02 12:35:26,095][36761] Fps is (10 sec: 39321.0, 60 sec: 43963.7, 300 sec: 44264.5). Total num frames: 232112128. Throughput: 0: 44120.0. Samples: 232290740. Policy #0 lag: (min: 0.0, avg: 13.0, max: 20.0) [2024-07-02 12:35:26,098][36761] Avg episode reward: [(0, '0.109')] [2024-07-02 12:35:27,584][36999] Updated weights for policy 0, policy_version 14170 (0.0029) [2024-07-02 12:35:30,004][36999] Updated weights for policy 0, policy_version 14180 (0.0036) [2024-07-02 12:35:31,095][36761] Fps is (10 sec: 52429.3, 60 sec: 44513.3, 300 sec: 44264.6). Total num frames: 232407040. Throughput: 0: 43908.4. Samples: 232422160. Policy #0 lag: (min: 1.0, avg: 6.7, max: 21.0) [2024-07-02 12:35:31,096][36761] Avg episode reward: [(0, '0.114')] [2024-07-02 12:35:35,131][36999] Updated weights for policy 0, policy_version 14190 (0.0037) [2024-07-02 12:35:36,095][36761] Fps is (10 sec: 44236.8, 60 sec: 44509.9, 300 sec: 44042.4). Total num frames: 232554496. Throughput: 0: 44048.8. Samples: 232696780. Policy #0 lag: (min: 1.0, avg: 6.7, max: 21.0) [2024-07-02 12:35:36,096][36761] Avg episode reward: [(0, '0.118')] [2024-07-02 12:35:37,451][36999] Updated weights for policy 0, policy_version 14200 (0.0034) [2024-07-02 12:35:41,095][36761] Fps is (10 sec: 36044.7, 60 sec: 43963.8, 300 sec: 44209.0). Total num frames: 232767488. Throughput: 0: 43951.6. Samples: 232947300. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-07-02 12:35:41,096][36761] Avg episode reward: [(0, '0.121')] [2024-07-02 12:35:42,625][36999] Updated weights for policy 0, policy_version 14210 (0.0025) [2024-07-02 12:35:44,874][36999] Updated weights for policy 0, policy_version 14220 (0.0027) [2024-07-02 12:35:46,095][36761] Fps is (10 sec: 49153.0, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 233046016. Throughput: 0: 43955.7. Samples: 233084780. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-07-02 12:35:46,096][36761] Avg episode reward: [(0, '0.121')] [2024-07-02 12:35:50,098][36999] Updated weights for policy 0, policy_version 14230 (0.0026) [2024-07-02 12:35:51,095][36761] Fps is (10 sec: 44236.5, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 233209856. Throughput: 0: 44042.6. Samples: 233359920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-07-02 12:35:51,096][36761] Avg episode reward: [(0, '0.134')] [2024-07-02 12:35:52,201][36999] Updated weights for policy 0, policy_version 14240 (0.0025) [2024-07-02 12:35:56,095][36761] Fps is (10 sec: 37682.9, 60 sec: 43967.1, 300 sec: 44209.0). Total num frames: 233422848. Throughput: 0: 44137.0. Samples: 233611720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-07-02 12:35:56,096][36761] Avg episode reward: [(0, '0.134')] [2024-07-02 12:35:57,341][36979] Signal inference workers to stop experience collection... (3350 times) [2024-07-02 12:35:57,387][36999] InferenceWorker_p0-w0: stopping experience collection (3350 times) [2024-07-02 12:35:57,452][36979] Signal inference workers to resume experience collection... (3350 times) [2024-07-02 12:35:57,453][36999] InferenceWorker_p0-w0: resuming experience collection (3350 times) [2024-07-02 12:35:57,454][36999] Updated weights for policy 0, policy_version 14250 (0.0029) [2024-07-02 12:35:59,800][36999] Updated weights for policy 0, policy_version 14260 (0.0038) [2024-07-02 12:36:01,095][36761] Fps is (10 sec: 50790.1, 60 sec: 44236.8, 300 sec: 44264.6). Total num frames: 233717760. Throughput: 0: 44065.1. Samples: 233745680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 12:36:01,096][36761] Avg episode reward: [(0, '0.121')] [2024-07-02 12:36:04,721][36999] Updated weights for policy 0, policy_version 14270 (0.0035) [2024-07-02 12:36:06,095][36761] Fps is (10 sec: 45875.3, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 233881600. Throughput: 0: 44003.6. Samples: 234017740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 12:36:06,096][36761] Avg episode reward: [(0, '0.104')] [2024-07-02 12:36:07,137][36999] Updated weights for policy 0, policy_version 14280 (0.0024) [2024-07-02 12:36:11,095][36761] Fps is (10 sec: 36045.1, 60 sec: 43963.8, 300 sec: 44097.9). Total num frames: 234078208. Throughput: 0: 44294.3. Samples: 234283980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 12:36:11,096][36761] Avg episode reward: [(0, '0.112')] [2024-07-02 12:36:12,029][36999] Updated weights for policy 0, policy_version 14290 (0.0036) [2024-07-02 12:36:14,487][36999] Updated weights for policy 0, policy_version 14300 (0.0030) [2024-07-02 12:36:16,095][36761] Fps is (10 sec: 47513.0, 60 sec: 43963.6, 300 sec: 44209.0). Total num frames: 234356736. Throughput: 0: 44044.3. Samples: 234404160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 12:36:16,096][36761] Avg episode reward: [(0, '0.125')] [2024-07-02 12:36:19,334][36999] Updated weights for policy 0, policy_version 14310 (0.0030) [2024-07-02 12:36:21,095][36761] Fps is (10 sec: 49152.3, 60 sec: 44783.0, 300 sec: 44209.0). Total num frames: 234569728. Throughput: 0: 44099.7. Samples: 234681260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 12:36:21,096][36761] Avg episode reward: [(0, '0.169')] [2024-07-02 12:36:21,098][36979] Saving new best policy, reward=0.169! [2024-07-02 12:36:21,746][36999] Updated weights for policy 0, policy_version 14320 (0.0026) [2024-07-02 12:36:26,095][36761] Fps is (10 sec: 37683.8, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 234733568. Throughput: 0: 44500.9. Samples: 234949840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-07-02 12:36:26,095][36761] Avg episode reward: [(0, '0.171')] [2024-07-02 12:36:26,174][36979] Saving new best policy, reward=0.171! [2024-07-02 12:36:26,715][36999] Updated weights for policy 0, policy_version 14330 (0.0039) [2024-07-02 12:36:29,023][36999] Updated weights for policy 0, policy_version 14340 (0.0032) [2024-07-02 12:36:31,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 44209.0). Total num frames: 235012096. Throughput: 0: 44063.5. Samples: 235067640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-07-02 12:36:31,096][36761] Avg episode reward: [(0, '0.172')] [2024-07-02 12:36:31,140][36979] Saving new best policy, reward=0.172! [2024-07-02 12:36:33,923][36999] Updated weights for policy 0, policy_version 14350 (0.0042) [2024-07-02 12:36:36,095][36761] Fps is (10 sec: 50790.3, 60 sec: 44783.1, 300 sec: 44209.0). Total num frames: 235241472. Throughput: 0: 44112.6. Samples: 235344980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 12:36:36,095][36761] Avg episode reward: [(0, '0.172')] [2024-07-02 12:36:36,189][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000014359_235257856.pth... [2024-07-02 12:36:36,240][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000013708_224591872.pth [2024-07-02 12:36:36,385][36999] Updated weights for policy 0, policy_version 14360 (0.0026) [2024-07-02 12:36:41,095][36761] Fps is (10 sec: 40960.1, 60 sec: 44236.9, 300 sec: 44209.0). Total num frames: 235421696. Throughput: 0: 44549.4. Samples: 235616440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 12:36:41,095][36761] Avg episode reward: [(0, '0.121')] [2024-07-02 12:36:41,153][36999] Updated weights for policy 0, policy_version 14370 (0.0031) [2024-07-02 12:36:43,686][36999] Updated weights for policy 0, policy_version 14380 (0.0030) [2024-07-02 12:36:46,095][36761] Fps is (10 sec: 44236.5, 60 sec: 43963.7, 300 sec: 44264.6). Total num frames: 235683840. Throughput: 0: 44093.0. Samples: 235729860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) [2024-07-02 12:36:46,100][36761] Avg episode reward: [(0, '0.105')] [2024-07-02 12:36:48,707][36999] Updated weights for policy 0, policy_version 14390 (0.0027) [2024-07-02 12:36:49,849][36979] Signal inference workers to stop experience collection... (3400 times) [2024-07-02 12:36:49,903][36999] InferenceWorker_p0-w0: stopping experience collection (3400 times) [2024-07-02 12:36:49,964][36979] Signal inference workers to resume experience collection... (3400 times) [2024-07-02 12:36:49,964][36999] InferenceWorker_p0-w0: resuming experience collection (3400 times) [2024-07-02 12:36:51,083][36999] Updated weights for policy 0, policy_version 14400 (0.0028) [2024-07-02 12:36:51,095][36761] Fps is (10 sec: 50789.7, 60 sec: 45329.1, 300 sec: 44209.0). Total num frames: 235929600. Throughput: 0: 44163.0. Samples: 236005080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) [2024-07-02 12:36:51,096][36761] Avg episode reward: [(0, '0.125')] [2024-07-02 12:36:56,095][36761] Fps is (10 sec: 39321.7, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 236077056. Throughput: 0: 44348.9. Samples: 236279680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 12:36:56,096][36761] Avg episode reward: [(0, '0.106')] [2024-07-02 12:36:56,252][36999] Updated weights for policy 0, policy_version 14410 (0.0037) [2024-07-02 12:36:58,605][36999] Updated weights for policy 0, policy_version 14420 (0.0032) [2024-07-02 12:37:01,095][36761] Fps is (10 sec: 40960.3, 60 sec: 43690.8, 300 sec: 44264.6). Total num frames: 236339200. Throughput: 0: 44199.2. Samples: 236393120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 12:37:01,096][36761] Avg episode reward: [(0, '0.110')] [2024-07-02 12:37:03,606][36999] Updated weights for policy 0, policy_version 14430 (0.0034) [2024-07-02 12:37:06,050][36999] Updated weights for policy 0, policy_version 14440 (0.0025) [2024-07-02 12:37:06,095][36761] Fps is (10 sec: 50790.1, 60 sec: 45055.9, 300 sec: 44209.0). Total num frames: 236584960. Throughput: 0: 44115.9. Samples: 236666480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-07-02 12:37:06,096][36761] Avg episode reward: [(0, '0.127')] [2024-07-02 12:37:10,935][36999] Updated weights for policy 0, policy_version 14450 (0.0027) [2024-07-02 12:37:11,095][36761] Fps is (10 sec: 40959.8, 60 sec: 44509.9, 300 sec: 44153.5). Total num frames: 236748800. Throughput: 0: 44183.0. Samples: 236938080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-07-02 12:37:11,096][36761] Avg episode reward: [(0, '0.141')] [2024-07-02 12:37:13,438][36999] Updated weights for policy 0, policy_version 14460 (0.0038) [2024-07-02 12:37:16,098][36761] Fps is (10 sec: 40948.1, 60 sec: 43961.6, 300 sec: 44264.1). Total num frames: 236994560. Throughput: 0: 44175.7. Samples: 237055680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 12:37:16,099][36761] Avg episode reward: [(0, '0.136')] [2024-07-02 12:37:18,430][36999] Updated weights for policy 0, policy_version 14470 (0.0042) [2024-07-02 12:37:20,710][36999] Updated weights for policy 0, policy_version 14480 (0.0036) [2024-07-02 12:37:21,095][36761] Fps is (10 sec: 49152.3, 60 sec: 44509.9, 300 sec: 44153.5). Total num frames: 237240320. Throughput: 0: 44117.8. Samples: 237330280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 12:37:21,096][36761] Avg episode reward: [(0, '0.126')] [2024-07-02 12:37:25,680][36999] Updated weights for policy 0, policy_version 14490 (0.0028) [2024-07-02 12:37:26,095][36761] Fps is (10 sec: 42611.2, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 237420544. Throughput: 0: 44009.3. Samples: 237596860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 12:37:26,095][36761] Avg episode reward: [(0, '0.154')] [2024-07-02 12:37:28,389][36999] Updated weights for policy 0, policy_version 14500 (0.0040) [2024-07-02 12:37:31,095][36761] Fps is (10 sec: 40959.6, 60 sec: 43963.6, 300 sec: 44320.1). Total num frames: 237649920. Throughput: 0: 44238.6. Samples: 237720600. Policy #0 lag: (min: 1.0, avg: 10.2, max: 25.0) [2024-07-02 12:37:31,096][36761] Avg episode reward: [(0, '0.168')] [2024-07-02 12:37:32,923][36999] Updated weights for policy 0, policy_version 14510 (0.0033) [2024-07-02 12:37:35,741][36999] Updated weights for policy 0, policy_version 14520 (0.0047) [2024-07-02 12:37:36,095][36761] Fps is (10 sec: 49151.6, 60 sec: 44509.8, 300 sec: 44153.5). Total num frames: 237912064. Throughput: 0: 44245.8. Samples: 237996140. Policy #0 lag: (min: 1.0, avg: 10.2, max: 25.0) [2024-07-02 12:37:36,096][36761] Avg episode reward: [(0, '0.159')] [2024-07-02 12:37:40,357][36999] Updated weights for policy 0, policy_version 14530 (0.0035) [2024-07-02 12:37:41,095][36761] Fps is (10 sec: 45875.2, 60 sec: 44782.8, 300 sec: 44264.6). Total num frames: 238108672. Throughput: 0: 44167.5. Samples: 238267220. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-07-02 12:37:41,096][36761] Avg episode reward: [(0, '0.144')] [2024-07-02 12:37:43,007][36999] Updated weights for policy 0, policy_version 14540 (0.0028) [2024-07-02 12:37:46,095][36761] Fps is (10 sec: 39321.9, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 238305280. Throughput: 0: 44288.9. Samples: 238386120. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-07-02 12:37:46,096][36761] Avg episode reward: [(0, '0.139')] [2024-07-02 12:37:47,953][36999] Updated weights for policy 0, policy_version 14550 (0.0038) [2024-07-02 12:37:50,270][36979] Signal inference workers to stop experience collection... (3450 times) [2024-07-02 12:37:50,270][36979] Signal inference workers to resume experience collection... (3450 times) [2024-07-02 12:37:50,289][36999] InferenceWorker_p0-w0: stopping experience collection (3450 times) [2024-07-02 12:37:50,289][36999] InferenceWorker_p0-w0: resuming experience collection (3450 times) [2024-07-02 12:37:50,406][36999] Updated weights for policy 0, policy_version 14560 (0.0024) [2024-07-02 12:37:51,095][36761] Fps is (10 sec: 47513.5, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 238583808. Throughput: 0: 44294.6. Samples: 238659740. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-07-02 12:37:51,096][36761] Avg episode reward: [(0, '0.167')] [2024-07-02 12:37:55,330][36999] Updated weights for policy 0, policy_version 14570 (0.0019) [2024-07-02 12:37:56,095][36761] Fps is (10 sec: 45874.8, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 238764032. Throughput: 0: 44317.3. Samples: 238932360. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-07-02 12:37:56,096][36761] Avg episode reward: [(0, '0.166')] [2024-07-02 12:37:57,670][36999] Updated weights for policy 0, policy_version 14580 (0.0034) [2024-07-02 12:38:01,095][36761] Fps is (10 sec: 37683.3, 60 sec: 43690.6, 300 sec: 44264.6). Total num frames: 238960640. Throughput: 0: 44335.3. Samples: 239050640. Policy #0 lag: (min: 0.0, avg: 13.7, max: 21.0) [2024-07-02 12:38:01,096][36761] Avg episode reward: [(0, '0.170')] [2024-07-02 12:38:02,723][36999] Updated weights for policy 0, policy_version 14590 (0.0024) [2024-07-02 12:38:05,091][36999] Updated weights for policy 0, policy_version 14600 (0.0027) [2024-07-02 12:38:06,095][36761] Fps is (10 sec: 49151.8, 60 sec: 44509.8, 300 sec: 44209.0). Total num frames: 239255552. Throughput: 0: 44238.5. Samples: 239321020. Policy #0 lag: (min: 0.0, avg: 13.7, max: 21.0) [2024-07-02 12:38:06,096][36761] Avg episode reward: [(0, '0.166')] [2024-07-02 12:38:10,111][36999] Updated weights for policy 0, policy_version 14610 (0.0027) [2024-07-02 12:38:11,095][36761] Fps is (10 sec: 45875.5, 60 sec: 44509.9, 300 sec: 44209.0). Total num frames: 239419392. Throughput: 0: 44354.6. Samples: 239592820. Policy #0 lag: (min: 0.0, avg: 7.4, max: 22.0) [2024-07-02 12:38:11,096][36761] Avg episode reward: [(0, '0.153')] [2024-07-02 12:38:12,527][36999] Updated weights for policy 0, policy_version 14620 (0.0036) [2024-07-02 12:38:16,095][36761] Fps is (10 sec: 36045.1, 60 sec: 43692.8, 300 sec: 44265.3). Total num frames: 239616000. Throughput: 0: 44308.9. Samples: 239714500. Policy #0 lag: (min: 0.0, avg: 7.4, max: 22.0) [2024-07-02 12:38:16,096][36761] Avg episode reward: [(0, '0.173')] [2024-07-02 12:38:16,104][36979] Saving new best policy, reward=0.173! [2024-07-02 12:38:17,521][36999] Updated weights for policy 0, policy_version 14630 (0.0027) [2024-07-02 12:38:19,877][36999] Updated weights for policy 0, policy_version 14640 (0.0022) [2024-07-02 12:38:21,095][36761] Fps is (10 sec: 47513.7, 60 sec: 44236.8, 300 sec: 44153.5). Total num frames: 239894528. Throughput: 0: 44046.7. Samples: 239978240. Policy #0 lag: (min: 0.0, avg: 14.0, max: 24.0) [2024-07-02 12:38:21,096][36761] Avg episode reward: [(0, '0.185')] [2024-07-02 12:38:21,173][36979] Saving new best policy, reward=0.185! [2024-07-02 12:38:24,841][36999] Updated weights for policy 0, policy_version 14650 (0.0041) [2024-07-02 12:38:26,095][36761] Fps is (10 sec: 47513.6, 60 sec: 44509.8, 300 sec: 44154.2). Total num frames: 240091136. Throughput: 0: 44155.6. Samples: 240254220. Policy #0 lag: (min: 0.0, avg: 14.0, max: 24.0) [2024-07-02 12:38:26,096][36761] Avg episode reward: [(0, '0.180')] [2024-07-02 12:38:27,322][36999] Updated weights for policy 0, policy_version 14660 (0.0031) [2024-07-02 12:38:31,095][36761] Fps is (10 sec: 37682.8, 60 sec: 43690.7, 300 sec: 44264.6). Total num frames: 240271360. Throughput: 0: 44267.5. Samples: 240378160. Policy #0 lag: (min: 0.0, avg: 14.0, max: 24.0) [2024-07-02 12:38:31,096][36761] Avg episode reward: [(0, '0.159')] [2024-07-02 12:38:32,086][36999] Updated weights for policy 0, policy_version 14670 (0.0032) [2024-07-02 12:38:34,650][36999] Updated weights for policy 0, policy_version 14680 (0.0028) [2024-07-02 12:38:35,691][36979] Signal inference workers to stop experience collection... (3500 times) [2024-07-02 12:38:35,691][36979] Signal inference workers to resume experience collection... (3500 times) [2024-07-02 12:38:35,736][36999] InferenceWorker_p0-w0: stopping experience collection (3500 times) [2024-07-02 12:38:35,736][36999] InferenceWorker_p0-w0: resuming experience collection (3500 times) [2024-07-02 12:38:36,095][36761] Fps is (10 sec: 47513.8, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 240566272. Throughput: 0: 44081.4. Samples: 240643400. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-07-02 12:38:36,096][36761] Avg episode reward: [(0, '0.157')] [2024-07-02 12:38:36,120][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000014683_240566272.pth... [2024-07-02 12:38:36,182][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000014032_229900288.pth [2024-07-02 12:38:39,549][36999] Updated weights for policy 0, policy_version 14690 (0.0036) [2024-07-02 12:38:41,095][36761] Fps is (10 sec: 47514.2, 60 sec: 43963.8, 300 sec: 44098.0). Total num frames: 240746496. Throughput: 0: 44226.8. Samples: 240922560. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-07-02 12:38:41,095][36761] Avg episode reward: [(0, '0.171')] [2024-07-02 12:38:41,986][36999] Updated weights for policy 0, policy_version 14700 (0.0045) [2024-07-02 12:38:46,095][36761] Fps is (10 sec: 37683.4, 60 sec: 43963.8, 300 sec: 44320.1). Total num frames: 240943104. Throughput: 0: 44384.1. Samples: 241047920. Policy #0 lag: (min: 0.0, avg: 13.3, max: 21.0) [2024-07-02 12:38:46,096][36761] Avg episode reward: [(0, '0.158')] [2024-07-02 12:38:46,693][36999] Updated weights for policy 0, policy_version 14710 (0.0035) [2024-07-02 12:38:49,276][36999] Updated weights for policy 0, policy_version 14720 (0.0024) [2024-07-02 12:38:51,095][36761] Fps is (10 sec: 47513.7, 60 sec: 43963.8, 300 sec: 44209.0). Total num frames: 241221632. Throughput: 0: 44255.7. Samples: 241312520. Policy #0 lag: (min: 0.0, avg: 13.3, max: 21.0) [2024-07-02 12:38:51,096][36761] Avg episode reward: [(0, '0.145')] [2024-07-02 12:38:54,297][36999] Updated weights for policy 0, policy_version 14730 (0.0025) [2024-07-02 12:38:56,095][36761] Fps is (10 sec: 49151.8, 60 sec: 44509.9, 300 sec: 44153.5). Total num frames: 241434624. Throughput: 0: 44279.6. Samples: 241585400. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-07-02 12:38:56,096][36761] Avg episode reward: [(0, '0.157')] [2024-07-02 12:38:56,884][36999] Updated weights for policy 0, policy_version 14740 (0.0026) [2024-07-02 12:39:01,095][36761] Fps is (10 sec: 40959.3, 60 sec: 44509.8, 300 sec: 44375.6). Total num frames: 241631232. Throughput: 0: 44572.8. Samples: 241720280. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-07-02 12:39:01,096][36761] Avg episode reward: [(0, '0.171')] [2024-07-02 12:39:01,603][36999] Updated weights for policy 0, policy_version 14750 (0.0037) [2024-07-02 12:39:04,099][36999] Updated weights for policy 0, policy_version 14760 (0.0035) [2024-07-02 12:39:06,095][36761] Fps is (10 sec: 44236.0, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 241876992. Throughput: 0: 44614.9. Samples: 241985920. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-07-02 12:39:06,096][36761] Avg episode reward: [(0, '0.174')] [2024-07-02 12:39:09,126][36999] Updated weights for policy 0, policy_version 14770 (0.0023) [2024-07-02 12:39:11,095][36761] Fps is (10 sec: 49152.2, 60 sec: 45056.0, 300 sec: 44264.5). Total num frames: 242122752. Throughput: 0: 44350.6. Samples: 242250000. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-07-02 12:39:11,097][36761] Avg episode reward: [(0, '0.170')] [2024-07-02 12:39:11,481][36999] Updated weights for policy 0, policy_version 14780 (0.0039) [2024-07-02 12:39:16,095][36761] Fps is (10 sec: 40960.8, 60 sec: 44509.9, 300 sec: 44264.6). Total num frames: 242286592. Throughput: 0: 44721.5. Samples: 242390620. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-07-02 12:39:16,095][36761] Avg episode reward: [(0, '0.141')] [2024-07-02 12:39:16,390][36999] Updated weights for policy 0, policy_version 14790 (0.0037) [2024-07-02 12:39:19,051][36999] Updated weights for policy 0, policy_version 14800 (0.0024) [2024-07-02 12:39:21,096][36761] Fps is (10 sec: 40958.0, 60 sec: 43963.3, 300 sec: 44264.5). Total num frames: 242532352. Throughput: 0: 44515.4. Samples: 242646620. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-07-02 12:39:21,096][36761] Avg episode reward: [(0, '0.171')] [2024-07-02 12:39:23,700][36999] Updated weights for policy 0, policy_version 14810 (0.0028) [2024-07-02 12:39:26,095][36761] Fps is (10 sec: 47513.4, 60 sec: 44509.9, 300 sec: 44154.2). Total num frames: 242761728. Throughput: 0: 44315.5. Samples: 242916760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 12:39:26,096][36761] Avg episode reward: [(0, '0.184')] [2024-07-02 12:39:26,487][36999] Updated weights for policy 0, policy_version 14820 (0.0033) [2024-07-02 12:39:30,879][36999] Updated weights for policy 0, policy_version 14830 (0.0028) [2024-07-02 12:39:31,095][36761] Fps is (10 sec: 45877.8, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 242991104. Throughput: 0: 44614.2. Samples: 243055560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 12:39:31,096][36761] Avg episode reward: [(0, '0.179')] [2024-07-02 12:39:33,747][36999] Updated weights for policy 0, policy_version 14840 (0.0026) [2024-07-02 12:39:36,100][36761] Fps is (10 sec: 44216.7, 60 sec: 43960.4, 300 sec: 44319.4). Total num frames: 243204096. Throughput: 0: 44588.8. Samples: 243319220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 12:39:36,100][36761] Avg episode reward: [(0, '0.140')] [2024-07-02 12:39:38,172][36999] Updated weights for policy 0, policy_version 14850 (0.0034) [2024-07-02 12:39:41,085][36999] Updated weights for policy 0, policy_version 14860 (0.0037) [2024-07-02 12:39:41,100][36761] Fps is (10 sec: 47491.7, 60 sec: 45325.6, 300 sec: 44319.4). Total num frames: 243466240. Throughput: 0: 44543.0. Samples: 243590040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-07-02 12:39:41,101][36761] Avg episode reward: [(0, '0.145')] [2024-07-02 12:39:45,100][36979] Signal inference workers to stop experience collection... (3550 times) [2024-07-02 12:39:45,147][36979] Signal inference workers to resume experience collection... (3550 times) [2024-07-02 12:39:45,148][36999] InferenceWorker_p0-w0: stopping experience collection (3550 times) [2024-07-02 12:39:45,163][36999] InferenceWorker_p0-w0: resuming experience collection (3550 times) [2024-07-02 12:39:45,470][36999] Updated weights for policy 0, policy_version 14870 (0.0026) [2024-07-02 12:39:46,095][36761] Fps is (10 sec: 45896.0, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 243662848. Throughput: 0: 44704.5. Samples: 243731980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-07-02 12:39:46,096][36761] Avg episode reward: [(0, '0.167')] [2024-07-02 12:39:48,634][36999] Updated weights for policy 0, policy_version 14880 (0.0039) [2024-07-02 12:39:51,095][36761] Fps is (10 sec: 39339.8, 60 sec: 43963.7, 300 sec: 44320.8). Total num frames: 243859456. Throughput: 0: 44474.9. Samples: 243987280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 12:39:51,096][36761] Avg episode reward: [(0, '0.174')] [2024-07-02 12:39:52,826][36999] Updated weights for policy 0, policy_version 14890 (0.0035) [2024-07-02 12:39:56,012][36999] Updated weights for policy 0, policy_version 14900 (0.0026) [2024-07-02 12:39:56,095][36761] Fps is (10 sec: 45874.6, 60 sec: 44782.8, 300 sec: 44264.6). Total num frames: 244121600. Throughput: 0: 44584.4. Samples: 244256300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 12:39:56,096][36761] Avg episode reward: [(0, '0.167')] [2024-07-02 12:40:00,004][36999] Updated weights for policy 0, policy_version 14910 (0.0025) [2024-07-02 12:40:01,095][36761] Fps is (10 sec: 45874.7, 60 sec: 44782.9, 300 sec: 44375.6). Total num frames: 244318208. Throughput: 0: 44595.9. Samples: 244397440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 12:40:01,096][36761] Avg episode reward: [(0, '0.155')] [2024-07-02 12:40:03,273][36999] Updated weights for policy 0, policy_version 14920 (0.0024) [2024-07-02 12:40:06,095][36761] Fps is (10 sec: 40960.3, 60 sec: 44236.9, 300 sec: 44375.7). Total num frames: 244531200. Throughput: 0: 44863.6. Samples: 244665460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 12:40:06,096][36761] Avg episode reward: [(0, '0.155')] [2024-07-02 12:40:07,326][36999] Updated weights for policy 0, policy_version 14930 (0.0029) [2024-07-02 12:40:10,468][36999] Updated weights for policy 0, policy_version 14940 (0.0031) [2024-07-02 12:40:11,095][36761] Fps is (10 sec: 47513.4, 60 sec: 44509.8, 300 sec: 44320.1). Total num frames: 244793344. Throughput: 0: 44611.0. Samples: 244924260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-07-02 12:40:11,096][36761] Avg episode reward: [(0, '0.151')] [2024-07-02 12:40:14,754][36999] Updated weights for policy 0, policy_version 14950 (0.0034) [2024-07-02 12:40:16,096][36761] Fps is (10 sec: 45871.2, 60 sec: 45055.3, 300 sec: 44431.1). Total num frames: 244989952. Throughput: 0: 44757.3. Samples: 245069680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-07-02 12:40:16,097][36761] Avg episode reward: [(0, '0.142')] [2024-07-02 12:40:17,740][36999] Updated weights for policy 0, policy_version 14960 (0.0030) [2024-07-02 12:40:21,095][36761] Fps is (10 sec: 39322.3, 60 sec: 44237.2, 300 sec: 44320.1). Total num frames: 245186560. Throughput: 0: 44774.4. Samples: 245333860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 12:40:21,096][36761] Avg episode reward: [(0, '0.123')] [2024-07-02 12:40:22,082][36999] Updated weights for policy 0, policy_version 14970 (0.0042) [2024-07-02 12:40:25,039][36999] Updated weights for policy 0, policy_version 14980 (0.0044) [2024-07-02 12:40:26,095][36761] Fps is (10 sec: 45879.2, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 245448704. Throughput: 0: 44544.5. Samples: 245594340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 12:40:26,096][36761] Avg episode reward: [(0, '0.148')] [2024-07-02 12:40:29,340][36999] Updated weights for policy 0, policy_version 14990 (0.0032) [2024-07-02 12:40:31,095][36761] Fps is (10 sec: 45875.2, 60 sec: 44236.8, 300 sec: 44375.7). Total num frames: 245645312. Throughput: 0: 44602.7. Samples: 245739100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 12:40:31,096][36761] Avg episode reward: [(0, '0.178')] [2024-07-02 12:40:32,198][36999] Updated weights for policy 0, policy_version 15000 (0.0029) [2024-07-02 12:40:36,095][36761] Fps is (10 sec: 44237.0, 60 sec: 44786.3, 300 sec: 44486.7). Total num frames: 245891072. Throughput: 0: 45009.3. Samples: 246012700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-07-02 12:40:36,096][36761] Avg episode reward: [(0, '0.168')] [2024-07-02 12:40:36,108][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000015008_245891072.pth... [2024-07-02 12:40:36,161][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000014359_235257856.pth [2024-07-02 12:40:36,490][36999] Updated weights for policy 0, policy_version 15010 (0.0027) [2024-07-02 12:40:39,830][36999] Updated weights for policy 0, policy_version 15020 (0.0034) [2024-07-02 12:40:41,095][36761] Fps is (10 sec: 47513.4, 60 sec: 44240.2, 300 sec: 44320.1). Total num frames: 246120448. Throughput: 0: 44855.2. Samples: 246274780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-07-02 12:40:41,096][36761] Avg episode reward: [(0, '0.168')] [2024-07-02 12:40:43,802][36999] Updated weights for policy 0, policy_version 15030 (0.0029) [2024-07-02 12:40:45,871][36979] Signal inference workers to stop experience collection... (3600 times) [2024-07-02 12:40:45,876][36979] Signal inference workers to resume experience collection... (3600 times) [2024-07-02 12:40:45,915][36999] InferenceWorker_p0-w0: stopping experience collection (3600 times) [2024-07-02 12:40:45,915][36999] InferenceWorker_p0-w0: resuming experience collection (3600 times) [2024-07-02 12:40:46,095][36761] Fps is (10 sec: 45875.0, 60 sec: 44782.9, 300 sec: 44542.3). Total num frames: 246349824. Throughput: 0: 44686.3. Samples: 246408320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 12:40:46,098][36761] Avg episode reward: [(0, '0.157')] [2024-07-02 12:40:47,289][36999] Updated weights for policy 0, policy_version 15040 (0.0031) [2024-07-02 12:40:51,023][36999] Updated weights for policy 0, policy_version 15050 (0.0034) [2024-07-02 12:40:51,095][36761] Fps is (10 sec: 45875.7, 60 sec: 45329.1, 300 sec: 44597.8). Total num frames: 246579200. Throughput: 0: 44990.4. Samples: 246690020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 12:40:51,095][36761] Avg episode reward: [(0, '0.157')] [2024-07-02 12:40:54,617][36999] Updated weights for policy 0, policy_version 15060 (0.0041) [2024-07-02 12:40:56,096][36761] Fps is (10 sec: 44236.2, 60 sec: 44509.8, 300 sec: 44320.1). Total num frames: 246792192. Throughput: 0: 44847.5. Samples: 246942400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 12:40:56,096][36761] Avg episode reward: [(0, '0.157')] [2024-07-02 12:40:58,520][36999] Updated weights for policy 0, policy_version 15070 (0.0038) [2024-07-02 12:41:01,095][36761] Fps is (10 sec: 44236.5, 60 sec: 45056.1, 300 sec: 44542.3). Total num frames: 247021568. Throughput: 0: 44568.5. Samples: 247075220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 12:41:01,096][36761] Avg episode reward: [(0, '0.150')] [2024-07-02 12:41:01,857][36999] Updated weights for policy 0, policy_version 15080 (0.0027) [2024-07-02 12:41:05,750][36999] Updated weights for policy 0, policy_version 15090 (0.0034) [2024-07-02 12:41:06,095][36761] Fps is (10 sec: 44236.9, 60 sec: 45055.9, 300 sec: 44597.8). Total num frames: 247234560. Throughput: 0: 44814.5. Samples: 247350520. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-07-02 12:41:06,096][36761] Avg episode reward: [(0, '0.166')] [2024-07-02 12:41:09,127][36999] Updated weights for policy 0, policy_version 15100 (0.0036) [2024-07-02 12:41:11,095][36761] Fps is (10 sec: 40959.5, 60 sec: 43963.8, 300 sec: 44320.1). Total num frames: 247431168. Throughput: 0: 44955.1. Samples: 247617320. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-07-02 12:41:11,096][36761] Avg episode reward: [(0, '0.169')] [2024-07-02 12:41:13,189][36999] Updated weights for policy 0, policy_version 15110 (0.0041) [2024-07-02 12:41:16,095][36761] Fps is (10 sec: 45876.0, 60 sec: 45056.7, 300 sec: 44486.7). Total num frames: 247693312. Throughput: 0: 44667.6. Samples: 247749140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 12:41:16,096][36761] Avg episode reward: [(0, '0.167')] [2024-07-02 12:41:16,353][36999] Updated weights for policy 0, policy_version 15120 (0.0044) [2024-07-02 12:41:20,465][36999] Updated weights for policy 0, policy_version 15130 (0.0028) [2024-07-02 12:41:21,095][36761] Fps is (10 sec: 47513.9, 60 sec: 45329.0, 300 sec: 44653.3). Total num frames: 247906304. Throughput: 0: 44719.5. Samples: 248025080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 12:41:21,100][36761] Avg episode reward: [(0, '0.161')] [2024-07-02 12:41:23,667][36999] Updated weights for policy 0, policy_version 15140 (0.0037) [2024-07-02 12:41:26,095][36761] Fps is (10 sec: 40959.9, 60 sec: 44236.8, 300 sec: 44375.6). Total num frames: 248102912. Throughput: 0: 44872.5. Samples: 248294040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 12:41:26,096][36761] Avg episode reward: [(0, '0.151')] [2024-07-02 12:41:27,808][36999] Updated weights for policy 0, policy_version 15150 (0.0028) [2024-07-02 12:41:31,029][36999] Updated weights for policy 0, policy_version 15160 (0.0038) [2024-07-02 12:41:31,096][36761] Fps is (10 sec: 47512.8, 60 sec: 45601.9, 300 sec: 44542.2). Total num frames: 248381440. Throughput: 0: 44794.5. Samples: 248424080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-07-02 12:41:31,096][36761] Avg episode reward: [(0, '0.160')] [2024-07-02 12:41:35,254][36999] Updated weights for policy 0, policy_version 15170 (0.0032) [2024-07-02 12:41:36,095][36761] Fps is (10 sec: 45874.8, 60 sec: 44509.8, 300 sec: 44542.2). Total num frames: 248561664. Throughput: 0: 44441.2. Samples: 248689880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-07-02 12:41:36,096][36761] Avg episode reward: [(0, '0.162')] [2024-07-02 12:41:38,414][36999] Updated weights for policy 0, policy_version 15180 (0.0021) [2024-07-02 12:41:41,095][36761] Fps is (10 sec: 37684.1, 60 sec: 43963.8, 300 sec: 44320.1). Total num frames: 248758272. Throughput: 0: 44724.7. Samples: 248955000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-07-02 12:41:41,096][36761] Avg episode reward: [(0, '0.152')] [2024-07-02 12:41:42,705][36999] Updated weights for policy 0, policy_version 15190 (0.0033) [2024-07-02 12:41:46,095][36761] Fps is (10 sec: 45875.7, 60 sec: 44509.9, 300 sec: 44375.7). Total num frames: 249020416. Throughput: 0: 44481.3. Samples: 249076880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-07-02 12:41:46,096][36761] Avg episode reward: [(0, '0.146')] [2024-07-02 12:41:46,108][36999] Updated weights for policy 0, policy_version 15200 (0.0021) [2024-07-02 12:41:50,195][36999] Updated weights for policy 0, policy_version 15210 (0.0022) [2024-07-02 12:41:51,096][36761] Fps is (10 sec: 49146.2, 60 sec: 44508.9, 300 sec: 44653.2). Total num frames: 249249792. Throughput: 0: 44308.8. Samples: 249344460. Policy #0 lag: (min: 0.0, avg: 7.5, max: 22.0) [2024-07-02 12:41:51,097][36761] Avg episode reward: [(0, '0.154')] [2024-07-02 12:41:53,522][36999] Updated weights for policy 0, policy_version 15220 (0.0030) [2024-07-02 12:41:56,095][36761] Fps is (10 sec: 39321.6, 60 sec: 43690.8, 300 sec: 44320.1). Total num frames: 249413632. Throughput: 0: 44538.8. Samples: 249621560. Policy #0 lag: (min: 0.0, avg: 7.5, max: 22.0) [2024-07-02 12:41:56,096][36761] Avg episode reward: [(0, '0.174')] [2024-07-02 12:41:56,739][36979] Signal inference workers to stop experience collection... (3650 times) [2024-07-02 12:41:56,742][36979] Signal inference workers to resume experience collection... (3650 times) [2024-07-02 12:41:56,781][36999] InferenceWorker_p0-w0: stopping experience collection (3650 times) [2024-07-02 12:41:56,781][36999] InferenceWorker_p0-w0: resuming experience collection (3650 times) [2024-07-02 12:41:57,601][36999] Updated weights for policy 0, policy_version 15230 (0.0028) [2024-07-02 12:42:00,791][36999] Updated weights for policy 0, policy_version 15240 (0.0025) [2024-07-02 12:42:01,100][36761] Fps is (10 sec: 44221.7, 60 sec: 44506.5, 300 sec: 44430.5). Total num frames: 249692160. Throughput: 0: 44251.0. Samples: 249740640. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-07-02 12:42:01,109][36761] Avg episode reward: [(0, '0.189')] [2024-07-02 12:42:01,110][36979] Saving new best policy, reward=0.189! [2024-07-02 12:42:05,090][36999] Updated weights for policy 0, policy_version 15250 (0.0037) [2024-07-02 12:42:06,095][36761] Fps is (10 sec: 49151.1, 60 sec: 44509.9, 300 sec: 44597.8). Total num frames: 249905152. Throughput: 0: 44281.2. Samples: 250017740. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-07-02 12:42:06,096][36761] Avg episode reward: [(0, '0.189')] [2024-07-02 12:42:08,091][36999] Updated weights for policy 0, policy_version 15260 (0.0026) [2024-07-02 12:42:11,095][36761] Fps is (10 sec: 42618.1, 60 sec: 44783.1, 300 sec: 44487.2). Total num frames: 250118144. Throughput: 0: 44291.2. Samples: 250287140. Policy #0 lag: (min: 1.0, avg: 8.6, max: 22.0) [2024-07-02 12:42:11,096][36761] Avg episode reward: [(0, '0.179')] [2024-07-02 12:42:12,468][36999] Updated weights for policy 0, policy_version 15270 (0.0039) [2024-07-02 12:42:15,486][36999] Updated weights for policy 0, policy_version 15280 (0.0029) [2024-07-02 12:42:16,095][36761] Fps is (10 sec: 45876.0, 60 sec: 44509.8, 300 sec: 44486.7). Total num frames: 250363904. Throughput: 0: 44230.4. Samples: 250414440. Policy #0 lag: (min: 1.0, avg: 8.6, max: 22.0) [2024-07-02 12:42:16,096][36761] Avg episode reward: [(0, '0.179')] [2024-07-02 12:42:19,684][36999] Updated weights for policy 0, policy_version 15290 (0.0042) [2024-07-02 12:42:21,095][36761] Fps is (10 sec: 44236.6, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 250560512. Throughput: 0: 44321.9. Samples: 250684360. Policy #0 lag: (min: 1.0, avg: 8.6, max: 22.0) [2024-07-02 12:42:21,096][36761] Avg episode reward: [(0, '0.183')] [2024-07-02 12:42:22,702][36999] Updated weights for policy 0, policy_version 15300 (0.0029) [2024-07-02 12:42:26,095][36761] Fps is (10 sec: 44236.9, 60 sec: 45056.0, 300 sec: 44597.8). Total num frames: 250806272. Throughput: 0: 44369.3. Samples: 250951620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 12:42:26,096][36761] Avg episode reward: [(0, '0.183')] [2024-07-02 12:42:27,053][36999] Updated weights for policy 0, policy_version 15310 (0.0037) [2024-07-02 12:42:29,988][36999] Updated weights for policy 0, policy_version 15320 (0.0038) [2024-07-02 12:42:31,095][36761] Fps is (10 sec: 45875.2, 60 sec: 43963.9, 300 sec: 44431.2). Total num frames: 251019264. Throughput: 0: 44531.6. Samples: 251080800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 12:42:31,096][36761] Avg episode reward: [(0, '0.184')] [2024-07-02 12:42:34,385][36999] Updated weights for policy 0, policy_version 15330 (0.0028) [2024-07-02 12:42:36,095][36761] Fps is (10 sec: 44236.1, 60 sec: 44782.9, 300 sec: 44542.3). Total num frames: 251248640. Throughput: 0: 44661.9. Samples: 251354200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-07-02 12:42:36,096][36761] Avg episode reward: [(0, '0.188')] [2024-07-02 12:42:36,116][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000015335_251248640.pth... [2024-07-02 12:42:36,165][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000014683_240566272.pth [2024-07-02 12:42:37,306][36999] Updated weights for policy 0, policy_version 15340 (0.0035) [2024-07-02 12:42:41,095][36761] Fps is (10 sec: 44236.6, 60 sec: 45056.0, 300 sec: 44597.8). Total num frames: 251461632. Throughput: 0: 44453.7. Samples: 251621980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-07-02 12:42:41,099][36761] Avg episode reward: [(0, '0.176')] [2024-07-02 12:42:41,942][36999] Updated weights for policy 0, policy_version 15350 (0.0020) [2024-07-02 12:42:44,691][36999] Updated weights for policy 0, policy_version 15360 (0.0030) [2024-07-02 12:42:46,095][36761] Fps is (10 sec: 42598.3, 60 sec: 44236.7, 300 sec: 44375.6). Total num frames: 251674624. Throughput: 0: 44758.1. Samples: 251754560. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-07-02 12:42:46,096][36761] Avg episode reward: [(0, '0.188')] [2024-07-02 12:42:49,245][36999] Updated weights for policy 0, policy_version 15370 (0.0026) [2024-07-02 12:42:51,095][36761] Fps is (10 sec: 44236.4, 60 sec: 44237.6, 300 sec: 44542.3). Total num frames: 251904000. Throughput: 0: 44560.9. Samples: 252022980. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-07-02 12:42:51,096][36761] Avg episode reward: [(0, '0.177')] [2024-07-02 12:42:52,203][36999] Updated weights for policy 0, policy_version 15380 (0.0039) [2024-07-02 12:42:56,096][36761] Fps is (10 sec: 44234.9, 60 sec: 45055.5, 300 sec: 44597.7). Total num frames: 252116992. Throughput: 0: 44414.9. Samples: 252285840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 12:42:56,096][36761] Avg episode reward: [(0, '0.163')] [2024-07-02 12:42:56,807][36999] Updated weights for policy 0, policy_version 15390 (0.0028) [2024-07-02 12:42:59,662][36999] Updated weights for policy 0, policy_version 15400 (0.0032) [2024-07-02 12:43:01,095][36761] Fps is (10 sec: 42598.8, 60 sec: 43967.1, 300 sec: 44320.1). Total num frames: 252329984. Throughput: 0: 44483.5. Samples: 252416200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 12:43:01,096][36761] Avg episode reward: [(0, '0.192')] [2024-07-02 12:43:01,253][36979] Saving new best policy, reward=0.192! [2024-07-02 12:43:01,886][36979] Signal inference workers to stop experience collection... (3700 times) [2024-07-02 12:43:01,886][36979] Signal inference workers to resume experience collection... (3700 times) [2024-07-02 12:43:01,900][36999] InferenceWorker_p0-w0: stopping experience collection (3700 times) [2024-07-02 12:43:01,900][36999] InferenceWorker_p0-w0: resuming experience collection (3700 times) [2024-07-02 12:43:04,152][36999] Updated weights for policy 0, policy_version 15410 (0.0024) [2024-07-02 12:43:06,095][36761] Fps is (10 sec: 45877.8, 60 sec: 44510.0, 300 sec: 44597.8). Total num frames: 252575744. Throughput: 0: 44505.7. Samples: 252687120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 12:43:06,096][36761] Avg episode reward: [(0, '0.192')] [2024-07-02 12:43:07,336][36999] Updated weights for policy 0, policy_version 15420 (0.0035) [2024-07-02 12:43:11,095][36761] Fps is (10 sec: 45875.2, 60 sec: 44509.8, 300 sec: 44653.3). Total num frames: 252788736. Throughput: 0: 44432.4. Samples: 252951080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-07-02 12:43:11,096][36761] Avg episode reward: [(0, '0.172')] [2024-07-02 12:43:11,497][36999] Updated weights for policy 0, policy_version 15430 (0.0031) [2024-07-02 12:43:14,772][36999] Updated weights for policy 0, policy_version 15440 (0.0027) [2024-07-02 12:43:16,095][36761] Fps is (10 sec: 44236.9, 60 sec: 44236.8, 300 sec: 44486.7). Total num frames: 253018112. Throughput: 0: 44361.7. Samples: 253077080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-07-02 12:43:16,096][36761] Avg episode reward: [(0, '0.169')] [2024-07-02 12:43:19,143][36999] Updated weights for policy 0, policy_version 15450 (0.0027) [2024-07-02 12:43:21,095][36761] Fps is (10 sec: 47513.7, 60 sec: 45056.0, 300 sec: 44653.3). Total num frames: 253263872. Throughput: 0: 44470.8. Samples: 253355380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 12:43:21,096][36761] Avg episode reward: [(0, '0.163')] [2024-07-02 12:43:22,126][36999] Updated weights for policy 0, policy_version 15460 (0.0030) [2024-07-02 12:43:26,095][36761] Fps is (10 sec: 42598.1, 60 sec: 43963.7, 300 sec: 44653.3). Total num frames: 253444096. Throughput: 0: 44400.8. Samples: 253620020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 12:43:26,096][36761] Avg episode reward: [(0, '0.176')] [2024-07-02 12:43:26,373][36999] Updated weights for policy 0, policy_version 15470 (0.0033) [2024-07-02 12:43:29,363][36999] Updated weights for policy 0, policy_version 15480 (0.0030) [2024-07-02 12:43:31,095][36761] Fps is (10 sec: 40960.4, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 253673472. Throughput: 0: 44329.1. Samples: 253749360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 12:43:31,095][36761] Avg episode reward: [(0, '0.189')] [2024-07-02 12:43:33,575][36999] Updated weights for policy 0, policy_version 15490 (0.0036) [2024-07-02 12:43:36,095][36761] Fps is (10 sec: 49152.4, 60 sec: 44783.0, 300 sec: 44708.9). Total num frames: 253935616. Throughput: 0: 44371.7. Samples: 254019700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 12:43:36,096][36761] Avg episode reward: [(0, '0.168')] [2024-07-02 12:43:36,608][36999] Updated weights for policy 0, policy_version 15500 (0.0038) [2024-07-02 12:43:40,734][36999] Updated weights for policy 0, policy_version 15510 (0.0031) [2024-07-02 12:43:41,095][36761] Fps is (10 sec: 45874.5, 60 sec: 44509.8, 300 sec: 44708.9). Total num frames: 254132224. Throughput: 0: 44689.4. Samples: 254296840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 12:43:41,098][36761] Avg episode reward: [(0, '0.181')] [2024-07-02 12:43:43,813][36999] Updated weights for policy 0, policy_version 15520 (0.0028) [2024-07-02 12:43:46,095][36761] Fps is (10 sec: 40959.8, 60 sec: 44510.0, 300 sec: 44486.7). Total num frames: 254345216. Throughput: 0: 44664.4. Samples: 254426100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 12:43:46,096][36761] Avg episode reward: [(0, '0.165')] [2024-07-02 12:43:48,006][36999] Updated weights for policy 0, policy_version 15530 (0.0026) [2024-07-02 12:43:51,096][36761] Fps is (10 sec: 45874.5, 60 sec: 44782.9, 300 sec: 44597.8). Total num frames: 254590976. Throughput: 0: 44595.8. Samples: 254693940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 12:43:51,096][36761] Avg episode reward: [(0, '0.165')] [2024-07-02 12:43:51,144][36999] Updated weights for policy 0, policy_version 15540 (0.0021) [2024-07-02 12:43:55,383][36999] Updated weights for policy 0, policy_version 15550 (0.0035) [2024-07-02 12:43:56,095][36761] Fps is (10 sec: 47513.6, 60 sec: 45056.4, 300 sec: 44708.9). Total num frames: 254820352. Throughput: 0: 44752.4. Samples: 254964940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-07-02 12:43:56,096][36761] Avg episode reward: [(0, '0.161')] [2024-07-02 12:43:58,440][36999] Updated weights for policy 0, policy_version 15560 (0.0027) [2024-07-02 12:44:01,095][36761] Fps is (10 sec: 44238.0, 60 sec: 45056.1, 300 sec: 44597.8). Total num frames: 255033344. Throughput: 0: 44804.1. Samples: 255093260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-07-02 12:44:01,096][36761] Avg episode reward: [(0, '0.190')] [2024-07-02 12:44:02,598][36999] Updated weights for policy 0, policy_version 15570 (0.0026) [2024-07-02 12:44:05,708][36999] Updated weights for policy 0, policy_version 15580 (0.0028) [2024-07-02 12:44:06,098][36761] Fps is (10 sec: 44227.5, 60 sec: 44781.4, 300 sec: 44542.0). Total num frames: 255262720. Throughput: 0: 44657.4. Samples: 255365060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-07-02 12:44:06,098][36761] Avg episode reward: [(0, '0.212')] [2024-07-02 12:44:06,103][36979] Saving new best policy, reward=0.212! [2024-07-02 12:44:09,931][36999] Updated weights for policy 0, policy_version 15590 (0.0034) [2024-07-02 12:44:11,095][36761] Fps is (10 sec: 45874.4, 60 sec: 45055.9, 300 sec: 44764.4). Total num frames: 255492096. Throughput: 0: 44768.0. Samples: 255634580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-07-02 12:44:11,096][36761] Avg episode reward: [(0, '0.188')] [2024-07-02 12:44:12,910][36999] Updated weights for policy 0, policy_version 15600 (0.0034) [2024-07-02 12:44:16,095][36761] Fps is (10 sec: 44246.1, 60 sec: 44782.9, 300 sec: 44653.4). Total num frames: 255705088. Throughput: 0: 44949.2. Samples: 255772080. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-07-02 12:44:16,096][36761] Avg episode reward: [(0, '0.186')] [2024-07-02 12:44:17,149][36999] Updated weights for policy 0, policy_version 15610 (0.0033) [2024-07-02 12:44:20,703][36999] Updated weights for policy 0, policy_version 15620 (0.0031) [2024-07-02 12:44:21,095][36761] Fps is (10 sec: 42598.7, 60 sec: 44236.8, 300 sec: 44597.8). Total num frames: 255918080. Throughput: 0: 44769.8. Samples: 256034340. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-07-02 12:44:21,096][36761] Avg episode reward: [(0, '0.204')] [2024-07-02 12:44:24,421][36999] Updated weights for policy 0, policy_version 15630 (0.0034) [2024-07-02 12:44:25,400][36979] Signal inference workers to stop experience collection... (3750 times) [2024-07-02 12:44:25,401][36979] Signal inference workers to resume experience collection... (3750 times) [2024-07-02 12:44:25,448][36999] InferenceWorker_p0-w0: stopping experience collection (3750 times) [2024-07-02 12:44:25,448][36999] InferenceWorker_p0-w0: resuming experience collection (3750 times) [2024-07-02 12:44:26,100][36761] Fps is (10 sec: 45854.3, 60 sec: 45325.7, 300 sec: 44652.6). Total num frames: 256163840. Throughput: 0: 44631.5. Samples: 256305460. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-07-02 12:44:26,100][36761] Avg episode reward: [(0, '0.208')] [2024-07-02 12:44:27,982][36999] Updated weights for policy 0, policy_version 15640 (0.0031) [2024-07-02 12:44:31,095][36761] Fps is (10 sec: 45875.2, 60 sec: 45055.9, 300 sec: 44654.0). Total num frames: 256376832. Throughput: 0: 44909.4. Samples: 256447020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 12:44:31,096][36761] Avg episode reward: [(0, '0.215')] [2024-07-02 12:44:31,096][36979] Saving new best policy, reward=0.215! [2024-07-02 12:44:31,714][36999] Updated weights for policy 0, policy_version 15650 (0.0039) [2024-07-02 12:44:35,268][36999] Updated weights for policy 0, policy_version 15660 (0.0043) [2024-07-02 12:44:36,095][36761] Fps is (10 sec: 40978.7, 60 sec: 43963.7, 300 sec: 44431.9). Total num frames: 256573440. Throughput: 0: 44638.8. Samples: 256702680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 12:44:36,096][36761] Avg episode reward: [(0, '0.194')] [2024-07-02 12:44:36,109][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000015660_256573440.pth... [2024-07-02 12:44:36,169][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000015008_245891072.pth [2024-07-02 12:44:39,036][36999] Updated weights for policy 0, policy_version 15670 (0.0033) [2024-07-02 12:44:41,100][36761] Fps is (10 sec: 45854.4, 60 sec: 45052.6, 300 sec: 44652.7). Total num frames: 256835584. Throughput: 0: 44684.4. Samples: 256975940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 12:44:41,100][36761] Avg episode reward: [(0, '0.207')] [2024-07-02 12:44:42,515][36999] Updated weights for policy 0, policy_version 15680 (0.0021) [2024-07-02 12:44:46,100][36761] Fps is (10 sec: 45854.5, 60 sec: 44779.6, 300 sec: 44652.6). Total num frames: 257032192. Throughput: 0: 44970.5. Samples: 257117140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 12:44:46,100][36761] Avg episode reward: [(0, '0.215')] [2024-07-02 12:44:46,548][36999] Updated weights for policy 0, policy_version 15690 (0.0044) [2024-07-02 12:44:50,040][36999] Updated weights for policy 0, policy_version 15700 (0.0029) [2024-07-02 12:44:51,095][36761] Fps is (10 sec: 40978.9, 60 sec: 44237.0, 300 sec: 44486.8). Total num frames: 257245184. Throughput: 0: 44767.5. Samples: 257379500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 12:44:51,095][36761] Avg episode reward: [(0, '0.209')] [2024-07-02 12:44:53,890][36999] Updated weights for policy 0, policy_version 15710 (0.0021) [2024-07-02 12:44:56,095][36761] Fps is (10 sec: 47534.9, 60 sec: 44782.9, 300 sec: 44708.9). Total num frames: 257507328. Throughput: 0: 44658.7. Samples: 257644220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 12:44:56,096][36761] Avg episode reward: [(0, '0.183')] [2024-07-02 12:44:57,357][36999] Updated weights for policy 0, policy_version 15720 (0.0054) [2024-07-02 12:45:01,095][36761] Fps is (10 sec: 45874.2, 60 sec: 44509.7, 300 sec: 44653.3). Total num frames: 257703936. Throughput: 0: 44679.4. Samples: 257782660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-07-02 12:45:01,096][36761] Avg episode reward: [(0, '0.207')] [2024-07-02 12:45:01,253][36999] Updated weights for policy 0, policy_version 15730 (0.0045) [2024-07-02 12:45:04,812][36999] Updated weights for policy 0, policy_version 15740 (0.0031) [2024-07-02 12:45:06,095][36761] Fps is (10 sec: 42598.7, 60 sec: 44511.4, 300 sec: 44542.3). Total num frames: 257933312. Throughput: 0: 44748.9. Samples: 258048040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-07-02 12:45:06,096][36761] Avg episode reward: [(0, '0.221')] [2024-07-02 12:45:06,109][36979] Saving new best policy, reward=0.221! [2024-07-02 12:45:08,489][36999] Updated weights for policy 0, policy_version 15750 (0.0031) [2024-07-02 12:45:11,095][36761] Fps is (10 sec: 47514.1, 60 sec: 44783.0, 300 sec: 44709.0). Total num frames: 258179072. Throughput: 0: 44642.3. Samples: 258314160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-07-02 12:45:11,096][36761] Avg episode reward: [(0, '0.203')] [2024-07-02 12:45:12,240][36999] Updated weights for policy 0, policy_version 15760 (0.0031) [2024-07-02 12:45:15,849][36999] Updated weights for policy 0, policy_version 15770 (0.0036) [2024-07-02 12:45:16,095][36761] Fps is (10 sec: 44236.4, 60 sec: 44509.8, 300 sec: 44708.9). Total num frames: 258375680. Throughput: 0: 44405.2. Samples: 258445260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 12:45:16,096][36761] Avg episode reward: [(0, '0.185')] [2024-07-02 12:45:19,545][36999] Updated weights for policy 0, policy_version 15780 (0.0032) [2024-07-02 12:45:21,095][36761] Fps is (10 sec: 44236.3, 60 sec: 45055.9, 300 sec: 44653.3). Total num frames: 258621440. Throughput: 0: 44813.7. Samples: 258719300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 12:45:21,096][36761] Avg episode reward: [(0, '0.186')] [2024-07-02 12:45:23,445][36999] Updated weights for policy 0, policy_version 15790 (0.0033) [2024-07-02 12:45:26,095][36761] Fps is (10 sec: 45875.8, 60 sec: 44513.3, 300 sec: 44708.9). Total num frames: 258834432. Throughput: 0: 44557.9. Samples: 258980840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 12:45:26,096][36761] Avg episode reward: [(0, '0.187')] [2024-07-02 12:45:26,906][36999] Updated weights for policy 0, policy_version 15800 (0.0026) [2024-07-02 12:45:30,820][36999] Updated weights for policy 0, policy_version 15810 (0.0028) [2024-07-02 12:45:31,095][36761] Fps is (10 sec: 40960.6, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 259031040. Throughput: 0: 44392.5. Samples: 259114600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 12:45:31,096][36761] Avg episode reward: [(0, '0.177')] [2024-07-02 12:45:34,214][36999] Updated weights for policy 0, policy_version 15820 (0.0037) [2024-07-02 12:45:36,095][36761] Fps is (10 sec: 44236.4, 60 sec: 45056.0, 300 sec: 44597.8). Total num frames: 259276800. Throughput: 0: 44665.2. Samples: 259389440. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-07-02 12:45:36,096][36761] Avg episode reward: [(0, '0.177')] [2024-07-02 12:45:38,060][36999] Updated weights for policy 0, policy_version 15830 (0.0022) [2024-07-02 12:45:41,095][36761] Fps is (10 sec: 45875.4, 60 sec: 44240.2, 300 sec: 44542.3). Total num frames: 259489792. Throughput: 0: 44667.2. Samples: 259654240. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-07-02 12:45:41,096][36761] Avg episode reward: [(0, '0.185')] [2024-07-02 12:45:41,634][36999] Updated weights for policy 0, policy_version 15840 (0.0024) [2024-07-02 12:45:44,956][36979] Signal inference workers to stop experience collection... (3800 times) [2024-07-02 12:45:44,956][36979] Signal inference workers to resume experience collection... (3800 times) [2024-07-02 12:45:44,967][36999] InferenceWorker_p0-w0: stopping experience collection (3800 times) [2024-07-02 12:45:44,968][36999] InferenceWorker_p0-w0: resuming experience collection (3800 times) [2024-07-02 12:45:45,273][36999] Updated weights for policy 0, policy_version 15850 (0.0027) [2024-07-02 12:45:46,100][36761] Fps is (10 sec: 45854.3, 60 sec: 45056.0, 300 sec: 44597.1). Total num frames: 259735552. Throughput: 0: 44651.6. Samples: 259792180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-07-02 12:45:46,100][36761] Avg episode reward: [(0, '0.198')] [2024-07-02 12:45:48,812][36999] Updated weights for policy 0, policy_version 15860 (0.0043) [2024-07-02 12:45:51,100][36761] Fps is (10 sec: 45854.0, 60 sec: 45052.5, 300 sec: 44597.1). Total num frames: 259948544. Throughput: 0: 44726.1. Samples: 260060920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-07-02 12:45:51,100][36761] Avg episode reward: [(0, '0.256')] [2024-07-02 12:45:51,101][36979] Saving new best policy, reward=0.256! [2024-07-02 12:45:52,524][36999] Updated weights for policy 0, policy_version 15870 (0.0027) [2024-07-02 12:45:56,095][36761] Fps is (10 sec: 42618.1, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 260161536. Throughput: 0: 44868.1. Samples: 260333220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-07-02 12:45:56,096][36761] Avg episode reward: [(0, '0.210')] [2024-07-02 12:45:56,237][36999] Updated weights for policy 0, policy_version 15880 (0.0027) [2024-07-02 12:45:59,750][36999] Updated weights for policy 0, policy_version 15890 (0.0029) [2024-07-02 12:46:01,095][36761] Fps is (10 sec: 44256.9, 60 sec: 44783.0, 300 sec: 44597.8). Total num frames: 260390912. Throughput: 0: 44947.6. Samples: 260467900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 12:46:01,096][36761] Avg episode reward: [(0, '0.170')] [2024-07-02 12:46:03,514][36999] Updated weights for policy 0, policy_version 15900 (0.0037) [2024-07-02 12:46:06,095][36761] Fps is (10 sec: 44236.5, 60 sec: 44509.8, 300 sec: 44653.3). Total num frames: 260603904. Throughput: 0: 44609.9. Samples: 260726740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 12:46:06,096][36761] Avg episode reward: [(0, '0.176')] [2024-07-02 12:46:07,103][36999] Updated weights for policy 0, policy_version 15910 (0.0035) [2024-07-02 12:46:10,781][36999] Updated weights for policy 0, policy_version 15920 (0.0032) [2024-07-02 12:46:11,095][36761] Fps is (10 sec: 44236.8, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 260833280. Throughput: 0: 44804.8. Samples: 260997060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 12:46:11,096][36761] Avg episode reward: [(0, '0.188')] [2024-07-02 12:46:14,383][36999] Updated weights for policy 0, policy_version 15930 (0.0022) [2024-07-02 12:46:16,095][36761] Fps is (10 sec: 45875.4, 60 sec: 44783.0, 300 sec: 44597.8). Total num frames: 261062656. Throughput: 0: 44882.2. Samples: 261134300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 12:46:16,100][36761] Avg episode reward: [(0, '0.210')] [2024-07-02 12:46:18,350][36999] Updated weights for policy 0, policy_version 15940 (0.0034) [2024-07-02 12:46:21,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43963.8, 300 sec: 44597.8). Total num frames: 261259264. Throughput: 0: 44528.9. Samples: 261393240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 12:46:21,096][36761] Avg episode reward: [(0, '0.269')] [2024-07-02 12:46:21,138][36979] Saving new best policy, reward=0.269! [2024-07-02 12:46:21,832][36999] Updated weights for policy 0, policy_version 15950 (0.0038) [2024-07-02 12:46:25,976][36999] Updated weights for policy 0, policy_version 15960 (0.0034) [2024-07-02 12:46:26,095][36761] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 261488640. Throughput: 0: 44720.4. Samples: 261666660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 12:46:26,096][36761] Avg episode reward: [(0, '0.237')] [2024-07-02 12:46:29,183][36999] Updated weights for policy 0, policy_version 15970 (0.0040) [2024-07-02 12:46:31,095][36761] Fps is (10 sec: 45875.4, 60 sec: 44783.0, 300 sec: 44597.8). Total num frames: 261718016. Throughput: 0: 44470.8. Samples: 261793160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 12:46:31,096][36761] Avg episode reward: [(0, '0.237')] [2024-07-02 12:46:33,510][36999] Updated weights for policy 0, policy_version 15980 (0.0033) [2024-07-02 12:46:36,095][36761] Fps is (10 sec: 45875.2, 60 sec: 44509.9, 300 sec: 44708.9). Total num frames: 261947392. Throughput: 0: 44392.5. Samples: 262058380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 12:46:36,096][36761] Avg episode reward: [(0, '0.234')] [2024-07-02 12:46:36,159][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000015989_261963776.pth... [2024-07-02 12:46:36,214][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000015335_251248640.pth [2024-07-02 12:46:36,490][36999] Updated weights for policy 0, policy_version 15990 (0.0027) [2024-07-02 12:46:40,805][36999] Updated weights for policy 0, policy_version 16000 (0.0024) [2024-07-02 12:46:41,096][36761] Fps is (10 sec: 44235.9, 60 sec: 44509.7, 300 sec: 44542.2). Total num frames: 262160384. Throughput: 0: 44387.8. Samples: 262330680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 12:46:41,096][36761] Avg episode reward: [(0, '0.252')] [2024-07-02 12:46:43,761][36999] Updated weights for policy 0, policy_version 16010 (0.0040) [2024-07-02 12:46:46,095][36761] Fps is (10 sec: 44236.9, 60 sec: 44240.2, 300 sec: 44542.4). Total num frames: 262389760. Throughput: 0: 44214.7. Samples: 262457560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 12:46:46,096][36761] Avg episode reward: [(0, '0.248')] [2024-07-02 12:46:47,972][36999] Updated weights for policy 0, policy_version 16020 (0.0038) [2024-07-02 12:46:51,085][36999] Updated weights for policy 0, policy_version 16030 (0.0033) [2024-07-02 12:46:51,095][36761] Fps is (10 sec: 47514.2, 60 sec: 44786.3, 300 sec: 44819.9). Total num frames: 262635520. Throughput: 0: 44492.9. Samples: 262728920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 12:46:51,096][36761] Avg episode reward: [(0, '0.248')] [2024-07-02 12:46:55,245][36999] Updated weights for policy 0, policy_version 16040 (0.0030) [2024-07-02 12:46:56,095][36761] Fps is (10 sec: 44236.0, 60 sec: 44509.8, 300 sec: 44542.9). Total num frames: 262832128. Throughput: 0: 44499.9. Samples: 262999560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 12:46:56,096][36761] Avg episode reward: [(0, '0.257')] [2024-07-02 12:46:58,387][36999] Updated weights for policy 0, policy_version 16050 (0.0037) [2024-07-02 12:47:01,095][36761] Fps is (10 sec: 40960.0, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 263045120. Throughput: 0: 44340.8. Samples: 263129640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 12:47:01,096][36761] Avg episode reward: [(0, '0.234')] [2024-07-02 12:47:02,503][36999] Updated weights for policy 0, policy_version 16060 (0.0035) [2024-07-02 12:47:05,759][36999] Updated weights for policy 0, policy_version 16070 (0.0026) [2024-07-02 12:47:06,095][36761] Fps is (10 sec: 45875.7, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 263290880. Throughput: 0: 44658.2. Samples: 263402860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 12:47:06,096][36761] Avg episode reward: [(0, '0.212')] [2024-07-02 12:47:09,890][36999] Updated weights for policy 0, policy_version 16080 (0.0033) [2024-07-02 12:47:11,095][36761] Fps is (10 sec: 45875.9, 60 sec: 44510.0, 300 sec: 44542.3). Total num frames: 263503872. Throughput: 0: 44453.0. Samples: 263667040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 12:47:11,095][36761] Avg episode reward: [(0, '0.212')] [2024-07-02 12:47:13,279][36999] Updated weights for policy 0, policy_version 16090 (0.0040) [2024-07-02 12:47:16,095][36761] Fps is (10 sec: 42598.9, 60 sec: 44236.9, 300 sec: 44597.8). Total num frames: 263716864. Throughput: 0: 44532.5. Samples: 263797120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 12:47:16,095][36761] Avg episode reward: [(0, '0.237')] [2024-07-02 12:47:17,162][36999] Updated weights for policy 0, policy_version 16100 (0.0042) [2024-07-02 12:47:20,644][36999] Updated weights for policy 0, policy_version 16110 (0.0035) [2024-07-02 12:47:21,095][36761] Fps is (10 sec: 45874.8, 60 sec: 45056.0, 300 sec: 44597.8). Total num frames: 263962624. Throughput: 0: 44658.6. Samples: 264068020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 12:47:21,096][36761] Avg episode reward: [(0, '0.235')] [2024-07-02 12:47:24,691][36999] Updated weights for policy 0, policy_version 16120 (0.0044) [2024-07-02 12:47:25,546][36979] Signal inference workers to stop experience collection... (3850 times) [2024-07-02 12:47:25,546][36979] Signal inference workers to resume experience collection... (3850 times) [2024-07-02 12:47:25,567][36999] InferenceWorker_p0-w0: stopping experience collection (3850 times) [2024-07-02 12:47:25,568][36999] InferenceWorker_p0-w0: resuming experience collection (3850 times) [2024-07-02 12:47:26,097][36761] Fps is (10 sec: 44230.4, 60 sec: 44508.8, 300 sec: 44542.1). Total num frames: 264159232. Throughput: 0: 44351.7. Samples: 264326560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 12:47:26,097][36761] Avg episode reward: [(0, '0.208')] [2024-07-02 12:47:27,931][36999] Updated weights for policy 0, policy_version 16130 (0.0026) [2024-07-02 12:47:31,095][36761] Fps is (10 sec: 42598.3, 60 sec: 44509.8, 300 sec: 44542.3). Total num frames: 264388608. Throughput: 0: 44459.9. Samples: 264458260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 12:47:31,096][36761] Avg episode reward: [(0, '0.216')] [2024-07-02 12:47:31,862][36999] Updated weights for policy 0, policy_version 16140 (0.0041) [2024-07-02 12:47:35,262][36999] Updated weights for policy 0, policy_version 16150 (0.0029) [2024-07-02 12:47:36,095][36761] Fps is (10 sec: 45881.2, 60 sec: 44509.8, 300 sec: 44597.8). Total num frames: 264617984. Throughput: 0: 44481.3. Samples: 264730580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-07-02 12:47:36,098][36761] Avg episode reward: [(0, '0.247')] [2024-07-02 12:47:39,231][36999] Updated weights for policy 0, policy_version 16160 (0.0025) [2024-07-02 12:47:41,095][36761] Fps is (10 sec: 44237.2, 60 sec: 44510.0, 300 sec: 44597.8). Total num frames: 264830976. Throughput: 0: 44314.4. Samples: 264993700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-07-02 12:47:41,096][36761] Avg episode reward: [(0, '0.249')] [2024-07-02 12:47:43,118][36999] Updated weights for policy 0, policy_version 16170 (0.0031) [2024-07-02 12:47:46,095][36761] Fps is (10 sec: 42598.2, 60 sec: 44236.7, 300 sec: 44542.3). Total num frames: 265043968. Throughput: 0: 44288.8. Samples: 265122640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-07-02 12:47:46,096][36761] Avg episode reward: [(0, '0.262')] [2024-07-02 12:47:46,819][36999] Updated weights for policy 0, policy_version 16180 (0.0024) [2024-07-02 12:47:50,387][36999] Updated weights for policy 0, policy_version 16190 (0.0032) [2024-07-02 12:47:51,095][36761] Fps is (10 sec: 45875.1, 60 sec: 44236.9, 300 sec: 44653.4). Total num frames: 265289728. Throughput: 0: 44231.2. Samples: 265393260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-07-02 12:47:51,096][36761] Avg episode reward: [(0, '0.199')] [2024-07-02 12:47:54,218][36999] Updated weights for policy 0, policy_version 16200 (0.0019) [2024-07-02 12:47:56,095][36761] Fps is (10 sec: 45875.6, 60 sec: 44510.0, 300 sec: 44653.3). Total num frames: 265502720. Throughput: 0: 44317.7. Samples: 265661340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 12:47:56,096][36761] Avg episode reward: [(0, '0.202')] [2024-07-02 12:47:57,737][36999] Updated weights for policy 0, policy_version 16210 (0.0032) [2024-07-02 12:48:01,095][36761] Fps is (10 sec: 42598.0, 60 sec: 44509.9, 300 sec: 44542.3). Total num frames: 265715712. Throughput: 0: 44318.1. Samples: 265791440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 12:48:01,096][36761] Avg episode reward: [(0, '0.210')] [2024-07-02 12:48:01,494][36999] Updated weights for policy 0, policy_version 16220 (0.0036) [2024-07-02 12:48:05,068][36999] Updated weights for policy 0, policy_version 16230 (0.0031) [2024-07-02 12:48:06,095][36761] Fps is (10 sec: 45874.6, 60 sec: 44509.8, 300 sec: 44653.3). Total num frames: 265961472. Throughput: 0: 44293.2. Samples: 266061220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 12:48:06,096][36761] Avg episode reward: [(0, '0.221')] [2024-07-02 12:48:08,939][36999] Updated weights for policy 0, policy_version 16240 (0.0043) [2024-07-02 12:48:11,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43963.7, 300 sec: 44486.7). Total num frames: 266141696. Throughput: 0: 44489.8. Samples: 266328540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 12:48:11,096][36761] Avg episode reward: [(0, '0.229')] [2024-07-02 12:48:12,457][36999] Updated weights for policy 0, policy_version 16250 (0.0024) [2024-07-02 12:48:16,095][36761] Fps is (10 sec: 40959.9, 60 sec: 44236.6, 300 sec: 44431.2). Total num frames: 266371072. Throughput: 0: 44271.0. Samples: 266450460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 12:48:16,096][36761] Avg episode reward: [(0, '0.303')] [2024-07-02 12:48:16,103][36979] Saving new best policy, reward=0.303! [2024-07-02 12:48:16,657][36999] Updated weights for policy 0, policy_version 16260 (0.0036) [2024-07-02 12:48:19,882][36999] Updated weights for policy 0, policy_version 16270 (0.0031) [2024-07-02 12:48:21,100][36761] Fps is (10 sec: 47491.9, 60 sec: 44233.4, 300 sec: 44652.7). Total num frames: 266616832. Throughput: 0: 44175.6. Samples: 266718680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-07-02 12:48:21,100][36761] Avg episode reward: [(0, '0.291')] [2024-07-02 12:48:23,961][36999] Updated weights for policy 0, policy_version 16280 (0.0041) [2024-07-02 12:48:26,095][36761] Fps is (10 sec: 44237.7, 60 sec: 44237.8, 300 sec: 44542.3). Total num frames: 266813440. Throughput: 0: 44310.2. Samples: 266987660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-07-02 12:48:26,096][36761] Avg episode reward: [(0, '0.300')] [2024-07-02 12:48:27,218][36999] Updated weights for policy 0, policy_version 16290 (0.0033) [2024-07-02 12:48:31,095][36761] Fps is (10 sec: 42618.3, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 267042816. Throughput: 0: 44399.3. Samples: 267120600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-07-02 12:48:31,096][36761] Avg episode reward: [(0, '0.299')] [2024-07-02 12:48:31,174][36999] Updated weights for policy 0, policy_version 16300 (0.0032) [2024-07-02 12:48:34,534][36999] Updated weights for policy 0, policy_version 16310 (0.0039) [2024-07-02 12:48:36,095][36761] Fps is (10 sec: 45874.9, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 267272192. Throughput: 0: 44295.9. Samples: 267386580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 12:48:36,096][36761] Avg episode reward: [(0, '0.299')] [2024-07-02 12:48:36,288][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000016315_267304960.pth... [2024-07-02 12:48:36,338][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000015660_256573440.pth [2024-07-02 12:48:38,514][36999] Updated weights for policy 0, policy_version 16320 (0.0035) [2024-07-02 12:48:41,095][36761] Fps is (10 sec: 44236.3, 60 sec: 44236.7, 300 sec: 44542.3). Total num frames: 267485184. Throughput: 0: 44268.9. Samples: 267653440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 12:48:41,096][36761] Avg episode reward: [(0, '0.288')] [2024-07-02 12:48:41,852][36999] Updated weights for policy 0, policy_version 16330 (0.0027) [2024-07-02 12:48:45,985][36999] Updated weights for policy 0, policy_version 16340 (0.0022) [2024-07-02 12:48:46,095][36761] Fps is (10 sec: 44236.5, 60 sec: 44509.9, 300 sec: 44486.7). Total num frames: 267714560. Throughput: 0: 44197.7. Samples: 267780340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 12:48:46,096][36761] Avg episode reward: [(0, '0.273')] [2024-07-02 12:48:47,368][36979] Signal inference workers to stop experience collection... (3900 times) [2024-07-02 12:48:47,368][36979] Signal inference workers to resume experience collection... (3900 times) [2024-07-02 12:48:47,390][36999] InferenceWorker_p0-w0: stopping experience collection (3900 times) [2024-07-02 12:48:47,390][36999] InferenceWorker_p0-w0: resuming experience collection (3900 times) [2024-07-02 12:48:49,457][36999] Updated weights for policy 0, policy_version 16350 (0.0039) [2024-07-02 12:48:51,095][36761] Fps is (10 sec: 47513.9, 60 sec: 44509.9, 300 sec: 44542.3). Total num frames: 267960320. Throughput: 0: 44220.6. Samples: 268051140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 12:48:51,096][36761] Avg episode reward: [(0, '0.275')] [2024-07-02 12:48:53,383][36999] Updated weights for policy 0, policy_version 16360 (0.0027) [2024-07-02 12:48:56,095][36761] Fps is (10 sec: 44237.6, 60 sec: 44236.9, 300 sec: 44486.7). Total num frames: 268156928. Throughput: 0: 44284.5. Samples: 268321340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-07-02 12:48:56,095][36761] Avg episode reward: [(0, '0.275')] [2024-07-02 12:48:56,714][36999] Updated weights for policy 0, policy_version 16370 (0.0023) [2024-07-02 12:49:00,717][36999] Updated weights for policy 0, policy_version 16380 (0.0033) [2024-07-02 12:49:01,095][36761] Fps is (10 sec: 40959.3, 60 sec: 44236.8, 300 sec: 44431.5). Total num frames: 268369920. Throughput: 0: 44456.1. Samples: 268450980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-07-02 12:49:01,096][36761] Avg episode reward: [(0, '0.288')] [2024-07-02 12:49:04,213][36999] Updated weights for policy 0, policy_version 16390 (0.0026) [2024-07-02 12:49:06,095][36761] Fps is (10 sec: 45874.7, 60 sec: 44236.9, 300 sec: 44486.7). Total num frames: 268615680. Throughput: 0: 44463.6. Samples: 268719340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-07-02 12:49:06,096][36761] Avg episode reward: [(0, '0.274')] [2024-07-02 12:49:07,991][36999] Updated weights for policy 0, policy_version 16400 (0.0034) [2024-07-02 12:49:11,095][36761] Fps is (10 sec: 45875.3, 60 sec: 44782.9, 300 sec: 44486.7). Total num frames: 268828672. Throughput: 0: 44323.0. Samples: 268982200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 12:49:11,096][36761] Avg episode reward: [(0, '0.243')] [2024-07-02 12:49:11,557][36999] Updated weights for policy 0, policy_version 16410 (0.0040) [2024-07-02 12:49:15,681][36999] Updated weights for policy 0, policy_version 16420 (0.0034) [2024-07-02 12:49:16,095][36761] Fps is (10 sec: 40959.8, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 269025280. Throughput: 0: 44429.6. Samples: 269119940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 12:49:16,096][36761] Avg episode reward: [(0, '0.233')] [2024-07-02 12:49:18,913][36999] Updated weights for policy 0, policy_version 16430 (0.0030) [2024-07-02 12:49:21,095][36761] Fps is (10 sec: 44237.2, 60 sec: 44240.2, 300 sec: 44431.9). Total num frames: 269271040. Throughput: 0: 44451.1. Samples: 269386880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-07-02 12:49:21,096][36761] Avg episode reward: [(0, '0.235')] [2024-07-02 12:49:23,025][36999] Updated weights for policy 0, policy_version 16440 (0.0032) [2024-07-02 12:49:26,095][36761] Fps is (10 sec: 45875.8, 60 sec: 44509.9, 300 sec: 44431.2). Total num frames: 269484032. Throughput: 0: 44403.2. Samples: 269651580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-07-02 12:49:26,096][36761] Avg episode reward: [(0, '0.256')] [2024-07-02 12:49:26,263][36999] Updated weights for policy 0, policy_version 16450 (0.0038) [2024-07-02 12:49:30,439][36999] Updated weights for policy 0, policy_version 16460 (0.0026) [2024-07-02 12:49:31,095][36761] Fps is (10 sec: 42598.7, 60 sec: 44236.8, 300 sec: 44486.7). Total num frames: 269697024. Throughput: 0: 44661.0. Samples: 269790080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 12:49:31,095][36761] Avg episode reward: [(0, '0.287')] [2024-07-02 12:49:33,608][36999] Updated weights for policy 0, policy_version 16470 (0.0039) [2024-07-02 12:49:36,095][36761] Fps is (10 sec: 44236.8, 60 sec: 44236.9, 300 sec: 44376.3). Total num frames: 269926400. Throughput: 0: 44382.7. Samples: 270048360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 12:49:36,096][36761] Avg episode reward: [(0, '0.305')] [2024-07-02 12:49:36,114][36979] Saving new best policy, reward=0.305! [2024-07-02 12:49:37,712][36999] Updated weights for policy 0, policy_version 16480 (0.0027) [2024-07-02 12:49:40,963][36999] Updated weights for policy 0, policy_version 16490 (0.0025) [2024-07-02 12:49:41,095][36761] Fps is (10 sec: 47513.5, 60 sec: 44783.0, 300 sec: 44543.0). Total num frames: 270172160. Throughput: 0: 44363.5. Samples: 270317700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 12:49:41,096][36761] Avg episode reward: [(0, '0.299')] [2024-07-02 12:49:45,013][36999] Updated weights for policy 0, policy_version 16500 (0.0029) [2024-07-02 12:49:46,099][36761] Fps is (10 sec: 44219.9, 60 sec: 44234.1, 300 sec: 44486.1). Total num frames: 270368768. Throughput: 0: 44440.8. Samples: 270450980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-07-02 12:49:46,099][36761] Avg episode reward: [(0, '0.259')] [2024-07-02 12:49:48,325][36999] Updated weights for policy 0, policy_version 16510 (0.0026) [2024-07-02 12:49:51,095][36761] Fps is (10 sec: 42597.8, 60 sec: 43963.6, 300 sec: 44375.6). Total num frames: 270598144. Throughput: 0: 44327.5. Samples: 270714080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-07-02 12:49:51,096][36761] Avg episode reward: [(0, '0.309')] [2024-07-02 12:49:51,096][36979] Saving new best policy, reward=0.309! [2024-07-02 12:49:52,477][36999] Updated weights for policy 0, policy_version 16520 (0.0028) [2024-07-02 12:49:55,642][36999] Updated weights for policy 0, policy_version 16530 (0.0033) [2024-07-02 12:49:56,100][36761] Fps is (10 sec: 47510.0, 60 sec: 44779.5, 300 sec: 44541.6). Total num frames: 270843904. Throughput: 0: 44339.6. Samples: 270977680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 12:49:56,100][36761] Avg episode reward: [(0, '0.309')] [2024-07-02 12:49:59,827][36999] Updated weights for policy 0, policy_version 16540 (0.0037) [2024-07-02 12:50:00,476][36979] Signal inference workers to stop experience collection... (3950 times) [2024-07-02 12:50:00,477][36979] Signal inference workers to resume experience collection... (3950 times) [2024-07-02 12:50:00,495][36999] InferenceWorker_p0-w0: stopping experience collection (3950 times) [2024-07-02 12:50:00,496][36999] InferenceWorker_p0-w0: resuming experience collection (3950 times) [2024-07-02 12:50:01,095][36761] Fps is (10 sec: 44237.2, 60 sec: 44509.9, 300 sec: 44431.2). Total num frames: 271040512. Throughput: 0: 44321.8. Samples: 271114420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 12:50:01,096][36761] Avg episode reward: [(0, '0.317')] [2024-07-02 12:50:01,096][36979] Saving new best policy, reward=0.317! [2024-07-02 12:50:03,032][36999] Updated weights for policy 0, policy_version 16550 (0.0041) [2024-07-02 12:50:06,101][36761] Fps is (10 sec: 42595.4, 60 sec: 44232.9, 300 sec: 44374.9). Total num frames: 271269888. Throughput: 0: 44320.6. Samples: 271381540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 12:50:06,101][36761] Avg episode reward: [(0, '0.326')] [2024-07-02 12:50:06,123][36979] Saving new best policy, reward=0.326! [2024-07-02 12:50:07,335][36999] Updated weights for policy 0, policy_version 16560 (0.0036) [2024-07-02 12:50:10,557][36999] Updated weights for policy 0, policy_version 16570 (0.0040) [2024-07-02 12:50:11,095][36761] Fps is (10 sec: 45875.4, 60 sec: 44510.0, 300 sec: 44486.7). Total num frames: 271499264. Throughput: 0: 44120.4. Samples: 271637000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-07-02 12:50:11,096][36761] Avg episode reward: [(0, '0.301')] [2024-07-02 12:50:14,844][36999] Updated weights for policy 0, policy_version 16580 (0.0038) [2024-07-02 12:50:16,095][36761] Fps is (10 sec: 44259.4, 60 sec: 44782.9, 300 sec: 44375.6). Total num frames: 271712256. Throughput: 0: 44135.8. Samples: 271776200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-07-02 12:50:16,096][36761] Avg episode reward: [(0, '0.285')] [2024-07-02 12:50:17,933][36999] Updated weights for policy 0, policy_version 16590 (0.0041) [2024-07-02 12:50:21,095][36761] Fps is (10 sec: 42598.4, 60 sec: 44236.8, 300 sec: 44375.6). Total num frames: 271925248. Throughput: 0: 44199.5. Samples: 272037340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 12:50:21,096][36761] Avg episode reward: [(0, '0.354')] [2024-07-02 12:50:21,096][36979] Saving new best policy, reward=0.354! [2024-07-02 12:50:22,297][36999] Updated weights for policy 0, policy_version 16600 (0.0036) [2024-07-02 12:50:25,175][36999] Updated weights for policy 0, policy_version 16610 (0.0026) [2024-07-02 12:50:26,100][36761] Fps is (10 sec: 44217.2, 60 sec: 44506.4, 300 sec: 44486.0). Total num frames: 272154624. Throughput: 0: 44139.0. Samples: 272304160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 12:50:26,100][36761] Avg episode reward: [(0, '0.354')] [2024-07-02 12:50:29,711][36999] Updated weights for policy 0, policy_version 16620 (0.0030) [2024-07-02 12:50:31,095][36761] Fps is (10 sec: 44236.2, 60 sec: 44509.7, 300 sec: 44375.6). Total num frames: 272367616. Throughput: 0: 44285.8. Samples: 272443680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-07-02 12:50:31,096][36761] Avg episode reward: [(0, '0.354')] [2024-07-02 12:50:32,562][36999] Updated weights for policy 0, policy_version 16630 (0.0034) [2024-07-02 12:50:36,095][36761] Fps is (10 sec: 44256.7, 60 sec: 44509.8, 300 sec: 44431.2). Total num frames: 272596992. Throughput: 0: 44440.9. Samples: 272713920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-07-02 12:50:36,096][36761] Avg episode reward: [(0, '0.342')] [2024-07-02 12:50:36,110][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000016638_272596992.pth... [2024-07-02 12:50:36,161][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000015989_261963776.pth [2024-07-02 12:50:36,920][36999] Updated weights for policy 0, policy_version 16640 (0.0031) [2024-07-02 12:50:40,073][36999] Updated weights for policy 0, policy_version 16650 (0.0029) [2024-07-02 12:50:41,095][36761] Fps is (10 sec: 45875.5, 60 sec: 44236.7, 300 sec: 44376.3). Total num frames: 272826368. Throughput: 0: 44442.7. Samples: 272977400. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-07-02 12:50:41,096][36761] Avg episode reward: [(0, '0.320')] [2024-07-02 12:50:44,235][36999] Updated weights for policy 0, policy_version 16660 (0.0023) [2024-07-02 12:50:46,096][36761] Fps is (10 sec: 44235.6, 60 sec: 44512.4, 300 sec: 44376.3). Total num frames: 273039360. Throughput: 0: 44470.8. Samples: 273115620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 12:50:46,096][36761] Avg episode reward: [(0, '0.315')] [2024-07-02 12:50:47,355][36999] Updated weights for policy 0, policy_version 16670 (0.0032) [2024-07-02 12:50:51,095][36761] Fps is (10 sec: 42598.7, 60 sec: 44236.9, 300 sec: 44375.6). Total num frames: 273252352. Throughput: 0: 44336.8. Samples: 273376460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 12:50:51,096][36761] Avg episode reward: [(0, '0.307')] [2024-07-02 12:50:51,611][36999] Updated weights for policy 0, policy_version 16680 (0.0027) [2024-07-02 12:50:54,836][36999] Updated weights for policy 0, policy_version 16690 (0.0024) [2024-07-02 12:50:56,100][36761] Fps is (10 sec: 45855.9, 60 sec: 44236.8, 300 sec: 44430.5). Total num frames: 273498112. Throughput: 0: 44603.0. Samples: 273644340. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-07-02 12:50:56,100][36761] Avg episode reward: [(0, '0.331')] [2024-07-02 12:50:58,893][36999] Updated weights for policy 0, policy_version 16700 (0.0036) [2024-07-02 12:51:01,100][36761] Fps is (10 sec: 45854.3, 60 sec: 44506.5, 300 sec: 44430.5). Total num frames: 273711104. Throughput: 0: 44574.3. Samples: 273782240. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-07-02 12:51:01,100][36761] Avg episode reward: [(0, '0.331')] [2024-07-02 12:51:02,096][36999] Updated weights for policy 0, policy_version 16710 (0.0036) [2024-07-02 12:51:06,095][36761] Fps is (10 sec: 40978.5, 60 sec: 43967.6, 300 sec: 44320.1). Total num frames: 273907712. Throughput: 0: 44651.0. Samples: 274046640. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-07-02 12:51:06,096][36761] Avg episode reward: [(0, '0.330')] [2024-07-02 12:51:06,709][36999] Updated weights for policy 0, policy_version 16720 (0.0041) [2024-07-02 12:51:09,510][36999] Updated weights for policy 0, policy_version 16730 (0.0025) [2024-07-02 12:51:11,095][36761] Fps is (10 sec: 44256.5, 60 sec: 44236.7, 300 sec: 44375.6). Total num frames: 274153472. Throughput: 0: 44408.9. Samples: 274302360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-07-02 12:51:11,096][36761] Avg episode reward: [(0, '0.321')] [2024-07-02 12:51:14,148][36999] Updated weights for policy 0, policy_version 16740 (0.0028) [2024-07-02 12:51:16,095][36761] Fps is (10 sec: 47514.0, 60 sec: 44510.0, 300 sec: 44486.7). Total num frames: 274382848. Throughput: 0: 44361.1. Samples: 274439920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-07-02 12:51:16,096][36761] Avg episode reward: [(0, '0.364')] [2024-07-02 12:51:16,162][36979] Saving new best policy, reward=0.364! [2024-07-02 12:51:17,083][36999] Updated weights for policy 0, policy_version 16750 (0.0029) [2024-07-02 12:51:21,095][36761] Fps is (10 sec: 42598.3, 60 sec: 44236.7, 300 sec: 44375.6). Total num frames: 274579456. Throughput: 0: 44299.5. Samples: 274707400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 12:51:21,096][36761] Avg episode reward: [(0, '0.331')] [2024-07-02 12:51:21,347][36999] Updated weights for policy 0, policy_version 16760 (0.0038) [2024-07-02 12:51:24,276][36999] Updated weights for policy 0, policy_version 16770 (0.0031) [2024-07-02 12:51:26,095][36761] Fps is (10 sec: 44236.6, 60 sec: 44513.3, 300 sec: 44431.2). Total num frames: 274825216. Throughput: 0: 44364.0. Samples: 274973780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 12:51:26,096][36761] Avg episode reward: [(0, '0.344')] [2024-07-02 12:51:28,572][36999] Updated weights for policy 0, policy_version 16780 (0.0039) [2024-07-02 12:51:31,095][36761] Fps is (10 sec: 47514.1, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 275054592. Throughput: 0: 44400.4. Samples: 275113620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 12:51:31,096][36761] Avg episode reward: [(0, '0.337')] [2024-07-02 12:51:31,585][36999] Updated weights for policy 0, policy_version 16790 (0.0026) [2024-07-02 12:51:35,910][36999] Updated weights for policy 0, policy_version 16800 (0.0026) [2024-07-02 12:51:36,095][36761] Fps is (10 sec: 42598.4, 60 sec: 44236.9, 300 sec: 44375.7). Total num frames: 275251200. Throughput: 0: 44440.4. Samples: 275376280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 12:51:36,096][36761] Avg episode reward: [(0, '0.344')] [2024-07-02 12:51:37,639][36979] Signal inference workers to stop experience collection... (4000 times) [2024-07-02 12:51:37,639][36979] Signal inference workers to resume experience collection... (4000 times) [2024-07-02 12:51:37,663][36999] InferenceWorker_p0-w0: stopping experience collection (4000 times) [2024-07-02 12:51:37,663][36999] InferenceWorker_p0-w0: resuming experience collection (4000 times) [2024-07-02 12:51:38,861][36999] Updated weights for policy 0, policy_version 16810 (0.0034) [2024-07-02 12:51:41,095][36761] Fps is (10 sec: 42598.1, 60 sec: 44236.8, 300 sec: 44375.6). Total num frames: 275480576. Throughput: 0: 44454.2. Samples: 275644580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 12:51:41,096][36761] Avg episode reward: [(0, '0.344')] [2024-07-02 12:51:43,398][36999] Updated weights for policy 0, policy_version 16820 (0.0026) [2024-07-02 12:51:46,095][36761] Fps is (10 sec: 47513.8, 60 sec: 44783.2, 300 sec: 44375.7). Total num frames: 275726336. Throughput: 0: 44380.1. Samples: 275779140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 12:51:46,096][36761] Avg episode reward: [(0, '0.338')] [2024-07-02 12:51:46,164][36999] Updated weights for policy 0, policy_version 16830 (0.0027) [2024-07-02 12:51:50,793][36999] Updated weights for policy 0, policy_version 16840 (0.0037) [2024-07-02 12:51:51,095][36761] Fps is (10 sec: 44236.9, 60 sec: 44509.8, 300 sec: 44375.7). Total num frames: 275922944. Throughput: 0: 44408.0. Samples: 276045000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 12:51:51,096][36761] Avg episode reward: [(0, '0.350')] [2024-07-02 12:51:53,703][36999] Updated weights for policy 0, policy_version 16850 (0.0039) [2024-07-02 12:51:56,095][36761] Fps is (10 sec: 42598.2, 60 sec: 44240.2, 300 sec: 44431.2). Total num frames: 276152320. Throughput: 0: 44511.2. Samples: 276305360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 12:51:56,096][36761] Avg episode reward: [(0, '0.370')] [2024-07-02 12:51:56,162][36979] Saving new best policy, reward=0.370! [2024-07-02 12:51:58,290][36999] Updated weights for policy 0, policy_version 16860 (0.0026) [2024-07-02 12:52:01,095][36761] Fps is (10 sec: 44237.2, 60 sec: 44240.2, 300 sec: 44320.1). Total num frames: 276365312. Throughput: 0: 44393.8. Samples: 276437640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 12:52:01,095][36761] Avg episode reward: [(0, '0.370')] [2024-07-02 12:52:01,620][36999] Updated weights for policy 0, policy_version 16870 (0.0039) [2024-07-02 12:52:05,720][36999] Updated weights for policy 0, policy_version 16880 (0.0036) [2024-07-02 12:52:06,095][36761] Fps is (10 sec: 42598.1, 60 sec: 44509.8, 300 sec: 44320.1). Total num frames: 276578304. Throughput: 0: 44484.9. Samples: 276709220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 12:52:06,096][36761] Avg episode reward: [(0, '0.405')] [2024-07-02 12:52:06,113][36979] Saving new best policy, reward=0.405! [2024-07-02 12:52:08,840][36999] Updated weights for policy 0, policy_version 16890 (0.0036) [2024-07-02 12:52:11,095][36761] Fps is (10 sec: 45874.9, 60 sec: 44509.9, 300 sec: 44431.2). Total num frames: 276824064. Throughput: 0: 44324.9. Samples: 276968400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 12:52:11,096][36761] Avg episode reward: [(0, '0.374')] [2024-07-02 12:52:13,084][36999] Updated weights for policy 0, policy_version 16900 (0.0030) [2024-07-02 12:52:16,095][36761] Fps is (10 sec: 45875.8, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 277037056. Throughput: 0: 44140.0. Samples: 277099920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 12:52:16,096][36761] Avg episode reward: [(0, '0.362')] [2024-07-02 12:52:16,119][36999] Updated weights for policy 0, policy_version 16910 (0.0031) [2024-07-02 12:52:20,442][36999] Updated weights for policy 0, policy_version 16920 (0.0032) [2024-07-02 12:52:21,095][36761] Fps is (10 sec: 42598.6, 60 sec: 44510.0, 300 sec: 44375.9). Total num frames: 277250048. Throughput: 0: 44362.7. Samples: 277372600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-07-02 12:52:21,096][36761] Avg episode reward: [(0, '0.362')] [2024-07-02 12:52:23,398][36999] Updated weights for policy 0, policy_version 16930 (0.0031) [2024-07-02 12:52:26,095][36761] Fps is (10 sec: 44236.9, 60 sec: 44236.8, 300 sec: 44375.7). Total num frames: 277479424. Throughput: 0: 44143.2. Samples: 277631020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-07-02 12:52:26,096][36761] Avg episode reward: [(0, '0.384')] [2024-07-02 12:52:27,753][36999] Updated weights for policy 0, policy_version 16940 (0.0024) [2024-07-02 12:52:31,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43963.8, 300 sec: 44320.1). Total num frames: 277692416. Throughput: 0: 44173.8. Samples: 277766960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-07-02 12:52:31,096][36761] Avg episode reward: [(0, '0.321')] [2024-07-02 12:52:31,169][36999] Updated weights for policy 0, policy_version 16950 (0.0027) [2024-07-02 12:52:35,174][36999] Updated weights for policy 0, policy_version 16960 (0.0028) [2024-07-02 12:52:36,095][36761] Fps is (10 sec: 44236.1, 60 sec: 44509.8, 300 sec: 44375.6). Total num frames: 277921792. Throughput: 0: 44272.8. Samples: 278037280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 12:52:36,096][36761] Avg episode reward: [(0, '0.347')] [2024-07-02 12:52:36,104][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000016963_277921792.pth... [2024-07-02 12:52:36,168][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000016315_267304960.pth [2024-07-02 12:52:38,369][36999] Updated weights for policy 0, policy_version 16970 (0.0027) [2024-07-02 12:52:41,095][36761] Fps is (10 sec: 44236.2, 60 sec: 44236.8, 300 sec: 44375.7). Total num frames: 278134784. Throughput: 0: 44261.7. Samples: 278297140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 12:52:41,096][36761] Avg episode reward: [(0, '0.345')] [2024-07-02 12:52:42,541][36999] Updated weights for policy 0, policy_version 16980 (0.0039) [2024-07-02 12:52:45,709][36999] Updated weights for policy 0, policy_version 16990 (0.0025) [2024-07-02 12:52:46,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43963.6, 300 sec: 44320.1). Total num frames: 278364160. Throughput: 0: 44272.7. Samples: 278429920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 12:52:46,096][36761] Avg episode reward: [(0, '0.356')] [2024-07-02 12:52:49,970][36999] Updated weights for policy 0, policy_version 17000 (0.0026) [2024-07-02 12:52:51,095][36761] Fps is (10 sec: 45875.3, 60 sec: 44509.9, 300 sec: 44375.6). Total num frames: 278593536. Throughput: 0: 44388.9. Samples: 278706720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 12:52:51,096][36761] Avg episode reward: [(0, '0.377')] [2024-07-02 12:52:52,890][36999] Updated weights for policy 0, policy_version 17010 (0.0030) [2024-07-02 12:52:56,095][36761] Fps is (10 sec: 42598.9, 60 sec: 43963.7, 300 sec: 44320.1). Total num frames: 278790144. Throughput: 0: 44616.9. Samples: 278976160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-07-02 12:52:56,096][36761] Avg episode reward: [(0, '0.377')] [2024-07-02 12:52:57,206][36999] Updated weights for policy 0, policy_version 17020 (0.0036) [2024-07-02 12:53:00,081][36999] Updated weights for policy 0, policy_version 17030 (0.0031) [2024-07-02 12:53:01,095][36761] Fps is (10 sec: 44237.2, 60 sec: 44509.9, 300 sec: 44320.1). Total num frames: 279035904. Throughput: 0: 44502.7. Samples: 279102540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-07-02 12:53:01,096][36761] Avg episode reward: [(0, '0.400')] [2024-07-02 12:53:04,422][36999] Updated weights for policy 0, policy_version 17040 (0.0039) [2024-07-02 12:53:06,095][36761] Fps is (10 sec: 47513.9, 60 sec: 44783.0, 300 sec: 44486.7). Total num frames: 279265280. Throughput: 0: 44521.4. Samples: 279376060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-07-02 12:53:06,096][36761] Avg episode reward: [(0, '0.346')] [2024-07-02 12:53:06,676][36979] Signal inference workers to stop experience collection... (4050 times) [2024-07-02 12:53:06,725][36979] Signal inference workers to resume experience collection... (4050 times) [2024-07-02 12:53:06,727][36999] InferenceWorker_p0-w0: stopping experience collection (4050 times) [2024-07-02 12:53:06,746][36999] InferenceWorker_p0-w0: resuming experience collection (4050 times) [2024-07-02 12:53:07,305][36999] Updated weights for policy 0, policy_version 17050 (0.0029) [2024-07-02 12:53:11,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 279445504. Throughput: 0: 44886.6. Samples: 279650920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 12:53:11,096][36761] Avg episode reward: [(0, '0.338')] [2024-07-02 12:53:11,836][36999] Updated weights for policy 0, policy_version 17060 (0.0030) [2024-07-02 12:53:14,626][36999] Updated weights for policy 0, policy_version 17070 (0.0043) [2024-07-02 12:53:16,095][36761] Fps is (10 sec: 45874.6, 60 sec: 44782.8, 300 sec: 44431.9). Total num frames: 279724032. Throughput: 0: 44636.3. Samples: 279775600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 12:53:16,096][36761] Avg episode reward: [(0, '0.366')] [2024-07-02 12:53:19,354][36999] Updated weights for policy 0, policy_version 17080 (0.0038) [2024-07-02 12:53:21,097][36761] Fps is (10 sec: 49142.4, 60 sec: 44781.5, 300 sec: 44486.4). Total num frames: 279937024. Throughput: 0: 44663.5. Samples: 280047220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 12:53:21,098][36761] Avg episode reward: [(0, '0.366')] [2024-07-02 12:53:21,871][36999] Updated weights for policy 0, policy_version 17090 (0.0037) [2024-07-02 12:53:26,095][36761] Fps is (10 sec: 40960.3, 60 sec: 44236.7, 300 sec: 44375.6). Total num frames: 280133632. Throughput: 0: 44994.3. Samples: 280321880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 12:53:26,096][36761] Avg episode reward: [(0, '0.381')] [2024-07-02 12:53:26,657][36999] Updated weights for policy 0, policy_version 17100 (0.0036) [2024-07-02 12:53:29,249][36999] Updated weights for policy 0, policy_version 17110 (0.0034) [2024-07-02 12:53:31,100][36761] Fps is (10 sec: 44225.1, 60 sec: 44779.5, 300 sec: 44430.5). Total num frames: 280379392. Throughput: 0: 44768.0. Samples: 280444680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 12:53:31,100][36761] Avg episode reward: [(0, '0.371')] [2024-07-02 12:53:33,938][36999] Updated weights for policy 0, policy_version 17120 (0.0033) [2024-07-02 12:53:36,095][36761] Fps is (10 sec: 47514.2, 60 sec: 44783.1, 300 sec: 44486.7). Total num frames: 280608768. Throughput: 0: 44605.9. Samples: 280713980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 12:53:36,095][36761] Avg episode reward: [(0, '0.333')] [2024-07-02 12:53:36,779][36999] Updated weights for policy 0, policy_version 17130 (0.0038) [2024-07-02 12:53:41,095][36761] Fps is (10 sec: 42617.6, 60 sec: 44509.9, 300 sec: 44375.7). Total num frames: 280805376. Throughput: 0: 44583.1. Samples: 280982400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 12:53:41,096][36761] Avg episode reward: [(0, '0.362')] [2024-07-02 12:53:41,256][36999] Updated weights for policy 0, policy_version 17140 (0.0029) [2024-07-02 12:53:44,381][36999] Updated weights for policy 0, policy_version 17150 (0.0041) [2024-07-02 12:53:46,095][36761] Fps is (10 sec: 44236.4, 60 sec: 44783.0, 300 sec: 44375.6). Total num frames: 281051136. Throughput: 0: 44529.3. Samples: 281106360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-07-02 12:53:46,096][36761] Avg episode reward: [(0, '0.401')] [2024-07-02 12:53:48,676][36999] Updated weights for policy 0, policy_version 17160 (0.0030) [2024-07-02 12:53:51,095][36761] Fps is (10 sec: 47513.5, 60 sec: 44782.9, 300 sec: 44486.7). Total num frames: 281280512. Throughput: 0: 44501.7. Samples: 281378640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-07-02 12:53:51,096][36761] Avg episode reward: [(0, '0.401')] [2024-07-02 12:53:51,815][36999] Updated weights for policy 0, policy_version 17170 (0.0032) [2024-07-02 12:53:56,024][36999] Updated weights for policy 0, policy_version 17180 (0.0035) [2024-07-02 12:53:56,095][36761] Fps is (10 sec: 42598.2, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 281477120. Throughput: 0: 44326.1. Samples: 281645600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-07-02 12:53:56,096][36761] Avg episode reward: [(0, '0.334')] [2024-07-02 12:53:59,395][36999] Updated weights for policy 0, policy_version 17190 (0.0043) [2024-07-02 12:54:01,100][36761] Fps is (10 sec: 44216.7, 60 sec: 44779.5, 300 sec: 44430.5). Total num frames: 281722880. Throughput: 0: 44230.7. Samples: 281766180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-07-02 12:54:01,100][36761] Avg episode reward: [(0, '0.384')] [2024-07-02 12:54:03,460][36999] Updated weights for policy 0, policy_version 17200 (0.0044) [2024-07-02 12:54:06,095][36761] Fps is (10 sec: 47514.2, 60 sec: 44783.0, 300 sec: 44486.8). Total num frames: 281952256. Throughput: 0: 44421.1. Samples: 282046080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-07-02 12:54:06,095][36761] Avg episode reward: [(0, '0.384')] [2024-07-02 12:54:06,771][36999] Updated weights for policy 0, policy_version 17210 (0.0029) [2024-07-02 12:54:10,742][36999] Updated weights for policy 0, policy_version 17220 (0.0031) [2024-07-02 12:54:11,099][36761] Fps is (10 sec: 40961.9, 60 sec: 44779.8, 300 sec: 44430.6). Total num frames: 282132480. Throughput: 0: 44184.4. Samples: 282310360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 12:54:11,100][36761] Avg episode reward: [(0, '0.399')] [2024-07-02 12:54:14,125][36999] Updated weights for policy 0, policy_version 17230 (0.0039) [2024-07-02 12:54:16,095][36761] Fps is (10 sec: 42598.1, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 282378240. Throughput: 0: 44261.4. Samples: 282436240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 12:54:16,096][36761] Avg episode reward: [(0, '0.347')] [2024-07-02 12:54:18,108][36999] Updated weights for policy 0, policy_version 17240 (0.0035) [2024-07-02 12:54:21,100][36761] Fps is (10 sec: 47511.5, 60 sec: 44507.9, 300 sec: 44486.0). Total num frames: 282607616. Throughput: 0: 44267.4. Samples: 282706220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 12:54:21,100][36761] Avg episode reward: [(0, '0.371')] [2024-07-02 12:54:21,457][36999] Updated weights for policy 0, policy_version 17250 (0.0027) [2024-07-02 12:54:25,674][36999] Updated weights for policy 0, policy_version 17260 (0.0026) [2024-07-02 12:54:26,095][36761] Fps is (10 sec: 42598.4, 60 sec: 44509.9, 300 sec: 44431.2). Total num frames: 282804224. Throughput: 0: 44317.4. Samples: 282976680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 12:54:26,096][36761] Avg episode reward: [(0, '0.374')] [2024-07-02 12:54:28,674][36999] Updated weights for policy 0, policy_version 17270 (0.0023) [2024-07-02 12:54:31,096][36761] Fps is (10 sec: 42615.6, 60 sec: 44239.7, 300 sec: 44431.1). Total num frames: 283033600. Throughput: 0: 44402.1. Samples: 283104480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 12:54:31,096][36761] Avg episode reward: [(0, '0.410')] [2024-07-02 12:54:31,100][36979] Saving new best policy, reward=0.410! [2024-07-02 12:54:33,150][36999] Updated weights for policy 0, policy_version 17280 (0.0035) [2024-07-02 12:54:36,095][36761] Fps is (10 sec: 45874.8, 60 sec: 44236.7, 300 sec: 44375.6). Total num frames: 283262976. Throughput: 0: 44248.0. Samples: 283369800. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-07-02 12:54:36,096][36761] Avg episode reward: [(0, '0.441')] [2024-07-02 12:54:36,230][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000017290_283279360.pth... [2024-07-02 12:54:36,235][36999] Updated weights for policy 0, policy_version 17290 (0.0036) [2024-07-02 12:54:36,292][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000016638_272596992.pth [2024-07-02 12:54:36,297][36979] Saving new best policy, reward=0.441! [2024-07-02 12:54:36,871][36979] Signal inference workers to stop experience collection... (4100 times) [2024-07-02 12:54:36,872][36979] Signal inference workers to resume experience collection... (4100 times) [2024-07-02 12:54:36,887][36999] InferenceWorker_p0-w0: stopping experience collection (4100 times) [2024-07-02 12:54:36,887][36999] InferenceWorker_p0-w0: resuming experience collection (4100 times) [2024-07-02 12:54:40,449][36999] Updated weights for policy 0, policy_version 17300 (0.0029) [2024-07-02 12:54:41,095][36761] Fps is (10 sec: 42600.6, 60 sec: 44236.8, 300 sec: 44376.2). Total num frames: 283459584. Throughput: 0: 44247.1. Samples: 283636720. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-07-02 12:54:41,096][36761] Avg episode reward: [(0, '0.405')] [2024-07-02 12:54:43,554][36999] Updated weights for policy 0, policy_version 17310 (0.0029) [2024-07-02 12:54:46,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43963.7, 300 sec: 44375.7). Total num frames: 283688960. Throughput: 0: 44417.4. Samples: 283764760. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-07-02 12:54:46,096][36761] Avg episode reward: [(0, '0.371')] [2024-07-02 12:54:47,801][36999] Updated weights for policy 0, policy_version 17320 (0.0044) [2024-07-02 12:54:51,088][36999] Updated weights for policy 0, policy_version 17330 (0.0025) [2024-07-02 12:54:51,095][36761] Fps is (10 sec: 47513.9, 60 sec: 44236.9, 300 sec: 44376.3). Total num frames: 283934720. Throughput: 0: 44088.4. Samples: 284030060. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-07-02 12:54:51,096][36761] Avg episode reward: [(0, '0.389')] [2024-07-02 12:54:55,303][36999] Updated weights for policy 0, policy_version 17340 (0.0049) [2024-07-02 12:54:56,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43963.7, 300 sec: 44320.1). Total num frames: 284114944. Throughput: 0: 44181.8. Samples: 284298360. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-07-02 12:54:56,096][36761] Avg episode reward: [(0, '0.442')] [2024-07-02 12:54:56,124][36979] Saving new best policy, reward=0.442! [2024-07-02 12:54:58,446][36999] Updated weights for policy 0, policy_version 17350 (0.0032) [2024-07-02 12:55:01,095][36761] Fps is (10 sec: 40959.7, 60 sec: 43694.0, 300 sec: 44320.9). Total num frames: 284344320. Throughput: 0: 44151.5. Samples: 284423060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-07-02 12:55:01,099][36761] Avg episode reward: [(0, '0.447')] [2024-07-02 12:55:01,100][36979] Saving new best policy, reward=0.447! [2024-07-02 12:55:03,171][36999] Updated weights for policy 0, policy_version 17360 (0.0040) [2024-07-02 12:55:05,980][36999] Updated weights for policy 0, policy_version 17370 (0.0034) [2024-07-02 12:55:06,095][36761] Fps is (10 sec: 47513.4, 60 sec: 43963.6, 300 sec: 44375.6). Total num frames: 284590080. Throughput: 0: 44096.4. Samples: 284690360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-07-02 12:55:06,096][36761] Avg episode reward: [(0, '0.445')] [2024-07-02 12:55:10,374][36999] Updated weights for policy 0, policy_version 17380 (0.0031) [2024-07-02 12:55:11,095][36761] Fps is (10 sec: 45874.9, 60 sec: 44512.9, 300 sec: 44375.7). Total num frames: 284803072. Throughput: 0: 44148.3. Samples: 284963360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-07-02 12:55:11,098][36761] Avg episode reward: [(0, '0.446')] [2024-07-02 12:55:13,207][36999] Updated weights for policy 0, policy_version 17390 (0.0044) [2024-07-02 12:55:16,095][36761] Fps is (10 sec: 40960.4, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 284999680. Throughput: 0: 44193.0. Samples: 285093140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 12:55:16,096][36761] Avg episode reward: [(0, '0.446')] [2024-07-02 12:55:17,658][36999] Updated weights for policy 0, policy_version 17400 (0.0033) [2024-07-02 12:55:20,497][36999] Updated weights for policy 0, policy_version 17410 (0.0038) [2024-07-02 12:55:21,095][36761] Fps is (10 sec: 45875.1, 60 sec: 44240.1, 300 sec: 44431.9). Total num frames: 285261824. Throughput: 0: 44227.1. Samples: 285360020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 12:55:21,096][36761] Avg episode reward: [(0, '0.446')] [2024-07-02 12:55:24,949][36999] Updated weights for policy 0, policy_version 17420 (0.0022) [2024-07-02 12:55:26,095][36761] Fps is (10 sec: 47513.9, 60 sec: 44509.9, 300 sec: 44431.2). Total num frames: 285474816. Throughput: 0: 44404.1. Samples: 285634900. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-07-02 12:55:26,095][36761] Avg episode reward: [(0, '0.446')] [2024-07-02 12:55:27,727][36999] Updated weights for policy 0, policy_version 17430 (0.0026) [2024-07-02 12:55:31,095][36761] Fps is (10 sec: 39322.1, 60 sec: 43691.1, 300 sec: 44264.6). Total num frames: 285655040. Throughput: 0: 44437.8. Samples: 285764460. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-07-02 12:55:31,096][36761] Avg episode reward: [(0, '0.517')] [2024-07-02 12:55:31,147][36979] Saving new best policy, reward=0.517! [2024-07-02 12:55:32,267][36999] Updated weights for policy 0, policy_version 17440 (0.0031) [2024-07-02 12:55:35,166][36999] Updated weights for policy 0, policy_version 17450 (0.0024) [2024-07-02 12:55:36,097][36761] Fps is (10 sec: 45868.2, 60 sec: 44508.8, 300 sec: 44431.0). Total num frames: 285933568. Throughput: 0: 44375.8. Samples: 286027040. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-07-02 12:55:36,097][36761] Avg episode reward: [(0, '0.449')] [2024-07-02 12:55:39,693][36999] Updated weights for policy 0, policy_version 17460 (0.0036) [2024-07-02 12:55:41,095][36761] Fps is (10 sec: 49151.1, 60 sec: 44782.8, 300 sec: 44431.2). Total num frames: 286146560. Throughput: 0: 44310.6. Samples: 286292340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 12:55:41,096][36761] Avg episode reward: [(0, '0.399')] [2024-07-02 12:55:42,609][36999] Updated weights for policy 0, policy_version 17470 (0.0036) [2024-07-02 12:55:46,095][36761] Fps is (10 sec: 39326.8, 60 sec: 43963.7, 300 sec: 44320.1). Total num frames: 286326784. Throughput: 0: 44405.2. Samples: 286421300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 12:55:46,096][36761] Avg episode reward: [(0, '0.399')] [2024-07-02 12:55:46,276][36979] Signal inference workers to stop experience collection... (4150 times) [2024-07-02 12:55:46,334][36999] InferenceWorker_p0-w0: stopping experience collection (4150 times) [2024-07-02 12:55:46,340][36979] Signal inference workers to resume experience collection... (4150 times) [2024-07-02 12:55:46,357][36999] InferenceWorker_p0-w0: resuming experience collection (4150 times) [2024-07-02 12:55:47,113][36999] Updated weights for policy 0, policy_version 17480 (0.0035) [2024-07-02 12:55:50,051][36999] Updated weights for policy 0, policy_version 17490 (0.0031) [2024-07-02 12:55:51,095][36761] Fps is (10 sec: 44237.7, 60 sec: 44236.8, 300 sec: 44376.3). Total num frames: 286588928. Throughput: 0: 44436.1. Samples: 286689980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 12:55:51,096][36761] Avg episode reward: [(0, '0.401')] [2024-07-02 12:55:54,394][36999] Updated weights for policy 0, policy_version 17500 (0.0025) [2024-07-02 12:55:56,095][36761] Fps is (10 sec: 47514.5, 60 sec: 44783.0, 300 sec: 44376.3). Total num frames: 286801920. Throughput: 0: 44234.8. Samples: 286953920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 12:55:56,096][36761] Avg episode reward: [(0, '0.402')] [2024-07-02 12:55:57,631][36999] Updated weights for policy 0, policy_version 17510 (0.0041) [2024-07-02 12:56:01,095][36761] Fps is (10 sec: 40960.2, 60 sec: 44236.9, 300 sec: 44375.7). Total num frames: 286998528. Throughput: 0: 44117.8. Samples: 287078440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 12:56:01,095][36761] Avg episode reward: [(0, '0.413')] [2024-07-02 12:56:01,876][36999] Updated weights for policy 0, policy_version 17520 (0.0047) [2024-07-02 12:56:05,004][36999] Updated weights for policy 0, policy_version 17530 (0.0035) [2024-07-02 12:56:06,095][36761] Fps is (10 sec: 44236.5, 60 sec: 44236.9, 300 sec: 44375.7). Total num frames: 287244288. Throughput: 0: 44127.6. Samples: 287345760. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-07-02 12:56:06,096][36761] Avg episode reward: [(0, '0.442')] [2024-07-02 12:56:09,197][36999] Updated weights for policy 0, policy_version 17540 (0.0025) [2024-07-02 12:56:11,096][36761] Fps is (10 sec: 47512.3, 60 sec: 44509.8, 300 sec: 44375.6). Total num frames: 287473664. Throughput: 0: 43935.3. Samples: 287612000. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-07-02 12:56:11,096][36761] Avg episode reward: [(0, '0.450')] [2024-07-02 12:56:12,283][36999] Updated weights for policy 0, policy_version 17550 (0.0030) [2024-07-02 12:56:16,096][36761] Fps is (10 sec: 40955.4, 60 sec: 44236.0, 300 sec: 44319.9). Total num frames: 287653888. Throughput: 0: 43990.4. Samples: 287744080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 12:56:16,097][36761] Avg episode reward: [(0, '0.468')] [2024-07-02 12:56:16,647][36999] Updated weights for policy 0, policy_version 17560 (0.0041) [2024-07-02 12:56:19,582][36999] Updated weights for policy 0, policy_version 17570 (0.0040) [2024-07-02 12:56:21,095][36761] Fps is (10 sec: 44238.0, 60 sec: 44236.9, 300 sec: 44375.7). Total num frames: 287916032. Throughput: 0: 44042.4. Samples: 288008880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 12:56:21,096][36761] Avg episode reward: [(0, '0.457')] [2024-07-02 12:56:24,216][36999] Updated weights for policy 0, policy_version 17580 (0.0028) [2024-07-02 12:56:26,095][36761] Fps is (10 sec: 47518.8, 60 sec: 44236.7, 300 sec: 44320.1). Total num frames: 288129024. Throughput: 0: 44060.6. Samples: 288275060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 12:56:26,096][36761] Avg episode reward: [(0, '0.445')] [2024-07-02 12:56:27,006][36999] Updated weights for policy 0, policy_version 17590 (0.0035) [2024-07-02 12:56:31,095][36761] Fps is (10 sec: 40959.9, 60 sec: 44509.9, 300 sec: 44320.1). Total num frames: 288325632. Throughput: 0: 44262.9. Samples: 288413120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 12:56:31,096][36761] Avg episode reward: [(0, '0.448')] [2024-07-02 12:56:31,581][36999] Updated weights for policy 0, policy_version 17600 (0.0032) [2024-07-02 12:56:34,450][36999] Updated weights for policy 0, policy_version 17610 (0.0034) [2024-07-02 12:56:36,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43964.8, 300 sec: 44375.6). Total num frames: 288571392. Throughput: 0: 43995.0. Samples: 288669760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 12:56:36,099][36761] Avg episode reward: [(0, '0.451')] [2024-07-02 12:56:36,113][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000017613_288571392.pth... [2024-07-02 12:56:36,165][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000016963_277921792.pth [2024-07-02 12:56:39,128][36999] Updated weights for policy 0, policy_version 17620 (0.0037) [2024-07-02 12:56:41,095][36761] Fps is (10 sec: 45874.4, 60 sec: 43963.8, 300 sec: 44264.5). Total num frames: 288784384. Throughput: 0: 44134.5. Samples: 288939980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-07-02 12:56:41,096][36761] Avg episode reward: [(0, '0.424')] [2024-07-02 12:56:41,782][36999] Updated weights for policy 0, policy_version 17630 (0.0026) [2024-07-02 12:56:46,095][36761] Fps is (10 sec: 40960.1, 60 sec: 44236.9, 300 sec: 44264.6). Total num frames: 288980992. Throughput: 0: 44468.3. Samples: 289079520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-07-02 12:56:46,096][36761] Avg episode reward: [(0, '0.437')] [2024-07-02 12:56:46,517][36999] Updated weights for policy 0, policy_version 17640 (0.0034) [2024-07-02 12:56:47,477][36979] Signal inference workers to stop experience collection... (4200 times) [2024-07-02 12:56:47,477][36979] Signal inference workers to resume experience collection... (4200 times) [2024-07-02 12:56:47,496][36999] InferenceWorker_p0-w0: stopping experience collection (4200 times) [2024-07-02 12:56:47,523][36999] InferenceWorker_p0-w0: resuming experience collection (4200 times) [2024-07-02 12:56:49,130][36999] Updated weights for policy 0, policy_version 17650 (0.0036) [2024-07-02 12:56:51,095][36761] Fps is (10 sec: 44237.3, 60 sec: 43963.7, 300 sec: 44320.1). Total num frames: 289226752. Throughput: 0: 44260.4. Samples: 289337480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-07-02 12:56:51,096][36761] Avg episode reward: [(0, '0.449')] [2024-07-02 12:56:53,942][36999] Updated weights for policy 0, policy_version 17660 (0.0043) [2024-07-02 12:56:56,095][36761] Fps is (10 sec: 47513.7, 60 sec: 44236.8, 300 sec: 44375.6). Total num frames: 289456128. Throughput: 0: 44319.3. Samples: 289606360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 12:56:56,096][36761] Avg episode reward: [(0, '0.472')] [2024-07-02 12:56:56,485][36999] Updated weights for policy 0, policy_version 17670 (0.0035) [2024-07-02 12:57:01,095][36761] Fps is (10 sec: 42598.7, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 289652736. Throughput: 0: 44388.7. Samples: 289741520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 12:57:01,096][36761] Avg episode reward: [(0, '0.438')] [2024-07-02 12:57:01,325][36999] Updated weights for policy 0, policy_version 17680 (0.0037) [2024-07-02 12:57:03,898][36999] Updated weights for policy 0, policy_version 17690 (0.0029) [2024-07-02 12:57:06,095][36761] Fps is (10 sec: 42597.9, 60 sec: 43963.6, 300 sec: 44264.6). Total num frames: 289882112. Throughput: 0: 44228.3. Samples: 289999160. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-07-02 12:57:06,096][36761] Avg episode reward: [(0, '0.460')] [2024-07-02 12:57:08,619][36999] Updated weights for policy 0, policy_version 17700 (0.0025) [2024-07-02 12:57:11,095][36761] Fps is (10 sec: 47513.3, 60 sec: 44236.9, 300 sec: 44375.6). Total num frames: 290127872. Throughput: 0: 44194.7. Samples: 290263820. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-07-02 12:57:11,096][36761] Avg episode reward: [(0, '0.412')] [2024-07-02 12:57:11,475][36999] Updated weights for policy 0, policy_version 17710 (0.0030) [2024-07-02 12:57:16,095][36999] Updated weights for policy 0, policy_version 17720 (0.0034) [2024-07-02 12:57:16,095][36761] Fps is (10 sec: 44237.3, 60 sec: 44510.7, 300 sec: 44320.1). Total num frames: 290324480. Throughput: 0: 44308.8. Samples: 290407020. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-07-02 12:57:16,096][36761] Avg episode reward: [(0, '0.418')] [2024-07-02 12:57:18,842][36999] Updated weights for policy 0, policy_version 17730 (0.0021) [2024-07-02 12:57:21,095][36761] Fps is (10 sec: 40959.8, 60 sec: 43690.6, 300 sec: 44264.5). Total num frames: 290537472. Throughput: 0: 44233.8. Samples: 290660280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 12:57:21,096][36761] Avg episode reward: [(0, '0.511')] [2024-07-02 12:57:23,526][36999] Updated weights for policy 0, policy_version 17740 (0.0028) [2024-07-02 12:57:26,095][36761] Fps is (10 sec: 47513.5, 60 sec: 44509.9, 300 sec: 44431.2). Total num frames: 290799616. Throughput: 0: 44145.4. Samples: 290926520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 12:57:26,096][36761] Avg episode reward: [(0, '0.460')] [2024-07-02 12:57:26,241][36999] Updated weights for policy 0, policy_version 17750 (0.0034) [2024-07-02 12:57:31,043][36999] Updated weights for policy 0, policy_version 17760 (0.0035) [2024-07-02 12:57:31,095][36761] Fps is (10 sec: 44237.0, 60 sec: 44236.8, 300 sec: 44264.6). Total num frames: 290979840. Throughput: 0: 44208.0. Samples: 291068880. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-07-02 12:57:31,096][36761] Avg episode reward: [(0, '0.457')] [2024-07-02 12:57:33,660][36999] Updated weights for policy 0, policy_version 17770 (0.0022) [2024-07-02 12:57:36,095][36761] Fps is (10 sec: 39321.9, 60 sec: 43690.7, 300 sec: 44264.6). Total num frames: 291192832. Throughput: 0: 44150.3. Samples: 291324240. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-07-02 12:57:36,096][36761] Avg episode reward: [(0, '0.457')] [2024-07-02 12:57:38,347][36999] Updated weights for policy 0, policy_version 17780 (0.0036) [2024-07-02 12:57:41,040][36999] Updated weights for policy 0, policy_version 17790 (0.0033) [2024-07-02 12:57:41,095][36761] Fps is (10 sec: 49152.3, 60 sec: 44783.1, 300 sec: 44431.2). Total num frames: 291471360. Throughput: 0: 44319.6. Samples: 291600740. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-07-02 12:57:41,096][36761] Avg episode reward: [(0, '0.457')] [2024-07-02 12:57:45,626][36999] Updated weights for policy 0, policy_version 17800 (0.0020) [2024-07-02 12:57:46,095][36761] Fps is (10 sec: 45874.8, 60 sec: 44509.9, 300 sec: 44264.6). Total num frames: 291651584. Throughput: 0: 44432.4. Samples: 291740980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 12:57:46,096][36761] Avg episode reward: [(0, '0.457')] [2024-07-02 12:57:48,570][36999] Updated weights for policy 0, policy_version 17810 (0.0021) [2024-07-02 12:57:51,095][36761] Fps is (10 sec: 40959.7, 60 sec: 44236.8, 300 sec: 44375.6). Total num frames: 291880960. Throughput: 0: 44481.4. Samples: 292000820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 12:57:51,096][36761] Avg episode reward: [(0, '0.457')] [2024-07-02 12:57:52,932][36999] Updated weights for policy 0, policy_version 17820 (0.0030) [2024-07-02 12:57:54,063][36979] Signal inference workers to stop experience collection... (4250 times) [2024-07-02 12:57:54,064][36979] Signal inference workers to resume experience collection... (4250 times) [2024-07-02 12:57:54,103][36999] InferenceWorker_p0-w0: stopping experience collection (4250 times) [2024-07-02 12:57:54,103][36999] InferenceWorker_p0-w0: resuming experience collection (4250 times) [2024-07-02 12:57:56,049][36999] Updated weights for policy 0, policy_version 17830 (0.0041) [2024-07-02 12:57:56,095][36761] Fps is (10 sec: 47514.0, 60 sec: 44509.9, 300 sec: 44375.6). Total num frames: 292126720. Throughput: 0: 44631.2. Samples: 292272220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-07-02 12:57:56,095][36761] Avg episode reward: [(0, '0.456')] [2024-07-02 12:58:00,248][36999] Updated weights for policy 0, policy_version 17840 (0.0036) [2024-07-02 12:58:01,098][36761] Fps is (10 sec: 45861.8, 60 sec: 44780.7, 300 sec: 44319.7). Total num frames: 292339712. Throughput: 0: 44476.7. Samples: 292408600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-07-02 12:58:01,099][36761] Avg episode reward: [(0, '0.466')] [2024-07-02 12:58:03,475][36999] Updated weights for policy 0, policy_version 17850 (0.0019) [2024-07-02 12:58:06,095][36761] Fps is (10 sec: 40959.8, 60 sec: 44236.9, 300 sec: 44375.6). Total num frames: 292536320. Throughput: 0: 44583.2. Samples: 292666520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-07-02 12:58:06,096][36761] Avg episode reward: [(0, '0.408')] [2024-07-02 12:58:07,567][36999] Updated weights for policy 0, policy_version 17860 (0.0029) [2024-07-02 12:58:11,002][36999] Updated weights for policy 0, policy_version 17870 (0.0027) [2024-07-02 12:58:11,095][36761] Fps is (10 sec: 44250.0, 60 sec: 44236.8, 300 sec: 44264.6). Total num frames: 292782080. Throughput: 0: 44638.3. Samples: 292935240. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-07-02 12:58:11,096][36761] Avg episode reward: [(0, '0.435')] [2024-07-02 12:58:14,871][36999] Updated weights for policy 0, policy_version 17880 (0.0025) [2024-07-02 12:58:16,095][36761] Fps is (10 sec: 47514.0, 60 sec: 44783.0, 300 sec: 44320.4). Total num frames: 293011456. Throughput: 0: 44627.2. Samples: 293077100. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-07-02 12:58:16,095][36761] Avg episode reward: [(0, '0.455')] [2024-07-02 12:58:18,350][36999] Updated weights for policy 0, policy_version 17890 (0.0029) [2024-07-02 12:58:21,095][36761] Fps is (10 sec: 40959.3, 60 sec: 44236.8, 300 sec: 44264.6). Total num frames: 293191680. Throughput: 0: 44760.3. Samples: 293338460. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-07-02 12:58:21,096][36761] Avg episode reward: [(0, '0.452')] [2024-07-02 12:58:22,260][36999] Updated weights for policy 0, policy_version 17900 (0.0027) [2024-07-02 12:58:25,697][36999] Updated weights for policy 0, policy_version 17910 (0.0031) [2024-07-02 12:58:26,095][36761] Fps is (10 sec: 44236.3, 60 sec: 44236.8, 300 sec: 44320.8). Total num frames: 293453824. Throughput: 0: 44375.5. Samples: 293597640. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-07-02 12:58:26,096][36761] Avg episode reward: [(0, '0.444')] [2024-07-02 12:58:29,646][36999] Updated weights for policy 0, policy_version 17920 (0.0033) [2024-07-02 12:58:31,095][36761] Fps is (10 sec: 49151.9, 60 sec: 45055.9, 300 sec: 44320.1). Total num frames: 293683200. Throughput: 0: 44480.3. Samples: 293742600. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-07-02 12:58:31,096][36761] Avg episode reward: [(0, '0.484')] [2024-07-02 12:58:33,016][36999] Updated weights for policy 0, policy_version 17930 (0.0025) [2024-07-02 12:58:36,095][36761] Fps is (10 sec: 39321.8, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 293847040. Throughput: 0: 44408.5. Samples: 293999200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-07-02 12:58:36,096][36761] Avg episode reward: [(0, '0.481')] [2024-07-02 12:58:36,125][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000017936_293863424.pth... [2024-07-02 12:58:36,172][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000017290_283279360.pth [2024-07-02 12:58:37,008][36999] Updated weights for policy 0, policy_version 17940 (0.0036) [2024-07-02 12:58:40,400][36999] Updated weights for policy 0, policy_version 17950 (0.0033) [2024-07-02 12:58:41,095][36761] Fps is (10 sec: 40960.8, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 294092800. Throughput: 0: 44300.9. Samples: 294265760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-07-02 12:58:41,096][36761] Avg episode reward: [(0, '0.504')] [2024-07-02 12:58:44,458][36999] Updated weights for policy 0, policy_version 17960 (0.0051) [2024-07-02 12:58:45,300][36979] Signal inference workers to stop experience collection... (4300 times) [2024-07-02 12:58:45,300][36979] Signal inference workers to resume experience collection... (4300 times) [2024-07-02 12:58:45,341][36999] InferenceWorker_p0-w0: stopping experience collection (4300 times) [2024-07-02 12:58:45,341][36999] InferenceWorker_p0-w0: resuming experience collection (4300 times) [2024-07-02 12:58:46,098][36761] Fps is (10 sec: 50775.0, 60 sec: 45053.8, 300 sec: 44319.7). Total num frames: 294354944. Throughput: 0: 44284.8. Samples: 294401420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-07-02 12:58:46,099][36761] Avg episode reward: [(0, '0.465')] [2024-07-02 12:58:47,739][36999] Updated weights for policy 0, policy_version 17970 (0.0032) [2024-07-02 12:58:51,095][36761] Fps is (10 sec: 42597.8, 60 sec: 43963.7, 300 sec: 44209.0). Total num frames: 294518784. Throughput: 0: 44447.0. Samples: 294666640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 12:58:51,096][36761] Avg episode reward: [(0, '0.474')] [2024-07-02 12:58:51,895][36999] Updated weights for policy 0, policy_version 17980 (0.0030) [2024-07-02 12:58:55,034][36999] Updated weights for policy 0, policy_version 17990 (0.0038) [2024-07-02 12:58:56,095][36761] Fps is (10 sec: 40972.2, 60 sec: 43963.7, 300 sec: 44209.7). Total num frames: 294764544. Throughput: 0: 44373.3. Samples: 294932040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 12:58:56,096][36761] Avg episode reward: [(0, '0.495')] [2024-07-02 12:58:59,270][36999] Updated weights for policy 0, policy_version 18000 (0.0026) [2024-07-02 12:59:01,095][36761] Fps is (10 sec: 50791.0, 60 sec: 44785.2, 300 sec: 44320.1). Total num frames: 295026688. Throughput: 0: 44330.2. Samples: 295071960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-07-02 12:59:01,096][36761] Avg episode reward: [(0, '0.465')] [2024-07-02 12:59:02,261][36999] Updated weights for policy 0, policy_version 18010 (0.0034) [2024-07-02 12:59:06,095][36761] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 44265.2). Total num frames: 295190528. Throughput: 0: 44410.8. Samples: 295336940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-07-02 12:59:06,096][36761] Avg episode reward: [(0, '0.460')] [2024-07-02 12:59:06,749][36999] Updated weights for policy 0, policy_version 18020 (0.0046) [2024-07-02 12:59:09,621][36999] Updated weights for policy 0, policy_version 18030 (0.0023) [2024-07-02 12:59:11,095][36761] Fps is (10 sec: 39321.8, 60 sec: 43963.8, 300 sec: 44209.0). Total num frames: 295419904. Throughput: 0: 44510.3. Samples: 295600600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-07-02 12:59:11,096][36761] Avg episode reward: [(0, '0.460')] [2024-07-02 12:59:14,010][36999] Updated weights for policy 0, policy_version 18040 (0.0038) [2024-07-02 12:59:16,095][36761] Fps is (10 sec: 49151.7, 60 sec: 44509.8, 300 sec: 44320.8). Total num frames: 295682048. Throughput: 0: 44380.1. Samples: 295739700. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-07-02 12:59:16,096][36761] Avg episode reward: [(0, '0.506')] [2024-07-02 12:59:16,847][36999] Updated weights for policy 0, policy_version 18050 (0.0026) [2024-07-02 12:59:21,095][36761] Fps is (10 sec: 44236.2, 60 sec: 44509.9, 300 sec: 44264.6). Total num frames: 295862272. Throughput: 0: 44504.3. Samples: 296001900. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-07-02 12:59:21,096][36761] Avg episode reward: [(0, '0.456')] [2024-07-02 12:59:21,397][36999] Updated weights for policy 0, policy_version 18060 (0.0041) [2024-07-02 12:59:24,293][36999] Updated weights for policy 0, policy_version 18070 (0.0036) [2024-07-02 12:59:26,095][36761] Fps is (10 sec: 39322.2, 60 sec: 43690.8, 300 sec: 44209.1). Total num frames: 296075264. Throughput: 0: 44343.6. Samples: 296261220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 12:59:26,095][36761] Avg episode reward: [(0, '0.472')] [2024-07-02 12:59:28,783][36999] Updated weights for policy 0, policy_version 18080 (0.0024) [2024-07-02 12:59:31,095][36761] Fps is (10 sec: 47514.2, 60 sec: 44237.0, 300 sec: 44320.1). Total num frames: 296337408. Throughput: 0: 44232.8. Samples: 296391760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 12:59:31,095][36761] Avg episode reward: [(0, '0.479')] [2024-07-02 12:59:31,657][36999] Updated weights for policy 0, policy_version 18090 (0.0028) [2024-07-02 12:59:36,095][36761] Fps is (10 sec: 45874.3, 60 sec: 44782.8, 300 sec: 44320.1). Total num frames: 296534016. Throughput: 0: 44446.2. Samples: 296666720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 12:59:36,096][36761] Avg episode reward: [(0, '0.500')] [2024-07-02 12:59:36,123][36999] Updated weights for policy 0, policy_version 18100 (0.0031) [2024-07-02 12:59:39,453][36999] Updated weights for policy 0, policy_version 18110 (0.0025) [2024-07-02 12:59:41,098][36761] Fps is (10 sec: 40949.4, 60 sec: 44234.9, 300 sec: 44264.2). Total num frames: 296747008. Throughput: 0: 44267.3. Samples: 296924180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-07-02 12:59:41,098][36761] Avg episode reward: [(0, '0.448')] [2024-07-02 12:59:43,589][36999] Updated weights for policy 0, policy_version 18120 (0.0031) [2024-07-02 12:59:46,095][36761] Fps is (10 sec: 47513.5, 60 sec: 44238.9, 300 sec: 44320.1). Total num frames: 297009152. Throughput: 0: 43946.5. Samples: 297049560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-07-02 12:59:46,096][36761] Avg episode reward: [(0, '0.452')] [2024-07-02 12:59:46,993][36999] Updated weights for policy 0, policy_version 18130 (0.0038) [2024-07-02 12:59:50,624][36979] Signal inference workers to stop experience collection... (4350 times) [2024-07-02 12:59:50,625][36979] Signal inference workers to resume experience collection... (4350 times) [2024-07-02 12:59:50,655][36999] InferenceWorker_p0-w0: stopping experience collection (4350 times) [2024-07-02 12:59:50,656][36999] InferenceWorker_p0-w0: resuming experience collection (4350 times) [2024-07-02 12:59:50,931][36999] Updated weights for policy 0, policy_version 18140 (0.0041) [2024-07-02 12:59:51,095][36761] Fps is (10 sec: 45886.4, 60 sec: 44782.9, 300 sec: 44375.6). Total num frames: 297205760. Throughput: 0: 44205.2. Samples: 297326180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-07-02 12:59:51,096][36761] Avg episode reward: [(0, '0.460')] [2024-07-02 12:59:54,338][36999] Updated weights for policy 0, policy_version 18150 (0.0026) [2024-07-02 12:59:56,098][36761] Fps is (10 sec: 39312.7, 60 sec: 43962.0, 300 sec: 44264.2). Total num frames: 297402368. Throughput: 0: 44148.7. Samples: 297587400. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-07-02 12:59:56,107][36761] Avg episode reward: [(0, '0.450')] [2024-07-02 12:59:58,672][36999] Updated weights for policy 0, policy_version 18160 (0.0032) [2024-07-02 13:00:01,095][36761] Fps is (10 sec: 45875.6, 60 sec: 43963.7, 300 sec: 44320.1). Total num frames: 297664512. Throughput: 0: 43890.7. Samples: 297714780. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-07-02 13:00:01,096][36761] Avg episode reward: [(0, '0.450')] [2024-07-02 13:00:01,749][36999] Updated weights for policy 0, policy_version 18170 (0.0043) [2024-07-02 13:00:05,973][36999] Updated weights for policy 0, policy_version 18180 (0.0025) [2024-07-02 13:00:06,095][36761] Fps is (10 sec: 45885.8, 60 sec: 44509.8, 300 sec: 44264.6). Total num frames: 297861120. Throughput: 0: 44189.3. Samples: 297990420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:00:06,104][36761] Avg episode reward: [(0, '0.450')] [2024-07-02 13:00:09,081][36999] Updated weights for policy 0, policy_version 18190 (0.0033) [2024-07-02 13:00:11,095][36761] Fps is (10 sec: 39321.9, 60 sec: 43963.7, 300 sec: 44264.6). Total num frames: 298057728. Throughput: 0: 44280.9. Samples: 298253860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:00:11,095][36761] Avg episode reward: [(0, '0.450')] [2024-07-02 13:00:13,286][36999] Updated weights for policy 0, policy_version 18200 (0.0033) [2024-07-02 13:00:16,095][36761] Fps is (10 sec: 45875.5, 60 sec: 43963.8, 300 sec: 44264.6). Total num frames: 298319872. Throughput: 0: 44247.5. Samples: 298382900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:00:16,096][36761] Avg episode reward: [(0, '0.450')] [2024-07-02 13:00:16,463][36999] Updated weights for policy 0, policy_version 18210 (0.0036) [2024-07-02 13:00:20,631][36999] Updated weights for policy 0, policy_version 18220 (0.0035) [2024-07-02 13:00:21,095][36761] Fps is (10 sec: 47513.5, 60 sec: 44510.0, 300 sec: 44264.6). Total num frames: 298532864. Throughput: 0: 44077.5. Samples: 298650200. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-07-02 13:00:21,096][36761] Avg episode reward: [(0, '0.565')] [2024-07-02 13:00:21,205][36979] Saving new best policy, reward=0.565! [2024-07-02 13:00:24,255][36999] Updated weights for policy 0, policy_version 18230 (0.0030) [2024-07-02 13:00:26,100][36761] Fps is (10 sec: 40941.1, 60 sec: 44233.3, 300 sec: 44319.4). Total num frames: 298729472. Throughput: 0: 44218.8. Samples: 298914120. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-07-02 13:00:26,101][36761] Avg episode reward: [(0, '0.475')] [2024-07-02 13:00:28,053][36999] Updated weights for policy 0, policy_version 18240 (0.0023) [2024-07-02 13:00:31,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43963.7, 300 sec: 44209.3). Total num frames: 298975232. Throughput: 0: 44120.2. Samples: 299034960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-07-02 13:00:31,096][36761] Avg episode reward: [(0, '0.469')] [2024-07-02 13:00:31,662][36999] Updated weights for policy 0, policy_version 18250 (0.0029) [2024-07-02 13:00:35,466][36999] Updated weights for policy 0, policy_version 18260 (0.0033) [2024-07-02 13:00:36,095][36761] Fps is (10 sec: 47536.0, 60 sec: 44510.0, 300 sec: 44264.6). Total num frames: 299204608. Throughput: 0: 44142.4. Samples: 299312580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-07-02 13:00:36,096][36761] Avg episode reward: [(0, '0.545')] [2024-07-02 13:00:36,105][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000018263_299220992.pth... [2024-07-02 13:00:36,154][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000017613_288571392.pth [2024-07-02 13:00:39,048][36999] Updated weights for policy 0, policy_version 18270 (0.0040) [2024-07-02 13:00:41,095][36761] Fps is (10 sec: 40959.6, 60 sec: 43965.6, 300 sec: 44264.6). Total num frames: 299384832. Throughput: 0: 44079.6. Samples: 299570880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-07-02 13:00:41,096][36761] Avg episode reward: [(0, '0.540')] [2024-07-02 13:00:43,096][36999] Updated weights for policy 0, policy_version 18280 (0.0030) [2024-07-02 13:00:46,100][36761] Fps is (10 sec: 42578.6, 60 sec: 43687.4, 300 sec: 44208.3). Total num frames: 299630592. Throughput: 0: 44024.0. Samples: 299696060. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-07-02 13:00:46,100][36761] Avg episode reward: [(0, '0.549')] [2024-07-02 13:00:46,539][36999] Updated weights for policy 0, policy_version 18290 (0.0034) [2024-07-02 13:00:50,528][36999] Updated weights for policy 0, policy_version 18300 (0.0042) [2024-07-02 13:00:51,095][36761] Fps is (10 sec: 47513.6, 60 sec: 44236.8, 300 sec: 44264.6). Total num frames: 299859968. Throughput: 0: 44147.1. Samples: 299977040. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-07-02 13:00:51,096][36761] Avg episode reward: [(0, '0.562')] [2024-07-02 13:00:53,796][36979] Signal inference workers to stop experience collection... (4400 times) [2024-07-02 13:00:53,796][36979] Signal inference workers to resume experience collection... (4400 times) [2024-07-02 13:00:53,809][36999] InferenceWorker_p0-w0: stopping experience collection (4400 times) [2024-07-02 13:00:53,809][36999] InferenceWorker_p0-w0: resuming experience collection (4400 times) [2024-07-02 13:00:53,944][36999] Updated weights for policy 0, policy_version 18310 (0.0037) [2024-07-02 13:00:56,095][36761] Fps is (10 sec: 42617.3, 60 sec: 44238.5, 300 sec: 44264.5). Total num frames: 300056576. Throughput: 0: 44078.5. Samples: 300237400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 13:00:56,096][36761] Avg episode reward: [(0, '0.565')] [2024-07-02 13:00:57,881][36999] Updated weights for policy 0, policy_version 18320 (0.0022) [2024-07-02 13:01:01,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 300285952. Throughput: 0: 43915.1. Samples: 300359080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 13:01:01,096][36761] Avg episode reward: [(0, '0.555')] [2024-07-02 13:01:01,390][36999] Updated weights for policy 0, policy_version 18330 (0.0036) [2024-07-02 13:01:05,303][36999] Updated weights for policy 0, policy_version 18340 (0.0034) [2024-07-02 13:01:06,095][36761] Fps is (10 sec: 47514.5, 60 sec: 44510.0, 300 sec: 44264.6). Total num frames: 300531712. Throughput: 0: 44234.2. Samples: 300640740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 13:01:06,095][36761] Avg episode reward: [(0, '0.549')] [2024-07-02 13:01:08,767][36999] Updated weights for policy 0, policy_version 18350 (0.0028) [2024-07-02 13:01:11,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43963.7, 300 sec: 44209.2). Total num frames: 300695552. Throughput: 0: 44149.9. Samples: 300900660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-07-02 13:01:11,096][36761] Avg episode reward: [(0, '0.507')] [2024-07-02 13:01:12,784][36999] Updated weights for policy 0, policy_version 18360 (0.0031) [2024-07-02 13:01:16,069][36999] Updated weights for policy 0, policy_version 18370 (0.0032) [2024-07-02 13:01:16,095][36761] Fps is (10 sec: 44236.5, 60 sec: 44236.8, 300 sec: 44264.6). Total num frames: 300974080. Throughput: 0: 44198.2. Samples: 301023880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-07-02 13:01:16,096][36761] Avg episode reward: [(0, '0.507')] [2024-07-02 13:01:20,030][36999] Updated weights for policy 0, policy_version 18380 (0.0032) [2024-07-02 13:01:21,095][36761] Fps is (10 sec: 50790.1, 60 sec: 44509.8, 300 sec: 44320.1). Total num frames: 301203456. Throughput: 0: 44230.6. Samples: 301302960. Policy #0 lag: (min: 1.0, avg: 12.5, max: 24.0) [2024-07-02 13:01:21,097][36761] Avg episode reward: [(0, '0.491')] [2024-07-02 13:01:23,385][36999] Updated weights for policy 0, policy_version 18390 (0.0040) [2024-07-02 13:01:26,095][36761] Fps is (10 sec: 39321.4, 60 sec: 43967.1, 300 sec: 44209.0). Total num frames: 301367296. Throughput: 0: 44387.5. Samples: 301568320. Policy #0 lag: (min: 1.0, avg: 12.5, max: 24.0) [2024-07-02 13:01:26,096][36761] Avg episode reward: [(0, '0.521')] [2024-07-02 13:01:27,440][36999] Updated weights for policy 0, policy_version 18400 (0.0031) [2024-07-02 13:01:30,648][36999] Updated weights for policy 0, policy_version 18410 (0.0042) [2024-07-02 13:01:31,095][36761] Fps is (10 sec: 44236.6, 60 sec: 44509.8, 300 sec: 44320.1). Total num frames: 301645824. Throughput: 0: 44314.2. Samples: 301690000. Policy #0 lag: (min: 1.0, avg: 12.5, max: 24.0) [2024-07-02 13:01:31,096][36761] Avg episode reward: [(0, '0.521')] [2024-07-02 13:01:34,836][36999] Updated weights for policy 0, policy_version 18420 (0.0039) [2024-07-02 13:01:36,095][36761] Fps is (10 sec: 50790.0, 60 sec: 44509.7, 300 sec: 44375.6). Total num frames: 301875200. Throughput: 0: 44246.1. Samples: 301968120. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-07-02 13:01:36,096][36761] Avg episode reward: [(0, '0.497')] [2024-07-02 13:01:37,977][36999] Updated weights for policy 0, policy_version 18430 (0.0041) [2024-07-02 13:01:41,095][36761] Fps is (10 sec: 37683.7, 60 sec: 43963.8, 300 sec: 44209.0). Total num frames: 302022656. Throughput: 0: 44575.3. Samples: 302243280. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-07-02 13:01:41,096][36761] Avg episode reward: [(0, '0.521')] [2024-07-02 13:01:42,195][36999] Updated weights for policy 0, policy_version 18440 (0.0023) [2024-07-02 13:01:45,429][36999] Updated weights for policy 0, policy_version 18450 (0.0033) [2024-07-02 13:01:46,095][36761] Fps is (10 sec: 42598.7, 60 sec: 44513.2, 300 sec: 44320.1). Total num frames: 302301184. Throughput: 0: 44434.6. Samples: 302358640. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-07-02 13:01:46,096][36761] Avg episode reward: [(0, '0.559')] [2024-07-02 13:01:49,550][36999] Updated weights for policy 0, policy_version 18460 (0.0036) [2024-07-02 13:01:51,095][36761] Fps is (10 sec: 50790.1, 60 sec: 44509.9, 300 sec: 44320.1). Total num frames: 302530560. Throughput: 0: 44144.8. Samples: 302627260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 13:01:51,096][36761] Avg episode reward: [(0, '0.577')] [2024-07-02 13:01:51,129][36979] Saving new best policy, reward=0.577! [2024-07-02 13:01:51,521][36979] Signal inference workers to stop experience collection... (4450 times) [2024-07-02 13:01:51,536][36999] InferenceWorker_p0-w0: stopping experience collection (4450 times) [2024-07-02 13:01:51,578][36979] Signal inference workers to resume experience collection... (4450 times) [2024-07-02 13:01:51,578][36999] InferenceWorker_p0-w0: resuming experience collection (4450 times) [2024-07-02 13:01:52,791][36999] Updated weights for policy 0, policy_version 18470 (0.0032) [2024-07-02 13:01:56,095][36761] Fps is (10 sec: 39321.6, 60 sec: 43963.8, 300 sec: 44209.0). Total num frames: 302694400. Throughput: 0: 44646.6. Samples: 302909760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 13:01:56,096][36761] Avg episode reward: [(0, '0.595')] [2024-07-02 13:01:56,175][36979] Saving new best policy, reward=0.595! [2024-07-02 13:01:56,855][36999] Updated weights for policy 0, policy_version 18480 (0.0024) [2024-07-02 13:02:00,340][36999] Updated weights for policy 0, policy_version 18490 (0.0028) [2024-07-02 13:02:01,095][36761] Fps is (10 sec: 42598.7, 60 sec: 44509.9, 300 sec: 44320.1). Total num frames: 302956544. Throughput: 0: 44505.8. Samples: 303026640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-07-02 13:02:01,096][36761] Avg episode reward: [(0, '0.570')] [2024-07-02 13:02:04,189][36999] Updated weights for policy 0, policy_version 18500 (0.0044) [2024-07-02 13:02:06,100][36761] Fps is (10 sec: 50768.4, 60 sec: 44506.6, 300 sec: 44319.5). Total num frames: 303202304. Throughput: 0: 44180.2. Samples: 303291260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-07-02 13:02:06,100][36761] Avg episode reward: [(0, '0.549')] [2024-07-02 13:02:07,703][36999] Updated weights for policy 0, policy_version 18510 (0.0032) [2024-07-02 13:02:11,095][36761] Fps is (10 sec: 42598.4, 60 sec: 44783.0, 300 sec: 44264.6). Total num frames: 303382528. Throughput: 0: 44442.3. Samples: 303568220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-07-02 13:02:11,096][36761] Avg episode reward: [(0, '0.555')] [2024-07-02 13:02:11,693][36999] Updated weights for policy 0, policy_version 18520 (0.0025) [2024-07-02 13:02:15,246][36999] Updated weights for policy 0, policy_version 18530 (0.0033) [2024-07-02 13:02:16,095][36761] Fps is (10 sec: 40978.0, 60 sec: 43963.7, 300 sec: 44320.1). Total num frames: 303611904. Throughput: 0: 44385.0. Samples: 303687320. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-07-02 13:02:16,096][36761] Avg episode reward: [(0, '0.539')] [2024-07-02 13:02:19,195][36999] Updated weights for policy 0, policy_version 18540 (0.0036) [2024-07-02 13:02:21,095][36761] Fps is (10 sec: 47512.7, 60 sec: 44236.7, 300 sec: 44264.6). Total num frames: 303857664. Throughput: 0: 44104.4. Samples: 303952820. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-07-02 13:02:21,096][36761] Avg episode reward: [(0, '0.541')] [2024-07-02 13:02:22,868][36999] Updated weights for policy 0, policy_version 18550 (0.0034) [2024-07-02 13:02:26,100][36761] Fps is (10 sec: 44216.5, 60 sec: 44779.5, 300 sec: 44319.4). Total num frames: 304054272. Throughput: 0: 44132.8. Samples: 304229460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-07-02 13:02:26,100][36761] Avg episode reward: [(0, '0.541')] [2024-07-02 13:02:26,386][36999] Updated weights for policy 0, policy_version 18560 (0.0037) [2024-07-02 13:02:30,272][36999] Updated weights for policy 0, policy_version 18570 (0.0035) [2024-07-02 13:02:31,095][36761] Fps is (10 sec: 40960.3, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 304267264. Throughput: 0: 44441.3. Samples: 304358500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-07-02 13:02:31,096][36761] Avg episode reward: [(0, '0.541')] [2024-07-02 13:02:33,800][36999] Updated weights for policy 0, policy_version 18580 (0.0028) [2024-07-02 13:02:36,095][36761] Fps is (10 sec: 45896.3, 60 sec: 43963.8, 300 sec: 44209.0). Total num frames: 304513024. Throughput: 0: 44220.5. Samples: 304617180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-07-02 13:02:36,096][36761] Avg episode reward: [(0, '0.541')] [2024-07-02 13:02:36,107][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000018586_304513024.pth... [2024-07-02 13:02:36,157][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000017936_293863424.pth [2024-07-02 13:02:37,684][36999] Updated weights for policy 0, policy_version 18590 (0.0033) [2024-07-02 13:02:41,095][36761] Fps is (10 sec: 45875.5, 60 sec: 45056.0, 300 sec: 44320.1). Total num frames: 304726016. Throughput: 0: 44111.6. Samples: 304894780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 13:02:41,096][36761] Avg episode reward: [(0, '0.575')] [2024-07-02 13:02:41,116][36999] Updated weights for policy 0, policy_version 18600 (0.0026) [2024-07-02 13:02:45,183][36999] Updated weights for policy 0, policy_version 18610 (0.0027) [2024-07-02 13:02:46,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43963.7, 300 sec: 44264.6). Total num frames: 304939008. Throughput: 0: 44474.6. Samples: 305028000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 13:02:46,096][36761] Avg episode reward: [(0, '0.621')] [2024-07-02 13:02:46,109][36979] Saving new best policy, reward=0.621! [2024-07-02 13:02:48,472][36999] Updated weights for policy 0, policy_version 18620 (0.0034) [2024-07-02 13:02:51,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43963.8, 300 sec: 44209.0). Total num frames: 305168384. Throughput: 0: 44191.4. Samples: 305279680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 13:02:51,096][36761] Avg episode reward: [(0, '0.609')] [2024-07-02 13:02:52,459][36999] Updated weights for policy 0, policy_version 18630 (0.0025) [2024-07-02 13:02:55,828][36999] Updated weights for policy 0, policy_version 18640 (0.0039) [2024-07-02 13:02:56,095][36761] Fps is (10 sec: 47514.1, 60 sec: 45329.1, 300 sec: 44320.6). Total num frames: 305414144. Throughput: 0: 44111.6. Samples: 305553240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-07-02 13:02:56,096][36761] Avg episode reward: [(0, '0.609')] [2024-07-02 13:02:59,904][36999] Updated weights for policy 0, policy_version 18650 (0.0039) [2024-07-02 13:03:01,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 305577984. Throughput: 0: 44465.8. Samples: 305688280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-07-02 13:03:01,096][36761] Avg episode reward: [(0, '0.592')] [2024-07-02 13:03:02,891][36979] Signal inference workers to stop experience collection... (4500 times) [2024-07-02 13:03:02,896][36979] Signal inference workers to resume experience collection... (4500 times) [2024-07-02 13:03:02,915][36999] InferenceWorker_p0-w0: stopping experience collection (4500 times) [2024-07-02 13:03:02,916][36999] InferenceWorker_p0-w0: resuming experience collection (4500 times) [2024-07-02 13:03:03,194][36999] Updated weights for policy 0, policy_version 18660 (0.0049) [2024-07-02 13:03:06,095][36761] Fps is (10 sec: 40959.9, 60 sec: 43693.9, 300 sec: 44209.0). Total num frames: 305823744. Throughput: 0: 44212.2. Samples: 305942360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-07-02 13:03:06,096][36761] Avg episode reward: [(0, '0.549')] [2024-07-02 13:03:07,274][36999] Updated weights for policy 0, policy_version 18670 (0.0031) [2024-07-02 13:03:10,661][36999] Updated weights for policy 0, policy_version 18680 (0.0028) [2024-07-02 13:03:11,095][36761] Fps is (10 sec: 47513.9, 60 sec: 44509.9, 300 sec: 44209.0). Total num frames: 306053120. Throughput: 0: 43928.6. Samples: 306206040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-07-02 13:03:11,096][36761] Avg episode reward: [(0, '0.524')] [2024-07-02 13:03:14,728][36999] Updated weights for policy 0, policy_version 18690 (0.0032) [2024-07-02 13:03:16,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43963.7, 300 sec: 44264.6). Total num frames: 306249728. Throughput: 0: 44115.6. Samples: 306343700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-07-02 13:03:16,096][36761] Avg episode reward: [(0, '0.568')] [2024-07-02 13:03:17,874][36999] Updated weights for policy 0, policy_version 18700 (0.0026) [2024-07-02 13:03:21,095][36761] Fps is (10 sec: 44236.2, 60 sec: 43963.8, 300 sec: 44209.0). Total num frames: 306495488. Throughput: 0: 44166.6. Samples: 306604680. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-07-02 13:03:21,096][36761] Avg episode reward: [(0, '0.597')] [2024-07-02 13:03:22,371][36999] Updated weights for policy 0, policy_version 18710 (0.0026) [2024-07-02 13:03:25,214][36999] Updated weights for policy 0, policy_version 18720 (0.0025) [2024-07-02 13:03:26,095][36761] Fps is (10 sec: 49152.1, 60 sec: 44786.3, 300 sec: 44264.6). Total num frames: 306741248. Throughput: 0: 44032.9. Samples: 306876260. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-07-02 13:03:26,096][36761] Avg episode reward: [(0, '0.614')] [2024-07-02 13:03:29,615][36999] Updated weights for policy 0, policy_version 18730 (0.0037) [2024-07-02 13:03:31,095][36761] Fps is (10 sec: 42598.8, 60 sec: 44236.9, 300 sec: 44320.1). Total num frames: 306921472. Throughput: 0: 44176.1. Samples: 307015920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-07-02 13:03:31,096][36761] Avg episode reward: [(0, '0.569')] [2024-07-02 13:03:32,492][36999] Updated weights for policy 0, policy_version 18740 (0.0040) [2024-07-02 13:03:36,100][36761] Fps is (10 sec: 42578.9, 60 sec: 44233.4, 300 sec: 44319.4). Total num frames: 307167232. Throughput: 0: 44408.3. Samples: 307278260. Policy #0 lag: (min: 2.0, avg: 11.9, max: 23.0) [2024-07-02 13:03:36,101][36761] Avg episode reward: [(0, '0.550')] [2024-07-02 13:03:36,839][36999] Updated weights for policy 0, policy_version 18750 (0.0035) [2024-07-02 13:03:39,922][36999] Updated weights for policy 0, policy_version 18760 (0.0029) [2024-07-02 13:03:41,095][36761] Fps is (10 sec: 49151.6, 60 sec: 44782.9, 300 sec: 44265.0). Total num frames: 307412992. Throughput: 0: 44164.8. Samples: 307540660. Policy #0 lag: (min: 2.0, avg: 11.9, max: 23.0) [2024-07-02 13:03:41,096][36761] Avg episode reward: [(0, '0.539')] [2024-07-02 13:03:44,550][36999] Updated weights for policy 0, policy_version 18770 (0.0030) [2024-07-02 13:03:46,096][36761] Fps is (10 sec: 39338.9, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 307560448. Throughput: 0: 44277.6. Samples: 307680780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-07-02 13:03:46,096][36761] Avg episode reward: [(0, '0.559')] [2024-07-02 13:03:47,494][36999] Updated weights for policy 0, policy_version 18780 (0.0024) [2024-07-02 13:03:51,095][36761] Fps is (10 sec: 40960.1, 60 sec: 44236.8, 300 sec: 44264.6). Total num frames: 307822592. Throughput: 0: 44397.3. Samples: 307940240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-07-02 13:03:51,096][36761] Avg episode reward: [(0, '0.581')] [2024-07-02 13:03:51,912][36999] Updated weights for policy 0, policy_version 18790 (0.0046) [2024-07-02 13:03:55,121][36999] Updated weights for policy 0, policy_version 18800 (0.0024) [2024-07-02 13:03:56,095][36761] Fps is (10 sec: 54068.3, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 308101120. Throughput: 0: 44342.2. Samples: 308201440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-07-02 13:03:56,096][36761] Avg episode reward: [(0, '0.612')] [2024-07-02 13:03:59,288][36999] Updated weights for policy 0, policy_version 18810 (0.0045) [2024-07-02 13:04:01,095][36761] Fps is (10 sec: 42598.5, 60 sec: 44509.9, 300 sec: 44264.6). Total num frames: 308248576. Throughput: 0: 44484.1. Samples: 308345480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 13:04:01,096][36761] Avg episode reward: [(0, '0.612')] [2024-07-02 13:04:02,440][36999] Updated weights for policy 0, policy_version 18820 (0.0041) [2024-07-02 13:04:06,095][36761] Fps is (10 sec: 39321.3, 60 sec: 44509.8, 300 sec: 44320.1). Total num frames: 308494336. Throughput: 0: 44608.4. Samples: 308612060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 13:04:06,096][36761] Avg episode reward: [(0, '0.629')] [2024-07-02 13:04:06,102][36979] Saving new best policy, reward=0.629! [2024-07-02 13:04:06,532][36999] Updated weights for policy 0, policy_version 18830 (0.0038) [2024-07-02 13:04:09,542][36979] Signal inference workers to stop experience collection... (4550 times) [2024-07-02 13:04:09,542][36979] Signal inference workers to resume experience collection... (4550 times) [2024-07-02 13:04:09,557][36999] InferenceWorker_p0-w0: stopping experience collection (4550 times) [2024-07-02 13:04:09,557][36999] InferenceWorker_p0-w0: resuming experience collection (4550 times) [2024-07-02 13:04:09,841][36999] Updated weights for policy 0, policy_version 18840 (0.0036) [2024-07-02 13:04:11,095][36761] Fps is (10 sec: 50790.5, 60 sec: 45056.0, 300 sec: 44320.1). Total num frames: 308756480. Throughput: 0: 44161.0. Samples: 308863500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 13:04:11,096][36761] Avg episode reward: [(0, '0.614')] [2024-07-02 13:04:13,848][36999] Updated weights for policy 0, policy_version 18850 (0.0035) [2024-07-02 13:04:16,095][36761] Fps is (10 sec: 42598.7, 60 sec: 44509.9, 300 sec: 44264.6). Total num frames: 308920320. Throughput: 0: 44443.5. Samples: 309015880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-07-02 13:04:16,096][36761] Avg episode reward: [(0, '0.601')] [2024-07-02 13:04:17,126][36999] Updated weights for policy 0, policy_version 18860 (0.0041) [2024-07-02 13:04:21,095][36761] Fps is (10 sec: 37682.7, 60 sec: 43963.7, 300 sec: 44264.5). Total num frames: 309133312. Throughput: 0: 44257.8. Samples: 309269660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-07-02 13:04:21,096][36761] Avg episode reward: [(0, '0.605')] [2024-07-02 13:04:21,359][36999] Updated weights for policy 0, policy_version 18870 (0.0027) [2024-07-02 13:04:24,539][36999] Updated weights for policy 0, policy_version 18880 (0.0033) [2024-07-02 13:04:26,095][36761] Fps is (10 sec: 47513.6, 60 sec: 44236.8, 300 sec: 44264.6). Total num frames: 309395456. Throughput: 0: 44256.0. Samples: 309532180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-07-02 13:04:26,096][36761] Avg episode reward: [(0, '0.566')] [2024-07-02 13:04:28,650][36999] Updated weights for policy 0, policy_version 18890 (0.0029) [2024-07-02 13:04:31,095][36761] Fps is (10 sec: 45875.7, 60 sec: 44509.8, 300 sec: 44264.6). Total num frames: 309592064. Throughput: 0: 44273.1. Samples: 309673060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-07-02 13:04:31,096][36761] Avg episode reward: [(0, '0.588')] [2024-07-02 13:04:32,028][36999] Updated weights for policy 0, policy_version 18900 (0.0030) [2024-07-02 13:04:35,967][36999] Updated weights for policy 0, policy_version 18910 (0.0033) [2024-07-02 13:04:36,095][36761] Fps is (10 sec: 42597.8, 60 sec: 44240.1, 300 sec: 44320.5). Total num frames: 309821440. Throughput: 0: 44369.2. Samples: 309936860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-07-02 13:04:36,096][36761] Avg episode reward: [(0, '0.597')] [2024-07-02 13:04:36,113][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000018910_309821440.pth... [2024-07-02 13:04:36,167][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000018263_299220992.pth [2024-07-02 13:04:39,518][36999] Updated weights for policy 0, policy_version 18920 (0.0030) [2024-07-02 13:04:41,095][36761] Fps is (10 sec: 45874.6, 60 sec: 43963.7, 300 sec: 44209.0). Total num frames: 310050816. Throughput: 0: 44386.5. Samples: 310198840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-07-02 13:04:41,096][36761] Avg episode reward: [(0, '0.595')] [2024-07-02 13:04:43,548][36999] Updated weights for policy 0, policy_version 18930 (0.0042) [2024-07-02 13:04:46,095][36761] Fps is (10 sec: 44237.6, 60 sec: 45056.2, 300 sec: 44264.6). Total num frames: 310263808. Throughput: 0: 44279.1. Samples: 310338040. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-07-02 13:04:46,096][36761] Avg episode reward: [(0, '0.602')] [2024-07-02 13:04:46,743][36999] Updated weights for policy 0, policy_version 18940 (0.0026) [2024-07-02 13:04:50,987][36999] Updated weights for policy 0, policy_version 18950 (0.0029) [2024-07-02 13:04:51,096][36761] Fps is (10 sec: 42596.7, 60 sec: 44236.4, 300 sec: 44320.4). Total num frames: 310476800. Throughput: 0: 44158.2. Samples: 310599200. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-07-02 13:04:51,096][36761] Avg episode reward: [(0, '0.602')] [2024-07-02 13:04:54,237][36999] Updated weights for policy 0, policy_version 18960 (0.0037) [2024-07-02 13:04:56,100][36761] Fps is (10 sec: 45854.1, 60 sec: 43687.3, 300 sec: 44263.9). Total num frames: 310722560. Throughput: 0: 44357.2. Samples: 310859780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-07-02 13:04:56,101][36761] Avg episode reward: [(0, '0.604')] [2024-07-02 13:04:58,453][36999] Updated weights for policy 0, policy_version 18970 (0.0024) [2024-07-02 13:05:01,095][36761] Fps is (10 sec: 45877.4, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 310935552. Throughput: 0: 44130.6. Samples: 311001760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-07-02 13:05:01,100][36761] Avg episode reward: [(0, '0.604')] [2024-07-02 13:05:01,601][36999] Updated weights for policy 0, policy_version 18980 (0.0030) [2024-07-02 13:05:05,775][36999] Updated weights for policy 0, policy_version 18990 (0.0035) [2024-07-02 13:05:06,095][36761] Fps is (10 sec: 40978.9, 60 sec: 43963.8, 300 sec: 44320.1). Total num frames: 311132160. Throughput: 0: 44392.6. Samples: 311267320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:05:06,095][36761] Avg episode reward: [(0, '0.603')] [2024-07-02 13:05:09,020][36999] Updated weights for policy 0, policy_version 19000 (0.0031) [2024-07-02 13:05:11,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43417.5, 300 sec: 44209.0). Total num frames: 311361536. Throughput: 0: 44388.4. Samples: 311529660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:05:11,096][36761] Avg episode reward: [(0, '0.593')] [2024-07-02 13:05:13,193][36999] Updated weights for policy 0, policy_version 19010 (0.0026) [2024-07-02 13:05:16,095][36761] Fps is (10 sec: 45874.5, 60 sec: 44509.8, 300 sec: 44264.5). Total num frames: 311590912. Throughput: 0: 44179.9. Samples: 311661160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:05:16,096][36761] Avg episode reward: [(0, '0.620')] [2024-07-02 13:05:16,367][36999] Updated weights for policy 0, policy_version 19020 (0.0041) [2024-07-02 13:05:20,573][36999] Updated weights for policy 0, policy_version 19030 (0.0035) [2024-07-02 13:05:21,095][36761] Fps is (10 sec: 44237.3, 60 sec: 44510.0, 300 sec: 44320.8). Total num frames: 311803904. Throughput: 0: 44295.7. Samples: 311930160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-07-02 13:05:21,096][36761] Avg episode reward: [(0, '0.630')] [2024-07-02 13:05:23,679][36999] Updated weights for policy 0, policy_version 19040 (0.0026) [2024-07-02 13:05:26,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 312016896. Throughput: 0: 44437.4. Samples: 312198520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-07-02 13:05:26,096][36761] Avg episode reward: [(0, '0.629')] [2024-07-02 13:05:27,907][36999] Updated weights for policy 0, policy_version 19050 (0.0022) [2024-07-02 13:05:31,095][36761] Fps is (10 sec: 45875.1, 60 sec: 44509.9, 300 sec: 44264.6). Total num frames: 312262656. Throughput: 0: 44267.5. Samples: 312330080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-07-02 13:05:31,096][36761] Avg episode reward: [(0, '0.645')] [2024-07-02 13:05:31,108][36979] Saving new best policy, reward=0.645! [2024-07-02 13:05:31,118][36999] Updated weights for policy 0, policy_version 19060 (0.0025) [2024-07-02 13:05:34,998][36979] Signal inference workers to stop experience collection... (4600 times) [2024-07-02 13:05:34,999][36979] Signal inference workers to resume experience collection... (4600 times) [2024-07-02 13:05:35,011][36999] InferenceWorker_p0-w0: stopping experience collection (4600 times) [2024-07-02 13:05:35,011][36999] InferenceWorker_p0-w0: resuming experience collection (4600 times) [2024-07-02 13:05:35,318][36999] Updated weights for policy 0, policy_version 19070 (0.0021) [2024-07-02 13:05:36,095][36761] Fps is (10 sec: 44237.2, 60 sec: 43963.8, 300 sec: 44320.1). Total num frames: 312459264. Throughput: 0: 44368.5. Samples: 312595760. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-07-02 13:05:36,096][36761] Avg episode reward: [(0, '0.651')] [2024-07-02 13:05:36,171][36979] Saving new best policy, reward=0.651! [2024-07-02 13:05:38,679][36999] Updated weights for policy 0, policy_version 19080 (0.0032) [2024-07-02 13:05:41,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43690.8, 300 sec: 44209.7). Total num frames: 312672256. Throughput: 0: 44445.0. Samples: 312859600. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-07-02 13:05:41,096][36761] Avg episode reward: [(0, '0.644')] [2024-07-02 13:05:42,634][36999] Updated weights for policy 0, policy_version 19090 (0.0034) [2024-07-02 13:05:46,007][36999] Updated weights for policy 0, policy_version 19100 (0.0036) [2024-07-02 13:05:46,095][36761] Fps is (10 sec: 47512.9, 60 sec: 44509.7, 300 sec: 44320.1). Total num frames: 312934400. Throughput: 0: 44146.1. Samples: 312988340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 13:05:46,096][36761] Avg episode reward: [(0, '0.645')] [2024-07-02 13:05:50,072][36999] Updated weights for policy 0, policy_version 19110 (0.0031) [2024-07-02 13:05:51,100][36761] Fps is (10 sec: 47491.5, 60 sec: 44506.8, 300 sec: 44375.0). Total num frames: 313147392. Throughput: 0: 44324.3. Samples: 313262120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 13:05:51,100][36761] Avg episode reward: [(0, '0.615')] [2024-07-02 13:05:53,286][36999] Updated weights for policy 0, policy_version 19120 (0.0030) [2024-07-02 13:05:56,095][36761] Fps is (10 sec: 40960.3, 60 sec: 43693.9, 300 sec: 44264.6). Total num frames: 313344000. Throughput: 0: 44380.9. Samples: 313526800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 13:05:56,096][36761] Avg episode reward: [(0, '0.583')] [2024-07-02 13:05:57,554][36999] Updated weights for policy 0, policy_version 19130 (0.0045) [2024-07-02 13:06:00,604][36999] Updated weights for policy 0, policy_version 19140 (0.0024) [2024-07-02 13:06:01,100][36761] Fps is (10 sec: 45875.2, 60 sec: 44506.5, 300 sec: 44319.4). Total num frames: 313606144. Throughput: 0: 44278.7. Samples: 313653900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-07-02 13:06:01,101][36761] Avg episode reward: [(0, '0.616')] [2024-07-02 13:06:05,000][36999] Updated weights for policy 0, policy_version 19150 (0.0031) [2024-07-02 13:06:06,095][36761] Fps is (10 sec: 45875.0, 60 sec: 44509.8, 300 sec: 44431.2). Total num frames: 313802752. Throughput: 0: 44376.3. Samples: 313927100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-07-02 13:06:06,096][36761] Avg episode reward: [(0, '0.654')] [2024-07-02 13:06:06,280][36979] Saving new best policy, reward=0.654! [2024-07-02 13:06:07,992][36999] Updated weights for policy 0, policy_version 19160 (0.0036) [2024-07-02 13:06:11,095][36761] Fps is (10 sec: 40978.8, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 314015744. Throughput: 0: 44327.2. Samples: 314193240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-07-02 13:06:11,096][36761] Avg episode reward: [(0, '0.609')] [2024-07-02 13:06:12,501][36999] Updated weights for policy 0, policy_version 19170 (0.0024) [2024-07-02 13:06:15,471][36999] Updated weights for policy 0, policy_version 19180 (0.0034) [2024-07-02 13:06:16,095][36761] Fps is (10 sec: 45875.0, 60 sec: 44509.8, 300 sec: 44264.6). Total num frames: 314261504. Throughput: 0: 44171.8. Samples: 314317820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-07-02 13:06:16,100][36761] Avg episode reward: [(0, '0.591')] [2024-07-02 13:06:19,929][36999] Updated weights for policy 0, policy_version 19190 (0.0028) [2024-07-02 13:06:21,095][36761] Fps is (10 sec: 45875.4, 60 sec: 44509.9, 300 sec: 44431.2). Total num frames: 314474496. Throughput: 0: 44325.4. Samples: 314590400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-07-02 13:06:21,096][36761] Avg episode reward: [(0, '0.591')] [2024-07-02 13:06:22,779][36999] Updated weights for policy 0, policy_version 19200 (0.0034) [2024-07-02 13:06:26,095][36761] Fps is (10 sec: 40960.3, 60 sec: 44236.8, 300 sec: 44153.5). Total num frames: 314671104. Throughput: 0: 44336.8. Samples: 314854760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 13:06:26,096][36761] Avg episode reward: [(0, '0.613')] [2024-07-02 13:06:27,271][36999] Updated weights for policy 0, policy_version 19210 (0.0029) [2024-07-02 13:06:30,070][36999] Updated weights for policy 0, policy_version 19220 (0.0029) [2024-07-02 13:06:31,095][36761] Fps is (10 sec: 45875.4, 60 sec: 44509.9, 300 sec: 44264.6). Total num frames: 314933248. Throughput: 0: 44387.3. Samples: 314985760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 13:06:31,096][36761] Avg episode reward: [(0, '0.609')] [2024-07-02 13:06:34,757][36999] Updated weights for policy 0, policy_version 19230 (0.0031) [2024-07-02 13:06:36,095][36761] Fps is (10 sec: 45875.5, 60 sec: 44509.9, 300 sec: 44431.2). Total num frames: 315129856. Throughput: 0: 44327.2. Samples: 315256640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 13:06:36,096][36761] Avg episode reward: [(0, '0.665')] [2024-07-02 13:06:36,161][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000019235_315146240.pth... [2024-07-02 13:06:36,213][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000018586_304513024.pth [2024-07-02 13:06:36,217][36979] Saving new best policy, reward=0.665! [2024-07-02 13:06:37,486][36999] Updated weights for policy 0, policy_version 19240 (0.0040) [2024-07-02 13:06:41,097][36761] Fps is (10 sec: 39313.6, 60 sec: 44235.3, 300 sec: 44153.2). Total num frames: 315326464. Throughput: 0: 44304.8. Samples: 315520600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-07-02 13:06:41,098][36761] Avg episode reward: [(0, '0.690')] [2024-07-02 13:06:41,153][36979] Saving new best policy, reward=0.690! [2024-07-02 13:06:42,078][36999] Updated weights for policy 0, policy_version 19250 (0.0034) [2024-07-02 13:06:44,818][36999] Updated weights for policy 0, policy_version 19260 (0.0031) [2024-07-02 13:06:46,095][36761] Fps is (10 sec: 47513.0, 60 sec: 44509.9, 300 sec: 44320.1). Total num frames: 315604992. Throughput: 0: 44298.6. Samples: 315647140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-07-02 13:06:46,096][36761] Avg episode reward: [(0, '0.696')] [2024-07-02 13:06:46,115][36979] Saving new best policy, reward=0.696! [2024-07-02 13:06:49,379][36999] Updated weights for policy 0, policy_version 19270 (0.0036) [2024-07-02 13:06:51,095][36761] Fps is (10 sec: 47522.6, 60 sec: 44240.2, 300 sec: 44431.2). Total num frames: 315801600. Throughput: 0: 44185.4. Samples: 315915440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-07-02 13:06:51,096][36761] Avg episode reward: [(0, '0.686')] [2024-07-02 13:06:52,192][36999] Updated weights for policy 0, policy_version 19280 (0.0022) [2024-07-02 13:06:56,095][36761] Fps is (10 sec: 37683.7, 60 sec: 43963.8, 300 sec: 44153.5). Total num frames: 315981824. Throughput: 0: 44238.2. Samples: 316183960. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-07-02 13:06:56,096][36761] Avg episode reward: [(0, '0.633')] [2024-07-02 13:06:57,054][36999] Updated weights for policy 0, policy_version 19290 (0.0039) [2024-07-02 13:06:59,241][36979] Signal inference workers to stop experience collection... (4650 times) [2024-07-02 13:06:59,241][36979] Signal inference workers to resume experience collection... (4650 times) [2024-07-02 13:06:59,283][36999] InferenceWorker_p0-w0: stopping experience collection (4650 times) [2024-07-02 13:06:59,283][36999] InferenceWorker_p0-w0: resuming experience collection (4650 times) [2024-07-02 13:06:59,621][36999] Updated weights for policy 0, policy_version 19300 (0.0026) [2024-07-02 13:07:01,095][36761] Fps is (10 sec: 45874.9, 60 sec: 44240.1, 300 sec: 44265.2). Total num frames: 316260352. Throughput: 0: 44341.4. Samples: 316313180. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-07-02 13:07:01,096][36761] Avg episode reward: [(0, '0.620')] [2024-07-02 13:07:04,282][36999] Updated weights for policy 0, policy_version 19310 (0.0032) [2024-07-02 13:07:06,095][36761] Fps is (10 sec: 47513.7, 60 sec: 44236.9, 300 sec: 44320.1). Total num frames: 316456960. Throughput: 0: 44335.5. Samples: 316585500. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-07-02 13:07:06,096][36761] Avg episode reward: [(0, '0.663')] [2024-07-02 13:07:07,171][36999] Updated weights for policy 0, policy_version 19320 (0.0031) [2024-07-02 13:07:11,095][36761] Fps is (10 sec: 40960.1, 60 sec: 44236.7, 300 sec: 44264.6). Total num frames: 316669952. Throughput: 0: 44542.2. Samples: 316859160. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-07-02 13:07:11,096][36761] Avg episode reward: [(0, '0.680')] [2024-07-02 13:07:11,588][36999] Updated weights for policy 0, policy_version 19330 (0.0025) [2024-07-02 13:07:14,580][36999] Updated weights for policy 0, policy_version 19340 (0.0032) [2024-07-02 13:07:16,095][36761] Fps is (10 sec: 45874.8, 60 sec: 44236.9, 300 sec: 44264.6). Total num frames: 316915712. Throughput: 0: 44284.7. Samples: 316978580. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-07-02 13:07:16,098][36761] Avg episode reward: [(0, '0.680')] [2024-07-02 13:07:18,981][36999] Updated weights for policy 0, policy_version 19350 (0.0019) [2024-07-02 13:07:21,095][36761] Fps is (10 sec: 47514.0, 60 sec: 44509.8, 300 sec: 44376.3). Total num frames: 317145088. Throughput: 0: 44195.5. Samples: 317245440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 13:07:21,096][36761] Avg episode reward: [(0, '0.661')] [2024-07-02 13:07:22,048][36999] Updated weights for policy 0, policy_version 19360 (0.0039) [2024-07-02 13:07:26,095][36761] Fps is (10 sec: 40960.2, 60 sec: 44236.8, 300 sec: 44264.6). Total num frames: 317325312. Throughput: 0: 44477.9. Samples: 317522020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 13:07:26,096][36761] Avg episode reward: [(0, '0.661')] [2024-07-02 13:07:26,302][36999] Updated weights for policy 0, policy_version 19370 (0.0030) [2024-07-02 13:07:29,289][36999] Updated weights for policy 0, policy_version 19380 (0.0031) [2024-07-02 13:07:31,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43963.7, 300 sec: 44264.6). Total num frames: 317571072. Throughput: 0: 44407.8. Samples: 317645480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 13:07:31,096][36761] Avg episode reward: [(0, '0.687')] [2024-07-02 13:07:33,588][36999] Updated weights for policy 0, policy_version 19390 (0.0024) [2024-07-02 13:07:36,100][36761] Fps is (10 sec: 49129.5, 60 sec: 44779.5, 300 sec: 44375.0). Total num frames: 317816832. Throughput: 0: 44405.7. Samples: 317913900. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-07-02 13:07:36,100][36761] Avg episode reward: [(0, '0.638')] [2024-07-02 13:07:36,565][36999] Updated weights for policy 0, policy_version 19400 (0.0042) [2024-07-02 13:07:41,095][36761] Fps is (10 sec: 42597.8, 60 sec: 44511.3, 300 sec: 44264.6). Total num frames: 317997056. Throughput: 0: 44413.3. Samples: 318182560. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-07-02 13:07:41,096][36761] Avg episode reward: [(0, '0.629')] [2024-07-02 13:07:41,258][36999] Updated weights for policy 0, policy_version 19410 (0.0036) [2024-07-02 13:07:43,916][36999] Updated weights for policy 0, policy_version 19420 (0.0028) [2024-07-02 13:07:46,095][36761] Fps is (10 sec: 42617.4, 60 sec: 43963.7, 300 sec: 44320.1). Total num frames: 318242816. Throughput: 0: 44309.8. Samples: 318307120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-07-02 13:07:46,100][36761] Avg episode reward: [(0, '0.657')] [2024-07-02 13:07:48,483][36999] Updated weights for policy 0, policy_version 19430 (0.0030) [2024-07-02 13:07:51,095][36761] Fps is (10 sec: 47514.2, 60 sec: 44509.9, 300 sec: 44264.6). Total num frames: 318472192. Throughput: 0: 44320.0. Samples: 318579900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 13:07:51,095][36761] Avg episode reward: [(0, '0.676')] [2024-07-02 13:07:51,426][36999] Updated weights for policy 0, policy_version 19440 (0.0034) [2024-07-02 13:07:55,774][36999] Updated weights for policy 0, policy_version 19450 (0.0030) [2024-07-02 13:07:56,095][36761] Fps is (10 sec: 44237.5, 60 sec: 45056.0, 300 sec: 44431.2). Total num frames: 318685184. Throughput: 0: 44185.9. Samples: 318847520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 13:07:56,096][36761] Avg episode reward: [(0, '0.688')] [2024-07-02 13:07:59,058][36999] Updated weights for policy 0, policy_version 19460 (0.0035) [2024-07-02 13:08:01,095][36761] Fps is (10 sec: 40959.8, 60 sec: 43690.8, 300 sec: 44264.6). Total num frames: 318881792. Throughput: 0: 44269.4. Samples: 318970700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:08:01,096][36761] Avg episode reward: [(0, '0.689')] [2024-07-02 13:08:03,051][36999] Updated weights for policy 0, policy_version 19470 (0.0027) [2024-07-02 13:08:06,095][36761] Fps is (10 sec: 45875.2, 60 sec: 44782.9, 300 sec: 44375.6). Total num frames: 319143936. Throughput: 0: 44344.0. Samples: 319240920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:08:06,096][36761] Avg episode reward: [(0, '0.708')] [2024-07-02 13:08:06,108][36979] Saving new best policy, reward=0.708! [2024-07-02 13:08:06,366][36999] Updated weights for policy 0, policy_version 19480 (0.0031) [2024-07-02 13:08:10,419][36999] Updated weights for policy 0, policy_version 19490 (0.0034) [2024-07-02 13:08:11,095][36761] Fps is (10 sec: 49151.5, 60 sec: 45056.0, 300 sec: 44486.7). Total num frames: 319373312. Throughput: 0: 44061.3. Samples: 319504780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:08:11,096][36761] Avg episode reward: [(0, '0.684')] [2024-07-02 13:08:13,737][36999] Updated weights for policy 0, policy_version 19500 (0.0027) [2024-07-02 13:08:16,096][36761] Fps is (10 sec: 40955.6, 60 sec: 43963.0, 300 sec: 44264.4). Total num frames: 319553536. Throughput: 0: 44254.0. Samples: 319636960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-07-02 13:08:16,097][36761] Avg episode reward: [(0, '0.694')] [2024-07-02 13:08:17,898][36999] Updated weights for policy 0, policy_version 19510 (0.0023) [2024-07-02 13:08:20,252][36979] Signal inference workers to stop experience collection... (4700 times) [2024-07-02 13:08:20,252][36979] Signal inference workers to resume experience collection... (4700 times) [2024-07-02 13:08:20,273][36999] InferenceWorker_p0-w0: stopping experience collection (4700 times) [2024-07-02 13:08:20,273][36999] InferenceWorker_p0-w0: resuming experience collection (4700 times) [2024-07-02 13:08:21,095][36761] Fps is (10 sec: 42598.8, 60 sec: 44236.8, 300 sec: 44264.6). Total num frames: 319799296. Throughput: 0: 44301.9. Samples: 319907280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-07-02 13:08:21,096][36761] Avg episode reward: [(0, '0.686')] [2024-07-02 13:08:21,141][36999] Updated weights for policy 0, policy_version 19520 (0.0029) [2024-07-02 13:08:25,158][36999] Updated weights for policy 0, policy_version 19530 (0.0023) [2024-07-02 13:08:26,096][36761] Fps is (10 sec: 47517.4, 60 sec: 45055.8, 300 sec: 44431.1). Total num frames: 320028672. Throughput: 0: 44181.6. Samples: 320170740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-07-02 13:08:26,096][36761] Avg episode reward: [(0, '0.708')] [2024-07-02 13:08:28,416][36999] Updated weights for policy 0, policy_version 19540 (0.0031) [2024-07-02 13:08:31,095][36761] Fps is (10 sec: 42598.6, 60 sec: 44236.8, 300 sec: 44265.3). Total num frames: 320225280. Throughput: 0: 44419.3. Samples: 320305980. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-07-02 13:08:31,096][36761] Avg episode reward: [(0, '0.673')] [2024-07-02 13:08:32,527][36999] Updated weights for policy 0, policy_version 19550 (0.0028) [2024-07-02 13:08:35,816][36999] Updated weights for policy 0, policy_version 19560 (0.0033) [2024-07-02 13:08:36,095][36761] Fps is (10 sec: 45876.4, 60 sec: 44513.3, 300 sec: 44320.1). Total num frames: 320487424. Throughput: 0: 44320.4. Samples: 320574320. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-07-02 13:08:36,096][36761] Avg episode reward: [(0, '0.660')] [2024-07-02 13:08:36,109][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000019561_320487424.pth... [2024-07-02 13:08:36,165][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000018910_309821440.pth [2024-07-02 13:08:39,880][36999] Updated weights for policy 0, policy_version 19570 (0.0031) [2024-07-02 13:08:41,095][36761] Fps is (10 sec: 45874.7, 60 sec: 44782.9, 300 sec: 44486.7). Total num frames: 320684032. Throughput: 0: 44185.3. Samples: 320835860. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-07-02 13:08:41,096][36761] Avg episode reward: [(0, '0.675')] [2024-07-02 13:08:43,209][36999] Updated weights for policy 0, policy_version 19580 (0.0029) [2024-07-02 13:08:46,095][36761] Fps is (10 sec: 39321.3, 60 sec: 43963.8, 300 sec: 44264.6). Total num frames: 320880640. Throughput: 0: 44443.9. Samples: 320970680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:08:46,096][36761] Avg episode reward: [(0, '0.675')] [2024-07-02 13:08:47,230][36999] Updated weights for policy 0, policy_version 19590 (0.0027) [2024-07-02 13:08:50,570][36999] Updated weights for policy 0, policy_version 19600 (0.0027) [2024-07-02 13:08:51,095][36761] Fps is (10 sec: 45875.4, 60 sec: 44509.8, 300 sec: 44209.0). Total num frames: 321142784. Throughput: 0: 44337.8. Samples: 321236120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:08:51,096][36761] Avg episode reward: [(0, '0.674')] [2024-07-02 13:08:54,586][36999] Updated weights for policy 0, policy_version 19610 (0.0031) [2024-07-02 13:08:56,100][36761] Fps is (10 sec: 45854.7, 60 sec: 44233.4, 300 sec: 44375.0). Total num frames: 321339392. Throughput: 0: 44382.3. Samples: 321502180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-07-02 13:08:56,100][36761] Avg episode reward: [(0, '0.642')] [2024-07-02 13:08:57,894][36999] Updated weights for policy 0, policy_version 19620 (0.0028) [2024-07-02 13:09:01,095][36761] Fps is (10 sec: 39321.7, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 321536000. Throughput: 0: 44342.8. Samples: 321632340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-07-02 13:09:01,096][36761] Avg episode reward: [(0, '0.657')] [2024-07-02 13:09:02,102][36999] Updated weights for policy 0, policy_version 19630 (0.0032) [2024-07-02 13:09:05,300][36999] Updated weights for policy 0, policy_version 19640 (0.0025) [2024-07-02 13:09:06,099][36761] Fps is (10 sec: 45879.9, 60 sec: 44234.2, 300 sec: 44208.5). Total num frames: 321798144. Throughput: 0: 44140.5. Samples: 321893760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-07-02 13:09:06,099][36761] Avg episode reward: [(0, '0.715')] [2024-07-02 13:09:06,111][36979] Saving new best policy, reward=0.715! [2024-07-02 13:09:09,546][36999] Updated weights for policy 0, policy_version 19650 (0.0021) [2024-07-02 13:09:11,095][36761] Fps is (10 sec: 45875.6, 60 sec: 43690.8, 300 sec: 44320.1). Total num frames: 321994752. Throughput: 0: 44359.5. Samples: 322166900. Policy #0 lag: (min: 1.0, avg: 8.9, max: 23.0) [2024-07-02 13:09:11,095][36761] Avg episode reward: [(0, '0.715')] [2024-07-02 13:09:12,815][36999] Updated weights for policy 0, policy_version 19660 (0.0037) [2024-07-02 13:09:16,095][36761] Fps is (10 sec: 40974.2, 60 sec: 44237.6, 300 sec: 44320.1). Total num frames: 322207744. Throughput: 0: 44209.2. Samples: 322295400. Policy #0 lag: (min: 1.0, avg: 8.9, max: 23.0) [2024-07-02 13:09:16,096][36761] Avg episode reward: [(0, '0.719')] [2024-07-02 13:09:16,101][36979] Saving new best policy, reward=0.719! [2024-07-02 13:09:16,940][36999] Updated weights for policy 0, policy_version 19670 (0.0033) [2024-07-02 13:09:20,143][36999] Updated weights for policy 0, policy_version 19680 (0.0022) [2024-07-02 13:09:21,096][36761] Fps is (10 sec: 47510.5, 60 sec: 44509.5, 300 sec: 44320.0). Total num frames: 322469888. Throughput: 0: 44148.8. Samples: 322561040. Policy #0 lag: (min: 1.0, avg: 8.9, max: 23.0) [2024-07-02 13:09:21,096][36761] Avg episode reward: [(0, '0.727')] [2024-07-02 13:09:21,100][36979] Saving new best policy, reward=0.727! [2024-07-02 13:09:24,437][36999] Updated weights for policy 0, policy_version 19690 (0.0040) [2024-07-02 13:09:26,095][36761] Fps is (10 sec: 47513.3, 60 sec: 44236.9, 300 sec: 44375.6). Total num frames: 322682880. Throughput: 0: 44325.7. Samples: 322830520. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-07-02 13:09:26,096][36761] Avg episode reward: [(0, '0.729')] [2024-07-02 13:09:26,115][36979] Saving new best policy, reward=0.729! [2024-07-02 13:09:27,570][36999] Updated weights for policy 0, policy_version 19700 (0.0033) [2024-07-02 13:09:31,095][36761] Fps is (10 sec: 39323.6, 60 sec: 43963.7, 300 sec: 44209.0). Total num frames: 322863104. Throughput: 0: 44288.5. Samples: 322963660. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-07-02 13:09:31,096][36761] Avg episode reward: [(0, '0.696')] [2024-07-02 13:09:31,853][36999] Updated weights for policy 0, policy_version 19710 (0.0040) [2024-07-02 13:09:34,896][36999] Updated weights for policy 0, policy_version 19720 (0.0024) [2024-07-02 13:09:36,095][36761] Fps is (10 sec: 44236.5, 60 sec: 43963.6, 300 sec: 44320.1). Total num frames: 323125248. Throughput: 0: 44231.0. Samples: 323226520. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-07-02 13:09:36,096][36761] Avg episode reward: [(0, '0.695')] [2024-07-02 13:09:39,179][36999] Updated weights for policy 0, policy_version 19730 (0.0031) [2024-07-02 13:09:40,385][36979] Signal inference workers to stop experience collection... (4750 times) [2024-07-02 13:09:40,385][36979] Signal inference workers to resume experience collection... (4750 times) [2024-07-02 13:09:40,405][36999] InferenceWorker_p0-w0: stopping experience collection (4750 times) [2024-07-02 13:09:40,433][36999] InferenceWorker_p0-w0: resuming experience collection (4750 times) [2024-07-02 13:09:41,095][36761] Fps is (10 sec: 47514.0, 60 sec: 44236.9, 300 sec: 44320.1). Total num frames: 323338240. Throughput: 0: 44249.9. Samples: 323493220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 13:09:41,095][36761] Avg episode reward: [(0, '0.695')] [2024-07-02 13:09:42,218][36999] Updated weights for policy 0, policy_version 19740 (0.0029) [2024-07-02 13:09:46,095][36761] Fps is (10 sec: 40960.6, 60 sec: 44236.8, 300 sec: 44264.6). Total num frames: 323534848. Throughput: 0: 44460.0. Samples: 323633040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 13:09:46,096][36761] Avg episode reward: [(0, '0.695')] [2024-07-02 13:09:46,534][36999] Updated weights for policy 0, policy_version 19750 (0.0041) [2024-07-02 13:09:49,543][36999] Updated weights for policy 0, policy_version 19760 (0.0037) [2024-07-02 13:09:51,100][36761] Fps is (10 sec: 44216.4, 60 sec: 43960.4, 300 sec: 44264.6). Total num frames: 323780608. Throughput: 0: 44420.3. Samples: 323892720. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-07-02 13:09:51,101][36761] Avg episode reward: [(0, '0.698')] [2024-07-02 13:09:53,823][36999] Updated weights for policy 0, policy_version 19770 (0.0025) [2024-07-02 13:09:56,095][36761] Fps is (10 sec: 47513.7, 60 sec: 44513.2, 300 sec: 44320.1). Total num frames: 324009984. Throughput: 0: 44351.5. Samples: 324162720. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-07-02 13:09:56,096][36761] Avg episode reward: [(0, '0.709')] [2024-07-02 13:09:56,861][36999] Updated weights for policy 0, policy_version 19780 (0.0029) [2024-07-02 13:10:01,095][36761] Fps is (10 sec: 42617.9, 60 sec: 44509.9, 300 sec: 44320.1). Total num frames: 324206592. Throughput: 0: 44475.2. Samples: 324296780. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-07-02 13:10:01,096][36761] Avg episode reward: [(0, '0.713')] [2024-07-02 13:10:01,365][36999] Updated weights for policy 0, policy_version 19790 (0.0032) [2024-07-02 13:10:04,396][36999] Updated weights for policy 0, policy_version 19800 (0.0026) [2024-07-02 13:10:06,095][36761] Fps is (10 sec: 44236.9, 60 sec: 44239.4, 300 sec: 44375.7). Total num frames: 324452352. Throughput: 0: 44221.0. Samples: 324550960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-07-02 13:10:06,096][36761] Avg episode reward: [(0, '0.693')] [2024-07-02 13:10:08,737][36999] Updated weights for policy 0, policy_version 19810 (0.0034) [2024-07-02 13:10:11,096][36761] Fps is (10 sec: 45874.2, 60 sec: 44509.7, 300 sec: 44320.1). Total num frames: 324665344. Throughput: 0: 44283.0. Samples: 324823260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-07-02 13:10:11,096][36761] Avg episode reward: [(0, '0.705')] [2024-07-02 13:10:11,945][36999] Updated weights for policy 0, policy_version 19820 (0.0032) [2024-07-02 13:10:16,095][36761] Fps is (10 sec: 42598.5, 60 sec: 44509.9, 300 sec: 44320.1). Total num frames: 324878336. Throughput: 0: 44255.6. Samples: 324955160. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-07-02 13:10:16,096][36761] Avg episode reward: [(0, '0.716')] [2024-07-02 13:10:16,127][36999] Updated weights for policy 0, policy_version 19830 (0.0028) [2024-07-02 13:10:19,357][36999] Updated weights for policy 0, policy_version 19840 (0.0048) [2024-07-02 13:10:21,100][36761] Fps is (10 sec: 44217.6, 60 sec: 43960.8, 300 sec: 44375.0). Total num frames: 325107712. Throughput: 0: 44313.0. Samples: 325220800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-07-02 13:10:21,100][36761] Avg episode reward: [(0, '0.701')] [2024-07-02 13:10:23,483][36999] Updated weights for policy 0, policy_version 19850 (0.0035) [2024-07-02 13:10:26,095][36761] Fps is (10 sec: 45875.0, 60 sec: 44236.9, 300 sec: 44320.1). Total num frames: 325337088. Throughput: 0: 44415.9. Samples: 325491940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-07-02 13:10:26,096][36761] Avg episode reward: [(0, '0.700')] [2024-07-02 13:10:26,781][36999] Updated weights for policy 0, policy_version 19860 (0.0027) [2024-07-02 13:10:31,062][36999] Updated weights for policy 0, policy_version 19870 (0.0039) [2024-07-02 13:10:31,100][36761] Fps is (10 sec: 44236.6, 60 sec: 44779.5, 300 sec: 44375.0). Total num frames: 325550080. Throughput: 0: 44280.8. Samples: 325625880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-07-02 13:10:31,101][36761] Avg episode reward: [(0, '0.710')] [2024-07-02 13:10:34,244][36999] Updated weights for policy 0, policy_version 19880 (0.0033) [2024-07-02 13:10:36,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43963.8, 300 sec: 44375.6). Total num frames: 325763072. Throughput: 0: 44209.3. Samples: 325881940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:10:36,096][36761] Avg episode reward: [(0, '0.673')] [2024-07-02 13:10:36,110][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000019883_325763072.pth... [2024-07-02 13:10:36,168][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000019235_315146240.pth [2024-07-02 13:10:38,468][36999] Updated weights for policy 0, policy_version 19890 (0.0032) [2024-07-02 13:10:41,095][36761] Fps is (10 sec: 44257.3, 60 sec: 44236.8, 300 sec: 44264.6). Total num frames: 325992448. Throughput: 0: 44158.3. Samples: 326149840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:10:41,095][36761] Avg episode reward: [(0, '0.661')] [2024-07-02 13:10:41,648][36999] Updated weights for policy 0, policy_version 19900 (0.0034) [2024-07-02 13:10:45,821][36999] Updated weights for policy 0, policy_version 19910 (0.0033) [2024-07-02 13:10:46,097][36761] Fps is (10 sec: 44228.4, 60 sec: 44508.4, 300 sec: 44265.0). Total num frames: 326205440. Throughput: 0: 44208.7. Samples: 326286260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:10:46,098][36761] Avg episode reward: [(0, '0.709')] [2024-07-02 13:10:49,098][36999] Updated weights for policy 0, policy_version 19920 (0.0042) [2024-07-02 13:10:51,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43967.0, 300 sec: 44320.1). Total num frames: 326418432. Throughput: 0: 44255.1. Samples: 326542440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-07-02 13:10:51,096][36761] Avg episode reward: [(0, '0.749')] [2024-07-02 13:10:51,096][36979] Saving new best policy, reward=0.749! [2024-07-02 13:10:53,269][36999] Updated weights for policy 0, policy_version 19930 (0.0036) [2024-07-02 13:10:56,095][36761] Fps is (10 sec: 45883.8, 60 sec: 44236.7, 300 sec: 44265.2). Total num frames: 326664192. Throughput: 0: 44171.2. Samples: 326810960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-07-02 13:10:56,096][36761] Avg episode reward: [(0, '0.735')] [2024-07-02 13:10:56,514][36999] Updated weights for policy 0, policy_version 19940 (0.0025) [2024-07-02 13:11:00,593][36999] Updated weights for policy 0, policy_version 19950 (0.0024) [2024-07-02 13:11:01,095][36761] Fps is (10 sec: 44237.4, 60 sec: 44236.9, 300 sec: 44264.6). Total num frames: 326860800. Throughput: 0: 44311.2. Samples: 326949160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 13:11:01,096][36761] Avg episode reward: [(0, '0.718')] [2024-07-02 13:11:04,177][36999] Updated weights for policy 0, policy_version 19960 (0.0034) [2024-07-02 13:11:06,095][36761] Fps is (10 sec: 40959.8, 60 sec: 43690.6, 300 sec: 44264.6). Total num frames: 327073792. Throughput: 0: 44210.6. Samples: 327210080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 13:11:06,098][36761] Avg episode reward: [(0, '0.723')] [2024-07-02 13:11:06,754][36979] Signal inference workers to stop experience collection... (4800 times) [2024-07-02 13:11:06,755][36979] Signal inference workers to resume experience collection... (4800 times) [2024-07-02 13:11:06,795][36999] InferenceWorker_p0-w0: stopping experience collection (4800 times) [2024-07-02 13:11:06,795][36999] InferenceWorker_p0-w0: resuming experience collection (4800 times) [2024-07-02 13:11:07,965][36999] Updated weights for policy 0, policy_version 19970 (0.0037) [2024-07-02 13:11:11,095][36761] Fps is (10 sec: 45874.5, 60 sec: 44236.9, 300 sec: 44264.6). Total num frames: 327319552. Throughput: 0: 44007.5. Samples: 327472280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 13:11:11,096][36761] Avg episode reward: [(0, '0.723')] [2024-07-02 13:11:11,523][36999] Updated weights for policy 0, policy_version 19980 (0.0025) [2024-07-02 13:11:15,511][36999] Updated weights for policy 0, policy_version 19990 (0.0037) [2024-07-02 13:11:16,095][36761] Fps is (10 sec: 44237.7, 60 sec: 43963.8, 300 sec: 44209.0). Total num frames: 327516160. Throughput: 0: 44084.1. Samples: 327609460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-07-02 13:11:16,095][36761] Avg episode reward: [(0, '0.766')] [2024-07-02 13:11:16,113][36979] Saving new best policy, reward=0.766! [2024-07-02 13:11:18,866][36999] Updated weights for policy 0, policy_version 20000 (0.0036) [2024-07-02 13:11:21,095][36761] Fps is (10 sec: 40960.3, 60 sec: 43694.0, 300 sec: 44264.6). Total num frames: 327729152. Throughput: 0: 44205.0. Samples: 327871160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-07-02 13:11:21,096][36761] Avg episode reward: [(0, '0.768')] [2024-07-02 13:11:21,209][36979] Saving new best policy, reward=0.768! [2024-07-02 13:11:22,874][36999] Updated weights for policy 0, policy_version 20010 (0.0037) [2024-07-02 13:11:26,095][36761] Fps is (10 sec: 47512.7, 60 sec: 44236.7, 300 sec: 44264.5). Total num frames: 327991296. Throughput: 0: 44014.9. Samples: 328130520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-07-02 13:11:26,096][36761] Avg episode reward: [(0, '0.729')] [2024-07-02 13:11:26,232][36999] Updated weights for policy 0, policy_version 20020 (0.0027) [2024-07-02 13:11:30,291][36999] Updated weights for policy 0, policy_version 20030 (0.0033) [2024-07-02 13:11:31,098][36761] Fps is (10 sec: 45864.2, 60 sec: 43965.4, 300 sec: 44264.2). Total num frames: 328187904. Throughput: 0: 43883.6. Samples: 328261040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:11:31,098][36761] Avg episode reward: [(0, '0.758')] [2024-07-02 13:11:33,910][36999] Updated weights for policy 0, policy_version 20040 (0.0022) [2024-07-02 13:11:36,095][36761] Fps is (10 sec: 40960.6, 60 sec: 43963.8, 300 sec: 44320.4). Total num frames: 328400896. Throughput: 0: 44125.8. Samples: 328528100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:11:36,096][36761] Avg episode reward: [(0, '0.758')] [2024-07-02 13:11:37,721][36999] Updated weights for policy 0, policy_version 20050 (0.0036) [2024-07-02 13:11:41,095][36761] Fps is (10 sec: 44247.4, 60 sec: 43963.7, 300 sec: 44153.5). Total num frames: 328630272. Throughput: 0: 43827.7. Samples: 328783200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:11:41,096][36761] Avg episode reward: [(0, '0.752')] [2024-07-02 13:11:41,375][36999] Updated weights for policy 0, policy_version 20060 (0.0028) [2024-07-02 13:11:45,591][36999] Updated weights for policy 0, policy_version 20070 (0.0035) [2024-07-02 13:11:46,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43965.2, 300 sec: 44209.0). Total num frames: 328843264. Throughput: 0: 43653.7. Samples: 328913580. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-07-02 13:11:46,095][36761] Avg episode reward: [(0, '0.737')] [2024-07-02 13:11:48,804][36999] Updated weights for policy 0, policy_version 20080 (0.0025) [2024-07-02 13:11:51,095][36761] Fps is (10 sec: 44236.4, 60 sec: 44236.8, 300 sec: 44375.6). Total num frames: 329072640. Throughput: 0: 43870.7. Samples: 329184260. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-07-02 13:11:51,096][36761] Avg episode reward: [(0, '0.747')] [2024-07-02 13:11:52,979][36999] Updated weights for policy 0, policy_version 20090 (0.0028) [2024-07-02 13:11:56,095][36761] Fps is (10 sec: 44236.4, 60 sec: 43690.7, 300 sec: 44153.5). Total num frames: 329285632. Throughput: 0: 43804.4. Samples: 329443480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 13:11:56,107][36761] Avg episode reward: [(0, '0.792')] [2024-07-02 13:11:56,145][36979] Saving new best policy, reward=0.792! [2024-07-02 13:11:56,751][36999] Updated weights for policy 0, policy_version 20100 (0.0038) [2024-07-02 13:12:00,449][36999] Updated weights for policy 0, policy_version 20110 (0.0036) [2024-07-02 13:12:01,095][36761] Fps is (10 sec: 44237.2, 60 sec: 44236.7, 300 sec: 44264.6). Total num frames: 329515008. Throughput: 0: 43640.8. Samples: 329573300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 13:12:01,096][36761] Avg episode reward: [(0, '0.777')] [2024-07-02 13:12:04,373][36999] Updated weights for policy 0, policy_version 20120 (0.0040) [2024-07-02 13:12:06,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43963.8, 300 sec: 44209.0). Total num frames: 329711616. Throughput: 0: 43816.0. Samples: 329842880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 13:12:06,096][36761] Avg episode reward: [(0, '0.772')] [2024-07-02 13:12:07,801][36999] Updated weights for policy 0, policy_version 20130 (0.0029) [2024-07-02 13:12:11,095][36761] Fps is (10 sec: 44236.5, 60 sec: 43963.7, 300 sec: 44209.0). Total num frames: 329957376. Throughput: 0: 43843.2. Samples: 330103460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 13:12:11,097][36761] Avg episode reward: [(0, '0.772')] [2024-07-02 13:12:11,706][36999] Updated weights for policy 0, policy_version 20140 (0.0035) [2024-07-02 13:12:15,253][36999] Updated weights for policy 0, policy_version 20150 (0.0033) [2024-07-02 13:12:16,095][36761] Fps is (10 sec: 47513.2, 60 sec: 44509.8, 300 sec: 44209.0). Total num frames: 330186752. Throughput: 0: 44068.0. Samples: 330244000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 13:12:16,096][36761] Avg episode reward: [(0, '0.772')] [2024-07-02 13:12:19,093][36999] Updated weights for policy 0, policy_version 20160 (0.0030) [2024-07-02 13:12:21,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43963.7, 300 sec: 44209.0). Total num frames: 330366976. Throughput: 0: 43967.5. Samples: 330506640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 13:12:21,096][36761] Avg episode reward: [(0, '0.746')] [2024-07-02 13:12:22,698][36999] Updated weights for policy 0, policy_version 20170 (0.0029) [2024-07-02 13:12:26,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43690.8, 300 sec: 44209.0). Total num frames: 330612736. Throughput: 0: 43898.2. Samples: 330758620. Policy #0 lag: (min: 2.0, avg: 12.6, max: 23.0) [2024-07-02 13:12:26,096][36761] Avg episode reward: [(0, '0.779')] [2024-07-02 13:12:26,562][36999] Updated weights for policy 0, policy_version 20180 (0.0038) [2024-07-02 13:12:27,592][36979] Signal inference workers to stop experience collection... (4850 times) [2024-07-02 13:12:27,625][36999] InferenceWorker_p0-w0: stopping experience collection (4850 times) [2024-07-02 13:12:27,651][36979] Signal inference workers to resume experience collection... (4850 times) [2024-07-02 13:12:27,652][36999] InferenceWorker_p0-w0: resuming experience collection (4850 times) [2024-07-02 13:12:30,177][36999] Updated weights for policy 0, policy_version 20190 (0.0026) [2024-07-02 13:12:31,095][36761] Fps is (10 sec: 47513.9, 60 sec: 44238.6, 300 sec: 44154.2). Total num frames: 330842112. Throughput: 0: 44062.7. Samples: 330896400. Policy #0 lag: (min: 2.0, avg: 12.6, max: 23.0) [2024-07-02 13:12:31,096][36761] Avg episode reward: [(0, '0.785')] [2024-07-02 13:12:33,787][36999] Updated weights for policy 0, policy_version 20200 (0.0041) [2024-07-02 13:12:36,095][36761] Fps is (10 sec: 42597.7, 60 sec: 43963.6, 300 sec: 44209.0). Total num frames: 331038720. Throughput: 0: 44017.2. Samples: 331165040. Policy #0 lag: (min: 2.0, avg: 12.6, max: 23.0) [2024-07-02 13:12:36,096][36761] Avg episode reward: [(0, '0.772')] [2024-07-02 13:12:36,106][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000020205_331038720.pth... [2024-07-02 13:12:36,156][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000019561_320487424.pth [2024-07-02 13:12:37,535][36999] Updated weights for policy 0, policy_version 20210 (0.0030) [2024-07-02 13:12:41,096][36761] Fps is (10 sec: 42595.5, 60 sec: 43963.2, 300 sec: 44153.4). Total num frames: 331268096. Throughput: 0: 44017.6. Samples: 331424300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 13:12:41,096][36761] Avg episode reward: [(0, '0.775')] [2024-07-02 13:12:41,576][36999] Updated weights for policy 0, policy_version 20220 (0.0021) [2024-07-02 13:12:44,901][36999] Updated weights for policy 0, policy_version 20230 (0.0038) [2024-07-02 13:12:46,095][36761] Fps is (10 sec: 45875.7, 60 sec: 44236.7, 300 sec: 44153.5). Total num frames: 331497472. Throughput: 0: 44207.5. Samples: 331562640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 13:12:46,096][36761] Avg episode reward: [(0, '0.763')] [2024-07-02 13:12:48,952][36999] Updated weights for policy 0, policy_version 20240 (0.0032) [2024-07-02 13:12:51,095][36761] Fps is (10 sec: 42601.1, 60 sec: 43690.7, 300 sec: 44097.9). Total num frames: 331694080. Throughput: 0: 44052.9. Samples: 331825260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 13:12:51,096][36761] Avg episode reward: [(0, '0.788')] [2024-07-02 13:12:52,325][36999] Updated weights for policy 0, policy_version 20250 (0.0029) [2024-07-02 13:12:56,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43963.8, 300 sec: 44209.0). Total num frames: 331923456. Throughput: 0: 44108.5. Samples: 332088340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-07-02 13:12:56,096][36761] Avg episode reward: [(0, '0.812')] [2024-07-02 13:12:56,106][36979] Saving new best policy, reward=0.812! [2024-07-02 13:12:56,331][36999] Updated weights for policy 0, policy_version 20260 (0.0035) [2024-07-02 13:12:59,910][36999] Updated weights for policy 0, policy_version 20270 (0.0037) [2024-07-02 13:13:01,095][36761] Fps is (10 sec: 47513.7, 60 sec: 44236.8, 300 sec: 44153.5). Total num frames: 332169216. Throughput: 0: 43917.8. Samples: 332220300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-07-02 13:13:01,096][36761] Avg episode reward: [(0, '0.797')] [2024-07-02 13:13:03,744][36999] Updated weights for policy 0, policy_version 20280 (0.0026) [2024-07-02 13:13:06,095][36761] Fps is (10 sec: 44236.0, 60 sec: 44236.7, 300 sec: 44042.4). Total num frames: 332365824. Throughput: 0: 43963.9. Samples: 332485020. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-07-02 13:13:06,096][36761] Avg episode reward: [(0, '0.797')] [2024-07-02 13:13:07,299][36999] Updated weights for policy 0, policy_version 20290 (0.0039) [2024-07-02 13:13:11,095][36761] Fps is (10 sec: 40959.9, 60 sec: 43690.7, 300 sec: 44153.6). Total num frames: 332578816. Throughput: 0: 44068.4. Samples: 332741700. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-07-02 13:13:11,096][36761] Avg episode reward: [(0, '0.754')] [2024-07-02 13:13:11,108][36999] Updated weights for policy 0, policy_version 20300 (0.0031) [2024-07-02 13:13:14,853][36999] Updated weights for policy 0, policy_version 20310 (0.0026) [2024-07-02 13:13:16,095][36761] Fps is (10 sec: 44237.7, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 332808192. Throughput: 0: 43991.1. Samples: 332876000. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-07-02 13:13:16,096][36761] Avg episode reward: [(0, '0.768')] [2024-07-02 13:13:18,457][36999] Updated weights for policy 0, policy_version 20320 (0.0031) [2024-07-02 13:13:21,095][36761] Fps is (10 sec: 44236.9, 60 sec: 44236.8, 300 sec: 44042.5). Total num frames: 333021184. Throughput: 0: 44077.0. Samples: 333148500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 13:13:21,096][36761] Avg episode reward: [(0, '0.781')] [2024-07-02 13:13:22,143][36999] Updated weights for policy 0, policy_version 20330 (0.0023) [2024-07-02 13:13:25,758][36999] Updated weights for policy 0, policy_version 20340 (0.0034) [2024-07-02 13:13:26,095][36761] Fps is (10 sec: 44236.4, 60 sec: 43963.7, 300 sec: 44153.5). Total num frames: 333250560. Throughput: 0: 43978.4. Samples: 333403300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 13:13:26,096][36761] Avg episode reward: [(0, '0.780')] [2024-07-02 13:13:29,602][36999] Updated weights for policy 0, policy_version 20350 (0.0028) [2024-07-02 13:13:31,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 333463552. Throughput: 0: 43893.8. Samples: 333537860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 13:13:31,096][36761] Avg episode reward: [(0, '0.771')] [2024-07-02 13:13:33,101][36999] Updated weights for policy 0, policy_version 20360 (0.0025) [2024-07-02 13:13:36,095][36761] Fps is (10 sec: 42598.8, 60 sec: 43963.9, 300 sec: 44042.4). Total num frames: 333676544. Throughput: 0: 44072.5. Samples: 333808520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-07-02 13:13:36,096][36761] Avg episode reward: [(0, '0.771')] [2024-07-02 13:13:37,035][36999] Updated weights for policy 0, policy_version 20370 (0.0029) [2024-07-02 13:13:40,473][36999] Updated weights for policy 0, policy_version 20380 (0.0037) [2024-07-02 13:13:41,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43964.2, 300 sec: 44153.5). Total num frames: 333905920. Throughput: 0: 43884.0. Samples: 334063120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-07-02 13:13:41,096][36761] Avg episode reward: [(0, '0.775')] [2024-07-02 13:13:44,503][36999] Updated weights for policy 0, policy_version 20390 (0.0049) [2024-07-02 13:13:46,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 334118912. Throughput: 0: 43981.4. Samples: 334199460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-07-02 13:13:46,096][36761] Avg episode reward: [(0, '0.798')] [2024-07-02 13:13:48,284][36999] Updated weights for policy 0, policy_version 20400 (0.0040) [2024-07-02 13:13:51,095][36761] Fps is (10 sec: 42598.1, 60 sec: 43963.7, 300 sec: 44043.1). Total num frames: 334331904. Throughput: 0: 43988.1. Samples: 334464480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-07-02 13:13:51,096][36761] Avg episode reward: [(0, '0.785')] [2024-07-02 13:13:52,010][36999] Updated weights for policy 0, policy_version 20410 (0.0028) [2024-07-02 13:13:55,682][36999] Updated weights for policy 0, policy_version 20420 (0.0035) [2024-07-02 13:13:56,095][36761] Fps is (10 sec: 45875.2, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 334577664. Throughput: 0: 43896.0. Samples: 334717020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-07-02 13:13:56,096][36761] Avg episode reward: [(0, '0.757')] [2024-07-02 13:13:59,369][36999] Updated weights for policy 0, policy_version 20430 (0.0027) [2024-07-02 13:14:00,294][36979] Signal inference workers to stop experience collection... (4900 times) [2024-07-02 13:14:00,356][36999] InferenceWorker_p0-w0: stopping experience collection (4900 times) [2024-07-02 13:14:00,414][36979] Signal inference workers to resume experience collection... (4900 times) [2024-07-02 13:14:00,414][36999] InferenceWorker_p0-w0: resuming experience collection (4900 times) [2024-07-02 13:14:01,095][36761] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 43987.4). Total num frames: 334774272. Throughput: 0: 44008.4. Samples: 334856380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-07-02 13:14:01,096][36761] Avg episode reward: [(0, '0.757')] [2024-07-02 13:14:03,066][36999] Updated weights for policy 0, policy_version 20440 (0.0036) [2024-07-02 13:14:06,095][36761] Fps is (10 sec: 40959.8, 60 sec: 43690.8, 300 sec: 44042.4). Total num frames: 334987264. Throughput: 0: 43823.5. Samples: 335120560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-07-02 13:14:06,096][36761] Avg episode reward: [(0, '0.693')] [2024-07-02 13:14:06,794][36999] Updated weights for policy 0, policy_version 20450 (0.0037) [2024-07-02 13:14:10,477][36999] Updated weights for policy 0, policy_version 20460 (0.0029) [2024-07-02 13:14:11,095][36761] Fps is (10 sec: 45875.1, 60 sec: 44236.8, 300 sec: 44153.5). Total num frames: 335233024. Throughput: 0: 43876.9. Samples: 335377760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-07-02 13:14:11,096][36761] Avg episode reward: [(0, '0.727')] [2024-07-02 13:14:14,162][36999] Updated weights for policy 0, policy_version 20470 (0.0025) [2024-07-02 13:14:16,095][36761] Fps is (10 sec: 45874.9, 60 sec: 43963.6, 300 sec: 43986.9). Total num frames: 335446016. Throughput: 0: 44057.7. Samples: 335520460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-07-02 13:14:16,096][36761] Avg episode reward: [(0, '0.723')] [2024-07-02 13:14:17,769][36999] Updated weights for policy 0, policy_version 20480 (0.0042) [2024-07-02 13:14:21,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 335659008. Throughput: 0: 43955.1. Samples: 335786500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:14:21,096][36761] Avg episode reward: [(0, '0.763')] [2024-07-02 13:14:21,648][36999] Updated weights for policy 0, policy_version 20490 (0.0029) [2024-07-02 13:14:25,136][36999] Updated weights for policy 0, policy_version 20500 (0.0040) [2024-07-02 13:14:26,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43963.7, 300 sec: 44153.5). Total num frames: 335888384. Throughput: 0: 43999.9. Samples: 336043120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:14:26,096][36761] Avg episode reward: [(0, '0.805')] [2024-07-02 13:14:29,170][36999] Updated weights for policy 0, policy_version 20510 (0.0035) [2024-07-02 13:14:31,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 336101376. Throughput: 0: 43962.2. Samples: 336177760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 13:14:31,096][36761] Avg episode reward: [(0, '0.805')] [2024-07-02 13:14:32,543][36999] Updated weights for policy 0, policy_version 20520 (0.0032) [2024-07-02 13:14:36,095][36761] Fps is (10 sec: 44237.0, 60 sec: 44236.7, 300 sec: 44042.4). Total num frames: 336330752. Throughput: 0: 44066.3. Samples: 336447460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 13:14:36,096][36761] Avg episode reward: [(0, '0.803')] [2024-07-02 13:14:36,110][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000020528_336330752.pth... [2024-07-02 13:14:36,172][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000019883_325763072.pth [2024-07-02 13:14:36,681][36999] Updated weights for policy 0, policy_version 20530 (0.0031) [2024-07-02 13:14:39,924][36999] Updated weights for policy 0, policy_version 20540 (0.0034) [2024-07-02 13:14:41,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43963.7, 300 sec: 44098.0). Total num frames: 336543744. Throughput: 0: 44214.6. Samples: 336706680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 13:14:41,100][36761] Avg episode reward: [(0, '0.788')] [2024-07-02 13:14:44,012][36999] Updated weights for policy 0, policy_version 20550 (0.0033) [2024-07-02 13:14:46,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43963.7, 300 sec: 43987.5). Total num frames: 336756736. Throughput: 0: 44091.0. Samples: 336840480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-07-02 13:14:46,096][36761] Avg episode reward: [(0, '0.776')] [2024-07-02 13:14:47,561][36999] Updated weights for policy 0, policy_version 20560 (0.0031) [2024-07-02 13:14:51,095][36761] Fps is (10 sec: 42598.8, 60 sec: 43963.8, 300 sec: 43931.3). Total num frames: 336969728. Throughput: 0: 44179.7. Samples: 337108640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-07-02 13:14:51,096][36761] Avg episode reward: [(0, '0.770')] [2024-07-02 13:14:51,423][36999] Updated weights for policy 0, policy_version 20570 (0.0035) [2024-07-02 13:14:54,877][36999] Updated weights for policy 0, policy_version 20580 (0.0029) [2024-07-02 13:14:56,095][36761] Fps is (10 sec: 44237.2, 60 sec: 43690.7, 300 sec: 44042.4). Total num frames: 337199104. Throughput: 0: 44234.7. Samples: 337368320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-07-02 13:14:56,096][36761] Avg episode reward: [(0, '0.758')] [2024-07-02 13:14:58,610][36979] Signal inference workers to stop experience collection... (4950 times) [2024-07-02 13:14:58,611][36979] Signal inference workers to resume experience collection... (4950 times) [2024-07-02 13:14:58,649][36999] InferenceWorker_p0-w0: stopping experience collection (4950 times) [2024-07-02 13:14:58,649][36999] InferenceWorker_p0-w0: resuming experience collection (4950 times) [2024-07-02 13:14:58,755][36999] Updated weights for policy 0, policy_version 20590 (0.0032) [2024-07-02 13:15:01,100][36761] Fps is (10 sec: 45854.0, 60 sec: 44233.4, 300 sec: 43986.2). Total num frames: 337428480. Throughput: 0: 43993.9. Samples: 337500380. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-07-02 13:15:01,100][36761] Avg episode reward: [(0, '0.768')] [2024-07-02 13:15:02,382][36999] Updated weights for policy 0, policy_version 20600 (0.0037) [2024-07-02 13:15:06,095][36761] Fps is (10 sec: 45874.4, 60 sec: 44509.8, 300 sec: 44042.4). Total num frames: 337657856. Throughput: 0: 43986.9. Samples: 337765920. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-07-02 13:15:06,096][36761] Avg episode reward: [(0, '0.761')] [2024-07-02 13:15:06,151][36999] Updated weights for policy 0, policy_version 20610 (0.0029) [2024-07-02 13:15:10,022][36999] Updated weights for policy 0, policy_version 20620 (0.0032) [2024-07-02 13:15:11,095][36761] Fps is (10 sec: 42618.0, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 337854464. Throughput: 0: 44141.4. Samples: 338029480. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-07-02 13:15:11,096][36761] Avg episode reward: [(0, '0.800')] [2024-07-02 13:15:13,508][36999] Updated weights for policy 0, policy_version 20630 (0.0027) [2024-07-02 13:15:16,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43963.7, 300 sec: 43987.5). Total num frames: 338083840. Throughput: 0: 43982.5. Samples: 338156980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 13:15:16,096][36761] Avg episode reward: [(0, '0.813')] [2024-07-02 13:15:17,489][36999] Updated weights for policy 0, policy_version 20640 (0.0033) [2024-07-02 13:15:21,040][36999] Updated weights for policy 0, policy_version 20650 (0.0030) [2024-07-02 13:15:21,095][36761] Fps is (10 sec: 47513.5, 60 sec: 44509.8, 300 sec: 44042.4). Total num frames: 338329600. Throughput: 0: 44103.1. Samples: 338432100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 13:15:21,096][36761] Avg episode reward: [(0, '0.818')] [2024-07-02 13:15:21,168][36979] Saving new best policy, reward=0.818! [2024-07-02 13:15:24,754][36999] Updated weights for policy 0, policy_version 20660 (0.0035) [2024-07-02 13:15:26,095][36761] Fps is (10 sec: 45875.3, 60 sec: 44236.8, 300 sec: 44043.1). Total num frames: 338542592. Throughput: 0: 44329.7. Samples: 338701520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 13:15:26,096][36761] Avg episode reward: [(0, '0.778')] [2024-07-02 13:15:28,461][36999] Updated weights for policy 0, policy_version 20670 (0.0039) [2024-07-02 13:15:31,095][36761] Fps is (10 sec: 42598.9, 60 sec: 44236.9, 300 sec: 44042.4). Total num frames: 338755584. Throughput: 0: 44139.3. Samples: 338826740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 13:15:31,096][36761] Avg episode reward: [(0, '0.781')] [2024-07-02 13:15:32,086][36999] Updated weights for policy 0, policy_version 20680 (0.0027) [2024-07-02 13:15:35,690][36999] Updated weights for policy 0, policy_version 20690 (0.0028) [2024-07-02 13:15:36,096][36761] Fps is (10 sec: 45872.9, 60 sec: 44509.5, 300 sec: 44097.9). Total num frames: 339001344. Throughput: 0: 44309.1. Samples: 339102580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 13:15:36,096][36761] Avg episode reward: [(0, '0.770')] [2024-07-02 13:15:39,437][36999] Updated weights for policy 0, policy_version 20700 (0.0047) [2024-07-02 13:15:41,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43963.8, 300 sec: 43987.2). Total num frames: 339181568. Throughput: 0: 44380.9. Samples: 339365460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 13:15:41,096][36761] Avg episode reward: [(0, '0.782')] [2024-07-02 13:15:43,138][36999] Updated weights for policy 0, policy_version 20710 (0.0037) [2024-07-02 13:15:46,095][36761] Fps is (10 sec: 42601.1, 60 sec: 44509.9, 300 sec: 44098.0). Total num frames: 339427328. Throughput: 0: 44273.4. Samples: 339492480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 13:15:46,096][36761] Avg episode reward: [(0, '0.767')] [2024-07-02 13:15:46,931][36999] Updated weights for policy 0, policy_version 20720 (0.0028) [2024-07-02 13:15:50,454][36999] Updated weights for policy 0, policy_version 20730 (0.0022) [2024-07-02 13:15:51,095][36761] Fps is (10 sec: 47513.4, 60 sec: 44782.9, 300 sec: 44042.4). Total num frames: 339656704. Throughput: 0: 44363.2. Samples: 339762260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 13:15:51,096][36761] Avg episode reward: [(0, '0.785')] [2024-07-02 13:15:54,328][36999] Updated weights for policy 0, policy_version 20740 (0.0048) [2024-07-02 13:15:56,095][36761] Fps is (10 sec: 42598.3, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 339853312. Throughput: 0: 44438.7. Samples: 340029220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-07-02 13:15:56,096][36761] Avg episode reward: [(0, '0.781')] [2024-07-02 13:15:57,785][36999] Updated weights for policy 0, policy_version 20750 (0.0031) [2024-07-02 13:16:01,095][36761] Fps is (10 sec: 44237.1, 60 sec: 44513.3, 300 sec: 44153.5). Total num frames: 340099072. Throughput: 0: 44489.5. Samples: 340159000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-07-02 13:16:01,096][36761] Avg episode reward: [(0, '0.816')] [2024-07-02 13:16:01,699][36999] Updated weights for policy 0, policy_version 20760 (0.0033) [2024-07-02 13:16:05,419][36999] Updated weights for policy 0, policy_version 20770 (0.0023) [2024-07-02 13:16:06,095][36761] Fps is (10 sec: 44236.4, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 340295680. Throughput: 0: 44303.5. Samples: 340425760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-07-02 13:16:06,097][36761] Avg episode reward: [(0, '0.801')] [2024-07-02 13:16:09,021][36999] Updated weights for policy 0, policy_version 20780 (0.0028) [2024-07-02 13:16:11,096][36761] Fps is (10 sec: 40957.5, 60 sec: 44236.4, 300 sec: 44042.3). Total num frames: 340508672. Throughput: 0: 44284.0. Samples: 340694320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 13:16:11,097][36761] Avg episode reward: [(0, '0.786')] [2024-07-02 13:16:13,102][36999] Updated weights for policy 0, policy_version 20790 (0.0032) [2024-07-02 13:16:16,095][36761] Fps is (10 sec: 45875.2, 60 sec: 44509.9, 300 sec: 44153.5). Total num frames: 340754432. Throughput: 0: 44273.1. Samples: 340819040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 13:16:16,097][36761] Avg episode reward: [(0, '0.779')] [2024-07-02 13:16:16,355][36999] Updated weights for policy 0, policy_version 20800 (0.0032) [2024-07-02 13:16:20,279][36979] Signal inference workers to stop experience collection... (5000 times) [2024-07-02 13:16:20,279][36979] Signal inference workers to resume experience collection... (5000 times) [2024-07-02 13:16:20,320][36999] InferenceWorker_p0-w0: stopping experience collection (5000 times) [2024-07-02 13:16:20,320][36999] InferenceWorker_p0-w0: resuming experience collection (5000 times) [2024-07-02 13:16:20,414][36999] Updated weights for policy 0, policy_version 20810 (0.0037) [2024-07-02 13:16:21,095][36761] Fps is (10 sec: 45877.6, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 340967424. Throughput: 0: 44034.8. Samples: 341084120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 13:16:21,096][36761] Avg episode reward: [(0, '0.797')] [2024-07-02 13:16:23,665][36999] Updated weights for policy 0, policy_version 20820 (0.0035) [2024-07-02 13:16:26,096][36761] Fps is (10 sec: 40959.0, 60 sec: 43690.5, 300 sec: 43987.2). Total num frames: 341164032. Throughput: 0: 44309.0. Samples: 341359380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:16:26,096][36761] Avg episode reward: [(0, '0.789')] [2024-07-02 13:16:27,766][36999] Updated weights for policy 0, policy_version 20830 (0.0028) [2024-07-02 13:16:30,992][36999] Updated weights for policy 0, policy_version 20840 (0.0044) [2024-07-02 13:16:31,095][36761] Fps is (10 sec: 47513.5, 60 sec: 44782.8, 300 sec: 44209.0). Total num frames: 341442560. Throughput: 0: 44309.2. Samples: 341486400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:16:31,096][36761] Avg episode reward: [(0, '0.765')] [2024-07-02 13:16:35,125][36999] Updated weights for policy 0, policy_version 20850 (0.0026) [2024-07-02 13:16:36,096][36761] Fps is (10 sec: 45875.9, 60 sec: 43691.0, 300 sec: 44042.4). Total num frames: 341622784. Throughput: 0: 44092.3. Samples: 341746420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:16:36,107][36761] Avg episode reward: [(0, '0.767')] [2024-07-02 13:16:36,124][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000020851_341622784.pth... [2024-07-02 13:16:36,175][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000020205_331038720.pth [2024-07-02 13:16:38,549][36999] Updated weights for policy 0, policy_version 20860 (0.0034) [2024-07-02 13:16:41,095][36761] Fps is (10 sec: 37683.0, 60 sec: 43963.6, 300 sec: 43986.9). Total num frames: 341819392. Throughput: 0: 44106.1. Samples: 342014000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 13:16:41,096][36761] Avg episode reward: [(0, '0.745')] [2024-07-02 13:16:42,505][36999] Updated weights for policy 0, policy_version 20870 (0.0030) [2024-07-02 13:16:45,919][36999] Updated weights for policy 0, policy_version 20880 (0.0036) [2024-07-02 13:16:46,098][36761] Fps is (10 sec: 47503.4, 60 sec: 44508.1, 300 sec: 44153.2). Total num frames: 342097920. Throughput: 0: 44163.9. Samples: 342146480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 13:16:46,098][36761] Avg episode reward: [(0, '0.760')] [2024-07-02 13:16:49,856][36999] Updated weights for policy 0, policy_version 20890 (0.0035) [2024-07-02 13:16:51,095][36761] Fps is (10 sec: 45875.4, 60 sec: 43690.6, 300 sec: 44042.4). Total num frames: 342278144. Throughput: 0: 44056.9. Samples: 342408320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 13:16:51,096][36761] Avg episode reward: [(0, '0.781')] [2024-07-02 13:16:53,207][36999] Updated weights for policy 0, policy_version 20900 (0.0030) [2024-07-02 13:16:56,095][36761] Fps is (10 sec: 39330.5, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 342491136. Throughput: 0: 44169.4. Samples: 342681920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:16:56,097][36761] Avg episode reward: [(0, '0.791')] [2024-07-02 13:16:57,310][36999] Updated weights for policy 0, policy_version 20910 (0.0027) [2024-07-02 13:17:00,934][36999] Updated weights for policy 0, policy_version 20920 (0.0032) [2024-07-02 13:17:01,095][36761] Fps is (10 sec: 47513.7, 60 sec: 44236.7, 300 sec: 44209.0). Total num frames: 342753280. Throughput: 0: 44260.5. Samples: 342810760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:17:01,096][36761] Avg episode reward: [(0, '0.785')] [2024-07-02 13:17:04,683][36999] Updated weights for policy 0, policy_version 20930 (0.0035) [2024-07-02 13:17:06,095][36761] Fps is (10 sec: 45875.3, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 342949888. Throughput: 0: 44143.1. Samples: 343070560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:17:06,096][36761] Avg episode reward: [(0, '0.778')] [2024-07-02 13:17:08,222][36999] Updated weights for policy 0, policy_version 20940 (0.0041) [2024-07-02 13:17:11,095][36761] Fps is (10 sec: 40960.2, 60 sec: 44237.2, 300 sec: 43986.9). Total num frames: 343162880. Throughput: 0: 44041.7. Samples: 343341240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:17:11,096][36761] Avg episode reward: [(0, '0.774')] [2024-07-02 13:17:12,119][36999] Updated weights for policy 0, policy_version 20950 (0.0022) [2024-07-02 13:17:15,625][36999] Updated weights for policy 0, policy_version 20960 (0.0044) [2024-07-02 13:17:16,095][36761] Fps is (10 sec: 45874.8, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 343408640. Throughput: 0: 44141.3. Samples: 343472760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:17:16,096][36761] Avg episode reward: [(0, '0.814')] [2024-07-02 13:17:19,577][36999] Updated weights for policy 0, policy_version 20970 (0.0036) [2024-07-02 13:17:21,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43963.8, 300 sec: 44042.4). Total num frames: 343605248. Throughput: 0: 44025.9. Samples: 343727580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:17:21,096][36761] Avg episode reward: [(0, '0.824')] [2024-07-02 13:17:21,096][36979] Saving new best policy, reward=0.824! [2024-07-02 13:17:23,237][36999] Updated weights for policy 0, policy_version 20980 (0.0035) [2024-07-02 13:17:26,095][36761] Fps is (10 sec: 40960.7, 60 sec: 44237.1, 300 sec: 43986.9). Total num frames: 343818240. Throughput: 0: 44155.3. Samples: 344000980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 13:17:26,095][36761] Avg episode reward: [(0, '0.795')] [2024-07-02 13:17:26,923][36999] Updated weights for policy 0, policy_version 20990 (0.0030) [2024-07-02 13:17:30,676][36999] Updated weights for policy 0, policy_version 21000 (0.0028) [2024-07-02 13:17:31,100][36761] Fps is (10 sec: 47491.9, 60 sec: 43960.4, 300 sec: 44208.4). Total num frames: 344080384. Throughput: 0: 44133.4. Samples: 344132580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 13:17:31,101][36761] Avg episode reward: [(0, '0.793')] [2024-07-02 13:17:34,325][36999] Updated weights for policy 0, policy_version 21010 (0.0032) [2024-07-02 13:17:36,095][36761] Fps is (10 sec: 44236.3, 60 sec: 43963.8, 300 sec: 44042.5). Total num frames: 344260608. Throughput: 0: 44044.9. Samples: 344390340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 13:17:36,096][36761] Avg episode reward: [(0, '0.817')] [2024-07-02 13:17:38,363][36999] Updated weights for policy 0, policy_version 21020 (0.0044) [2024-07-02 13:17:41,095][36761] Fps is (10 sec: 42618.2, 60 sec: 44783.1, 300 sec: 44098.0). Total num frames: 344506368. Throughput: 0: 43917.0. Samples: 344658180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-07-02 13:17:41,096][36761] Avg episode reward: [(0, '0.816')] [2024-07-02 13:17:41,650][36999] Updated weights for policy 0, policy_version 21030 (0.0032) [2024-07-02 13:17:46,083][36999] Updated weights for policy 0, policy_version 21040 (0.0032) [2024-07-02 13:17:46,095][36761] Fps is (10 sec: 45875.4, 60 sec: 43692.3, 300 sec: 44153.5). Total num frames: 344719360. Throughput: 0: 43995.1. Samples: 344790540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-07-02 13:17:46,096][36761] Avg episode reward: [(0, '0.825')] [2024-07-02 13:17:46,213][36979] Saving new best policy, reward=0.825! [2024-07-02 13:17:49,417][36999] Updated weights for policy 0, policy_version 21050 (0.0030) [2024-07-02 13:17:51,100][36761] Fps is (10 sec: 42578.6, 60 sec: 44233.5, 300 sec: 44097.3). Total num frames: 344932352. Throughput: 0: 44044.0. Samples: 345052740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-07-02 13:17:51,101][36761] Avg episode reward: [(0, '0.804')] [2024-07-02 13:17:53,340][36999] Updated weights for policy 0, policy_version 21060 (0.0027) [2024-07-02 13:17:56,095][36761] Fps is (10 sec: 44237.1, 60 sec: 44509.9, 300 sec: 44042.4). Total num frames: 345161728. Throughput: 0: 43956.9. Samples: 345319300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:17:56,096][36761] Avg episode reward: [(0, '0.790')] [2024-07-02 13:17:56,722][36999] Updated weights for policy 0, policy_version 21070 (0.0033) [2024-07-02 13:17:58,766][36979] Signal inference workers to stop experience collection... (5050 times) [2024-07-02 13:17:58,807][36999] InferenceWorker_p0-w0: stopping experience collection (5050 times) [2024-07-02 13:17:58,881][36979] Signal inference workers to resume experience collection... (5050 times) [2024-07-02 13:17:58,881][36999] InferenceWorker_p0-w0: resuming experience collection (5050 times) [2024-07-02 13:18:00,625][36999] Updated weights for policy 0, policy_version 21080 (0.0044) [2024-07-02 13:18:01,095][36761] Fps is (10 sec: 45895.9, 60 sec: 43963.7, 300 sec: 44153.5). Total num frames: 345391104. Throughput: 0: 44076.0. Samples: 345456180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:18:01,096][36761] Avg episode reward: [(0, '0.807')] [2024-07-02 13:18:04,627][36999] Updated weights for policy 0, policy_version 21090 (0.0036) [2024-07-02 13:18:06,095][36761] Fps is (10 sec: 42597.5, 60 sec: 43963.6, 300 sec: 44097.9). Total num frames: 345587712. Throughput: 0: 44213.2. Samples: 345717180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:18:06,096][36761] Avg episode reward: [(0, '0.823')] [2024-07-02 13:18:08,265][36999] Updated weights for policy 0, policy_version 21100 (0.0039) [2024-07-02 13:18:11,095][36761] Fps is (10 sec: 44237.4, 60 sec: 44509.9, 300 sec: 44153.5). Total num frames: 345833472. Throughput: 0: 44030.2. Samples: 345982340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 13:18:11,096][36761] Avg episode reward: [(0, '0.795')] [2024-07-02 13:18:11,889][36999] Updated weights for policy 0, policy_version 21110 (0.0031) [2024-07-02 13:18:15,591][36999] Updated weights for policy 0, policy_version 21120 (0.0030) [2024-07-02 13:18:16,095][36761] Fps is (10 sec: 45875.6, 60 sec: 43963.8, 300 sec: 44153.5). Total num frames: 346046464. Throughput: 0: 43992.4. Samples: 346112040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 13:18:16,096][36761] Avg episode reward: [(0, '0.806')] [2024-07-02 13:18:19,121][36999] Updated weights for policy 0, policy_version 21130 (0.0039) [2024-07-02 13:18:21,095][36761] Fps is (10 sec: 42598.4, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 346259456. Throughput: 0: 44247.2. Samples: 346381460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 13:18:21,096][36761] Avg episode reward: [(0, '0.812')] [2024-07-02 13:18:22,873][36999] Updated weights for policy 0, policy_version 21140 (0.0050) [2024-07-02 13:18:26,095][36761] Fps is (10 sec: 45875.8, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 346505216. Throughput: 0: 44279.1. Samples: 346650740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 13:18:26,096][36761] Avg episode reward: [(0, '0.809')] [2024-07-02 13:18:26,804][36999] Updated weights for policy 0, policy_version 21150 (0.0034) [2024-07-02 13:18:30,175][36999] Updated weights for policy 0, policy_version 21160 (0.0026) [2024-07-02 13:18:31,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43694.0, 300 sec: 44153.5). Total num frames: 346701824. Throughput: 0: 44205.8. Samples: 346779800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 13:18:31,096][36761] Avg episode reward: [(0, '0.840')] [2024-07-02 13:18:31,096][36979] Saving new best policy, reward=0.840! [2024-07-02 13:18:34,063][36999] Updated weights for policy 0, policy_version 21170 (0.0042) [2024-07-02 13:18:36,100][36761] Fps is (10 sec: 44216.2, 60 sec: 44779.6, 300 sec: 44208.3). Total num frames: 346947584. Throughput: 0: 44421.3. Samples: 347051700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 13:18:36,100][36761] Avg episode reward: [(0, '0.839')] [2024-07-02 13:18:36,113][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000021176_346947584.pth... [2024-07-02 13:18:36,163][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000020528_336330752.pth [2024-07-02 13:18:37,489][36999] Updated weights for policy 0, policy_version 21180 (0.0027) [2024-07-02 13:18:41,095][36761] Fps is (10 sec: 44236.3, 60 sec: 43963.6, 300 sec: 44153.5). Total num frames: 347144192. Throughput: 0: 44276.7. Samples: 347311760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-07-02 13:18:41,096][36761] Avg episode reward: [(0, '0.797')] [2024-07-02 13:18:41,408][36999] Updated weights for policy 0, policy_version 21190 (0.0035) [2024-07-02 13:18:44,834][36999] Updated weights for policy 0, policy_version 21200 (0.0030) [2024-07-02 13:18:46,095][36761] Fps is (10 sec: 40978.9, 60 sec: 43963.8, 300 sec: 44153.5). Total num frames: 347357184. Throughput: 0: 44208.1. Samples: 347445540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-07-02 13:18:46,096][36761] Avg episode reward: [(0, '0.801')] [2024-07-02 13:18:48,659][36999] Updated weights for policy 0, policy_version 21210 (0.0036) [2024-07-02 13:18:51,095][36761] Fps is (10 sec: 45875.8, 60 sec: 44513.3, 300 sec: 44153.5). Total num frames: 347602944. Throughput: 0: 44459.3. Samples: 347717840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-07-02 13:18:51,096][36761] Avg episode reward: [(0, '0.803')] [2024-07-02 13:18:52,189][36999] Updated weights for policy 0, policy_version 21220 (0.0037) [2024-07-02 13:18:55,950][36999] Updated weights for policy 0, policy_version 21230 (0.0029) [2024-07-02 13:18:56,095][36761] Fps is (10 sec: 47513.2, 60 sec: 44509.8, 300 sec: 44264.6). Total num frames: 347832320. Throughput: 0: 44396.4. Samples: 347980180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-07-02 13:18:56,099][36761] Avg episode reward: [(0, '0.832')] [2024-07-02 13:18:59,459][36999] Updated weights for policy 0, policy_version 21240 (0.0038) [2024-07-02 13:19:01,100][36761] Fps is (10 sec: 42578.6, 60 sec: 43960.4, 300 sec: 44208.3). Total num frames: 348028928. Throughput: 0: 44448.4. Samples: 348112420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-07-02 13:19:01,100][36761] Avg episode reward: [(0, '0.851')] [2024-07-02 13:19:01,103][36979] Saving new best policy, reward=0.851! [2024-07-02 13:19:03,371][36999] Updated weights for policy 0, policy_version 21250 (0.0024) [2024-07-02 13:19:06,095][36761] Fps is (10 sec: 44236.3, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 348274688. Throughput: 0: 44358.5. Samples: 348377600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-07-02 13:19:06,096][36761] Avg episode reward: [(0, '0.851')] [2024-07-02 13:19:06,928][36999] Updated weights for policy 0, policy_version 21260 (0.0032) [2024-07-02 13:19:10,842][36999] Updated weights for policy 0, policy_version 21270 (0.0030) [2024-07-02 13:19:11,095][36761] Fps is (10 sec: 45896.4, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 348487680. Throughput: 0: 44307.9. Samples: 348644600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 13:19:11,096][36761] Avg episode reward: [(0, '0.788')] [2024-07-02 13:19:14,875][36999] Updated weights for policy 0, policy_version 21280 (0.0027) [2024-07-02 13:19:16,095][36761] Fps is (10 sec: 40960.3, 60 sec: 43963.7, 300 sec: 44153.5). Total num frames: 348684288. Throughput: 0: 44305.7. Samples: 348773560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 13:19:16,096][36761] Avg episode reward: [(0, '0.775')] [2024-07-02 13:19:18,207][36999] Updated weights for policy 0, policy_version 21290 (0.0026) [2024-07-02 13:19:21,096][36761] Fps is (10 sec: 45872.7, 60 sec: 44782.5, 300 sec: 44264.5). Total num frames: 348946432. Throughput: 0: 44185.3. Samples: 349039860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-07-02 13:19:21,096][36761] Avg episode reward: [(0, '0.820')] [2024-07-02 13:19:22,482][36999] Updated weights for policy 0, policy_version 21300 (0.0034) [2024-07-02 13:19:25,613][36999] Updated weights for policy 0, policy_version 21310 (0.0032) [2024-07-02 13:19:26,095][36761] Fps is (10 sec: 47513.2, 60 sec: 44236.6, 300 sec: 44264.5). Total num frames: 349159424. Throughput: 0: 44329.7. Samples: 349306600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-07-02 13:19:26,096][36761] Avg episode reward: [(0, '0.829')] [2024-07-02 13:19:29,938][36979] Signal inference workers to stop experience collection... (5100 times) [2024-07-02 13:19:29,938][36979] Signal inference workers to resume experience collection... (5100 times) [2024-07-02 13:19:29,945][36999] Updated weights for policy 0, policy_version 21320 (0.0032) [2024-07-02 13:19:29,972][36999] InferenceWorker_p0-w0: stopping experience collection (5100 times) [2024-07-02 13:19:29,972][36999] InferenceWorker_p0-w0: resuming experience collection (5100 times) [2024-07-02 13:19:31,095][36761] Fps is (10 sec: 40962.0, 60 sec: 44236.8, 300 sec: 44153.5). Total num frames: 349356032. Throughput: 0: 44347.9. Samples: 349441200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-07-02 13:19:31,099][36761] Avg episode reward: [(0, '0.832')] [2024-07-02 13:19:32,957][36999] Updated weights for policy 0, policy_version 21330 (0.0032) [2024-07-02 13:19:36,095][36761] Fps is (10 sec: 44237.5, 60 sec: 44240.2, 300 sec: 44264.6). Total num frames: 349601792. Throughput: 0: 44137.7. Samples: 349704040. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-07-02 13:19:36,096][36761] Avg episode reward: [(0, '0.828')] [2024-07-02 13:19:37,231][36999] Updated weights for policy 0, policy_version 21340 (0.0028) [2024-07-02 13:19:40,294][36999] Updated weights for policy 0, policy_version 21350 (0.0027) [2024-07-02 13:19:41,095][36761] Fps is (10 sec: 45875.5, 60 sec: 44510.0, 300 sec: 44264.6). Total num frames: 349814784. Throughput: 0: 44249.8. Samples: 349971420. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-07-02 13:19:41,096][36761] Avg episode reward: [(0, '0.845')] [2024-07-02 13:19:44,550][36999] Updated weights for policy 0, policy_version 21360 (0.0028) [2024-07-02 13:19:46,095][36761] Fps is (10 sec: 40959.9, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 350011392. Throughput: 0: 44278.3. Samples: 350104740. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-07-02 13:19:46,096][36761] Avg episode reward: [(0, '0.832')] [2024-07-02 13:19:47,657][36999] Updated weights for policy 0, policy_version 21370 (0.0027) [2024-07-02 13:19:51,100][36761] Fps is (10 sec: 44216.5, 60 sec: 44233.4, 300 sec: 44263.9). Total num frames: 350257152. Throughput: 0: 44169.9. Samples: 350365440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 13:19:51,100][36761] Avg episode reward: [(0, '0.772')] [2024-07-02 13:19:52,205][36999] Updated weights for policy 0, policy_version 21380 (0.0041) [2024-07-02 13:19:55,022][36999] Updated weights for policy 0, policy_version 21390 (0.0039) [2024-07-02 13:19:56,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43690.7, 300 sec: 44154.2). Total num frames: 350453760. Throughput: 0: 44188.4. Samples: 350633080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 13:19:56,096][36761] Avg episode reward: [(0, '0.784')] [2024-07-02 13:19:59,591][36999] Updated weights for policy 0, policy_version 21400 (0.0026) [2024-07-02 13:20:01,095][36761] Fps is (10 sec: 42617.8, 60 sec: 44240.2, 300 sec: 44153.5). Total num frames: 350683136. Throughput: 0: 44252.5. Samples: 350764920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 13:20:01,096][36761] Avg episode reward: [(0, '0.823')] [2024-07-02 13:20:02,741][36999] Updated weights for policy 0, policy_version 21410 (0.0042) [2024-07-02 13:20:06,095][36761] Fps is (10 sec: 45875.0, 60 sec: 43963.8, 300 sec: 44264.6). Total num frames: 350912512. Throughput: 0: 44101.8. Samples: 351024420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-07-02 13:20:06,096][36761] Avg episode reward: [(0, '0.825')] [2024-07-02 13:20:06,914][36999] Updated weights for policy 0, policy_version 21420 (0.0029) [2024-07-02 13:20:10,071][36999] Updated weights for policy 0, policy_version 21430 (0.0036) [2024-07-02 13:20:11,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43963.7, 300 sec: 44209.0). Total num frames: 351125504. Throughput: 0: 44137.1. Samples: 351292760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-07-02 13:20:11,096][36761] Avg episode reward: [(0, '0.826')] [2024-07-02 13:20:14,284][36999] Updated weights for policy 0, policy_version 21440 (0.0037) [2024-07-02 13:20:16,095][36761] Fps is (10 sec: 42598.0, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 351338496. Throughput: 0: 44151.0. Samples: 351428000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-07-02 13:20:16,096][36761] Avg episode reward: [(0, '0.826')] [2024-07-02 13:20:17,500][36999] Updated weights for policy 0, policy_version 21450 (0.0038) [2024-07-02 13:20:21,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43691.1, 300 sec: 44153.5). Total num frames: 351567872. Throughput: 0: 44094.2. Samples: 351688280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 13:20:21,096][36761] Avg episode reward: [(0, '0.848')] [2024-07-02 13:20:21,525][36999] Updated weights for policy 0, policy_version 21460 (0.0030) [2024-07-02 13:20:24,853][36999] Updated weights for policy 0, policy_version 21470 (0.0030) [2024-07-02 13:20:26,095][36761] Fps is (10 sec: 47514.6, 60 sec: 44236.9, 300 sec: 44264.6). Total num frames: 351813632. Throughput: 0: 44164.9. Samples: 351958840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 13:20:26,096][36761] Avg episode reward: [(0, '0.854')] [2024-07-02 13:20:26,201][36979] Saving new best policy, reward=0.854! [2024-07-02 13:20:28,799][36999] Updated weights for policy 0, policy_version 21480 (0.0026) [2024-07-02 13:20:31,095][36761] Fps is (10 sec: 44236.8, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 352010240. Throughput: 0: 44135.6. Samples: 352090840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 13:20:31,096][36761] Avg episode reward: [(0, '0.840')] [2024-07-02 13:20:32,599][36999] Updated weights for policy 0, policy_version 21490 (0.0023) [2024-07-02 13:20:33,548][36979] Signal inference workers to stop experience collection... (5150 times) [2024-07-02 13:20:33,549][36979] Signal inference workers to resume experience collection... (5150 times) [2024-07-02 13:20:33,596][36999] InferenceWorker_p0-w0: stopping experience collection (5150 times) [2024-07-02 13:20:33,596][36999] InferenceWorker_p0-w0: resuming experience collection (5150 times) [2024-07-02 13:20:36,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43963.7, 300 sec: 44264.6). Total num frames: 352239616. Throughput: 0: 44011.5. Samples: 352345760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-07-02 13:20:36,096][36761] Avg episode reward: [(0, '0.843')] [2024-07-02 13:20:36,107][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000021499_352239616.pth... [2024-07-02 13:20:36,169][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000020851_341622784.pth [2024-07-02 13:20:36,764][36999] Updated weights for policy 0, policy_version 21500 (0.0032) [2024-07-02 13:20:40,002][36999] Updated weights for policy 0, policy_version 21510 (0.0035) [2024-07-02 13:20:41,096][36761] Fps is (10 sec: 45870.6, 60 sec: 44236.0, 300 sec: 44208.9). Total num frames: 352468992. Throughput: 0: 43883.9. Samples: 352607900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-07-02 13:20:41,097][36761] Avg episode reward: [(0, '0.846')] [2024-07-02 13:20:44,098][36999] Updated weights for policy 0, policy_version 21520 (0.0031) [2024-07-02 13:20:46,095][36761] Fps is (10 sec: 42598.9, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 352665600. Throughput: 0: 43933.8. Samples: 352741940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-07-02 13:20:46,096][36761] Avg episode reward: [(0, '0.832')] [2024-07-02 13:20:47,327][36999] Updated weights for policy 0, policy_version 21530 (0.0039) [2024-07-02 13:20:51,095][36761] Fps is (10 sec: 42602.7, 60 sec: 43967.1, 300 sec: 44209.0). Total num frames: 352894976. Throughput: 0: 44061.0. Samples: 353007160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 13:20:51,096][36761] Avg episode reward: [(0, '0.837')] [2024-07-02 13:20:51,419][36999] Updated weights for policy 0, policy_version 21540 (0.0030) [2024-07-02 13:20:54,654][36999] Updated weights for policy 0, policy_version 21550 (0.0034) [2024-07-02 13:20:56,095][36761] Fps is (10 sec: 45874.4, 60 sec: 44509.8, 300 sec: 44153.5). Total num frames: 353124352. Throughput: 0: 43868.8. Samples: 353266860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 13:20:56,096][36761] Avg episode reward: [(0, '0.841')] [2024-07-02 13:20:58,735][36999] Updated weights for policy 0, policy_version 21560 (0.0035) [2024-07-02 13:21:01,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43963.8, 300 sec: 44153.5). Total num frames: 353320960. Throughput: 0: 43956.6. Samples: 353406040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 13:21:01,096][36761] Avg episode reward: [(0, '0.843')] [2024-07-02 13:21:02,061][36999] Updated weights for policy 0, policy_version 21570 (0.0039) [2024-07-02 13:21:06,095][36761] Fps is (10 sec: 40960.7, 60 sec: 43690.8, 300 sec: 44153.6). Total num frames: 353533952. Throughput: 0: 43966.3. Samples: 353666760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 13:21:06,095][36761] Avg episode reward: [(0, '0.812')] [2024-07-02 13:21:06,241][36999] Updated weights for policy 0, policy_version 21580 (0.0028) [2024-07-02 13:21:09,566][36999] Updated weights for policy 0, policy_version 21590 (0.0029) [2024-07-02 13:21:11,095][36761] Fps is (10 sec: 47513.4, 60 sec: 44509.9, 300 sec: 44209.0). Total num frames: 353796096. Throughput: 0: 43785.7. Samples: 353929200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 13:21:11,096][36761] Avg episode reward: [(0, '0.812')] [2024-07-02 13:21:13,715][36999] Updated weights for policy 0, policy_version 21600 (0.0027) [2024-07-02 13:21:16,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43963.9, 300 sec: 44098.0). Total num frames: 353976320. Throughput: 0: 43950.7. Samples: 354068620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 13:21:16,095][36761] Avg episode reward: [(0, '0.829')] [2024-07-02 13:21:17,304][36999] Updated weights for policy 0, policy_version 21610 (0.0032) [2024-07-02 13:21:21,096][36761] Fps is (10 sec: 40955.7, 60 sec: 43963.0, 300 sec: 44208.9). Total num frames: 354205696. Throughput: 0: 44161.7. Samples: 354333080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-07-02 13:21:21,097][36761] Avg episode reward: [(0, '0.826')] [2024-07-02 13:21:21,488][36999] Updated weights for policy 0, policy_version 21620 (0.0035) [2024-07-02 13:21:24,516][36999] Updated weights for policy 0, policy_version 21630 (0.0035) [2024-07-02 13:21:26,095][36761] Fps is (10 sec: 49151.3, 60 sec: 44236.7, 300 sec: 44153.5). Total num frames: 354467840. Throughput: 0: 44127.6. Samples: 354593600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-07-02 13:21:26,096][36761] Avg episode reward: [(0, '0.816')] [2024-07-02 13:21:28,825][36999] Updated weights for policy 0, policy_version 21640 (0.0034) [2024-07-02 13:21:31,095][36761] Fps is (10 sec: 44241.2, 60 sec: 43963.7, 300 sec: 44153.5). Total num frames: 354648064. Throughput: 0: 44363.9. Samples: 354738320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-07-02 13:21:31,096][36761] Avg episode reward: [(0, '0.829')] [2024-07-02 13:21:31,975][36999] Updated weights for policy 0, policy_version 21650 (0.0032) [2024-07-02 13:21:32,871][36979] Signal inference workers to stop experience collection... (5200 times) [2024-07-02 13:21:32,872][36979] Signal inference workers to resume experience collection... (5200 times) [2024-07-02 13:21:32,911][36999] InferenceWorker_p0-w0: stopping experience collection (5200 times) [2024-07-02 13:21:32,911][36999] InferenceWorker_p0-w0: resuming experience collection (5200 times) [2024-07-02 13:21:36,036][36999] Updated weights for policy 0, policy_version 21660 (0.0030) [2024-07-02 13:21:36,095][36761] Fps is (10 sec: 40960.5, 60 sec: 43963.8, 300 sec: 44264.6). Total num frames: 354877440. Throughput: 0: 44233.8. Samples: 354997680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-07-02 13:21:36,096][36761] Avg episode reward: [(0, '0.824')] [2024-07-02 13:21:39,285][36999] Updated weights for policy 0, policy_version 21670 (0.0042) [2024-07-02 13:21:41,095][36761] Fps is (10 sec: 49152.4, 60 sec: 44510.6, 300 sec: 44209.4). Total num frames: 355139584. Throughput: 0: 44325.0. Samples: 355261480. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-07-02 13:21:41,096][36761] Avg episode reward: [(0, '0.850')] [2024-07-02 13:21:43,343][36999] Updated weights for policy 0, policy_version 21680 (0.0030) [2024-07-02 13:21:46,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43963.7, 300 sec: 44153.5). Total num frames: 355303424. Throughput: 0: 44458.1. Samples: 355406660. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-07-02 13:21:46,096][36761] Avg episode reward: [(0, '0.841')] [2024-07-02 13:21:46,711][36999] Updated weights for policy 0, policy_version 21690 (0.0035) [2024-07-02 13:21:50,689][36999] Updated weights for policy 0, policy_version 21700 (0.0027) [2024-07-02 13:21:51,095][36761] Fps is (10 sec: 39321.0, 60 sec: 43963.6, 300 sec: 44209.0). Total num frames: 355532800. Throughput: 0: 44319.8. Samples: 355661160. Policy #0 lag: (min: 2.0, avg: 12.8, max: 21.0) [2024-07-02 13:21:51,096][36761] Avg episode reward: [(0, '0.850')] [2024-07-02 13:21:54,231][36999] Updated weights for policy 0, policy_version 21710 (0.0032) [2024-07-02 13:21:56,095][36761] Fps is (10 sec: 49152.2, 60 sec: 44509.9, 300 sec: 44209.0). Total num frames: 355794944. Throughput: 0: 44208.0. Samples: 355918560. Policy #0 lag: (min: 2.0, avg: 12.8, max: 21.0) [2024-07-02 13:21:56,096][36761] Avg episode reward: [(0, '0.860')] [2024-07-02 13:21:56,116][36979] Saving new best policy, reward=0.860! [2024-07-02 13:21:58,009][36999] Updated weights for policy 0, policy_version 21720 (0.0037) [2024-07-02 13:22:01,095][36761] Fps is (10 sec: 44237.0, 60 sec: 44236.7, 300 sec: 44153.5). Total num frames: 355975168. Throughput: 0: 44412.3. Samples: 356067180. Policy #0 lag: (min: 2.0, avg: 12.8, max: 21.0) [2024-07-02 13:22:01,096][36761] Avg episode reward: [(0, '0.870')] [2024-07-02 13:22:01,211][36979] Saving new best policy, reward=0.870! [2024-07-02 13:22:01,592][36999] Updated weights for policy 0, policy_version 21730 (0.0027) [2024-07-02 13:22:05,414][36999] Updated weights for policy 0, policy_version 21740 (0.0043) [2024-07-02 13:22:06,095][36761] Fps is (10 sec: 39321.3, 60 sec: 44236.7, 300 sec: 44153.5). Total num frames: 356188160. Throughput: 0: 44215.2. Samples: 356322720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 13:22:06,098][36761] Avg episode reward: [(0, '0.833')] [2024-07-02 13:22:08,910][36999] Updated weights for policy 0, policy_version 21750 (0.0039) [2024-07-02 13:22:11,095][36761] Fps is (10 sec: 45875.5, 60 sec: 43963.7, 300 sec: 44153.5). Total num frames: 356433920. Throughput: 0: 44257.0. Samples: 356585160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 13:22:11,096][36761] Avg episode reward: [(0, '0.828')] [2024-07-02 13:22:12,715][36999] Updated weights for policy 0, policy_version 21760 (0.0029) [2024-07-02 13:22:16,095][36761] Fps is (10 sec: 45876.0, 60 sec: 44509.9, 300 sec: 44209.0). Total num frames: 356646912. Throughput: 0: 44269.9. Samples: 356730460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 13:22:16,096][36761] Avg episode reward: [(0, '0.866')] [2024-07-02 13:22:16,288][36999] Updated weights for policy 0, policy_version 21770 (0.0030) [2024-07-02 13:22:20,014][36999] Updated weights for policy 0, policy_version 21780 (0.0028) [2024-07-02 13:22:21,095][36761] Fps is (10 sec: 40960.3, 60 sec: 43964.5, 300 sec: 44153.5). Total num frames: 356843520. Throughput: 0: 44250.2. Samples: 356988940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-07-02 13:22:21,096][36761] Avg episode reward: [(0, '0.875')] [2024-07-02 13:22:21,131][36979] Saving new best policy, reward=0.875! [2024-07-02 13:22:23,642][36999] Updated weights for policy 0, policy_version 21790 (0.0046) [2024-07-02 13:22:26,095][36761] Fps is (10 sec: 45875.0, 60 sec: 43963.8, 300 sec: 44154.2). Total num frames: 357105664. Throughput: 0: 44230.2. Samples: 357251840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-07-02 13:22:26,096][36761] Avg episode reward: [(0, '0.869')] [2024-07-02 13:22:27,804][36999] Updated weights for policy 0, policy_version 21800 (0.0026) [2024-07-02 13:22:31,095][36761] Fps is (10 sec: 47513.1, 60 sec: 44509.9, 300 sec: 44264.6). Total num frames: 357318656. Throughput: 0: 44185.8. Samples: 357395020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-07-02 13:22:31,096][36761] Avg episode reward: [(0, '0.865')] [2024-07-02 13:22:31,185][36999] Updated weights for policy 0, policy_version 21810 (0.0030) [2024-07-02 13:22:35,215][36999] Updated weights for policy 0, policy_version 21820 (0.0026) [2024-07-02 13:22:36,095][36761] Fps is (10 sec: 40959.5, 60 sec: 43963.7, 300 sec: 44097.9). Total num frames: 357515264. Throughput: 0: 44295.6. Samples: 357654460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-07-02 13:22:36,096][36761] Avg episode reward: [(0, '0.865')] [2024-07-02 13:22:36,103][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000021821_357515264.pth... [2024-07-02 13:22:36,160][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000021176_346947584.pth [2024-07-02 13:22:38,454][36979] Signal inference workers to stop experience collection... (5250 times) [2024-07-02 13:22:38,510][36999] InferenceWorker_p0-w0: stopping experience collection (5250 times) [2024-07-02 13:22:38,512][36979] Signal inference workers to resume experience collection... (5250 times) [2024-07-02 13:22:38,524][36999] InferenceWorker_p0-w0: resuming experience collection (5250 times) [2024-07-02 13:22:38,527][36999] Updated weights for policy 0, policy_version 21830 (0.0022) [2024-07-02 13:22:41,100][36761] Fps is (10 sec: 44216.9, 60 sec: 43687.3, 300 sec: 44208.4). Total num frames: 357761024. Throughput: 0: 44380.4. Samples: 357915880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-07-02 13:22:41,100][36761] Avg episode reward: [(0, '0.865')] [2024-07-02 13:22:42,678][36999] Updated weights for policy 0, policy_version 21840 (0.0030) [2024-07-02 13:22:45,959][36999] Updated weights for policy 0, policy_version 21850 (0.0020) [2024-07-02 13:22:46,095][36761] Fps is (10 sec: 49152.4, 60 sec: 45056.1, 300 sec: 44320.8). Total num frames: 358006784. Throughput: 0: 44281.4. Samples: 358059840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-07-02 13:22:46,096][36761] Avg episode reward: [(0, '0.868')] [2024-07-02 13:22:49,979][36999] Updated weights for policy 0, policy_version 21860 (0.0026) [2024-07-02 13:22:51,095][36761] Fps is (10 sec: 40979.0, 60 sec: 43963.9, 300 sec: 44098.0). Total num frames: 358170624. Throughput: 0: 44452.2. Samples: 358323060. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-07-02 13:22:51,096][36761] Avg episode reward: [(0, '0.867')] [2024-07-02 13:22:53,407][36999] Updated weights for policy 0, policy_version 21870 (0.0028) [2024-07-02 13:22:56,095][36761] Fps is (10 sec: 40959.7, 60 sec: 43690.6, 300 sec: 44153.5). Total num frames: 358416384. Throughput: 0: 44233.3. Samples: 358575660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-07-02 13:22:56,096][36761] Avg episode reward: [(0, '0.878')] [2024-07-02 13:22:56,108][36979] Saving new best policy, reward=0.878! [2024-07-02 13:22:57,340][36999] Updated weights for policy 0, policy_version 21880 (0.0034) [2024-07-02 13:23:00,831][36999] Updated weights for policy 0, policy_version 21890 (0.0037) [2024-07-02 13:23:01,095][36761] Fps is (10 sec: 47513.3, 60 sec: 44509.9, 300 sec: 44264.6). Total num frames: 358645760. Throughput: 0: 44090.6. Samples: 358714540. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-07-02 13:23:01,096][36761] Avg episode reward: [(0, '0.880')] [2024-07-02 13:23:01,096][36979] Saving new best policy, reward=0.880! [2024-07-02 13:23:04,851][36999] Updated weights for policy 0, policy_version 21900 (0.0030) [2024-07-02 13:23:06,096][36761] Fps is (10 sec: 44234.6, 60 sec: 44509.5, 300 sec: 44153.4). Total num frames: 358858752. Throughput: 0: 44203.8. Samples: 358978140. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-07-02 13:23:06,096][36761] Avg episode reward: [(0, '0.846')] [2024-07-02 13:23:08,178][36999] Updated weights for policy 0, policy_version 21910 (0.0042) [2024-07-02 13:23:11,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43963.7, 300 sec: 44153.5). Total num frames: 359071744. Throughput: 0: 44189.3. Samples: 359240360. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-07-02 13:23:11,096][36761] Avg episode reward: [(0, '0.848')] [2024-07-02 13:23:12,339][36999] Updated weights for policy 0, policy_version 21920 (0.0035) [2024-07-02 13:23:15,595][36999] Updated weights for policy 0, policy_version 21930 (0.0045) [2024-07-02 13:23:16,095][36761] Fps is (10 sec: 45877.3, 60 sec: 44509.7, 300 sec: 44264.5). Total num frames: 359317504. Throughput: 0: 44081.7. Samples: 359378700. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-07-02 13:23:16,096][36761] Avg episode reward: [(0, '0.837')] [2024-07-02 13:23:19,765][36999] Updated weights for policy 0, policy_version 21940 (0.0033) [2024-07-02 13:23:21,095][36761] Fps is (10 sec: 42598.8, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 359497728. Throughput: 0: 44103.7. Samples: 359639120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-07-02 13:23:21,096][36761] Avg episode reward: [(0, '0.848')] [2024-07-02 13:23:22,970][36999] Updated weights for policy 0, policy_version 21950 (0.0028) [2024-07-02 13:23:26,095][36761] Fps is (10 sec: 40960.3, 60 sec: 43690.6, 300 sec: 44153.5). Total num frames: 359727104. Throughput: 0: 44074.6. Samples: 359899040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-07-02 13:23:26,096][36761] Avg episode reward: [(0, '0.846')] [2024-07-02 13:23:27,267][36999] Updated weights for policy 0, policy_version 21960 (0.0042) [2024-07-02 13:23:30,422][36999] Updated weights for policy 0, policy_version 21970 (0.0042) [2024-07-02 13:23:31,095][36761] Fps is (10 sec: 45874.8, 60 sec: 43963.8, 300 sec: 44098.6). Total num frames: 359956480. Throughput: 0: 43837.3. Samples: 360032520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-07-02 13:23:31,096][36761] Avg episode reward: [(0, '0.873')] [2024-07-02 13:23:34,692][36999] Updated weights for policy 0, policy_version 21980 (0.0033) [2024-07-02 13:23:36,095][36761] Fps is (10 sec: 44236.3, 60 sec: 44236.7, 300 sec: 44153.5). Total num frames: 360169472. Throughput: 0: 43801.6. Samples: 360294140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 13:23:36,096][36761] Avg episode reward: [(0, '0.866')] [2024-07-02 13:23:37,998][36999] Updated weights for policy 0, policy_version 21990 (0.0029) [2024-07-02 13:23:38,807][36979] Signal inference workers to stop experience collection... (5300 times) [2024-07-02 13:23:38,852][36999] InferenceWorker_p0-w0: stopping experience collection (5300 times) [2024-07-02 13:23:38,860][36979] Signal inference workers to resume experience collection... (5300 times) [2024-07-02 13:23:38,872][36999] InferenceWorker_p0-w0: resuming experience collection (5300 times) [2024-07-02 13:23:41,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43967.1, 300 sec: 44209.0). Total num frames: 360398848. Throughput: 0: 44065.8. Samples: 360558620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 13:23:41,096][36761] Avg episode reward: [(0, '0.870')] [2024-07-02 13:23:42,169][36999] Updated weights for policy 0, policy_version 22000 (0.0022) [2024-07-02 13:23:45,680][36999] Updated weights for policy 0, policy_version 22010 (0.0035) [2024-07-02 13:23:46,095][36761] Fps is (10 sec: 45875.5, 60 sec: 43690.6, 300 sec: 44153.5). Total num frames: 360628224. Throughput: 0: 43891.9. Samples: 360689680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 13:23:46,096][36761] Avg episode reward: [(0, '0.885')] [2024-07-02 13:23:46,112][36979] Saving new best policy, reward=0.885! [2024-07-02 13:23:49,653][36999] Updated weights for policy 0, policy_version 22020 (0.0041) [2024-07-02 13:23:51,096][36761] Fps is (10 sec: 44234.4, 60 sec: 44509.4, 300 sec: 44097.9). Total num frames: 360841216. Throughput: 0: 43948.5. Samples: 360955820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 13:23:51,097][36761] Avg episode reward: [(0, '0.877')] [2024-07-02 13:23:53,061][36999] Updated weights for policy 0, policy_version 22030 (0.0026) [2024-07-02 13:23:56,095][36761] Fps is (10 sec: 42598.8, 60 sec: 43963.8, 300 sec: 44154.2). Total num frames: 361054208. Throughput: 0: 43933.8. Samples: 361217380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 13:23:56,095][36761] Avg episode reward: [(0, '0.882')] [2024-07-02 13:23:57,068][36999] Updated weights for policy 0, policy_version 22040 (0.0033) [2024-07-02 13:24:00,468][36999] Updated weights for policy 0, policy_version 22050 (0.0026) [2024-07-02 13:24:01,095][36761] Fps is (10 sec: 44238.9, 60 sec: 43963.7, 300 sec: 44098.0). Total num frames: 361283584. Throughput: 0: 43843.6. Samples: 361351660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 13:24:01,096][36761] Avg episode reward: [(0, '0.882')] [2024-07-02 13:24:04,414][36999] Updated weights for policy 0, policy_version 22060 (0.0028) [2024-07-02 13:24:06,095][36761] Fps is (10 sec: 42597.8, 60 sec: 43691.0, 300 sec: 44042.4). Total num frames: 361480192. Throughput: 0: 44047.4. Samples: 361621260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:24:06,096][36761] Avg episode reward: [(0, '0.878')] [2024-07-02 13:24:07,884][36999] Updated weights for policy 0, policy_version 22070 (0.0045) [2024-07-02 13:24:11,100][36761] Fps is (10 sec: 42578.6, 60 sec: 43960.3, 300 sec: 44152.8). Total num frames: 361709568. Throughput: 0: 44036.7. Samples: 361880900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:24:11,101][36761] Avg episode reward: [(0, '0.890')] [2024-07-02 13:24:11,101][36979] Saving new best policy, reward=0.890! [2024-07-02 13:24:12,375][36999] Updated weights for policy 0, policy_version 22080 (0.0036) [2024-07-02 13:24:15,237][36999] Updated weights for policy 0, policy_version 22090 (0.0031) [2024-07-02 13:24:16,095][36761] Fps is (10 sec: 47514.3, 60 sec: 43963.8, 300 sec: 44098.0). Total num frames: 361955328. Throughput: 0: 44120.5. Samples: 362017940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:24:16,096][36761] Avg episode reward: [(0, '0.883')] [2024-07-02 13:24:19,674][36999] Updated weights for policy 0, policy_version 22100 (0.0037) [2024-07-02 13:24:21,095][36761] Fps is (10 sec: 42618.4, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 362135552. Throughput: 0: 44079.2. Samples: 362277700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-07-02 13:24:21,096][36761] Avg episode reward: [(0, '0.900')] [2024-07-02 13:24:21,245][36979] Saving new best policy, reward=0.900! [2024-07-02 13:24:22,630][36999] Updated weights for policy 0, policy_version 22110 (0.0040) [2024-07-02 13:24:26,098][36761] Fps is (10 sec: 40947.6, 60 sec: 43961.6, 300 sec: 44097.5). Total num frames: 362364928. Throughput: 0: 44031.3. Samples: 362540160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-07-02 13:24:26,099][36761] Avg episode reward: [(0, '0.875')] [2024-07-02 13:24:27,172][36999] Updated weights for policy 0, policy_version 22120 (0.0030) [2024-07-02 13:24:30,041][36999] Updated weights for policy 0, policy_version 22130 (0.0033) [2024-07-02 13:24:31,095][36761] Fps is (10 sec: 45875.4, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 362594304. Throughput: 0: 44001.4. Samples: 362669740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-07-02 13:24:31,096][36761] Avg episode reward: [(0, '0.875')] [2024-07-02 13:24:34,611][36999] Updated weights for policy 0, policy_version 22140 (0.0029) [2024-07-02 13:24:36,095][36761] Fps is (10 sec: 45888.4, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 362823680. Throughput: 0: 44062.2. Samples: 362938600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 13:24:36,096][36761] Avg episode reward: [(0, '0.868')] [2024-07-02 13:24:36,109][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000022145_362823680.pth... [2024-07-02 13:24:36,158][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000021499_352239616.pth [2024-07-02 13:24:37,547][36999] Updated weights for policy 0, policy_version 22150 (0.0045) [2024-07-02 13:24:41,100][36761] Fps is (10 sec: 44216.8, 60 sec: 43960.4, 300 sec: 44152.8). Total num frames: 363036672. Throughput: 0: 44075.5. Samples: 363200980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 13:24:41,100][36761] Avg episode reward: [(0, '0.868')] [2024-07-02 13:24:41,980][36999] Updated weights for policy 0, policy_version 22160 (0.0026) [2024-07-02 13:24:44,913][36999] Updated weights for policy 0, policy_version 22170 (0.0024) [2024-07-02 13:24:46,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43963.7, 300 sec: 44098.6). Total num frames: 363266048. Throughput: 0: 44043.6. Samples: 363333620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 13:24:46,096][36761] Avg episode reward: [(0, '0.882')] [2024-07-02 13:24:49,660][36999] Updated weights for policy 0, policy_version 22180 (0.0041) [2024-07-02 13:24:51,095][36761] Fps is (10 sec: 44256.8, 60 sec: 43964.1, 300 sec: 44153.5). Total num frames: 363479040. Throughput: 0: 44073.0. Samples: 363604540. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-07-02 13:24:51,096][36761] Avg episode reward: [(0, '0.869')] [2024-07-02 13:24:52,481][36999] Updated weights for policy 0, policy_version 22190 (0.0029) [2024-07-02 13:24:56,100][36761] Fps is (10 sec: 42579.0, 60 sec: 43960.3, 300 sec: 44097.3). Total num frames: 363692032. Throughput: 0: 43984.1. Samples: 363860180. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-07-02 13:24:56,101][36761] Avg episode reward: [(0, '0.883')] [2024-07-02 13:24:56,976][36999] Updated weights for policy 0, policy_version 22200 (0.0036) [2024-07-02 13:24:59,981][36999] Updated weights for policy 0, policy_version 22210 (0.0035) [2024-07-02 13:25:01,095][36761] Fps is (10 sec: 45875.3, 60 sec: 44236.9, 300 sec: 44153.5). Total num frames: 363937792. Throughput: 0: 43927.1. Samples: 363994660. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-07-02 13:25:01,096][36761] Avg episode reward: [(0, '0.865')] [2024-07-02 13:25:04,305][36999] Updated weights for policy 0, policy_version 22220 (0.0040) [2024-07-02 13:25:06,095][36761] Fps is (10 sec: 45895.9, 60 sec: 44509.9, 300 sec: 44153.5). Total num frames: 364150784. Throughput: 0: 44144.4. Samples: 364264200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-07-02 13:25:06,096][36761] Avg episode reward: [(0, '0.880')] [2024-07-02 13:25:07,353][36999] Updated weights for policy 0, policy_version 22230 (0.0046) [2024-07-02 13:25:11,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43967.2, 300 sec: 44098.0). Total num frames: 364347392. Throughput: 0: 44179.0. Samples: 364528080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-07-02 13:25:11,096][36761] Avg episode reward: [(0, '0.867')] [2024-07-02 13:25:11,601][36999] Updated weights for policy 0, policy_version 22240 (0.0027) [2024-07-02 13:25:12,450][36979] Signal inference workers to stop experience collection... (5350 times) [2024-07-02 13:25:12,450][36979] Signal inference workers to resume experience collection... (5350 times) [2024-07-02 13:25:12,485][36999] InferenceWorker_p0-w0: stopping experience collection (5350 times) [2024-07-02 13:25:12,485][36999] InferenceWorker_p0-w0: resuming experience collection (5350 times) [2024-07-02 13:25:14,832][36999] Updated weights for policy 0, policy_version 22250 (0.0027) [2024-07-02 13:25:16,096][36761] Fps is (10 sec: 44234.8, 60 sec: 43963.3, 300 sec: 44153.4). Total num frames: 364593152. Throughput: 0: 44221.2. Samples: 364659720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-07-02 13:25:16,096][36761] Avg episode reward: [(0, '0.890')] [2024-07-02 13:25:18,906][36999] Updated weights for policy 0, policy_version 22260 (0.0026) [2024-07-02 13:25:21,095][36761] Fps is (10 sec: 45875.2, 60 sec: 44509.9, 300 sec: 44042.4). Total num frames: 364806144. Throughput: 0: 44178.8. Samples: 364926640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-07-02 13:25:21,096][36761] Avg episode reward: [(0, '0.894')] [2024-07-02 13:25:22,141][36999] Updated weights for policy 0, policy_version 22270 (0.0033) [2024-07-02 13:25:26,095][36761] Fps is (10 sec: 40961.9, 60 sec: 43965.9, 300 sec: 44042.4). Total num frames: 365002752. Throughput: 0: 44154.1. Samples: 365187720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-07-02 13:25:26,096][36761] Avg episode reward: [(0, '0.893')] [2024-07-02 13:25:26,470][36999] Updated weights for policy 0, policy_version 22280 (0.0030) [2024-07-02 13:25:29,621][36999] Updated weights for policy 0, policy_version 22290 (0.0025) [2024-07-02 13:25:31,095][36761] Fps is (10 sec: 44235.9, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 365248512. Throughput: 0: 44017.3. Samples: 365314400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-07-02 13:25:31,096][36761] Avg episode reward: [(0, '0.904')] [2024-07-02 13:25:31,096][36979] Saving new best policy, reward=0.904! [2024-07-02 13:25:33,910][36999] Updated weights for policy 0, policy_version 22300 (0.0030) [2024-07-02 13:25:36,095][36761] Fps is (10 sec: 45875.9, 60 sec: 43963.9, 300 sec: 44042.6). Total num frames: 365461504. Throughput: 0: 43957.4. Samples: 365582620. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-07-02 13:25:36,095][36761] Avg episode reward: [(0, '0.904')] [2024-07-02 13:25:37,098][36999] Updated weights for policy 0, policy_version 22310 (0.0033) [2024-07-02 13:25:41,100][36761] Fps is (10 sec: 40942.0, 60 sec: 43690.7, 300 sec: 44041.7). Total num frames: 365658112. Throughput: 0: 44057.0. Samples: 365842740. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-07-02 13:25:41,101][36761] Avg episode reward: [(0, '0.877')] [2024-07-02 13:25:41,712][36999] Updated weights for policy 0, policy_version 22320 (0.0024) [2024-07-02 13:25:44,655][36999] Updated weights for policy 0, policy_version 22330 (0.0031) [2024-07-02 13:25:46,095][36761] Fps is (10 sec: 44235.9, 60 sec: 43963.7, 300 sec: 44097.9). Total num frames: 365903872. Throughput: 0: 43963.8. Samples: 365973040. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-07-02 13:25:46,096][36761] Avg episode reward: [(0, '0.887')] [2024-07-02 13:25:49,067][36999] Updated weights for policy 0, policy_version 22340 (0.0041) [2024-07-02 13:25:51,095][36761] Fps is (10 sec: 47535.3, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 366133248. Throughput: 0: 43859.7. Samples: 366237880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-07-02 13:25:51,096][36761] Avg episode reward: [(0, '0.879')] [2024-07-02 13:25:52,192][36999] Updated weights for policy 0, policy_version 22350 (0.0036) [2024-07-02 13:25:56,100][36761] Fps is (10 sec: 40941.7, 60 sec: 43690.7, 300 sec: 44041.7). Total num frames: 366313472. Throughput: 0: 43897.2. Samples: 366503660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-07-02 13:25:56,101][36761] Avg episode reward: [(0, '0.855')] [2024-07-02 13:25:56,450][36999] Updated weights for policy 0, policy_version 22360 (0.0044) [2024-07-02 13:25:59,784][36999] Updated weights for policy 0, policy_version 22370 (0.0033) [2024-07-02 13:26:01,095][36761] Fps is (10 sec: 42598.1, 60 sec: 43690.6, 300 sec: 44153.5). Total num frames: 366559232. Throughput: 0: 43928.5. Samples: 366636480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-07-02 13:26:01,096][36761] Avg episode reward: [(0, '0.864')] [2024-07-02 13:26:03,856][36999] Updated weights for policy 0, policy_version 22380 (0.0026) [2024-07-02 13:26:06,095][36761] Fps is (10 sec: 45896.0, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 366772224. Throughput: 0: 43774.1. Samples: 366896480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 13:26:06,098][36761] Avg episode reward: [(0, '0.863')] [2024-07-02 13:26:07,288][36999] Updated weights for policy 0, policy_version 22390 (0.0036) [2024-07-02 13:26:11,095][36761] Fps is (10 sec: 42598.8, 60 sec: 43963.7, 300 sec: 44097.9). Total num frames: 366985216. Throughput: 0: 43920.1. Samples: 367164120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 13:26:11,096][36761] Avg episode reward: [(0, '0.887')] [2024-07-02 13:26:11,124][36999] Updated weights for policy 0, policy_version 22400 (0.0033) [2024-07-02 13:26:14,619][36999] Updated weights for policy 0, policy_version 22410 (0.0035) [2024-07-02 13:26:16,095][36761] Fps is (10 sec: 44237.4, 60 sec: 43691.1, 300 sec: 44098.1). Total num frames: 367214592. Throughput: 0: 43854.4. Samples: 367287840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 13:26:16,096][36761] Avg episode reward: [(0, '0.890')] [2024-07-02 13:26:18,571][36999] Updated weights for policy 0, policy_version 22420 (0.0026) [2024-07-02 13:26:21,095][36761] Fps is (10 sec: 45875.0, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 367443968. Throughput: 0: 43781.7. Samples: 367552800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 13:26:21,096][36761] Avg episode reward: [(0, '0.871')] [2024-07-02 13:26:21,982][36999] Updated weights for policy 0, policy_version 22430 (0.0037) [2024-07-02 13:26:25,908][36999] Updated weights for policy 0, policy_version 22440 (0.0038) [2024-07-02 13:26:26,095][36761] Fps is (10 sec: 44236.5, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 367656960. Throughput: 0: 44007.5. Samples: 367822880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 13:26:26,099][36761] Avg episode reward: [(0, '0.874')] [2024-07-02 13:26:26,360][36979] Signal inference workers to stop experience collection... (5400 times) [2024-07-02 13:26:26,409][36999] InferenceWorker_p0-w0: stopping experience collection (5400 times) [2024-07-02 13:26:26,485][36979] Signal inference workers to resume experience collection... (5400 times) [2024-07-02 13:26:26,486][36999] InferenceWorker_p0-w0: resuming experience collection (5400 times) [2024-07-02 13:26:29,348][36999] Updated weights for policy 0, policy_version 22450 (0.0029) [2024-07-02 13:26:31,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43690.8, 300 sec: 44042.4). Total num frames: 367869952. Throughput: 0: 43887.8. Samples: 367947980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 13:26:31,095][36761] Avg episode reward: [(0, '0.870')] [2024-07-02 13:26:33,348][36999] Updated weights for policy 0, policy_version 22460 (0.0027) [2024-07-02 13:26:36,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 368099328. Throughput: 0: 43840.9. Samples: 368210720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 13:26:36,096][36761] Avg episode reward: [(0, '0.910')] [2024-07-02 13:26:36,147][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000022468_368115712.pth... [2024-07-02 13:26:36,199][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000021821_357515264.pth [2024-07-02 13:26:36,208][36979] Saving new best policy, reward=0.910! [2024-07-02 13:26:36,981][36999] Updated weights for policy 0, policy_version 22470 (0.0028) [2024-07-02 13:26:41,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43967.1, 300 sec: 44042.4). Total num frames: 368295936. Throughput: 0: 43841.9. Samples: 368476340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 13:26:41,096][36761] Avg episode reward: [(0, '0.905')] [2024-07-02 13:26:41,151][36999] Updated weights for policy 0, policy_version 22480 (0.0027) [2024-07-02 13:26:44,298][36999] Updated weights for policy 0, policy_version 22490 (0.0021) [2024-07-02 13:26:46,095][36761] Fps is (10 sec: 44236.4, 60 sec: 43963.8, 300 sec: 44098.0). Total num frames: 368541696. Throughput: 0: 43665.3. Samples: 368601420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 13:26:46,096][36761] Avg episode reward: [(0, '0.868')] [2024-07-02 13:26:48,541][36999] Updated weights for policy 0, policy_version 22500 (0.0046) [2024-07-02 13:26:51,095][36761] Fps is (10 sec: 47513.9, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 368771072. Throughput: 0: 43783.7. Samples: 368866740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 13:26:51,095][36761] Avg episode reward: [(0, '0.891')] [2024-07-02 13:26:52,085][36999] Updated weights for policy 0, policy_version 22510 (0.0038) [2024-07-02 13:26:56,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43967.1, 300 sec: 43986.9). Total num frames: 368951296. Throughput: 0: 43777.7. Samples: 369134120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 13:26:56,096][36761] Avg episode reward: [(0, '0.909')] [2024-07-02 13:26:56,247][36999] Updated weights for policy 0, policy_version 22520 (0.0023) [2024-07-02 13:26:59,382][36999] Updated weights for policy 0, policy_version 22530 (0.0034) [2024-07-02 13:27:01,096][36761] Fps is (10 sec: 42595.4, 60 sec: 43963.3, 300 sec: 44097.9). Total num frames: 369197056. Throughput: 0: 43800.3. Samples: 369258880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 13:27:01,096][36761] Avg episode reward: [(0, '0.903')] [2024-07-02 13:27:03,618][36999] Updated weights for policy 0, policy_version 22540 (0.0032) [2024-07-02 13:27:06,095][36761] Fps is (10 sec: 47513.7, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 369426432. Throughput: 0: 43847.1. Samples: 369525920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 13:27:06,096][36761] Avg episode reward: [(0, '0.870')] [2024-07-02 13:27:06,763][36999] Updated weights for policy 0, policy_version 22550 (0.0026) [2024-07-02 13:27:10,950][36999] Updated weights for policy 0, policy_version 22560 (0.0029) [2024-07-02 13:27:11,095][36761] Fps is (10 sec: 42600.9, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 369623040. Throughput: 0: 43974.7. Samples: 369801740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-07-02 13:27:11,096][36761] Avg episode reward: [(0, '0.881')] [2024-07-02 13:27:14,154][36999] Updated weights for policy 0, policy_version 22570 (0.0027) [2024-07-02 13:27:16,095][36761] Fps is (10 sec: 42598.9, 60 sec: 43963.7, 300 sec: 44098.0). Total num frames: 369852416. Throughput: 0: 43930.2. Samples: 369924840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-07-02 13:27:16,096][36761] Avg episode reward: [(0, '0.907')] [2024-07-02 13:27:18,487][36999] Updated weights for policy 0, policy_version 22580 (0.0042) [2024-07-02 13:27:21,095][36761] Fps is (10 sec: 45874.8, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 370081792. Throughput: 0: 43839.9. Samples: 370183520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-07-02 13:27:21,096][36761] Avg episode reward: [(0, '0.887')] [2024-07-02 13:27:21,557][36999] Updated weights for policy 0, policy_version 22590 (0.0029) [2024-07-02 13:27:25,841][36999] Updated weights for policy 0, policy_version 22600 (0.0029) [2024-07-02 13:27:26,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 370278400. Throughput: 0: 43999.5. Samples: 370456320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-07-02 13:27:26,096][36761] Avg episode reward: [(0, '0.873')] [2024-07-02 13:27:29,079][36999] Updated weights for policy 0, policy_version 22610 (0.0025) [2024-07-02 13:27:31,095][36761] Fps is (10 sec: 44236.6, 60 sec: 44236.6, 300 sec: 44097.9). Total num frames: 370524160. Throughput: 0: 44111.5. Samples: 370586440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-07-02 13:27:31,096][36761] Avg episode reward: [(0, '0.877')] [2024-07-02 13:27:33,266][36999] Updated weights for policy 0, policy_version 22620 (0.0030) [2024-07-02 13:27:36,095][36761] Fps is (10 sec: 45875.3, 60 sec: 43963.7, 300 sec: 43987.6). Total num frames: 370737152. Throughput: 0: 43959.5. Samples: 370844920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-07-02 13:27:36,096][36761] Avg episode reward: [(0, '0.908')] [2024-07-02 13:27:36,581][36999] Updated weights for policy 0, policy_version 22630 (0.0027) [2024-07-02 13:27:40,560][36999] Updated weights for policy 0, policy_version 22640 (0.0034) [2024-07-02 13:27:41,095][36761] Fps is (10 sec: 44237.2, 60 sec: 44509.8, 300 sec: 43931.3). Total num frames: 370966528. Throughput: 0: 44031.1. Samples: 371115520. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-07-02 13:27:41,096][36761] Avg episode reward: [(0, '0.902')] [2024-07-02 13:27:44,226][36999] Updated weights for policy 0, policy_version 22650 (0.0025) [2024-07-02 13:27:46,095][36761] Fps is (10 sec: 42598.1, 60 sec: 43690.7, 300 sec: 44042.4). Total num frames: 371163136. Throughput: 0: 44096.5. Samples: 371243200. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-07-02 13:27:46,096][36761] Avg episode reward: [(0, '0.891')] [2024-07-02 13:27:48,223][36999] Updated weights for policy 0, policy_version 22660 (0.0027) [2024-07-02 13:27:51,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 371392512. Throughput: 0: 43892.9. Samples: 371501100. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-07-02 13:27:51,096][36761] Avg episode reward: [(0, '0.885')] [2024-07-02 13:27:51,626][36999] Updated weights for policy 0, policy_version 22670 (0.0030) [2024-07-02 13:27:55,648][36999] Updated weights for policy 0, policy_version 22680 (0.0026) [2024-07-02 13:27:56,095][36761] Fps is (10 sec: 45875.3, 60 sec: 44509.9, 300 sec: 43986.9). Total num frames: 371621888. Throughput: 0: 43750.2. Samples: 371770500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 13:27:56,096][36761] Avg episode reward: [(0, '0.890')] [2024-07-02 13:27:59,299][36999] Updated weights for policy 0, policy_version 22690 (0.0035) [2024-07-02 13:28:01,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43691.1, 300 sec: 43931.4). Total num frames: 371818496. Throughput: 0: 43927.5. Samples: 371901580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 13:28:01,096][36761] Avg episode reward: [(0, '0.890')] [2024-07-02 13:28:03,121][36999] Updated weights for policy 0, policy_version 22700 (0.0031) [2024-07-02 13:28:06,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 372047872. Throughput: 0: 44024.4. Samples: 372164620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 13:28:06,096][36761] Avg episode reward: [(0, '0.902')] [2024-07-02 13:28:06,684][36999] Updated weights for policy 0, policy_version 22710 (0.0034) [2024-07-02 13:28:09,883][36979] Signal inference workers to stop experience collection... (5450 times) [2024-07-02 13:28:09,901][36999] InferenceWorker_p0-w0: stopping experience collection (5450 times) [2024-07-02 13:28:09,965][36979] Signal inference workers to resume experience collection... (5450 times) [2024-07-02 13:28:09,965][36999] InferenceWorker_p0-w0: resuming experience collection (5450 times) [2024-07-02 13:28:10,492][36999] Updated weights for policy 0, policy_version 22720 (0.0032) [2024-07-02 13:28:11,095][36761] Fps is (10 sec: 45875.5, 60 sec: 44236.9, 300 sec: 43931.4). Total num frames: 372277248. Throughput: 0: 43837.0. Samples: 372428980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 13:28:11,096][36761] Avg episode reward: [(0, '0.905')] [2024-07-02 13:28:13,934][36999] Updated weights for policy 0, policy_version 22730 (0.0028) [2024-07-02 13:28:16,095][36761] Fps is (10 sec: 44237.3, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 372490240. Throughput: 0: 44041.9. Samples: 372568320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 13:28:16,100][36761] Avg episode reward: [(0, '0.905')] [2024-07-02 13:28:17,836][36999] Updated weights for policy 0, policy_version 22740 (0.0030) [2024-07-02 13:28:21,095][36761] Fps is (10 sec: 44236.1, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 372719616. Throughput: 0: 44094.6. Samples: 372829180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 13:28:21,096][36761] Avg episode reward: [(0, '0.916')] [2024-07-02 13:28:21,097][36979] Saving new best policy, reward=0.916! [2024-07-02 13:28:21,303][36999] Updated weights for policy 0, policy_version 22750 (0.0035) [2024-07-02 13:28:25,251][36999] Updated weights for policy 0, policy_version 22760 (0.0029) [2024-07-02 13:28:26,095][36761] Fps is (10 sec: 44237.1, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 372932608. Throughput: 0: 44017.4. Samples: 373096300. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-07-02 13:28:26,096][36761] Avg episode reward: [(0, '0.932')] [2024-07-02 13:28:26,208][36979] Saving new best policy, reward=0.932! [2024-07-02 13:28:28,649][36999] Updated weights for policy 0, policy_version 22770 (0.0039) [2024-07-02 13:28:31,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 373145600. Throughput: 0: 44039.5. Samples: 373224980. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-07-02 13:28:31,096][36761] Avg episode reward: [(0, '0.937')] [2024-07-02 13:28:31,097][36979] Saving new best policy, reward=0.937! [2024-07-02 13:28:32,569][36999] Updated weights for policy 0, policy_version 22780 (0.0041) [2024-07-02 13:28:35,954][36999] Updated weights for policy 0, policy_version 22790 (0.0039) [2024-07-02 13:28:36,095][36761] Fps is (10 sec: 45874.7, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 373391360. Throughput: 0: 44229.4. Samples: 373491420. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-07-02 13:28:36,096][36761] Avg episode reward: [(0, '0.882')] [2024-07-02 13:28:36,102][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000022790_373391360.pth... [2024-07-02 13:28:36,156][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000022145_362823680.pth [2024-07-02 13:28:39,882][36999] Updated weights for policy 0, policy_version 22800 (0.0033) [2024-07-02 13:28:41,100][36761] Fps is (10 sec: 45854.5, 60 sec: 43960.4, 300 sec: 43986.2). Total num frames: 373604352. Throughput: 0: 44222.6. Samples: 373760720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 13:28:41,101][36761] Avg episode reward: [(0, '0.882')] [2024-07-02 13:28:43,167][36999] Updated weights for policy 0, policy_version 22810 (0.0032) [2024-07-02 13:28:46,095][36761] Fps is (10 sec: 42598.7, 60 sec: 44236.9, 300 sec: 43987.0). Total num frames: 373817344. Throughput: 0: 44296.0. Samples: 373894900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 13:28:46,096][36761] Avg episode reward: [(0, '0.914')] [2024-07-02 13:28:47,236][36999] Updated weights for policy 0, policy_version 22820 (0.0031) [2024-07-02 13:28:50,611][36999] Updated weights for policy 0, policy_version 22830 (0.0029) [2024-07-02 13:28:51,100][36761] Fps is (10 sec: 44236.9, 60 sec: 44233.5, 300 sec: 44041.7). Total num frames: 374046720. Throughput: 0: 44317.8. Samples: 374159120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 13:28:51,101][36761] Avg episode reward: [(0, '0.906')] [2024-07-02 13:28:54,555][36999] Updated weights for policy 0, policy_version 22840 (0.0034) [2024-07-02 13:28:56,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 374259712. Throughput: 0: 44276.5. Samples: 374421420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:28:56,096][36761] Avg episode reward: [(0, '0.899')] [2024-07-02 13:28:57,939][36999] Updated weights for policy 0, policy_version 22850 (0.0033) [2024-07-02 13:29:01,095][36761] Fps is (10 sec: 42618.3, 60 sec: 44236.9, 300 sec: 44042.4). Total num frames: 374472704. Throughput: 0: 44109.9. Samples: 374553260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:29:01,095][36761] Avg episode reward: [(0, '0.923')] [2024-07-02 13:29:02,286][36999] Updated weights for policy 0, policy_version 22860 (0.0038) [2024-07-02 13:29:05,276][36999] Updated weights for policy 0, policy_version 22870 (0.0029) [2024-07-02 13:29:06,095][36761] Fps is (10 sec: 44236.7, 60 sec: 44236.9, 300 sec: 44043.1). Total num frames: 374702080. Throughput: 0: 44230.8. Samples: 374819560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:29:06,096][36761] Avg episode reward: [(0, '0.923')] [2024-07-02 13:29:09,622][36999] Updated weights for policy 0, policy_version 22880 (0.0035) [2024-07-02 13:29:11,095][36761] Fps is (10 sec: 45874.5, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 374931456. Throughput: 0: 44229.7. Samples: 375086640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:29:11,096][36761] Avg episode reward: [(0, '0.918')] [2024-07-02 13:29:12,733][36999] Updated weights for policy 0, policy_version 22890 (0.0041) [2024-07-02 13:29:16,095][36761] Fps is (10 sec: 44236.9, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 375144448. Throughput: 0: 44396.6. Samples: 375222820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:29:16,096][36761] Avg episode reward: [(0, '0.914')] [2024-07-02 13:29:16,934][36999] Updated weights for policy 0, policy_version 22900 (0.0025) [2024-07-02 13:29:20,116][36999] Updated weights for policy 0, policy_version 22910 (0.0036) [2024-07-02 13:29:21,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43963.8, 300 sec: 44042.8). Total num frames: 375357440. Throughput: 0: 44267.1. Samples: 375483440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:29:21,096][36761] Avg episode reward: [(0, '0.922')] [2024-07-02 13:29:24,354][36999] Updated weights for policy 0, policy_version 22920 (0.0030) [2024-07-02 13:29:26,098][36761] Fps is (10 sec: 44222.8, 60 sec: 44234.5, 300 sec: 44041.9). Total num frames: 375586816. Throughput: 0: 44107.7. Samples: 375745500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:29:26,099][36761] Avg episode reward: [(0, '0.905')] [2024-07-02 13:29:27,616][36999] Updated weights for policy 0, policy_version 22930 (0.0052) [2024-07-02 13:29:31,095][36761] Fps is (10 sec: 44237.2, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 375799808. Throughput: 0: 44115.1. Samples: 375880080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-07-02 13:29:31,096][36761] Avg episode reward: [(0, '0.909')] [2024-07-02 13:29:31,907][36979] Signal inference workers to stop experience collection... (5500 times) [2024-07-02 13:29:31,907][36979] Signal inference workers to resume experience collection... (5500 times) [2024-07-02 13:29:31,952][36999] InferenceWorker_p0-w0: stopping experience collection (5500 times) [2024-07-02 13:29:31,952][36999] InferenceWorker_p0-w0: resuming experience collection (5500 times) [2024-07-02 13:29:32,045][36999] Updated weights for policy 0, policy_version 22940 (0.0036) [2024-07-02 13:29:35,029][36999] Updated weights for policy 0, policy_version 22950 (0.0044) [2024-07-02 13:29:36,095][36761] Fps is (10 sec: 44250.4, 60 sec: 43963.7, 300 sec: 44043.1). Total num frames: 376029184. Throughput: 0: 44058.7. Samples: 376141560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-07-02 13:29:36,096][36761] Avg episode reward: [(0, '0.906')] [2024-07-02 13:29:39,544][36999] Updated weights for policy 0, policy_version 22960 (0.0046) [2024-07-02 13:29:41,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43967.1, 300 sec: 43986.9). Total num frames: 376242176. Throughput: 0: 44151.5. Samples: 376408240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-07-02 13:29:41,096][36761] Avg episode reward: [(0, '0.889')] [2024-07-02 13:29:42,429][36999] Updated weights for policy 0, policy_version 22970 (0.0031) [2024-07-02 13:29:46,095][36761] Fps is (10 sec: 42597.9, 60 sec: 43963.6, 300 sec: 43986.8). Total num frames: 376455168. Throughput: 0: 44149.5. Samples: 376540000. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-07-02 13:29:46,096][36761] Avg episode reward: [(0, '0.894')] [2024-07-02 13:29:46,902][36999] Updated weights for policy 0, policy_version 22980 (0.0038) [2024-07-02 13:29:49,778][36999] Updated weights for policy 0, policy_version 22990 (0.0026) [2024-07-02 13:29:51,096][36761] Fps is (10 sec: 44235.1, 60 sec: 43966.8, 300 sec: 44043.0). Total num frames: 376684544. Throughput: 0: 43940.9. Samples: 376796920. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-07-02 13:29:51,096][36761] Avg episode reward: [(0, '0.888')] [2024-07-02 13:29:54,226][36999] Updated weights for policy 0, policy_version 23000 (0.0027) [2024-07-02 13:29:56,095][36761] Fps is (10 sec: 44237.7, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 376897536. Throughput: 0: 44129.0. Samples: 377072440. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-07-02 13:29:56,096][36761] Avg episode reward: [(0, '0.915')] [2024-07-02 13:29:57,086][36999] Updated weights for policy 0, policy_version 23010 (0.0033) [2024-07-02 13:30:01,095][36761] Fps is (10 sec: 44238.7, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 377126912. Throughput: 0: 44047.5. Samples: 377204960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 13:30:01,096][36761] Avg episode reward: [(0, '0.920')] [2024-07-02 13:30:01,548][36999] Updated weights for policy 0, policy_version 23020 (0.0030) [2024-07-02 13:30:04,678][36999] Updated weights for policy 0, policy_version 23030 (0.0035) [2024-07-02 13:30:06,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43963.8, 300 sec: 44042.4). Total num frames: 377339904. Throughput: 0: 44108.1. Samples: 377468300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 13:30:06,095][36761] Avg episode reward: [(0, '0.913')] [2024-07-02 13:30:08,871][36999] Updated weights for policy 0, policy_version 23040 (0.0028) [2024-07-02 13:30:11,096][36761] Fps is (10 sec: 45874.2, 60 sec: 44236.7, 300 sec: 44042.5). Total num frames: 377585664. Throughput: 0: 44253.1. Samples: 377736760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 13:30:11,096][36761] Avg episode reward: [(0, '0.936')] [2024-07-02 13:30:12,666][36999] Updated weights for policy 0, policy_version 23050 (0.0038) [2024-07-02 13:30:16,095][36761] Fps is (10 sec: 45874.8, 60 sec: 44236.7, 300 sec: 44042.4). Total num frames: 377798656. Throughput: 0: 44240.0. Samples: 377870880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:30:16,096][36761] Avg episode reward: [(0, '0.926')] [2024-07-02 13:30:16,211][36999] Updated weights for policy 0, policy_version 23060 (0.0020) [2024-07-02 13:30:19,956][36999] Updated weights for policy 0, policy_version 23070 (0.0034) [2024-07-02 13:30:21,095][36761] Fps is (10 sec: 44237.4, 60 sec: 44509.9, 300 sec: 44153.5). Total num frames: 378028032. Throughput: 0: 44334.2. Samples: 378136600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:30:21,098][36761] Avg episode reward: [(0, '0.911')] [2024-07-02 13:30:23,530][36999] Updated weights for policy 0, policy_version 23080 (0.0040) [2024-07-02 13:30:26,095][36761] Fps is (10 sec: 44236.2, 60 sec: 44239.0, 300 sec: 44042.4). Total num frames: 378241024. Throughput: 0: 44332.3. Samples: 378403200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:30:26,096][36761] Avg episode reward: [(0, '0.913')] [2024-07-02 13:30:27,297][36999] Updated weights for policy 0, policy_version 23090 (0.0038) [2024-07-02 13:30:30,959][36999] Updated weights for policy 0, policy_version 23100 (0.0027) [2024-07-02 13:30:31,095][36761] Fps is (10 sec: 44237.3, 60 sec: 44509.9, 300 sec: 44098.0). Total num frames: 378470400. Throughput: 0: 44345.5. Samples: 378535540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 13:30:31,096][36761] Avg episode reward: [(0, '0.893')] [2024-07-02 13:30:34,648][36999] Updated weights for policy 0, policy_version 23110 (0.0041) [2024-07-02 13:30:36,095][36761] Fps is (10 sec: 44237.1, 60 sec: 44236.8, 300 sec: 44154.2). Total num frames: 378683392. Throughput: 0: 44537.7. Samples: 378801100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 13:30:36,096][36761] Avg episode reward: [(0, '0.896')] [2024-07-02 13:30:36,205][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000023114_378699776.pth... [2024-07-02 13:30:36,231][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000022468_368115712.pth [2024-07-02 13:30:38,651][36999] Updated weights for policy 0, policy_version 23120 (0.0035) [2024-07-02 13:30:41,095][36761] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 378896384. Throughput: 0: 44360.0. Samples: 379068640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 13:30:41,096][36761] Avg episode reward: [(0, '0.894')] [2024-07-02 13:30:42,046][36999] Updated weights for policy 0, policy_version 23130 (0.0040) [2024-07-02 13:30:46,095][36761] Fps is (10 sec: 42599.3, 60 sec: 44237.0, 300 sec: 43986.9). Total num frames: 379109376. Throughput: 0: 44247.7. Samples: 379196100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:30:46,095][36761] Avg episode reward: [(0, '0.894')] [2024-07-02 13:30:46,163][36999] Updated weights for policy 0, policy_version 23140 (0.0032) [2024-07-02 13:30:49,503][36999] Updated weights for policy 0, policy_version 23150 (0.0040) [2024-07-02 13:30:51,095][36761] Fps is (10 sec: 44236.6, 60 sec: 44237.1, 300 sec: 44154.2). Total num frames: 379338752. Throughput: 0: 44252.3. Samples: 379459660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:30:51,096][36761] Avg episode reward: [(0, '0.937')] [2024-07-02 13:30:53,526][36999] Updated weights for policy 0, policy_version 23160 (0.0031) [2024-07-02 13:30:56,095][36761] Fps is (10 sec: 45874.9, 60 sec: 44509.9, 300 sec: 44098.0). Total num frames: 379568128. Throughput: 0: 44258.5. Samples: 379728380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:30:56,096][36761] Avg episode reward: [(0, '0.935')] [2024-07-02 13:30:57,257][36999] Updated weights for policy 0, policy_version 23170 (0.0028) [2024-07-02 13:30:58,321][36979] Signal inference workers to stop experience collection... (5550 times) [2024-07-02 13:30:58,321][36979] Signal inference workers to resume experience collection... (5550 times) [2024-07-02 13:30:58,363][36999] InferenceWorker_p0-w0: stopping experience collection (5550 times) [2024-07-02 13:30:58,363][36999] InferenceWorker_p0-w0: resuming experience collection (5550 times) [2024-07-02 13:31:00,897][36999] Updated weights for policy 0, policy_version 23180 (0.0042) [2024-07-02 13:31:01,095][36761] Fps is (10 sec: 44236.8, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 379781120. Throughput: 0: 44031.1. Samples: 379852280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:31:01,096][36761] Avg episode reward: [(0, '0.915')] [2024-07-02 13:31:04,653][36999] Updated weights for policy 0, policy_version 23190 (0.0030) [2024-07-02 13:31:06,095][36761] Fps is (10 sec: 45874.7, 60 sec: 44782.8, 300 sec: 44209.0). Total num frames: 380026880. Throughput: 0: 44163.6. Samples: 380123960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:31:06,096][36761] Avg episode reward: [(0, '0.915')] [2024-07-02 13:31:08,200][36999] Updated weights for policy 0, policy_version 23200 (0.0032) [2024-07-02 13:31:11,095][36761] Fps is (10 sec: 45874.9, 60 sec: 44236.9, 300 sec: 44153.5). Total num frames: 380239872. Throughput: 0: 44023.6. Samples: 380384260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:31:11,096][36761] Avg episode reward: [(0, '0.932')] [2024-07-02 13:31:12,292][36999] Updated weights for policy 0, policy_version 23210 (0.0039) [2024-07-02 13:31:15,538][36999] Updated weights for policy 0, policy_version 23220 (0.0033) [2024-07-02 13:31:16,095][36761] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 380452864. Throughput: 0: 44001.7. Samples: 380515620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:31:16,100][36761] Avg episode reward: [(0, '0.919')] [2024-07-02 13:31:19,570][36999] Updated weights for policy 0, policy_version 23230 (0.0028) [2024-07-02 13:31:21,095][36761] Fps is (10 sec: 44237.4, 60 sec: 44236.9, 300 sec: 44153.5). Total num frames: 380682240. Throughput: 0: 44141.0. Samples: 380787440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:31:21,096][36761] Avg episode reward: [(0, '0.919')] [2024-07-02 13:31:23,032][36999] Updated weights for policy 0, policy_version 23240 (0.0032) [2024-07-02 13:31:26,095][36761] Fps is (10 sec: 44236.6, 60 sec: 44236.8, 300 sec: 44153.5). Total num frames: 380895232. Throughput: 0: 43923.9. Samples: 381045220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:31:26,096][36761] Avg episode reward: [(0, '0.922')] [2024-07-02 13:31:26,970][36999] Updated weights for policy 0, policy_version 23250 (0.0028) [2024-07-02 13:31:30,551][36999] Updated weights for policy 0, policy_version 23260 (0.0036) [2024-07-02 13:31:31,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43963.7, 300 sec: 44097.9). Total num frames: 381108224. Throughput: 0: 44037.6. Samples: 381177800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:31:31,096][36761] Avg episode reward: [(0, '0.917')] [2024-07-02 13:31:34,353][36999] Updated weights for policy 0, policy_version 23270 (0.0030) [2024-07-02 13:31:36,095][36761] Fps is (10 sec: 44237.4, 60 sec: 44236.9, 300 sec: 44209.0). Total num frames: 381337600. Throughput: 0: 44213.9. Samples: 381449280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 13:31:36,096][36761] Avg episode reward: [(0, '0.908')] [2024-07-02 13:31:38,006][36999] Updated weights for policy 0, policy_version 23280 (0.0028) [2024-07-02 13:31:41,095][36761] Fps is (10 sec: 44237.3, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 381550592. Throughput: 0: 44039.1. Samples: 381710140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 13:31:41,096][36761] Avg episode reward: [(0, '0.900')] [2024-07-02 13:31:41,909][36999] Updated weights for policy 0, policy_version 23290 (0.0029) [2024-07-02 13:31:45,417][36999] Updated weights for policy 0, policy_version 23300 (0.0049) [2024-07-02 13:31:46,095][36761] Fps is (10 sec: 44236.2, 60 sec: 44509.7, 300 sec: 44097.9). Total num frames: 381779968. Throughput: 0: 44215.1. Samples: 381841960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 13:31:46,096][36761] Avg episode reward: [(0, '0.907')] [2024-07-02 13:31:49,230][36999] Updated weights for policy 0, policy_version 23310 (0.0034) [2024-07-02 13:31:51,096][36761] Fps is (10 sec: 45874.0, 60 sec: 44509.7, 300 sec: 44264.5). Total num frames: 382009344. Throughput: 0: 44053.7. Samples: 382106380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:31:51,096][36761] Avg episode reward: [(0, '0.930')] [2024-07-02 13:31:53,070][36999] Updated weights for policy 0, policy_version 23320 (0.0037) [2024-07-02 13:31:56,095][36761] Fps is (10 sec: 42598.9, 60 sec: 43963.7, 300 sec: 44098.0). Total num frames: 382205952. Throughput: 0: 44093.0. Samples: 382368440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:31:56,096][36761] Avg episode reward: [(0, '0.945')] [2024-07-02 13:31:56,100][36979] Saving new best policy, reward=0.945! [2024-07-02 13:31:56,713][36999] Updated weights for policy 0, policy_version 23330 (0.0032) [2024-07-02 13:32:00,441][36999] Updated weights for policy 0, policy_version 23340 (0.0035) [2024-07-02 13:32:01,095][36761] Fps is (10 sec: 40961.0, 60 sec: 43963.8, 300 sec: 44042.4). Total num frames: 382418944. Throughput: 0: 44063.2. Samples: 382498460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:32:01,096][36761] Avg episode reward: [(0, '0.963')] [2024-07-02 13:32:01,206][36979] Saving new best policy, reward=0.963! [2024-07-02 13:32:04,037][36999] Updated weights for policy 0, policy_version 23350 (0.0035) [2024-07-02 13:32:06,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43417.7, 300 sec: 44098.0). Total num frames: 382631936. Throughput: 0: 43893.4. Samples: 382762640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 13:32:06,095][36761] Avg episode reward: [(0, '0.928')] [2024-07-02 13:32:08,324][36999] Updated weights for policy 0, policy_version 23360 (0.0027) [2024-07-02 13:32:11,095][36761] Fps is (10 sec: 44236.3, 60 sec: 43690.7, 300 sec: 44097.9). Total num frames: 382861312. Throughput: 0: 43921.8. Samples: 383021700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 13:32:11,096][36761] Avg episode reward: [(0, '0.935')] [2024-07-02 13:32:11,638][36999] Updated weights for policy 0, policy_version 23370 (0.0034) [2024-07-02 13:32:15,681][36979] Signal inference workers to stop experience collection... (5600 times) [2024-07-02 13:32:15,735][36979] Signal inference workers to resume experience collection... (5600 times) [2024-07-02 13:32:15,736][36999] InferenceWorker_p0-w0: stopping experience collection (5600 times) [2024-07-02 13:32:15,738][36999] Updated weights for policy 0, policy_version 23380 (0.0022) [2024-07-02 13:32:15,749][36999] InferenceWorker_p0-w0: resuming experience collection (5600 times) [2024-07-02 13:32:16,095][36761] Fps is (10 sec: 45874.7, 60 sec: 43963.8, 300 sec: 44098.0). Total num frames: 383090688. Throughput: 0: 44009.8. Samples: 383158240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 13:32:16,096][36761] Avg episode reward: [(0, '0.944')] [2024-07-02 13:32:19,224][36999] Updated weights for policy 0, policy_version 23390 (0.0025) [2024-07-02 13:32:21,098][36761] Fps is (10 sec: 44225.7, 60 sec: 43688.8, 300 sec: 44153.1). Total num frames: 383303680. Throughput: 0: 43768.6. Samples: 383418980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-07-02 13:32:21,098][36761] Avg episode reward: [(0, '0.952')] [2024-07-02 13:32:23,209][36999] Updated weights for policy 0, policy_version 23400 (0.0032) [2024-07-02 13:32:26,100][36761] Fps is (10 sec: 42578.8, 60 sec: 43687.4, 300 sec: 44041.7). Total num frames: 383516672. Throughput: 0: 43768.4. Samples: 383679920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-07-02 13:32:26,101][36761] Avg episode reward: [(0, '0.950')] [2024-07-02 13:32:26,546][36999] Updated weights for policy 0, policy_version 23410 (0.0042) [2024-07-02 13:32:30,535][36999] Updated weights for policy 0, policy_version 23420 (0.0029) [2024-07-02 13:32:31,098][36761] Fps is (10 sec: 44234.2, 60 sec: 43961.5, 300 sec: 44097.5). Total num frames: 383746048. Throughput: 0: 43858.8. Samples: 383815740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-07-02 13:32:31,099][36761] Avg episode reward: [(0, '0.923')] [2024-07-02 13:32:33,856][36999] Updated weights for policy 0, policy_version 23430 (0.0034) [2024-07-02 13:32:36,095][36761] Fps is (10 sec: 44257.2, 60 sec: 43690.6, 300 sec: 44042.4). Total num frames: 383959040. Throughput: 0: 43801.5. Samples: 384077440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-07-02 13:32:36,096][36761] Avg episode reward: [(0, '0.934')] [2024-07-02 13:32:36,114][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000023435_383959040.pth... [2024-07-02 13:32:36,175][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000022790_373391360.pth [2024-07-02 13:32:37,824][36999] Updated weights for policy 0, policy_version 23440 (0.0029) [2024-07-02 13:32:41,095][36761] Fps is (10 sec: 44250.9, 60 sec: 43963.7, 300 sec: 44153.5). Total num frames: 384188416. Throughput: 0: 43825.8. Samples: 384340600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:32:41,096][36761] Avg episode reward: [(0, '0.929')] [2024-07-02 13:32:41,378][36999] Updated weights for policy 0, policy_version 23450 (0.0032) [2024-07-02 13:32:45,236][36999] Updated weights for policy 0, policy_version 23460 (0.0030) [2024-07-02 13:32:46,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43417.7, 300 sec: 44042.4). Total num frames: 384385024. Throughput: 0: 43860.9. Samples: 384472200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:32:46,095][36761] Avg episode reward: [(0, '0.939')] [2024-07-02 13:32:48,786][36999] Updated weights for policy 0, policy_version 23470 (0.0041) [2024-07-02 13:32:51,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 384630784. Throughput: 0: 43842.6. Samples: 384735560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:32:51,096][36761] Avg episode reward: [(0, '0.935')] [2024-07-02 13:32:52,605][36999] Updated weights for policy 0, policy_version 23480 (0.0034) [2024-07-02 13:32:56,095][36761] Fps is (10 sec: 45874.5, 60 sec: 43963.6, 300 sec: 44153.5). Total num frames: 384843776. Throughput: 0: 43952.4. Samples: 384999560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 13:32:56,096][36761] Avg episode reward: [(0, '0.926')] [2024-07-02 13:32:56,373][36999] Updated weights for policy 0, policy_version 23490 (0.0055) [2024-07-02 13:33:00,210][36999] Updated weights for policy 0, policy_version 23500 (0.0034) [2024-07-02 13:33:01,095][36761] Fps is (10 sec: 44236.6, 60 sec: 44236.8, 300 sec: 44153.5). Total num frames: 385073152. Throughput: 0: 43879.6. Samples: 385132820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 13:33:01,096][36761] Avg episode reward: [(0, '0.926')] [2024-07-02 13:33:03,711][36999] Updated weights for policy 0, policy_version 23510 (0.0037) [2024-07-02 13:33:06,095][36761] Fps is (10 sec: 44237.1, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 385286144. Throughput: 0: 43904.7. Samples: 385394580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 13:33:06,096][36761] Avg episode reward: [(0, '0.918')] [2024-07-02 13:33:07,750][36999] Updated weights for policy 0, policy_version 23520 (0.0046) [2024-07-02 13:33:11,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43963.8, 300 sec: 44098.0). Total num frames: 385499136. Throughput: 0: 44009.4. Samples: 385660140. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-07-02 13:33:11,096][36761] Avg episode reward: [(0, '0.923')] [2024-07-02 13:33:11,143][36999] Updated weights for policy 0, policy_version 23530 (0.0026) [2024-07-02 13:33:15,327][36999] Updated weights for policy 0, policy_version 23540 (0.0038) [2024-07-02 13:33:16,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43963.7, 300 sec: 44098.0). Total num frames: 385728512. Throughput: 0: 43925.7. Samples: 385792260. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-07-02 13:33:16,096][36761] Avg episode reward: [(0, '0.931')] [2024-07-02 13:33:18,687][36999] Updated weights for policy 0, policy_version 23550 (0.0030) [2024-07-02 13:33:21,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43965.6, 300 sec: 44097.9). Total num frames: 385941504. Throughput: 0: 43989.4. Samples: 386056960. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-07-02 13:33:21,096][36761] Avg episode reward: [(0, '0.932')] [2024-07-02 13:33:22,601][36999] Updated weights for policy 0, policy_version 23560 (0.0026) [2024-07-02 13:33:26,009][36999] Updated weights for policy 0, policy_version 23570 (0.0037) [2024-07-02 13:33:26,095][36761] Fps is (10 sec: 44236.9, 60 sec: 44240.2, 300 sec: 44153.5). Total num frames: 386170880. Throughput: 0: 43980.0. Samples: 386319700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 13:33:26,096][36761] Avg episode reward: [(0, '0.942')] [2024-07-02 13:33:30,148][36999] Updated weights for policy 0, policy_version 23580 (0.0035) [2024-07-02 13:33:31,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43692.9, 300 sec: 43986.9). Total num frames: 386367488. Throughput: 0: 44023.0. Samples: 386453240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 13:33:31,096][36761] Avg episode reward: [(0, '0.944')] [2024-07-02 13:33:33,374][36999] Updated weights for policy 0, policy_version 23590 (0.0027) [2024-07-02 13:33:36,100][36761] Fps is (10 sec: 42578.8, 60 sec: 43960.4, 300 sec: 44042.4). Total num frames: 386596864. Throughput: 0: 43933.3. Samples: 386712760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 13:33:36,100][36761] Avg episode reward: [(0, '0.945')] [2024-07-02 13:33:37,492][36999] Updated weights for policy 0, policy_version 23600 (0.0034) [2024-07-02 13:33:40,919][36999] Updated weights for policy 0, policy_version 23610 (0.0027) [2024-07-02 13:33:41,097][36761] Fps is (10 sec: 45869.5, 60 sec: 43962.7, 300 sec: 44097.8). Total num frames: 386826240. Throughput: 0: 44005.5. Samples: 386979860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:33:41,097][36761] Avg episode reward: [(0, '0.942')] [2024-07-02 13:33:44,874][36999] Updated weights for policy 0, policy_version 23620 (0.0030) [2024-07-02 13:33:46,095][36761] Fps is (10 sec: 44256.6, 60 sec: 44236.7, 300 sec: 44043.1). Total num frames: 387039232. Throughput: 0: 44039.0. Samples: 387114580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:33:46,096][36761] Avg episode reward: [(0, '0.942')] [2024-07-02 13:33:48,238][36999] Updated weights for policy 0, policy_version 23630 (0.0032) [2024-07-02 13:33:51,095][36761] Fps is (10 sec: 40965.6, 60 sec: 43417.6, 300 sec: 43986.9). Total num frames: 387235840. Throughput: 0: 43959.6. Samples: 387372760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:33:51,096][36761] Avg episode reward: [(0, '0.933')] [2024-07-02 13:33:52,194][36999] Updated weights for policy 0, policy_version 23640 (0.0035) [2024-07-02 13:33:55,634][36999] Updated weights for policy 0, policy_version 23650 (0.0028) [2024-07-02 13:33:56,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43963.8, 300 sec: 44097.9). Total num frames: 387481600. Throughput: 0: 43911.9. Samples: 387636180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-07-02 13:33:56,098][36761] Avg episode reward: [(0, '0.917')] [2024-07-02 13:33:56,770][36979] Signal inference workers to stop experience collection... (5650 times) [2024-07-02 13:33:56,770][36979] Signal inference workers to resume experience collection... (5650 times) [2024-07-02 13:33:56,792][36999] InferenceWorker_p0-w0: stopping experience collection (5650 times) [2024-07-02 13:33:56,822][36999] InferenceWorker_p0-w0: resuming experience collection (5650 times) [2024-07-02 13:33:59,540][36999] Updated weights for policy 0, policy_version 23660 (0.0030) [2024-07-02 13:34:01,095][36761] Fps is (10 sec: 45874.6, 60 sec: 43690.6, 300 sec: 44042.4). Total num frames: 387694592. Throughput: 0: 44101.7. Samples: 387776840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-07-02 13:34:01,096][36761] Avg episode reward: [(0, '0.925')] [2024-07-02 13:34:02,943][36999] Updated weights for policy 0, policy_version 23670 (0.0026) [2024-07-02 13:34:06,096][36761] Fps is (10 sec: 42597.9, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 387907584. Throughput: 0: 43972.7. Samples: 388035740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-07-02 13:34:06,096][36761] Avg episode reward: [(0, '0.921')] [2024-07-02 13:34:06,906][36999] Updated weights for policy 0, policy_version 23680 (0.0047) [2024-07-02 13:34:10,591][36999] Updated weights for policy 0, policy_version 23690 (0.0027) [2024-07-02 13:34:11,095][36761] Fps is (10 sec: 45875.7, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 388153344. Throughput: 0: 43932.0. Samples: 388296640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-07-02 13:34:11,096][36761] Avg episode reward: [(0, '0.921')] [2024-07-02 13:34:14,688][36999] Updated weights for policy 0, policy_version 23700 (0.0032) [2024-07-02 13:34:16,095][36761] Fps is (10 sec: 44237.8, 60 sec: 43690.7, 300 sec: 44042.4). Total num frames: 388349952. Throughput: 0: 43983.7. Samples: 388432500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 13:34:16,095][36761] Avg episode reward: [(0, '0.938')] [2024-07-02 13:34:18,032][36999] Updated weights for policy 0, policy_version 23710 (0.0046) [2024-07-02 13:34:21,095][36761] Fps is (10 sec: 40959.8, 60 sec: 43690.6, 300 sec: 43987.3). Total num frames: 388562944. Throughput: 0: 43916.9. Samples: 388688820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 13:34:21,096][36761] Avg episode reward: [(0, '0.949')] [2024-07-02 13:34:22,638][36999] Updated weights for policy 0, policy_version 23720 (0.0029) [2024-07-02 13:34:25,732][36999] Updated weights for policy 0, policy_version 23730 (0.0039) [2024-07-02 13:34:26,095][36761] Fps is (10 sec: 45874.4, 60 sec: 43963.6, 300 sec: 44097.9). Total num frames: 388808704. Throughput: 0: 43784.7. Samples: 388950120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 13:34:26,096][36761] Avg episode reward: [(0, '0.972')] [2024-07-02 13:34:26,121][36979] Saving new best policy, reward=0.972! [2024-07-02 13:34:30,007][36999] Updated weights for policy 0, policy_version 23740 (0.0034) [2024-07-02 13:34:31,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 389005312. Throughput: 0: 43789.8. Samples: 389085120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 13:34:31,096][36761] Avg episode reward: [(0, '0.958')] [2024-07-02 13:34:33,065][36999] Updated weights for policy 0, policy_version 23750 (0.0035) [2024-07-02 13:34:36,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43967.0, 300 sec: 44042.4). Total num frames: 389234688. Throughput: 0: 43876.3. Samples: 389347200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 13:34:36,096][36761] Avg episode reward: [(0, '0.956')] [2024-07-02 13:34:36,123][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000023757_389234688.pth... [2024-07-02 13:34:36,207][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000023114_378699776.pth [2024-07-02 13:34:37,276][36999] Updated weights for policy 0, policy_version 23760 (0.0027) [2024-07-02 13:34:40,477][36999] Updated weights for policy 0, policy_version 23770 (0.0032) [2024-07-02 13:34:41,097][36761] Fps is (10 sec: 44229.7, 60 sec: 43690.4, 300 sec: 44042.2). Total num frames: 389447680. Throughput: 0: 43902.9. Samples: 389611880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 13:34:41,097][36761] Avg episode reward: [(0, '0.944')] [2024-07-02 13:34:44,739][36999] Updated weights for policy 0, policy_version 23780 (0.0027) [2024-07-02 13:34:46,096][36761] Fps is (10 sec: 44236.4, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 389677056. Throughput: 0: 43785.2. Samples: 389747180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 13:34:46,096][36761] Avg episode reward: [(0, '0.966')] [2024-07-02 13:34:47,823][36999] Updated weights for policy 0, policy_version 23790 (0.0036) [2024-07-02 13:34:51,095][36761] Fps is (10 sec: 44244.1, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 389890048. Throughput: 0: 43965.5. Samples: 390014180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 13:34:51,096][36761] Avg episode reward: [(0, '0.966')] [2024-07-02 13:34:52,027][36999] Updated weights for policy 0, policy_version 23800 (0.0036) [2024-07-02 13:34:55,170][36999] Updated weights for policy 0, policy_version 23810 (0.0035) [2024-07-02 13:34:56,095][36761] Fps is (10 sec: 42599.0, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 390103040. Throughput: 0: 44054.1. Samples: 390279080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 13:34:56,096][36761] Avg episode reward: [(0, '0.962')] [2024-07-02 13:34:59,583][36999] Updated weights for policy 0, policy_version 23820 (0.0034) [2024-07-02 13:35:01,095][36761] Fps is (10 sec: 44237.2, 60 sec: 43963.8, 300 sec: 44042.4). Total num frames: 390332416. Throughput: 0: 44008.0. Samples: 390412860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 13:35:01,095][36761] Avg episode reward: [(0, '0.963')] [2024-07-02 13:35:03,202][36999] Updated weights for policy 0, policy_version 23830 (0.0030) [2024-07-02 13:35:06,095][36761] Fps is (10 sec: 45875.4, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 390561792. Throughput: 0: 44089.8. Samples: 390672860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 13:35:06,096][36761] Avg episode reward: [(0, '0.963')] [2024-07-02 13:35:06,891][36999] Updated weights for policy 0, policy_version 23840 (0.0029) [2024-07-02 13:35:10,404][36999] Updated weights for policy 0, policy_version 23850 (0.0022) [2024-07-02 13:35:11,095][36761] Fps is (10 sec: 45874.9, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 390791168. Throughput: 0: 44300.6. Samples: 390943640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 13:35:11,096][36761] Avg episode reward: [(0, '0.962')] [2024-07-02 13:35:14,255][36999] Updated weights for policy 0, policy_version 23860 (0.0024) [2024-07-02 13:35:16,095][36761] Fps is (10 sec: 45874.8, 60 sec: 44509.8, 300 sec: 44042.4). Total num frames: 391020544. Throughput: 0: 44356.4. Samples: 391081160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 13:35:16,096][36761] Avg episode reward: [(0, '0.973')] [2024-07-02 13:35:17,748][36999] Updated weights for policy 0, policy_version 23870 (0.0033) [2024-07-02 13:35:21,095][36761] Fps is (10 sec: 40959.6, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 391200768. Throughput: 0: 44315.1. Samples: 391341380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-07-02 13:35:21,096][36761] Avg episode reward: [(0, '0.955')] [2024-07-02 13:35:21,698][36999] Updated weights for policy 0, policy_version 23880 (0.0027) [2024-07-02 13:35:25,059][36999] Updated weights for policy 0, policy_version 23890 (0.0035) [2024-07-02 13:35:26,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 391446528. Throughput: 0: 44331.8. Samples: 391606740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-07-02 13:35:26,096][36761] Avg episode reward: [(0, '0.943')] [2024-07-02 13:35:29,043][36999] Updated weights for policy 0, policy_version 23900 (0.0030) [2024-07-02 13:35:31,095][36761] Fps is (10 sec: 47513.5, 60 sec: 44509.8, 300 sec: 44042.4). Total num frames: 391675904. Throughput: 0: 44439.2. Samples: 391746940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-07-02 13:35:31,096][36761] Avg episode reward: [(0, '0.949')] [2024-07-02 13:35:32,303][36999] Updated weights for policy 0, policy_version 23910 (0.0044) [2024-07-02 13:35:36,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 391872512. Throughput: 0: 44344.9. Samples: 392009700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 13:35:36,096][36761] Avg episode reward: [(0, '0.952')] [2024-07-02 13:35:36,472][36999] Updated weights for policy 0, policy_version 23920 (0.0025) [2024-07-02 13:35:40,092][36999] Updated weights for policy 0, policy_version 23930 (0.0037) [2024-07-02 13:35:41,095][36761] Fps is (10 sec: 44236.6, 60 sec: 44511.0, 300 sec: 44097.9). Total num frames: 392118272. Throughput: 0: 44130.5. Samples: 392264960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 13:35:41,096][36761] Avg episode reward: [(0, '0.962')] [2024-07-02 13:35:43,794][36999] Updated weights for policy 0, policy_version 23940 (0.0034) [2024-07-02 13:35:44,822][36979] Signal inference workers to stop experience collection... (5700 times) [2024-07-02 13:35:44,822][36979] Signal inference workers to resume experience collection... (5700 times) [2024-07-02 13:35:44,853][36999] InferenceWorker_p0-w0: stopping experience collection (5700 times) [2024-07-02 13:35:44,853][36999] InferenceWorker_p0-w0: resuming experience collection (5700 times) [2024-07-02 13:35:46,095][36761] Fps is (10 sec: 47513.3, 60 sec: 44510.0, 300 sec: 44098.0). Total num frames: 392347648. Throughput: 0: 44257.2. Samples: 392404440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 13:35:46,096][36761] Avg episode reward: [(0, '0.962')] [2024-07-02 13:35:47,465][36999] Updated weights for policy 0, policy_version 23950 (0.0037) [2024-07-02 13:35:51,095][36761] Fps is (10 sec: 42598.8, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 392544256. Throughput: 0: 44352.8. Samples: 392668740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 13:35:51,096][36761] Avg episode reward: [(0, '0.957')] [2024-07-02 13:35:51,376][36999] Updated weights for policy 0, policy_version 23960 (0.0039) [2024-07-02 13:35:54,817][36999] Updated weights for policy 0, policy_version 23970 (0.0039) [2024-07-02 13:35:56,095][36761] Fps is (10 sec: 40960.0, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 392757248. Throughput: 0: 44066.6. Samples: 392926640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 13:35:56,096][36761] Avg episode reward: [(0, '0.960')] [2024-07-02 13:35:58,937][36999] Updated weights for policy 0, policy_version 23980 (0.0029) [2024-07-02 13:36:01,095][36761] Fps is (10 sec: 45876.0, 60 sec: 44509.9, 300 sec: 43986.9). Total num frames: 393003008. Throughput: 0: 43971.3. Samples: 393059860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 13:36:01,096][36761] Avg episode reward: [(0, '0.961')] [2024-07-02 13:36:02,169][36999] Updated weights for policy 0, policy_version 23990 (0.0025) [2024-07-02 13:36:06,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 393183232. Throughput: 0: 44108.9. Samples: 393326280. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-07-02 13:36:06,096][36761] Avg episode reward: [(0, '0.952')] [2024-07-02 13:36:06,625][36999] Updated weights for policy 0, policy_version 24000 (0.0026) [2024-07-02 13:36:09,543][36999] Updated weights for policy 0, policy_version 24010 (0.0034) [2024-07-02 13:36:11,095][36761] Fps is (10 sec: 42598.1, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 393428992. Throughput: 0: 44014.3. Samples: 393587380. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-07-02 13:36:11,096][36761] Avg episode reward: [(0, '0.946')] [2024-07-02 13:36:14,166][36999] Updated weights for policy 0, policy_version 24020 (0.0027) [2024-07-02 13:36:16,095][36761] Fps is (10 sec: 49152.1, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 393674752. Throughput: 0: 43979.6. Samples: 393726020. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-07-02 13:36:16,096][36761] Avg episode reward: [(0, '0.959')] [2024-07-02 13:36:16,827][36999] Updated weights for policy 0, policy_version 24030 (0.0046) [2024-07-02 13:36:21,095][36761] Fps is (10 sec: 40959.3, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 393838592. Throughput: 0: 43925.6. Samples: 393986360. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-07-02 13:36:21,096][36761] Avg episode reward: [(0, '0.962')] [2024-07-02 13:36:21,493][36999] Updated weights for policy 0, policy_version 24040 (0.0033) [2024-07-02 13:36:24,309][36999] Updated weights for policy 0, policy_version 24050 (0.0029) [2024-07-02 13:36:26,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 394084352. Throughput: 0: 44015.7. Samples: 394245660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-07-02 13:36:26,104][36761] Avg episode reward: [(0, '0.977')] [2024-07-02 13:36:26,137][36979] Saving new best policy, reward=0.977! [2024-07-02 13:36:29,014][36999] Updated weights for policy 0, policy_version 24060 (0.0032) [2024-07-02 13:36:31,095][36761] Fps is (10 sec: 49152.3, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 394330112. Throughput: 0: 43911.5. Samples: 394380460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-07-02 13:36:31,104][36761] Avg episode reward: [(0, '0.986')] [2024-07-02 13:36:31,108][36979] Saving new best policy, reward=0.986! [2024-07-02 13:36:31,758][36999] Updated weights for policy 0, policy_version 24070 (0.0034) [2024-07-02 13:36:36,095][36761] Fps is (10 sec: 39321.6, 60 sec: 43417.5, 300 sec: 43820.2). Total num frames: 394477568. Throughput: 0: 43771.1. Samples: 394638440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-07-02 13:36:36,096][36761] Avg episode reward: [(0, '0.968')] [2024-07-02 13:36:36,209][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000024078_394493952.pth... [2024-07-02 13:36:36,282][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000023435_383959040.pth [2024-07-02 13:36:36,568][36999] Updated weights for policy 0, policy_version 24080 (0.0042) [2024-07-02 13:36:39,406][36999] Updated weights for policy 0, policy_version 24090 (0.0021) [2024-07-02 13:36:41,095][36761] Fps is (10 sec: 39322.1, 60 sec: 43417.7, 300 sec: 43875.8). Total num frames: 394723328. Throughput: 0: 43709.8. Samples: 394893580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-07-02 13:36:41,096][36761] Avg episode reward: [(0, '0.948')] [2024-07-02 13:36:44,078][36999] Updated weights for policy 0, policy_version 24100 (0.0033) [2024-07-02 13:36:46,095][36761] Fps is (10 sec: 47513.6, 60 sec: 43417.6, 300 sec: 43875.8). Total num frames: 394952704. Throughput: 0: 43870.5. Samples: 395034040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-07-02 13:36:46,096][36761] Avg episode reward: [(0, '0.960')] [2024-07-02 13:36:46,624][36999] Updated weights for policy 0, policy_version 24110 (0.0031) [2024-07-02 13:36:48,783][36979] Signal inference workers to stop experience collection... (5750 times) [2024-07-02 13:36:48,823][36999] InferenceWorker_p0-w0: stopping experience collection (5750 times) [2024-07-02 13:36:48,838][36979] Signal inference workers to resume experience collection... (5750 times) [2024-07-02 13:36:48,848][36999] InferenceWorker_p0-w0: resuming experience collection (5750 times) [2024-07-02 13:36:51,095][36761] Fps is (10 sec: 40959.1, 60 sec: 43144.5, 300 sec: 43820.2). Total num frames: 395132928. Throughput: 0: 43757.3. Samples: 395295360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-07-02 13:36:51,096][36761] Avg episode reward: [(0, '0.946')] [2024-07-02 13:36:51,484][36999] Updated weights for policy 0, policy_version 24120 (0.0022) [2024-07-02 13:36:54,236][36999] Updated weights for policy 0, policy_version 24130 (0.0025) [2024-07-02 13:36:56,096][36761] Fps is (10 sec: 44234.2, 60 sec: 43963.3, 300 sec: 43986.8). Total num frames: 395395072. Throughput: 0: 43834.4. Samples: 395559960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-07-02 13:36:56,096][36761] Avg episode reward: [(0, '0.958')] [2024-07-02 13:36:58,766][36999] Updated weights for policy 0, policy_version 24140 (0.0038) [2024-07-02 13:37:01,095][36761] Fps is (10 sec: 50791.5, 60 sec: 43963.7, 300 sec: 44097.9). Total num frames: 395640832. Throughput: 0: 43682.3. Samples: 395691720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-07-02 13:37:01,095][36761] Avg episode reward: [(0, '0.958')] [2024-07-02 13:37:01,654][36999] Updated weights for policy 0, policy_version 24150 (0.0034) [2024-07-02 13:37:06,095][36761] Fps is (10 sec: 42601.3, 60 sec: 43963.8, 300 sec: 43931.4). Total num frames: 395821056. Throughput: 0: 43882.4. Samples: 395961060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-07-02 13:37:06,096][36761] Avg episode reward: [(0, '0.962')] [2024-07-02 13:37:06,296][36999] Updated weights for policy 0, policy_version 24160 (0.0042) [2024-07-02 13:37:09,526][36999] Updated weights for policy 0, policy_version 24170 (0.0026) [2024-07-02 13:37:11,095][36761] Fps is (10 sec: 44236.5, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 396083200. Throughput: 0: 43772.0. Samples: 396215400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-07-02 13:37:11,096][36761] Avg episode reward: [(0, '0.957')] [2024-07-02 13:37:13,731][36999] Updated weights for policy 0, policy_version 24180 (0.0025) [2024-07-02 13:37:16,095][36761] Fps is (10 sec: 47513.5, 60 sec: 43690.7, 300 sec: 44042.8). Total num frames: 396296192. Throughput: 0: 43876.5. Samples: 396354900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 13:37:16,096][36761] Avg episode reward: [(0, '0.957')] [2024-07-02 13:37:16,808][36999] Updated weights for policy 0, policy_version 24190 (0.0035) [2024-07-02 13:37:21,095][36761] Fps is (10 sec: 39322.0, 60 sec: 43963.9, 300 sec: 43932.0). Total num frames: 396476416. Throughput: 0: 43887.7. Samples: 396613380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 13:37:21,096][36761] Avg episode reward: [(0, '0.963')] [2024-07-02 13:37:21,242][36999] Updated weights for policy 0, policy_version 24200 (0.0027) [2024-07-02 13:37:24,179][36999] Updated weights for policy 0, policy_version 24210 (0.0039) [2024-07-02 13:37:26,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43963.8, 300 sec: 43987.3). Total num frames: 396722176. Throughput: 0: 43958.2. Samples: 396871700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 13:37:26,096][36761] Avg episode reward: [(0, '0.957')] [2024-07-02 13:37:28,611][36999] Updated weights for policy 0, policy_version 24220 (0.0029) [2024-07-02 13:37:31,095][36761] Fps is (10 sec: 47513.1, 60 sec: 43690.7, 300 sec: 44042.4). Total num frames: 396951552. Throughput: 0: 43920.5. Samples: 397010460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-07-02 13:37:31,096][36761] Avg episode reward: [(0, '0.941')] [2024-07-02 13:37:31,836][36999] Updated weights for policy 0, policy_version 24230 (0.0036) [2024-07-02 13:37:35,952][36999] Updated weights for policy 0, policy_version 24240 (0.0027) [2024-07-02 13:37:36,095][36761] Fps is (10 sec: 42598.7, 60 sec: 44509.9, 300 sec: 43931.3). Total num frames: 397148160. Throughput: 0: 44101.6. Samples: 397279920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-07-02 13:37:36,096][36761] Avg episode reward: [(0, '0.953')] [2024-07-02 13:37:39,162][36999] Updated weights for policy 0, policy_version 24250 (0.0036) [2024-07-02 13:37:41,095][36761] Fps is (10 sec: 44236.7, 60 sec: 44509.8, 300 sec: 44097.9). Total num frames: 397393920. Throughput: 0: 43879.7. Samples: 397534520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-07-02 13:37:41,096][36761] Avg episode reward: [(0, '0.967')] [2024-07-02 13:37:43,195][36979] Signal inference workers to stop experience collection... (5800 times) [2024-07-02 13:37:43,195][36979] Signal inference workers to resume experience collection... (5800 times) [2024-07-02 13:37:43,240][36999] InferenceWorker_p0-w0: stopping experience collection (5800 times) [2024-07-02 13:37:43,240][36999] InferenceWorker_p0-w0: resuming experience collection (5800 times) [2024-07-02 13:37:43,364][36999] Updated weights for policy 0, policy_version 24260 (0.0030) [2024-07-02 13:37:46,095][36761] Fps is (10 sec: 45874.3, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 397606912. Throughput: 0: 43939.8. Samples: 397669020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-07-02 13:37:46,096][36761] Avg episode reward: [(0, '0.952')] [2024-07-02 13:37:46,568][36999] Updated weights for policy 0, policy_version 24270 (0.0029) [2024-07-02 13:37:50,803][36999] Updated weights for policy 0, policy_version 24280 (0.0032) [2024-07-02 13:37:51,100][36761] Fps is (10 sec: 40941.6, 60 sec: 44506.6, 300 sec: 43930.7). Total num frames: 397803520. Throughput: 0: 43996.0. Samples: 397941080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-07-02 13:37:51,100][36761] Avg episode reward: [(0, '0.944')] [2024-07-02 13:37:54,197][36999] Updated weights for policy 0, policy_version 24290 (0.0026) [2024-07-02 13:37:56,095][36761] Fps is (10 sec: 44237.8, 60 sec: 44237.3, 300 sec: 43986.9). Total num frames: 398049280. Throughput: 0: 44022.3. Samples: 398196400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-07-02 13:37:56,096][36761] Avg episode reward: [(0, '0.952')] [2024-07-02 13:37:58,115][36999] Updated weights for policy 0, policy_version 24300 (0.0031) [2024-07-02 13:38:01,095][36761] Fps is (10 sec: 45896.2, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 398262272. Throughput: 0: 43928.9. Samples: 398331700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:38:01,096][36761] Avg episode reward: [(0, '0.949')] [2024-07-02 13:38:01,765][36999] Updated weights for policy 0, policy_version 24310 (0.0026) [2024-07-02 13:38:05,792][36999] Updated weights for policy 0, policy_version 24320 (0.0030) [2024-07-02 13:38:06,095][36761] Fps is (10 sec: 42597.7, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 398475264. Throughput: 0: 44166.5. Samples: 398600880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:38:06,096][36761] Avg episode reward: [(0, '0.946')] [2024-07-02 13:38:09,090][36999] Updated weights for policy 0, policy_version 24330 (0.0037) [2024-07-02 13:38:11,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 398704640. Throughput: 0: 44015.2. Samples: 398852380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:38:11,096][36761] Avg episode reward: [(0, '0.950')] [2024-07-02 13:38:13,362][36999] Updated weights for policy 0, policy_version 24340 (0.0035) [2024-07-02 13:38:16,096][36761] Fps is (10 sec: 44236.4, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 398917632. Throughput: 0: 43971.0. Samples: 398989160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:38:16,096][36761] Avg episode reward: [(0, '0.963')] [2024-07-02 13:38:16,454][36999] Updated weights for policy 0, policy_version 24350 (0.0031) [2024-07-02 13:38:20,799][36999] Updated weights for policy 0, policy_version 24360 (0.0030) [2024-07-02 13:38:21,095][36761] Fps is (10 sec: 42598.3, 60 sec: 44236.8, 300 sec: 43931.3). Total num frames: 399130624. Throughput: 0: 43958.7. Samples: 399258060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:38:21,096][36761] Avg episode reward: [(0, '0.965')] [2024-07-02 13:38:23,809][36999] Updated weights for policy 0, policy_version 24370 (0.0030) [2024-07-02 13:38:26,097][36761] Fps is (10 sec: 42590.3, 60 sec: 43689.2, 300 sec: 43986.6). Total num frames: 399343616. Throughput: 0: 44038.0. Samples: 399516320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:38:26,098][36761] Avg episode reward: [(0, '0.965')] [2024-07-02 13:38:28,199][36999] Updated weights for policy 0, policy_version 24380 (0.0031) [2024-07-02 13:38:31,095][36761] Fps is (10 sec: 45875.4, 60 sec: 43963.8, 300 sec: 44043.1). Total num frames: 399589376. Throughput: 0: 43814.0. Samples: 399640640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:38:31,096][36761] Avg episode reward: [(0, '0.980')] [2024-07-02 13:38:31,349][36999] Updated weights for policy 0, policy_version 24390 (0.0045) [2024-07-02 13:38:35,619][36999] Updated weights for policy 0, policy_version 24400 (0.0033) [2024-07-02 13:38:36,095][36761] Fps is (10 sec: 45884.8, 60 sec: 44236.7, 300 sec: 43987.1). Total num frames: 399802368. Throughput: 0: 43788.8. Samples: 399911380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:38:36,096][36761] Avg episode reward: [(0, '0.970')] [2024-07-02 13:38:36,114][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000024402_399802368.pth... [2024-07-02 13:38:36,180][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000023757_389234688.pth [2024-07-02 13:38:38,710][36999] Updated weights for policy 0, policy_version 24410 (0.0030) [2024-07-02 13:38:41,095][36761] Fps is (10 sec: 40959.1, 60 sec: 43417.5, 300 sec: 43931.3). Total num frames: 399998976. Throughput: 0: 43853.1. Samples: 400169800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:38:41,096][36761] Avg episode reward: [(0, '0.940')] [2024-07-02 13:38:42,910][36999] Updated weights for policy 0, policy_version 24420 (0.0039) [2024-07-02 13:38:46,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43963.9, 300 sec: 44098.0). Total num frames: 400244736. Throughput: 0: 43846.7. Samples: 400304800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:38:46,095][36761] Avg episode reward: [(0, '0.943')] [2024-07-02 13:38:46,131][36999] Updated weights for policy 0, policy_version 24430 (0.0022) [2024-07-02 13:38:50,193][36999] Updated weights for policy 0, policy_version 24440 (0.0040) [2024-07-02 13:38:51,098][36761] Fps is (10 sec: 45864.2, 60 sec: 44238.3, 300 sec: 43986.5). Total num frames: 400457728. Throughput: 0: 43874.1. Samples: 400575320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 13:38:51,098][36761] Avg episode reward: [(0, '0.939')] [2024-07-02 13:38:53,542][36999] Updated weights for policy 0, policy_version 24450 (0.0019) [2024-07-02 13:38:56,100][36761] Fps is (10 sec: 40940.8, 60 sec: 43414.2, 300 sec: 43930.7). Total num frames: 400654336. Throughput: 0: 44102.5. Samples: 400837200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 13:38:56,101][36761] Avg episode reward: [(0, '0.968')] [2024-07-02 13:38:57,447][36979] Signal inference workers to stop experience collection... (5850 times) [2024-07-02 13:38:57,484][36999] InferenceWorker_p0-w0: stopping experience collection (5850 times) [2024-07-02 13:38:57,506][36979] Signal inference workers to resume experience collection... (5850 times) [2024-07-02 13:38:57,508][36999] InferenceWorker_p0-w0: resuming experience collection (5850 times) [2024-07-02 13:38:57,652][36999] Updated weights for policy 0, policy_version 24460 (0.0033) [2024-07-02 13:39:01,095][36761] Fps is (10 sec: 44247.6, 60 sec: 43963.6, 300 sec: 44042.4). Total num frames: 400900096. Throughput: 0: 43856.5. Samples: 400962700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 13:39:01,096][36761] Avg episode reward: [(0, '0.971')] [2024-07-02 13:39:01,154][36999] Updated weights for policy 0, policy_version 24470 (0.0027) [2024-07-02 13:39:05,176][36999] Updated weights for policy 0, policy_version 24480 (0.0025) [2024-07-02 13:39:06,095][36761] Fps is (10 sec: 47534.9, 60 sec: 44236.7, 300 sec: 43986.8). Total num frames: 401129472. Throughput: 0: 43953.6. Samples: 401235980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 13:39:06,096][36761] Avg episode reward: [(0, '0.934')] [2024-07-02 13:39:08,631][36999] Updated weights for policy 0, policy_version 24490 (0.0035) [2024-07-02 13:39:11,095][36761] Fps is (10 sec: 40960.5, 60 sec: 43417.6, 300 sec: 43931.3). Total num frames: 401309696. Throughput: 0: 43991.0. Samples: 401495820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-07-02 13:39:11,096][36761] Avg episode reward: [(0, '0.935')] [2024-07-02 13:39:12,903][36999] Updated weights for policy 0, policy_version 24500 (0.0035) [2024-07-02 13:39:16,095][36761] Fps is (10 sec: 44238.2, 60 sec: 44237.0, 300 sec: 44098.0). Total num frames: 401571840. Throughput: 0: 44089.8. Samples: 401624680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-07-02 13:39:16,095][36761] Avg episode reward: [(0, '0.963')] [2024-07-02 13:39:16,101][36999] Updated weights for policy 0, policy_version 24510 (0.0035) [2024-07-02 13:39:20,234][36999] Updated weights for policy 0, policy_version 24520 (0.0032) [2024-07-02 13:39:21,095][36761] Fps is (10 sec: 47513.3, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 401784832. Throughput: 0: 44274.6. Samples: 401903740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-07-02 13:39:21,096][36761] Avg episode reward: [(0, '0.914')] [2024-07-02 13:39:23,812][36999] Updated weights for policy 0, policy_version 24530 (0.0030) [2024-07-02 13:39:26,095][36761] Fps is (10 sec: 40959.5, 60 sec: 43965.3, 300 sec: 43986.9). Total num frames: 401981440. Throughput: 0: 44358.8. Samples: 402165940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 13:39:26,096][36761] Avg episode reward: [(0, '0.915')] [2024-07-02 13:39:27,639][36999] Updated weights for policy 0, policy_version 24540 (0.0034) [2024-07-02 13:39:31,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 402210816. Throughput: 0: 44026.2. Samples: 402285980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 13:39:31,096][36761] Avg episode reward: [(0, '0.923')] [2024-07-02 13:39:31,191][36999] Updated weights for policy 0, policy_version 24550 (0.0031) [2024-07-02 13:39:34,975][36999] Updated weights for policy 0, policy_version 24560 (0.0031) [2024-07-02 13:39:36,095][36761] Fps is (10 sec: 45874.9, 60 sec: 43963.7, 300 sec: 44042.7). Total num frames: 402440192. Throughput: 0: 44078.9. Samples: 402558760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 13:39:36,096][36761] Avg episode reward: [(0, '0.961')] [2024-07-02 13:39:38,542][36999] Updated weights for policy 0, policy_version 24570 (0.0025) [2024-07-02 13:39:41,095][36761] Fps is (10 sec: 42598.1, 60 sec: 43963.8, 300 sec: 43931.4). Total num frames: 402636800. Throughput: 0: 44141.4. Samples: 402823360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 13:39:41,096][36761] Avg episode reward: [(0, '0.966')] [2024-07-02 13:39:42,381][36999] Updated weights for policy 0, policy_version 24580 (0.0027) [2024-07-02 13:39:45,885][36999] Updated weights for policy 0, policy_version 24590 (0.0025) [2024-07-02 13:39:46,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 402882560. Throughput: 0: 44153.4. Samples: 402949600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 13:39:46,096][36761] Avg episode reward: [(0, '0.973')] [2024-07-02 13:39:49,939][36999] Updated weights for policy 0, policy_version 24600 (0.0032) [2024-07-02 13:39:51,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43692.5, 300 sec: 43986.9). Total num frames: 403079168. Throughput: 0: 43944.1. Samples: 403213460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 13:39:51,096][36761] Avg episode reward: [(0, '0.981')] [2024-07-02 13:39:53,237][36999] Updated weights for policy 0, policy_version 24610 (0.0037) [2024-07-02 13:39:56,095][36761] Fps is (10 sec: 40960.3, 60 sec: 43967.2, 300 sec: 43931.3). Total num frames: 403292160. Throughput: 0: 43935.6. Samples: 403472920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 13:39:56,095][36761] Avg episode reward: [(0, '0.976')] [2024-07-02 13:39:57,536][36999] Updated weights for policy 0, policy_version 24620 (0.0023) [2024-07-02 13:40:00,703][36999] Updated weights for policy 0, policy_version 24630 (0.0031) [2024-07-02 13:40:01,100][36761] Fps is (10 sec: 45854.7, 60 sec: 43960.5, 300 sec: 43986.2). Total num frames: 403537920. Throughput: 0: 43961.2. Samples: 403603140. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-07-02 13:40:01,100][36761] Avg episode reward: [(0, '0.979')] [2024-07-02 13:40:04,991][36999] Updated weights for policy 0, policy_version 24640 (0.0030) [2024-07-02 13:40:06,095][36761] Fps is (10 sec: 45875.1, 60 sec: 43690.8, 300 sec: 43931.3). Total num frames: 403750912. Throughput: 0: 43781.8. Samples: 403873920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-07-02 13:40:06,096][36761] Avg episode reward: [(0, '0.976')] [2024-07-02 13:40:06,850][36979] Signal inference workers to stop experience collection... (5900 times) [2024-07-02 13:40:06,850][36979] Signal inference workers to resume experience collection... (5900 times) [2024-07-02 13:40:06,871][36999] InferenceWorker_p0-w0: stopping experience collection (5900 times) [2024-07-02 13:40:06,871][36999] InferenceWorker_p0-w0: resuming experience collection (5900 times) [2024-07-02 13:40:08,034][36999] Updated weights for policy 0, policy_version 24650 (0.0025) [2024-07-02 13:40:11,095][36761] Fps is (10 sec: 40978.5, 60 sec: 43963.7, 300 sec: 43820.3). Total num frames: 403947520. Throughput: 0: 43706.6. Samples: 404132740. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-07-02 13:40:11,096][36761] Avg episode reward: [(0, '0.954')] [2024-07-02 13:40:12,419][36999] Updated weights for policy 0, policy_version 24660 (0.0030) [2024-07-02 13:40:15,400][36999] Updated weights for policy 0, policy_version 24670 (0.0040) [2024-07-02 13:40:16,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43690.5, 300 sec: 44042.4). Total num frames: 404193280. Throughput: 0: 43919.5. Samples: 404262360. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-07-02 13:40:16,096][36761] Avg episode reward: [(0, '0.957')] [2024-07-02 13:40:19,764][36999] Updated weights for policy 0, policy_version 24680 (0.0026) [2024-07-02 13:40:21,095][36761] Fps is (10 sec: 45875.6, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 404406272. Throughput: 0: 43842.8. Samples: 404531680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-07-02 13:40:21,096][36761] Avg episode reward: [(0, '0.983')] [2024-07-02 13:40:22,715][36999] Updated weights for policy 0, policy_version 24690 (0.0029) [2024-07-02 13:40:26,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 404619264. Throughput: 0: 43800.5. Samples: 404794380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-07-02 13:40:26,096][36761] Avg episode reward: [(0, '0.979')] [2024-07-02 13:40:27,649][36999] Updated weights for policy 0, policy_version 24700 (0.0031) [2024-07-02 13:40:30,643][36999] Updated weights for policy 0, policy_version 24710 (0.0035) [2024-07-02 13:40:31,095][36761] Fps is (10 sec: 45875.0, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 404865024. Throughput: 0: 43812.5. Samples: 404921160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-07-02 13:40:31,096][36761] Avg episode reward: [(0, '0.979')] [2024-07-02 13:40:35,037][36999] Updated weights for policy 0, policy_version 24720 (0.0029) [2024-07-02 13:40:36,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 405061632. Throughput: 0: 43998.3. Samples: 405193380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-07-02 13:40:36,095][36761] Avg episode reward: [(0, '0.977')] [2024-07-02 13:40:36,243][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000024725_405094400.pth... [2024-07-02 13:40:36,296][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000024078_394493952.pth [2024-07-02 13:40:37,973][36999] Updated weights for policy 0, policy_version 24730 (0.0026) [2024-07-02 13:40:41,095][36761] Fps is (10 sec: 40959.9, 60 sec: 43963.7, 300 sec: 43820.3). Total num frames: 405274624. Throughput: 0: 44031.5. Samples: 405454340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-07-02 13:40:41,100][36761] Avg episode reward: [(0, '0.972')] [2024-07-02 13:40:42,441][36999] Updated weights for policy 0, policy_version 24740 (0.0049) [2024-07-02 13:40:45,413][36999] Updated weights for policy 0, policy_version 24750 (0.0022) [2024-07-02 13:40:46,099][36761] Fps is (10 sec: 45857.9, 60 sec: 43961.0, 300 sec: 43986.3). Total num frames: 405520384. Throughput: 0: 43986.6. Samples: 405582500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-07-02 13:40:46,099][36761] Avg episode reward: [(0, '0.971')] [2024-07-02 13:40:49,869][36999] Updated weights for policy 0, policy_version 24760 (0.0026) [2024-07-02 13:40:51,095][36761] Fps is (10 sec: 45875.4, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 405733376. Throughput: 0: 43940.9. Samples: 405851260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 13:40:51,096][36761] Avg episode reward: [(0, '0.971')] [2024-07-02 13:40:52,923][36999] Updated weights for policy 0, policy_version 24770 (0.0030) [2024-07-02 13:40:56,095][36761] Fps is (10 sec: 40975.2, 60 sec: 43963.7, 300 sec: 43820.2). Total num frames: 405929984. Throughput: 0: 43946.2. Samples: 406110320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 13:40:56,096][36761] Avg episode reward: [(0, '0.961')] [2024-07-02 13:40:57,290][36999] Updated weights for policy 0, policy_version 24780 (0.0021) [2024-07-02 13:41:00,659][36999] Updated weights for policy 0, policy_version 24790 (0.0036) [2024-07-02 13:41:01,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43694.0, 300 sec: 43986.9). Total num frames: 406159360. Throughput: 0: 43894.7. Samples: 406237620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 13:41:01,096][36761] Avg episode reward: [(0, '0.968')] [2024-07-02 13:41:04,662][36999] Updated weights for policy 0, policy_version 24800 (0.0049) [2024-07-02 13:41:06,095][36761] Fps is (10 sec: 45875.2, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 406388736. Throughput: 0: 43909.3. Samples: 406507600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 13:41:06,096][36761] Avg episode reward: [(0, '0.968')] [2024-07-02 13:41:08,373][36999] Updated weights for policy 0, policy_version 24810 (0.0041) [2024-07-02 13:41:11,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43963.8, 300 sec: 43764.7). Total num frames: 406585344. Throughput: 0: 43912.5. Samples: 406770440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 13:41:11,095][36761] Avg episode reward: [(0, '0.971')] [2024-07-02 13:41:12,063][36999] Updated weights for policy 0, policy_version 24820 (0.0028) [2024-07-02 13:41:15,852][36999] Updated weights for policy 0, policy_version 24830 (0.0039) [2024-07-02 13:41:16,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 406814720. Throughput: 0: 43930.7. Samples: 406898040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 13:41:16,095][36761] Avg episode reward: [(0, '0.961')] [2024-07-02 13:41:19,419][36999] Updated weights for policy 0, policy_version 24840 (0.0032) [2024-07-02 13:41:21,095][36761] Fps is (10 sec: 47513.3, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 407060480. Throughput: 0: 43753.7. Samples: 407162300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 13:41:21,096][36761] Avg episode reward: [(0, '0.973')] [2024-07-02 13:41:23,236][36999] Updated weights for policy 0, policy_version 24850 (0.0036) [2024-07-02 13:41:26,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43963.8, 300 sec: 43820.3). Total num frames: 407257088. Throughput: 0: 43865.4. Samples: 407428280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 13:41:26,096][36761] Avg episode reward: [(0, '0.971')] [2024-07-02 13:41:26,789][36999] Updated weights for policy 0, policy_version 24860 (0.0038) [2024-07-02 13:41:30,595][36979] Signal inference workers to stop experience collection... (5950 times) [2024-07-02 13:41:30,596][36979] Signal inference workers to resume experience collection... (5950 times) [2024-07-02 13:41:30,644][36999] InferenceWorker_p0-w0: stopping experience collection (5950 times) [2024-07-02 13:41:30,644][36999] InferenceWorker_p0-w0: resuming experience collection (5950 times) [2024-07-02 13:41:30,734][36999] Updated weights for policy 0, policy_version 24870 (0.0043) [2024-07-02 13:41:31,095][36761] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 44042.4). Total num frames: 407470080. Throughput: 0: 43796.6. Samples: 407553180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 13:41:31,096][36761] Avg episode reward: [(0, '0.966')] [2024-07-02 13:41:34,486][36999] Updated weights for policy 0, policy_version 24880 (0.0027) [2024-07-02 13:41:36,095][36761] Fps is (10 sec: 45874.5, 60 sec: 44236.7, 300 sec: 44042.4). Total num frames: 407715840. Throughput: 0: 43756.7. Samples: 407820320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 13:41:36,096][36761] Avg episode reward: [(0, '0.974')] [2024-07-02 13:41:38,135][36999] Updated weights for policy 0, policy_version 24890 (0.0033) [2024-07-02 13:41:41,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 407896064. Throughput: 0: 43782.3. Samples: 408080520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 13:41:41,096][36761] Avg episode reward: [(0, '0.989')] [2024-07-02 13:41:41,180][36979] Saving new best policy, reward=0.989! [2024-07-02 13:41:41,851][36999] Updated weights for policy 0, policy_version 24900 (0.0030) [2024-07-02 13:41:45,523][36999] Updated weights for policy 0, policy_version 24910 (0.0028) [2024-07-02 13:41:46,095][36761] Fps is (10 sec: 42599.0, 60 sec: 43693.4, 300 sec: 44098.0). Total num frames: 408141824. Throughput: 0: 43706.7. Samples: 408204420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 13:41:46,096][36761] Avg episode reward: [(0, '0.985')] [2024-07-02 13:41:49,246][36999] Updated weights for policy 0, policy_version 24920 (0.0038) [2024-07-02 13:41:51,095][36761] Fps is (10 sec: 47513.5, 60 sec: 43963.7, 300 sec: 43987.0). Total num frames: 408371200. Throughput: 0: 43718.7. Samples: 408474940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 13:41:51,096][36761] Avg episode reward: [(0, '0.985')] [2024-07-02 13:41:52,880][36999] Updated weights for policy 0, policy_version 24930 (0.0033) [2024-07-02 13:41:56,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 408551424. Throughput: 0: 43854.2. Samples: 408743880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 13:41:56,096][36761] Avg episode reward: [(0, '0.972')] [2024-07-02 13:41:56,825][36999] Updated weights for policy 0, policy_version 24940 (0.0031) [2024-07-02 13:42:00,260][36999] Updated weights for policy 0, policy_version 24950 (0.0031) [2024-07-02 13:42:01,095][36761] Fps is (10 sec: 42598.1, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 408797184. Throughput: 0: 43707.0. Samples: 408864860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 13:42:01,096][36761] Avg episode reward: [(0, '0.981')] [2024-07-02 13:42:04,237][36999] Updated weights for policy 0, policy_version 24960 (0.0035) [2024-07-02 13:42:06,095][36761] Fps is (10 sec: 49151.5, 60 sec: 44236.8, 300 sec: 43931.3). Total num frames: 409042944. Throughput: 0: 43849.3. Samples: 409135520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 13:42:06,096][36761] Avg episode reward: [(0, '1.002')] [2024-07-02 13:42:06,120][36979] Saving new best policy, reward=1.002! [2024-07-02 13:42:07,896][36999] Updated weights for policy 0, policy_version 24970 (0.0036) [2024-07-02 13:42:11,102][36761] Fps is (10 sec: 40932.8, 60 sec: 43685.7, 300 sec: 43763.7). Total num frames: 409206784. Throughput: 0: 43911.2. Samples: 409404580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 13:42:11,103][36761] Avg episode reward: [(0, '0.991')] [2024-07-02 13:42:11,820][36999] Updated weights for policy 0, policy_version 24980 (0.0028) [2024-07-02 13:42:15,210][36999] Updated weights for policy 0, policy_version 24990 (0.0026) [2024-07-02 13:42:16,095][36761] Fps is (10 sec: 40960.6, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 409452544. Throughput: 0: 43956.0. Samples: 409531200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-07-02 13:42:16,096][36761] Avg episode reward: [(0, '0.988')] [2024-07-02 13:42:19,234][36999] Updated weights for policy 0, policy_version 25000 (0.0029) [2024-07-02 13:42:21,099][36761] Fps is (10 sec: 47527.1, 60 sec: 43687.9, 300 sec: 43930.8). Total num frames: 409681920. Throughput: 0: 43793.7. Samples: 409791200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-07-02 13:42:21,100][36761] Avg episode reward: [(0, '0.972')] [2024-07-02 13:42:22,554][36999] Updated weights for policy 0, policy_version 25010 (0.0030) [2024-07-02 13:42:26,095][36761] Fps is (10 sec: 42597.8, 60 sec: 43690.6, 300 sec: 43820.3). Total num frames: 409878528. Throughput: 0: 44185.2. Samples: 410068860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-07-02 13:42:26,096][36761] Avg episode reward: [(0, '0.968')] [2024-07-02 13:42:26,546][36999] Updated weights for policy 0, policy_version 25020 (0.0021) [2024-07-02 13:42:29,878][36999] Updated weights for policy 0, policy_version 25030 (0.0027) [2024-07-02 13:42:31,095][36761] Fps is (10 sec: 45892.7, 60 sec: 44509.8, 300 sec: 44042.4). Total num frames: 410140672. Throughput: 0: 44207.5. Samples: 410193760. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-07-02 13:42:31,096][36761] Avg episode reward: [(0, '0.968')] [2024-07-02 13:42:34,141][36999] Updated weights for policy 0, policy_version 25040 (0.0046) [2024-07-02 13:42:36,095][36761] Fps is (10 sec: 45874.8, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 410337280. Throughput: 0: 43968.3. Samples: 410453520. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-07-02 13:42:36,096][36761] Avg episode reward: [(0, '0.977')] [2024-07-02 13:42:36,106][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000025045_410337280.pth... [2024-07-02 13:42:36,169][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000024402_399802368.pth [2024-07-02 13:42:37,337][36999] Updated weights for policy 0, policy_version 25050 (0.0038) [2024-07-02 13:42:41,095][36761] Fps is (10 sec: 39321.8, 60 sec: 43963.7, 300 sec: 43820.3). Total num frames: 410533888. Throughput: 0: 44015.1. Samples: 410724560. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-07-02 13:42:41,096][36761] Avg episode reward: [(0, '0.976')] [2024-07-02 13:42:41,767][36999] Updated weights for policy 0, policy_version 25060 (0.0022) [2024-07-02 13:42:44,733][36999] Updated weights for policy 0, policy_version 25070 (0.0035) [2024-07-02 13:42:46,096][36761] Fps is (10 sec: 45872.9, 60 sec: 44236.3, 300 sec: 44043.0). Total num frames: 410796032. Throughput: 0: 44085.2. Samples: 410848720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 13:42:46,097][36761] Avg episode reward: [(0, '0.969')] [2024-07-02 13:42:47,907][36979] Signal inference workers to stop experience collection... (6000 times) [2024-07-02 13:42:47,907][36979] Signal inference workers to resume experience collection... (6000 times) [2024-07-02 13:42:47,956][36999] InferenceWorker_p0-w0: stopping experience collection (6000 times) [2024-07-02 13:42:47,956][36999] InferenceWorker_p0-w0: resuming experience collection (6000 times) [2024-07-02 13:42:49,164][36999] Updated weights for policy 0, policy_version 25080 (0.0040) [2024-07-02 13:42:51,095][36761] Fps is (10 sec: 47513.5, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 411009024. Throughput: 0: 43889.4. Samples: 411110540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 13:42:51,096][36761] Avg episode reward: [(0, '0.972')] [2024-07-02 13:42:52,056][36999] Updated weights for policy 0, policy_version 25090 (0.0024) [2024-07-02 13:42:56,095][36761] Fps is (10 sec: 39324.0, 60 sec: 43963.7, 300 sec: 43820.2). Total num frames: 411189248. Throughput: 0: 44066.5. Samples: 411387280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 13:42:56,096][36761] Avg episode reward: [(0, '0.993')] [2024-07-02 13:42:56,491][36999] Updated weights for policy 0, policy_version 25100 (0.0030) [2024-07-02 13:42:59,398][36999] Updated weights for policy 0, policy_version 25110 (0.0023) [2024-07-02 13:43:01,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 411435008. Throughput: 0: 44014.5. Samples: 411511860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 13:43:01,096][36761] Avg episode reward: [(0, '0.993')] [2024-07-02 13:43:04,085][36999] Updated weights for policy 0, policy_version 25120 (0.0035) [2024-07-02 13:43:06,095][36761] Fps is (10 sec: 47513.9, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 411664384. Throughput: 0: 44078.9. Samples: 411774580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 13:43:06,096][36761] Avg episode reward: [(0, '0.994')] [2024-07-02 13:43:06,784][36999] Updated weights for policy 0, policy_version 25130 (0.0046) [2024-07-02 13:43:11,095][36761] Fps is (10 sec: 42598.1, 60 sec: 44241.6, 300 sec: 43875.8). Total num frames: 411860992. Throughput: 0: 44018.1. Samples: 412049680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 13:43:11,096][36761] Avg episode reward: [(0, '0.974')] [2024-07-02 13:43:11,407][36999] Updated weights for policy 0, policy_version 25140 (0.0027) [2024-07-02 13:43:14,799][36999] Updated weights for policy 0, policy_version 25150 (0.0035) [2024-07-02 13:43:16,095][36761] Fps is (10 sec: 44236.1, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 412106752. Throughput: 0: 43976.8. Samples: 412172720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 13:43:16,096][36761] Avg episode reward: [(0, '0.969')] [2024-07-02 13:43:18,683][36999] Updated weights for policy 0, policy_version 25160 (0.0032) [2024-07-02 13:43:21,095][36761] Fps is (10 sec: 45876.0, 60 sec: 43966.6, 300 sec: 43987.2). Total num frames: 412319744. Throughput: 0: 44085.1. Samples: 412437340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-07-02 13:43:21,096][36761] Avg episode reward: [(0, '0.985')] [2024-07-02 13:43:22,229][36999] Updated weights for policy 0, policy_version 25170 (0.0024) [2024-07-02 13:43:26,095][36761] Fps is (10 sec: 42598.6, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 412532736. Throughput: 0: 44083.9. Samples: 412708340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-07-02 13:43:26,096][36761] Avg episode reward: [(0, '0.978')] [2024-07-02 13:43:26,338][36999] Updated weights for policy 0, policy_version 25180 (0.0033) [2024-07-02 13:43:29,580][36999] Updated weights for policy 0, policy_version 25190 (0.0037) [2024-07-02 13:43:31,095][36761] Fps is (10 sec: 45874.6, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 412778496. Throughput: 0: 44191.2. Samples: 412837300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-07-02 13:43:31,096][36761] Avg episode reward: [(0, '0.983')] [2024-07-02 13:43:33,702][36999] Updated weights for policy 0, policy_version 25200 (0.0032) [2024-07-02 13:43:36,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 412975104. Throughput: 0: 44323.1. Samples: 413105080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-07-02 13:43:36,096][36761] Avg episode reward: [(0, '0.968')] [2024-07-02 13:43:36,954][36999] Updated weights for policy 0, policy_version 25210 (0.0036) [2024-07-02 13:43:41,095][36761] Fps is (10 sec: 40960.6, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 413188096. Throughput: 0: 44114.3. Samples: 413372420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-07-02 13:43:41,096][36761] Avg episode reward: [(0, '0.972')] [2024-07-02 13:43:41,118][36999] Updated weights for policy 0, policy_version 25220 (0.0023) [2024-07-02 13:43:44,302][36999] Updated weights for policy 0, policy_version 25230 (0.0030) [2024-07-02 13:43:46,095][36761] Fps is (10 sec: 45875.6, 60 sec: 43964.2, 300 sec: 43987.3). Total num frames: 413433856. Throughput: 0: 44196.5. Samples: 413500700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-07-02 13:43:46,096][36761] Avg episode reward: [(0, '0.990')] [2024-07-02 13:43:48,466][36999] Updated weights for policy 0, policy_version 25240 (0.0027) [2024-07-02 13:43:51,095][36761] Fps is (10 sec: 45875.2, 60 sec: 43963.8, 300 sec: 44043.1). Total num frames: 413646848. Throughput: 0: 44295.1. Samples: 413767860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-07-02 13:43:51,096][36761] Avg episode reward: [(0, '1.003')] [2024-07-02 13:43:51,571][36999] Updated weights for policy 0, policy_version 25250 (0.0024) [2024-07-02 13:43:55,694][36999] Updated weights for policy 0, policy_version 25260 (0.0025) [2024-07-02 13:43:56,095][36761] Fps is (10 sec: 42598.1, 60 sec: 44509.8, 300 sec: 43931.3). Total num frames: 413859840. Throughput: 0: 44063.2. Samples: 414032520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 13:43:56,096][36761] Avg episode reward: [(0, '0.998')] [2024-07-02 13:43:59,018][36999] Updated weights for policy 0, policy_version 25270 (0.0038) [2024-07-02 13:44:00,080][36979] Signal inference workers to stop experience collection... (6050 times) [2024-07-02 13:44:00,080][36979] Signal inference workers to resume experience collection... (6050 times) [2024-07-02 13:44:00,135][36999] InferenceWorker_p0-w0: stopping experience collection (6050 times) [2024-07-02 13:44:00,136][36999] InferenceWorker_p0-w0: resuming experience collection (6050 times) [2024-07-02 13:44:01,095][36761] Fps is (10 sec: 44236.8, 60 sec: 44236.9, 300 sec: 43931.4). Total num frames: 414089216. Throughput: 0: 44235.7. Samples: 414163320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 13:44:01,095][36761] Avg episode reward: [(0, '0.999')] [2024-07-02 13:44:02,984][36999] Updated weights for policy 0, policy_version 25280 (0.0032) [2024-07-02 13:44:06,095][36761] Fps is (10 sec: 45875.5, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 414318592. Throughput: 0: 44325.8. Samples: 414432000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 13:44:06,096][36761] Avg episode reward: [(0, '0.989')] [2024-07-02 13:44:06,447][36999] Updated weights for policy 0, policy_version 25290 (0.0044) [2024-07-02 13:44:10,429][36999] Updated weights for policy 0, policy_version 25300 (0.0041) [2024-07-02 13:44:11,095][36761] Fps is (10 sec: 44236.7, 60 sec: 44510.0, 300 sec: 43931.3). Total num frames: 414531584. Throughput: 0: 44218.8. Samples: 414698180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 13:44:11,096][36761] Avg episode reward: [(0, '0.984')] [2024-07-02 13:44:13,729][36999] Updated weights for policy 0, policy_version 25310 (0.0027) [2024-07-02 13:44:16,095][36761] Fps is (10 sec: 44236.9, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 414760960. Throughput: 0: 44114.8. Samples: 414822460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 13:44:16,096][36761] Avg episode reward: [(0, '0.983')] [2024-07-02 13:44:17,798][36999] Updated weights for policy 0, policy_version 25320 (0.0045) [2024-07-02 13:44:21,073][36999] Updated weights for policy 0, policy_version 25330 (0.0042) [2024-07-02 13:44:21,095][36761] Fps is (10 sec: 47513.7, 60 sec: 44783.0, 300 sec: 44153.5). Total num frames: 415006720. Throughput: 0: 44198.8. Samples: 415094020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 13:44:21,096][36761] Avg episode reward: [(0, '0.984')] [2024-07-02 13:44:25,183][36999] Updated weights for policy 0, policy_version 25340 (0.0032) [2024-07-02 13:44:26,100][36761] Fps is (10 sec: 42578.6, 60 sec: 44233.4, 300 sec: 43986.2). Total num frames: 415186944. Throughput: 0: 44104.8. Samples: 415357340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 13:44:26,101][36761] Avg episode reward: [(0, '0.979')] [2024-07-02 13:44:28,736][36999] Updated weights for policy 0, policy_version 25350 (0.0024) [2024-07-02 13:44:31,095][36761] Fps is (10 sec: 40959.4, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 415416320. Throughput: 0: 44186.6. Samples: 415489100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-07-02 13:44:31,096][36761] Avg episode reward: [(0, '0.975')] [2024-07-02 13:44:32,498][36999] Updated weights for policy 0, policy_version 25360 (0.0023) [2024-07-02 13:44:36,095][36761] Fps is (10 sec: 45896.6, 60 sec: 44509.9, 300 sec: 44098.0). Total num frames: 415645696. Throughput: 0: 44136.4. Samples: 415754000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-07-02 13:44:36,096][36761] Avg episode reward: [(0, '0.981')] [2024-07-02 13:44:36,204][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000025370_415662080.pth... [2024-07-02 13:44:36,209][36999] Updated weights for policy 0, policy_version 25370 (0.0036) [2024-07-02 13:44:36,260][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000024725_405094400.pth [2024-07-02 13:44:39,910][36999] Updated weights for policy 0, policy_version 25380 (0.0041) [2024-07-02 13:44:41,095][36761] Fps is (10 sec: 42598.5, 60 sec: 44236.7, 300 sec: 43931.3). Total num frames: 415842304. Throughput: 0: 44158.2. Samples: 416019640. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-07-02 13:44:41,096][36761] Avg episode reward: [(0, '0.988')] [2024-07-02 13:44:43,570][36999] Updated weights for policy 0, policy_version 25390 (0.0036) [2024-07-02 13:44:46,095][36761] Fps is (10 sec: 42597.9, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 416071680. Throughput: 0: 44025.2. Samples: 416144460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 13:44:46,096][36761] Avg episode reward: [(0, '0.976')] [2024-07-02 13:44:47,635][36999] Updated weights for policy 0, policy_version 25400 (0.0037) [2024-07-02 13:44:51,095][36761] Fps is (10 sec: 45874.9, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 416301056. Throughput: 0: 44038.1. Samples: 416413720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 13:44:51,096][36761] Avg episode reward: [(0, '0.985')] [2024-07-02 13:44:51,155][36999] Updated weights for policy 0, policy_version 25410 (0.0031) [2024-07-02 13:44:54,958][36999] Updated weights for policy 0, policy_version 25420 (0.0030) [2024-07-02 13:44:56,095][36761] Fps is (10 sec: 44237.2, 60 sec: 44236.8, 300 sec: 43987.6). Total num frames: 416514048. Throughput: 0: 44056.4. Samples: 416680720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 13:44:56,096][36761] Avg episode reward: [(0, '0.985')] [2024-07-02 13:44:58,809][36999] Updated weights for policy 0, policy_version 25430 (0.0039) [2024-07-02 13:45:01,095][36761] Fps is (10 sec: 42599.0, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 416727040. Throughput: 0: 44138.7. Samples: 416808700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 13:45:01,096][36761] Avg episode reward: [(0, '0.992')] [2024-07-02 13:45:02,341][36999] Updated weights for policy 0, policy_version 25440 (0.0029) [2024-07-02 13:45:06,095][36761] Fps is (10 sec: 44236.3, 60 sec: 43963.6, 300 sec: 44097.9). Total num frames: 416956416. Throughput: 0: 43952.3. Samples: 417071880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 13:45:06,096][36761] Avg episode reward: [(0, '0.988')] [2024-07-02 13:45:06,141][36999] Updated weights for policy 0, policy_version 25450 (0.0030) [2024-07-02 13:45:09,823][36999] Updated weights for policy 0, policy_version 25460 (0.0029) [2024-07-02 13:45:11,095][36761] Fps is (10 sec: 44236.2, 60 sec: 43963.6, 300 sec: 43986.9). Total num frames: 417169408. Throughput: 0: 43972.4. Samples: 417335900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 13:45:11,096][36761] Avg episode reward: [(0, '0.986')] [2024-07-02 13:45:13,931][36999] Updated weights for policy 0, policy_version 25470 (0.0029) [2024-07-02 13:45:16,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 417382400. Throughput: 0: 43926.3. Samples: 417465780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 13:45:16,099][36761] Avg episode reward: [(0, '1.007')] [2024-07-02 13:45:16,111][36979] Saving new best policy, reward=1.007! [2024-07-02 13:45:17,158][36999] Updated weights for policy 0, policy_version 25480 (0.0039) [2024-07-02 13:45:21,097][36761] Fps is (10 sec: 44228.2, 60 sec: 43416.1, 300 sec: 44042.1). Total num frames: 417611776. Throughput: 0: 43790.9. Samples: 417724680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 13:45:21,098][36761] Avg episode reward: [(0, '1.005')] [2024-07-02 13:45:21,310][36999] Updated weights for policy 0, policy_version 25490 (0.0024) [2024-07-02 13:45:24,467][36999] Updated weights for policy 0, policy_version 25500 (0.0035) [2024-07-02 13:45:26,095][36761] Fps is (10 sec: 45874.8, 60 sec: 44240.1, 300 sec: 43986.9). Total num frames: 417841152. Throughput: 0: 44036.4. Samples: 418001280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 13:45:26,096][36761] Avg episode reward: [(0, '0.999')] [2024-07-02 13:45:28,574][36999] Updated weights for policy 0, policy_version 25510 (0.0040) [2024-07-02 13:45:31,095][36761] Fps is (10 sec: 44245.3, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 418054144. Throughput: 0: 44154.6. Samples: 418131420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 13:45:31,096][36761] Avg episode reward: [(0, '0.994')] [2024-07-02 13:45:32,116][36999] Updated weights for policy 0, policy_version 25520 (0.0037) [2024-07-02 13:45:36,010][36999] Updated weights for policy 0, policy_version 25530 (0.0028) [2024-07-02 13:45:36,095][36761] Fps is (10 sec: 44237.6, 60 sec: 43963.8, 300 sec: 44098.0). Total num frames: 418283520. Throughput: 0: 44129.9. Samples: 418399560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 13:45:36,096][36761] Avg episode reward: [(0, '0.974')] [2024-07-02 13:45:40,145][36999] Updated weights for policy 0, policy_version 25540 (0.0030) [2024-07-02 13:45:40,181][36979] Signal inference workers to stop experience collection... (6100 times) [2024-07-02 13:45:40,182][36979] Signal inference workers to resume experience collection... (6100 times) [2024-07-02 13:45:40,232][36999] InferenceWorker_p0-w0: stopping experience collection (6100 times) [2024-07-02 13:45:40,232][36999] InferenceWorker_p0-w0: resuming experience collection (6100 times) [2024-07-02 13:45:41,095][36761] Fps is (10 sec: 44237.8, 60 sec: 44236.9, 300 sec: 43987.4). Total num frames: 418496512. Throughput: 0: 43968.1. Samples: 418659280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 13:45:41,095][36761] Avg episode reward: [(0, '0.994')] [2024-07-02 13:45:43,733][36999] Updated weights for policy 0, policy_version 25550 (0.0042) [2024-07-02 13:45:46,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 418709504. Throughput: 0: 44112.0. Samples: 418793740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 13:45:46,096][36761] Avg episode reward: [(0, '0.998')] [2024-07-02 13:45:47,476][36999] Updated weights for policy 0, policy_version 25560 (0.0032) [2024-07-02 13:45:51,035][36999] Updated weights for policy 0, policy_version 25570 (0.0037) [2024-07-02 13:45:51,095][36761] Fps is (10 sec: 44236.1, 60 sec: 43963.8, 300 sec: 44097.9). Total num frames: 418938880. Throughput: 0: 44050.3. Samples: 419054140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 13:45:51,096][36761] Avg episode reward: [(0, '0.997')] [2024-07-02 13:45:54,831][36999] Updated weights for policy 0, policy_version 25580 (0.0033) [2024-07-02 13:45:56,095][36761] Fps is (10 sec: 45875.1, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 419168256. Throughput: 0: 44053.0. Samples: 419318280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:45:56,096][36761] Avg episode reward: [(0, '0.995')] [2024-07-02 13:45:58,284][36999] Updated weights for policy 0, policy_version 25590 (0.0028) [2024-07-02 13:46:01,095][36761] Fps is (10 sec: 44237.4, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 419381248. Throughput: 0: 44126.8. Samples: 419451480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:46:01,095][36761] Avg episode reward: [(0, '1.007')] [2024-07-02 13:46:02,107][36999] Updated weights for policy 0, policy_version 25600 (0.0027) [2024-07-02 13:46:05,545][36999] Updated weights for policy 0, policy_version 25610 (0.0029) [2024-07-02 13:46:06,095][36761] Fps is (10 sec: 44236.2, 60 sec: 44236.8, 300 sec: 44153.5). Total num frames: 419610624. Throughput: 0: 44361.5. Samples: 419720860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:46:06,096][36761] Avg episode reward: [(0, '1.007')] [2024-07-02 13:46:09,606][36999] Updated weights for policy 0, policy_version 25620 (0.0043) [2024-07-02 13:46:11,096][36761] Fps is (10 sec: 42597.4, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 419807232. Throughput: 0: 44028.4. Samples: 419982560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-07-02 13:46:11,096][36761] Avg episode reward: [(0, '1.002')] [2024-07-02 13:46:12,867][36999] Updated weights for policy 0, policy_version 25630 (0.0033) [2024-07-02 13:46:16,095][36761] Fps is (10 sec: 44236.8, 60 sec: 44509.8, 300 sec: 44042.4). Total num frames: 420052992. Throughput: 0: 44081.4. Samples: 420115080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-07-02 13:46:16,096][36761] Avg episode reward: [(0, '1.003')] [2024-07-02 13:46:16,918][36999] Updated weights for policy 0, policy_version 25640 (0.0027) [2024-07-02 13:46:20,359][36999] Updated weights for policy 0, policy_version 25650 (0.0030) [2024-07-02 13:46:21,095][36761] Fps is (10 sec: 44237.5, 60 sec: 43965.2, 300 sec: 44042.4). Total num frames: 420249600. Throughput: 0: 43875.9. Samples: 420373980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-07-02 13:46:21,096][36761] Avg episode reward: [(0, '0.990')] [2024-07-02 13:46:24,421][36999] Updated weights for policy 0, policy_version 25660 (0.0031) [2024-07-02 13:46:26,096][36761] Fps is (10 sec: 42598.0, 60 sec: 43963.7, 300 sec: 44097.9). Total num frames: 420478976. Throughput: 0: 43978.4. Samples: 420638320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-07-02 13:46:26,096][36761] Avg episode reward: [(0, '0.997')] [2024-07-02 13:46:28,110][36999] Updated weights for policy 0, policy_version 25670 (0.0036) [2024-07-02 13:46:31,095][36761] Fps is (10 sec: 45875.2, 60 sec: 44236.9, 300 sec: 44042.4). Total num frames: 420708352. Throughput: 0: 43913.3. Samples: 420769840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-07-02 13:46:31,096][36761] Avg episode reward: [(0, '0.997')] [2024-07-02 13:46:31,824][36999] Updated weights for policy 0, policy_version 25680 (0.0028) [2024-07-02 13:46:35,557][36999] Updated weights for policy 0, policy_version 25690 (0.0034) [2024-07-02 13:46:36,095][36761] Fps is (10 sec: 44237.9, 60 sec: 43963.7, 300 sec: 44153.5). Total num frames: 420921344. Throughput: 0: 43962.8. Samples: 421032460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-07-02 13:46:36,096][36761] Avg episode reward: [(0, '0.995')] [2024-07-02 13:46:36,107][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000025691_420921344.pth... [2024-07-02 13:46:36,162][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000025045_410337280.pth [2024-07-02 13:46:39,126][36999] Updated weights for policy 0, policy_version 25700 (0.0025) [2024-07-02 13:46:41,095][36761] Fps is (10 sec: 44236.9, 60 sec: 44236.7, 300 sec: 44098.0). Total num frames: 421150720. Throughput: 0: 44219.5. Samples: 421308160. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-07-02 13:46:41,096][36761] Avg episode reward: [(0, '0.994')] [2024-07-02 13:46:42,923][36999] Updated weights for policy 0, policy_version 25710 (0.0032) [2024-07-02 13:46:46,095][36761] Fps is (10 sec: 44236.9, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 421363712. Throughput: 0: 44169.3. Samples: 421439100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-07-02 13:46:46,096][36761] Avg episode reward: [(0, '0.992')] [2024-07-02 13:46:46,487][36999] Updated weights for policy 0, policy_version 25720 (0.0041) [2024-07-02 13:46:50,230][36999] Updated weights for policy 0, policy_version 25730 (0.0044) [2024-07-02 13:46:51,095][36761] Fps is (10 sec: 44236.4, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 421593088. Throughput: 0: 44003.1. Samples: 421701000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-07-02 13:46:51,096][36761] Avg episode reward: [(0, '0.996')] [2024-07-02 13:46:53,917][36999] Updated weights for policy 0, policy_version 25740 (0.0037) [2024-07-02 13:46:56,095][36761] Fps is (10 sec: 45874.3, 60 sec: 44236.7, 300 sec: 44153.5). Total num frames: 421822464. Throughput: 0: 44176.5. Samples: 421970500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-07-02 13:46:56,096][36761] Avg episode reward: [(0, '0.995')] [2024-07-02 13:46:57,588][36999] Updated weights for policy 0, policy_version 25750 (0.0037) [2024-07-02 13:47:01,095][36761] Fps is (10 sec: 42599.0, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 422019072. Throughput: 0: 44133.5. Samples: 422101080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-07-02 13:47:01,095][36761] Avg episode reward: [(0, '0.995')] [2024-07-02 13:47:01,478][36999] Updated weights for policy 0, policy_version 25760 (0.0033) [2024-07-02 13:47:04,987][36999] Updated weights for policy 0, policy_version 25770 (0.0038) [2024-07-02 13:47:06,095][36761] Fps is (10 sec: 42599.2, 60 sec: 43963.9, 300 sec: 44210.0). Total num frames: 422248448. Throughput: 0: 44253.0. Samples: 422365360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:47:06,095][36761] Avg episode reward: [(0, '0.988')] [2024-07-02 13:47:08,944][36999] Updated weights for policy 0, policy_version 25780 (0.0044) [2024-07-02 13:47:11,095][36761] Fps is (10 sec: 45874.7, 60 sec: 44510.0, 300 sec: 44153.5). Total num frames: 422477824. Throughput: 0: 44148.2. Samples: 422624980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:47:11,096][36761] Avg episode reward: [(0, '1.006')] [2024-07-02 13:47:12,645][36999] Updated weights for policy 0, policy_version 25790 (0.0021) [2024-07-02 13:47:16,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43690.7, 300 sec: 44043.0). Total num frames: 422674432. Throughput: 0: 44132.0. Samples: 422755780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:47:16,096][36761] Avg episode reward: [(0, '0.999')] [2024-07-02 13:47:16,346][36999] Updated weights for policy 0, policy_version 25800 (0.0036) [2024-07-02 13:47:19,841][36979] Signal inference workers to stop experience collection... (6150 times) [2024-07-02 13:47:19,842][36979] Signal inference workers to resume experience collection... (6150 times) [2024-07-02 13:47:19,880][36999] InferenceWorker_p0-w0: stopping experience collection (6150 times) [2024-07-02 13:47:19,880][36999] InferenceWorker_p0-w0: resuming experience collection (6150 times) [2024-07-02 13:47:19,975][36999] Updated weights for policy 0, policy_version 25810 (0.0027) [2024-07-02 13:47:21,100][36761] Fps is (10 sec: 44216.7, 60 sec: 44506.5, 300 sec: 44208.3). Total num frames: 422920192. Throughput: 0: 44315.0. Samples: 423026840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-07-02 13:47:21,101][36761] Avg episode reward: [(0, '1.003')] [2024-07-02 13:47:23,772][36999] Updated weights for policy 0, policy_version 25820 (0.0040) [2024-07-02 13:47:26,095][36761] Fps is (10 sec: 44236.5, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 423116800. Throughput: 0: 43986.6. Samples: 423287560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-07-02 13:47:26,096][36761] Avg episode reward: [(0, '0.985')] [2024-07-02 13:47:27,601][36999] Updated weights for policy 0, policy_version 25830 (0.0034) [2024-07-02 13:47:31,095][36761] Fps is (10 sec: 40979.0, 60 sec: 43690.7, 300 sec: 44042.4). Total num frames: 423329792. Throughput: 0: 44051.5. Samples: 423421420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-07-02 13:47:31,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 13:47:31,260][36979] Saving new best policy, reward=1.015! [2024-07-02 13:47:31,267][36999] Updated weights for policy 0, policy_version 25840 (0.0022) [2024-07-02 13:47:34,916][36999] Updated weights for policy 0, policy_version 25850 (0.0024) [2024-07-02 13:47:36,095][36761] Fps is (10 sec: 45875.3, 60 sec: 44236.7, 300 sec: 44209.0). Total num frames: 423575552. Throughput: 0: 44165.8. Samples: 423688460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-07-02 13:47:36,096][36761] Avg episode reward: [(0, '1.003')] [2024-07-02 13:47:38,930][36999] Updated weights for policy 0, policy_version 25860 (0.0035) [2024-07-02 13:47:41,095][36761] Fps is (10 sec: 47513.7, 60 sec: 44236.8, 300 sec: 44098.1). Total num frames: 423804928. Throughput: 0: 43898.9. Samples: 423945940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 13:47:41,095][36761] Avg episode reward: [(0, '0.986')] [2024-07-02 13:47:42,311][36999] Updated weights for policy 0, policy_version 25870 (0.0025) [2024-07-02 13:47:46,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43963.6, 300 sec: 44042.4). Total num frames: 424001536. Throughput: 0: 44096.3. Samples: 424085420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 13:47:46,096][36761] Avg episode reward: [(0, '0.992')] [2024-07-02 13:47:46,213][36999] Updated weights for policy 0, policy_version 25880 (0.0039) [2024-07-02 13:47:49,638][36999] Updated weights for policy 0, policy_version 25890 (0.0027) [2024-07-02 13:47:51,095][36761] Fps is (10 sec: 42597.8, 60 sec: 43963.7, 300 sec: 44209.0). Total num frames: 424230912. Throughput: 0: 44016.3. Samples: 424346100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 13:47:51,096][36761] Avg episode reward: [(0, '0.977')] [2024-07-02 13:47:53,540][36999] Updated weights for policy 0, policy_version 25900 (0.0026) [2024-07-02 13:47:56,095][36761] Fps is (10 sec: 45875.9, 60 sec: 43963.9, 300 sec: 44153.5). Total num frames: 424460288. Throughput: 0: 44129.0. Samples: 424610780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 13:47:56,095][36761] Avg episode reward: [(0, '1.001')] [2024-07-02 13:47:56,968][36999] Updated weights for policy 0, policy_version 25910 (0.0024) [2024-07-02 13:48:01,052][36999] Updated weights for policy 0, policy_version 25920 (0.0035) [2024-07-02 13:48:01,095][36761] Fps is (10 sec: 44237.1, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 424673280. Throughput: 0: 44315.6. Samples: 424749980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 13:48:01,096][36761] Avg episode reward: [(0, '0.988')] [2024-07-02 13:48:04,272][36999] Updated weights for policy 0, policy_version 25930 (0.0022) [2024-07-02 13:48:06,095][36761] Fps is (10 sec: 40959.9, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 424869888. Throughput: 0: 44078.8. Samples: 425010180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 13:48:06,096][36761] Avg episode reward: [(0, '0.979')] [2024-07-02 13:48:08,385][36999] Updated weights for policy 0, policy_version 25940 (0.0024) [2024-07-02 13:48:11,095][36761] Fps is (10 sec: 45874.9, 60 sec: 44236.8, 300 sec: 44153.5). Total num frames: 425132032. Throughput: 0: 44049.8. Samples: 425269800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 13:48:11,096][36761] Avg episode reward: [(0, '0.972')] [2024-07-02 13:48:11,850][36999] Updated weights for policy 0, policy_version 25950 (0.0033) [2024-07-02 13:48:15,718][36999] Updated weights for policy 0, policy_version 25960 (0.0039) [2024-07-02 13:48:16,095][36761] Fps is (10 sec: 45874.8, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 425328640. Throughput: 0: 44263.9. Samples: 425413300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 13:48:16,096][36761] Avg episode reward: [(0, '0.985')] [2024-07-02 13:48:19,199][36999] Updated weights for policy 0, policy_version 25970 (0.0026) [2024-07-02 13:48:21,095][36761] Fps is (10 sec: 40960.4, 60 sec: 43694.0, 300 sec: 44098.0). Total num frames: 425541632. Throughput: 0: 44045.9. Samples: 425670520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 13:48:21,096][36761] Avg episode reward: [(0, '0.991')] [2024-07-02 13:48:23,334][36999] Updated weights for policy 0, policy_version 25980 (0.0023) [2024-07-02 13:48:26,095][36761] Fps is (10 sec: 45875.0, 60 sec: 44509.9, 300 sec: 44098.0). Total num frames: 425787392. Throughput: 0: 44149.6. Samples: 425932680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-07-02 13:48:26,096][36761] Avg episode reward: [(0, '1.009')] [2024-07-02 13:48:26,690][36999] Updated weights for policy 0, policy_version 25990 (0.0031) [2024-07-02 13:48:30,681][36999] Updated weights for policy 0, policy_version 26000 (0.0037) [2024-07-02 13:48:31,095][36761] Fps is (10 sec: 45875.6, 60 sec: 44509.9, 300 sec: 44153.5). Total num frames: 426000384. Throughput: 0: 44166.4. Samples: 426072900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-07-02 13:48:31,096][36761] Avg episode reward: [(0, '1.001')] [2024-07-02 13:48:34,019][36999] Updated weights for policy 0, policy_version 26010 (0.0035) [2024-07-02 13:48:36,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43963.7, 300 sec: 44153.5). Total num frames: 426213376. Throughput: 0: 44080.0. Samples: 426329700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-07-02 13:48:36,096][36761] Avg episode reward: [(0, '0.999')] [2024-07-02 13:48:36,111][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000026014_426213376.pth... [2024-07-02 13:48:36,172][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000025370_415662080.pth [2024-07-02 13:48:38,097][36999] Updated weights for policy 0, policy_version 26020 (0.0032) [2024-07-02 13:48:41,096][36761] Fps is (10 sec: 44235.6, 60 sec: 43963.6, 300 sec: 44097.9). Total num frames: 426442752. Throughput: 0: 44018.9. Samples: 426591640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-07-02 13:48:41,096][36761] Avg episode reward: [(0, '1.003')] [2024-07-02 13:48:41,497][36999] Updated weights for policy 0, policy_version 26030 (0.0030) [2024-07-02 13:48:45,487][36999] Updated weights for policy 0, policy_version 26040 (0.0026) [2024-07-02 13:48:46,100][36761] Fps is (10 sec: 44217.0, 60 sec: 44233.5, 300 sec: 44097.3). Total num frames: 426655744. Throughput: 0: 44023.6. Samples: 426731240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-07-02 13:48:46,100][36761] Avg episode reward: [(0, '1.008')] [2024-07-02 13:48:48,712][36979] Signal inference workers to stop experience collection... (6200 times) [2024-07-02 13:48:48,729][36999] InferenceWorker_p0-w0: stopping experience collection (6200 times) [2024-07-02 13:48:48,770][36979] Signal inference workers to resume experience collection... (6200 times) [2024-07-02 13:48:48,770][36999] InferenceWorker_p0-w0: resuming experience collection (6200 times) [2024-07-02 13:48:48,909][36999] Updated weights for policy 0, policy_version 26050 (0.0027) [2024-07-02 13:48:51,095][36761] Fps is (10 sec: 44237.2, 60 sec: 44236.8, 300 sec: 44153.5). Total num frames: 426885120. Throughput: 0: 44130.6. Samples: 426996060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 13:48:51,097][36761] Avg episode reward: [(0, '1.012')] [2024-07-02 13:48:52,988][36999] Updated weights for policy 0, policy_version 26060 (0.0025) [2024-07-02 13:48:56,095][36761] Fps is (10 sec: 44256.3, 60 sec: 43963.6, 300 sec: 44097.9). Total num frames: 427098112. Throughput: 0: 43985.3. Samples: 427249140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 13:48:56,096][36761] Avg episode reward: [(0, '0.998')] [2024-07-02 13:48:56,506][36999] Updated weights for policy 0, policy_version 26070 (0.0033) [2024-07-02 13:49:00,587][36999] Updated weights for policy 0, policy_version 26080 (0.0047) [2024-07-02 13:49:01,096][36761] Fps is (10 sec: 42598.0, 60 sec: 43963.6, 300 sec: 44042.4). Total num frames: 427311104. Throughput: 0: 43875.9. Samples: 427387720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 13:49:01,096][36761] Avg episode reward: [(0, '1.001')] [2024-07-02 13:49:03,931][36999] Updated weights for policy 0, policy_version 26090 (0.0037) [2024-07-02 13:49:06,095][36761] Fps is (10 sec: 44237.5, 60 sec: 44509.9, 300 sec: 44097.9). Total num frames: 427540480. Throughput: 0: 43984.0. Samples: 427649800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 13:49:06,096][36761] Avg episode reward: [(0, '0.999')] [2024-07-02 13:49:08,058][36999] Updated weights for policy 0, policy_version 26100 (0.0029) [2024-07-02 13:49:11,095][36761] Fps is (10 sec: 45876.2, 60 sec: 43963.8, 300 sec: 44098.0). Total num frames: 427769856. Throughput: 0: 44101.0. Samples: 427917220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 13:49:11,096][36761] Avg episode reward: [(0, '1.003')] [2024-07-02 13:49:11,368][36999] Updated weights for policy 0, policy_version 26110 (0.0030) [2024-07-02 13:49:15,692][36999] Updated weights for policy 0, policy_version 26120 (0.0027) [2024-07-02 13:49:16,095][36761] Fps is (10 sec: 42597.8, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 427966464. Throughput: 0: 43962.4. Samples: 428051220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 13:49:16,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 13:49:16,111][36979] Saving new best policy, reward=1.021! [2024-07-02 13:49:18,721][36999] Updated weights for policy 0, policy_version 26130 (0.0034) [2024-07-02 13:49:21,100][36761] Fps is (10 sec: 42578.7, 60 sec: 44233.4, 300 sec: 44098.0). Total num frames: 428195840. Throughput: 0: 43937.4. Samples: 428307080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 13:49:21,101][36761] Avg episode reward: [(0, '1.002')] [2024-07-02 13:49:23,129][36999] Updated weights for policy 0, policy_version 26140 (0.0040) [2024-07-02 13:49:26,078][36999] Updated weights for policy 0, policy_version 26150 (0.0046) [2024-07-02 13:49:26,095][36761] Fps is (10 sec: 47514.4, 60 sec: 44236.9, 300 sec: 44153.5). Total num frames: 428441600. Throughput: 0: 43950.4. Samples: 428569400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 13:49:26,096][36761] Avg episode reward: [(0, '1.005')] [2024-07-02 13:49:30,501][36999] Updated weights for policy 0, policy_version 26160 (0.0039) [2024-07-02 13:49:31,095][36761] Fps is (10 sec: 40979.0, 60 sec: 43417.6, 300 sec: 43931.3). Total num frames: 428605440. Throughput: 0: 43834.7. Samples: 428703600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 13:49:31,095][36761] Avg episode reward: [(0, '1.005')] [2024-07-02 13:49:33,567][36999] Updated weights for policy 0, policy_version 26170 (0.0039) [2024-07-02 13:49:36,095][36761] Fps is (10 sec: 40960.3, 60 sec: 43963.9, 300 sec: 44098.0). Total num frames: 428851200. Throughput: 0: 43825.5. Samples: 428968200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 13:49:36,095][36761] Avg episode reward: [(0, '1.013')] [2024-07-02 13:49:38,102][36999] Updated weights for policy 0, policy_version 26180 (0.0033) [2024-07-02 13:49:41,020][36999] Updated weights for policy 0, policy_version 26190 (0.0023) [2024-07-02 13:49:41,095][36761] Fps is (10 sec: 49151.5, 60 sec: 44236.9, 300 sec: 44153.5). Total num frames: 429096960. Throughput: 0: 43941.9. Samples: 429226520. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-07-02 13:49:41,096][36761] Avg episode reward: [(0, '0.999')] [2024-07-02 13:49:45,544][36999] Updated weights for policy 0, policy_version 26200 (0.0030) [2024-07-02 13:49:46,095][36761] Fps is (10 sec: 42598.1, 60 sec: 43694.0, 300 sec: 43986.9). Total num frames: 429277184. Throughput: 0: 43880.6. Samples: 429362340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-07-02 13:49:46,096][36761] Avg episode reward: [(0, '1.002')] [2024-07-02 13:49:48,414][36999] Updated weights for policy 0, policy_version 26210 (0.0022) [2024-07-02 13:49:51,095][36761] Fps is (10 sec: 40959.8, 60 sec: 43690.7, 300 sec: 44042.4). Total num frames: 429506560. Throughput: 0: 43856.3. Samples: 429623340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-07-02 13:49:51,096][36761] Avg episode reward: [(0, '1.000')] [2024-07-02 13:49:52,987][36999] Updated weights for policy 0, policy_version 26220 (0.0032) [2024-07-02 13:49:56,063][36999] Updated weights for policy 0, policy_version 26230 (0.0031) [2024-07-02 13:49:56,095][36761] Fps is (10 sec: 47514.0, 60 sec: 44237.0, 300 sec: 44153.5). Total num frames: 429752320. Throughput: 0: 43619.2. Samples: 429880080. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-07-02 13:49:56,095][36761] Avg episode reward: [(0, '0.998')] [2024-07-02 13:50:00,542][36999] Updated weights for policy 0, policy_version 26240 (0.0033) [2024-07-02 13:50:01,095][36761] Fps is (10 sec: 44237.2, 60 sec: 43963.9, 300 sec: 44042.4). Total num frames: 429948928. Throughput: 0: 43793.5. Samples: 430021920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-07-02 13:50:01,096][36761] Avg episode reward: [(0, '0.994')] [2024-07-02 13:50:03,502][36999] Updated weights for policy 0, policy_version 26250 (0.0035) [2024-07-02 13:50:06,095][36761] Fps is (10 sec: 40959.5, 60 sec: 43690.6, 300 sec: 44042.4). Total num frames: 430161920. Throughput: 0: 43882.7. Samples: 430281600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-07-02 13:50:06,096][36761] Avg episode reward: [(0, '1.001')] [2024-07-02 13:50:07,885][36999] Updated weights for policy 0, policy_version 26260 (0.0026) [2024-07-02 13:50:09,430][36979] Signal inference workers to stop experience collection... (6250 times) [2024-07-02 13:50:09,436][36979] Signal inference workers to resume experience collection... (6250 times) [2024-07-02 13:50:09,471][36999] InferenceWorker_p0-w0: stopping experience collection (6250 times) [2024-07-02 13:50:09,472][36999] InferenceWorker_p0-w0: resuming experience collection (6250 times) [2024-07-02 13:50:10,832][36999] Updated weights for policy 0, policy_version 26270 (0.0031) [2024-07-02 13:50:11,095][36761] Fps is (10 sec: 47513.3, 60 sec: 44236.7, 300 sec: 44209.0). Total num frames: 430424064. Throughput: 0: 43816.8. Samples: 430541160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-07-02 13:50:11,096][36761] Avg episode reward: [(0, '0.997')] [2024-07-02 13:50:15,390][36999] Updated weights for policy 0, policy_version 26280 (0.0023) [2024-07-02 13:50:16,095][36761] Fps is (10 sec: 44236.5, 60 sec: 43963.8, 300 sec: 44042.7). Total num frames: 430604288. Throughput: 0: 43924.8. Samples: 430680220. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-07-02 13:50:16,096][36761] Avg episode reward: [(0, '0.998')] [2024-07-02 13:50:18,295][36999] Updated weights for policy 0, policy_version 26290 (0.0029) [2024-07-02 13:50:21,100][36761] Fps is (10 sec: 39303.9, 60 sec: 43690.7, 300 sec: 43986.2). Total num frames: 430817280. Throughput: 0: 43653.2. Samples: 430932800. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-07-02 13:50:21,100][36761] Avg episode reward: [(0, '0.995')] [2024-07-02 13:50:22,879][36999] Updated weights for policy 0, policy_version 26300 (0.0029) [2024-07-02 13:50:26,018][36999] Updated weights for policy 0, policy_version 26310 (0.0032) [2024-07-02 13:50:26,099][36761] Fps is (10 sec: 45858.3, 60 sec: 43687.9, 300 sec: 44097.4). Total num frames: 431063040. Throughput: 0: 43889.7. Samples: 431201720. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-07-02 13:50:26,099][36761] Avg episode reward: [(0, '0.985')] [2024-07-02 13:50:30,327][36999] Updated weights for policy 0, policy_version 26320 (0.0022) [2024-07-02 13:50:31,095][36761] Fps is (10 sec: 44257.0, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 431259648. Throughput: 0: 43994.6. Samples: 431342100. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-07-02 13:50:31,096][36761] Avg episode reward: [(0, '0.986')] [2024-07-02 13:50:33,510][36999] Updated weights for policy 0, policy_version 26330 (0.0030) [2024-07-02 13:50:36,095][36761] Fps is (10 sec: 40974.8, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 431472640. Throughput: 0: 43716.8. Samples: 431590600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 13:50:36,096][36761] Avg episode reward: [(0, '1.001')] [2024-07-02 13:50:36,101][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000026335_431472640.pth... [2024-07-02 13:50:36,151][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000025691_420921344.pth [2024-07-02 13:50:37,845][36999] Updated weights for policy 0, policy_version 26340 (0.0032) [2024-07-02 13:50:40,909][36999] Updated weights for policy 0, policy_version 26350 (0.0033) [2024-07-02 13:50:41,100][36761] Fps is (10 sec: 45854.4, 60 sec: 43687.4, 300 sec: 44097.3). Total num frames: 431718400. Throughput: 0: 43804.8. Samples: 431851500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 13:50:41,101][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 13:50:45,429][36999] Updated weights for policy 0, policy_version 26360 (0.0043) [2024-07-02 13:50:46,095][36761] Fps is (10 sec: 42599.1, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 431898624. Throughput: 0: 43764.0. Samples: 431991300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-07-02 13:50:46,096][36761] Avg episode reward: [(0, '1.012')] [2024-07-02 13:50:48,374][36999] Updated weights for policy 0, policy_version 26370 (0.0031) [2024-07-02 13:50:51,100][36761] Fps is (10 sec: 40959.9, 60 sec: 43687.4, 300 sec: 43930.6). Total num frames: 432128000. Throughput: 0: 43755.6. Samples: 432250800. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-07-02 13:50:51,100][36761] Avg episode reward: [(0, '1.003')] [2024-07-02 13:50:52,908][36999] Updated weights for policy 0, policy_version 26380 (0.0038) [2024-07-02 13:50:55,811][36999] Updated weights for policy 0, policy_version 26390 (0.0035) [2024-07-02 13:50:56,095][36761] Fps is (10 sec: 47513.5, 60 sec: 43690.6, 300 sec: 44042.4). Total num frames: 432373760. Throughput: 0: 43827.2. Samples: 432513380. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-07-02 13:50:56,096][36761] Avg episode reward: [(0, '1.013')] [2024-07-02 13:51:00,337][36999] Updated weights for policy 0, policy_version 26400 (0.0032) [2024-07-02 13:51:01,095][36761] Fps is (10 sec: 44257.3, 60 sec: 43690.7, 300 sec: 43931.4). Total num frames: 432570368. Throughput: 0: 43850.8. Samples: 432653500. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-07-02 13:51:01,096][36761] Avg episode reward: [(0, '0.955')] [2024-07-02 13:51:03,195][36999] Updated weights for policy 0, policy_version 26410 (0.0036) [2024-07-02 13:51:06,095][36761] Fps is (10 sec: 40959.4, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 432783360. Throughput: 0: 43883.9. Samples: 432907380. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-07-02 13:51:06,100][36761] Avg episode reward: [(0, '0.994')] [2024-07-02 13:51:07,643][36999] Updated weights for policy 0, policy_version 26420 (0.0028) [2024-07-02 13:51:10,626][36999] Updated weights for policy 0, policy_version 26430 (0.0022) [2024-07-02 13:51:11,095][36761] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 43986.9). Total num frames: 433029120. Throughput: 0: 43674.7. Samples: 433166920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-07-02 13:51:11,098][36761] Avg episode reward: [(0, '1.001')] [2024-07-02 13:51:15,436][36999] Updated weights for policy 0, policy_version 26440 (0.0024) [2024-07-02 13:51:16,096][36761] Fps is (10 sec: 44236.4, 60 sec: 43690.6, 300 sec: 43986.8). Total num frames: 433225728. Throughput: 0: 43813.1. Samples: 433313700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-07-02 13:51:16,100][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 13:51:17,974][36999] Updated weights for policy 0, policy_version 26450 (0.0035) [2024-07-02 13:51:21,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43694.0, 300 sec: 43931.4). Total num frames: 433438720. Throughput: 0: 44011.2. Samples: 433571100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-07-02 13:51:21,096][36761] Avg episode reward: [(0, '1.018')] [2024-07-02 13:51:22,821][36999] Updated weights for policy 0, policy_version 26460 (0.0036) [2024-07-02 13:51:25,881][36999] Updated weights for policy 0, policy_version 26470 (0.0031) [2024-07-02 13:51:26,095][36761] Fps is (10 sec: 45876.5, 60 sec: 43693.4, 300 sec: 43986.9). Total num frames: 433684480. Throughput: 0: 43895.6. Samples: 433826600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-07-02 13:51:26,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 13:51:30,172][36999] Updated weights for policy 0, policy_version 26480 (0.0029) [2024-07-02 13:51:31,097][36761] Fps is (10 sec: 44229.1, 60 sec: 43689.4, 300 sec: 43931.1). Total num frames: 433881088. Throughput: 0: 43924.0. Samples: 433967960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-07-02 13:51:31,098][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 13:51:33,267][36999] Updated weights for policy 0, policy_version 26490 (0.0022) [2024-07-02 13:51:36,095][36761] Fps is (10 sec: 40959.8, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 434094080. Throughput: 0: 43853.8. Samples: 434224020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-07-02 13:51:36,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 13:51:37,604][36999] Updated weights for policy 0, policy_version 26500 (0.0034) [2024-07-02 13:51:40,647][36999] Updated weights for policy 0, policy_version 26510 (0.0030) [2024-07-02 13:51:41,095][36761] Fps is (10 sec: 47522.1, 60 sec: 43967.1, 300 sec: 44042.4). Total num frames: 434356224. Throughput: 0: 43740.4. Samples: 434481700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-07-02 13:51:41,096][36761] Avg episode reward: [(0, '1.003')] [2024-07-02 13:51:45,129][36999] Updated weights for policy 0, policy_version 26520 (0.0044) [2024-07-02 13:51:46,095][36761] Fps is (10 sec: 45874.7, 60 sec: 44236.7, 300 sec: 43931.3). Total num frames: 434552832. Throughput: 0: 43847.0. Samples: 434626620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:51:46,096][36761] Avg episode reward: [(0, '1.011')] [2024-07-02 13:51:47,832][36979] Signal inference workers to stop experience collection... (6300 times) [2024-07-02 13:51:47,877][36999] InferenceWorker_p0-w0: stopping experience collection (6300 times) [2024-07-02 13:51:47,886][36979] Signal inference workers to resume experience collection... (6300 times) [2024-07-02 13:51:47,893][36999] InferenceWorker_p0-w0: resuming experience collection (6300 times) [2024-07-02 13:51:48,194][36999] Updated weights for policy 0, policy_version 26530 (0.0031) [2024-07-02 13:51:51,095][36761] Fps is (10 sec: 39321.8, 60 sec: 43694.0, 300 sec: 43820.3). Total num frames: 434749440. Throughput: 0: 43865.9. Samples: 434881340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:51:51,096][36761] Avg episode reward: [(0, '1.011')] [2024-07-02 13:51:52,589][36999] Updated weights for policy 0, policy_version 26540 (0.0029) [2024-07-02 13:51:55,477][36999] Updated weights for policy 0, policy_version 26550 (0.0033) [2024-07-02 13:51:56,095][36761] Fps is (10 sec: 45875.5, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 435011584. Throughput: 0: 43784.5. Samples: 435137220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:51:56,103][36761] Avg episode reward: [(0, '1.009')] [2024-07-02 13:52:00,074][36999] Updated weights for policy 0, policy_version 26560 (0.0029) [2024-07-02 13:52:01,095][36761] Fps is (10 sec: 45875.4, 60 sec: 43963.8, 300 sec: 43931.3). Total num frames: 435208192. Throughput: 0: 43826.1. Samples: 435285860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:52:01,096][36761] Avg episode reward: [(0, '1.013')] [2024-07-02 13:52:03,046][36999] Updated weights for policy 0, policy_version 26570 (0.0037) [2024-07-02 13:52:06,095][36761] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 435404800. Throughput: 0: 43711.6. Samples: 435538120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:52:06,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 13:52:07,502][36999] Updated weights for policy 0, policy_version 26580 (0.0037) [2024-07-02 13:52:10,502][36999] Updated weights for policy 0, policy_version 26590 (0.0031) [2024-07-02 13:52:11,095][36761] Fps is (10 sec: 45875.1, 60 sec: 43963.8, 300 sec: 44042.4). Total num frames: 435666944. Throughput: 0: 43832.0. Samples: 435799040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:52:11,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 13:52:14,970][36999] Updated weights for policy 0, policy_version 26600 (0.0033) [2024-07-02 13:52:16,095][36761] Fps is (10 sec: 45875.5, 60 sec: 43963.9, 300 sec: 43876.5). Total num frames: 435863552. Throughput: 0: 43899.1. Samples: 435943340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 13:52:16,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 13:52:16,264][36979] Saving new best policy, reward=1.023! [2024-07-02 13:52:17,927][36999] Updated weights for policy 0, policy_version 26610 (0.0041) [2024-07-02 13:52:21,095][36761] Fps is (10 sec: 39321.5, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 436060160. Throughput: 0: 43905.4. Samples: 436199760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-07-02 13:52:21,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 13:52:21,096][36979] Saving new best policy, reward=1.028! [2024-07-02 13:52:22,708][36999] Updated weights for policy 0, policy_version 26620 (0.0029) [2024-07-02 13:52:25,323][36999] Updated weights for policy 0, policy_version 26630 (0.0034) [2024-07-02 13:52:26,095][36761] Fps is (10 sec: 45874.8, 60 sec: 43963.6, 300 sec: 44042.4). Total num frames: 436322304. Throughput: 0: 43848.4. Samples: 436454880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-07-02 13:52:26,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 13:52:30,063][36999] Updated weights for policy 0, policy_version 26640 (0.0038) [2024-07-02 13:52:31,095][36761] Fps is (10 sec: 45874.5, 60 sec: 43965.0, 300 sec: 43875.8). Total num frames: 436518912. Throughput: 0: 43816.4. Samples: 436598360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-07-02 13:52:31,096][36761] Avg episode reward: [(0, '1.013')] [2024-07-02 13:52:33,101][36999] Updated weights for policy 0, policy_version 26650 (0.0036) [2024-07-02 13:52:36,095][36761] Fps is (10 sec: 39321.7, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 436715520. Throughput: 0: 43867.0. Samples: 436855360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-07-02 13:52:36,096][36761] Avg episode reward: [(0, '1.013')] [2024-07-02 13:52:36,112][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000026655_436715520.pth... [2024-07-02 13:52:36,160][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000026014_426213376.pth [2024-07-02 13:52:37,550][36999] Updated weights for policy 0, policy_version 26660 (0.0037) [2024-07-02 13:52:40,789][36999] Updated weights for policy 0, policy_version 26670 (0.0029) [2024-07-02 13:52:41,095][36761] Fps is (10 sec: 45875.6, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 436977664. Throughput: 0: 43964.0. Samples: 437115600. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-07-02 13:52:41,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 13:52:44,971][36999] Updated weights for policy 0, policy_version 26680 (0.0032) [2024-07-02 13:52:45,469][36979] Signal inference workers to stop experience collection... (6350 times) [2024-07-02 13:52:45,518][36979] Signal inference workers to resume experience collection... (6350 times) [2024-07-02 13:52:45,516][36999] InferenceWorker_p0-w0: stopping experience collection (6350 times) [2024-07-02 13:52:45,535][36999] InferenceWorker_p0-w0: resuming experience collection (6350 times) [2024-07-02 13:52:46,095][36761] Fps is (10 sec: 45874.7, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 437174272. Throughput: 0: 43806.0. Samples: 437257140. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-07-02 13:52:46,096][36761] Avg episode reward: [(0, '1.007')] [2024-07-02 13:52:48,234][36999] Updated weights for policy 0, policy_version 26690 (0.0022) [2024-07-02 13:52:51,095][36761] Fps is (10 sec: 39321.5, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 437370880. Throughput: 0: 43863.1. Samples: 437511960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-07-02 13:52:51,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 13:52:52,285][36999] Updated weights for policy 0, policy_version 26700 (0.0047) [2024-07-02 13:52:55,598][36999] Updated weights for policy 0, policy_version 26710 (0.0039) [2024-07-02 13:52:56,095][36761] Fps is (10 sec: 45876.1, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 437633024. Throughput: 0: 43896.9. Samples: 437774400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:52:56,096][36761] Avg episode reward: [(0, '1.011')] [2024-07-02 13:52:59,759][36999] Updated weights for policy 0, policy_version 26720 (0.0033) [2024-07-02 13:53:01,095][36761] Fps is (10 sec: 45875.3, 60 sec: 43690.6, 300 sec: 43931.3). Total num frames: 437829632. Throughput: 0: 43778.6. Samples: 437913380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:53:01,096][36761] Avg episode reward: [(0, '1.001')] [2024-07-02 13:53:03,106][36999] Updated weights for policy 0, policy_version 26730 (0.0031) [2024-07-02 13:53:06,095][36761] Fps is (10 sec: 40959.4, 60 sec: 43963.7, 300 sec: 43764.7). Total num frames: 438042624. Throughput: 0: 43756.3. Samples: 438168800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:53:06,096][36761] Avg episode reward: [(0, '1.012')] [2024-07-02 13:53:07,295][36999] Updated weights for policy 0, policy_version 26740 (0.0029) [2024-07-02 13:53:10,374][36999] Updated weights for policy 0, policy_version 26750 (0.0024) [2024-07-02 13:53:11,096][36761] Fps is (10 sec: 45874.5, 60 sec: 43690.5, 300 sec: 43931.3). Total num frames: 438288384. Throughput: 0: 43996.3. Samples: 438434720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:53:11,096][36761] Avg episode reward: [(0, '1.013')] [2024-07-02 13:53:14,608][36999] Updated weights for policy 0, policy_version 26760 (0.0035) [2024-07-02 13:53:16,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 438484992. Throughput: 0: 43894.3. Samples: 438573600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-07-02 13:53:16,096][36761] Avg episode reward: [(0, '1.002')] [2024-07-02 13:53:18,257][36999] Updated weights for policy 0, policy_version 26770 (0.0023) [2024-07-02 13:53:21,095][36761] Fps is (10 sec: 40960.4, 60 sec: 43963.6, 300 sec: 43764.7). Total num frames: 438697984. Throughput: 0: 43872.0. Samples: 438829600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-07-02 13:53:21,096][36761] Avg episode reward: [(0, '1.006')] [2024-07-02 13:53:22,098][36999] Updated weights for policy 0, policy_version 26780 (0.0040) [2024-07-02 13:53:25,622][36999] Updated weights for policy 0, policy_version 26790 (0.0020) [2024-07-02 13:53:26,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 43820.2). Total num frames: 438927360. Throughput: 0: 43940.9. Samples: 439092940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-07-02 13:53:26,096][36761] Avg episode reward: [(0, '1.010')] [2024-07-02 13:53:29,501][36999] Updated weights for policy 0, policy_version 26800 (0.0034) [2024-07-02 13:53:31,095][36761] Fps is (10 sec: 44237.3, 60 sec: 43690.8, 300 sec: 43820.3). Total num frames: 439140352. Throughput: 0: 43838.0. Samples: 439229840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:53:31,096][36761] Avg episode reward: [(0, '0.994')] [2024-07-02 13:53:33,030][36999] Updated weights for policy 0, policy_version 26810 (0.0029) [2024-07-02 13:53:36,097][36761] Fps is (10 sec: 42590.3, 60 sec: 43962.3, 300 sec: 43764.5). Total num frames: 439353344. Throughput: 0: 43915.1. Samples: 439488220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:53:36,098][36761] Avg episode reward: [(0, '1.006')] [2024-07-02 13:53:36,879][36999] Updated weights for policy 0, policy_version 26820 (0.0041) [2024-07-02 13:53:40,503][36999] Updated weights for policy 0, policy_version 26830 (0.0025) [2024-07-02 13:53:41,095][36761] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 43876.5). Total num frames: 439599104. Throughput: 0: 43809.8. Samples: 439745840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:53:41,096][36761] Avg episode reward: [(0, '1.012')] [2024-07-02 13:53:44,387][36999] Updated weights for policy 0, policy_version 26840 (0.0032) [2024-07-02 13:53:46,095][36761] Fps is (10 sec: 44245.0, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 439795712. Throughput: 0: 43855.0. Samples: 439886860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:53:46,096][36761] Avg episode reward: [(0, '1.012')] [2024-07-02 13:53:47,942][36999] Updated weights for policy 0, policy_version 26850 (0.0043) [2024-07-02 13:53:51,095][36761] Fps is (10 sec: 40959.9, 60 sec: 43963.8, 300 sec: 43764.7). Total num frames: 440008704. Throughput: 0: 43950.3. Samples: 440146560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 13:53:51,096][36761] Avg episode reward: [(0, '0.999')] [2024-07-02 13:53:51,810][36999] Updated weights for policy 0, policy_version 26860 (0.0031) [2024-07-02 13:53:55,326][36999] Updated weights for policy 0, policy_version 26870 (0.0029) [2024-07-02 13:53:56,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43417.5, 300 sec: 43820.3). Total num frames: 440238080. Throughput: 0: 43826.8. Samples: 440406920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 13:53:56,096][36761] Avg episode reward: [(0, '1.002')] [2024-07-02 13:53:59,415][36999] Updated weights for policy 0, policy_version 26880 (0.0039) [2024-07-02 13:54:01,095][36761] Fps is (10 sec: 45875.7, 60 sec: 43963.8, 300 sec: 43820.3). Total num frames: 440467456. Throughput: 0: 43682.8. Samples: 440539320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 13:54:01,095][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 13:54:02,890][36999] Updated weights for policy 0, policy_version 26890 (0.0034) [2024-07-02 13:54:06,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43963.7, 300 sec: 43764.7). Total num frames: 440680448. Throughput: 0: 43924.4. Samples: 440806200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 13:54:06,096][36761] Avg episode reward: [(0, '1.007')] [2024-07-02 13:54:07,003][36999] Updated weights for policy 0, policy_version 26900 (0.0030) [2024-07-02 13:54:10,318][36999] Updated weights for policy 0, policy_version 26910 (0.0043) [2024-07-02 13:54:11,097][36761] Fps is (10 sec: 44226.8, 60 sec: 43689.2, 300 sec: 43875.5). Total num frames: 440909824. Throughput: 0: 43766.0. Samples: 441062500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 13:54:11,098][36761] Avg episode reward: [(0, '1.005')] [2024-07-02 13:54:14,423][36999] Updated weights for policy 0, policy_version 26920 (0.0022) [2024-07-02 13:54:16,095][36761] Fps is (10 sec: 44237.7, 60 sec: 43963.8, 300 sec: 43820.9). Total num frames: 441122816. Throughput: 0: 43800.9. Samples: 441200880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 13:54:16,095][36761] Avg episode reward: [(0, '1.011')] [2024-07-02 13:54:17,004][36979] Signal inference workers to stop experience collection... (6400 times) [2024-07-02 13:54:17,005][36979] Signal inference workers to resume experience collection... (6400 times) [2024-07-02 13:54:17,040][36999] InferenceWorker_p0-w0: stopping experience collection (6400 times) [2024-07-02 13:54:17,044][36999] InferenceWorker_p0-w0: resuming experience collection (6400 times) [2024-07-02 13:54:17,829][36999] Updated weights for policy 0, policy_version 26930 (0.0026) [2024-07-02 13:54:21,095][36761] Fps is (10 sec: 42607.2, 60 sec: 43963.7, 300 sec: 43709.2). Total num frames: 441335808. Throughput: 0: 43878.3. Samples: 441462660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 13:54:21,099][36761] Avg episode reward: [(0, '1.005')] [2024-07-02 13:54:21,838][36999] Updated weights for policy 0, policy_version 26940 (0.0034) [2024-07-02 13:54:25,370][36999] Updated weights for policy 0, policy_version 26950 (0.0035) [2024-07-02 13:54:26,095][36761] Fps is (10 sec: 44235.9, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 441565184. Throughput: 0: 43899.8. Samples: 441721340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:54:26,096][36761] Avg episode reward: [(0, '0.994')] [2024-07-02 13:54:29,211][36999] Updated weights for policy 0, policy_version 26960 (0.0025) [2024-07-02 13:54:31,096][36761] Fps is (10 sec: 45874.7, 60 sec: 44236.6, 300 sec: 43875.8). Total num frames: 441794560. Throughput: 0: 43711.5. Samples: 441853880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:54:31,096][36761] Avg episode reward: [(0, '1.007')] [2024-07-02 13:54:33,145][36999] Updated weights for policy 0, policy_version 26970 (0.0033) [2024-07-02 13:54:36,095][36761] Fps is (10 sec: 42598.9, 60 sec: 43965.1, 300 sec: 43709.2). Total num frames: 441991168. Throughput: 0: 43876.0. Samples: 442120980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:54:36,096][36761] Avg episode reward: [(0, '1.012')] [2024-07-02 13:54:36,161][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000026978_442007552.pth... [2024-07-02 13:54:36,213][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000026335_431472640.pth [2024-07-02 13:54:36,763][36999] Updated weights for policy 0, policy_version 26980 (0.0034) [2024-07-02 13:54:40,594][36999] Updated weights for policy 0, policy_version 26990 (0.0035) [2024-07-02 13:54:41,100][36761] Fps is (10 sec: 40942.2, 60 sec: 43414.3, 300 sec: 43819.6). Total num frames: 442204160. Throughput: 0: 43776.6. Samples: 442377060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 13:54:41,100][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 13:54:44,280][36999] Updated weights for policy 0, policy_version 27000 (0.0040) [2024-07-02 13:54:46,095][36761] Fps is (10 sec: 45875.7, 60 sec: 44236.9, 300 sec: 43875.8). Total num frames: 442449920. Throughput: 0: 43841.3. Samples: 442512180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-07-02 13:54:46,096][36761] Avg episode reward: [(0, '1.011')] [2024-07-02 13:54:48,201][36999] Updated weights for policy 0, policy_version 27010 (0.0033) [2024-07-02 13:54:51,095][36761] Fps is (10 sec: 44256.8, 60 sec: 43963.7, 300 sec: 43709.2). Total num frames: 442646528. Throughput: 0: 43709.0. Samples: 442773100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-07-02 13:54:51,096][36761] Avg episode reward: [(0, '1.011')] [2024-07-02 13:54:51,578][36999] Updated weights for policy 0, policy_version 27020 (0.0037) [2024-07-02 13:54:55,778][36999] Updated weights for policy 0, policy_version 27030 (0.0020) [2024-07-02 13:54:56,095][36761] Fps is (10 sec: 42597.7, 60 sec: 43963.7, 300 sec: 43820.2). Total num frames: 442875904. Throughput: 0: 43954.0. Samples: 443040340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-07-02 13:54:56,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 13:54:59,091][36999] Updated weights for policy 0, policy_version 27040 (0.0028) [2024-07-02 13:55:01,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43690.6, 300 sec: 43820.3). Total num frames: 443088896. Throughput: 0: 43755.5. Samples: 443169880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 13:55:01,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 13:55:03,127][36999] Updated weights for policy 0, policy_version 27050 (0.0030) [2024-07-02 13:55:06,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43963.8, 300 sec: 43709.2). Total num frames: 443318272. Throughput: 0: 43780.0. Samples: 443432760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 13:55:06,096][36761] Avg episode reward: [(0, '1.012')] [2024-07-02 13:55:06,479][36999] Updated weights for policy 0, policy_version 27060 (0.0045) [2024-07-02 13:55:10,522][36999] Updated weights for policy 0, policy_version 27070 (0.0027) [2024-07-02 13:55:11,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43692.2, 300 sec: 43820.3). Total num frames: 443531264. Throughput: 0: 43947.6. Samples: 443698980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 13:55:11,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 13:55:13,978][36999] Updated weights for policy 0, policy_version 27080 (0.0026) [2024-07-02 13:55:16,095][36761] Fps is (10 sec: 44237.5, 60 sec: 43963.8, 300 sec: 43876.5). Total num frames: 443760640. Throughput: 0: 43929.6. Samples: 443830700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 13:55:16,095][36761] Avg episode reward: [(0, '1.005')] [2024-07-02 13:55:17,917][36999] Updated weights for policy 0, policy_version 27090 (0.0042) [2024-07-02 13:55:21,095][36761] Fps is (10 sec: 44237.2, 60 sec: 43963.8, 300 sec: 43765.3). Total num frames: 443973632. Throughput: 0: 43872.0. Samples: 444095220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 13:55:21,096][36761] Avg episode reward: [(0, '1.008')] [2024-07-02 13:55:21,409][36999] Updated weights for policy 0, policy_version 27100 (0.0029) [2024-07-02 13:55:25,335][36999] Updated weights for policy 0, policy_version 27110 (0.0034) [2024-07-02 13:55:26,096][36761] Fps is (10 sec: 42597.2, 60 sec: 43690.6, 300 sec: 43820.2). Total num frames: 444186624. Throughput: 0: 44040.7. Samples: 444358700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 13:55:26,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 13:55:28,833][36999] Updated weights for policy 0, policy_version 27120 (0.0031) [2024-07-02 13:55:31,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43690.9, 300 sec: 43875.8). Total num frames: 444416000. Throughput: 0: 43869.8. Samples: 444486320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 13:55:31,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 13:55:32,632][36999] Updated weights for policy 0, policy_version 27130 (0.0031) [2024-07-02 13:55:36,095][36761] Fps is (10 sec: 45875.9, 60 sec: 44236.8, 300 sec: 43820.9). Total num frames: 444645376. Throughput: 0: 44041.3. Samples: 444754960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 13:55:36,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 13:55:36,102][36999] Updated weights for policy 0, policy_version 27140 (0.0036) [2024-07-02 13:55:40,002][36999] Updated weights for policy 0, policy_version 27150 (0.0034) [2024-07-02 13:55:41,100][36761] Fps is (10 sec: 42578.6, 60 sec: 43963.7, 300 sec: 43875.1). Total num frames: 444841984. Throughput: 0: 43957.4. Samples: 445018620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-07-02 13:55:41,100][36761] Avg episode reward: [(0, '1.006')] [2024-07-02 13:55:43,696][36999] Updated weights for policy 0, policy_version 27160 (0.0034) [2024-07-02 13:55:46,023][36979] Signal inference workers to stop experience collection... (6450 times) [2024-07-02 13:55:46,070][36979] Signal inference workers to resume experience collection... (6450 times) [2024-07-02 13:55:46,072][36999] InferenceWorker_p0-w0: stopping experience collection (6450 times) [2024-07-02 13:55:46,086][36999] InferenceWorker_p0-w0: resuming experience collection (6450 times) [2024-07-02 13:55:46,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43690.6, 300 sec: 43876.5). Total num frames: 445071360. Throughput: 0: 43826.3. Samples: 445142060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-07-02 13:55:46,096][36761] Avg episode reward: [(0, '1.005')] [2024-07-02 13:55:47,393][36999] Updated weights for policy 0, policy_version 27170 (0.0027) [2024-07-02 13:55:51,095][36761] Fps is (10 sec: 45896.5, 60 sec: 44236.9, 300 sec: 43820.3). Total num frames: 445300736. Throughput: 0: 44008.1. Samples: 445413120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-07-02 13:55:51,096][36761] Avg episode reward: [(0, '0.994')] [2024-07-02 13:55:51,107][36999] Updated weights for policy 0, policy_version 27180 (0.0030) [2024-07-02 13:55:54,819][36999] Updated weights for policy 0, policy_version 27190 (0.0032) [2024-07-02 13:55:56,095][36761] Fps is (10 sec: 42597.6, 60 sec: 43690.6, 300 sec: 43820.2). Total num frames: 445497344. Throughput: 0: 43981.3. Samples: 445678140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 13:55:56,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 13:55:58,406][36999] Updated weights for policy 0, policy_version 27200 (0.0025) [2024-07-02 13:56:01,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 445726720. Throughput: 0: 43835.4. Samples: 445803300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 13:56:01,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 13:56:02,263][36999] Updated weights for policy 0, policy_version 27210 (0.0036) [2024-07-02 13:56:06,095][36761] Fps is (10 sec: 44237.7, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 445939712. Throughput: 0: 43888.5. Samples: 446070200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 13:56:06,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 13:56:06,304][36999] Updated weights for policy 0, policy_version 27220 (0.0024) [2024-07-02 13:56:09,551][36999] Updated weights for policy 0, policy_version 27230 (0.0034) [2024-07-02 13:56:11,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 446152704. Throughput: 0: 43982.4. Samples: 446337900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 13:56:11,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 13:56:13,768][36999] Updated weights for policy 0, policy_version 27240 (0.0034) [2024-07-02 13:56:16,095][36761] Fps is (10 sec: 45875.2, 60 sec: 43963.7, 300 sec: 43931.4). Total num frames: 446398464. Throughput: 0: 44000.9. Samples: 446466360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 13:56:16,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 13:56:16,966][36999] Updated weights for policy 0, policy_version 27250 (0.0024) [2024-07-02 13:56:21,095][36761] Fps is (10 sec: 45875.5, 60 sec: 43963.8, 300 sec: 43820.3). Total num frames: 446611456. Throughput: 0: 43940.1. Samples: 446732260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 13:56:21,096][36761] Avg episode reward: [(0, '1.013')] [2024-07-02 13:56:21,098][36999] Updated weights for policy 0, policy_version 27260 (0.0043) [2024-07-02 13:56:25,289][36999] Updated weights for policy 0, policy_version 27270 (0.0038) [2024-07-02 13:56:26,095][36761] Fps is (10 sec: 40959.9, 60 sec: 43690.8, 300 sec: 43820.5). Total num frames: 446808064. Throughput: 0: 43999.2. Samples: 446998380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 13:56:26,095][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 13:56:28,480][36999] Updated weights for policy 0, policy_version 27280 (0.0041) [2024-07-02 13:56:31,095][36761] Fps is (10 sec: 45875.2, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 447070208. Throughput: 0: 44061.8. Samples: 447124840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 13:56:31,096][36761] Avg episode reward: [(0, '1.005')] [2024-07-02 13:56:32,680][36999] Updated weights for policy 0, policy_version 27290 (0.0035) [2024-07-02 13:56:35,835][36999] Updated weights for policy 0, policy_version 27300 (0.0035) [2024-07-02 13:56:36,096][36761] Fps is (10 sec: 47512.5, 60 sec: 43963.6, 300 sec: 43820.2). Total num frames: 447283200. Throughput: 0: 43994.0. Samples: 447392860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 13:56:36,096][36761] Avg episode reward: [(0, '0.993')] [2024-07-02 13:56:36,114][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000027300_447283200.pth... [2024-07-02 13:56:36,179][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000026655_436715520.pth [2024-07-02 13:56:39,985][36999] Updated weights for policy 0, policy_version 27310 (0.0040) [2024-07-02 13:56:41,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43967.1, 300 sec: 43820.3). Total num frames: 447479808. Throughput: 0: 44158.9. Samples: 447665280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 13:56:41,096][36761] Avg episode reward: [(0, '0.995')] [2024-07-02 13:56:43,146][36999] Updated weights for policy 0, policy_version 27320 (0.0032) [2024-07-02 13:56:46,095][36761] Fps is (10 sec: 45876.2, 60 sec: 44509.8, 300 sec: 44042.4). Total num frames: 447741952. Throughput: 0: 44180.5. Samples: 447791420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 13:56:46,096][36761] Avg episode reward: [(0, '0.998')] [2024-07-02 13:56:47,511][36999] Updated weights for policy 0, policy_version 27330 (0.0034) [2024-07-02 13:56:50,851][36999] Updated weights for policy 0, policy_version 27340 (0.0033) [2024-07-02 13:56:51,095][36761] Fps is (10 sec: 47513.7, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 447954944. Throughput: 0: 44235.1. Samples: 448060780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 13:56:51,096][36761] Avg episode reward: [(0, '1.009')] [2024-07-02 13:56:54,835][36999] Updated weights for policy 0, policy_version 27350 (0.0029) [2024-07-02 13:56:56,095][36761] Fps is (10 sec: 40959.7, 60 sec: 44236.9, 300 sec: 43875.8). Total num frames: 448151552. Throughput: 0: 44115.5. Samples: 448323100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 13:56:56,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 13:56:58,186][36999] Updated weights for policy 0, policy_version 27360 (0.0042) [2024-07-02 13:57:01,095][36761] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 448380928. Throughput: 0: 44168.9. Samples: 448453960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 13:57:01,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 13:57:01,096][36979] Saving new best policy, reward=1.030! [2024-07-02 13:57:02,185][36999] Updated weights for policy 0, policy_version 27370 (0.0027) [2024-07-02 13:57:05,632][36999] Updated weights for policy 0, policy_version 27380 (0.0037) [2024-07-02 13:57:06,095][36761] Fps is (10 sec: 45875.3, 60 sec: 44509.8, 300 sec: 43875.8). Total num frames: 448610304. Throughput: 0: 44131.0. Samples: 448718160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 13:57:06,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 13:57:09,834][36999] Updated weights for policy 0, policy_version 27390 (0.0040) [2024-07-02 13:57:11,095][36761] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 448806912. Throughput: 0: 44006.6. Samples: 448978680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 13:57:11,096][36761] Avg episode reward: [(0, '1.007')] [2024-07-02 13:57:12,941][36999] Updated weights for policy 0, policy_version 27400 (0.0029) [2024-07-02 13:57:16,100][36761] Fps is (10 sec: 44216.7, 60 sec: 44233.4, 300 sec: 44041.7). Total num frames: 449052672. Throughput: 0: 44134.6. Samples: 449111100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 13:57:16,101][36761] Avg episode reward: [(0, '1.008')] [2024-07-02 13:57:17,115][36999] Updated weights for policy 0, policy_version 27410 (0.0029) [2024-07-02 13:57:19,384][36979] Signal inference workers to stop experience collection... (6500 times) [2024-07-02 13:57:19,416][36999] InferenceWorker_p0-w0: stopping experience collection (6500 times) [2024-07-02 13:57:19,442][36979] Signal inference workers to resume experience collection... (6500 times) [2024-07-02 13:57:19,443][36999] InferenceWorker_p0-w0: resuming experience collection (6500 times) [2024-07-02 13:57:20,420][36999] Updated weights for policy 0, policy_version 27420 (0.0036) [2024-07-02 13:57:21,095][36761] Fps is (10 sec: 45875.3, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 449265664. Throughput: 0: 44077.5. Samples: 449376340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 13:57:21,096][36761] Avg episode reward: [(0, '0.994')] [2024-07-02 13:57:24,678][36999] Updated weights for policy 0, policy_version 27430 (0.0035) [2024-07-02 13:57:26,095][36761] Fps is (10 sec: 42618.1, 60 sec: 44509.9, 300 sec: 43931.4). Total num frames: 449478656. Throughput: 0: 43899.1. Samples: 449640740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:57:26,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 13:57:27,920][36999] Updated weights for policy 0, policy_version 27440 (0.0020) [2024-07-02 13:57:31,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 449708032. Throughput: 0: 44026.6. Samples: 449772620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:57:31,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 13:57:32,109][36999] Updated weights for policy 0, policy_version 27450 (0.0021) [2024-07-02 13:57:35,521][36999] Updated weights for policy 0, policy_version 27460 (0.0026) [2024-07-02 13:57:36,095][36761] Fps is (10 sec: 42597.9, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 449904640. Throughput: 0: 43792.8. Samples: 450031460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:57:36,096][36761] Avg episode reward: [(0, '1.007')] [2024-07-02 13:57:39,638][36999] Updated weights for policy 0, policy_version 27470 (0.0020) [2024-07-02 13:57:41,095][36761] Fps is (10 sec: 44236.9, 60 sec: 44509.8, 300 sec: 43986.9). Total num frames: 450150400. Throughput: 0: 43731.1. Samples: 450291000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 13:57:41,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 13:57:43,168][36999] Updated weights for policy 0, policy_version 27480 (0.0037) [2024-07-02 13:57:46,100][36761] Fps is (10 sec: 45854.7, 60 sec: 43687.3, 300 sec: 44041.7). Total num frames: 450363392. Throughput: 0: 43963.1. Samples: 450432500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:57:46,100][36761] Avg episode reward: [(0, '1.010')] [2024-07-02 13:57:46,976][36999] Updated weights for policy 0, policy_version 27490 (0.0033) [2024-07-02 13:57:50,721][36999] Updated weights for policy 0, policy_version 27500 (0.0037) [2024-07-02 13:57:51,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43417.5, 300 sec: 43820.2). Total num frames: 450560000. Throughput: 0: 43706.7. Samples: 450684960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:57:51,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 13:57:54,789][36999] Updated weights for policy 0, policy_version 27510 (0.0021) [2024-07-02 13:57:56,095][36761] Fps is (10 sec: 40978.3, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 450772992. Throughput: 0: 43661.3. Samples: 450943440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:57:56,099][36761] Avg episode reward: [(0, '1.013')] [2024-07-02 13:57:58,167][36999] Updated weights for policy 0, policy_version 27520 (0.0031) [2024-07-02 13:58:01,095][36761] Fps is (10 sec: 45875.1, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 451018752. Throughput: 0: 43663.1. Samples: 451075740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 13:58:01,097][36761] Avg episode reward: [(0, '1.008')] [2024-07-02 13:58:02,339][36999] Updated weights for policy 0, policy_version 27530 (0.0034) [2024-07-02 13:58:05,592][36999] Updated weights for policy 0, policy_version 27540 (0.0032) [2024-07-02 13:58:06,100][36761] Fps is (10 sec: 44217.0, 60 sec: 43414.3, 300 sec: 43819.6). Total num frames: 451215360. Throughput: 0: 43524.5. Samples: 451335140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-07-02 13:58:06,100][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 13:58:09,663][36999] Updated weights for policy 0, policy_version 27550 (0.0034) [2024-07-02 13:58:11,095][36761] Fps is (10 sec: 40959.9, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 451428352. Throughput: 0: 43453.2. Samples: 451596140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-07-02 13:58:11,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 13:58:13,081][36999] Updated weights for policy 0, policy_version 27560 (0.0037) [2024-07-02 13:58:16,095][36761] Fps is (10 sec: 45896.2, 60 sec: 43694.0, 300 sec: 43986.9). Total num frames: 451674112. Throughput: 0: 43512.5. Samples: 451730680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-07-02 13:58:16,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 13:58:17,140][36999] Updated weights for policy 0, policy_version 27570 (0.0034) [2024-07-02 13:58:20,612][36999] Updated weights for policy 0, policy_version 27580 (0.0032) [2024-07-02 13:58:21,095][36761] Fps is (10 sec: 44237.4, 60 sec: 43417.7, 300 sec: 43875.8). Total num frames: 451870720. Throughput: 0: 43596.2. Samples: 451993280. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-07-02 13:58:21,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 13:58:24,521][36999] Updated weights for policy 0, policy_version 27590 (0.0040) [2024-07-02 13:58:26,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 452100096. Throughput: 0: 43646.7. Samples: 452255100. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-07-02 13:58:26,096][36761] Avg episode reward: [(0, '1.012')] [2024-07-02 13:58:28,478][36999] Updated weights for policy 0, policy_version 27600 (0.0030) [2024-07-02 13:58:31,095][36761] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 43931.6). Total num frames: 452313088. Throughput: 0: 43402.6. Samples: 452385420. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-07-02 13:58:31,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 13:58:32,018][36999] Updated weights for policy 0, policy_version 27610 (0.0032) [2024-07-02 13:58:35,899][36999] Updated weights for policy 0, policy_version 27620 (0.0026) [2024-07-02 13:58:36,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 452526080. Throughput: 0: 43683.6. Samples: 452650720. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-07-02 13:58:36,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 13:58:36,192][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000027621_452542464.pth... [2024-07-02 13:58:36,264][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000026978_442007552.pth [2024-07-02 13:58:38,489][36979] Signal inference workers to stop experience collection... (6550 times) [2024-07-02 13:58:38,494][36979] Signal inference workers to resume experience collection... (6550 times) [2024-07-02 13:58:38,515][36999] InferenceWorker_p0-w0: stopping experience collection (6550 times) [2024-07-02 13:58:38,516][36999] InferenceWorker_p0-w0: resuming experience collection (6550 times) [2024-07-02 13:58:39,519][36999] Updated weights for policy 0, policy_version 27630 (0.0031) [2024-07-02 13:58:41,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 43931.3). Total num frames: 452755456. Throughput: 0: 43637.3. Samples: 452907120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 13:58:41,096][36761] Avg episode reward: [(0, '0.998')] [2024-07-02 13:58:43,318][36999] Updated weights for policy 0, policy_version 27640 (0.0025) [2024-07-02 13:58:46,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43420.9, 300 sec: 43931.3). Total num frames: 452968448. Throughput: 0: 43631.7. Samples: 453039160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 13:58:46,095][36761] Avg episode reward: [(0, '1.008')] [2024-07-02 13:58:47,159][36999] Updated weights for policy 0, policy_version 27650 (0.0037) [2024-07-02 13:58:50,665][36999] Updated weights for policy 0, policy_version 27660 (0.0026) [2024-07-02 13:58:51,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 453181440. Throughput: 0: 43754.2. Samples: 453303880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 13:58:51,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 13:58:54,484][36999] Updated weights for policy 0, policy_version 27670 (0.0030) [2024-07-02 13:58:56,095][36761] Fps is (10 sec: 45875.2, 60 sec: 44236.9, 300 sec: 43931.3). Total num frames: 453427200. Throughput: 0: 43770.8. Samples: 453565820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 13:58:56,095][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 13:58:57,979][36999] Updated weights for policy 0, policy_version 27680 (0.0029) [2024-07-02 13:59:01,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 43875.8). Total num frames: 453623808. Throughput: 0: 43869.3. Samples: 453704800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 13:59:01,096][36761] Avg episode reward: [(0, '1.003')] [2024-07-02 13:59:01,892][36999] Updated weights for policy 0, policy_version 27690 (0.0035) [2024-07-02 13:59:05,381][36999] Updated weights for policy 0, policy_version 27700 (0.0029) [2024-07-02 13:59:06,095][36761] Fps is (10 sec: 42597.7, 60 sec: 43967.0, 300 sec: 43876.1). Total num frames: 453853184. Throughput: 0: 43844.7. Samples: 453966300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 13:59:06,096][36761] Avg episode reward: [(0, '1.008')] [2024-07-02 13:59:09,442][36999] Updated weights for policy 0, policy_version 27710 (0.0028) [2024-07-02 13:59:11,100][36761] Fps is (10 sec: 45854.4, 60 sec: 44233.5, 300 sec: 43930.7). Total num frames: 454082560. Throughput: 0: 43782.6. Samples: 454225520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 13:59:11,100][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 13:59:13,355][36999] Updated weights for policy 0, policy_version 27720 (0.0037) [2024-07-02 13:59:16,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43690.6, 300 sec: 43931.3). Total num frames: 454295552. Throughput: 0: 43886.6. Samples: 454360320. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-07-02 13:59:16,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 13:59:16,811][36999] Updated weights for policy 0, policy_version 27730 (0.0031) [2024-07-02 13:59:20,742][36999] Updated weights for policy 0, policy_version 27740 (0.0022) [2024-07-02 13:59:21,095][36761] Fps is (10 sec: 42618.0, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 454508544. Throughput: 0: 43956.9. Samples: 454628780. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-07-02 13:59:21,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 13:59:24,252][36999] Updated weights for policy 0, policy_version 27750 (0.0029) [2024-07-02 13:59:26,095][36761] Fps is (10 sec: 44237.7, 60 sec: 43963.8, 300 sec: 43875.8). Total num frames: 454737920. Throughput: 0: 43917.5. Samples: 454883400. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-07-02 13:59:26,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 13:59:28,040][36999] Updated weights for policy 0, policy_version 27760 (0.0032) [2024-07-02 13:59:31,095][36761] Fps is (10 sec: 44236.4, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 454950912. Throughput: 0: 43981.7. Samples: 455018340. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-07-02 13:59:31,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 13:59:31,598][36999] Updated weights for policy 0, policy_version 27770 (0.0031) [2024-07-02 13:59:35,434][36999] Updated weights for policy 0, policy_version 27780 (0.0042) [2024-07-02 13:59:36,095][36761] Fps is (10 sec: 42597.6, 60 sec: 43963.6, 300 sec: 43932.0). Total num frames: 455163904. Throughput: 0: 44086.2. Samples: 455287760. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-07-02 13:59:36,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 13:59:38,950][36999] Updated weights for policy 0, policy_version 27790 (0.0050) [2024-07-02 13:59:41,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 43820.2). Total num frames: 455376896. Throughput: 0: 43928.0. Samples: 455542580. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-07-02 13:59:41,096][36761] Avg episode reward: [(0, '1.007')] [2024-07-02 13:59:42,840][36999] Updated weights for policy 0, policy_version 27800 (0.0032) [2024-07-02 13:59:46,095][36761] Fps is (10 sec: 45875.6, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 455622656. Throughput: 0: 43905.8. Samples: 455680560. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-07-02 13:59:46,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 13:59:46,627][36999] Updated weights for policy 0, policy_version 27810 (0.0040) [2024-07-02 13:59:50,825][36999] Updated weights for policy 0, policy_version 27820 (0.0048) [2024-07-02 13:59:51,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43690.6, 300 sec: 43820.3). Total num frames: 455802880. Throughput: 0: 43817.8. Samples: 455938100. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-07-02 13:59:51,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 13:59:54,023][36979] Signal inference workers to stop experience collection... (6600 times) [2024-07-02 13:59:54,058][36999] InferenceWorker_p0-w0: stopping experience collection (6600 times) [2024-07-02 13:59:54,080][36979] Signal inference workers to resume experience collection... (6600 times) [2024-07-02 13:59:54,082][36999] InferenceWorker_p0-w0: resuming experience collection (6600 times) [2024-07-02 13:59:54,089][36999] Updated weights for policy 0, policy_version 27830 (0.0027) [2024-07-02 13:59:56,099][36761] Fps is (10 sec: 40944.5, 60 sec: 43414.8, 300 sec: 43875.2). Total num frames: 456032256. Throughput: 0: 43839.4. Samples: 456198260. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-07-02 13:59:56,100][36761] Avg episode reward: [(0, '1.011')] [2024-07-02 13:59:58,146][36999] Updated weights for policy 0, policy_version 27840 (0.0038) [2024-07-02 14:00:01,095][36761] Fps is (10 sec: 47514.0, 60 sec: 44236.8, 300 sec: 43931.3). Total num frames: 456278016. Throughput: 0: 43815.7. Samples: 456332020. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-07-02 14:00:01,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:00:01,448][36999] Updated weights for policy 0, policy_version 27850 (0.0033) [2024-07-02 14:00:06,035][36999] Updated weights for policy 0, policy_version 27860 (0.0029) [2024-07-02 14:00:06,095][36761] Fps is (10 sec: 42614.8, 60 sec: 43417.7, 300 sec: 43820.3). Total num frames: 456458240. Throughput: 0: 43635.1. Samples: 456592360. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-07-02 14:00:06,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:00:08,823][36999] Updated weights for policy 0, policy_version 27870 (0.0028) [2024-07-02 14:00:11,095][36761] Fps is (10 sec: 40959.4, 60 sec: 43420.8, 300 sec: 43820.2). Total num frames: 456687616. Throughput: 0: 43743.8. Samples: 456851880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 14:00:11,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 14:00:13,342][36999] Updated weights for policy 0, policy_version 27880 (0.0020) [2024-07-02 14:00:16,095][36761] Fps is (10 sec: 47513.7, 60 sec: 43963.8, 300 sec: 43931.3). Total num frames: 456933376. Throughput: 0: 43821.0. Samples: 456990280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 14:00:16,095][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:00:16,201][36999] Updated weights for policy 0, policy_version 27890 (0.0051) [2024-07-02 14:00:20,764][36999] Updated weights for policy 0, policy_version 27900 (0.0028) [2024-07-02 14:00:21,095][36761] Fps is (10 sec: 42599.4, 60 sec: 43417.6, 300 sec: 43820.3). Total num frames: 457113600. Throughput: 0: 43705.1. Samples: 457254480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 14:00:21,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:00:23,610][36999] Updated weights for policy 0, policy_version 27910 (0.0032) [2024-07-02 14:00:26,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 457359360. Throughput: 0: 43780.4. Samples: 457512700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 14:00:26,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:00:28,210][36999] Updated weights for policy 0, policy_version 27920 (0.0032) [2024-07-02 14:00:30,867][36999] Updated weights for policy 0, policy_version 27930 (0.0020) [2024-07-02 14:00:31,095][36761] Fps is (10 sec: 49150.8, 60 sec: 44236.7, 300 sec: 43931.3). Total num frames: 457605120. Throughput: 0: 43827.4. Samples: 457652800. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-07-02 14:00:31,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:00:35,671][36999] Updated weights for policy 0, policy_version 27940 (0.0030) [2024-07-02 14:00:36,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43417.7, 300 sec: 43820.9). Total num frames: 457768960. Throughput: 0: 43771.7. Samples: 457907820. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-07-02 14:00:36,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:00:36,108][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000027940_457768960.pth... [2024-07-02 14:00:36,151][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000027300_447283200.pth [2024-07-02 14:00:38,421][36999] Updated weights for policy 0, policy_version 27950 (0.0046) [2024-07-02 14:00:41,095][36761] Fps is (10 sec: 40960.3, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 458014720. Throughput: 0: 43699.2. Samples: 458164560. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-07-02 14:00:41,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:00:43,115][36999] Updated weights for policy 0, policy_version 27960 (0.0028) [2024-07-02 14:00:45,993][36999] Updated weights for policy 0, policy_version 27970 (0.0021) [2024-07-02 14:00:46,100][36761] Fps is (10 sec: 49129.5, 60 sec: 43960.4, 300 sec: 43930.7). Total num frames: 458260480. Throughput: 0: 43811.1. Samples: 458303720. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-07-02 14:00:46,100][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:00:50,904][36999] Updated weights for policy 0, policy_version 27980 (0.0036) [2024-07-02 14:00:51,095][36761] Fps is (10 sec: 40960.2, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 458424320. Throughput: 0: 43745.3. Samples: 458560900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 14:00:51,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:00:54,133][36999] Updated weights for policy 0, policy_version 27990 (0.0026) [2024-07-02 14:00:56,095][36761] Fps is (10 sec: 39339.6, 60 sec: 43693.5, 300 sec: 43820.3). Total num frames: 458653696. Throughput: 0: 43608.6. Samples: 458814260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 14:00:56,095][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:00:58,378][36999] Updated weights for policy 0, policy_version 28000 (0.0030) [2024-07-02 14:01:01,095][36761] Fps is (10 sec: 47513.7, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 458899456. Throughput: 0: 43562.6. Samples: 458950600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 14:01:01,096][36761] Avg episode reward: [(0, '1.010')] [2024-07-02 14:01:01,482][36999] Updated weights for policy 0, policy_version 28010 (0.0037) [2024-07-02 14:01:05,643][36999] Updated weights for policy 0, policy_version 28020 (0.0029) [2024-07-02 14:01:06,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43690.6, 300 sec: 43820.3). Total num frames: 459079680. Throughput: 0: 43684.8. Samples: 459220300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 14:01:06,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 14:01:08,795][36999] Updated weights for policy 0, policy_version 28030 (0.0025) [2024-07-02 14:01:11,095][36761] Fps is (10 sec: 40959.8, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 459309056. Throughput: 0: 43653.3. Samples: 459477100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-07-02 14:01:11,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 14:01:13,133][36999] Updated weights for policy 0, policy_version 28040 (0.0040) [2024-07-02 14:01:16,095][36761] Fps is (10 sec: 47512.9, 60 sec: 43690.5, 300 sec: 43875.8). Total num frames: 459554816. Throughput: 0: 43576.5. Samples: 459613740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-07-02 14:01:16,096][36761] Avg episode reward: [(0, '1.007')] [2024-07-02 14:01:16,257][36999] Updated weights for policy 0, policy_version 28050 (0.0023) [2024-07-02 14:01:20,515][36999] Updated weights for policy 0, policy_version 28060 (0.0025) [2024-07-02 14:01:21,096][36761] Fps is (10 sec: 42597.7, 60 sec: 43690.5, 300 sec: 43820.2). Total num frames: 459735040. Throughput: 0: 43717.1. Samples: 459875100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-07-02 14:01:21,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:01:21,505][36979] Signal inference workers to stop experience collection... (6650 times) [2024-07-02 14:01:21,558][36999] InferenceWorker_p0-w0: stopping experience collection (6650 times) [2024-07-02 14:01:21,564][36979] Signal inference workers to resume experience collection... (6650 times) [2024-07-02 14:01:21,576][36999] InferenceWorker_p0-w0: resuming experience collection (6650 times) [2024-07-02 14:01:23,689][36999] Updated weights for policy 0, policy_version 28070 (0.0043) [2024-07-02 14:01:26,095][36761] Fps is (10 sec: 40960.8, 60 sec: 43417.6, 300 sec: 43709.2). Total num frames: 459964416. Throughput: 0: 43771.7. Samples: 460134280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 14:01:26,096][36761] Avg episode reward: [(0, '1.003')] [2024-07-02 14:01:28,398][36999] Updated weights for policy 0, policy_version 28080 (0.0032) [2024-07-02 14:01:30,934][36999] Updated weights for policy 0, policy_version 28090 (0.0027) [2024-07-02 14:01:31,095][36761] Fps is (10 sec: 49152.4, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 460226560. Throughput: 0: 43735.9. Samples: 460271640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 14:01:31,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:01:35,839][36999] Updated weights for policy 0, policy_version 28100 (0.0045) [2024-07-02 14:01:36,095][36761] Fps is (10 sec: 44236.1, 60 sec: 43963.6, 300 sec: 43820.2). Total num frames: 460406784. Throughput: 0: 43944.8. Samples: 460538420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 14:01:36,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:01:38,328][36999] Updated weights for policy 0, policy_version 28110 (0.0027) [2024-07-02 14:01:41,095][36761] Fps is (10 sec: 39321.9, 60 sec: 43417.6, 300 sec: 43653.6). Total num frames: 460619776. Throughput: 0: 44005.7. Samples: 460794520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 14:01:41,096][36761] Avg episode reward: [(0, '1.012')] [2024-07-02 14:01:43,267][36999] Updated weights for policy 0, policy_version 28120 (0.0034) [2024-07-02 14:01:45,793][36999] Updated weights for policy 0, policy_version 28130 (0.0031) [2024-07-02 14:01:46,095][36761] Fps is (10 sec: 47514.2, 60 sec: 43694.0, 300 sec: 43820.3). Total num frames: 460881920. Throughput: 0: 44070.2. Samples: 460933760. Policy #0 lag: (min: 0.0, avg: 7.8, max: 19.0) [2024-07-02 14:01:46,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:01:50,651][36999] Updated weights for policy 0, policy_version 28140 (0.0031) [2024-07-02 14:01:51,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 43709.2). Total num frames: 461045760. Throughput: 0: 43976.8. Samples: 461199260. Policy #0 lag: (min: 0.0, avg: 7.8, max: 19.0) [2024-07-02 14:01:51,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:01:53,212][36999] Updated weights for policy 0, policy_version 28150 (0.0032) [2024-07-02 14:01:56,100][36761] Fps is (10 sec: 40941.1, 60 sec: 43960.3, 300 sec: 43764.0). Total num frames: 461291520. Throughput: 0: 44049.3. Samples: 461459520. Policy #0 lag: (min: 0.0, avg: 7.8, max: 19.0) [2024-07-02 14:01:56,101][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:01:58,089][36999] Updated weights for policy 0, policy_version 28160 (0.0031) [2024-07-02 14:02:00,575][36999] Updated weights for policy 0, policy_version 28170 (0.0025) [2024-07-02 14:02:01,096][36761] Fps is (10 sec: 49151.6, 60 sec: 43963.6, 300 sec: 43820.2). Total num frames: 461537280. Throughput: 0: 44086.7. Samples: 461597640. Policy #0 lag: (min: 0.0, avg: 7.8, max: 19.0) [2024-07-02 14:02:01,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:02:05,663][36999] Updated weights for policy 0, policy_version 28180 (0.0046) [2024-07-02 14:02:06,096][36761] Fps is (10 sec: 42617.0, 60 sec: 43963.6, 300 sec: 43764.7). Total num frames: 461717504. Throughput: 0: 44046.2. Samples: 461857180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 14:02:06,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:02:07,962][36999] Updated weights for policy 0, policy_version 28190 (0.0031) [2024-07-02 14:02:11,095][36761] Fps is (10 sec: 39322.3, 60 sec: 43690.7, 300 sec: 43654.3). Total num frames: 461930496. Throughput: 0: 43892.0. Samples: 462109420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 14:02:11,096][36761] Avg episode reward: [(0, '1.005')] [2024-07-02 14:02:13,201][36999] Updated weights for policy 0, policy_version 28200 (0.0032) [2024-07-02 14:02:15,902][36999] Updated weights for policy 0, policy_version 28210 (0.0029) [2024-07-02 14:02:16,095][36761] Fps is (10 sec: 47515.0, 60 sec: 43963.9, 300 sec: 43820.3). Total num frames: 462192640. Throughput: 0: 43844.6. Samples: 462244640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 14:02:16,095][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:02:20,589][36999] Updated weights for policy 0, policy_version 28220 (0.0048) [2024-07-02 14:02:21,095][36761] Fps is (10 sec: 44236.2, 60 sec: 43963.8, 300 sec: 43709.2). Total num frames: 462372864. Throughput: 0: 43811.6. Samples: 462509940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-07-02 14:02:21,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 14:02:23,348][36999] Updated weights for policy 0, policy_version 28230 (0.0032) [2024-07-02 14:02:26,100][36761] Fps is (10 sec: 40941.0, 60 sec: 43960.3, 300 sec: 43708.5). Total num frames: 462602240. Throughput: 0: 43957.3. Samples: 462772800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-07-02 14:02:26,101][36761] Avg episode reward: [(0, '1.012')] [2024-07-02 14:02:27,985][36999] Updated weights for policy 0, policy_version 28240 (0.0034) [2024-07-02 14:02:30,701][36999] Updated weights for policy 0, policy_version 28250 (0.0019) [2024-07-02 14:02:31,095][36761] Fps is (10 sec: 47514.2, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 462848000. Throughput: 0: 43791.1. Samples: 462904360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-07-02 14:02:31,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:02:35,570][36999] Updated weights for policy 0, policy_version 28260 (0.0039) [2024-07-02 14:02:36,095][36761] Fps is (10 sec: 42618.0, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 463028224. Throughput: 0: 43734.7. Samples: 463167320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-07-02 14:02:36,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:02:36,279][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000028263_463060992.pth... [2024-07-02 14:02:36,326][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000027621_452542464.pth [2024-07-02 14:02:38,203][36999] Updated weights for policy 0, policy_version 28270 (0.0035) [2024-07-02 14:02:41,095][36761] Fps is (10 sec: 39321.5, 60 sec: 43690.7, 300 sec: 43654.3). Total num frames: 463241216. Throughput: 0: 43746.3. Samples: 463427900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 14:02:41,096][36761] Avg episode reward: [(0, '1.010')] [2024-07-02 14:02:42,778][36979] Signal inference workers to stop experience collection... (6700 times) [2024-07-02 14:02:42,778][36979] Signal inference workers to resume experience collection... (6700 times) [2024-07-02 14:02:42,789][36999] InferenceWorker_p0-w0: stopping experience collection (6700 times) [2024-07-02 14:02:42,789][36999] InferenceWorker_p0-w0: resuming experience collection (6700 times) [2024-07-02 14:02:42,933][36999] Updated weights for policy 0, policy_version 28280 (0.0035) [2024-07-02 14:02:45,913][36999] Updated weights for policy 0, policy_version 28290 (0.0025) [2024-07-02 14:02:46,095][36761] Fps is (10 sec: 49151.4, 60 sec: 43963.6, 300 sec: 43931.3). Total num frames: 463519744. Throughput: 0: 43452.0. Samples: 463552980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 14:02:46,096][36761] Avg episode reward: [(0, '1.006')] [2024-07-02 14:02:50,271][36999] Updated weights for policy 0, policy_version 28300 (0.0031) [2024-07-02 14:02:51,095][36761] Fps is (10 sec: 47513.3, 60 sec: 44509.9, 300 sec: 43875.8). Total num frames: 463716352. Throughput: 0: 43703.7. Samples: 463823840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 14:02:51,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:02:53,220][36999] Updated weights for policy 0, policy_version 28310 (0.0030) [2024-07-02 14:02:56,095][36761] Fps is (10 sec: 39322.2, 60 sec: 43694.0, 300 sec: 43709.2). Total num frames: 463912960. Throughput: 0: 43996.4. Samples: 464089260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 14:02:56,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:02:57,697][36999] Updated weights for policy 0, policy_version 28320 (0.0023) [2024-07-02 14:03:00,666][36999] Updated weights for policy 0, policy_version 28330 (0.0024) [2024-07-02 14:03:01,095][36761] Fps is (10 sec: 44237.4, 60 sec: 43690.8, 300 sec: 43876.5). Total num frames: 464158720. Throughput: 0: 43791.1. Samples: 464215240. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-07-02 14:03:01,095][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:03:05,371][36999] Updated weights for policy 0, policy_version 28340 (0.0032) [2024-07-02 14:03:06,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43963.9, 300 sec: 43820.3). Total num frames: 464355328. Throughput: 0: 43883.6. Samples: 464484700. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-07-02 14:03:06,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:03:08,092][36999] Updated weights for policy 0, policy_version 28350 (0.0028) [2024-07-02 14:03:11,095][36761] Fps is (10 sec: 40959.4, 60 sec: 43963.6, 300 sec: 43709.2). Total num frames: 464568320. Throughput: 0: 43741.7. Samples: 464740980. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-07-02 14:03:11,096][36761] Avg episode reward: [(0, '1.041')] [2024-07-02 14:03:11,096][36979] Saving new best policy, reward=1.041! [2024-07-02 14:03:12,797][36999] Updated weights for policy 0, policy_version 28360 (0.0033) [2024-07-02 14:03:15,829][36999] Updated weights for policy 0, policy_version 28370 (0.0040) [2024-07-02 14:03:16,095][36761] Fps is (10 sec: 47513.4, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 464830464. Throughput: 0: 43691.0. Samples: 464870460. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-07-02 14:03:16,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:03:20,229][36999] Updated weights for policy 0, policy_version 28380 (0.0041) [2024-07-02 14:03:21,095][36761] Fps is (10 sec: 45875.2, 60 sec: 44236.8, 300 sec: 43820.2). Total num frames: 465027072. Throughput: 0: 43774.6. Samples: 465137180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 14:03:21,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:03:23,351][36999] Updated weights for policy 0, policy_version 28390 (0.0035) [2024-07-02 14:03:26,095][36761] Fps is (10 sec: 39321.7, 60 sec: 43694.0, 300 sec: 43764.7). Total num frames: 465223680. Throughput: 0: 43969.3. Samples: 465406520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 14:03:26,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:03:27,574][36999] Updated weights for policy 0, policy_version 28400 (0.0031) [2024-07-02 14:03:30,616][36999] Updated weights for policy 0, policy_version 28410 (0.0033) [2024-07-02 14:03:31,095][36761] Fps is (10 sec: 45875.7, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 465485824. Throughput: 0: 44119.7. Samples: 465538360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 14:03:31,100][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:03:34,955][36999] Updated weights for policy 0, policy_version 28420 (0.0028) [2024-07-02 14:03:36,095][36761] Fps is (10 sec: 45875.8, 60 sec: 44236.9, 300 sec: 43820.3). Total num frames: 465682432. Throughput: 0: 44064.2. Samples: 465806720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-07-02 14:03:36,095][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:03:38,210][36999] Updated weights for policy 0, policy_version 28430 (0.0030) [2024-07-02 14:03:41,095][36761] Fps is (10 sec: 39321.3, 60 sec: 43963.7, 300 sec: 43764.7). Total num frames: 465879040. Throughput: 0: 43885.7. Samples: 466064120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-07-02 14:03:41,096][36761] Avg episode reward: [(0, '1.008')] [2024-07-02 14:03:42,391][36999] Updated weights for policy 0, policy_version 28440 (0.0031) [2024-07-02 14:03:45,915][36999] Updated weights for policy 0, policy_version 28450 (0.0035) [2024-07-02 14:03:46,095][36761] Fps is (10 sec: 44236.1, 60 sec: 43417.6, 300 sec: 43875.8). Total num frames: 466124800. Throughput: 0: 44050.1. Samples: 466197500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-07-02 14:03:46,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:03:49,758][36999] Updated weights for policy 0, policy_version 28460 (0.0026) [2024-07-02 14:03:51,100][36761] Fps is (10 sec: 45854.6, 60 sec: 43687.4, 300 sec: 43764.0). Total num frames: 466337792. Throughput: 0: 43962.7. Samples: 466463220. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-07-02 14:03:51,100][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:03:53,387][36999] Updated weights for policy 0, policy_version 28470 (0.0026) [2024-07-02 14:03:56,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43963.7, 300 sec: 43820.3). Total num frames: 466550784. Throughput: 0: 44154.7. Samples: 466727940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-07-02 14:03:56,096][36761] Avg episode reward: [(0, '1.038')] [2024-07-02 14:03:57,290][36999] Updated weights for policy 0, policy_version 28480 (0.0026) [2024-07-02 14:04:00,781][36999] Updated weights for policy 0, policy_version 28490 (0.0034) [2024-07-02 14:04:01,095][36761] Fps is (10 sec: 44256.9, 60 sec: 43690.6, 300 sec: 43820.3). Total num frames: 466780160. Throughput: 0: 44158.7. Samples: 466857600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-07-02 14:04:01,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:04:04,698][36999] Updated weights for policy 0, policy_version 28500 (0.0031) [2024-07-02 14:04:06,095][36761] Fps is (10 sec: 44236.4, 60 sec: 43963.7, 300 sec: 43765.4). Total num frames: 466993152. Throughput: 0: 44057.7. Samples: 467119780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-07-02 14:04:06,096][36761] Avg episode reward: [(0, '1.012')] [2024-07-02 14:04:08,201][36999] Updated weights for policy 0, policy_version 28510 (0.0027) [2024-07-02 14:04:11,095][36761] Fps is (10 sec: 42598.1, 60 sec: 43963.7, 300 sec: 43764.7). Total num frames: 467206144. Throughput: 0: 44012.8. Samples: 467387100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-07-02 14:04:11,096][36761] Avg episode reward: [(0, '1.010')] [2024-07-02 14:04:12,038][36999] Updated weights for policy 0, policy_version 28520 (0.0024) [2024-07-02 14:04:15,616][36999] Updated weights for policy 0, policy_version 28530 (0.0031) [2024-07-02 14:04:16,095][36761] Fps is (10 sec: 45875.8, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 467451904. Throughput: 0: 43868.4. Samples: 467512440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 14:04:16,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:04:19,691][36999] Updated weights for policy 0, policy_version 28540 (0.0028) [2024-07-02 14:04:21,095][36761] Fps is (10 sec: 47513.7, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 467681280. Throughput: 0: 43802.5. Samples: 467777840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 14:04:21,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:04:23,368][36999] Updated weights for policy 0, policy_version 28550 (0.0025) [2024-07-02 14:04:26,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43963.8, 300 sec: 43764.7). Total num frames: 467861504. Throughput: 0: 43925.4. Samples: 468040760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 14:04:26,096][36761] Avg episode reward: [(0, '0.977')] [2024-07-02 14:04:26,876][36979] Signal inference workers to stop experience collection... (6750 times) [2024-07-02 14:04:26,877][36979] Signal inference workers to resume experience collection... (6750 times) [2024-07-02 14:04:26,898][36999] InferenceWorker_p0-w0: stopping experience collection (6750 times) [2024-07-02 14:04:26,898][36999] InferenceWorker_p0-w0: resuming experience collection (6750 times) [2024-07-02 14:04:27,188][36999] Updated weights for policy 0, policy_version 28560 (0.0026) [2024-07-02 14:04:30,630][36999] Updated weights for policy 0, policy_version 28570 (0.0022) [2024-07-02 14:04:31,095][36761] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 43820.3). Total num frames: 468090880. Throughput: 0: 43722.7. Samples: 468165020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 14:04:31,096][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 14:04:34,631][36999] Updated weights for policy 0, policy_version 28580 (0.0023) [2024-07-02 14:04:36,100][36761] Fps is (10 sec: 45853.8, 60 sec: 43960.3, 300 sec: 43875.1). Total num frames: 468320256. Throughput: 0: 43742.6. Samples: 468431640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 14:04:36,101][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:04:36,112][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000028584_468320256.pth... [2024-07-02 14:04:36,167][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000027940_457768960.pth [2024-07-02 14:04:37,938][36999] Updated weights for policy 0, policy_version 28590 (0.0025) [2024-07-02 14:04:41,095][36761] Fps is (10 sec: 40959.9, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 468500480. Throughput: 0: 43842.2. Samples: 468700840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 14:04:41,096][36761] Avg episode reward: [(0, '1.018')] [2024-07-02 14:04:42,002][36999] Updated weights for policy 0, policy_version 28600 (0.0032) [2024-07-02 14:04:45,514][36999] Updated weights for policy 0, policy_version 28610 (0.0035) [2024-07-02 14:04:46,095][36761] Fps is (10 sec: 42617.5, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 468746240. Throughput: 0: 43739.9. Samples: 468825900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 14:04:46,096][36761] Avg episode reward: [(0, '1.007')] [2024-07-02 14:04:49,363][36999] Updated weights for policy 0, policy_version 28620 (0.0026) [2024-07-02 14:04:51,095][36761] Fps is (10 sec: 47513.1, 60 sec: 43967.0, 300 sec: 43876.3). Total num frames: 468975616. Throughput: 0: 43924.9. Samples: 469096400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 14:04:51,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 14:04:52,788][36999] Updated weights for policy 0, policy_version 28630 (0.0023) [2024-07-02 14:04:56,095][36761] Fps is (10 sec: 45875.6, 60 sec: 44236.8, 300 sec: 43820.2). Total num frames: 469204992. Throughput: 0: 43867.1. Samples: 469361120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-07-02 14:04:56,099][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:04:56,832][36999] Updated weights for policy 0, policy_version 28640 (0.0036) [2024-07-02 14:05:00,338][36999] Updated weights for policy 0, policy_version 28650 (0.0035) [2024-07-02 14:05:01,095][36761] Fps is (10 sec: 44237.2, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 469417984. Throughput: 0: 43920.9. Samples: 469488880. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-07-02 14:05:01,098][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:05:04,173][36999] Updated weights for policy 0, policy_version 28660 (0.0038) [2024-07-02 14:05:06,095][36761] Fps is (10 sec: 42598.8, 60 sec: 43963.8, 300 sec: 43875.8). Total num frames: 469630976. Throughput: 0: 43916.5. Samples: 469754080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-07-02 14:05:06,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:05:07,967][36999] Updated weights for policy 0, policy_version 28670 (0.0026) [2024-07-02 14:05:11,097][36761] Fps is (10 sec: 44230.0, 60 sec: 44235.7, 300 sec: 43820.0). Total num frames: 469860352. Throughput: 0: 44028.2. Samples: 470022100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:05:11,106][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:05:11,509][36999] Updated weights for policy 0, policy_version 28680 (0.0047) [2024-07-02 14:05:15,524][36999] Updated weights for policy 0, policy_version 28690 (0.0045) [2024-07-02 14:05:16,095][36761] Fps is (10 sec: 45875.1, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 470089728. Throughput: 0: 44150.2. Samples: 470151780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:05:16,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:05:18,957][36999] Updated weights for policy 0, policy_version 28700 (0.0035) [2024-07-02 14:05:21,095][36761] Fps is (10 sec: 44243.9, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 470302720. Throughput: 0: 44150.8. Samples: 470418220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:05:21,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:05:22,902][36999] Updated weights for policy 0, policy_version 28710 (0.0023) [2024-07-02 14:05:26,100][36761] Fps is (10 sec: 42578.9, 60 sec: 44233.4, 300 sec: 43764.1). Total num frames: 470515712. Throughput: 0: 44034.2. Samples: 470682580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:05:26,100][36761] Avg episode reward: [(0, '1.007')] [2024-07-02 14:05:26,385][36999] Updated weights for policy 0, policy_version 28720 (0.0033) [2024-07-02 14:05:30,261][36999] Updated weights for policy 0, policy_version 28730 (0.0037) [2024-07-02 14:05:31,095][36761] Fps is (10 sec: 42598.1, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 470728704. Throughput: 0: 43966.8. Samples: 470804400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 14:05:31,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:05:34,051][36999] Updated weights for policy 0, policy_version 28740 (0.0026) [2024-07-02 14:05:36,096][36761] Fps is (10 sec: 44256.2, 60 sec: 43967.0, 300 sec: 43875.8). Total num frames: 470958080. Throughput: 0: 43971.5. Samples: 471075120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 14:05:36,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:05:37,691][36999] Updated weights for policy 0, policy_version 28750 (0.0023) [2024-07-02 14:05:39,864][36979] Signal inference workers to stop experience collection... (6800 times) [2024-07-02 14:05:39,865][36979] Signal inference workers to resume experience collection... (6800 times) [2024-07-02 14:05:39,886][36999] InferenceWorker_p0-w0: stopping experience collection (6800 times) [2024-07-02 14:05:39,886][36999] InferenceWorker_p0-w0: resuming experience collection (6800 times) [2024-07-02 14:05:41,095][36761] Fps is (10 sec: 45875.0, 60 sec: 44782.9, 300 sec: 43820.9). Total num frames: 471187456. Throughput: 0: 43924.9. Samples: 471337740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 14:05:41,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:05:41,376][36999] Updated weights for policy 0, policy_version 28760 (0.0036) [2024-07-02 14:05:45,150][36999] Updated weights for policy 0, policy_version 28770 (0.0035) [2024-07-02 14:05:46,095][36761] Fps is (10 sec: 42598.8, 60 sec: 43963.8, 300 sec: 43931.3). Total num frames: 471384064. Throughput: 0: 43999.1. Samples: 471468840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 14:05:46,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:05:48,939][36999] Updated weights for policy 0, policy_version 28780 (0.0028) [2024-07-02 14:05:51,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43963.8, 300 sec: 43931.3). Total num frames: 471613440. Throughput: 0: 43945.2. Samples: 471731620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:05:51,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:05:52,536][36999] Updated weights for policy 0, policy_version 28790 (0.0031) [2024-07-02 14:05:56,095][36761] Fps is (10 sec: 45875.1, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 471842816. Throughput: 0: 43855.2. Samples: 471995520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:05:56,096][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 14:05:56,222][36999] Updated weights for policy 0, policy_version 28800 (0.0029) [2024-07-02 14:05:59,976][36999] Updated weights for policy 0, policy_version 28810 (0.0039) [2024-07-02 14:06:01,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43417.6, 300 sec: 43875.8). Total num frames: 472023040. Throughput: 0: 43826.6. Samples: 472123980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:06:01,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:06:03,538][36999] Updated weights for policy 0, policy_version 28820 (0.0029) [2024-07-02 14:06:06,095][36761] Fps is (10 sec: 42599.0, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 472268800. Throughput: 0: 43802.7. Samples: 472389340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:06:06,095][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:06:07,621][36999] Updated weights for policy 0, policy_version 28830 (0.0051) [2024-07-02 14:06:11,095][36761] Fps is (10 sec: 47513.7, 60 sec: 43964.9, 300 sec: 43875.8). Total num frames: 472498176. Throughput: 0: 43756.0. Samples: 472651400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-07-02 14:06:11,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:06:11,306][36999] Updated weights for policy 0, policy_version 28840 (0.0023) [2024-07-02 14:06:15,002][36999] Updated weights for policy 0, policy_version 28850 (0.0026) [2024-07-02 14:06:16,095][36761] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 43931.4). Total num frames: 472694784. Throughput: 0: 44045.3. Samples: 472786440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-07-02 14:06:16,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:06:18,842][36999] Updated weights for policy 0, policy_version 28860 (0.0042) [2024-07-02 14:06:21,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 43931.3). Total num frames: 472924160. Throughput: 0: 43904.1. Samples: 473050800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-07-02 14:06:21,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:06:22,313][36999] Updated weights for policy 0, policy_version 28870 (0.0026) [2024-07-02 14:06:26,097][36761] Fps is (10 sec: 44228.1, 60 sec: 43692.5, 300 sec: 43764.4). Total num frames: 473137152. Throughput: 0: 43900.3. Samples: 473313340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-07-02 14:06:26,098][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:06:26,281][36999] Updated weights for policy 0, policy_version 28880 (0.0034) [2024-07-02 14:06:30,316][36999] Updated weights for policy 0, policy_version 28890 (0.0037) [2024-07-02 14:06:31,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 473350144. Throughput: 0: 43881.8. Samples: 473443520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 14:06:31,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:06:33,706][36999] Updated weights for policy 0, policy_version 28900 (0.0029) [2024-07-02 14:06:36,095][36761] Fps is (10 sec: 45884.0, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 473595904. Throughput: 0: 43960.0. Samples: 473709820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 14:06:36,099][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:06:36,121][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000028906_473595904.pth... [2024-07-02 14:06:36,202][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000028263_463060992.pth [2024-07-02 14:06:37,719][36999] Updated weights for policy 0, policy_version 28910 (0.0040) [2024-07-02 14:06:41,095][36761] Fps is (10 sec: 47514.5, 60 sec: 43963.8, 300 sec: 43875.8). Total num frames: 473825280. Throughput: 0: 43907.7. Samples: 473971360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 14:06:41,095][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:06:41,101][36999] Updated weights for policy 0, policy_version 28920 (0.0032) [2024-07-02 14:06:45,601][36999] Updated weights for policy 0, policy_version 28930 (0.0031) [2024-07-02 14:06:46,095][36761] Fps is (10 sec: 40960.2, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 474005504. Throughput: 0: 43902.7. Samples: 474099600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-07-02 14:06:46,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:06:48,420][36999] Updated weights for policy 0, policy_version 28940 (0.0037) [2024-07-02 14:06:51,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43963.8, 300 sec: 43932.0). Total num frames: 474251264. Throughput: 0: 43969.8. Samples: 474367980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-07-02 14:06:51,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:06:53,002][36999] Updated weights for policy 0, policy_version 28950 (0.0041) [2024-07-02 14:06:55,614][36979] Signal inference workers to stop experience collection... (6850 times) [2024-07-02 14:06:55,669][36979] Signal inference workers to resume experience collection... (6850 times) [2024-07-02 14:06:55,671][36999] InferenceWorker_p0-w0: stopping experience collection (6850 times) [2024-07-02 14:06:55,699][36999] InferenceWorker_p0-w0: resuming experience collection (6850 times) [2024-07-02 14:06:55,803][36999] Updated weights for policy 0, policy_version 28960 (0.0035) [2024-07-02 14:06:56,095][36761] Fps is (10 sec: 47513.3, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 474480640. Throughput: 0: 43917.3. Samples: 474627680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-07-02 14:06:56,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:07:00,305][36999] Updated weights for policy 0, policy_version 28970 (0.0038) [2024-07-02 14:07:01,095][36761] Fps is (10 sec: 44236.5, 60 sec: 44509.9, 300 sec: 43986.9). Total num frames: 474693632. Throughput: 0: 43927.6. Samples: 474763180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-07-02 14:07:01,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:07:03,538][36999] Updated weights for policy 0, policy_version 28980 (0.0020) [2024-07-02 14:07:06,095][36761] Fps is (10 sec: 44237.0, 60 sec: 44236.7, 300 sec: 44042.4). Total num frames: 474923008. Throughput: 0: 44000.0. Samples: 475030800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:07:06,096][36761] Avg episode reward: [(0, '1.012')] [2024-07-02 14:07:07,578][36999] Updated weights for policy 0, policy_version 28990 (0.0031) [2024-07-02 14:07:11,008][36999] Updated weights for policy 0, policy_version 29000 (0.0032) [2024-07-02 14:07:11,096][36761] Fps is (10 sec: 44234.8, 60 sec: 43963.4, 300 sec: 43875.7). Total num frames: 475136000. Throughput: 0: 43964.2. Samples: 475291660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:07:11,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:07:14,945][36999] Updated weights for policy 0, policy_version 29010 (0.0022) [2024-07-02 14:07:16,095][36761] Fps is (10 sec: 42598.8, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 475348992. Throughput: 0: 44125.5. Samples: 475429160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:07:16,096][36761] Avg episode reward: [(0, '1.018')] [2024-07-02 14:07:18,360][36999] Updated weights for policy 0, policy_version 29020 (0.0034) [2024-07-02 14:07:21,095][36761] Fps is (10 sec: 42600.6, 60 sec: 43963.8, 300 sec: 43932.0). Total num frames: 475561984. Throughput: 0: 44083.3. Samples: 475693560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:07:21,095][36761] Avg episode reward: [(0, '1.038')] [2024-07-02 14:07:22,209][36999] Updated weights for policy 0, policy_version 29030 (0.0031) [2024-07-02 14:07:25,773][36999] Updated weights for policy 0, policy_version 29040 (0.0030) [2024-07-02 14:07:26,095][36761] Fps is (10 sec: 45875.0, 60 sec: 44511.4, 300 sec: 43931.3). Total num frames: 475807744. Throughput: 0: 44038.1. Samples: 475953080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-07-02 14:07:26,096][36761] Avg episode reward: [(0, '0.960')] [2024-07-02 14:07:29,587][36999] Updated weights for policy 0, policy_version 29050 (0.0028) [2024-07-02 14:07:31,095][36761] Fps is (10 sec: 45874.7, 60 sec: 44509.9, 300 sec: 44042.4). Total num frames: 476020736. Throughput: 0: 44135.5. Samples: 476085700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-07-02 14:07:31,096][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 14:07:33,150][36999] Updated weights for policy 0, policy_version 29060 (0.0044) [2024-07-02 14:07:36,095][36761] Fps is (10 sec: 42597.9, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 476233728. Throughput: 0: 44068.3. Samples: 476351060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-07-02 14:07:36,098][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 14:07:37,078][36999] Updated weights for policy 0, policy_version 29070 (0.0030) [2024-07-02 14:07:40,663][36999] Updated weights for policy 0, policy_version 29080 (0.0046) [2024-07-02 14:07:41,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43690.6, 300 sec: 43820.3). Total num frames: 476446720. Throughput: 0: 44044.9. Samples: 476609700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-07-02 14:07:41,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:07:44,505][36999] Updated weights for policy 0, policy_version 29090 (0.0030) [2024-07-02 14:07:46,095][36761] Fps is (10 sec: 44237.2, 60 sec: 44509.9, 300 sec: 43931.3). Total num frames: 476676096. Throughput: 0: 43999.1. Samples: 476743140. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-07-02 14:07:46,096][36761] Avg episode reward: [(0, '1.006')] [2024-07-02 14:07:48,613][36999] Updated weights for policy 0, policy_version 29100 (0.0027) [2024-07-02 14:07:51,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43690.6, 300 sec: 43931.3). Total num frames: 476872704. Throughput: 0: 43856.5. Samples: 477004340. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-07-02 14:07:51,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:07:52,144][36999] Updated weights for policy 0, policy_version 29110 (0.0045) [2024-07-02 14:07:56,018][36999] Updated weights for policy 0, policy_version 29120 (0.0025) [2024-07-02 14:07:56,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 477102080. Throughput: 0: 43855.6. Samples: 477265140. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-07-02 14:07:56,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:07:59,567][36999] Updated weights for policy 0, policy_version 29130 (0.0027) [2024-07-02 14:08:01,095][36761] Fps is (10 sec: 47513.1, 60 sec: 44236.7, 300 sec: 44042.4). Total num frames: 477347840. Throughput: 0: 43694.5. Samples: 477395420. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-07-02 14:08:01,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:08:03,388][36999] Updated weights for policy 0, policy_version 29140 (0.0030) [2024-07-02 14:08:06,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 477544448. Throughput: 0: 43648.0. Samples: 477657720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-07-02 14:08:06,096][36761] Avg episode reward: [(0, '1.003')] [2024-07-02 14:08:06,898][36999] Updated weights for policy 0, policy_version 29150 (0.0033) [2024-07-02 14:08:10,849][36999] Updated weights for policy 0, policy_version 29160 (0.0037) [2024-07-02 14:08:11,095][36761] Fps is (10 sec: 40960.2, 60 sec: 43691.0, 300 sec: 43820.3). Total num frames: 477757440. Throughput: 0: 43773.3. Samples: 477922880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-07-02 14:08:11,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:08:14,531][36999] Updated weights for policy 0, policy_version 29170 (0.0034) [2024-07-02 14:08:16,095][36761] Fps is (10 sec: 45875.1, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 478003200. Throughput: 0: 43788.1. Samples: 478056160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-07-02 14:08:16,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:08:17,340][36979] Signal inference workers to stop experience collection... (6900 times) [2024-07-02 14:08:17,340][36979] Signal inference workers to resume experience collection... (6900 times) [2024-07-02 14:08:17,359][36999] InferenceWorker_p0-w0: stopping experience collection (6900 times) [2024-07-02 14:08:17,364][36999] InferenceWorker_p0-w0: resuming experience collection (6900 times) [2024-07-02 14:08:18,387][36999] Updated weights for policy 0, policy_version 29180 (0.0032) [2024-07-02 14:08:21,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 43931.3). Total num frames: 478183424. Throughput: 0: 43717.8. Samples: 478318360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-07-02 14:08:21,098][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:08:21,862][36999] Updated weights for policy 0, policy_version 29190 (0.0024) [2024-07-02 14:08:26,057][36999] Updated weights for policy 0, policy_version 29200 (0.0036) [2024-07-02 14:08:26,095][36761] Fps is (10 sec: 40959.5, 60 sec: 43417.5, 300 sec: 43820.2). Total num frames: 478412800. Throughput: 0: 43774.6. Samples: 478579560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-07-02 14:08:26,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:08:29,345][36999] Updated weights for policy 0, policy_version 29210 (0.0038) [2024-07-02 14:08:31,095][36761] Fps is (10 sec: 47514.3, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 478658560. Throughput: 0: 43791.6. Samples: 478713760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-07-02 14:08:31,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:08:33,541][36999] Updated weights for policy 0, policy_version 29220 (0.0033) [2024-07-02 14:08:36,095][36761] Fps is (10 sec: 42599.0, 60 sec: 43417.7, 300 sec: 43931.4). Total num frames: 478838784. Throughput: 0: 43907.1. Samples: 478980160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-07-02 14:08:36,095][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:08:36,118][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000029227_478855168.pth... [2024-07-02 14:08:36,162][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000028584_468320256.pth [2024-07-02 14:08:36,834][36999] Updated weights for policy 0, policy_version 29230 (0.0030) [2024-07-02 14:08:40,889][36999] Updated weights for policy 0, policy_version 29240 (0.0027) [2024-07-02 14:08:41,095][36761] Fps is (10 sec: 40959.5, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 479068160. Throughput: 0: 43887.5. Samples: 479240080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 14:08:41,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:08:44,371][36999] Updated weights for policy 0, policy_version 29250 (0.0043) [2024-07-02 14:08:46,095][36761] Fps is (10 sec: 49151.8, 60 sec: 44236.8, 300 sec: 44043.1). Total num frames: 479330304. Throughput: 0: 43894.3. Samples: 479370660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 14:08:46,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:08:48,281][36999] Updated weights for policy 0, policy_version 29260 (0.0022) [2024-07-02 14:08:51,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 479494144. Throughput: 0: 43795.9. Samples: 479628540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 14:08:51,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:08:52,116][36999] Updated weights for policy 0, policy_version 29270 (0.0032) [2024-07-02 14:08:56,094][36999] Updated weights for policy 0, policy_version 29280 (0.0034) [2024-07-02 14:08:56,095][36761] Fps is (10 sec: 39321.2, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 479723520. Throughput: 0: 43796.4. Samples: 479893720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 14:08:56,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:08:59,499][36999] Updated weights for policy 0, policy_version 29290 (0.0031) [2024-07-02 14:09:01,095][36761] Fps is (10 sec: 47513.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 479969280. Throughput: 0: 43719.1. Samples: 480023520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 14:09:01,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:09:03,432][36999] Updated weights for policy 0, policy_version 29300 (0.0032) [2024-07-02 14:09:06,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43690.6, 300 sec: 43931.3). Total num frames: 480165888. Throughput: 0: 43849.4. Samples: 480291580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 14:09:06,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:09:06,843][36999] Updated weights for policy 0, policy_version 29310 (0.0022) [2024-07-02 14:09:10,793][36999] Updated weights for policy 0, policy_version 29320 (0.0036) [2024-07-02 14:09:11,095][36761] Fps is (10 sec: 40959.8, 60 sec: 43690.6, 300 sec: 43820.2). Total num frames: 480378880. Throughput: 0: 43862.7. Samples: 480553380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 14:09:11,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:09:14,194][36999] Updated weights for policy 0, policy_version 29330 (0.0039) [2024-07-02 14:09:16,095][36761] Fps is (10 sec: 45875.5, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 480624640. Throughput: 0: 43752.0. Samples: 480682600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 14:09:16,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:09:18,147][36999] Updated weights for policy 0, policy_version 29340 (0.0039) [2024-07-02 14:09:21,095][36761] Fps is (10 sec: 45875.6, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 480837632. Throughput: 0: 43775.5. Samples: 480950060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 14:09:21,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:09:21,639][36999] Updated weights for policy 0, policy_version 29350 (0.0032) [2024-07-02 14:09:25,538][36999] Updated weights for policy 0, policy_version 29360 (0.0042) [2024-07-02 14:09:26,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43963.8, 300 sec: 43931.3). Total num frames: 481050624. Throughput: 0: 43935.2. Samples: 481217160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 14:09:26,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:09:28,990][36999] Updated weights for policy 0, policy_version 29370 (0.0034) [2024-07-02 14:09:31,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43690.7, 300 sec: 43932.0). Total num frames: 481280000. Throughput: 0: 43998.7. Samples: 481350600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 14:09:31,095][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:09:32,946][36999] Updated weights for policy 0, policy_version 29380 (0.0025) [2024-07-02 14:09:36,095][36761] Fps is (10 sec: 44236.6, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 481492992. Throughput: 0: 44095.6. Samples: 481612840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 14:09:36,096][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 14:09:36,535][36979] Signal inference workers to stop experience collection... (6950 times) [2024-07-02 14:09:36,574][36999] InferenceWorker_p0-w0: stopping experience collection (6950 times) [2024-07-02 14:09:36,590][36979] Signal inference workers to resume experience collection... (6950 times) [2024-07-02 14:09:36,596][36999] InferenceWorker_p0-w0: resuming experience collection (6950 times) [2024-07-02 14:09:36,744][36999] Updated weights for policy 0, policy_version 29390 (0.0033) [2024-07-02 14:09:40,349][36999] Updated weights for policy 0, policy_version 29400 (0.0043) [2024-07-02 14:09:41,095][36761] Fps is (10 sec: 44236.2, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 481722368. Throughput: 0: 44063.1. Samples: 481876560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-07-02 14:09:41,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:09:44,061][36999] Updated weights for policy 0, policy_version 29410 (0.0030) [2024-07-02 14:09:46,095][36761] Fps is (10 sec: 45875.0, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 481951744. Throughput: 0: 44236.9. Samples: 482014180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-07-02 14:09:46,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:09:47,674][36999] Updated weights for policy 0, policy_version 29420 (0.0028) [2024-07-02 14:09:51,095][36761] Fps is (10 sec: 42599.1, 60 sec: 44236.9, 300 sec: 43875.8). Total num frames: 482148352. Throughput: 0: 44166.4. Samples: 482279060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-07-02 14:09:51,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:09:51,456][36999] Updated weights for policy 0, policy_version 29430 (0.0036) [2024-07-02 14:09:55,238][36999] Updated weights for policy 0, policy_version 29440 (0.0021) [2024-07-02 14:09:56,095][36761] Fps is (10 sec: 40959.9, 60 sec: 43963.8, 300 sec: 43875.8). Total num frames: 482361344. Throughput: 0: 44098.2. Samples: 482537800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-07-02 14:09:56,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:09:58,836][36999] Updated weights for policy 0, policy_version 29450 (0.0024) [2024-07-02 14:10:01,095][36761] Fps is (10 sec: 45874.3, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 482607104. Throughput: 0: 44215.4. Samples: 482672300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:10:01,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:10:02,612][36999] Updated weights for policy 0, policy_version 29460 (0.0036) [2024-07-02 14:10:06,095][36761] Fps is (10 sec: 45875.3, 60 sec: 44236.8, 300 sec: 43931.6). Total num frames: 482820096. Throughput: 0: 44145.7. Samples: 482936620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:10:06,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:10:06,245][36999] Updated weights for policy 0, policy_version 29470 (0.0038) [2024-07-02 14:10:09,952][36999] Updated weights for policy 0, policy_version 29480 (0.0032) [2024-07-02 14:10:11,095][36761] Fps is (10 sec: 42599.2, 60 sec: 44236.9, 300 sec: 43875.8). Total num frames: 483033088. Throughput: 0: 44142.3. Samples: 483203560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:10:11,096][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 14:10:13,489][36999] Updated weights for policy 0, policy_version 29490 (0.0042) [2024-07-02 14:10:16,096][36761] Fps is (10 sec: 44234.5, 60 sec: 43963.3, 300 sec: 43931.2). Total num frames: 483262464. Throughput: 0: 43889.6. Samples: 483325660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:10:16,096][36761] Avg episode reward: [(0, '1.046')] [2024-07-02 14:10:16,104][36979] Saving new best policy, reward=1.046! [2024-07-02 14:10:17,611][36999] Updated weights for policy 0, policy_version 29500 (0.0036) [2024-07-02 14:10:20,971][36999] Updated weights for policy 0, policy_version 29510 (0.0028) [2024-07-02 14:10:21,095][36761] Fps is (10 sec: 45874.6, 60 sec: 44236.8, 300 sec: 43987.5). Total num frames: 483491840. Throughput: 0: 43967.9. Samples: 483591400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:10:21,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:10:25,114][36999] Updated weights for policy 0, policy_version 29520 (0.0031) [2024-07-02 14:10:26,095][36761] Fps is (10 sec: 42600.4, 60 sec: 43963.6, 300 sec: 43931.3). Total num frames: 483688448. Throughput: 0: 44009.8. Samples: 483857000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:10:26,096][36761] Avg episode reward: [(0, '1.013')] [2024-07-02 14:10:28,605][36999] Updated weights for policy 0, policy_version 29530 (0.0031) [2024-07-02 14:10:31,095][36761] Fps is (10 sec: 40960.3, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 483901440. Throughput: 0: 43775.2. Samples: 483984060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:10:31,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:10:32,555][36999] Updated weights for policy 0, policy_version 29540 (0.0041) [2024-07-02 14:10:35,922][36999] Updated weights for policy 0, policy_version 29550 (0.0031) [2024-07-02 14:10:36,095][36761] Fps is (10 sec: 45875.4, 60 sec: 44236.7, 300 sec: 43931.3). Total num frames: 484147200. Throughput: 0: 43761.6. Samples: 484248340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:10:36,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:10:36,113][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000029550_484147200.pth... [2024-07-02 14:10:36,165][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000028906_473595904.pth [2024-07-02 14:10:40,047][36999] Updated weights for policy 0, policy_version 29560 (0.0033) [2024-07-02 14:10:41,095][36761] Fps is (10 sec: 45874.8, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 484360192. Throughput: 0: 43834.2. Samples: 484510340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 14:10:41,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:10:43,261][36999] Updated weights for policy 0, policy_version 29570 (0.0043) [2024-07-02 14:10:46,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 43931.3). Total num frames: 484573184. Throughput: 0: 43792.4. Samples: 484642960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 14:10:46,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:10:47,669][36999] Updated weights for policy 0, policy_version 29580 (0.0042) [2024-07-02 14:10:50,789][36999] Updated weights for policy 0, policy_version 29590 (0.0030) [2024-07-02 14:10:51,095][36761] Fps is (10 sec: 44236.5, 60 sec: 44236.6, 300 sec: 43931.3). Total num frames: 484802560. Throughput: 0: 43748.4. Samples: 484905300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 14:10:51,096][36761] Avg episode reward: [(0, '1.012')] [2024-07-02 14:10:55,008][36999] Updated weights for policy 0, policy_version 29600 (0.0034) [2024-07-02 14:10:56,095][36761] Fps is (10 sec: 42599.2, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 484999168. Throughput: 0: 43733.8. Samples: 485171580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 14:10:56,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:10:58,179][36999] Updated weights for policy 0, policy_version 29610 (0.0029) [2024-07-02 14:11:01,095][36761] Fps is (10 sec: 40960.5, 60 sec: 43417.7, 300 sec: 43875.8). Total num frames: 485212160. Throughput: 0: 43813.9. Samples: 485297260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 14:11:01,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:11:02,277][36999] Updated weights for policy 0, policy_version 29620 (0.0028) [2024-07-02 14:11:05,648][36999] Updated weights for policy 0, policy_version 29630 (0.0035) [2024-07-02 14:11:06,095][36761] Fps is (10 sec: 45875.2, 60 sec: 43963.8, 300 sec: 43931.3). Total num frames: 485457920. Throughput: 0: 43942.3. Samples: 485568800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 14:11:06,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:11:09,328][36979] Signal inference workers to stop experience collection... (7000 times) [2024-07-02 14:11:09,329][36979] Signal inference workers to resume experience collection... (7000 times) [2024-07-02 14:11:09,347][36999] InferenceWorker_p0-w0: stopping experience collection (7000 times) [2024-07-02 14:11:09,347][36999] InferenceWorker_p0-w0: resuming experience collection (7000 times) [2024-07-02 14:11:09,641][36999] Updated weights for policy 0, policy_version 29640 (0.0027) [2024-07-02 14:11:11,095][36761] Fps is (10 sec: 45875.5, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 485670912. Throughput: 0: 43826.8. Samples: 485829200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 14:11:11,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:11:13,430][36999] Updated weights for policy 0, policy_version 29650 (0.0024) [2024-07-02 14:11:16,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43418.1, 300 sec: 43875.8). Total num frames: 485867520. Throughput: 0: 43990.7. Samples: 485963640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 14:11:16,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:11:17,078][36999] Updated weights for policy 0, policy_version 29660 (0.0036) [2024-07-02 14:11:20,709][36999] Updated weights for policy 0, policy_version 29670 (0.0032) [2024-07-02 14:11:21,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43690.7, 300 sec: 43987.2). Total num frames: 486113280. Throughput: 0: 44054.8. Samples: 486230800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 14:11:21,095][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:11:24,542][36999] Updated weights for policy 0, policy_version 29680 (0.0032) [2024-07-02 14:11:26,095][36761] Fps is (10 sec: 45874.5, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 486326272. Throughput: 0: 44078.2. Samples: 486493860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 14:11:26,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:11:28,095][36999] Updated weights for policy 0, policy_version 29690 (0.0033) [2024-07-02 14:11:31,095][36761] Fps is (10 sec: 44236.4, 60 sec: 44236.7, 300 sec: 43931.3). Total num frames: 486555648. Throughput: 0: 43951.2. Samples: 486620760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 14:11:31,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 14:11:32,407][36999] Updated weights for policy 0, policy_version 29700 (0.0034) [2024-07-02 14:11:35,473][36999] Updated weights for policy 0, policy_version 29710 (0.0033) [2024-07-02 14:11:36,098][36761] Fps is (10 sec: 44223.2, 60 sec: 43688.4, 300 sec: 43875.3). Total num frames: 486768640. Throughput: 0: 43925.5. Samples: 486882080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-07-02 14:11:36,099][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:11:39,848][36999] Updated weights for policy 0, policy_version 29720 (0.0035) [2024-07-02 14:11:41,100][36761] Fps is (10 sec: 42579.0, 60 sec: 43687.4, 300 sec: 43986.2). Total num frames: 486981632. Throughput: 0: 43936.3. Samples: 487148920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-07-02 14:11:41,100][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:11:43,397][36999] Updated weights for policy 0, policy_version 29730 (0.0032) [2024-07-02 14:11:46,095][36761] Fps is (10 sec: 42611.9, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 487194624. Throughput: 0: 44108.0. Samples: 487282120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-07-02 14:11:46,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:11:47,189][36999] Updated weights for policy 0, policy_version 29740 (0.0037) [2024-07-02 14:11:50,866][36999] Updated weights for policy 0, policy_version 29750 (0.0031) [2024-07-02 14:11:51,095][36761] Fps is (10 sec: 44257.4, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 487424000. Throughput: 0: 43884.4. Samples: 487543600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-07-02 14:11:51,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:11:54,714][36999] Updated weights for policy 0, policy_version 29760 (0.0049) [2024-07-02 14:11:56,095][36761] Fps is (10 sec: 44236.5, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 487636992. Throughput: 0: 43915.0. Samples: 487805380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:11:56,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:11:58,351][36999] Updated weights for policy 0, policy_version 29770 (0.0031) [2024-07-02 14:12:01,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43963.8, 300 sec: 43820.3). Total num frames: 487849984. Throughput: 0: 43809.7. Samples: 487935080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:12:01,096][36761] Avg episode reward: [(0, '1.010')] [2024-07-02 14:12:02,195][36999] Updated weights for policy 0, policy_version 29780 (0.0041) [2024-07-02 14:12:05,719][36999] Updated weights for policy 0, policy_version 29790 (0.0021) [2024-07-02 14:12:06,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43690.6, 300 sec: 43875.9). Total num frames: 488079360. Throughput: 0: 43775.0. Samples: 488200680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:12:06,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:12:09,759][36999] Updated weights for policy 0, policy_version 29800 (0.0030) [2024-07-02 14:12:11,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 488292352. Throughput: 0: 43777.4. Samples: 488463840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:12:11,096][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 14:12:13,426][36999] Updated weights for policy 0, policy_version 29810 (0.0044) [2024-07-02 14:12:16,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 488505344. Throughput: 0: 43988.0. Samples: 488600220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:12:16,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:12:17,100][36999] Updated weights for policy 0, policy_version 29820 (0.0032) [2024-07-02 14:12:20,847][36999] Updated weights for policy 0, policy_version 29830 (0.0029) [2024-07-02 14:12:21,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43690.6, 300 sec: 43820.3). Total num frames: 488734720. Throughput: 0: 43790.6. Samples: 488852520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:12:21,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:12:24,660][36999] Updated weights for policy 0, policy_version 29840 (0.0039) [2024-07-02 14:12:26,096][36761] Fps is (10 sec: 44234.5, 60 sec: 43690.3, 300 sec: 43820.2). Total num frames: 488947712. Throughput: 0: 43768.0. Samples: 489118300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:12:26,097][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:12:28,223][36999] Updated weights for policy 0, policy_version 29850 (0.0034) [2024-07-02 14:12:31,100][36761] Fps is (10 sec: 44216.7, 60 sec: 43687.4, 300 sec: 43875.1). Total num frames: 489177088. Throughput: 0: 43759.5. Samples: 489251500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:12:31,100][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:12:32,053][36999] Updated weights for policy 0, policy_version 29860 (0.0030) [2024-07-02 14:12:32,596][36979] Signal inference workers to stop experience collection... (7050 times) [2024-07-02 14:12:32,603][36979] Signal inference workers to resume experience collection... (7050 times) [2024-07-02 14:12:32,640][36999] InferenceWorker_p0-w0: stopping experience collection (7050 times) [2024-07-02 14:12:32,640][36999] InferenceWorker_p0-w0: resuming experience collection (7050 times) [2024-07-02 14:12:35,913][36999] Updated weights for policy 0, policy_version 29870 (0.0033) [2024-07-02 14:12:36,095][36761] Fps is (10 sec: 44239.3, 60 sec: 43693.0, 300 sec: 43875.8). Total num frames: 489390080. Throughput: 0: 43700.0. Samples: 489510100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 14:12:36,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:12:36,118][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000029870_489390080.pth... [2024-07-02 14:12:36,183][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000029227_478855168.pth [2024-07-02 14:12:39,342][36999] Updated weights for policy 0, policy_version 29880 (0.0032) [2024-07-02 14:12:41,095][36761] Fps is (10 sec: 44257.2, 60 sec: 43967.1, 300 sec: 43875.8). Total num frames: 489619456. Throughput: 0: 43889.9. Samples: 489780420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 14:12:41,096][36761] Avg episode reward: [(0, '1.037')] [2024-07-02 14:12:43,367][36999] Updated weights for policy 0, policy_version 29890 (0.0045) [2024-07-02 14:12:46,095][36761] Fps is (10 sec: 45874.5, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 489848832. Throughput: 0: 43951.0. Samples: 489912880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 14:12:46,096][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 14:12:46,715][36999] Updated weights for policy 0, policy_version 29900 (0.0028) [2024-07-02 14:12:50,675][36999] Updated weights for policy 0, policy_version 29910 (0.0040) [2024-07-02 14:12:51,095][36761] Fps is (10 sec: 44236.2, 60 sec: 43963.6, 300 sec: 43931.3). Total num frames: 490061824. Throughput: 0: 44049.3. Samples: 490182900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 14:12:51,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:12:53,999][36999] Updated weights for policy 0, policy_version 29920 (0.0026) [2024-07-02 14:12:56,095][36761] Fps is (10 sec: 44237.1, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 490291200. Throughput: 0: 43946.6. Samples: 490441440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 14:12:56,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:12:58,016][36999] Updated weights for policy 0, policy_version 29930 (0.0024) [2024-07-02 14:13:01,095][36761] Fps is (10 sec: 44237.6, 60 sec: 44236.8, 300 sec: 43931.3). Total num frames: 490504192. Throughput: 0: 43917.4. Samples: 490576500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 14:13:01,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:13:01,381][36999] Updated weights for policy 0, policy_version 29940 (0.0039) [2024-07-02 14:13:05,648][36999] Updated weights for policy 0, policy_version 29950 (0.0037) [2024-07-02 14:13:06,095][36761] Fps is (10 sec: 40960.4, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 490700800. Throughput: 0: 44136.9. Samples: 490838680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 14:13:06,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:13:08,694][36999] Updated weights for policy 0, policy_version 29960 (0.0040) [2024-07-02 14:13:11,095][36761] Fps is (10 sec: 44236.5, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 490946560. Throughput: 0: 44050.3. Samples: 491100540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 14:13:11,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:13:13,173][36999] Updated weights for policy 0, policy_version 29970 (0.0031) [2024-07-02 14:13:16,095][36761] Fps is (10 sec: 47513.4, 60 sec: 44509.9, 300 sec: 44042.4). Total num frames: 491175936. Throughput: 0: 44068.9. Samples: 491234400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 14:13:16,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:13:16,440][36999] Updated weights for policy 0, policy_version 29980 (0.0030) [2024-07-02 14:13:20,517][36999] Updated weights for policy 0, policy_version 29990 (0.0040) [2024-07-02 14:13:21,095][36761] Fps is (10 sec: 40960.2, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 491356160. Throughput: 0: 44136.0. Samples: 491496220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 14:13:21,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:13:23,933][36999] Updated weights for policy 0, policy_version 30000 (0.0025) [2024-07-02 14:13:26,100][36761] Fps is (10 sec: 42579.0, 60 sec: 44233.8, 300 sec: 43875.1). Total num frames: 491601920. Throughput: 0: 43887.9. Samples: 491755580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 14:13:26,101][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:13:28,163][36999] Updated weights for policy 0, policy_version 30010 (0.0030) [2024-07-02 14:13:31,095][36761] Fps is (10 sec: 47512.7, 60 sec: 44240.1, 300 sec: 44042.4). Total num frames: 491831296. Throughput: 0: 44025.8. Samples: 491894040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 14:13:31,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:13:31,303][36999] Updated weights for policy 0, policy_version 30020 (0.0027) [2024-07-02 14:13:35,795][36999] Updated weights for policy 0, policy_version 30030 (0.0023) [2024-07-02 14:13:36,095][36761] Fps is (10 sec: 40978.3, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 492011520. Throughput: 0: 43818.7. Samples: 492154740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 14:13:36,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:13:38,910][36999] Updated weights for policy 0, policy_version 30040 (0.0029) [2024-07-02 14:13:41,095][36761] Fps is (10 sec: 42599.0, 60 sec: 43963.7, 300 sec: 43820.3). Total num frames: 492257280. Throughput: 0: 43771.6. Samples: 492411160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 14:13:41,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 14:13:43,654][36999] Updated weights for policy 0, policy_version 30050 (0.0035) [2024-07-02 14:13:43,889][36979] Signal inference workers to stop experience collection... (7100 times) [2024-07-02 14:13:43,900][36999] InferenceWorker_p0-w0: stopping experience collection (7100 times) [2024-07-02 14:13:43,949][36979] Signal inference workers to resume experience collection... (7100 times) [2024-07-02 14:13:43,950][36999] InferenceWorker_p0-w0: resuming experience collection (7100 times) [2024-07-02 14:13:46,100][36761] Fps is (10 sec: 45854.6, 60 sec: 43687.4, 300 sec: 43986.2). Total num frames: 492470272. Throughput: 0: 43781.7. Samples: 492546880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 14:13:46,100][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:13:46,492][36999] Updated weights for policy 0, policy_version 30060 (0.0033) [2024-07-02 14:13:50,990][36999] Updated weights for policy 0, policy_version 30070 (0.0027) [2024-07-02 14:13:51,096][36761] Fps is (10 sec: 40959.2, 60 sec: 43417.5, 300 sec: 43875.8). Total num frames: 492666880. Throughput: 0: 43723.3. Samples: 492806240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 14:13:51,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:13:53,815][36999] Updated weights for policy 0, policy_version 30080 (0.0034) [2024-07-02 14:13:56,095][36761] Fps is (10 sec: 44257.2, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 492912640. Throughput: 0: 43771.6. Samples: 493070260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-07-02 14:13:56,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:13:58,324][36999] Updated weights for policy 0, policy_version 30090 (0.0028) [2024-07-02 14:14:01,074][36999] Updated weights for policy 0, policy_version 30100 (0.0038) [2024-07-02 14:14:01,100][36761] Fps is (10 sec: 49130.3, 60 sec: 44233.3, 300 sec: 44041.7). Total num frames: 493158400. Throughput: 0: 43870.6. Samples: 493208780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-07-02 14:14:01,100][36761] Avg episode reward: [(0, '1.039')] [2024-07-02 14:14:05,663][36999] Updated weights for policy 0, policy_version 30110 (0.0028) [2024-07-02 14:14:06,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43963.8, 300 sec: 43931.4). Total num frames: 493338624. Throughput: 0: 43953.8. Samples: 493474140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-07-02 14:14:06,096][36761] Avg episode reward: [(0, '1.039')] [2024-07-02 14:14:08,686][36999] Updated weights for policy 0, policy_version 30120 (0.0031) [2024-07-02 14:14:11,095][36761] Fps is (10 sec: 42618.3, 60 sec: 43963.8, 300 sec: 43931.3). Total num frames: 493584384. Throughput: 0: 43959.6. Samples: 493733560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-07-02 14:14:11,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:14:13,048][36999] Updated weights for policy 0, policy_version 30130 (0.0026) [2024-07-02 14:14:16,061][36999] Updated weights for policy 0, policy_version 30140 (0.0047) [2024-07-02 14:14:16,095][36761] Fps is (10 sec: 47512.7, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 493813760. Throughput: 0: 44044.0. Samples: 493876020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 14:14:16,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 14:14:20,998][36999] Updated weights for policy 0, policy_version 30150 (0.0022) [2024-07-02 14:14:21,095][36761] Fps is (10 sec: 39321.1, 60 sec: 43690.6, 300 sec: 43820.2). Total num frames: 493977600. Throughput: 0: 43948.0. Samples: 494132400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 14:14:21,097][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:14:23,466][36999] Updated weights for policy 0, policy_version 30160 (0.0046) [2024-07-02 14:14:26,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43967.0, 300 sec: 43931.3). Total num frames: 494239744. Throughput: 0: 43976.8. Samples: 494390120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 14:14:26,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:14:28,345][36999] Updated weights for policy 0, policy_version 30170 (0.0037) [2024-07-02 14:14:30,787][36999] Updated weights for policy 0, policy_version 30180 (0.0031) [2024-07-02 14:14:31,095][36761] Fps is (10 sec: 49152.3, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 494469120. Throughput: 0: 44088.0. Samples: 494530640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 14:14:31,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:14:35,690][36999] Updated weights for policy 0, policy_version 30190 (0.0032) [2024-07-02 14:14:36,100][36761] Fps is (10 sec: 40941.5, 60 sec: 43960.4, 300 sec: 43819.6). Total num frames: 494649344. Throughput: 0: 44100.1. Samples: 494790940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 14:14:36,100][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:14:36,113][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000030191_494649344.pth... [2024-07-02 14:14:36,162][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000029550_484147200.pth [2024-07-02 14:14:38,159][36999] Updated weights for policy 0, policy_version 30200 (0.0033) [2024-07-02 14:14:41,100][36761] Fps is (10 sec: 42578.6, 60 sec: 43960.3, 300 sec: 43875.1). Total num frames: 494895104. Throughput: 0: 43985.2. Samples: 495049800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 14:14:41,101][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:14:43,048][36999] Updated weights for policy 0, policy_version 30210 (0.0033) [2024-07-02 14:14:45,626][36999] Updated weights for policy 0, policy_version 30220 (0.0026) [2024-07-02 14:14:46,096][36761] Fps is (10 sec: 47534.5, 60 sec: 44240.1, 300 sec: 43986.8). Total num frames: 495124480. Throughput: 0: 44052.8. Samples: 495190960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 14:14:46,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 14:14:50,446][36999] Updated weights for policy 0, policy_version 30230 (0.0028) [2024-07-02 14:14:51,095][36761] Fps is (10 sec: 42617.9, 60 sec: 44236.9, 300 sec: 43931.3). Total num frames: 495321088. Throughput: 0: 44062.9. Samples: 495456980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-07-02 14:14:51,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:14:52,991][36999] Updated weights for policy 0, policy_version 30240 (0.0025) [2024-07-02 14:14:56,096][36761] Fps is (10 sec: 42598.4, 60 sec: 43963.6, 300 sec: 43875.8). Total num frames: 495550464. Throughput: 0: 44023.3. Samples: 495714620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-07-02 14:14:56,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:14:57,735][36999] Updated weights for policy 0, policy_version 30250 (0.0042) [2024-07-02 14:15:00,519][36999] Updated weights for policy 0, policy_version 30260 (0.0028) [2024-07-02 14:15:01,095][36761] Fps is (10 sec: 47514.2, 60 sec: 43967.1, 300 sec: 43986.9). Total num frames: 495796224. Throughput: 0: 43853.9. Samples: 495849440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-07-02 14:15:01,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:15:05,178][36999] Updated weights for policy 0, policy_version 30270 (0.0032) [2024-07-02 14:15:06,095][36761] Fps is (10 sec: 42599.3, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 495976448. Throughput: 0: 44033.4. Samples: 496113900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-07-02 14:15:06,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 14:15:08,182][36999] Updated weights for policy 0, policy_version 30280 (0.0038) [2024-07-02 14:15:11,095][36761] Fps is (10 sec: 39321.7, 60 sec: 43417.6, 300 sec: 43820.4). Total num frames: 496189440. Throughput: 0: 44060.6. Samples: 496372840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-07-02 14:15:11,096][36761] Avg episode reward: [(0, '1.013')] [2024-07-02 14:15:12,547][36999] Updated weights for policy 0, policy_version 30290 (0.0029) [2024-07-02 14:15:15,596][36999] Updated weights for policy 0, policy_version 30300 (0.0035) [2024-07-02 14:15:16,095][36761] Fps is (10 sec: 45875.1, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 496435200. Throughput: 0: 43944.0. Samples: 496508120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 14:15:16,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:15:19,945][36999] Updated weights for policy 0, policy_version 30310 (0.0035) [2024-07-02 14:15:21,095][36761] Fps is (10 sec: 44236.7, 60 sec: 44236.9, 300 sec: 43875.8). Total num frames: 496631808. Throughput: 0: 43946.7. Samples: 496768340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 14:15:21,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 14:15:22,749][36979] Signal inference workers to stop experience collection... (7150 times) [2024-07-02 14:15:22,749][36979] Signal inference workers to resume experience collection... (7150 times) [2024-07-02 14:15:22,778][36999] InferenceWorker_p0-w0: stopping experience collection (7150 times) [2024-07-02 14:15:22,778][36999] InferenceWorker_p0-w0: resuming experience collection (7150 times) [2024-07-02 14:15:23,210][36999] Updated weights for policy 0, policy_version 30320 (0.0040) [2024-07-02 14:15:26,095][36761] Fps is (10 sec: 42597.9, 60 sec: 43690.6, 300 sec: 43931.3). Total num frames: 496861184. Throughput: 0: 44052.0. Samples: 497031940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 14:15:26,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 14:15:27,440][36999] Updated weights for policy 0, policy_version 30330 (0.0026) [2024-07-02 14:15:30,614][36999] Updated weights for policy 0, policy_version 30340 (0.0031) [2024-07-02 14:15:31,095][36761] Fps is (10 sec: 49151.9, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 497123328. Throughput: 0: 43846.4. Samples: 497164040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 14:15:31,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:15:35,002][36999] Updated weights for policy 0, policy_version 30350 (0.0028) [2024-07-02 14:15:36,095][36761] Fps is (10 sec: 44237.1, 60 sec: 44240.2, 300 sec: 43875.8). Total num frames: 497303552. Throughput: 0: 43941.8. Samples: 497434360. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-07-02 14:15:36,096][36761] Avg episode reward: [(0, '1.004')] [2024-07-02 14:15:38,110][36999] Updated weights for policy 0, policy_version 30360 (0.0039) [2024-07-02 14:15:41,095][36761] Fps is (10 sec: 39321.4, 60 sec: 43694.0, 300 sec: 43875.8). Total num frames: 497516544. Throughput: 0: 43861.4. Samples: 497688380. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-07-02 14:15:41,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:15:42,422][36999] Updated weights for policy 0, policy_version 30370 (0.0024) [2024-07-02 14:15:45,445][36999] Updated weights for policy 0, policy_version 30380 (0.0026) [2024-07-02 14:15:46,095][36761] Fps is (10 sec: 45875.5, 60 sec: 43963.9, 300 sec: 43931.4). Total num frames: 497762304. Throughput: 0: 43915.6. Samples: 497825640. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-07-02 14:15:46,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:15:50,223][36999] Updated weights for policy 0, policy_version 30390 (0.0034) [2024-07-02 14:15:51,100][36761] Fps is (10 sec: 44216.8, 60 sec: 43960.4, 300 sec: 43930.6). Total num frames: 497958912. Throughput: 0: 44013.3. Samples: 498094700. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-07-02 14:15:51,101][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:15:52,887][36999] Updated weights for policy 0, policy_version 30400 (0.0036) [2024-07-02 14:15:56,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43690.8, 300 sec: 43931.3). Total num frames: 498171904. Throughput: 0: 43980.0. Samples: 498351940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-07-02 14:15:56,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:15:57,501][36999] Updated weights for policy 0, policy_version 30410 (0.0029) [2024-07-02 14:16:00,581][36999] Updated weights for policy 0, policy_version 30420 (0.0028) [2024-07-02 14:16:01,099][36761] Fps is (10 sec: 47519.8, 60 sec: 43961.3, 300 sec: 43986.4). Total num frames: 498434048. Throughput: 0: 43944.8. Samples: 498485780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-07-02 14:16:01,099][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:16:05,076][36999] Updated weights for policy 0, policy_version 30430 (0.0031) [2024-07-02 14:16:06,095][36761] Fps is (10 sec: 45874.9, 60 sec: 44236.7, 300 sec: 43931.3). Total num frames: 498630656. Throughput: 0: 44052.8. Samples: 498750720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-07-02 14:16:06,097][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:16:07,841][36999] Updated weights for policy 0, policy_version 30440 (0.0034) [2024-07-02 14:16:11,095][36761] Fps is (10 sec: 39334.3, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 498827264. Throughput: 0: 43926.3. Samples: 499008620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-07-02 14:16:11,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:16:12,478][36999] Updated weights for policy 0, policy_version 30450 (0.0049) [2024-07-02 14:16:15,484][36999] Updated weights for policy 0, policy_version 30460 (0.0030) [2024-07-02 14:16:16,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 499073024. Throughput: 0: 43909.3. Samples: 499139960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-07-02 14:16:16,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:16:19,722][36999] Updated weights for policy 0, policy_version 30470 (0.0034) [2024-07-02 14:16:21,095][36761] Fps is (10 sec: 47514.0, 60 sec: 44509.9, 300 sec: 43986.9). Total num frames: 499302400. Throughput: 0: 43823.6. Samples: 499406420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-07-02 14:16:21,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:16:23,103][36999] Updated weights for policy 0, policy_version 30480 (0.0031) [2024-07-02 14:16:26,095][36761] Fps is (10 sec: 40960.4, 60 sec: 43690.8, 300 sec: 43820.3). Total num frames: 499482624. Throughput: 0: 43998.3. Samples: 499668300. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-07-02 14:16:26,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:16:27,117][36999] Updated weights for policy 0, policy_version 30490 (0.0036) [2024-07-02 14:16:30,422][36999] Updated weights for policy 0, policy_version 30500 (0.0026) [2024-07-02 14:16:31,095][36761] Fps is (10 sec: 42597.8, 60 sec: 43417.5, 300 sec: 43931.8). Total num frames: 499728384. Throughput: 0: 43845.2. Samples: 499798680. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-07-02 14:16:31,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:16:33,814][36979] Signal inference workers to stop experience collection... (7200 times) [2024-07-02 14:16:33,815][36979] Signal inference workers to resume experience collection... (7200 times) [2024-07-02 14:16:33,828][36999] InferenceWorker_p0-w0: stopping experience collection (7200 times) [2024-07-02 14:16:33,856][36999] InferenceWorker_p0-w0: resuming experience collection (7200 times) [2024-07-02 14:16:34,456][36999] Updated weights for policy 0, policy_version 30510 (0.0042) [2024-07-02 14:16:36,095][36761] Fps is (10 sec: 47513.3, 60 sec: 44236.8, 300 sec: 43987.6). Total num frames: 499957760. Throughput: 0: 43815.1. Samples: 500066180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 14:16:36,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:16:36,118][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000030515_499957760.pth... [2024-07-02 14:16:36,175][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000029870_489390080.pth [2024-07-02 14:16:37,833][36999] Updated weights for policy 0, policy_version 30520 (0.0020) [2024-07-02 14:16:41,095][36761] Fps is (10 sec: 40960.8, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 500137984. Throughput: 0: 43976.1. Samples: 500330860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 14:16:41,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:16:41,785][36999] Updated weights for policy 0, policy_version 30530 (0.0041) [2024-07-02 14:16:45,227][36999] Updated weights for policy 0, policy_version 30540 (0.0036) [2024-07-02 14:16:46,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 500400128. Throughput: 0: 43861.4. Samples: 500459400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 14:16:46,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:16:49,264][36999] Updated weights for policy 0, policy_version 30550 (0.0031) [2024-07-02 14:16:51,095][36761] Fps is (10 sec: 45875.0, 60 sec: 43967.1, 300 sec: 43931.3). Total num frames: 500596736. Throughput: 0: 43911.2. Samples: 500726720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 14:16:51,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:16:52,553][36999] Updated weights for policy 0, policy_version 30560 (0.0026) [2024-07-02 14:16:56,098][36761] Fps is (10 sec: 40947.8, 60 sec: 43961.5, 300 sec: 43930.9). Total num frames: 500809728. Throughput: 0: 44050.0. Samples: 500991000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 14:16:56,099][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:16:57,163][36999] Updated weights for policy 0, policy_version 30570 (0.0028) [2024-07-02 14:17:00,033][36999] Updated weights for policy 0, policy_version 30580 (0.0029) [2024-07-02 14:17:01,095][36761] Fps is (10 sec: 45875.2, 60 sec: 43693.1, 300 sec: 43986.9). Total num frames: 501055488. Throughput: 0: 43954.3. Samples: 501117900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 14:17:01,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:17:04,499][36999] Updated weights for policy 0, policy_version 30590 (0.0028) [2024-07-02 14:17:06,095][36761] Fps is (10 sec: 45888.7, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 501268480. Throughput: 0: 43970.5. Samples: 501385100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 14:17:06,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:17:07,316][36999] Updated weights for policy 0, policy_version 30600 (0.0025) [2024-07-02 14:17:11,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43963.8, 300 sec: 43931.3). Total num frames: 501465088. Throughput: 0: 44044.5. Samples: 501650300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 14:17:11,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:17:11,825][36999] Updated weights for policy 0, policy_version 30610 (0.0037) [2024-07-02 14:17:14,985][36999] Updated weights for policy 0, policy_version 30620 (0.0028) [2024-07-02 14:17:16,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 501710848. Throughput: 0: 44005.9. Samples: 501778940. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-07-02 14:17:16,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 14:17:19,223][36999] Updated weights for policy 0, policy_version 30630 (0.0033) [2024-07-02 14:17:21,095][36761] Fps is (10 sec: 45874.8, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 501923840. Throughput: 0: 43903.6. Samples: 502041840. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-07-02 14:17:21,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:17:22,380][36999] Updated weights for policy 0, policy_version 30640 (0.0030) [2024-07-02 14:17:26,095][36761] Fps is (10 sec: 40960.4, 60 sec: 43963.8, 300 sec: 43876.5). Total num frames: 502120448. Throughput: 0: 44030.2. Samples: 502312220. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-07-02 14:17:26,096][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 14:17:26,601][36999] Updated weights for policy 0, policy_version 30650 (0.0042) [2024-07-02 14:17:29,779][36999] Updated weights for policy 0, policy_version 30660 (0.0049) [2024-07-02 14:17:31,096][36761] Fps is (10 sec: 45873.0, 60 sec: 44236.5, 300 sec: 44042.3). Total num frames: 502382592. Throughput: 0: 43898.2. Samples: 502434840. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-07-02 14:17:31,096][36761] Avg episode reward: [(0, '1.038')] [2024-07-02 14:17:34,215][36999] Updated weights for policy 0, policy_version 30670 (0.0032) [2024-07-02 14:17:36,095][36761] Fps is (10 sec: 47512.7, 60 sec: 43963.7, 300 sec: 43986.8). Total num frames: 502595584. Throughput: 0: 43934.5. Samples: 502703780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 14:17:36,096][36761] Avg episode reward: [(0, '1.041')] [2024-07-02 14:17:37,081][36999] Updated weights for policy 0, policy_version 30680 (0.0038) [2024-07-02 14:17:41,095][36761] Fps is (10 sec: 40961.9, 60 sec: 44236.7, 300 sec: 43875.8). Total num frames: 502792192. Throughput: 0: 44048.3. Samples: 502973040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 14:17:41,097][36761] Avg episode reward: [(0, '1.037')] [2024-07-02 14:17:41,533][36999] Updated weights for policy 0, policy_version 30690 (0.0024) [2024-07-02 14:17:44,540][36999] Updated weights for policy 0, policy_version 30700 (0.0029) [2024-07-02 14:17:46,095][36761] Fps is (10 sec: 44237.6, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 503037952. Throughput: 0: 44061.8. Samples: 503100680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 14:17:46,100][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:17:48,835][36999] Updated weights for policy 0, policy_version 30710 (0.0043) [2024-07-02 14:17:51,095][36761] Fps is (10 sec: 44237.2, 60 sec: 43963.8, 300 sec: 43875.8). Total num frames: 503234560. Throughput: 0: 43973.5. Samples: 503363900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 14:17:51,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:17:51,742][36979] Signal inference workers to stop experience collection... (7250 times) [2024-07-02 14:17:51,742][36979] Signal inference workers to resume experience collection... (7250 times) [2024-07-02 14:17:51,764][36999] InferenceWorker_p0-w0: stopping experience collection (7250 times) [2024-07-02 14:17:51,765][36999] InferenceWorker_p0-w0: resuming experience collection (7250 times) [2024-07-02 14:17:51,886][36999] Updated weights for policy 0, policy_version 30720 (0.0035) [2024-07-02 14:17:56,095][36761] Fps is (10 sec: 42598.4, 60 sec: 44239.1, 300 sec: 43931.3). Total num frames: 503463936. Throughput: 0: 44093.8. Samples: 503634520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-07-02 14:17:56,095][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:17:56,118][36999] Updated weights for policy 0, policy_version 30730 (0.0041) [2024-07-02 14:17:59,199][36999] Updated weights for policy 0, policy_version 30740 (0.0027) [2024-07-02 14:18:01,095][36761] Fps is (10 sec: 45874.6, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 503693312. Throughput: 0: 44113.7. Samples: 503764060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-07-02 14:18:01,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:18:03,988][36999] Updated weights for policy 0, policy_version 30750 (0.0028) [2024-07-02 14:18:06,095][36761] Fps is (10 sec: 44236.4, 60 sec: 43963.8, 300 sec: 43931.3). Total num frames: 503906304. Throughput: 0: 44016.5. Samples: 504022580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-07-02 14:18:06,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:18:07,084][36999] Updated weights for policy 0, policy_version 30760 (0.0044) [2024-07-02 14:18:11,095][36761] Fps is (10 sec: 40960.2, 60 sec: 43963.7, 300 sec: 43820.3). Total num frames: 504102912. Throughput: 0: 44022.6. Samples: 504293240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-07-02 14:18:11,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:18:11,460][36999] Updated weights for policy 0, policy_version 30770 (0.0039) [2024-07-02 14:18:14,488][36999] Updated weights for policy 0, policy_version 30780 (0.0029) [2024-07-02 14:18:16,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43963.8, 300 sec: 44042.4). Total num frames: 504348672. Throughput: 0: 44063.2. Samples: 504417660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 27.0) [2024-07-02 14:18:16,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:18:18,847][36999] Updated weights for policy 0, policy_version 30790 (0.0025) [2024-07-02 14:18:21,095][36761] Fps is (10 sec: 47513.4, 60 sec: 44236.8, 300 sec: 43987.5). Total num frames: 504578048. Throughput: 0: 44048.9. Samples: 504685980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 27.0) [2024-07-02 14:18:21,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:18:21,837][36999] Updated weights for policy 0, policy_version 30800 (0.0035) [2024-07-02 14:18:26,095][36761] Fps is (10 sec: 42598.0, 60 sec: 44236.7, 300 sec: 43875.8). Total num frames: 504774656. Throughput: 0: 44075.5. Samples: 504956440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 27.0) [2024-07-02 14:18:26,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:18:26,302][36999] Updated weights for policy 0, policy_version 30810 (0.0027) [2024-07-02 14:18:29,099][36999] Updated weights for policy 0, policy_version 30820 (0.0026) [2024-07-02 14:18:31,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43691.0, 300 sec: 44042.4). Total num frames: 505004032. Throughput: 0: 44017.3. Samples: 505081460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 27.0) [2024-07-02 14:18:31,096][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 14:18:33,559][36999] Updated weights for policy 0, policy_version 30830 (0.0034) [2024-07-02 14:18:36,095][36761] Fps is (10 sec: 47514.1, 60 sec: 44236.9, 300 sec: 44042.4). Total num frames: 505249792. Throughput: 0: 44210.6. Samples: 505353380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 14:18:36,096][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 14:18:36,108][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000030838_505249792.pth... [2024-07-02 14:18:36,162][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000030191_494649344.pth [2024-07-02 14:18:36,497][36999] Updated weights for policy 0, policy_version 30840 (0.0037) [2024-07-02 14:18:41,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43963.7, 300 sec: 43932.0). Total num frames: 505430016. Throughput: 0: 44076.8. Samples: 505617980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 14:18:41,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:18:41,125][36999] Updated weights for policy 0, policy_version 30850 (0.0029) [2024-07-02 14:18:44,107][36999] Updated weights for policy 0, policy_version 30860 (0.0029) [2024-07-02 14:18:46,096][36761] Fps is (10 sec: 40957.4, 60 sec: 43690.2, 300 sec: 44042.3). Total num frames: 505659392. Throughput: 0: 43901.3. Samples: 505739640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 14:18:46,097][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:18:48,847][36999] Updated weights for policy 0, policy_version 30870 (0.0027) [2024-07-02 14:18:51,096][36761] Fps is (10 sec: 47512.8, 60 sec: 44509.7, 300 sec: 44042.4). Total num frames: 505905152. Throughput: 0: 44002.5. Samples: 506002700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 14:18:51,096][36761] Avg episode reward: [(0, '1.038')] [2024-07-02 14:18:51,826][36999] Updated weights for policy 0, policy_version 30880 (0.0028) [2024-07-02 14:18:56,095][36761] Fps is (10 sec: 42601.2, 60 sec: 43690.7, 300 sec: 43821.0). Total num frames: 506085376. Throughput: 0: 44016.1. Samples: 506273960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-07-02 14:18:56,095][36761] Avg episode reward: [(0, '1.038')] [2024-07-02 14:18:56,171][36999] Updated weights for policy 0, policy_version 30890 (0.0022) [2024-07-02 14:18:59,115][36999] Updated weights for policy 0, policy_version 30900 (0.0042) [2024-07-02 14:19:01,095][36761] Fps is (10 sec: 40960.5, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 506314752. Throughput: 0: 44082.2. Samples: 506401360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-07-02 14:19:01,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:19:03,470][36999] Updated weights for policy 0, policy_version 30910 (0.0021) [2024-07-02 14:19:06,095][36761] Fps is (10 sec: 47513.7, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 506560512. Throughput: 0: 43989.0. Samples: 506665480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-07-02 14:19:06,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 14:19:06,592][36999] Updated weights for policy 0, policy_version 30920 (0.0038) [2024-07-02 14:19:10,908][36999] Updated weights for policy 0, policy_version 30930 (0.0033) [2024-07-02 14:19:11,095][36761] Fps is (10 sec: 44237.3, 60 sec: 44236.9, 300 sec: 43875.8). Total num frames: 506757120. Throughput: 0: 44049.5. Samples: 506938660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-07-02 14:19:11,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:19:13,960][36999] Updated weights for policy 0, policy_version 30940 (0.0028) [2024-07-02 14:19:16,095][36761] Fps is (10 sec: 40959.5, 60 sec: 43690.6, 300 sec: 44042.4). Total num frames: 506970112. Throughput: 0: 43956.0. Samples: 507059480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 14:19:16,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:19:18,250][36999] Updated weights for policy 0, policy_version 30950 (0.0033) [2024-07-02 14:19:21,095][36761] Fps is (10 sec: 47512.9, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 507232256. Throughput: 0: 43843.0. Samples: 507326320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 14:19:21,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:19:21,211][36999] Updated weights for policy 0, policy_version 30960 (0.0028) [2024-07-02 14:19:25,772][36999] Updated weights for policy 0, policy_version 30970 (0.0032) [2024-07-02 14:19:26,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43963.8, 300 sec: 43875.8). Total num frames: 507412480. Throughput: 0: 43961.8. Samples: 507596260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 14:19:26,096][36761] Avg episode reward: [(0, '1.039')] [2024-07-02 14:19:28,738][36999] Updated weights for policy 0, policy_version 30980 (0.0029) [2024-07-02 14:19:31,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43963.7, 300 sec: 44043.1). Total num frames: 507641856. Throughput: 0: 43972.5. Samples: 507718380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 14:19:31,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:19:32,259][36979] Signal inference workers to stop experience collection... (7300 times) [2024-07-02 14:19:32,307][36999] InferenceWorker_p0-w0: stopping experience collection (7300 times) [2024-07-02 14:19:32,314][36979] Signal inference workers to resume experience collection... (7300 times) [2024-07-02 14:19:32,328][36999] InferenceWorker_p0-w0: resuming experience collection (7300 times) [2024-07-02 14:19:33,483][36999] Updated weights for policy 0, policy_version 30990 (0.0030) [2024-07-02 14:19:35,941][36999] Updated weights for policy 0, policy_version 31000 (0.0038) [2024-07-02 14:19:36,095][36761] Fps is (10 sec: 49152.6, 60 sec: 44236.8, 300 sec: 44098.7). Total num frames: 507904000. Throughput: 0: 44283.8. Samples: 507995460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-07-02 14:19:36,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:19:40,764][36999] Updated weights for policy 0, policy_version 31010 (0.0033) [2024-07-02 14:19:41,097][36761] Fps is (10 sec: 44231.1, 60 sec: 44235.8, 300 sec: 43931.2). Total num frames: 508084224. Throughput: 0: 44215.9. Samples: 508263740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-07-02 14:19:41,097][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 14:19:43,366][36999] Updated weights for policy 0, policy_version 31020 (0.0044) [2024-07-02 14:19:46,095][36761] Fps is (10 sec: 40959.7, 60 sec: 44237.2, 300 sec: 44042.4). Total num frames: 508313600. Throughput: 0: 44034.3. Samples: 508382900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-07-02 14:19:46,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:19:48,089][36999] Updated weights for policy 0, policy_version 31030 (0.0037) [2024-07-02 14:19:50,769][36999] Updated weights for policy 0, policy_version 31040 (0.0039) [2024-07-02 14:19:51,095][36761] Fps is (10 sec: 49158.4, 60 sec: 44509.9, 300 sec: 44153.5). Total num frames: 508575744. Throughput: 0: 44250.1. Samples: 508656740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-07-02 14:19:51,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 14:19:55,494][36999] Updated weights for policy 0, policy_version 31050 (0.0025) [2024-07-02 14:19:56,095][36761] Fps is (10 sec: 42598.0, 60 sec: 44236.7, 300 sec: 43875.8). Total num frames: 508739584. Throughput: 0: 44170.1. Samples: 508926320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:19:56,096][36761] Avg episode reward: [(0, '1.006')] [2024-07-02 14:19:58,131][36999] Updated weights for policy 0, policy_version 31060 (0.0022) [2024-07-02 14:20:01,095][36761] Fps is (10 sec: 39322.1, 60 sec: 44236.9, 300 sec: 44042.4). Total num frames: 508968960. Throughput: 0: 44117.4. Samples: 509044760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:20:01,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:20:03,124][36999] Updated weights for policy 0, policy_version 31070 (0.0025) [2024-07-02 14:20:05,653][36999] Updated weights for policy 0, policy_version 31080 (0.0023) [2024-07-02 14:20:06,095][36761] Fps is (10 sec: 49152.0, 60 sec: 44509.7, 300 sec: 44209.0). Total num frames: 509231104. Throughput: 0: 44190.7. Samples: 509314900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:20:06,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:20:10,499][36999] Updated weights for policy 0, policy_version 31090 (0.0039) [2024-07-02 14:20:11,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 509394944. Throughput: 0: 44153.4. Samples: 509583160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:20:11,095][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 14:20:13,262][36999] Updated weights for policy 0, policy_version 31100 (0.0037) [2024-07-02 14:20:16,096][36761] Fps is (10 sec: 40958.1, 60 sec: 44509.5, 300 sec: 44097.9). Total num frames: 509640704. Throughput: 0: 44237.3. Samples: 509709080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-07-02 14:20:16,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:20:17,802][36999] Updated weights for policy 0, policy_version 31110 (0.0033) [2024-07-02 14:20:20,586][36999] Updated weights for policy 0, policy_version 31120 (0.0025) [2024-07-02 14:20:21,095][36761] Fps is (10 sec: 47513.4, 60 sec: 43963.8, 300 sec: 44098.0). Total num frames: 509870080. Throughput: 0: 43906.6. Samples: 509971260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-07-02 14:20:21,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:20:25,165][36999] Updated weights for policy 0, policy_version 31130 (0.0038) [2024-07-02 14:20:26,095][36761] Fps is (10 sec: 42601.1, 60 sec: 44236.9, 300 sec: 43875.8). Total num frames: 510066688. Throughput: 0: 44021.9. Samples: 510244660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-07-02 14:20:26,095][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 14:20:28,534][36999] Updated weights for policy 0, policy_version 31140 (0.0030) [2024-07-02 14:20:31,095][36761] Fps is (10 sec: 42598.5, 60 sec: 44236.9, 300 sec: 44042.4). Total num frames: 510296064. Throughput: 0: 44067.2. Samples: 510365920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-07-02 14:20:31,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 14:20:32,584][36999] Updated weights for policy 0, policy_version 31150 (0.0031) [2024-07-02 14:20:35,966][36999] Updated weights for policy 0, policy_version 31160 (0.0021) [2024-07-02 14:20:36,095][36761] Fps is (10 sec: 45874.9, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 510525440. Throughput: 0: 44036.5. Samples: 510638380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 14:20:36,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 14:20:36,118][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000031160_510525440.pth... [2024-07-02 14:20:36,188][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000030515_499957760.pth [2024-07-02 14:20:37,626][36979] Signal inference workers to stop experience collection... (7350 times) [2024-07-02 14:20:37,626][36979] Signal inference workers to resume experience collection... (7350 times) [2024-07-02 14:20:37,658][36999] InferenceWorker_p0-w0: stopping experience collection (7350 times) [2024-07-02 14:20:37,658][36999] InferenceWorker_p0-w0: resuming experience collection (7350 times) [2024-07-02 14:20:39,977][36999] Updated weights for policy 0, policy_version 31170 (0.0042) [2024-07-02 14:20:41,095][36761] Fps is (10 sec: 44236.2, 60 sec: 44237.8, 300 sec: 43986.9). Total num frames: 510738432. Throughput: 0: 43902.2. Samples: 510901920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 14:20:41,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:20:43,394][36999] Updated weights for policy 0, policy_version 31180 (0.0042) [2024-07-02 14:20:46,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43963.8, 300 sec: 44043.1). Total num frames: 510951424. Throughput: 0: 44052.0. Samples: 511027100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 14:20:46,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:20:47,202][36999] Updated weights for policy 0, policy_version 31190 (0.0040) [2024-07-02 14:20:50,598][36999] Updated weights for policy 0, policy_version 31200 (0.0028) [2024-07-02 14:20:51,095][36761] Fps is (10 sec: 44237.5, 60 sec: 43417.7, 300 sec: 44098.0). Total num frames: 511180800. Throughput: 0: 43900.6. Samples: 511290420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 14:20:51,095][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:20:54,787][36999] Updated weights for policy 0, policy_version 31210 (0.0036) [2024-07-02 14:20:56,098][36761] Fps is (10 sec: 44225.7, 60 sec: 44235.1, 300 sec: 43931.5). Total num frames: 511393792. Throughput: 0: 44052.6. Samples: 511565640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 14:20:56,098][36761] Avg episode reward: [(0, '1.018')] [2024-07-02 14:20:57,938][36999] Updated weights for policy 0, policy_version 31220 (0.0031) [2024-07-02 14:21:01,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 511606784. Throughput: 0: 44040.2. Samples: 511690860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-07-02 14:21:01,095][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:21:02,150][36999] Updated weights for policy 0, policy_version 31230 (0.0033) [2024-07-02 14:21:05,419][36999] Updated weights for policy 0, policy_version 31240 (0.0032) [2024-07-02 14:21:06,095][36761] Fps is (10 sec: 44247.4, 60 sec: 43417.6, 300 sec: 44097.9). Total num frames: 511836160. Throughput: 0: 43971.9. Samples: 511950000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-07-02 14:21:06,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:21:09,571][36999] Updated weights for policy 0, policy_version 31250 (0.0036) [2024-07-02 14:21:11,095][36761] Fps is (10 sec: 44236.0, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 512049152. Throughput: 0: 43813.6. Samples: 512216280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-07-02 14:21:11,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:21:13,302][36999] Updated weights for policy 0, policy_version 31260 (0.0026) [2024-07-02 14:21:16,095][36761] Fps is (10 sec: 40960.5, 60 sec: 43418.0, 300 sec: 43875.8). Total num frames: 512245760. Throughput: 0: 43817.3. Samples: 512337700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-07-02 14:21:16,096][36761] Avg episode reward: [(0, '1.036')] [2024-07-02 14:21:17,415][36999] Updated weights for policy 0, policy_version 31270 (0.0039) [2024-07-02 14:21:20,840][36999] Updated weights for policy 0, policy_version 31280 (0.0040) [2024-07-02 14:21:21,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43690.6, 300 sec: 44097.9). Total num frames: 512491520. Throughput: 0: 43614.6. Samples: 512601040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 14:21:21,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:21:24,609][36999] Updated weights for policy 0, policy_version 31290 (0.0033) [2024-07-02 14:21:26,095][36761] Fps is (10 sec: 45875.4, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 512704512. Throughput: 0: 43918.8. Samples: 512878260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 14:21:26,095][36761] Avg episode reward: [(0, '1.018')] [2024-07-02 14:21:28,154][36999] Updated weights for policy 0, policy_version 31300 (0.0022) [2024-07-02 14:21:31,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 512933888. Throughput: 0: 43900.4. Samples: 513002620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 14:21:31,096][36761] Avg episode reward: [(0, '1.018')] [2024-07-02 14:21:32,041][36999] Updated weights for policy 0, policy_version 31310 (0.0043) [2024-07-02 14:21:35,595][36999] Updated weights for policy 0, policy_version 31320 (0.0036) [2024-07-02 14:21:36,096][36761] Fps is (10 sec: 45871.6, 60 sec: 43963.2, 300 sec: 44153.4). Total num frames: 513163264. Throughput: 0: 43825.0. Samples: 513262580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 14:21:36,097][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:21:39,530][36999] Updated weights for policy 0, policy_version 31330 (0.0029) [2024-07-02 14:21:41,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 513359872. Throughput: 0: 43720.6. Samples: 513532960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 14:21:41,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:21:43,011][36999] Updated weights for policy 0, policy_version 31340 (0.0036) [2024-07-02 14:21:46,095][36761] Fps is (10 sec: 40963.1, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 513572864. Throughput: 0: 43796.0. Samples: 513661680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 14:21:46,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:21:46,823][36999] Updated weights for policy 0, policy_version 31350 (0.0040) [2024-07-02 14:21:50,395][36999] Updated weights for policy 0, policy_version 31360 (0.0033) [2024-07-02 14:21:51,100][36761] Fps is (10 sec: 45854.2, 60 sec: 43960.3, 300 sec: 44097.7). Total num frames: 513818624. Throughput: 0: 43846.3. Samples: 513923280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 14:21:51,101][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:21:54,260][36999] Updated weights for policy 0, policy_version 31370 (0.0042) [2024-07-02 14:21:56,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43692.5, 300 sec: 43931.3). Total num frames: 514015232. Throughput: 0: 43836.6. Samples: 514188920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 14:21:56,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:21:57,824][36999] Updated weights for policy 0, policy_version 31380 (0.0031) [2024-07-02 14:22:01,095][36761] Fps is (10 sec: 42618.2, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 514244608. Throughput: 0: 43961.8. Samples: 514315980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 14:22:01,096][36761] Avg episode reward: [(0, '1.008')] [2024-07-02 14:22:01,526][36999] Updated weights for policy 0, policy_version 31390 (0.0035) [2024-07-02 14:22:03,756][36979] Signal inference workers to stop experience collection... (7400 times) [2024-07-02 14:22:03,757][36979] Signal inference workers to resume experience collection... (7400 times) [2024-07-02 14:22:03,787][36999] InferenceWorker_p0-w0: stopping experience collection (7400 times) [2024-07-02 14:22:03,788][36999] InferenceWorker_p0-w0: resuming experience collection (7400 times) [2024-07-02 14:22:05,335][36999] Updated weights for policy 0, policy_version 31400 (0.0029) [2024-07-02 14:22:06,095][36761] Fps is (10 sec: 45875.0, 60 sec: 43963.8, 300 sec: 44097.9). Total num frames: 514473984. Throughput: 0: 43987.2. Samples: 514580460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 14:22:06,096][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 14:22:09,443][36999] Updated weights for policy 0, policy_version 31410 (0.0031) [2024-07-02 14:22:11,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43690.8, 300 sec: 43931.3). Total num frames: 514670592. Throughput: 0: 43851.1. Samples: 514851560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 14:22:11,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:22:12,866][36999] Updated weights for policy 0, policy_version 31420 (0.0027) [2024-07-02 14:22:16,095][36761] Fps is (10 sec: 44236.7, 60 sec: 44509.8, 300 sec: 44042.4). Total num frames: 514916352. Throughput: 0: 43868.0. Samples: 514976680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-07-02 14:22:16,096][36761] Avg episode reward: [(0, '0.984')] [2024-07-02 14:22:17,015][36999] Updated weights for policy 0, policy_version 31430 (0.0031) [2024-07-02 14:22:20,132][36999] Updated weights for policy 0, policy_version 31440 (0.0037) [2024-07-02 14:22:21,095][36761] Fps is (10 sec: 45874.3, 60 sec: 43963.7, 300 sec: 44097.9). Total num frames: 515129344. Throughput: 0: 43957.4. Samples: 515240640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 14:22:21,096][36761] Avg episode reward: [(0, '0.979')] [2024-07-02 14:22:24,395][36999] Updated weights for policy 0, policy_version 31450 (0.0038) [2024-07-02 14:22:26,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43690.6, 300 sec: 43875.9). Total num frames: 515325952. Throughput: 0: 43855.6. Samples: 515506460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 14:22:26,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:22:27,928][36999] Updated weights for policy 0, policy_version 31460 (0.0027) [2024-07-02 14:22:31,095][36761] Fps is (10 sec: 42599.2, 60 sec: 43690.7, 300 sec: 43931.4). Total num frames: 515555328. Throughput: 0: 43840.4. Samples: 515634500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 14:22:31,095][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:22:31,933][36999] Updated weights for policy 0, policy_version 31470 (0.0020) [2024-07-02 14:22:35,368][36999] Updated weights for policy 0, policy_version 31480 (0.0020) [2024-07-02 14:22:36,096][36761] Fps is (10 sec: 45869.9, 60 sec: 43690.3, 300 sec: 44042.2). Total num frames: 515784704. Throughput: 0: 43848.2. Samples: 515896300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-07-02 14:22:36,097][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:22:36,120][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000031481_515784704.pth... [2024-07-02 14:22:36,186][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000030838_505249792.pth [2024-07-02 14:22:39,425][36999] Updated weights for policy 0, policy_version 31490 (0.0042) [2024-07-02 14:22:41,095][36761] Fps is (10 sec: 45875.1, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 516014080. Throughput: 0: 43903.5. Samples: 516164580. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-07-02 14:22:41,096][36761] Avg episode reward: [(0, '1.018')] [2024-07-02 14:22:42,895][36999] Updated weights for policy 0, policy_version 31500 (0.0030) [2024-07-02 14:22:46,095][36761] Fps is (10 sec: 40965.0, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 516194304. Throughput: 0: 43979.6. Samples: 516295060. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-07-02 14:22:46,095][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:22:46,739][36999] Updated weights for policy 0, policy_version 31510 (0.0020) [2024-07-02 14:22:50,342][36999] Updated weights for policy 0, policy_version 31520 (0.0035) [2024-07-02 14:22:51,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43694.1, 300 sec: 43986.9). Total num frames: 516440064. Throughput: 0: 43957.8. Samples: 516558560. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-07-02 14:22:51,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:22:54,130][36999] Updated weights for policy 0, policy_version 31530 (0.0027) [2024-07-02 14:22:56,095][36761] Fps is (10 sec: 47513.1, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 516669440. Throughput: 0: 43601.2. Samples: 516813620. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-07-02 14:22:56,099][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:22:57,840][36999] Updated weights for policy 0, policy_version 31540 (0.0036) [2024-07-02 14:23:01,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43417.6, 300 sec: 43875.8). Total num frames: 516849664. Throughput: 0: 43744.5. Samples: 516945180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 14:23:01,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:23:01,852][36999] Updated weights for policy 0, policy_version 31550 (0.0037) [2024-07-02 14:23:05,456][36999] Updated weights for policy 0, policy_version 31560 (0.0032) [2024-07-02 14:23:06,095][36761] Fps is (10 sec: 42598.8, 60 sec: 43690.7, 300 sec: 44042.4). Total num frames: 517095424. Throughput: 0: 43778.8. Samples: 517210680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 14:23:06,096][36761] Avg episode reward: [(0, '1.041')] [2024-07-02 14:23:09,204][36999] Updated weights for policy 0, policy_version 31570 (0.0030) [2024-07-02 14:23:11,095][36761] Fps is (10 sec: 45874.8, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 517308416. Throughput: 0: 43645.3. Samples: 517470500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 14:23:11,096][36761] Avg episode reward: [(0, '1.047')] [2024-07-02 14:23:12,235][36979] Signal inference workers to stop experience collection... (7450 times) [2024-07-02 14:23:12,236][36979] Signal inference workers to resume experience collection... (7450 times) [2024-07-02 14:23:12,271][36999] InferenceWorker_p0-w0: stopping experience collection (7450 times) [2024-07-02 14:23:12,271][36999] InferenceWorker_p0-w0: resuming experience collection (7450 times) [2024-07-02 14:23:12,797][36999] Updated weights for policy 0, policy_version 31580 (0.0038) [2024-07-02 14:23:16,095][36761] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 43875.8). Total num frames: 517521408. Throughput: 0: 43763.9. Samples: 517603880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 14:23:16,096][36761] Avg episode reward: [(0, '1.041')] [2024-07-02 14:23:17,245][36999] Updated weights for policy 0, policy_version 31590 (0.0031) [2024-07-02 14:23:20,361][36999] Updated weights for policy 0, policy_version 31600 (0.0026) [2024-07-02 14:23:21,095][36761] Fps is (10 sec: 44236.5, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 517750784. Throughput: 0: 43779.3. Samples: 517866320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 14:23:21,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:23:24,485][36999] Updated weights for policy 0, policy_version 31610 (0.0039) [2024-07-02 14:23:26,096][36761] Fps is (10 sec: 44236.2, 60 sec: 43963.6, 300 sec: 43931.3). Total num frames: 517963776. Throughput: 0: 43640.7. Samples: 518128420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 14:23:26,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:23:27,825][36999] Updated weights for policy 0, policy_version 31620 (0.0029) [2024-07-02 14:23:31,095][36761] Fps is (10 sec: 42599.0, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 518176768. Throughput: 0: 43654.7. Samples: 518259520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 14:23:31,095][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:23:31,880][36999] Updated weights for policy 0, policy_version 31630 (0.0036) [2024-07-02 14:23:35,205][36999] Updated weights for policy 0, policy_version 31640 (0.0036) [2024-07-02 14:23:36,095][36761] Fps is (10 sec: 45875.7, 60 sec: 43964.5, 300 sec: 44042.4). Total num frames: 518422528. Throughput: 0: 43840.3. Samples: 518531380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 14:23:36,096][36761] Avg episode reward: [(0, '1.010')] [2024-07-02 14:23:39,416][36999] Updated weights for policy 0, policy_version 31650 (0.0026) [2024-07-02 14:23:41,096][36761] Fps is (10 sec: 42597.2, 60 sec: 43144.3, 300 sec: 43875.9). Total num frames: 518602752. Throughput: 0: 44013.2. Samples: 518794220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-07-02 14:23:41,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:23:42,475][36999] Updated weights for policy 0, policy_version 31660 (0.0027) [2024-07-02 14:23:46,095][36761] Fps is (10 sec: 42598.8, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 518848512. Throughput: 0: 43900.0. Samples: 518920680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-07-02 14:23:46,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:23:46,605][36999] Updated weights for policy 0, policy_version 31670 (0.0027) [2024-07-02 14:23:49,861][36999] Updated weights for policy 0, policy_version 31680 (0.0033) [2024-07-02 14:23:51,100][36761] Fps is (10 sec: 49130.8, 60 sec: 44233.4, 300 sec: 44097.3). Total num frames: 519094272. Throughput: 0: 43953.3. Samples: 519188780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-07-02 14:23:51,100][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:23:54,279][36999] Updated weights for policy 0, policy_version 31690 (0.0032) [2024-07-02 14:23:56,095][36761] Fps is (10 sec: 44236.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 519290880. Throughput: 0: 44168.4. Samples: 519458080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-07-02 14:23:56,096][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 14:23:57,168][36999] Updated weights for policy 0, policy_version 31700 (0.0030) [2024-07-02 14:24:01,095][36761] Fps is (10 sec: 40978.5, 60 sec: 44236.7, 300 sec: 43875.8). Total num frames: 519503872. Throughput: 0: 43942.2. Samples: 519581280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-07-02 14:24:01,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:24:01,605][36999] Updated weights for policy 0, policy_version 31710 (0.0031) [2024-07-02 14:24:04,945][36999] Updated weights for policy 0, policy_version 31720 (0.0033) [2024-07-02 14:24:06,095][36761] Fps is (10 sec: 45875.0, 60 sec: 44236.7, 300 sec: 44042.4). Total num frames: 519749632. Throughput: 0: 44038.2. Samples: 519848040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-07-02 14:24:06,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:24:08,943][36999] Updated weights for policy 0, policy_version 31730 (0.0025) [2024-07-02 14:24:11,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 519946240. Throughput: 0: 44194.9. Samples: 520117180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-07-02 14:24:11,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:24:12,414][36999] Updated weights for policy 0, policy_version 31740 (0.0034) [2024-07-02 14:24:16,095][36761] Fps is (10 sec: 40960.6, 60 sec: 43963.8, 300 sec: 43820.3). Total num frames: 520159232. Throughput: 0: 43921.3. Samples: 520235980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-07-02 14:24:16,096][36761] Avg episode reward: [(0, '1.039')] [2024-07-02 14:24:16,406][36999] Updated weights for policy 0, policy_version 31750 (0.0027) [2024-07-02 14:24:19,903][36999] Updated weights for policy 0, policy_version 31760 (0.0027) [2024-07-02 14:24:21,095][36761] Fps is (10 sec: 45875.0, 60 sec: 44236.9, 300 sec: 44042.4). Total num frames: 520404992. Throughput: 0: 43896.9. Samples: 520506740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-07-02 14:24:21,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:24:23,819][36999] Updated weights for policy 0, policy_version 31770 (0.0040) [2024-07-02 14:24:26,098][36761] Fps is (10 sec: 44224.9, 60 sec: 43961.9, 300 sec: 43931.0). Total num frames: 520601600. Throughput: 0: 44073.2. Samples: 520777620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-07-02 14:24:26,099][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:24:27,316][36999] Updated weights for policy 0, policy_version 31780 (0.0027) [2024-07-02 14:24:31,095][36761] Fps is (10 sec: 40960.3, 60 sec: 43963.7, 300 sec: 43764.7). Total num frames: 520814592. Throughput: 0: 43958.2. Samples: 520898800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-07-02 14:24:31,095][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:24:31,332][36999] Updated weights for policy 0, policy_version 31790 (0.0032) [2024-07-02 14:24:34,880][36999] Updated weights for policy 0, policy_version 31800 (0.0028) [2024-07-02 14:24:36,095][36761] Fps is (10 sec: 45887.6, 60 sec: 43963.8, 300 sec: 43987.1). Total num frames: 521060352. Throughput: 0: 43881.8. Samples: 521163260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-07-02 14:24:36,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:24:36,119][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000031804_521076736.pth... [2024-07-02 14:24:36,175][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000031160_510525440.pth [2024-07-02 14:24:38,767][36999] Updated weights for policy 0, policy_version 31810 (0.0028) [2024-07-02 14:24:41,095][36761] Fps is (10 sec: 44236.6, 60 sec: 44237.0, 300 sec: 43875.8). Total num frames: 521256960. Throughput: 0: 43920.5. Samples: 521434500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-07-02 14:24:41,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:24:42,545][36999] Updated weights for policy 0, policy_version 31820 (0.0056) [2024-07-02 14:24:46,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43963.7, 300 sec: 43764.7). Total num frames: 521486336. Throughput: 0: 43876.5. Samples: 521555720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 14:24:46,096][36761] Avg episode reward: [(0, '1.037')] [2024-07-02 14:24:46,171][36999] Updated weights for policy 0, policy_version 31830 (0.0040) [2024-07-02 14:24:49,846][36999] Updated weights for policy 0, policy_version 31840 (0.0028) [2024-07-02 14:24:51,095][36761] Fps is (10 sec: 47513.8, 60 sec: 43967.1, 300 sec: 44042.4). Total num frames: 521732096. Throughput: 0: 43994.4. Samples: 521827780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 14:24:51,096][36761] Avg episode reward: [(0, '1.036')] [2024-07-02 14:24:53,615][36999] Updated weights for policy 0, policy_version 31850 (0.0032) [2024-07-02 14:24:55,780][36979] Signal inference workers to stop experience collection... (7500 times) [2024-07-02 14:24:55,830][36999] InferenceWorker_p0-w0: stopping experience collection (7500 times) [2024-07-02 14:24:55,836][36979] Signal inference workers to resume experience collection... (7500 times) [2024-07-02 14:24:55,848][36999] InferenceWorker_p0-w0: resuming experience collection (7500 times) [2024-07-02 14:24:56,097][36761] Fps is (10 sec: 44227.8, 60 sec: 43962.3, 300 sec: 43931.0). Total num frames: 521928704. Throughput: 0: 44018.5. Samples: 522098100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 14:24:56,098][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:24:57,210][36999] Updated weights for policy 0, policy_version 31860 (0.0042) [2024-07-02 14:25:00,921][36999] Updated weights for policy 0, policy_version 31870 (0.0029) [2024-07-02 14:25:01,100][36761] Fps is (10 sec: 42578.9, 60 sec: 44233.5, 300 sec: 43819.6). Total num frames: 522158080. Throughput: 0: 44105.7. Samples: 522220940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-07-02 14:25:01,100][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:25:04,550][36999] Updated weights for policy 0, policy_version 31880 (0.0040) [2024-07-02 14:25:06,100][36761] Fps is (10 sec: 47501.1, 60 sec: 44233.5, 300 sec: 44097.3). Total num frames: 522403840. Throughput: 0: 44035.9. Samples: 522488560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 14:25:06,101][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:25:08,318][36999] Updated weights for policy 0, policy_version 31890 (0.0039) [2024-07-02 14:25:11,095][36761] Fps is (10 sec: 42617.7, 60 sec: 43963.7, 300 sec: 43875.9). Total num frames: 522584064. Throughput: 0: 44031.5. Samples: 522758920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 14:25:11,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:25:12,010][36999] Updated weights for policy 0, policy_version 31900 (0.0032) [2024-07-02 14:25:15,703][36999] Updated weights for policy 0, policy_version 31910 (0.0031) [2024-07-02 14:25:16,095][36761] Fps is (10 sec: 40978.5, 60 sec: 44236.7, 300 sec: 43875.8). Total num frames: 522813440. Throughput: 0: 44031.4. Samples: 522880220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 14:25:16,096][36761] Avg episode reward: [(0, '1.038')] [2024-07-02 14:25:19,387][36999] Updated weights for policy 0, policy_version 31920 (0.0024) [2024-07-02 14:25:21,095][36761] Fps is (10 sec: 47513.7, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 523059200. Throughput: 0: 44077.3. Samples: 523146740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 14:25:21,096][36761] Avg episode reward: [(0, '1.039')] [2024-07-02 14:25:23,455][36999] Updated weights for policy 0, policy_version 31930 (0.0034) [2024-07-02 14:25:26,095][36761] Fps is (10 sec: 42599.1, 60 sec: 43965.7, 300 sec: 43875.8). Total num frames: 523239424. Throughput: 0: 44101.8. Samples: 523419080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 14:25:26,095][36761] Avg episode reward: [(0, '1.038')] [2024-07-02 14:25:26,855][36999] Updated weights for policy 0, policy_version 31940 (0.0035) [2024-07-02 14:25:30,711][36999] Updated weights for policy 0, policy_version 31950 (0.0046) [2024-07-02 14:25:31,100][36761] Fps is (10 sec: 40941.0, 60 sec: 44233.4, 300 sec: 43875.1). Total num frames: 523468800. Throughput: 0: 44123.0. Samples: 523541460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 14:25:31,101][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:25:34,422][36999] Updated weights for policy 0, policy_version 31960 (0.0036) [2024-07-02 14:25:36,095][36761] Fps is (10 sec: 47513.1, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 523714560. Throughput: 0: 43931.0. Samples: 523804680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 14:25:36,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:25:37,985][36999] Updated weights for policy 0, policy_version 31970 (0.0038) [2024-07-02 14:25:41,095][36761] Fps is (10 sec: 40979.1, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 523878400. Throughput: 0: 44010.9. Samples: 524078500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 14:25:41,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:25:41,927][36999] Updated weights for policy 0, policy_version 31980 (0.0023) [2024-07-02 14:25:45,691][36999] Updated weights for policy 0, policy_version 31990 (0.0028) [2024-07-02 14:25:46,095][36761] Fps is (10 sec: 42598.7, 60 sec: 44236.8, 300 sec: 43931.3). Total num frames: 524140544. Throughput: 0: 43920.9. Samples: 524197180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 14:25:46,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 14:25:49,406][36999] Updated weights for policy 0, policy_version 32000 (0.0036) [2024-07-02 14:25:51,095][36761] Fps is (10 sec: 47512.6, 60 sec: 43690.5, 300 sec: 43931.7). Total num frames: 524353536. Throughput: 0: 43904.3. Samples: 524464060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 14:25:51,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 14:25:52,946][36999] Updated weights for policy 0, policy_version 32010 (0.0032) [2024-07-02 14:25:56,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43692.1, 300 sec: 43875.8). Total num frames: 524550144. Throughput: 0: 43897.4. Samples: 524734300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 14:25:56,095][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 14:25:56,801][36999] Updated weights for policy 0, policy_version 32020 (0.0025) [2024-07-02 14:26:00,261][36999] Updated weights for policy 0, policy_version 32030 (0.0040) [2024-07-02 14:26:01,097][36761] Fps is (10 sec: 44231.2, 60 sec: 43966.0, 300 sec: 43931.1). Total num frames: 524795904. Throughput: 0: 44021.8. Samples: 524861260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 14:26:01,097][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:26:04,290][36999] Updated weights for policy 0, policy_version 32040 (0.0031) [2024-07-02 14:26:06,095][36761] Fps is (10 sec: 45875.0, 60 sec: 43420.9, 300 sec: 43931.4). Total num frames: 525008896. Throughput: 0: 43957.7. Samples: 525124840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 14:26:06,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:26:07,642][36999] Updated weights for policy 0, policy_version 32050 (0.0036) [2024-07-02 14:26:11,095][36761] Fps is (10 sec: 40965.8, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 525205504. Throughput: 0: 43894.1. Samples: 525394320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 14:26:11,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:26:11,668][36979] Signal inference workers to stop experience collection... (7550 times) [2024-07-02 14:26:11,669][36979] Signal inference workers to resume experience collection... (7550 times) [2024-07-02 14:26:11,682][36999] InferenceWorker_p0-w0: stopping experience collection (7550 times) [2024-07-02 14:26:11,682][36999] InferenceWorker_p0-w0: resuming experience collection (7550 times) [2024-07-02 14:26:11,806][36999] Updated weights for policy 0, policy_version 32060 (0.0043) [2024-07-02 14:26:15,234][36999] Updated weights for policy 0, policy_version 32070 (0.0026) [2024-07-02 14:26:16,095][36761] Fps is (10 sec: 44236.3, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 525451264. Throughput: 0: 43917.3. Samples: 525517540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 14:26:16,098][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:26:19,491][36999] Updated weights for policy 0, policy_version 32080 (0.0037) [2024-07-02 14:26:21,095][36761] Fps is (10 sec: 47513.7, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 525680640. Throughput: 0: 44002.7. Samples: 525784800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 14:26:21,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:26:22,856][36999] Updated weights for policy 0, policy_version 32090 (0.0032) [2024-07-02 14:26:26,095][36761] Fps is (10 sec: 42599.1, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 525877248. Throughput: 0: 43816.4. Samples: 526050240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 14:26:26,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:26:26,807][36999] Updated weights for policy 0, policy_version 32100 (0.0038) [2024-07-02 14:26:30,300][36999] Updated weights for policy 0, policy_version 32110 (0.0030) [2024-07-02 14:26:31,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43967.1, 300 sec: 43875.9). Total num frames: 526106624. Throughput: 0: 44033.4. Samples: 526178680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 14:26:31,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:26:34,234][36999] Updated weights for policy 0, policy_version 32120 (0.0036) [2024-07-02 14:26:36,095][36761] Fps is (10 sec: 45875.0, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 526336000. Throughput: 0: 43890.4. Samples: 526439120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 14:26:36,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:26:36,105][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000032125_526336000.pth... [2024-07-02 14:26:36,167][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000031481_515784704.pth [2024-07-02 14:26:38,078][36999] Updated weights for policy 0, policy_version 32130 (0.0033) [2024-07-02 14:26:41,096][36761] Fps is (10 sec: 40959.0, 60 sec: 43963.5, 300 sec: 43875.8). Total num frames: 526516224. Throughput: 0: 43807.8. Samples: 526705660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 14:26:41,096][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 14:26:41,827][36999] Updated weights for policy 0, policy_version 32140 (0.0034) [2024-07-02 14:26:45,490][36999] Updated weights for policy 0, policy_version 32150 (0.0027) [2024-07-02 14:26:46,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 43876.5). Total num frames: 526761984. Throughput: 0: 43821.9. Samples: 526833180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 14:26:46,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:26:49,429][36999] Updated weights for policy 0, policy_version 32160 (0.0033) [2024-07-02 14:26:51,095][36761] Fps is (10 sec: 47514.5, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 526991360. Throughput: 0: 43641.3. Samples: 527088700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 14:26:51,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:26:52,964][36999] Updated weights for policy 0, policy_version 32170 (0.0038) [2024-07-02 14:26:56,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 527187968. Throughput: 0: 43715.5. Samples: 527361520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 14:26:56,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:26:56,771][36999] Updated weights for policy 0, policy_version 32180 (0.0039) [2024-07-02 14:27:00,275][36999] Updated weights for policy 0, policy_version 32190 (0.0029) [2024-07-02 14:27:01,096][36761] Fps is (10 sec: 42597.7, 60 sec: 43691.6, 300 sec: 43875.8). Total num frames: 527417344. Throughput: 0: 43760.8. Samples: 527486780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 14:27:01,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:27:04,425][36999] Updated weights for policy 0, policy_version 32200 (0.0032) [2024-07-02 14:27:06,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 527630336. Throughput: 0: 43637.8. Samples: 527748500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 14:27:06,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 14:27:07,921][36999] Updated weights for policy 0, policy_version 32210 (0.0036) [2024-07-02 14:27:11,095][36761] Fps is (10 sec: 42599.2, 60 sec: 43963.8, 300 sec: 43820.3). Total num frames: 527843328. Throughput: 0: 43646.6. Samples: 528014340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 14:27:11,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:27:11,921][36999] Updated weights for policy 0, policy_version 32220 (0.0031) [2024-07-02 14:27:15,246][36999] Updated weights for policy 0, policy_version 32230 (0.0029) [2024-07-02 14:27:16,095][36761] Fps is (10 sec: 44236.2, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 528072704. Throughput: 0: 43740.3. Samples: 528147000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 14:27:16,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 14:27:19,330][36999] Updated weights for policy 0, policy_version 32240 (0.0030) [2024-07-02 14:27:21,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 43931.3). Total num frames: 528285696. Throughput: 0: 43684.9. Samples: 528404940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 14:27:21,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:27:22,938][36999] Updated weights for policy 0, policy_version 32250 (0.0025) [2024-07-02 14:27:26,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 528515072. Throughput: 0: 43766.4. Samples: 528675140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 14:27:26,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:27:26,650][36999] Updated weights for policy 0, policy_version 32260 (0.0034) [2024-07-02 14:27:30,303][36999] Updated weights for policy 0, policy_version 32270 (0.0037) [2024-07-02 14:27:31,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43690.7, 300 sec: 43876.0). Total num frames: 528728064. Throughput: 0: 43810.7. Samples: 528804660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 14:27:31,095][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 14:27:34,073][36999] Updated weights for policy 0, policy_version 32280 (0.0035) [2024-07-02 14:27:36,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 43820.2). Total num frames: 528941056. Throughput: 0: 43889.7. Samples: 529063740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 14:27:36,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:27:37,721][36999] Updated weights for policy 0, policy_version 32290 (0.0041) [2024-07-02 14:27:41,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43963.9, 300 sec: 43931.3). Total num frames: 529154048. Throughput: 0: 43801.9. Samples: 529332600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 14:27:41,095][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:27:41,491][36999] Updated weights for policy 0, policy_version 32300 (0.0031) [2024-07-02 14:27:45,593][36999] Updated weights for policy 0, policy_version 32310 (0.0024) [2024-07-02 14:27:46,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 529383424. Throughput: 0: 43915.7. Samples: 529462980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 14:27:46,098][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:27:48,982][36999] Updated weights for policy 0, policy_version 32320 (0.0045) [2024-07-02 14:27:51,018][36979] Signal inference workers to stop experience collection... (7600 times) [2024-07-02 14:27:51,071][36999] InferenceWorker_p0-w0: stopping experience collection (7600 times) [2024-07-02 14:27:51,073][36979] Signal inference workers to resume experience collection... (7600 times) [2024-07-02 14:27:51,084][36999] InferenceWorker_p0-w0: resuming experience collection (7600 times) [2024-07-02 14:27:51,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 43820.3). Total num frames: 529596416. Throughput: 0: 43754.7. Samples: 529717460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-07-02 14:27:51,095][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:27:52,983][36999] Updated weights for policy 0, policy_version 32330 (0.0029) [2024-07-02 14:27:56,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 529825792. Throughput: 0: 43873.7. Samples: 529988660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-07-02 14:27:56,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:27:56,748][36999] Updated weights for policy 0, policy_version 32340 (0.0038) [2024-07-02 14:28:00,326][36999] Updated weights for policy 0, policy_version 32350 (0.0021) [2024-07-02 14:28:01,095][36761] Fps is (10 sec: 44235.9, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 530038784. Throughput: 0: 43750.2. Samples: 530115760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-07-02 14:28:01,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:28:04,032][36999] Updated weights for policy 0, policy_version 32360 (0.0036) [2024-07-02 14:28:06,096][36761] Fps is (10 sec: 44236.3, 60 sec: 43963.6, 300 sec: 43931.3). Total num frames: 530268160. Throughput: 0: 43844.2. Samples: 530377940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-07-02 14:28:06,096][36761] Avg episode reward: [(0, '1.010')] [2024-07-02 14:28:07,770][36999] Updated weights for policy 0, policy_version 32370 (0.0036) [2024-07-02 14:28:11,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 530481152. Throughput: 0: 43870.2. Samples: 530649300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-07-02 14:28:11,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:28:11,414][36999] Updated weights for policy 0, policy_version 32380 (0.0025) [2024-07-02 14:28:15,155][36999] Updated weights for policy 0, policy_version 32390 (0.0037) [2024-07-02 14:28:16,095][36761] Fps is (10 sec: 42598.9, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 530694144. Throughput: 0: 43792.8. Samples: 530775340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-07-02 14:28:16,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:28:18,855][36999] Updated weights for policy 0, policy_version 32400 (0.0036) [2024-07-02 14:28:21,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43963.7, 300 sec: 43931.4). Total num frames: 530923520. Throughput: 0: 43826.7. Samples: 531035940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-07-02 14:28:21,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:28:22,929][36999] Updated weights for policy 0, policy_version 32410 (0.0042) [2024-07-02 14:28:26,095][36761] Fps is (10 sec: 45875.8, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 531152896. Throughput: 0: 43781.8. Samples: 531302780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-07-02 14:28:26,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:28:26,118][36999] Updated weights for policy 0, policy_version 32420 (0.0045) [2024-07-02 14:28:30,405][36999] Updated weights for policy 0, policy_version 32430 (0.0033) [2024-07-02 14:28:31,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 43820.3). Total num frames: 531349504. Throughput: 0: 43889.3. Samples: 531438000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-07-02 14:28:31,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:28:33,455][36999] Updated weights for policy 0, policy_version 32440 (0.0026) [2024-07-02 14:28:36,095][36761] Fps is (10 sec: 42597.7, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 531578880. Throughput: 0: 43924.2. Samples: 531694060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 14:28:36,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:28:36,111][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000032445_531578880.pth... [2024-07-02 14:28:36,165][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000031804_521076736.pth [2024-07-02 14:28:38,334][36999] Updated weights for policy 0, policy_version 32450 (0.0035) [2024-07-02 14:28:41,095][36761] Fps is (10 sec: 45875.7, 60 sec: 44236.8, 300 sec: 43931.3). Total num frames: 531808256. Throughput: 0: 43658.7. Samples: 531953300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 14:28:41,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:28:41,386][36999] Updated weights for policy 0, policy_version 32460 (0.0034) [2024-07-02 14:28:45,785][36999] Updated weights for policy 0, policy_version 32470 (0.0036) [2024-07-02 14:28:46,095][36761] Fps is (10 sec: 40960.9, 60 sec: 43417.7, 300 sec: 43709.9). Total num frames: 531988480. Throughput: 0: 43886.9. Samples: 532090660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 14:28:46,095][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:28:48,690][36999] Updated weights for policy 0, policy_version 32480 (0.0035) [2024-07-02 14:28:51,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 532234240. Throughput: 0: 43848.2. Samples: 532351100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-07-02 14:28:51,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:28:53,018][36999] Updated weights for policy 0, policy_version 32490 (0.0030) [2024-07-02 14:28:56,095][36761] Fps is (10 sec: 47513.5, 60 sec: 43963.8, 300 sec: 43931.4). Total num frames: 532463616. Throughput: 0: 43689.5. Samples: 532615320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-07-02 14:28:56,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:28:56,132][36999] Updated weights for policy 0, policy_version 32500 (0.0031) [2024-07-02 14:29:00,346][36999] Updated weights for policy 0, policy_version 32510 (0.0045) [2024-07-02 14:29:01,096][36761] Fps is (10 sec: 42597.5, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 532660224. Throughput: 0: 43955.4. Samples: 532753340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-07-02 14:29:01,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:29:03,562][36999] Updated weights for policy 0, policy_version 32520 (0.0046) [2024-07-02 14:29:06,097][36761] Fps is (10 sec: 44228.1, 60 sec: 43962.5, 300 sec: 43931.1). Total num frames: 532905984. Throughput: 0: 43967.5. Samples: 533014560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-07-02 14:29:06,098][36761] Avg episode reward: [(0, '1.036')] [2024-07-02 14:29:07,606][36999] Updated weights for policy 0, policy_version 32530 (0.0030) [2024-07-02 14:29:11,014][36999] Updated weights for policy 0, policy_version 32540 (0.0031) [2024-07-02 14:29:11,095][36761] Fps is (10 sec: 47514.4, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 533135360. Throughput: 0: 44083.0. Samples: 533286520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-07-02 14:29:11,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:29:13,582][36979] Signal inference workers to stop experience collection... (7650 times) [2024-07-02 14:29:13,625][36999] InferenceWorker_p0-w0: stopping experience collection (7650 times) [2024-07-02 14:29:13,703][36979] Signal inference workers to resume experience collection... (7650 times) [2024-07-02 14:29:13,704][36999] InferenceWorker_p0-w0: resuming experience collection (7650 times) [2024-07-02 14:29:15,013][36999] Updated weights for policy 0, policy_version 32550 (0.0028) [2024-07-02 14:29:16,095][36761] Fps is (10 sec: 42605.8, 60 sec: 43963.7, 300 sec: 43820.2). Total num frames: 533331968. Throughput: 0: 43890.2. Samples: 533413060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 14:29:16,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:29:18,515][36999] Updated weights for policy 0, policy_version 32560 (0.0024) [2024-07-02 14:29:21,095][36761] Fps is (10 sec: 44236.4, 60 sec: 44236.8, 300 sec: 43987.3). Total num frames: 533577728. Throughput: 0: 44076.9. Samples: 533677520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 14:29:21,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 14:29:22,300][36999] Updated weights for policy 0, policy_version 32570 (0.0026) [2024-07-02 14:29:25,738][36999] Updated weights for policy 0, policy_version 32580 (0.0031) [2024-07-02 14:29:26,095][36761] Fps is (10 sec: 45875.4, 60 sec: 43963.6, 300 sec: 43986.9). Total num frames: 533790720. Throughput: 0: 44172.8. Samples: 533941080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 14:29:26,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:29:29,551][36999] Updated weights for policy 0, policy_version 32590 (0.0023) [2024-07-02 14:29:31,095][36761] Fps is (10 sec: 40960.4, 60 sec: 43963.8, 300 sec: 43820.3). Total num frames: 533987328. Throughput: 0: 44065.7. Samples: 534073620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 14:29:31,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:29:33,064][36999] Updated weights for policy 0, policy_version 32600 (0.0020) [2024-07-02 14:29:36,095][36761] Fps is (10 sec: 44237.0, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 534233088. Throughput: 0: 44223.9. Samples: 534341180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 14:29:36,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:29:37,177][36999] Updated weights for policy 0, policy_version 32610 (0.0038) [2024-07-02 14:29:40,621][36999] Updated weights for policy 0, policy_version 32620 (0.0041) [2024-07-02 14:29:41,095][36761] Fps is (10 sec: 45875.2, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 534446080. Throughput: 0: 44006.6. Samples: 534595620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 14:29:41,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:29:44,973][36999] Updated weights for policy 0, policy_version 32630 (0.0045) [2024-07-02 14:29:46,095][36761] Fps is (10 sec: 40960.7, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 534642688. Throughput: 0: 44031.4. Samples: 534734740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 14:29:46,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:29:47,975][36999] Updated weights for policy 0, policy_version 32640 (0.0025) [2024-07-02 14:29:51,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43963.7, 300 sec: 43876.1). Total num frames: 534872064. Throughput: 0: 43983.2. Samples: 534993720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 14:29:51,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:29:52,226][36999] Updated weights for policy 0, policy_version 32650 (0.0041) [2024-07-02 14:29:55,690][36999] Updated weights for policy 0, policy_version 32660 (0.0038) [2024-07-02 14:29:56,100][36761] Fps is (10 sec: 45853.6, 60 sec: 43960.3, 300 sec: 43875.8). Total num frames: 535101440. Throughput: 0: 43817.3. Samples: 535258500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 14:29:56,101][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:29:59,563][36999] Updated weights for policy 0, policy_version 32670 (0.0026) [2024-07-02 14:30:01,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43963.9, 300 sec: 43709.9). Total num frames: 535298048. Throughput: 0: 44003.7. Samples: 535393220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 14:30:01,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:30:03,113][36999] Updated weights for policy 0, policy_version 32680 (0.0033) [2024-07-02 14:30:06,095][36761] Fps is (10 sec: 44256.5, 60 sec: 43965.0, 300 sec: 43931.3). Total num frames: 535543808. Throughput: 0: 43895.1. Samples: 535652800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 14:30:06,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:30:06,953][36999] Updated weights for policy 0, policy_version 32690 (0.0037) [2024-07-02 14:30:10,932][36999] Updated weights for policy 0, policy_version 32700 (0.0031) [2024-07-02 14:30:11,095][36761] Fps is (10 sec: 45875.3, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 535756800. Throughput: 0: 44085.1. Samples: 535924900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 14:30:11,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:30:14,727][36999] Updated weights for policy 0, policy_version 32710 (0.0037) [2024-07-02 14:30:16,095][36761] Fps is (10 sec: 44237.8, 60 sec: 44237.0, 300 sec: 43820.3). Total num frames: 535986176. Throughput: 0: 44105.9. Samples: 536058380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-07-02 14:30:16,095][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:30:18,168][36999] Updated weights for policy 0, policy_version 32720 (0.0045) [2024-07-02 14:30:21,095][36761] Fps is (10 sec: 44236.5, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 536199168. Throughput: 0: 43837.4. Samples: 536313860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-07-02 14:30:21,096][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 14:30:22,155][36999] Updated weights for policy 0, policy_version 32730 (0.0033) [2024-07-02 14:30:25,527][36999] Updated weights for policy 0, policy_version 32740 (0.0030) [2024-07-02 14:30:26,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43690.8, 300 sec: 43876.5). Total num frames: 536412160. Throughput: 0: 44126.7. Samples: 536581320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-07-02 14:30:26,095][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:30:29,505][36999] Updated weights for policy 0, policy_version 32750 (0.0042) [2024-07-02 14:30:31,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43963.7, 300 sec: 43764.7). Total num frames: 536625152. Throughput: 0: 43997.6. Samples: 536714640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-07-02 14:30:31,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:30:33,225][36999] Updated weights for policy 0, policy_version 32760 (0.0028) [2024-07-02 14:30:36,096][36761] Fps is (10 sec: 44235.7, 60 sec: 43690.6, 300 sec: 43986.8). Total num frames: 536854528. Throughput: 0: 44018.1. Samples: 536974540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-07-02 14:30:36,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:30:36,121][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000032768_536870912.pth... [2024-07-02 14:30:36,182][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000032125_526336000.pth [2024-07-02 14:30:36,928][36999] Updated weights for policy 0, policy_version 32770 (0.0035) [2024-07-02 14:30:40,656][36999] Updated weights for policy 0, policy_version 32780 (0.0039) [2024-07-02 14:30:41,095][36761] Fps is (10 sec: 45875.1, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 537083904. Throughput: 0: 43987.5. Samples: 537237740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-07-02 14:30:41,096][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 14:30:41,239][36979] Signal inference workers to stop experience collection... (7700 times) [2024-07-02 14:30:41,240][36979] Signal inference workers to resume experience collection... (7700 times) [2024-07-02 14:30:41,286][36999] InferenceWorker_p0-w0: stopping experience collection (7700 times) [2024-07-02 14:30:41,286][36999] InferenceWorker_p0-w0: resuming experience collection (7700 times) [2024-07-02 14:30:44,378][36999] Updated weights for policy 0, policy_version 32790 (0.0025) [2024-07-02 14:30:46,095][36761] Fps is (10 sec: 42599.4, 60 sec: 43963.7, 300 sec: 43820.3). Total num frames: 537280512. Throughput: 0: 43859.6. Samples: 537366900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-07-02 14:30:46,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:30:48,080][36999] Updated weights for policy 0, policy_version 32800 (0.0029) [2024-07-02 14:30:51,095][36761] Fps is (10 sec: 44237.0, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 537526272. Throughput: 0: 43915.2. Samples: 537628980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-07-02 14:30:51,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 14:30:52,294][36999] Updated weights for policy 0, policy_version 32810 (0.0027) [2024-07-02 14:30:55,455][36999] Updated weights for policy 0, policy_version 32820 (0.0038) [2024-07-02 14:30:56,098][36761] Fps is (10 sec: 44224.7, 60 sec: 43692.1, 300 sec: 43820.1). Total num frames: 537722880. Throughput: 0: 43703.1. Samples: 537891660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-07-02 14:30:56,099][36761] Avg episode reward: [(0, '1.003')] [2024-07-02 14:30:59,595][36999] Updated weights for policy 0, policy_version 32830 (0.0038) [2024-07-02 14:31:01,095][36761] Fps is (10 sec: 42598.9, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 537952256. Throughput: 0: 43720.4. Samples: 538025800. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-07-02 14:31:01,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:31:03,166][36999] Updated weights for policy 0, policy_version 32840 (0.0029) [2024-07-02 14:31:06,095][36761] Fps is (10 sec: 42610.3, 60 sec: 43417.8, 300 sec: 43875.8). Total num frames: 538148864. Throughput: 0: 43794.8. Samples: 538284620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 14:31:06,095][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:31:07,094][36999] Updated weights for policy 0, policy_version 32850 (0.0030) [2024-07-02 14:31:10,428][36999] Updated weights for policy 0, policy_version 32860 (0.0040) [2024-07-02 14:31:11,095][36761] Fps is (10 sec: 44236.2, 60 sec: 43963.6, 300 sec: 43875.8). Total num frames: 538394624. Throughput: 0: 43775.9. Samples: 538551240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 14:31:11,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:31:14,792][36999] Updated weights for policy 0, policy_version 32870 (0.0037) [2024-07-02 14:31:16,095][36761] Fps is (10 sec: 47513.2, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 538624000. Throughput: 0: 43708.5. Samples: 538681520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 14:31:16,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:31:18,085][36999] Updated weights for policy 0, policy_version 32880 (0.0038) [2024-07-02 14:31:21,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 538820608. Throughput: 0: 43682.8. Samples: 538940260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 14:31:21,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:31:22,220][36999] Updated weights for policy 0, policy_version 32890 (0.0030) [2024-07-02 14:31:25,396][36999] Updated weights for policy 0, policy_version 32900 (0.0027) [2024-07-02 14:31:26,095][36761] Fps is (10 sec: 44236.3, 60 sec: 44236.7, 300 sec: 43931.3). Total num frames: 539066368. Throughput: 0: 43776.9. Samples: 539207700. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-07-02 14:31:26,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:31:29,531][36999] Updated weights for policy 0, policy_version 32910 (0.0045) [2024-07-02 14:31:31,095][36761] Fps is (10 sec: 45875.5, 60 sec: 44236.9, 300 sec: 43875.8). Total num frames: 539279360. Throughput: 0: 44040.8. Samples: 539348740. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-07-02 14:31:31,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:31:32,811][36999] Updated weights for policy 0, policy_version 32920 (0.0029) [2024-07-02 14:31:36,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43690.7, 300 sec: 43931.4). Total num frames: 539475968. Throughput: 0: 43847.5. Samples: 539602120. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-07-02 14:31:36,096][36761] Avg episode reward: [(0, '1.013')] [2024-07-02 14:31:37,026][36999] Updated weights for policy 0, policy_version 32930 (0.0030) [2024-07-02 14:31:40,160][36999] Updated weights for policy 0, policy_version 32940 (0.0041) [2024-07-02 14:31:41,095][36761] Fps is (10 sec: 44236.1, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 539721728. Throughput: 0: 43847.8. Samples: 539864700. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-07-02 14:31:41,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:31:44,373][36999] Updated weights for policy 0, policy_version 32950 (0.0035) [2024-07-02 14:31:46,095][36761] Fps is (10 sec: 45875.8, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 539934720. Throughput: 0: 43994.6. Samples: 540005560. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-07-02 14:31:46,100][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:31:47,466][36999] Updated weights for policy 0, policy_version 32960 (0.0036) [2024-07-02 14:31:51,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 540147712. Throughput: 0: 44143.4. Samples: 540271080. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-07-02 14:31:51,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:31:52,229][36999] Updated weights for policy 0, policy_version 32970 (0.0030) [2024-07-02 14:31:54,727][36999] Updated weights for policy 0, policy_version 32980 (0.0035) [2024-07-02 14:31:56,095][36761] Fps is (10 sec: 44236.2, 60 sec: 44238.7, 300 sec: 43931.3). Total num frames: 540377088. Throughput: 0: 44008.8. Samples: 540531640. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-07-02 14:31:56,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:31:59,609][36999] Updated weights for policy 0, policy_version 32990 (0.0035) [2024-07-02 14:32:01,095][36761] Fps is (10 sec: 45875.6, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 540606464. Throughput: 0: 44157.3. Samples: 540668600. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-07-02 14:32:01,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:32:02,335][36999] Updated weights for policy 0, policy_version 33000 (0.0031) [2024-07-02 14:32:04,435][36979] Signal inference workers to stop experience collection... (7750 times) [2024-07-02 14:32:04,435][36979] Signal inference workers to resume experience collection... (7750 times) [2024-07-02 14:32:04,473][36999] InferenceWorker_p0-w0: stopping experience collection (7750 times) [2024-07-02 14:32:04,473][36999] InferenceWorker_p0-w0: resuming experience collection (7750 times) [2024-07-02 14:32:06,095][36761] Fps is (10 sec: 42598.8, 60 sec: 44236.7, 300 sec: 43931.3). Total num frames: 540803072. Throughput: 0: 44119.6. Samples: 540925640. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-07-02 14:32:06,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:32:06,859][36999] Updated weights for policy 0, policy_version 33010 (0.0041) [2024-07-02 14:32:10,248][36999] Updated weights for policy 0, policy_version 33020 (0.0028) [2024-07-02 14:32:11,095][36761] Fps is (10 sec: 42597.8, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 541032448. Throughput: 0: 43962.2. Samples: 541186000. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-07-02 14:32:11,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:32:14,290][36999] Updated weights for policy 0, policy_version 33030 (0.0030) [2024-07-02 14:32:16,095][36761] Fps is (10 sec: 45874.7, 60 sec: 43963.6, 300 sec: 43986.8). Total num frames: 541261824. Throughput: 0: 43979.0. Samples: 541327800. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-07-02 14:32:16,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:32:17,529][36999] Updated weights for policy 0, policy_version 33040 (0.0030) [2024-07-02 14:32:21,095][36761] Fps is (10 sec: 40960.6, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 541442048. Throughput: 0: 44001.9. Samples: 541582200. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-07-02 14:32:21,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:32:21,798][36999] Updated weights for policy 0, policy_version 33050 (0.0028) [2024-07-02 14:32:24,749][36999] Updated weights for policy 0, policy_version 33060 (0.0026) [2024-07-02 14:32:26,095][36761] Fps is (10 sec: 44237.2, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 541704192. Throughput: 0: 44089.9. Samples: 541848740. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-07-02 14:32:26,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:32:29,111][36999] Updated weights for policy 0, policy_version 33070 (0.0027) [2024-07-02 14:32:31,096][36761] Fps is (10 sec: 47512.3, 60 sec: 43963.6, 300 sec: 43986.9). Total num frames: 541917184. Throughput: 0: 44150.0. Samples: 541992320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-07-02 14:32:31,105][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:32:32,153][36999] Updated weights for policy 0, policy_version 33080 (0.0036) [2024-07-02 14:32:36,095][36761] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 542097408. Throughput: 0: 43977.8. Samples: 542250080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-07-02 14:32:36,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:32:36,151][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000033088_542113792.pth... [2024-07-02 14:32:36,215][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000032445_531578880.pth [2024-07-02 14:32:36,569][36999] Updated weights for policy 0, policy_version 33090 (0.0031) [2024-07-02 14:32:39,674][36999] Updated weights for policy 0, policy_version 33100 (0.0038) [2024-07-02 14:32:41,095][36761] Fps is (10 sec: 44237.5, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 542359552. Throughput: 0: 43944.0. Samples: 542509120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-07-02 14:32:41,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:32:43,962][36999] Updated weights for policy 0, policy_version 33110 (0.0020) [2024-07-02 14:32:46,095][36761] Fps is (10 sec: 47514.2, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 542572544. Throughput: 0: 44120.9. Samples: 542654040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-07-02 14:32:46,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:32:47,152][36999] Updated weights for policy 0, policy_version 33120 (0.0033) [2024-07-02 14:32:51,095][36761] Fps is (10 sec: 40959.7, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 542769152. Throughput: 0: 44091.0. Samples: 542909740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 14:32:51,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:32:51,401][36999] Updated weights for policy 0, policy_version 33130 (0.0026) [2024-07-02 14:32:54,518][36999] Updated weights for policy 0, policy_version 33140 (0.0029) [2024-07-02 14:32:56,095][36761] Fps is (10 sec: 44236.2, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 543014912. Throughput: 0: 44047.1. Samples: 543168120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 14:32:56,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:32:58,814][36999] Updated weights for policy 0, policy_version 33150 (0.0030) [2024-07-02 14:33:01,095][36761] Fps is (10 sec: 47514.1, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 543244288. Throughput: 0: 44062.3. Samples: 543310600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 14:33:01,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 14:33:01,975][36999] Updated weights for policy 0, policy_version 33160 (0.0040) [2024-07-02 14:33:06,095][36761] Fps is (10 sec: 42599.3, 60 sec: 43963.8, 300 sec: 43931.4). Total num frames: 543440896. Throughput: 0: 44203.6. Samples: 543571360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 14:33:06,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:33:06,266][36999] Updated weights for policy 0, policy_version 33170 (0.0036) [2024-07-02 14:33:09,350][36999] Updated weights for policy 0, policy_version 33180 (0.0029) [2024-07-02 14:33:11,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 543670272. Throughput: 0: 44065.8. Samples: 543831700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 14:33:11,096][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 14:33:13,687][36999] Updated weights for policy 0, policy_version 33190 (0.0026) [2024-07-02 14:33:16,095][36761] Fps is (10 sec: 45874.7, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 543899648. Throughput: 0: 43894.0. Samples: 543967540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 14:33:16,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:33:17,013][36999] Updated weights for policy 0, policy_version 33200 (0.0026) [2024-07-02 14:33:20,894][36999] Updated weights for policy 0, policy_version 33210 (0.0032) [2024-07-02 14:33:21,095][36761] Fps is (10 sec: 44236.8, 60 sec: 44509.8, 300 sec: 43931.3). Total num frames: 544112640. Throughput: 0: 44125.4. Samples: 544235720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 14:33:21,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:33:24,367][36999] Updated weights for policy 0, policy_version 33220 (0.0031) [2024-07-02 14:33:26,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43963.8, 300 sec: 44042.4). Total num frames: 544342016. Throughput: 0: 44183.2. Samples: 544497360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 14:33:26,096][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 14:33:28,373][36999] Updated weights for policy 0, policy_version 33230 (0.0038) [2024-07-02 14:33:31,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43690.9, 300 sec: 43931.4). Total num frames: 544538624. Throughput: 0: 43965.4. Samples: 544632480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 14:33:31,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:33:31,825][36999] Updated weights for policy 0, policy_version 33240 (0.0037) [2024-07-02 14:33:35,805][36999] Updated weights for policy 0, policy_version 33250 (0.0024) [2024-07-02 14:33:36,096][36761] Fps is (10 sec: 42597.7, 60 sec: 44509.8, 300 sec: 43931.3). Total num frames: 544768000. Throughput: 0: 44101.7. Samples: 544894320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 14:33:36,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:33:39,589][36999] Updated weights for policy 0, policy_version 33260 (0.0026) [2024-07-02 14:33:40,814][36979] Signal inference workers to stop experience collection... (7800 times) [2024-07-02 14:33:40,865][36979] Signal inference workers to resume experience collection... (7800 times) [2024-07-02 14:33:40,872][36999] InferenceWorker_p0-w0: stopping experience collection (7800 times) [2024-07-02 14:33:40,902][36999] InferenceWorker_p0-w0: resuming experience collection (7800 times) [2024-07-02 14:33:41,095][36761] Fps is (10 sec: 47513.2, 60 sec: 44236.8, 300 sec: 44153.5). Total num frames: 545013760. Throughput: 0: 44050.7. Samples: 545150400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 14:33:41,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:33:43,567][36999] Updated weights for policy 0, policy_version 33270 (0.0026) [2024-07-02 14:33:46,095][36761] Fps is (10 sec: 42599.5, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 545193984. Throughput: 0: 43903.7. Samples: 545286260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 14:33:46,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:33:47,037][36999] Updated weights for policy 0, policy_version 33280 (0.0030) [2024-07-02 14:33:50,915][36999] Updated weights for policy 0, policy_version 33290 (0.0033) [2024-07-02 14:33:51,095][36761] Fps is (10 sec: 40960.2, 60 sec: 44236.9, 300 sec: 43931.3). Total num frames: 545423360. Throughput: 0: 43916.8. Samples: 545547620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 14:33:51,096][36761] Avg episode reward: [(0, '0.986')] [2024-07-02 14:33:54,404][36999] Updated weights for policy 0, policy_version 33300 (0.0030) [2024-07-02 14:33:56,095][36761] Fps is (10 sec: 45874.7, 60 sec: 43963.8, 300 sec: 44042.4). Total num frames: 545652736. Throughput: 0: 44026.2. Samples: 545812880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 14:33:56,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 14:33:58,266][36999] Updated weights for policy 0, policy_version 33310 (0.0043) [2024-07-02 14:34:01,095][36761] Fps is (10 sec: 44236.2, 60 sec: 43690.6, 300 sec: 43931.6). Total num frames: 545865728. Throughput: 0: 43932.3. Samples: 545944500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 14:34:01,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:34:01,809][36999] Updated weights for policy 0, policy_version 33320 (0.0026) [2024-07-02 14:34:05,805][36999] Updated weights for policy 0, policy_version 33330 (0.0033) [2024-07-02 14:34:06,095][36761] Fps is (10 sec: 42597.9, 60 sec: 43963.6, 300 sec: 43875.8). Total num frames: 546078720. Throughput: 0: 43771.5. Samples: 546205440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 14:34:06,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:34:09,228][36999] Updated weights for policy 0, policy_version 33340 (0.0038) [2024-07-02 14:34:11,100][36761] Fps is (10 sec: 44217.4, 60 sec: 43960.4, 300 sec: 43986.2). Total num frames: 546308096. Throughput: 0: 43850.3. Samples: 546470820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 14:34:11,100][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:34:13,443][36999] Updated weights for policy 0, policy_version 33350 (0.0025) [2024-07-02 14:34:16,096][36761] Fps is (10 sec: 44236.6, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 546521088. Throughput: 0: 43669.1. Samples: 546597600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 14:34:16,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:34:16,832][36999] Updated weights for policy 0, policy_version 33360 (0.0029) [2024-07-02 14:34:20,770][36999] Updated weights for policy 0, policy_version 33370 (0.0023) [2024-07-02 14:34:21,095][36761] Fps is (10 sec: 44256.3, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 546750464. Throughput: 0: 43757.8. Samples: 546863420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:34:21,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:34:24,569][36999] Updated weights for policy 0, policy_version 33380 (0.0034) [2024-07-02 14:34:26,095][36761] Fps is (10 sec: 44237.4, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 546963456. Throughput: 0: 43916.4. Samples: 547126640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:34:26,096][36761] Avg episode reward: [(0, '0.966')] [2024-07-02 14:34:28,084][36999] Updated weights for policy 0, policy_version 33390 (0.0041) [2024-07-02 14:34:31,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 547176448. Throughput: 0: 43789.7. Samples: 547256800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:34:31,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:34:31,949][36999] Updated weights for policy 0, policy_version 33400 (0.0036) [2024-07-02 14:34:35,570][36999] Updated weights for policy 0, policy_version 33410 (0.0043) [2024-07-02 14:34:36,095][36761] Fps is (10 sec: 44237.2, 60 sec: 43963.9, 300 sec: 43931.3). Total num frames: 547405824. Throughput: 0: 43792.0. Samples: 547518260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:34:36,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:34:36,184][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000033412_547422208.pth... [2024-07-02 14:34:36,238][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000032768_536870912.pth [2024-07-02 14:34:39,448][36999] Updated weights for policy 0, policy_version 33420 (0.0032) [2024-07-02 14:34:41,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 43931.3). Total num frames: 547602432. Throughput: 0: 43796.9. Samples: 547783740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-07-02 14:34:41,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:34:42,985][36999] Updated weights for policy 0, policy_version 33430 (0.0040) [2024-07-02 14:34:46,095][36761] Fps is (10 sec: 42597.7, 60 sec: 43963.6, 300 sec: 43931.3). Total num frames: 547831808. Throughput: 0: 43725.3. Samples: 547912140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-07-02 14:34:46,096][36761] Avg episode reward: [(0, '1.139')] [2024-07-02 14:34:46,113][36979] Saving new best policy, reward=1.139! [2024-07-02 14:34:47,038][36999] Updated weights for policy 0, policy_version 33440 (0.0036) [2024-07-02 14:34:50,453][36999] Updated weights for policy 0, policy_version 33450 (0.0037) [2024-07-02 14:34:51,095][36761] Fps is (10 sec: 45875.7, 60 sec: 43963.8, 300 sec: 43932.0). Total num frames: 548061184. Throughput: 0: 43802.0. Samples: 548176520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-07-02 14:34:51,096][36761] Avg episode reward: [(0, '1.146')] [2024-07-02 14:34:51,096][36979] Saving new best policy, reward=1.146! [2024-07-02 14:34:54,303][36999] Updated weights for policy 0, policy_version 33460 (0.0032) [2024-07-02 14:34:56,095][36761] Fps is (10 sec: 45875.8, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 548290560. Throughput: 0: 43858.2. Samples: 548444240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-07-02 14:34:56,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:34:57,683][36999] Updated weights for policy 0, policy_version 33470 (0.0032) [2024-07-02 14:35:01,095][36761] Fps is (10 sec: 44236.2, 60 sec: 43963.8, 300 sec: 43931.3). Total num frames: 548503552. Throughput: 0: 43869.4. Samples: 548571720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-07-02 14:35:01,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:35:01,825][36999] Updated weights for policy 0, policy_version 33480 (0.0029) [2024-07-02 14:35:05,226][36999] Updated weights for policy 0, policy_version 33490 (0.0039) [2024-07-02 14:35:06,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43963.8, 300 sec: 43931.3). Total num frames: 548716544. Throughput: 0: 43864.6. Samples: 548837320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-07-02 14:35:06,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:35:09,164][36999] Updated weights for policy 0, policy_version 33500 (0.0035) [2024-07-02 14:35:11,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43420.8, 300 sec: 43820.2). Total num frames: 548913152. Throughput: 0: 43900.9. Samples: 549102180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-07-02 14:35:11,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:35:12,753][36999] Updated weights for policy 0, policy_version 33510 (0.0026) [2024-07-02 14:35:16,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43963.8, 300 sec: 43931.3). Total num frames: 549158912. Throughput: 0: 43774.2. Samples: 549226640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-07-02 14:35:16,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:35:16,516][36999] Updated weights for policy 0, policy_version 33520 (0.0035) [2024-07-02 14:35:20,142][36999] Updated weights for policy 0, policy_version 33530 (0.0023) [2024-07-02 14:35:21,095][36761] Fps is (10 sec: 47513.6, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 549388288. Throughput: 0: 44014.6. Samples: 549498920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-07-02 14:35:21,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:35:23,431][36979] Signal inference workers to stop experience collection... (7850 times) [2024-07-02 14:35:23,431][36979] Signal inference workers to resume experience collection... (7850 times) [2024-07-02 14:35:23,467][36999] InferenceWorker_p0-w0: stopping experience collection (7850 times) [2024-07-02 14:35:23,467][36999] InferenceWorker_p0-w0: resuming experience collection (7850 times) [2024-07-02 14:35:23,862][36999] Updated weights for policy 0, policy_version 33540 (0.0037) [2024-07-02 14:35:26,095][36761] Fps is (10 sec: 44236.3, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 549601280. Throughput: 0: 43994.1. Samples: 549763480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 14:35:26,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:35:27,608][36999] Updated weights for policy 0, policy_version 33550 (0.0034) [2024-07-02 14:35:31,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43963.8, 300 sec: 43931.4). Total num frames: 549814272. Throughput: 0: 43931.2. Samples: 549889040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 14:35:31,096][36761] Avg episode reward: [(0, '1.013')] [2024-07-02 14:35:31,310][36999] Updated weights for policy 0, policy_version 33560 (0.0035) [2024-07-02 14:35:34,955][36999] Updated weights for policy 0, policy_version 33570 (0.0028) [2024-07-02 14:35:36,095][36761] Fps is (10 sec: 45875.4, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 550060032. Throughput: 0: 44013.6. Samples: 550157140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 14:35:36,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:35:38,730][36999] Updated weights for policy 0, policy_version 33580 (0.0040) [2024-07-02 14:35:41,095][36761] Fps is (10 sec: 44236.5, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 550256640. Throughput: 0: 44017.3. Samples: 550425020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-07-02 14:35:41,100][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:35:42,467][36999] Updated weights for policy 0, policy_version 33590 (0.0034) [2024-07-02 14:35:46,095][36761] Fps is (10 sec: 42599.2, 60 sec: 44236.9, 300 sec: 43931.4). Total num frames: 550486016. Throughput: 0: 43965.0. Samples: 550550140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 14:35:46,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:35:46,158][36999] Updated weights for policy 0, policy_version 33600 (0.0036) [2024-07-02 14:35:50,000][36999] Updated weights for policy 0, policy_version 33610 (0.0031) [2024-07-02 14:35:51,095][36761] Fps is (10 sec: 45875.8, 60 sec: 44236.8, 300 sec: 44042.8). Total num frames: 550715392. Throughput: 0: 43890.7. Samples: 550812400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 14:35:51,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:35:53,999][36999] Updated weights for policy 0, policy_version 33620 (0.0028) [2024-07-02 14:35:56,095][36761] Fps is (10 sec: 42597.7, 60 sec: 43690.6, 300 sec: 43931.3). Total num frames: 550912000. Throughput: 0: 43911.5. Samples: 551078200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 14:35:56,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:35:57,418][36999] Updated weights for policy 0, policy_version 33630 (0.0036) [2024-07-02 14:36:01,095][36761] Fps is (10 sec: 40959.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 551124992. Throughput: 0: 43894.2. Samples: 551201880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 14:36:01,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:36:01,667][36999] Updated weights for policy 0, policy_version 33640 (0.0040) [2024-07-02 14:36:04,937][36999] Updated weights for policy 0, policy_version 33650 (0.0028) [2024-07-02 14:36:06,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 551354368. Throughput: 0: 43649.8. Samples: 551463160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-07-02 14:36:06,096][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 14:36:09,180][36999] Updated weights for policy 0, policy_version 33660 (0.0044) [2024-07-02 14:36:11,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43963.7, 300 sec: 43820.3). Total num frames: 551550976. Throughput: 0: 43685.9. Samples: 551729340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-07-02 14:36:11,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:36:12,576][36999] Updated weights for policy 0, policy_version 33670 (0.0020) [2024-07-02 14:36:16,096][36761] Fps is (10 sec: 42596.1, 60 sec: 43690.3, 300 sec: 43931.3). Total num frames: 551780352. Throughput: 0: 43742.5. Samples: 551857480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-07-02 14:36:16,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:36:16,598][36999] Updated weights for policy 0, policy_version 33680 (0.0025) [2024-07-02 14:36:20,131][36999] Updated weights for policy 0, policy_version 33690 (0.0036) [2024-07-02 14:36:21,095][36761] Fps is (10 sec: 47513.5, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 552026112. Throughput: 0: 43725.8. Samples: 552124800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-07-02 14:36:21,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:36:23,916][36999] Updated weights for policy 0, policy_version 33700 (0.0039) [2024-07-02 14:36:26,095][36761] Fps is (10 sec: 45877.7, 60 sec: 43963.8, 300 sec: 43931.3). Total num frames: 552239104. Throughput: 0: 43689.4. Samples: 552391040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-07-02 14:36:26,099][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:36:27,566][36999] Updated weights for policy 0, policy_version 33710 (0.0030) [2024-07-02 14:36:31,095][36761] Fps is (10 sec: 40960.3, 60 sec: 43690.7, 300 sec: 43931.4). Total num frames: 552435712. Throughput: 0: 43761.7. Samples: 552519420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 14:36:31,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 14:36:31,245][36999] Updated weights for policy 0, policy_version 33720 (0.0035) [2024-07-02 14:36:34,972][36999] Updated weights for policy 0, policy_version 33730 (0.0023) [2024-07-02 14:36:36,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43690.7, 300 sec: 43931.4). Total num frames: 552681472. Throughput: 0: 43835.1. Samples: 552784980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 14:36:36,096][36761] Avg episode reward: [(0, '1.007')] [2024-07-02 14:36:36,103][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000033733_552681472.pth... [2024-07-02 14:36:36,162][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000033088_542113792.pth [2024-07-02 14:36:38,596][36999] Updated weights for policy 0, policy_version 33740 (0.0024) [2024-07-02 14:36:41,100][36761] Fps is (10 sec: 45854.2, 60 sec: 43960.5, 300 sec: 43930.7). Total num frames: 552894464. Throughput: 0: 43752.6. Samples: 553047260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 14:36:41,101][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:36:42,443][36999] Updated weights for policy 0, policy_version 33750 (0.0031) [2024-07-02 14:36:45,310][36979] Signal inference workers to stop experience collection... (7900 times) [2024-07-02 14:36:45,310][36979] Signal inference workers to resume experience collection... (7900 times) [2024-07-02 14:36:45,347][36999] InferenceWorker_p0-w0: stopping experience collection (7900 times) [2024-07-02 14:36:45,348][36999] InferenceWorker_p0-w0: resuming experience collection (7900 times) [2024-07-02 14:36:46,100][36761] Fps is (10 sec: 42578.3, 60 sec: 43687.2, 300 sec: 43930.6). Total num frames: 553107456. Throughput: 0: 43906.5. Samples: 553177880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-07-02 14:36:46,100][36761] Avg episode reward: [(0, '1.037')] [2024-07-02 14:36:46,401][36999] Updated weights for policy 0, policy_version 33760 (0.0032) [2024-07-02 14:36:49,888][36999] Updated weights for policy 0, policy_version 33770 (0.0027) [2024-07-02 14:36:51,095][36761] Fps is (10 sec: 44257.1, 60 sec: 43690.7, 300 sec: 43931.4). Total num frames: 553336832. Throughput: 0: 43952.1. Samples: 553441000. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-07-02 14:36:51,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:36:53,828][36999] Updated weights for policy 0, policy_version 33780 (0.0036) [2024-07-02 14:36:56,095][36761] Fps is (10 sec: 42618.2, 60 sec: 43690.7, 300 sec: 43820.2). Total num frames: 553533440. Throughput: 0: 43834.6. Samples: 553701900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-07-02 14:36:56,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:36:57,356][36999] Updated weights for policy 0, policy_version 33790 (0.0033) [2024-07-02 14:37:01,100][36761] Fps is (10 sec: 42578.9, 60 sec: 43960.4, 300 sec: 43930.7). Total num frames: 553762816. Throughput: 0: 43974.3. Samples: 553836500. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-07-02 14:37:01,101][36761] Avg episode reward: [(0, '1.013')] [2024-07-02 14:37:01,755][36999] Updated weights for policy 0, policy_version 33800 (0.0030) [2024-07-02 14:37:04,971][36999] Updated weights for policy 0, policy_version 33810 (0.0030) [2024-07-02 14:37:06,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 553975808. Throughput: 0: 43811.5. Samples: 554096320. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-07-02 14:37:06,096][36761] Avg episode reward: [(0, '1.036')] [2024-07-02 14:37:09,219][36999] Updated weights for policy 0, policy_version 33820 (0.0030) [2024-07-02 14:37:11,095][36761] Fps is (10 sec: 44256.3, 60 sec: 44236.7, 300 sec: 43875.8). Total num frames: 554205184. Throughput: 0: 43765.2. Samples: 554360480. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-07-02 14:37:11,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:37:12,366][36999] Updated weights for policy 0, policy_version 33830 (0.0033) [2024-07-02 14:37:16,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43691.1, 300 sec: 43931.3). Total num frames: 554401792. Throughput: 0: 43712.8. Samples: 554486500. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-07-02 14:37:16,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:37:16,697][36999] Updated weights for policy 0, policy_version 33840 (0.0041) [2024-07-02 14:37:20,162][36999] Updated weights for policy 0, policy_version 33850 (0.0029) [2024-07-02 14:37:21,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43417.6, 300 sec: 43820.3). Total num frames: 554631168. Throughput: 0: 43590.6. Samples: 554746560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-07-02 14:37:21,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:37:24,013][36999] Updated weights for policy 0, policy_version 33860 (0.0036) [2024-07-02 14:37:26,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 43820.3). Total num frames: 554844160. Throughput: 0: 43813.3. Samples: 555018660. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-07-02 14:37:26,096][36761] Avg episode reward: [(0, '1.042')] [2024-07-02 14:37:27,549][36999] Updated weights for policy 0, policy_version 33870 (0.0032) [2024-07-02 14:37:31,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43690.6, 300 sec: 43931.3). Total num frames: 555057152. Throughput: 0: 43630.3. Samples: 555141040. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-07-02 14:37:31,096][36761] Avg episode reward: [(0, '1.040')] [2024-07-02 14:37:31,386][36999] Updated weights for policy 0, policy_version 33880 (0.0042) [2024-07-02 14:37:34,844][36999] Updated weights for policy 0, policy_version 33890 (0.0038) [2024-07-02 14:37:36,095][36761] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 43820.3). Total num frames: 555286528. Throughput: 0: 43628.4. Samples: 555404280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-07-02 14:37:36,096][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 14:37:38,730][36999] Updated weights for policy 0, policy_version 33900 (0.0031) [2024-07-02 14:37:41,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43420.8, 300 sec: 43820.2). Total num frames: 555499520. Throughput: 0: 43767.6. Samples: 555671440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-07-02 14:37:41,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:37:42,746][36999] Updated weights for policy 0, policy_version 33910 (0.0041) [2024-07-02 14:37:46,095][36761] Fps is (10 sec: 44236.5, 60 sec: 43694.0, 300 sec: 43931.3). Total num frames: 555728896. Throughput: 0: 43546.1. Samples: 555795880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-07-02 14:37:46,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:37:46,251][36999] Updated weights for policy 0, policy_version 33920 (0.0027) [2024-07-02 14:37:50,061][36999] Updated weights for policy 0, policy_version 33930 (0.0034) [2024-07-02 14:37:51,095][36761] Fps is (10 sec: 45875.5, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 555958272. Throughput: 0: 43696.1. Samples: 556062640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-07-02 14:37:51,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:37:53,927][36999] Updated weights for policy 0, policy_version 33940 (0.0021) [2024-07-02 14:37:56,095][36761] Fps is (10 sec: 42599.2, 60 sec: 43690.8, 300 sec: 43764.7). Total num frames: 556154880. Throughput: 0: 43686.0. Samples: 556326340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-07-02 14:37:56,095][36761] Avg episode reward: [(0, '1.008')] [2024-07-02 14:37:57,265][36979] Signal inference workers to stop experience collection... (7950 times) [2024-07-02 14:37:57,304][36999] InferenceWorker_p0-w0: stopping experience collection (7950 times) [2024-07-02 14:37:57,324][36979] Signal inference workers to resume experience collection... (7950 times) [2024-07-02 14:37:57,326][36999] InferenceWorker_p0-w0: resuming experience collection (7950 times) [2024-07-02 14:37:57,465][36999] Updated weights for policy 0, policy_version 33950 (0.0035) [2024-07-02 14:38:01,096][36761] Fps is (10 sec: 40957.6, 60 sec: 43420.5, 300 sec: 43820.2). Total num frames: 556367872. Throughput: 0: 43760.8. Samples: 556455760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-07-02 14:38:01,097][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 14:38:01,431][36999] Updated weights for policy 0, policy_version 33960 (0.0033) [2024-07-02 14:38:04,890][36999] Updated weights for policy 0, policy_version 33970 (0.0026) [2024-07-02 14:38:06,095][36761] Fps is (10 sec: 45875.6, 60 sec: 43963.9, 300 sec: 43875.8). Total num frames: 556613632. Throughput: 0: 43830.5. Samples: 556718920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-07-02 14:38:06,095][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:38:08,824][36999] Updated weights for policy 0, policy_version 33980 (0.0047) [2024-07-02 14:38:11,095][36761] Fps is (10 sec: 45878.0, 60 sec: 43690.8, 300 sec: 43820.3). Total num frames: 556826624. Throughput: 0: 43530.2. Samples: 556977520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-07-02 14:38:11,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:38:12,424][36999] Updated weights for policy 0, policy_version 33990 (0.0029) [2024-07-02 14:38:16,095][36761] Fps is (10 sec: 42597.9, 60 sec: 43963.8, 300 sec: 43820.3). Total num frames: 557039616. Throughput: 0: 43728.5. Samples: 557108820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-07-02 14:38:16,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:38:16,191][36999] Updated weights for policy 0, policy_version 34000 (0.0036) [2024-07-02 14:38:20,179][36999] Updated weights for policy 0, policy_version 34010 (0.0029) [2024-07-02 14:38:21,095][36761] Fps is (10 sec: 42597.9, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 557252608. Throughput: 0: 43828.8. Samples: 557376580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-07-02 14:38:21,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:38:23,572][36999] Updated weights for policy 0, policy_version 34020 (0.0028) [2024-07-02 14:38:26,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 557465600. Throughput: 0: 43647.2. Samples: 557635560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-07-02 14:38:26,096][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 14:38:27,622][36999] Updated weights for policy 0, policy_version 34030 (0.0036) [2024-07-02 14:38:31,095][36761] Fps is (10 sec: 42598.9, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 557678592. Throughput: 0: 43738.8. Samples: 557764120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-07-02 14:38:31,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:38:31,314][36999] Updated weights for policy 0, policy_version 34040 (0.0028) [2024-07-02 14:38:35,235][36999] Updated weights for policy 0, policy_version 34050 (0.0040) [2024-07-02 14:38:36,095][36761] Fps is (10 sec: 45875.1, 60 sec: 43963.8, 300 sec: 43764.7). Total num frames: 557924352. Throughput: 0: 43773.3. Samples: 558032440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-07-02 14:38:36,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 14:38:36,118][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000034053_557924352.pth... [2024-07-02 14:38:36,171][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000033412_547422208.pth [2024-07-02 14:38:38,823][36999] Updated weights for policy 0, policy_version 34060 (0.0039) [2024-07-02 14:38:41,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43690.8, 300 sec: 43820.3). Total num frames: 558120960. Throughput: 0: 43696.9. Samples: 558292700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 14:38:41,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:38:42,730][36999] Updated weights for policy 0, policy_version 34070 (0.0029) [2024-07-02 14:38:46,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43690.7, 300 sec: 43820.2). Total num frames: 558350336. Throughput: 0: 43593.9. Samples: 558417460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 14:38:46,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:38:46,238][36999] Updated weights for policy 0, policy_version 34080 (0.0040) [2024-07-02 14:38:50,117][36999] Updated weights for policy 0, policy_version 34090 (0.0033) [2024-07-02 14:38:51,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43417.7, 300 sec: 43764.7). Total num frames: 558563328. Throughput: 0: 43783.9. Samples: 558689200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 14:38:51,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:38:53,550][36999] Updated weights for policy 0, policy_version 34100 (0.0031) [2024-07-02 14:38:56,095][36761] Fps is (10 sec: 44236.4, 60 sec: 43963.6, 300 sec: 43820.3). Total num frames: 558792704. Throughput: 0: 43840.8. Samples: 558950360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 14:38:56,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 14:38:57,589][36999] Updated weights for policy 0, policy_version 34110 (0.0034) [2024-07-02 14:39:00,888][36999] Updated weights for policy 0, policy_version 34120 (0.0033) [2024-07-02 14:39:01,095][36761] Fps is (10 sec: 45875.1, 60 sec: 44237.3, 300 sec: 43875.8). Total num frames: 559022080. Throughput: 0: 43895.1. Samples: 559084100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-07-02 14:39:01,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 14:39:05,052][36999] Updated weights for policy 0, policy_version 34130 (0.0041) [2024-07-02 14:39:06,095][36761] Fps is (10 sec: 42599.2, 60 sec: 43417.5, 300 sec: 43765.4). Total num frames: 559218688. Throughput: 0: 43851.3. Samples: 559349880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-07-02 14:39:06,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:39:08,575][36999] Updated weights for policy 0, policy_version 34140 (0.0026) [2024-07-02 14:39:11,095][36761] Fps is (10 sec: 42597.8, 60 sec: 43690.6, 300 sec: 43820.3). Total num frames: 559448064. Throughput: 0: 43869.2. Samples: 559609680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-07-02 14:39:11,096][36761] Avg episode reward: [(0, '1.043')] [2024-07-02 14:39:12,701][36999] Updated weights for policy 0, policy_version 34150 (0.0042) [2024-07-02 14:39:16,081][36999] Updated weights for policy 0, policy_version 34160 (0.0032) [2024-07-02 14:39:16,095][36761] Fps is (10 sec: 45874.6, 60 sec: 43963.6, 300 sec: 43820.3). Total num frames: 559677440. Throughput: 0: 43860.4. Samples: 559737840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-07-02 14:39:16,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:39:20,030][36999] Updated weights for policy 0, policy_version 34170 (0.0033) [2024-07-02 14:39:21,095][36761] Fps is (10 sec: 44237.5, 60 sec: 43963.8, 300 sec: 43820.3). Total num frames: 559890432. Throughput: 0: 43839.2. Samples: 560005200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-07-02 14:39:21,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:39:23,632][36999] Updated weights for policy 0, policy_version 34180 (0.0037) [2024-07-02 14:39:26,095][36761] Fps is (10 sec: 42598.9, 60 sec: 43963.8, 300 sec: 43820.3). Total num frames: 560103424. Throughput: 0: 43913.3. Samples: 560268800. Policy #0 lag: (min: 2.0, avg: 10.9, max: 21.0) [2024-07-02 14:39:26,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:39:27,434][36999] Updated weights for policy 0, policy_version 34190 (0.0027) [2024-07-02 14:39:27,892][36979] Signal inference workers to stop experience collection... (8000 times) [2024-07-02 14:39:27,892][36979] Signal inference workers to resume experience collection... (8000 times) [2024-07-02 14:39:27,924][36999] InferenceWorker_p0-w0: stopping experience collection (8000 times) [2024-07-02 14:39:27,924][36999] InferenceWorker_p0-w0: resuming experience collection (8000 times) [2024-07-02 14:39:30,922][36999] Updated weights for policy 0, policy_version 34200 (0.0040) [2024-07-02 14:39:31,095][36761] Fps is (10 sec: 44236.8, 60 sec: 44236.8, 300 sec: 43820.3). Total num frames: 560332800. Throughput: 0: 44082.8. Samples: 560401180. Policy #0 lag: (min: 2.0, avg: 10.9, max: 21.0) [2024-07-02 14:39:31,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:39:34,630][36999] Updated weights for policy 0, policy_version 34210 (0.0034) [2024-07-02 14:39:36,095][36761] Fps is (10 sec: 44236.4, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 560545792. Throughput: 0: 43920.4. Samples: 560665620. Policy #0 lag: (min: 2.0, avg: 10.9, max: 21.0) [2024-07-02 14:39:36,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:39:38,673][36999] Updated weights for policy 0, policy_version 34220 (0.0031) [2024-07-02 14:39:41,095][36761] Fps is (10 sec: 44236.5, 60 sec: 44236.7, 300 sec: 43875.8). Total num frames: 560775168. Throughput: 0: 44001.0. Samples: 560930400. Policy #0 lag: (min: 2.0, avg: 10.9, max: 21.0) [2024-07-02 14:39:41,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:39:42,005][36999] Updated weights for policy 0, policy_version 34230 (0.0028) [2024-07-02 14:39:46,020][36999] Updated weights for policy 0, policy_version 34240 (0.0042) [2024-07-02 14:39:46,096][36761] Fps is (10 sec: 44235.9, 60 sec: 43963.6, 300 sec: 43820.2). Total num frames: 560988160. Throughput: 0: 43965.1. Samples: 561062540. Policy #0 lag: (min: 2.0, avg: 11.3, max: 26.0) [2024-07-02 14:39:46,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:39:49,973][36999] Updated weights for policy 0, policy_version 34250 (0.0028) [2024-07-02 14:39:51,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43963.7, 300 sec: 43764.7). Total num frames: 561201152. Throughput: 0: 43933.7. Samples: 561326900. Policy #0 lag: (min: 2.0, avg: 11.3, max: 26.0) [2024-07-02 14:39:51,099][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:39:53,414][36999] Updated weights for policy 0, policy_version 34260 (0.0031) [2024-07-02 14:39:56,095][36761] Fps is (10 sec: 42599.9, 60 sec: 43690.8, 300 sec: 43764.7). Total num frames: 561414144. Throughput: 0: 43937.6. Samples: 561586860. Policy #0 lag: (min: 2.0, avg: 11.3, max: 26.0) [2024-07-02 14:39:56,095][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:39:57,259][36999] Updated weights for policy 0, policy_version 34270 (0.0036) [2024-07-02 14:40:00,752][36999] Updated weights for policy 0, policy_version 34280 (0.0033) [2024-07-02 14:40:01,100][36761] Fps is (10 sec: 45854.2, 60 sec: 43960.3, 300 sec: 43875.1). Total num frames: 561659904. Throughput: 0: 44065.3. Samples: 561720980. Policy #0 lag: (min: 2.0, avg: 11.3, max: 26.0) [2024-07-02 14:40:01,101][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:40:04,683][36999] Updated weights for policy 0, policy_version 34290 (0.0026) [2024-07-02 14:40:06,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43963.8, 300 sec: 43875.8). Total num frames: 561856512. Throughput: 0: 43913.8. Samples: 561981320. Policy #0 lag: (min: 2.0, avg: 11.3, max: 26.0) [2024-07-02 14:40:06,095][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:40:08,045][36999] Updated weights for policy 0, policy_version 34300 (0.0038) [2024-07-02 14:40:11,095][36761] Fps is (10 sec: 40978.8, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 562069504. Throughput: 0: 43959.0. Samples: 562246960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 14:40:11,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:40:11,968][36999] Updated weights for policy 0, policy_version 34310 (0.0035) [2024-07-02 14:40:15,619][36999] Updated weights for policy 0, policy_version 34320 (0.0026) [2024-07-02 14:40:16,100][36761] Fps is (10 sec: 45853.9, 60 sec: 43960.5, 300 sec: 43819.6). Total num frames: 562315264. Throughput: 0: 43893.7. Samples: 562376600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 14:40:16,100][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:40:19,606][36999] Updated weights for policy 0, policy_version 34330 (0.0022) [2024-07-02 14:40:21,095][36761] Fps is (10 sec: 45875.5, 60 sec: 43963.7, 300 sec: 43820.3). Total num frames: 562528256. Throughput: 0: 43938.7. Samples: 562642860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 14:40:21,096][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 14:40:23,190][36999] Updated weights for policy 0, policy_version 34340 (0.0037) [2024-07-02 14:40:26,095][36761] Fps is (10 sec: 42617.3, 60 sec: 43963.6, 300 sec: 43820.2). Total num frames: 562741248. Throughput: 0: 43935.9. Samples: 562907520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 14:40:26,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:40:27,085][36999] Updated weights for policy 0, policy_version 34350 (0.0041) [2024-07-02 14:40:30,564][36999] Updated weights for policy 0, policy_version 34360 (0.0029) [2024-07-02 14:40:31,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43963.7, 300 sec: 43764.7). Total num frames: 562970624. Throughput: 0: 43805.6. Samples: 563033780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:40:31,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:40:34,329][36999] Updated weights for policy 0, policy_version 34370 (0.0031) [2024-07-02 14:40:36,095][36761] Fps is (10 sec: 44237.3, 60 sec: 43963.8, 300 sec: 43820.3). Total num frames: 563183616. Throughput: 0: 43837.8. Samples: 563299600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:40:36,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:40:36,151][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000034375_563200000.pth... [2024-07-02 14:40:36,205][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000033733_552681472.pth [2024-07-02 14:40:38,118][36999] Updated weights for policy 0, policy_version 34380 (0.0030) [2024-07-02 14:40:41,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 563396608. Throughput: 0: 44010.6. Samples: 563567340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:40:41,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:40:41,666][36999] Updated weights for policy 0, policy_version 34390 (0.0049) [2024-07-02 14:40:45,390][36999] Updated weights for policy 0, policy_version 34400 (0.0031) [2024-07-02 14:40:46,095][36761] Fps is (10 sec: 45875.5, 60 sec: 44237.0, 300 sec: 43820.3). Total num frames: 563642368. Throughput: 0: 43879.2. Samples: 563695340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:40:46,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:40:49,085][36999] Updated weights for policy 0, policy_version 34410 (0.0038) [2024-07-02 14:40:50,787][36979] Signal inference workers to stop experience collection... (8050 times) [2024-07-02 14:40:50,787][36979] Signal inference workers to resume experience collection... (8050 times) [2024-07-02 14:40:50,813][36999] InferenceWorker_p0-w0: stopping experience collection (8050 times) [2024-07-02 14:40:50,813][36999] InferenceWorker_p0-w0: resuming experience collection (8050 times) [2024-07-02 14:40:51,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43963.8, 300 sec: 43820.3). Total num frames: 563838976. Throughput: 0: 44039.9. Samples: 563963120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:40:51,096][36761] Avg episode reward: [(0, '1.037')] [2024-07-02 14:40:52,667][36999] Updated weights for policy 0, policy_version 34420 (0.0040) [2024-07-02 14:40:56,095][36761] Fps is (10 sec: 42598.1, 60 sec: 44236.7, 300 sec: 43875.8). Total num frames: 564068352. Throughput: 0: 43992.0. Samples: 564226600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:40:56,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:40:56,799][36999] Updated weights for policy 0, policy_version 34430 (0.0029) [2024-07-02 14:41:00,086][36999] Updated weights for policy 0, policy_version 34440 (0.0022) [2024-07-02 14:41:01,095][36761] Fps is (10 sec: 44236.3, 60 sec: 43694.0, 300 sec: 43820.3). Total num frames: 564281344. Throughput: 0: 44050.2. Samples: 564358660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:41:01,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:41:04,232][36999] Updated weights for policy 0, policy_version 34450 (0.0039) [2024-07-02 14:41:06,095][36761] Fps is (10 sec: 44236.8, 60 sec: 44236.7, 300 sec: 43931.3). Total num frames: 564510720. Throughput: 0: 44171.1. Samples: 564630560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:41:06,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:41:07,389][36999] Updated weights for policy 0, policy_version 34460 (0.0022) [2024-07-02 14:41:11,095][36761] Fps is (10 sec: 44237.1, 60 sec: 44236.8, 300 sec: 43875.9). Total num frames: 564723712. Throughput: 0: 44024.5. Samples: 564888620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:41:11,096][36761] Avg episode reward: [(0, '1.039')] [2024-07-02 14:41:11,676][36999] Updated weights for policy 0, policy_version 34470 (0.0030) [2024-07-02 14:41:15,231][36999] Updated weights for policy 0, policy_version 34480 (0.0039) [2024-07-02 14:41:16,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43694.0, 300 sec: 43764.7). Total num frames: 564936704. Throughput: 0: 44128.0. Samples: 565019540. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-07-02 14:41:16,096][36761] Avg episode reward: [(0, '1.041')] [2024-07-02 14:41:19,198][36999] Updated weights for policy 0, policy_version 34490 (0.0038) [2024-07-02 14:41:21,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43963.7, 300 sec: 43820.3). Total num frames: 565166080. Throughput: 0: 44144.4. Samples: 565286100. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-07-02 14:41:21,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:41:22,682][36999] Updated weights for policy 0, policy_version 34500 (0.0038) [2024-07-02 14:41:26,095][36761] Fps is (10 sec: 44236.5, 60 sec: 43963.8, 300 sec: 43875.8). Total num frames: 565379072. Throughput: 0: 43963.0. Samples: 565545680. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-07-02 14:41:26,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:41:26,750][36999] Updated weights for policy 0, policy_version 34510 (0.0030) [2024-07-02 14:41:30,216][36999] Updated weights for policy 0, policy_version 34520 (0.0027) [2024-07-02 14:41:31,095][36761] Fps is (10 sec: 42599.0, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 565592064. Throughput: 0: 43994.2. Samples: 565675080. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-07-02 14:41:31,095][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 14:41:34,329][36999] Updated weights for policy 0, policy_version 34530 (0.0034) [2024-07-02 14:41:36,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43963.7, 300 sec: 43820.9). Total num frames: 565821440. Throughput: 0: 43914.5. Samples: 565939280. Policy #0 lag: (min: 2.0, avg: 10.6, max: 20.0) [2024-07-02 14:41:36,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:41:37,679][36999] Updated weights for policy 0, policy_version 34540 (0.0041) [2024-07-02 14:41:41,095][36761] Fps is (10 sec: 44236.4, 60 sec: 43963.7, 300 sec: 43821.0). Total num frames: 566034432. Throughput: 0: 43752.0. Samples: 566195440. Policy #0 lag: (min: 2.0, avg: 10.6, max: 20.0) [2024-07-02 14:41:41,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 14:41:41,708][36999] Updated weights for policy 0, policy_version 34550 (0.0026) [2024-07-02 14:41:45,115][36999] Updated weights for policy 0, policy_version 34560 (0.0023) [2024-07-02 14:41:46,095][36761] Fps is (10 sec: 44237.5, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 566263808. Throughput: 0: 43841.0. Samples: 566331500. Policy #0 lag: (min: 2.0, avg: 10.6, max: 20.0) [2024-07-02 14:41:46,096][36761] Avg episode reward: [(0, '1.018')] [2024-07-02 14:41:49,087][36999] Updated weights for policy 0, policy_version 34570 (0.0024) [2024-07-02 14:41:51,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 566476800. Throughput: 0: 43685.8. Samples: 566596420. Policy #0 lag: (min: 2.0, avg: 10.6, max: 20.0) [2024-07-02 14:41:51,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:41:52,454][36999] Updated weights for policy 0, policy_version 34580 (0.0023) [2024-07-02 14:41:56,095][36761] Fps is (10 sec: 42597.7, 60 sec: 43690.6, 300 sec: 43820.9). Total num frames: 566689792. Throughput: 0: 43828.4. Samples: 566860900. Policy #0 lag: (min: 2.0, avg: 10.6, max: 20.0) [2024-07-02 14:41:56,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:41:56,702][36999] Updated weights for policy 0, policy_version 34590 (0.0034) [2024-07-02 14:41:59,867][36999] Updated weights for policy 0, policy_version 34600 (0.0034) [2024-07-02 14:42:01,095][36761] Fps is (10 sec: 45874.8, 60 sec: 44236.8, 300 sec: 43931.3). Total num frames: 566935552. Throughput: 0: 43843.5. Samples: 566992500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-07-02 14:42:01,096][36761] Avg episode reward: [(0, '1.045')] [2024-07-02 14:42:02,224][36979] Signal inference workers to stop experience collection... (8100 times) [2024-07-02 14:42:02,224][36979] Signal inference workers to resume experience collection... (8100 times) [2024-07-02 14:42:02,265][36999] InferenceWorker_p0-w0: stopping experience collection (8100 times) [2024-07-02 14:42:02,265][36999] InferenceWorker_p0-w0: resuming experience collection (8100 times) [2024-07-02 14:42:04,042][36999] Updated weights for policy 0, policy_version 34610 (0.0034) [2024-07-02 14:42:06,095][36761] Fps is (10 sec: 44237.4, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 567132160. Throughput: 0: 43788.5. Samples: 567256580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-07-02 14:42:06,095][36761] Avg episode reward: [(0, '0.985')] [2024-07-02 14:42:07,559][36999] Updated weights for policy 0, policy_version 34620 (0.0023) [2024-07-02 14:42:11,095][36761] Fps is (10 sec: 40960.3, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 567345152. Throughput: 0: 43700.5. Samples: 567512200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-07-02 14:42:11,097][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:42:11,586][36999] Updated weights for policy 0, policy_version 34630 (0.0038) [2024-07-02 14:42:15,078][36999] Updated weights for policy 0, policy_version 34640 (0.0037) [2024-07-02 14:42:16,095][36761] Fps is (10 sec: 45874.9, 60 sec: 44236.8, 300 sec: 43931.3). Total num frames: 567590912. Throughput: 0: 43804.8. Samples: 567646300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-07-02 14:42:16,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:42:18,860][36999] Updated weights for policy 0, policy_version 34650 (0.0033) [2024-07-02 14:42:21,100][36761] Fps is (10 sec: 45854.3, 60 sec: 43960.4, 300 sec: 43930.7). Total num frames: 567803904. Throughput: 0: 43920.5. Samples: 567915900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:42:21,100][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 14:42:22,330][36999] Updated weights for policy 0, policy_version 34660 (0.0029) [2024-07-02 14:42:26,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 568016896. Throughput: 0: 44104.4. Samples: 568180140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:42:26,096][36761] Avg episode reward: [(0, '1.038')] [2024-07-02 14:42:26,590][36999] Updated weights for policy 0, policy_version 34670 (0.0044) [2024-07-02 14:42:29,823][36999] Updated weights for policy 0, policy_version 34680 (0.0033) [2024-07-02 14:42:31,095][36761] Fps is (10 sec: 44257.0, 60 sec: 44236.7, 300 sec: 43931.3). Total num frames: 568246272. Throughput: 0: 44005.7. Samples: 568311760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:42:31,096][36761] Avg episode reward: [(0, '1.038')] [2024-07-02 14:42:33,910][36999] Updated weights for policy 0, policy_version 34690 (0.0026) [2024-07-02 14:42:36,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43963.8, 300 sec: 43931.3). Total num frames: 568459264. Throughput: 0: 43984.0. Samples: 568575700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:42:36,096][36761] Avg episode reward: [(0, '1.038')] [2024-07-02 14:42:36,110][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000034696_568459264.pth... [2024-07-02 14:42:36,183][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000034053_557924352.pth [2024-07-02 14:42:37,105][36999] Updated weights for policy 0, policy_version 34700 (0.0031) [2024-07-02 14:42:41,095][36761] Fps is (10 sec: 40959.9, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 568655872. Throughput: 0: 43985.9. Samples: 568840260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:42:41,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:42:41,248][36999] Updated weights for policy 0, policy_version 34710 (0.0039) [2024-07-02 14:42:44,646][36999] Updated weights for policy 0, policy_version 34720 (0.0029) [2024-07-02 14:42:46,095][36761] Fps is (10 sec: 44236.1, 60 sec: 43963.6, 300 sec: 43875.8). Total num frames: 568901632. Throughput: 0: 44033.3. Samples: 568974000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-07-02 14:42:46,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:42:48,634][36999] Updated weights for policy 0, policy_version 34730 (0.0025) [2024-07-02 14:42:51,095][36761] Fps is (10 sec: 45875.8, 60 sec: 43963.8, 300 sec: 43931.3). Total num frames: 569114624. Throughput: 0: 43981.0. Samples: 569235720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-07-02 14:42:51,096][36761] Avg episode reward: [(0, '1.036')] [2024-07-02 14:42:52,061][36999] Updated weights for policy 0, policy_version 34740 (0.0034) [2024-07-02 14:42:56,055][36999] Updated weights for policy 0, policy_version 34750 (0.0025) [2024-07-02 14:42:56,095][36761] Fps is (10 sec: 44237.2, 60 sec: 44236.8, 300 sec: 43987.0). Total num frames: 569344000. Throughput: 0: 44270.6. Samples: 569504380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-07-02 14:42:56,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:42:59,530][36999] Updated weights for policy 0, policy_version 34760 (0.0035) [2024-07-02 14:43:01,095][36761] Fps is (10 sec: 44236.3, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 569556992. Throughput: 0: 44145.8. Samples: 569632860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-07-02 14:43:01,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 14:43:03,598][36999] Updated weights for policy 0, policy_version 34770 (0.0035) [2024-07-02 14:43:06,095][36761] Fps is (10 sec: 44236.7, 60 sec: 44236.7, 300 sec: 43931.3). Total num frames: 569786368. Throughput: 0: 44068.4. Samples: 569898780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:43:06,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:43:06,977][36999] Updated weights for policy 0, policy_version 34780 (0.0043) [2024-07-02 14:43:11,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43963.8, 300 sec: 43875.8). Total num frames: 569982976. Throughput: 0: 44167.2. Samples: 570167660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:43:11,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:43:11,168][36999] Updated weights for policy 0, policy_version 34790 (0.0027) [2024-07-02 14:43:14,345][36999] Updated weights for policy 0, policy_version 34800 (0.0025) [2024-07-02 14:43:16,095][36761] Fps is (10 sec: 42598.9, 60 sec: 43690.7, 300 sec: 43931.4). Total num frames: 570212352. Throughput: 0: 44060.0. Samples: 570294460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:43:16,095][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:43:18,505][36999] Updated weights for policy 0, policy_version 34810 (0.0034) [2024-07-02 14:43:21,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43694.0, 300 sec: 43931.3). Total num frames: 570425344. Throughput: 0: 43968.9. Samples: 570554300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:43:21,095][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:43:21,894][36999] Updated weights for policy 0, policy_version 34820 (0.0029) [2024-07-02 14:43:25,471][36979] Signal inference workers to stop experience collection... (8150 times) [2024-07-02 14:43:25,507][36999] InferenceWorker_p0-w0: stopping experience collection (8150 times) [2024-07-02 14:43:25,587][36979] Signal inference workers to resume experience collection... (8150 times) [2024-07-02 14:43:25,587][36999] InferenceWorker_p0-w0: resuming experience collection (8150 times) [2024-07-02 14:43:25,724][36999] Updated weights for policy 0, policy_version 34830 (0.0027) [2024-07-02 14:43:26,095][36761] Fps is (10 sec: 44236.1, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 570654720. Throughput: 0: 44009.7. Samples: 570820700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:43:26,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:43:29,396][36999] Updated weights for policy 0, policy_version 34840 (0.0042) [2024-07-02 14:43:31,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 570867712. Throughput: 0: 44087.7. Samples: 570957940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 14:43:31,096][36761] Avg episode reward: [(0, '1.044')] [2024-07-02 14:43:33,695][36999] Updated weights for policy 0, policy_version 34850 (0.0030) [2024-07-02 14:43:36,096][36761] Fps is (10 sec: 42597.5, 60 sec: 43690.4, 300 sec: 43931.3). Total num frames: 571080704. Throughput: 0: 43953.4. Samples: 571213640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 14:43:36,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:43:36,867][36999] Updated weights for policy 0, policy_version 34860 (0.0030) [2024-07-02 14:43:41,050][36999] Updated weights for policy 0, policy_version 34870 (0.0031) [2024-07-02 14:43:41,095][36761] Fps is (10 sec: 44236.9, 60 sec: 44236.8, 300 sec: 43931.4). Total num frames: 571310080. Throughput: 0: 44037.0. Samples: 571486040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 14:43:41,095][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:43:44,139][36999] Updated weights for policy 0, policy_version 34880 (0.0040) [2024-07-02 14:43:46,095][36761] Fps is (10 sec: 47515.3, 60 sec: 44236.9, 300 sec: 44042.4). Total num frames: 571555840. Throughput: 0: 44048.0. Samples: 571615020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 14:43:46,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:43:48,273][36999] Updated weights for policy 0, policy_version 34890 (0.0040) [2024-07-02 14:43:51,095][36761] Fps is (10 sec: 45874.6, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 571768832. Throughput: 0: 43943.5. Samples: 571876240. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-07-02 14:43:51,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:43:51,413][36999] Updated weights for policy 0, policy_version 34900 (0.0040) [2024-07-02 14:43:55,568][36999] Updated weights for policy 0, policy_version 34910 (0.0025) [2024-07-02 14:43:56,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 571965440. Throughput: 0: 43980.0. Samples: 572146760. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-07-02 14:43:56,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:43:59,173][36999] Updated weights for policy 0, policy_version 34920 (0.0026) [2024-07-02 14:44:01,095][36761] Fps is (10 sec: 44237.2, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 572211200. Throughput: 0: 44143.5. Samples: 572280920. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-07-02 14:44:01,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:44:02,906][36999] Updated weights for policy 0, policy_version 34930 (0.0032) [2024-07-02 14:44:06,095][36761] Fps is (10 sec: 45875.3, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 572424192. Throughput: 0: 44190.7. Samples: 572542880. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-07-02 14:44:06,095][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:44:06,447][36999] Updated weights for policy 0, policy_version 34940 (0.0032) [2024-07-02 14:44:10,210][36999] Updated weights for policy 0, policy_version 34950 (0.0029) [2024-07-02 14:44:11,095][36761] Fps is (10 sec: 44236.7, 60 sec: 44509.8, 300 sec: 43986.9). Total num frames: 572653568. Throughput: 0: 44233.9. Samples: 572811220. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-07-02 14:44:11,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:44:13,830][36999] Updated weights for policy 0, policy_version 34960 (0.0028) [2024-07-02 14:44:16,095][36761] Fps is (10 sec: 44236.8, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 572866560. Throughput: 0: 44057.4. Samples: 572940520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-07-02 14:44:16,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:44:17,624][36999] Updated weights for policy 0, policy_version 34970 (0.0040) [2024-07-02 14:44:21,095][36761] Fps is (10 sec: 44236.1, 60 sec: 44509.7, 300 sec: 44042.4). Total num frames: 573095936. Throughput: 0: 44178.8. Samples: 573201680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-07-02 14:44:21,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:44:21,349][36999] Updated weights for policy 0, policy_version 34980 (0.0042) [2024-07-02 14:44:24,913][36999] Updated weights for policy 0, policy_version 34990 (0.0032) [2024-07-02 14:44:26,095][36761] Fps is (10 sec: 45875.1, 60 sec: 44510.0, 300 sec: 44042.4). Total num frames: 573325312. Throughput: 0: 44128.9. Samples: 573471840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-07-02 14:44:26,096][36761] Avg episode reward: [(0, '1.010')] [2024-07-02 14:44:29,012][36999] Updated weights for policy 0, policy_version 35000 (0.0030) [2024-07-02 14:44:31,095][36761] Fps is (10 sec: 44237.3, 60 sec: 44509.8, 300 sec: 44042.4). Total num frames: 573538304. Throughput: 0: 44338.2. Samples: 573610240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-07-02 14:44:31,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:44:32,206][36999] Updated weights for policy 0, policy_version 35010 (0.0030) [2024-07-02 14:44:36,095][36761] Fps is (10 sec: 42598.0, 60 sec: 44510.1, 300 sec: 43986.9). Total num frames: 573751296. Throughput: 0: 44264.9. Samples: 573868160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 14:44:36,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:44:36,110][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000035019_573751296.pth... [2024-07-02 14:44:36,167][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000034375_563200000.pth [2024-07-02 14:44:36,382][36999] Updated weights for policy 0, policy_version 35020 (0.0026) [2024-07-02 14:44:39,875][36999] Updated weights for policy 0, policy_version 35030 (0.0022) [2024-07-02 14:44:41,100][36761] Fps is (10 sec: 44216.6, 60 sec: 44506.4, 300 sec: 44041.8). Total num frames: 573980672. Throughput: 0: 44144.3. Samples: 574133460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 14:44:41,101][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:44:43,765][36999] Updated weights for policy 0, policy_version 35040 (0.0027) [2024-07-02 14:44:46,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 574193664. Throughput: 0: 44158.2. Samples: 574268040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 14:44:46,096][36761] Avg episode reward: [(0, '1.037')] [2024-07-02 14:44:47,225][36999] Updated weights for policy 0, policy_version 35050 (0.0030) [2024-07-02 14:44:51,095][36761] Fps is (10 sec: 42617.7, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 574406656. Throughput: 0: 44175.9. Samples: 574530800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 14:44:51,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:44:51,101][36999] Updated weights for policy 0, policy_version 35060 (0.0026) [2024-07-02 14:44:54,798][36999] Updated weights for policy 0, policy_version 35070 (0.0026) [2024-07-02 14:44:56,098][36761] Fps is (10 sec: 44225.8, 60 sec: 44508.0, 300 sec: 43987.2). Total num frames: 574636032. Throughput: 0: 44077.1. Samples: 574794800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 14:44:56,098][36761] Avg episode reward: [(0, '1.038')] [2024-07-02 14:44:58,289][36979] Signal inference workers to stop experience collection... (8200 times) [2024-07-02 14:44:58,336][36979] Signal inference workers to resume experience collection... (8200 times) [2024-07-02 14:44:58,337][36999] InferenceWorker_p0-w0: stopping experience collection (8200 times) [2024-07-02 14:44:58,364][36999] InferenceWorker_p0-w0: resuming experience collection (8200 times) [2024-07-02 14:44:58,485][36999] Updated weights for policy 0, policy_version 35080 (0.0047) [2024-07-02 14:45:01,095][36761] Fps is (10 sec: 44237.3, 60 sec: 43963.8, 300 sec: 44042.4). Total num frames: 574849024. Throughput: 0: 44121.8. Samples: 574926000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 14:45:01,096][36761] Avg episode reward: [(0, '1.037')] [2024-07-02 14:45:02,460][36999] Updated weights for policy 0, policy_version 35090 (0.0040) [2024-07-02 14:45:05,770][36999] Updated weights for policy 0, policy_version 35100 (0.0020) [2024-07-02 14:45:06,095][36761] Fps is (10 sec: 44247.7, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 575078400. Throughput: 0: 44238.3. Samples: 575192400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 14:45:06,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:45:09,929][36999] Updated weights for policy 0, policy_version 35110 (0.0037) [2024-07-02 14:45:11,095][36761] Fps is (10 sec: 47512.7, 60 sec: 44509.8, 300 sec: 44098.6). Total num frames: 575324160. Throughput: 0: 43998.5. Samples: 575451780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 14:45:11,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:45:13,399][36999] Updated weights for policy 0, policy_version 35120 (0.0036) [2024-07-02 14:45:16,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 575504384. Throughput: 0: 43902.2. Samples: 575585840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 14:45:16,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:45:17,274][36999] Updated weights for policy 0, policy_version 35130 (0.0031) [2024-07-02 14:45:20,853][36999] Updated weights for policy 0, policy_version 35140 (0.0042) [2024-07-02 14:45:21,096][36761] Fps is (10 sec: 40959.8, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 575733760. Throughput: 0: 44099.0. Samples: 575852620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-07-02 14:45:21,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:45:24,755][36999] Updated weights for policy 0, policy_version 35150 (0.0026) [2024-07-02 14:45:26,095][36761] Fps is (10 sec: 47513.6, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 575979520. Throughput: 0: 43753.8. Samples: 576102180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-07-02 14:45:26,096][36761] Avg episode reward: [(0, '1.037')] [2024-07-02 14:45:28,421][36999] Updated weights for policy 0, policy_version 35160 (0.0032) [2024-07-02 14:45:31,095][36761] Fps is (10 sec: 42599.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 576159744. Throughput: 0: 43933.0. Samples: 576245020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-07-02 14:45:31,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 14:45:32,027][36999] Updated weights for policy 0, policy_version 35170 (0.0031) [2024-07-02 14:45:35,683][36999] Updated weights for policy 0, policy_version 35180 (0.0026) [2024-07-02 14:45:36,095][36761] Fps is (10 sec: 40960.2, 60 sec: 43963.8, 300 sec: 44042.4). Total num frames: 576389120. Throughput: 0: 44129.9. Samples: 576516640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-07-02 14:45:36,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 14:45:39,398][36999] Updated weights for policy 0, policy_version 35190 (0.0023) [2024-07-02 14:45:41,095][36761] Fps is (10 sec: 47513.5, 60 sec: 44240.2, 300 sec: 44042.4). Total num frames: 576634880. Throughput: 0: 43918.0. Samples: 576771000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-07-02 14:45:41,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 14:45:43,551][36999] Updated weights for policy 0, policy_version 35200 (0.0028) [2024-07-02 14:45:46,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 576815104. Throughput: 0: 44127.9. Samples: 576911760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 14:45:46,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 14:45:46,857][36999] Updated weights for policy 0, policy_version 35210 (0.0036) [2024-07-02 14:45:50,962][36999] Updated weights for policy 0, policy_version 35220 (0.0037) [2024-07-02 14:45:51,100][36761] Fps is (10 sec: 42578.9, 60 sec: 44233.5, 300 sec: 44041.7). Total num frames: 577060864. Throughput: 0: 44052.5. Samples: 577174960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 14:45:51,100][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:45:54,399][36999] Updated weights for policy 0, policy_version 35230 (0.0039) [2024-07-02 14:45:56,095][36761] Fps is (10 sec: 47513.3, 60 sec: 44238.6, 300 sec: 44098.0). Total num frames: 577290240. Throughput: 0: 44002.7. Samples: 577431900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 14:45:56,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:45:58,353][36999] Updated weights for policy 0, policy_version 35240 (0.0041) [2024-07-02 14:46:01,095][36761] Fps is (10 sec: 40978.5, 60 sec: 43690.6, 300 sec: 43931.3). Total num frames: 577470464. Throughput: 0: 44163.1. Samples: 577573180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-07-02 14:46:01,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:46:01,789][36999] Updated weights for policy 0, policy_version 35250 (0.0030) [2024-07-02 14:46:05,763][36999] Updated weights for policy 0, policy_version 35260 (0.0041) [2024-07-02 14:46:06,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 577699840. Throughput: 0: 44012.6. Samples: 577833180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 14:46:06,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:46:08,782][36979] Signal inference workers to stop experience collection... (8250 times) [2024-07-02 14:46:08,810][36999] InferenceWorker_p0-w0: stopping experience collection (8250 times) [2024-07-02 14:46:08,834][36979] Signal inference workers to resume experience collection... (8250 times) [2024-07-02 14:46:08,835][36999] InferenceWorker_p0-w0: resuming experience collection (8250 times) [2024-07-02 14:46:09,113][36999] Updated weights for policy 0, policy_version 35270 (0.0035) [2024-07-02 14:46:11,095][36761] Fps is (10 sec: 49152.2, 60 sec: 43963.8, 300 sec: 44153.5). Total num frames: 577961984. Throughput: 0: 44313.4. Samples: 578096280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 14:46:11,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:46:13,059][36999] Updated weights for policy 0, policy_version 35280 (0.0030) [2024-07-02 14:46:16,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 578142208. Throughput: 0: 44217.3. Samples: 578234800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 14:46:16,096][36761] Avg episode reward: [(0, '1.012')] [2024-07-02 14:46:16,719][36999] Updated weights for policy 0, policy_version 35290 (0.0034) [2024-07-02 14:46:20,645][36999] Updated weights for policy 0, policy_version 35300 (0.0022) [2024-07-02 14:46:21,095][36761] Fps is (10 sec: 40960.5, 60 sec: 43964.0, 300 sec: 44042.4). Total num frames: 578371584. Throughput: 0: 43998.3. Samples: 578496560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 14:46:21,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:46:24,226][36999] Updated weights for policy 0, policy_version 35310 (0.0034) [2024-07-02 14:46:26,095][36761] Fps is (10 sec: 47513.7, 60 sec: 43963.8, 300 sec: 44153.5). Total num frames: 578617344. Throughput: 0: 44121.4. Samples: 578756460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-07-02 14:46:26,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:46:28,367][36999] Updated weights for policy 0, policy_version 35320 (0.0023) [2024-07-02 14:46:31,095][36761] Fps is (10 sec: 44236.4, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 578813952. Throughput: 0: 43945.8. Samples: 578889320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 14:46:31,096][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 14:46:31,599][36999] Updated weights for policy 0, policy_version 35330 (0.0027) [2024-07-02 14:46:35,643][36999] Updated weights for policy 0, policy_version 35340 (0.0030) [2024-07-02 14:46:36,095][36761] Fps is (10 sec: 40959.9, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 579026944. Throughput: 0: 44037.8. Samples: 579156460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 14:46:36,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:46:36,135][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000035342_579043328.pth... [2024-07-02 14:46:36,194][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000034696_568459264.pth [2024-07-02 14:46:39,202][36999] Updated weights for policy 0, policy_version 35350 (0.0034) [2024-07-02 14:46:41,096][36761] Fps is (10 sec: 45874.4, 60 sec: 43963.6, 300 sec: 44097.9). Total num frames: 579272704. Throughput: 0: 44048.3. Samples: 579414080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 14:46:41,096][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 14:46:43,080][36999] Updated weights for policy 0, policy_version 35360 (0.0037) [2024-07-02 14:46:46,095][36761] Fps is (10 sec: 44236.1, 60 sec: 44236.7, 300 sec: 44042.4). Total num frames: 579469312. Throughput: 0: 43947.0. Samples: 579550800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-07-02 14:46:46,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:46:46,497][36999] Updated weights for policy 0, policy_version 35370 (0.0039) [2024-07-02 14:46:50,463][36999] Updated weights for policy 0, policy_version 35380 (0.0024) [2024-07-02 14:46:51,095][36761] Fps is (10 sec: 40960.9, 60 sec: 43694.0, 300 sec: 44042.4). Total num frames: 579682304. Throughput: 0: 44072.1. Samples: 579816420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:46:51,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:46:53,959][36999] Updated weights for policy 0, policy_version 35390 (0.0025) [2024-07-02 14:46:56,095][36761] Fps is (10 sec: 45876.0, 60 sec: 43963.8, 300 sec: 44042.4). Total num frames: 579928064. Throughput: 0: 43877.8. Samples: 580070780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:46:56,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:46:57,876][36999] Updated weights for policy 0, policy_version 35400 (0.0037) [2024-07-02 14:47:01,095][36761] Fps is (10 sec: 44236.5, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 580124672. Throughput: 0: 43854.7. Samples: 580208260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:47:01,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:47:01,519][36999] Updated weights for policy 0, policy_version 35410 (0.0042) [2024-07-02 14:47:05,321][36999] Updated weights for policy 0, policy_version 35420 (0.0035) [2024-07-02 14:47:06,096][36761] Fps is (10 sec: 40959.2, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 580337664. Throughput: 0: 43955.3. Samples: 580474560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:47:06,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:47:08,923][36999] Updated weights for policy 0, policy_version 35430 (0.0031) [2024-07-02 14:47:11,095][36761] Fps is (10 sec: 47514.5, 60 sec: 43963.9, 300 sec: 44098.0). Total num frames: 580599808. Throughput: 0: 43737.0. Samples: 580724620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-07-02 14:47:11,095][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:47:12,762][36999] Updated weights for policy 0, policy_version 35440 (0.0030) [2024-07-02 14:47:16,095][36761] Fps is (10 sec: 44237.8, 60 sec: 43963.8, 300 sec: 43987.6). Total num frames: 580780032. Throughput: 0: 43928.5. Samples: 580866100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 14:47:16,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:47:16,263][36999] Updated weights for policy 0, policy_version 35450 (0.0030) [2024-07-02 14:47:20,160][36999] Updated weights for policy 0, policy_version 35460 (0.0030) [2024-07-02 14:47:21,095][36761] Fps is (10 sec: 39321.0, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 580993024. Throughput: 0: 43821.8. Samples: 581128440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 14:47:21,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:47:24,161][36999] Updated weights for policy 0, policy_version 35470 (0.0031) [2024-07-02 14:47:26,095][36761] Fps is (10 sec: 45874.9, 60 sec: 43690.6, 300 sec: 44042.4). Total num frames: 581238784. Throughput: 0: 43909.9. Samples: 581390020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 14:47:26,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:47:27,349][36999] Updated weights for policy 0, policy_version 35480 (0.0032) [2024-07-02 14:47:31,095][36761] Fps is (10 sec: 44236.4, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 581435392. Throughput: 0: 43800.5. Samples: 581521820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 14:47:31,096][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 14:47:31,589][36999] Updated weights for policy 0, policy_version 35490 (0.0039) [2024-07-02 14:47:34,879][36999] Updated weights for policy 0, policy_version 35500 (0.0040) [2024-07-02 14:47:36,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43963.7, 300 sec: 44097.9). Total num frames: 581664768. Throughput: 0: 43821.2. Samples: 581788380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 14:47:36,096][36761] Avg episode reward: [(0, '0.993')] [2024-07-02 14:47:38,867][36999] Updated weights for policy 0, policy_version 35510 (0.0036) [2024-07-02 14:47:41,095][36761] Fps is (10 sec: 45875.4, 60 sec: 43690.8, 300 sec: 44042.4). Total num frames: 581894144. Throughput: 0: 44076.4. Samples: 582054220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 14:47:41,096][36761] Avg episode reward: [(0, '0.999')] [2024-07-02 14:47:42,147][36999] Updated weights for policy 0, policy_version 35520 (0.0023) [2024-07-02 14:47:46,083][36999] Updated weights for policy 0, policy_version 35530 (0.0039) [2024-07-02 14:47:46,100][36761] Fps is (10 sec: 45854.4, 60 sec: 44233.5, 300 sec: 44097.2). Total num frames: 582123520. Throughput: 0: 43989.7. Samples: 582188000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 14:47:46,101][36761] Avg episode reward: [(0, '1.039')] [2024-07-02 14:47:47,275][36979] Signal inference workers to stop experience collection... (8300 times) [2024-07-02 14:47:47,324][36979] Signal inference workers to resume experience collection... (8300 times) [2024-07-02 14:47:47,326][36999] InferenceWorker_p0-w0: stopping experience collection (8300 times) [2024-07-02 14:47:47,352][36999] InferenceWorker_p0-w0: resuming experience collection (8300 times) [2024-07-02 14:47:49,720][36999] Updated weights for policy 0, policy_version 35540 (0.0037) [2024-07-02 14:47:51,095][36761] Fps is (10 sec: 44237.1, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 582336512. Throughput: 0: 43959.3. Samples: 582452720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 14:47:51,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:47:53,385][36999] Updated weights for policy 0, policy_version 35550 (0.0043) [2024-07-02 14:47:56,095][36761] Fps is (10 sec: 42617.8, 60 sec: 43690.6, 300 sec: 44042.4). Total num frames: 582549504. Throughput: 0: 44262.4. Samples: 582716440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 14:47:56,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:47:57,135][36999] Updated weights for policy 0, policy_version 35560 (0.0029) [2024-07-02 14:48:00,758][36999] Updated weights for policy 0, policy_version 35570 (0.0028) [2024-07-02 14:48:01,096][36761] Fps is (10 sec: 45874.2, 60 sec: 44509.7, 300 sec: 44097.9). Total num frames: 582795264. Throughput: 0: 44191.3. Samples: 582854720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 14:48:01,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:48:04,518][36999] Updated weights for policy 0, policy_version 35580 (0.0040) [2024-07-02 14:48:06,095][36761] Fps is (10 sec: 44236.8, 60 sec: 44236.9, 300 sec: 44097.9). Total num frames: 582991872. Throughput: 0: 44147.4. Samples: 583115080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 14:48:06,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:48:08,362][36999] Updated weights for policy 0, policy_version 35590 (0.0032) [2024-07-02 14:48:11,095][36761] Fps is (10 sec: 40960.7, 60 sec: 43417.5, 300 sec: 44042.4). Total num frames: 583204864. Throughput: 0: 44147.6. Samples: 583376660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 14:48:11,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:48:11,860][36999] Updated weights for policy 0, policy_version 35600 (0.0028) [2024-07-02 14:48:15,901][36999] Updated weights for policy 0, policy_version 35610 (0.0032) [2024-07-02 14:48:16,095][36761] Fps is (10 sec: 45875.7, 60 sec: 44509.9, 300 sec: 44153.5). Total num frames: 583450624. Throughput: 0: 44211.7. Samples: 583511340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 14:48:16,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:48:19,790][36999] Updated weights for policy 0, policy_version 35620 (0.0026) [2024-07-02 14:48:21,095][36761] Fps is (10 sec: 44236.8, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 583647232. Throughput: 0: 44182.8. Samples: 583776600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 14:48:21,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:48:23,395][36999] Updated weights for policy 0, policy_version 35630 (0.0029) [2024-07-02 14:48:26,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43690.7, 300 sec: 44042.4). Total num frames: 583860224. Throughput: 0: 44063.2. Samples: 584037060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-07-02 14:48:26,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:48:27,119][36999] Updated weights for policy 0, policy_version 35640 (0.0036) [2024-07-02 14:48:30,641][36999] Updated weights for policy 0, policy_version 35650 (0.0034) [2024-07-02 14:48:31,095][36761] Fps is (10 sec: 44237.0, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 584089600. Throughput: 0: 43935.2. Samples: 584164880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-07-02 14:48:31,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:48:34,426][36999] Updated weights for policy 0, policy_version 35660 (0.0035) [2024-07-02 14:48:36,095][36761] Fps is (10 sec: 45874.2, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 584318976. Throughput: 0: 43894.9. Samples: 584428000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-07-02 14:48:36,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 14:48:36,102][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000035664_584318976.pth... [2024-07-02 14:48:36,171][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000035019_573751296.pth [2024-07-02 14:48:38,247][36999] Updated weights for policy 0, policy_version 35670 (0.0030) [2024-07-02 14:48:41,100][36761] Fps is (10 sec: 44216.4, 60 sec: 43960.4, 300 sec: 43986.2). Total num frames: 584531968. Throughput: 0: 43961.8. Samples: 584694920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-07-02 14:48:41,100][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:48:42,139][36999] Updated weights for policy 0, policy_version 35680 (0.0028) [2024-07-02 14:48:45,799][36999] Updated weights for policy 0, policy_version 35690 (0.0039) [2024-07-02 14:48:46,095][36761] Fps is (10 sec: 42598.8, 60 sec: 43694.0, 300 sec: 43986.9). Total num frames: 584744960. Throughput: 0: 43774.3. Samples: 584824560. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-07-02 14:48:46,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:48:49,394][36999] Updated weights for policy 0, policy_version 35700 (0.0023) [2024-07-02 14:48:51,095][36761] Fps is (10 sec: 42617.6, 60 sec: 43690.6, 300 sec: 44042.4). Total num frames: 584957952. Throughput: 0: 43927.1. Samples: 585091800. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-07-02 14:48:51,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:48:53,109][36999] Updated weights for policy 0, policy_version 35710 (0.0035) [2024-07-02 14:48:56,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 585187328. Throughput: 0: 43883.9. Samples: 585351440. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-07-02 14:48:56,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:48:56,779][36999] Updated weights for policy 0, policy_version 35720 (0.0032) [2024-07-02 14:49:00,439][36999] Updated weights for policy 0, policy_version 35730 (0.0032) [2024-07-02 14:49:01,095][36761] Fps is (10 sec: 45875.2, 60 sec: 43690.8, 300 sec: 44042.4). Total num frames: 585416704. Throughput: 0: 43967.9. Samples: 585489900. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-07-02 14:49:01,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:49:04,200][36999] Updated weights for policy 0, policy_version 35740 (0.0038) [2024-07-02 14:49:06,095][36761] Fps is (10 sec: 42598.8, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 585613312. Throughput: 0: 43853.8. Samples: 585750020. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-07-02 14:49:06,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:49:08,080][36999] Updated weights for policy 0, policy_version 35750 (0.0042) [2024-07-02 14:49:11,095][36761] Fps is (10 sec: 44236.8, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 585859072. Throughput: 0: 43874.6. Samples: 586011420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 14:49:11,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:49:11,666][36999] Updated weights for policy 0, policy_version 35760 (0.0028) [2024-07-02 14:49:14,284][36979] Signal inference workers to stop experience collection... (8350 times) [2024-07-02 14:49:14,284][36979] Signal inference workers to resume experience collection... (8350 times) [2024-07-02 14:49:14,307][36999] InferenceWorker_p0-w0: stopping experience collection (8350 times) [2024-07-02 14:49:14,307][36999] InferenceWorker_p0-w0: resuming experience collection (8350 times) [2024-07-02 14:49:15,329][36999] Updated weights for policy 0, policy_version 35770 (0.0030) [2024-07-02 14:49:16,100][36761] Fps is (10 sec: 45854.2, 60 sec: 43687.3, 300 sec: 43986.2). Total num frames: 586072064. Throughput: 0: 44166.6. Samples: 586152580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 14:49:16,100][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 14:49:19,185][36999] Updated weights for policy 0, policy_version 35780 (0.0031) [2024-07-02 14:49:21,095][36761] Fps is (10 sec: 42598.9, 60 sec: 43963.8, 300 sec: 43931.3). Total num frames: 586285056. Throughput: 0: 44062.0. Samples: 586410780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 14:49:21,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:49:23,163][36999] Updated weights for policy 0, policy_version 35790 (0.0024) [2024-07-02 14:49:26,095][36761] Fps is (10 sec: 44257.0, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 586514432. Throughput: 0: 43885.8. Samples: 586669580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 14:49:26,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:49:26,915][36999] Updated weights for policy 0, policy_version 35800 (0.0037) [2024-07-02 14:49:30,583][36999] Updated weights for policy 0, policy_version 35810 (0.0033) [2024-07-02 14:49:31,096][36761] Fps is (10 sec: 44234.3, 60 sec: 43963.3, 300 sec: 43986.8). Total num frames: 586727424. Throughput: 0: 43923.1. Samples: 586801120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-07-02 14:49:31,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:49:34,196][36999] Updated weights for policy 0, policy_version 35820 (0.0026) [2024-07-02 14:49:36,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43690.8, 300 sec: 43932.0). Total num frames: 586940416. Throughput: 0: 43870.2. Samples: 587065960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 14:49:36,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:49:37,865][36999] Updated weights for policy 0, policy_version 35830 (0.0039) [2024-07-02 14:49:41,095][36761] Fps is (10 sec: 44239.5, 60 sec: 43967.2, 300 sec: 43986.9). Total num frames: 587169792. Throughput: 0: 43900.2. Samples: 587326940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 14:49:41,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:49:41,387][36999] Updated weights for policy 0, policy_version 35840 (0.0030) [2024-07-02 14:49:45,429][36999] Updated weights for policy 0, policy_version 35850 (0.0031) [2024-07-02 14:49:46,095][36761] Fps is (10 sec: 45875.1, 60 sec: 44236.8, 300 sec: 44042.4). Total num frames: 587399168. Throughput: 0: 43828.9. Samples: 587462200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 14:49:46,098][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:49:49,266][36999] Updated weights for policy 0, policy_version 35860 (0.0037) [2024-07-02 14:49:51,095][36761] Fps is (10 sec: 44236.3, 60 sec: 44236.8, 300 sec: 43987.3). Total num frames: 587612160. Throughput: 0: 44029.3. Samples: 587731340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-07-02 14:49:51,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:49:52,638][36999] Updated weights for policy 0, policy_version 35870 (0.0026) [2024-07-02 14:49:56,095][36761] Fps is (10 sec: 42598.0, 60 sec: 43963.7, 300 sec: 43986.8). Total num frames: 587825152. Throughput: 0: 44007.0. Samples: 587991740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-07-02 14:49:56,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:49:56,954][36999] Updated weights for policy 0, policy_version 35880 (0.0038) [2024-07-02 14:49:59,964][36999] Updated weights for policy 0, policy_version 35890 (0.0025) [2024-07-02 14:50:01,095][36761] Fps is (10 sec: 45875.5, 60 sec: 44236.9, 300 sec: 44042.4). Total num frames: 588070912. Throughput: 0: 43911.2. Samples: 588128380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-07-02 14:50:01,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:50:04,219][36999] Updated weights for policy 0, policy_version 35900 (0.0037) [2024-07-02 14:50:06,095][36761] Fps is (10 sec: 44237.2, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 588267520. Throughput: 0: 44066.6. Samples: 588393780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-07-02 14:50:06,096][36761] Avg episode reward: [(0, '0.997')] [2024-07-02 14:50:07,385][36999] Updated weights for policy 0, policy_version 35910 (0.0031) [2024-07-02 14:50:11,095][36761] Fps is (10 sec: 42597.8, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 588496896. Throughput: 0: 44255.5. Samples: 588661080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-07-02 14:50:11,096][36761] Avg episode reward: [(0, '0.997')] [2024-07-02 14:50:11,424][36999] Updated weights for policy 0, policy_version 35920 (0.0023) [2024-07-02 14:50:14,590][36999] Updated weights for policy 0, policy_version 35930 (0.0031) [2024-07-02 14:50:16,095][36761] Fps is (10 sec: 45875.8, 60 sec: 44240.2, 300 sec: 44042.5). Total num frames: 588726272. Throughput: 0: 44378.8. Samples: 588798140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-07-02 14:50:16,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:50:18,846][36999] Updated weights for policy 0, policy_version 35940 (0.0027) [2024-07-02 14:50:21,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43963.6, 300 sec: 43875.8). Total num frames: 588922880. Throughput: 0: 44408.8. Samples: 589064360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-07-02 14:50:21,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:50:22,031][36999] Updated weights for policy 0, policy_version 35950 (0.0043) [2024-07-02 14:50:26,095][36761] Fps is (10 sec: 42598.1, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 589152256. Throughput: 0: 44325.7. Samples: 589321600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-07-02 14:50:26,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:50:26,133][36999] Updated weights for policy 0, policy_version 35960 (0.0037) [2024-07-02 14:50:29,498][36999] Updated weights for policy 0, policy_version 35970 (0.0030) [2024-07-02 14:50:31,095][36761] Fps is (10 sec: 47513.7, 60 sec: 44510.2, 300 sec: 44097.9). Total num frames: 589398016. Throughput: 0: 44155.1. Samples: 589449180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-07-02 14:50:31,104][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:50:33,673][36999] Updated weights for policy 0, policy_version 35980 (0.0027) [2024-07-02 14:50:36,095][36761] Fps is (10 sec: 44236.8, 60 sec: 44236.8, 300 sec: 43931.3). Total num frames: 589594624. Throughput: 0: 44227.6. Samples: 589721580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-07-02 14:50:36,097][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:50:36,115][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000035986_589594624.pth... [2024-07-02 14:50:36,163][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000035342_579043328.pth [2024-07-02 14:50:37,128][36999] Updated weights for policy 0, policy_version 35990 (0.0023) [2024-07-02 14:50:41,006][36999] Updated weights for policy 0, policy_version 36000 (0.0027) [2024-07-02 14:50:41,095][36761] Fps is (10 sec: 42598.5, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 589824000. Throughput: 0: 44215.6. Samples: 589981440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 14:50:41,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:50:44,494][36999] Updated weights for policy 0, policy_version 36010 (0.0021) [2024-07-02 14:50:46,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43963.8, 300 sec: 43987.6). Total num frames: 590036992. Throughput: 0: 44024.3. Samples: 590109480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 14:50:46,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:50:48,685][36999] Updated weights for policy 0, policy_version 36020 (0.0027) [2024-07-02 14:50:51,095][36761] Fps is (10 sec: 44236.4, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 590266368. Throughput: 0: 44109.7. Samples: 590378720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 14:50:51,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:50:52,014][36999] Updated weights for policy 0, policy_version 36030 (0.0028) [2024-07-02 14:50:53,550][36979] Signal inference workers to stop experience collection... (8400 times) [2024-07-02 14:50:53,550][36979] Signal inference workers to resume experience collection... (8400 times) [2024-07-02 14:50:53,567][36999] InferenceWorker_p0-w0: stopping experience collection (8400 times) [2024-07-02 14:50:53,567][36999] InferenceWorker_p0-w0: resuming experience collection (8400 times) [2024-07-02 14:50:56,025][36999] Updated weights for policy 0, policy_version 36040 (0.0035) [2024-07-02 14:50:56,095][36761] Fps is (10 sec: 44236.7, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 590479360. Throughput: 0: 44012.0. Samples: 590641620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 14:50:56,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:50:59,488][36999] Updated weights for policy 0, policy_version 36050 (0.0032) [2024-07-02 14:51:01,100][36761] Fps is (10 sec: 45854.9, 60 sec: 44233.4, 300 sec: 44152.8). Total num frames: 590725120. Throughput: 0: 43911.0. Samples: 590774340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-07-02 14:51:01,100][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:51:03,338][36999] Updated weights for policy 0, policy_version 36060 (0.0031) [2024-07-02 14:51:06,095][36761] Fps is (10 sec: 44237.1, 60 sec: 44236.9, 300 sec: 43931.3). Total num frames: 590921728. Throughput: 0: 44053.4. Samples: 591046760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-07-02 14:51:06,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:51:06,821][36999] Updated weights for policy 0, policy_version 36070 (0.0041) [2024-07-02 14:51:10,946][36999] Updated weights for policy 0, policy_version 36080 (0.0027) [2024-07-02 14:51:11,095][36761] Fps is (10 sec: 40978.3, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 591134720. Throughput: 0: 44134.1. Samples: 591307640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-07-02 14:51:11,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:51:14,110][36999] Updated weights for policy 0, policy_version 36090 (0.0037) [2024-07-02 14:51:16,095][36761] Fps is (10 sec: 45874.7, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 591380480. Throughput: 0: 44093.3. Samples: 591433380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-07-02 14:51:16,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:51:18,430][36999] Updated weights for policy 0, policy_version 36100 (0.0032) [2024-07-02 14:51:21,095][36761] Fps is (10 sec: 45875.4, 60 sec: 44509.9, 300 sec: 43986.9). Total num frames: 591593472. Throughput: 0: 44119.5. Samples: 591706960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-07-02 14:51:21,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 14:51:21,448][36999] Updated weights for policy 0, policy_version 36110 (0.0037) [2024-07-02 14:51:25,798][36999] Updated weights for policy 0, policy_version 36120 (0.0039) [2024-07-02 14:51:26,098][36761] Fps is (10 sec: 40950.6, 60 sec: 43962.0, 300 sec: 43986.5). Total num frames: 591790080. Throughput: 0: 44043.0. Samples: 591963480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-07-02 14:51:26,098][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:51:29,054][36999] Updated weights for policy 0, policy_version 36130 (0.0027) [2024-07-02 14:51:31,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43963.7, 300 sec: 44097.9). Total num frames: 592035840. Throughput: 0: 44066.2. Samples: 592092460. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-07-02 14:51:31,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:51:33,630][36999] Updated weights for policy 0, policy_version 36140 (0.0028) [2024-07-02 14:51:36,095][36761] Fps is (10 sec: 45886.3, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 592248832. Throughput: 0: 43958.8. Samples: 592356860. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-07-02 14:51:36,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 14:51:36,746][36999] Updated weights for policy 0, policy_version 36150 (0.0034) [2024-07-02 14:51:41,077][36999] Updated weights for policy 0, policy_version 36160 (0.0020) [2024-07-02 14:51:41,098][36761] Fps is (10 sec: 40949.9, 60 sec: 43688.8, 300 sec: 43986.5). Total num frames: 592445440. Throughput: 0: 44034.4. Samples: 592623280. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-07-02 14:51:41,098][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:51:44,045][36999] Updated weights for policy 0, policy_version 36170 (0.0031) [2024-07-02 14:51:46,100][36761] Fps is (10 sec: 44216.7, 60 sec: 44233.5, 300 sec: 44097.3). Total num frames: 592691200. Throughput: 0: 43916.5. Samples: 592750580. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-07-02 14:51:46,100][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:51:48,383][36999] Updated weights for policy 0, policy_version 36180 (0.0034) [2024-07-02 14:51:51,095][36761] Fps is (10 sec: 45886.3, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 592904192. Throughput: 0: 43751.0. Samples: 593015560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 14:51:51,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:51:51,498][36999] Updated weights for policy 0, policy_version 36190 (0.0023) [2024-07-02 14:51:55,827][36999] Updated weights for policy 0, policy_version 36200 (0.0031) [2024-07-02 14:51:56,095][36761] Fps is (10 sec: 40978.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 593100800. Throughput: 0: 43954.3. Samples: 593285580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 14:51:56,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:51:58,864][36999] Updated weights for policy 0, policy_version 36210 (0.0028) [2024-07-02 14:52:01,095][36761] Fps is (10 sec: 45875.3, 60 sec: 43967.0, 300 sec: 44153.5). Total num frames: 593362944. Throughput: 0: 43928.4. Samples: 593410160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 14:52:01,097][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:52:03,286][36999] Updated weights for policy 0, policy_version 36220 (0.0034) [2024-07-02 14:52:06,095][36761] Fps is (10 sec: 47513.6, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 593575936. Throughput: 0: 43774.7. Samples: 593676820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 14:52:06,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:52:06,148][36999] Updated weights for policy 0, policy_version 36230 (0.0031) [2024-07-02 14:52:10,664][36999] Updated weights for policy 0, policy_version 36240 (0.0031) [2024-07-02 14:52:11,095][36761] Fps is (10 sec: 39322.0, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 593756160. Throughput: 0: 44085.5. Samples: 593947220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-07-02 14:52:11,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:52:13,283][36979] Signal inference workers to stop experience collection... (8450 times) [2024-07-02 14:52:13,316][36999] InferenceWorker_p0-w0: stopping experience collection (8450 times) [2024-07-02 14:52:13,342][36979] Signal inference workers to resume experience collection... (8450 times) [2024-07-02 14:52:13,343][36999] InferenceWorker_p0-w0: resuming experience collection (8450 times) [2024-07-02 14:52:13,485][36999] Updated weights for policy 0, policy_version 36250 (0.0038) [2024-07-02 14:52:16,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43690.7, 300 sec: 44097.9). Total num frames: 594001920. Throughput: 0: 43969.0. Samples: 594071060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-07-02 14:52:16,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:52:18,088][36999] Updated weights for policy 0, policy_version 36260 (0.0032) [2024-07-02 14:52:20,861][36999] Updated weights for policy 0, policy_version 36270 (0.0042) [2024-07-02 14:52:21,095][36761] Fps is (10 sec: 49151.6, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 594247680. Throughput: 0: 44051.9. Samples: 594339200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-07-02 14:52:21,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:52:25,790][36999] Updated weights for policy 0, policy_version 36280 (0.0029) [2024-07-02 14:52:26,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43965.4, 300 sec: 44042.4). Total num frames: 594427904. Throughput: 0: 44211.8. Samples: 594612700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-07-02 14:52:26,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:52:28,643][36999] Updated weights for policy 0, policy_version 36290 (0.0023) [2024-07-02 14:52:31,095][36761] Fps is (10 sec: 40959.8, 60 sec: 43690.6, 300 sec: 44042.4). Total num frames: 594657280. Throughput: 0: 43963.0. Samples: 594728720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-07-02 14:52:31,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:52:33,217][36999] Updated weights for policy 0, policy_version 36300 (0.0042) [2024-07-02 14:52:36,049][36999] Updated weights for policy 0, policy_version 36310 (0.0028) [2024-07-02 14:52:36,095][36761] Fps is (10 sec: 47513.6, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 594903040. Throughput: 0: 43949.8. Samples: 594993300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-07-02 14:52:36,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:52:36,126][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000036310_594903040.pth... [2024-07-02 14:52:36,175][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000035664_584318976.pth [2024-07-02 14:52:41,067][36999] Updated weights for policy 0, policy_version 36320 (0.0027) [2024-07-02 14:52:41,095][36761] Fps is (10 sec: 40960.8, 60 sec: 43692.6, 300 sec: 43876.5). Total num frames: 595066880. Throughput: 0: 43960.0. Samples: 595263780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 14:52:41,095][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 14:52:43,889][36999] Updated weights for policy 0, policy_version 36330 (0.0028) [2024-07-02 14:52:46,095][36761] Fps is (10 sec: 42598.5, 60 sec: 43967.0, 300 sec: 44042.4). Total num frames: 595329024. Throughput: 0: 43959.6. Samples: 595388340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 14:52:46,100][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:52:48,413][36999] Updated weights for policy 0, policy_version 36340 (0.0040) [2024-07-02 14:52:51,100][36761] Fps is (10 sec: 47491.5, 60 sec: 43960.5, 300 sec: 44041.7). Total num frames: 595542016. Throughput: 0: 43820.4. Samples: 595648940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 14:52:51,101][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 14:52:51,254][36999] Updated weights for policy 0, policy_version 36350 (0.0027) [2024-07-02 14:52:55,780][36999] Updated weights for policy 0, policy_version 36360 (0.0025) [2024-07-02 14:52:56,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 595738624. Throughput: 0: 43897.7. Samples: 595922620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-07-02 14:52:56,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:52:58,586][36999] Updated weights for policy 0, policy_version 36370 (0.0025) [2024-07-02 14:53:01,095][36761] Fps is (10 sec: 44256.8, 60 sec: 43690.7, 300 sec: 44042.4). Total num frames: 595984384. Throughput: 0: 43800.8. Samples: 596042100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 14:53:01,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 14:53:03,140][36999] Updated weights for policy 0, policy_version 36380 (0.0033) [2024-07-02 14:53:05,982][36999] Updated weights for policy 0, policy_version 36390 (0.0035) [2024-07-02 14:53:06,095][36761] Fps is (10 sec: 47513.5, 60 sec: 43963.7, 300 sec: 44097.9). Total num frames: 596213760. Throughput: 0: 43692.9. Samples: 596305380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 14:53:06,096][36761] Avg episode reward: [(0, '1.011')] [2024-07-02 14:53:10,562][36999] Updated weights for policy 0, policy_version 36400 (0.0028) [2024-07-02 14:53:11,095][36761] Fps is (10 sec: 42598.4, 60 sec: 44236.7, 300 sec: 43931.3). Total num frames: 596410368. Throughput: 0: 43786.2. Samples: 596583080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 14:53:11,096][36761] Avg episode reward: [(0, '1.018')] [2024-07-02 14:53:13,281][36999] Updated weights for policy 0, policy_version 36410 (0.0034) [2024-07-02 14:53:16,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 596639744. Throughput: 0: 43928.1. Samples: 596705480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 14:53:16,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:53:17,587][36979] Signal inference workers to stop experience collection... (8500 times) [2024-07-02 14:53:17,616][36999] InferenceWorker_p0-w0: stopping experience collection (8500 times) [2024-07-02 14:53:17,640][36979] Signal inference workers to resume experience collection... (8500 times) [2024-07-02 14:53:17,641][36999] InferenceWorker_p0-w0: resuming experience collection (8500 times) [2024-07-02 14:53:17,917][36999] Updated weights for policy 0, policy_version 36420 (0.0045) [2024-07-02 14:53:20,658][36999] Updated weights for policy 0, policy_version 36430 (0.0034) [2024-07-02 14:53:21,095][36761] Fps is (10 sec: 45875.3, 60 sec: 43690.7, 300 sec: 44097.9). Total num frames: 596869120. Throughput: 0: 43887.1. Samples: 596968220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-07-02 14:53:21,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:53:25,302][36999] Updated weights for policy 0, policy_version 36440 (0.0035) [2024-07-02 14:53:26,100][36761] Fps is (10 sec: 42579.0, 60 sec: 43960.4, 300 sec: 43986.2). Total num frames: 597065728. Throughput: 0: 43887.9. Samples: 597238940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 14:53:26,100][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:53:28,383][36999] Updated weights for policy 0, policy_version 36450 (0.0037) [2024-07-02 14:53:31,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 597295104. Throughput: 0: 44051.5. Samples: 597370660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 14:53:31,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 14:53:32,720][36999] Updated weights for policy 0, policy_version 36460 (0.0032) [2024-07-02 14:53:36,100][36761] Fps is (10 sec: 44236.6, 60 sec: 43414.3, 300 sec: 43986.9). Total num frames: 597508096. Throughput: 0: 43935.5. Samples: 597626040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 14:53:36,101][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:53:36,371][36999] Updated weights for policy 0, policy_version 36470 (0.0040) [2024-07-02 14:53:40,373][36999] Updated weights for policy 0, policy_version 36480 (0.0033) [2024-07-02 14:53:41,100][36761] Fps is (10 sec: 44217.0, 60 sec: 44506.4, 300 sec: 44041.7). Total num frames: 597737472. Throughput: 0: 43895.6. Samples: 597898120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 14:53:41,100][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:53:43,717][36999] Updated weights for policy 0, policy_version 36490 (0.0036) [2024-07-02 14:53:46,095][36761] Fps is (10 sec: 42618.1, 60 sec: 43417.6, 300 sec: 43986.9). Total num frames: 597934080. Throughput: 0: 44053.9. Samples: 598024520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 14:53:46,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:53:47,655][36999] Updated weights for policy 0, policy_version 36500 (0.0029) [2024-07-02 14:53:50,952][36999] Updated weights for policy 0, policy_version 36510 (0.0043) [2024-07-02 14:53:51,095][36761] Fps is (10 sec: 44256.8, 60 sec: 43967.1, 300 sec: 44042.4). Total num frames: 598179840. Throughput: 0: 44004.9. Samples: 598285600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 14:53:51,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:53:55,131][36999] Updated weights for policy 0, policy_version 36520 (0.0034) [2024-07-02 14:53:56,095][36761] Fps is (10 sec: 45875.3, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 598392832. Throughput: 0: 43806.8. Samples: 598554380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 14:53:56,096][36761] Avg episode reward: [(0, '1.012')] [2024-07-02 14:53:58,611][36999] Updated weights for policy 0, policy_version 36530 (0.0030) [2024-07-02 14:54:01,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 43986.9). Total num frames: 598589440. Throughput: 0: 44009.7. Samples: 598685920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 14:54:01,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:54:02,529][36999] Updated weights for policy 0, policy_version 36540 (0.0026) [2024-07-02 14:54:05,845][36999] Updated weights for policy 0, policy_version 36550 (0.0029) [2024-07-02 14:54:06,095][36761] Fps is (10 sec: 44236.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 598835200. Throughput: 0: 44004.5. Samples: 598948420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 14:54:06,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:54:09,865][36999] Updated weights for policy 0, policy_version 36560 (0.0024) [2024-07-02 14:54:11,095][36761] Fps is (10 sec: 49152.1, 60 sec: 44509.9, 300 sec: 44098.6). Total num frames: 599080960. Throughput: 0: 43991.6. Samples: 599218360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 14:54:11,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:54:12,978][36979] Signal inference workers to stop experience collection... (8550 times) [2024-07-02 14:54:12,978][36979] Signal inference workers to resume experience collection... (8550 times) [2024-07-02 14:54:12,989][36999] InferenceWorker_p0-w0: stopping experience collection (8550 times) [2024-07-02 14:54:12,989][36999] InferenceWorker_p0-w0: resuming experience collection (8550 times) [2024-07-02 14:54:13,128][36999] Updated weights for policy 0, policy_version 36570 (0.0026) [2024-07-02 14:54:16,095][36761] Fps is (10 sec: 42597.8, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 599261184. Throughput: 0: 43999.1. Samples: 599350620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 14:54:16,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 14:54:17,118][36999] Updated weights for policy 0, policy_version 36580 (0.0031) [2024-07-02 14:54:20,742][36999] Updated weights for policy 0, policy_version 36590 (0.0026) [2024-07-02 14:54:21,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 599506944. Throughput: 0: 44258.7. Samples: 599617480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 14:54:21,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:54:24,555][36999] Updated weights for policy 0, policy_version 36600 (0.0030) [2024-07-02 14:54:26,095][36761] Fps is (10 sec: 47514.2, 60 sec: 44513.3, 300 sec: 44098.0). Total num frames: 599736320. Throughput: 0: 43962.7. Samples: 599876240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 14:54:26,096][36761] Avg episode reward: [(0, '1.012')] [2024-07-02 14:54:28,211][36999] Updated weights for policy 0, policy_version 36610 (0.0019) [2024-07-02 14:54:31,096][36761] Fps is (10 sec: 40958.1, 60 sec: 43690.3, 300 sec: 43986.8). Total num frames: 599916544. Throughput: 0: 44170.1. Samples: 600012200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-07-02 14:54:31,097][36761] Avg episode reward: [(0, '1.012')] [2024-07-02 14:54:31,942][36999] Updated weights for policy 0, policy_version 36620 (0.0032) [2024-07-02 14:54:35,544][36999] Updated weights for policy 0, policy_version 36630 (0.0034) [2024-07-02 14:54:36,095][36761] Fps is (10 sec: 42598.3, 60 sec: 44240.2, 300 sec: 44042.4). Total num frames: 600162304. Throughput: 0: 44243.6. Samples: 600276560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 14:54:36,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:54:36,209][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000036632_600178688.pth... [2024-07-02 14:54:36,260][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000035986_589594624.pth [2024-07-02 14:54:39,487][36999] Updated weights for policy 0, policy_version 36640 (0.0038) [2024-07-02 14:54:41,100][36761] Fps is (10 sec: 47494.4, 60 sec: 44236.8, 300 sec: 44041.7). Total num frames: 600391680. Throughput: 0: 44023.9. Samples: 600535660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 14:54:41,101][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 14:54:42,960][36999] Updated weights for policy 0, policy_version 36650 (0.0025) [2024-07-02 14:54:46,095][36761] Fps is (10 sec: 40959.6, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 600571904. Throughput: 0: 44165.3. Samples: 600673360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 14:54:46,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:54:46,925][36999] Updated weights for policy 0, policy_version 36660 (0.0039) [2024-07-02 14:54:50,524][36999] Updated weights for policy 0, policy_version 36670 (0.0032) [2024-07-02 14:54:51,095][36761] Fps is (10 sec: 42617.5, 60 sec: 43963.7, 300 sec: 44042.4). Total num frames: 600817664. Throughput: 0: 44112.8. Samples: 600933500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 14:54:51,096][36761] Avg episode reward: [(0, '1.036')] [2024-07-02 14:54:54,406][36999] Updated weights for policy 0, policy_version 36680 (0.0032) [2024-07-02 14:54:56,095][36761] Fps is (10 sec: 47513.4, 60 sec: 44236.7, 300 sec: 43986.8). Total num frames: 601047040. Throughput: 0: 43797.7. Samples: 601189260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 14:54:56,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:54:57,910][36999] Updated weights for policy 0, policy_version 36690 (0.0041) [2024-07-02 14:55:01,095][36761] Fps is (10 sec: 40960.4, 60 sec: 43963.8, 300 sec: 43931.3). Total num frames: 601227264. Throughput: 0: 43935.7. Samples: 601327720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:55:01,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:55:01,774][36999] Updated weights for policy 0, policy_version 36700 (0.0035) [2024-07-02 14:55:05,237][36999] Updated weights for policy 0, policy_version 36710 (0.0039) [2024-07-02 14:55:06,095][36761] Fps is (10 sec: 44236.9, 60 sec: 44236.7, 300 sec: 44042.4). Total num frames: 601489408. Throughput: 0: 43867.5. Samples: 601591520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:55:06,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:55:09,422][36999] Updated weights for policy 0, policy_version 36720 (0.0026) [2024-07-02 14:55:11,095][36761] Fps is (10 sec: 47513.6, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 601702400. Throughput: 0: 43896.9. Samples: 601851600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:55:11,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:55:12,775][36999] Updated weights for policy 0, policy_version 36730 (0.0041) [2024-07-02 14:55:16,100][36761] Fps is (10 sec: 42579.4, 60 sec: 44233.5, 300 sec: 44041.7). Total num frames: 601915392. Throughput: 0: 43997.8. Samples: 601992280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:55:16,100][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:55:16,880][36999] Updated weights for policy 0, policy_version 36740 (0.0036) [2024-07-02 14:55:20,267][36999] Updated weights for policy 0, policy_version 36750 (0.0027) [2024-07-02 14:55:21,095][36761] Fps is (10 sec: 42598.1, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 602128384. Throughput: 0: 43771.9. Samples: 602246300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-07-02 14:55:21,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 14:55:24,163][36999] Updated weights for policy 0, policy_version 36760 (0.0028) [2024-07-02 14:55:26,095][36761] Fps is (10 sec: 44256.7, 60 sec: 43690.6, 300 sec: 43931.3). Total num frames: 602357760. Throughput: 0: 43820.4. Samples: 602507380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 14:55:26,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:55:28,140][36999] Updated weights for policy 0, policy_version 36770 (0.0032) [2024-07-02 14:55:31,095][36761] Fps is (10 sec: 44237.1, 60 sec: 44237.2, 300 sec: 43986.9). Total num frames: 602570752. Throughput: 0: 43840.5. Samples: 602646180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 14:55:31,099][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:55:31,543][36999] Updated weights for policy 0, policy_version 36780 (0.0043) [2024-07-02 14:55:32,314][36979] Signal inference workers to stop experience collection... (8600 times) [2024-07-02 14:55:32,360][36999] InferenceWorker_p0-w0: stopping experience collection (8600 times) [2024-07-02 14:55:32,360][36979] Signal inference workers to resume experience collection... (8600 times) [2024-07-02 14:55:32,371][36999] InferenceWorker_p0-w0: resuming experience collection (8600 times) [2024-07-02 14:55:35,414][36999] Updated weights for policy 0, policy_version 36790 (0.0025) [2024-07-02 14:55:36,095][36761] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 602783744. Throughput: 0: 43763.2. Samples: 602902840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 14:55:36,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:55:39,078][36999] Updated weights for policy 0, policy_version 36800 (0.0025) [2024-07-02 14:55:41,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43694.0, 300 sec: 43986.9). Total num frames: 603013120. Throughput: 0: 43880.2. Samples: 603163860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 14:55:41,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:55:42,801][36999] Updated weights for policy 0, policy_version 36810 (0.0034) [2024-07-02 14:55:46,100][36761] Fps is (10 sec: 44216.5, 60 sec: 44233.5, 300 sec: 43930.7). Total num frames: 603226112. Throughput: 0: 43865.3. Samples: 603301860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 14:55:46,101][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:55:46,519][36999] Updated weights for policy 0, policy_version 36820 (0.0020) [2024-07-02 14:55:50,190][36999] Updated weights for policy 0, policy_version 36830 (0.0030) [2024-07-02 14:55:51,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 603439104. Throughput: 0: 43825.0. Samples: 603563640. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-07-02 14:55:51,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:55:53,792][36999] Updated weights for policy 0, policy_version 36840 (0.0041) [2024-07-02 14:55:56,095][36761] Fps is (10 sec: 44256.9, 60 sec: 43690.7, 300 sec: 43876.5). Total num frames: 603668480. Throughput: 0: 43834.6. Samples: 603824160. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-07-02 14:55:56,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:55:57,752][36999] Updated weights for policy 0, policy_version 36850 (0.0039) [2024-07-02 14:56:01,095][36761] Fps is (10 sec: 44237.2, 60 sec: 44236.8, 300 sec: 43931.3). Total num frames: 603881472. Throughput: 0: 43784.1. Samples: 603962360. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-07-02 14:56:01,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 14:56:01,521][36999] Updated weights for policy 0, policy_version 36860 (0.0046) [2024-07-02 14:56:05,565][36999] Updated weights for policy 0, policy_version 36870 (0.0030) [2024-07-02 14:56:06,095][36761] Fps is (10 sec: 44236.6, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 604110848. Throughput: 0: 43847.6. Samples: 604219440. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-07-02 14:56:06,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:56:08,989][36999] Updated weights for policy 0, policy_version 36880 (0.0031) [2024-07-02 14:56:11,095][36761] Fps is (10 sec: 45875.0, 60 sec: 43963.8, 300 sec: 43931.4). Total num frames: 604340224. Throughput: 0: 43806.7. Samples: 604478680. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-07-02 14:56:11,096][36761] Avg episode reward: [(0, '1.042')] [2024-07-02 14:56:12,916][36999] Updated weights for policy 0, policy_version 36890 (0.0032) [2024-07-02 14:56:16,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43693.9, 300 sec: 43875.8). Total num frames: 604536832. Throughput: 0: 43813.2. Samples: 604617780. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-07-02 14:56:16,096][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 14:56:16,456][36999] Updated weights for policy 0, policy_version 36900 (0.0025) [2024-07-02 14:56:20,209][36999] Updated weights for policy 0, policy_version 36910 (0.0027) [2024-07-02 14:56:21,095][36761] Fps is (10 sec: 40959.9, 60 sec: 43690.7, 300 sec: 43931.7). Total num frames: 604749824. Throughput: 0: 43971.6. Samples: 604881560. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-07-02 14:56:21,096][36761] Avg episode reward: [(0, '1.041')] [2024-07-02 14:56:23,923][36999] Updated weights for policy 0, policy_version 36920 (0.0026) [2024-07-02 14:56:26,095][36761] Fps is (10 sec: 45875.9, 60 sec: 43963.8, 300 sec: 43931.4). Total num frames: 604995584. Throughput: 0: 43929.3. Samples: 605140680. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-07-02 14:56:26,096][36761] Avg episode reward: [(0, '1.038')] [2024-07-02 14:56:27,503][36999] Updated weights for policy 0, policy_version 36930 (0.0043) [2024-07-02 14:56:31,095][36761] Fps is (10 sec: 45874.7, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 605208576. Throughput: 0: 44089.3. Samples: 605285680. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-07-02 14:56:31,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:56:31,219][36999] Updated weights for policy 0, policy_version 36940 (0.0026) [2024-07-02 14:56:35,089][36999] Updated weights for policy 0, policy_version 36950 (0.0033) [2024-07-02 14:56:36,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43690.7, 300 sec: 43931.7). Total num frames: 605405184. Throughput: 0: 43961.0. Samples: 605541880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 14:56:36,095][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:56:36,200][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000036952_605421568.pth... [2024-07-02 14:56:36,259][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000036310_594903040.pth [2024-07-02 14:56:38,452][36999] Updated weights for policy 0, policy_version 36960 (0.0032) [2024-07-02 14:56:41,095][36761] Fps is (10 sec: 44236.8, 60 sec: 43963.6, 300 sec: 43932.0). Total num frames: 605650944. Throughput: 0: 44076.4. Samples: 605807600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 14:56:41,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:56:42,458][36999] Updated weights for policy 0, policy_version 36970 (0.0032) [2024-07-02 14:56:46,004][36999] Updated weights for policy 0, policy_version 36980 (0.0045) [2024-07-02 14:56:46,095][36761] Fps is (10 sec: 47513.2, 60 sec: 44240.2, 300 sec: 43986.9). Total num frames: 605880320. Throughput: 0: 44145.3. Samples: 605948900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 14:56:46,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:56:49,975][36999] Updated weights for policy 0, policy_version 36990 (0.0031) [2024-07-02 14:56:51,095][36761] Fps is (10 sec: 40960.4, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 606060544. Throughput: 0: 44065.4. Samples: 606202380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 14:56:51,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:56:53,423][36999] Updated weights for policy 0, policy_version 37000 (0.0031) [2024-07-02 14:56:55,374][36979] Signal inference workers to stop experience collection... (8650 times) [2024-07-02 14:56:55,380][36979] Signal inference workers to resume experience collection... (8650 times) [2024-07-02 14:56:55,415][36999] InferenceWorker_p0-w0: stopping experience collection (8650 times) [2024-07-02 14:56:55,415][36999] InferenceWorker_p0-w0: resuming experience collection (8650 times) [2024-07-02 14:56:56,095][36761] Fps is (10 sec: 44237.0, 60 sec: 44236.9, 300 sec: 43931.4). Total num frames: 606322688. Throughput: 0: 44097.3. Samples: 606463060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-07-02 14:56:56,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:56:57,374][36999] Updated weights for policy 0, policy_version 37010 (0.0034) [2024-07-02 14:57:00,823][36999] Updated weights for policy 0, policy_version 37020 (0.0023) [2024-07-02 14:57:01,095][36761] Fps is (10 sec: 47513.2, 60 sec: 44236.7, 300 sec: 43931.3). Total num frames: 606535680. Throughput: 0: 44258.7. Samples: 606609420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 14:57:01,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:57:04,829][36999] Updated weights for policy 0, policy_version 37030 (0.0030) [2024-07-02 14:57:06,095][36761] Fps is (10 sec: 39321.6, 60 sec: 43417.7, 300 sec: 43931.3). Total num frames: 606715904. Throughput: 0: 44106.3. Samples: 606866340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 14:57:06,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:57:08,491][36999] Updated weights for policy 0, policy_version 37040 (0.0035) [2024-07-02 14:57:11,097][36761] Fps is (10 sec: 44228.4, 60 sec: 43962.2, 300 sec: 43986.6). Total num frames: 606978048. Throughput: 0: 44039.8. Samples: 607122560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 14:57:11,098][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 14:57:12,524][36999] Updated weights for policy 0, policy_version 37050 (0.0031) [2024-07-02 14:57:16,027][36999] Updated weights for policy 0, policy_version 37060 (0.0028) [2024-07-02 14:57:16,095][36761] Fps is (10 sec: 47513.3, 60 sec: 44236.9, 300 sec: 43875.8). Total num frames: 607191040. Throughput: 0: 43869.9. Samples: 607259820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 14:57:16,104][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:57:19,810][36999] Updated weights for policy 0, policy_version 37070 (0.0030) [2024-07-02 14:57:21,100][36761] Fps is (10 sec: 40949.4, 60 sec: 43960.4, 300 sec: 43930.7). Total num frames: 607387648. Throughput: 0: 43923.9. Samples: 607518660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 14:57:21,101][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 14:57:23,524][36999] Updated weights for policy 0, policy_version 37080 (0.0033) [2024-07-02 14:57:26,095][36761] Fps is (10 sec: 44236.3, 60 sec: 43963.6, 300 sec: 43986.9). Total num frames: 607633408. Throughput: 0: 43813.3. Samples: 607779200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 14:57:26,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:57:27,667][36999] Updated weights for policy 0, policy_version 37090 (0.0027) [2024-07-02 14:57:30,847][36999] Updated weights for policy 0, policy_version 37100 (0.0026) [2024-07-02 14:57:31,095][36761] Fps is (10 sec: 45895.9, 60 sec: 43963.7, 300 sec: 43875.8). Total num frames: 607846400. Throughput: 0: 43742.6. Samples: 607917320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 14:57:31,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:57:35,042][36999] Updated weights for policy 0, policy_version 37110 (0.0039) [2024-07-02 14:57:36,095][36761] Fps is (10 sec: 40960.8, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 608043008. Throughput: 0: 43957.0. Samples: 608180440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 14:57:36,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:57:38,143][36999] Updated weights for policy 0, policy_version 37120 (0.0030) [2024-07-02 14:57:41,095][36761] Fps is (10 sec: 45875.2, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 608305152. Throughput: 0: 43901.6. Samples: 608438640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-07-02 14:57:41,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:57:42,279][36999] Updated weights for policy 0, policy_version 37130 (0.0027) [2024-07-02 14:57:45,967][36999] Updated weights for policy 0, policy_version 37140 (0.0036) [2024-07-02 14:57:46,096][36761] Fps is (10 sec: 45874.2, 60 sec: 43690.6, 300 sec: 43932.0). Total num frames: 608501760. Throughput: 0: 43623.1. Samples: 608572460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:57:46,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:57:50,216][36999] Updated weights for policy 0, policy_version 37150 (0.0027) [2024-07-02 14:57:51,095][36761] Fps is (10 sec: 40960.5, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 608714752. Throughput: 0: 43921.8. Samples: 608842820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:57:51,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:57:53,351][36999] Updated weights for policy 0, policy_version 37160 (0.0023) [2024-07-02 14:57:56,095][36761] Fps is (10 sec: 45876.0, 60 sec: 43963.7, 300 sec: 43986.9). Total num frames: 608960512. Throughput: 0: 43902.0. Samples: 609098060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:57:56,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 14:57:57,706][36999] Updated weights for policy 0, policy_version 37170 (0.0033) [2024-07-02 14:58:00,765][36999] Updated weights for policy 0, policy_version 37180 (0.0036) [2024-07-02 14:58:01,095][36761] Fps is (10 sec: 44237.0, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 609157120. Throughput: 0: 43865.0. Samples: 609233740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:58:01,095][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:58:04,898][36999] Updated weights for policy 0, policy_version 37190 (0.0047) [2024-07-02 14:58:06,095][36761] Fps is (10 sec: 40959.7, 60 sec: 44236.7, 300 sec: 43931.3). Total num frames: 609370112. Throughput: 0: 44100.9. Samples: 609503000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-07-02 14:58:06,099][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 14:58:08,100][36999] Updated weights for policy 0, policy_version 37200 (0.0029) [2024-07-02 14:58:11,095][36761] Fps is (10 sec: 42598.2, 60 sec: 43419.1, 300 sec: 43875.8). Total num frames: 609583104. Throughput: 0: 44022.8. Samples: 609760220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 14:58:11,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:58:12,169][36999] Updated weights for policy 0, policy_version 37210 (0.0034) [2024-07-02 14:58:15,924][36999] Updated weights for policy 0, policy_version 37220 (0.0032) [2024-07-02 14:58:16,095][36761] Fps is (10 sec: 44237.1, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 609812480. Throughput: 0: 43800.6. Samples: 609888340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 14:58:16,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:58:19,583][36999] Updated weights for policy 0, policy_version 37230 (0.0028) [2024-07-02 14:58:20,584][36979] Signal inference workers to stop experience collection... (8700 times) [2024-07-02 14:58:20,585][36979] Signal inference workers to resume experience collection... (8700 times) [2024-07-02 14:58:20,612][36999] InferenceWorker_p0-w0: stopping experience collection (8700 times) [2024-07-02 14:58:20,612][36999] InferenceWorker_p0-w0: resuming experience collection (8700 times) [2024-07-02 14:58:21,100][36761] Fps is (10 sec: 45854.6, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 610041856. Throughput: 0: 43865.4. Samples: 610154580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 14:58:21,100][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 14:58:23,501][36999] Updated weights for policy 0, policy_version 37240 (0.0030) [2024-07-02 14:58:26,095][36761] Fps is (10 sec: 44236.2, 60 sec: 43690.7, 300 sec: 43931.3). Total num frames: 610254848. Throughput: 0: 43992.0. Samples: 610418280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 14:58:26,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:58:27,012][36999] Updated weights for policy 0, policy_version 37250 (0.0022) [2024-07-02 14:58:30,869][36999] Updated weights for policy 0, policy_version 37260 (0.0035) [2024-07-02 14:58:31,095][36761] Fps is (10 sec: 44256.7, 60 sec: 43963.8, 300 sec: 43987.6). Total num frames: 610484224. Throughput: 0: 44001.5. Samples: 610552520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 14:58:31,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 14:58:34,330][36999] Updated weights for policy 0, policy_version 37270 (0.0036) [2024-07-02 14:58:36,095][36761] Fps is (10 sec: 44237.6, 60 sec: 44236.8, 300 sec: 43932.0). Total num frames: 610697216. Throughput: 0: 43943.2. Samples: 610820260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 14:58:36,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:58:36,208][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000037275_610713600.pth... [2024-07-02 14:58:36,290][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000036632_600178688.pth [2024-07-02 14:58:38,369][36999] Updated weights for policy 0, policy_version 37280 (0.0041) [2024-07-02 14:58:41,095][36761] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 43986.9). Total num frames: 610910208. Throughput: 0: 44059.6. Samples: 611080740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 14:58:41,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:58:41,769][36999] Updated weights for policy 0, policy_version 37290 (0.0023) [2024-07-02 14:58:46,095][36761] Fps is (10 sec: 40960.0, 60 sec: 43417.8, 300 sec: 43820.3). Total num frames: 611106816. Throughput: 0: 43831.6. Samples: 611206160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 14:58:46,095][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:58:46,134][36999] Updated weights for policy 0, policy_version 37300 (0.0031) [2024-07-02 14:58:49,214][36999] Updated weights for policy 0, policy_version 37310 (0.0037) [2024-07-02 14:58:51,095][36761] Fps is (10 sec: 45875.0, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 611368960. Throughput: 0: 43817.0. Samples: 611474760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 14:58:51,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:58:53,607][36999] Updated weights for policy 0, policy_version 37320 (0.0032) [2024-07-02 14:58:56,095][36761] Fps is (10 sec: 47513.6, 60 sec: 43690.7, 300 sec: 44042.4). Total num frames: 611581952. Throughput: 0: 43908.1. Samples: 611736080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 14:58:56,096][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 14:58:56,630][36999] Updated weights for policy 0, policy_version 37330 (0.0038) [2024-07-02 14:59:00,743][36999] Updated weights for policy 0, policy_version 37340 (0.0024) [2024-07-02 14:59:01,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 611794944. Throughput: 0: 43923.6. Samples: 611864900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 14:59:01,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 14:59:04,174][36999] Updated weights for policy 0, policy_version 37350 (0.0036) [2024-07-02 14:59:06,095][36761] Fps is (10 sec: 44236.7, 60 sec: 44236.9, 300 sec: 43875.8). Total num frames: 612024320. Throughput: 0: 44006.2. Samples: 612134660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 14:59:06,096][36761] Avg episode reward: [(0, '1.011')] [2024-07-02 14:59:08,204][36999] Updated weights for policy 0, policy_version 37360 (0.0037) [2024-07-02 14:59:11,100][36761] Fps is (10 sec: 44216.5, 60 sec: 44233.4, 300 sec: 43986.2). Total num frames: 612237312. Throughput: 0: 44024.5. Samples: 612399580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 14:59:11,101][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:59:11,822][36999] Updated weights for policy 0, policy_version 37370 (0.0038) [2024-07-02 14:59:15,477][36999] Updated weights for policy 0, policy_version 37380 (0.0036) [2024-07-02 14:59:16,095][36761] Fps is (10 sec: 42597.7, 60 sec: 43963.6, 300 sec: 43875.8). Total num frames: 612450304. Throughput: 0: 43872.3. Samples: 612526780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 14:59:16,098][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 14:59:19,190][36999] Updated weights for policy 0, policy_version 37390 (0.0032) [2024-07-02 14:59:21,095][36761] Fps is (10 sec: 44256.8, 60 sec: 43967.0, 300 sec: 43875.8). Total num frames: 612679680. Throughput: 0: 43870.1. Samples: 612794420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 14:59:21,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 14:59:22,883][36999] Updated weights for policy 0, policy_version 37400 (0.0035) [2024-07-02 14:59:26,095][36761] Fps is (10 sec: 44236.9, 60 sec: 43963.8, 300 sec: 43986.9). Total num frames: 612892672. Throughput: 0: 43961.6. Samples: 613059020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-07-02 14:59:26,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:59:26,682][36999] Updated weights for policy 0, policy_version 37410 (0.0040) [2024-07-02 14:59:30,749][36999] Updated weights for policy 0, policy_version 37420 (0.0039) [2024-07-02 14:59:31,095][36761] Fps is (10 sec: 42598.9, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 613105664. Throughput: 0: 44023.1. Samples: 613187200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-07-02 14:59:31,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 14:59:33,926][36999] Updated weights for policy 0, policy_version 37430 (0.0032) [2024-07-02 14:59:36,095][36761] Fps is (10 sec: 44237.2, 60 sec: 43963.7, 300 sec: 43876.5). Total num frames: 613335040. Throughput: 0: 43920.0. Samples: 613451160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-07-02 14:59:36,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:59:38,017][36999] Updated weights for policy 0, policy_version 37440 (0.0025) [2024-07-02 14:59:41,095][36761] Fps is (10 sec: 44236.3, 60 sec: 43963.6, 300 sec: 43986.9). Total num frames: 613548032. Throughput: 0: 43895.9. Samples: 613711400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-07-02 14:59:41,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 14:59:41,791][36999] Updated weights for policy 0, policy_version 37450 (0.0037) [2024-07-02 14:59:45,388][36999] Updated weights for policy 0, policy_version 37460 (0.0036) [2024-07-02 14:59:46,100][36761] Fps is (10 sec: 44216.6, 60 sec: 44506.4, 300 sec: 43930.7). Total num frames: 613777408. Throughput: 0: 43898.6. Samples: 613840540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 14:59:46,101][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 14:59:49,056][36999] Updated weights for policy 0, policy_version 37470 (0.0033) [2024-07-02 14:59:51,095][36761] Fps is (10 sec: 44237.2, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 613990400. Throughput: 0: 43888.9. Samples: 614109660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 14:59:51,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 14:59:52,738][36999] Updated weights for policy 0, policy_version 37480 (0.0031) [2024-07-02 14:59:56,095][36761] Fps is (10 sec: 42618.1, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 614203392. Throughput: 0: 43729.8. Samples: 614367220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 14:59:56,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 14:59:56,530][36999] Updated weights for policy 0, policy_version 37490 (0.0038) [2024-07-02 14:59:59,274][36979] Signal inference workers to stop experience collection... (8750 times) [2024-07-02 14:59:59,275][36979] Signal inference workers to resume experience collection... (8750 times) [2024-07-02 14:59:59,320][36999] InferenceWorker_p0-w0: stopping experience collection (8750 times) [2024-07-02 14:59:59,321][36999] InferenceWorker_p0-w0: resuming experience collection (8750 times) [2024-07-02 15:00:00,397][36999] Updated weights for policy 0, policy_version 37500 (0.0037) [2024-07-02 15:00:01,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 43820.3). Total num frames: 614416384. Throughput: 0: 43833.0. Samples: 614499260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 15:00:01,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 15:00:03,819][36999] Updated weights for policy 0, policy_version 37510 (0.0022) [2024-07-02 15:00:06,095][36761] Fps is (10 sec: 45875.1, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 614662144. Throughput: 0: 43789.4. Samples: 614764940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-07-02 15:00:06,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 15:00:07,764][36999] Updated weights for policy 0, policy_version 37520 (0.0026) [2024-07-02 15:00:11,100][36761] Fps is (10 sec: 45854.2, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 614875136. Throughput: 0: 43748.1. Samples: 615027880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-07-02 15:00:11,101][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 15:00:11,372][36999] Updated weights for policy 0, policy_version 37530 (0.0033) [2024-07-02 15:00:15,090][36999] Updated weights for policy 0, policy_version 37540 (0.0032) [2024-07-02 15:00:16,095][36761] Fps is (10 sec: 40960.1, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 615071744. Throughput: 0: 43787.6. Samples: 615157640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-07-02 15:00:16,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 15:00:18,724][36999] Updated weights for policy 0, policy_version 37550 (0.0031) [2024-07-02 15:00:21,095][36761] Fps is (10 sec: 42617.9, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 615301120. Throughput: 0: 43828.0. Samples: 615423420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-07-02 15:00:21,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:00:22,443][36999] Updated weights for policy 0, policy_version 37560 (0.0042) [2024-07-02 15:00:26,095][36761] Fps is (10 sec: 45875.2, 60 sec: 43963.8, 300 sec: 43931.3). Total num frames: 615530496. Throughput: 0: 43875.2. Samples: 615685780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-07-02 15:00:26,095][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 15:00:26,136][36999] Updated weights for policy 0, policy_version 37570 (0.0036) [2024-07-02 15:00:29,991][36999] Updated weights for policy 0, policy_version 37580 (0.0029) [2024-07-02 15:00:31,098][36761] Fps is (10 sec: 42586.9, 60 sec: 43688.7, 300 sec: 43875.4). Total num frames: 615727104. Throughput: 0: 43841.4. Samples: 615813320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-07-02 15:00:31,099][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 15:00:33,590][36999] Updated weights for policy 0, policy_version 37590 (0.0024) [2024-07-02 15:00:36,095][36761] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 615956480. Throughput: 0: 43752.9. Samples: 616078540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-07-02 15:00:36,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:00:36,122][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000037595_615956480.pth... [2024-07-02 15:00:36,188][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000036952_605421568.pth [2024-07-02 15:00:37,267][36999] Updated weights for policy 0, policy_version 37600 (0.0031) [2024-07-02 15:00:41,095][36761] Fps is (10 sec: 45887.9, 60 sec: 43963.8, 300 sec: 43932.0). Total num frames: 616185856. Throughput: 0: 43971.1. Samples: 616345920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-07-02 15:00:41,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:00:41,139][36999] Updated weights for policy 0, policy_version 37610 (0.0023) [2024-07-02 15:00:44,627][36999] Updated weights for policy 0, policy_version 37620 (0.0025) [2024-07-02 15:00:46,095][36761] Fps is (10 sec: 42598.3, 60 sec: 43420.9, 300 sec: 43875.8). Total num frames: 616382464. Throughput: 0: 43956.0. Samples: 616477280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-07-02 15:00:46,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:00:48,613][36999] Updated weights for policy 0, policy_version 37630 (0.0024) [2024-07-02 15:00:51,095][36761] Fps is (10 sec: 44236.1, 60 sec: 43963.7, 300 sec: 43931.3). Total num frames: 616628224. Throughput: 0: 43811.5. Samples: 616736460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-07-02 15:00:51,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 15:00:51,822][36999] Updated weights for policy 0, policy_version 37640 (0.0024) [2024-07-02 15:00:55,096][36999] Updated weights for policy 0, policy_version 37650 (0.0025) [2024-07-02 15:00:56,095][36761] Fps is (10 sec: 50790.4, 60 sec: 44782.9, 300 sec: 44098.0). Total num frames: 616890368. Throughput: 0: 44624.1. Samples: 617035760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-07-02 15:00:56,096][36761] Avg episode reward: [(0, '1.038')] [2024-07-02 15:00:59,178][36999] Updated weights for policy 0, policy_version 37660 (0.0021) [2024-07-02 15:01:00,922][36979] Signal inference workers to stop experience collection... (8800 times) [2024-07-02 15:01:00,923][36979] Signal inference workers to resume experience collection... (8800 times) [2024-07-02 15:01:00,932][36999] InferenceWorker_p0-w0: stopping experience collection (8800 times) [2024-07-02 15:01:00,942][36999] InferenceWorker_p0-w0: resuming experience collection (8800 times) [2024-07-02 15:01:01,095][36761] Fps is (10 sec: 52428.6, 60 sec: 45602.0, 300 sec: 44209.0). Total num frames: 617152512. Throughput: 0: 45285.6. Samples: 617195500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-07-02 15:01:01,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 15:01:01,577][36999] Updated weights for policy 0, policy_version 37670 (0.0022) [2024-07-02 15:01:05,749][36999] Updated weights for policy 0, policy_version 37680 (0.0028) [2024-07-02 15:01:06,095][36761] Fps is (10 sec: 47513.2, 60 sec: 45055.9, 300 sec: 44153.5). Total num frames: 617365504. Throughput: 0: 45780.4. Samples: 617483540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-07-02 15:01:06,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:01:08,574][36999] Updated weights for policy 0, policy_version 37690 (0.0022) [2024-07-02 15:01:11,095][36761] Fps is (10 sec: 47513.8, 60 sec: 45878.6, 300 sec: 44375.7). Total num frames: 617627648. Throughput: 0: 46468.8. Samples: 617776880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-07-02 15:01:11,096][36761] Avg episode reward: [(0, '1.040')] [2024-07-02 15:01:12,587][36999] Updated weights for policy 0, policy_version 37700 (0.0024) [2024-07-02 15:01:15,059][36999] Updated weights for policy 0, policy_version 37710 (0.0031) [2024-07-02 15:01:16,100][36761] Fps is (10 sec: 50767.6, 60 sec: 46690.8, 300 sec: 44486.0). Total num frames: 617873408. Throughput: 0: 46961.6. Samples: 617926680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-07-02 15:01:16,100][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:01:19,151][36999] Updated weights for policy 0, policy_version 37720 (0.0023) [2024-07-02 15:01:21,095][36761] Fps is (10 sec: 50790.2, 60 sec: 47240.4, 300 sec: 44542.2). Total num frames: 618135552. Throughput: 0: 47675.8. Samples: 618223960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-07-02 15:01:21,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 15:01:21,909][36999] Updated weights for policy 0, policy_version 37730 (0.0027) [2024-07-02 15:01:25,592][36999] Updated weights for policy 0, policy_version 37740 (0.0027) [2024-07-02 15:01:26,100][36761] Fps is (10 sec: 47513.4, 60 sec: 46963.8, 300 sec: 44541.6). Total num frames: 618348544. Throughput: 0: 48435.4. Samples: 618525740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 15:01:26,100][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 15:01:28,606][36999] Updated weights for policy 0, policy_version 37750 (0.0027) [2024-07-02 15:01:31,095][36761] Fps is (10 sec: 45875.2, 60 sec: 47788.7, 300 sec: 44708.9). Total num frames: 618594304. Throughput: 0: 48659.8. Samples: 618666980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 15:01:31,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:01:32,288][36999] Updated weights for policy 0, policy_version 37760 (0.0024) [2024-07-02 15:01:35,326][36999] Updated weights for policy 0, policy_version 37770 (0.0022) [2024-07-02 15:01:36,095][36761] Fps is (10 sec: 52452.5, 60 sec: 48605.8, 300 sec: 44820.0). Total num frames: 618872832. Throughput: 0: 49546.7. Samples: 618966060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 15:01:36,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 15:01:38,554][36999] Updated weights for policy 0, policy_version 37780 (0.0032) [2024-07-02 15:01:41,095][36761] Fps is (10 sec: 52429.3, 60 sec: 48878.8, 300 sec: 44875.5). Total num frames: 619118592. Throughput: 0: 49655.9. Samples: 619270280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 15:01:41,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 15:01:41,578][36999] Updated weights for policy 0, policy_version 37790 (0.0025) [2024-07-02 15:01:45,053][36999] Updated weights for policy 0, policy_version 37800 (0.0022) [2024-07-02 15:01:46,095][36761] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 45042.1). Total num frames: 619347968. Throughput: 0: 49465.9. Samples: 619421460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-07-02 15:01:46,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 15:01:47,925][36999] Updated weights for policy 0, policy_version 37810 (0.0026) [2024-07-02 15:01:51,095][36761] Fps is (10 sec: 49152.1, 60 sec: 49698.2, 300 sec: 45042.1). Total num frames: 619610112. Throughput: 0: 49528.5. Samples: 619712320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-07-02 15:01:51,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:01:51,671][36999] Updated weights for policy 0, policy_version 37820 (0.0021) [2024-07-02 15:01:54,944][36999] Updated weights for policy 0, policy_version 37830 (0.0021) [2024-07-02 15:01:56,095][36761] Fps is (10 sec: 50790.4, 60 sec: 49425.1, 300 sec: 45153.2). Total num frames: 619855872. Throughput: 0: 49599.2. Samples: 620008840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-07-02 15:01:56,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:01:57,518][36979] Signal inference workers to stop experience collection... (8850 times) [2024-07-02 15:01:57,518][36979] Signal inference workers to resume experience collection... (8850 times) [2024-07-02 15:01:57,528][36999] InferenceWorker_p0-w0: stopping experience collection (8850 times) [2024-07-02 15:01:57,528][36999] InferenceWorker_p0-w0: resuming experience collection (8850 times) [2024-07-02 15:01:58,522][36999] Updated weights for policy 0, policy_version 37840 (0.0031) [2024-07-02 15:02:01,095][36761] Fps is (10 sec: 49152.1, 60 sec: 49152.1, 300 sec: 45375.4). Total num frames: 620101632. Throughput: 0: 49618.8. Samples: 620159300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-07-02 15:02:01,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:02:01,367][36999] Updated weights for policy 0, policy_version 37850 (0.0024) [2024-07-02 15:02:05,240][36999] Updated weights for policy 0, policy_version 37860 (0.0022) [2024-07-02 15:02:06,095][36761] Fps is (10 sec: 50789.8, 60 sec: 49971.2, 300 sec: 45375.6). Total num frames: 620363776. Throughput: 0: 49648.9. Samples: 620458160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-07-02 15:02:06,097][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 15:02:08,184][36999] Updated weights for policy 0, policy_version 37870 (0.0022) [2024-07-02 15:02:11,100][36761] Fps is (10 sec: 49129.3, 60 sec: 49421.3, 300 sec: 45430.2). Total num frames: 620593152. Throughput: 0: 49589.8. Samples: 620757280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 15:02:11,101][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 15:02:11,886][36999] Updated weights for policy 0, policy_version 37880 (0.0029) [2024-07-02 15:02:14,997][36999] Updated weights for policy 0, policy_version 37890 (0.0021) [2024-07-02 15:02:16,095][36761] Fps is (10 sec: 47514.0, 60 sec: 49428.8, 300 sec: 45598.2). Total num frames: 620838912. Throughput: 0: 49705.9. Samples: 620903740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 15:02:16,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 15:02:18,219][36999] Updated weights for policy 0, policy_version 37900 (0.0028) [2024-07-02 15:02:21,100][36761] Fps is (10 sec: 49152.1, 60 sec: 49148.3, 300 sec: 45596.8). Total num frames: 621084672. Throughput: 0: 49700.3. Samples: 621202800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 15:02:21,100][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:02:21,517][36999] Updated weights for policy 0, policy_version 37910 (0.0021) [2024-07-02 15:02:24,751][36999] Updated weights for policy 0, policy_version 37920 (0.0022) [2024-07-02 15:02:26,095][36761] Fps is (10 sec: 50790.5, 60 sec: 49975.0, 300 sec: 45764.1). Total num frames: 621346816. Throughput: 0: 49636.9. Samples: 621503940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 15:02:26,096][36761] Avg episode reward: [(0, '1.018')] [2024-07-02 15:02:27,924][36999] Updated weights for policy 0, policy_version 37930 (0.0023) [2024-07-02 15:02:31,095][36761] Fps is (10 sec: 49174.2, 60 sec: 49698.2, 300 sec: 45875.2). Total num frames: 621576192. Throughput: 0: 49507.9. Samples: 621649320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 15:02:31,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:02:31,368][36999] Updated weights for policy 0, policy_version 37940 (0.0023) [2024-07-02 15:02:34,390][36999] Updated weights for policy 0, policy_version 37950 (0.0022) [2024-07-02 15:02:36,095][36761] Fps is (10 sec: 47513.4, 60 sec: 49152.0, 300 sec: 45819.7). Total num frames: 621821952. Throughput: 0: 49694.6. Samples: 621948580. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-07-02 15:02:36,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 15:02:36,205][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000037954_621838336.pth... [2024-07-02 15:02:36,258][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000037275_610713600.pth [2024-07-02 15:02:37,770][36999] Updated weights for policy 0, policy_version 37960 (0.0020) [2024-07-02 15:02:41,095][36761] Fps is (10 sec: 50790.3, 60 sec: 49425.0, 300 sec: 46041.8). Total num frames: 622084096. Throughput: 0: 49626.5. Samples: 622242040. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-07-02 15:02:41,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 15:02:41,168][36999] Updated weights for policy 0, policy_version 37970 (0.0022) [2024-07-02 15:02:44,524][36999] Updated weights for policy 0, policy_version 37980 (0.0026) [2024-07-02 15:02:46,095][36761] Fps is (10 sec: 52428.3, 60 sec: 49971.1, 300 sec: 46208.4). Total num frames: 622346240. Throughput: 0: 49589.6. Samples: 622390840. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-07-02 15:02:46,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 15:02:47,778][36999] Updated weights for policy 0, policy_version 37990 (0.0020) [2024-07-02 15:02:49,432][36979] Signal inference workers to stop experience collection... (8900 times) [2024-07-02 15:02:49,462][36999] InferenceWorker_p0-w0: stopping experience collection (8900 times) [2024-07-02 15:02:49,490][36979] Signal inference workers to resume experience collection... (8900 times) [2024-07-02 15:02:49,513][36999] InferenceWorker_p0-w0: resuming experience collection (8900 times) [2024-07-02 15:02:51,095][36761] Fps is (10 sec: 49152.6, 60 sec: 49425.1, 300 sec: 46152.9). Total num frames: 622575616. Throughput: 0: 49396.6. Samples: 622681000. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-07-02 15:02:51,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 15:02:51,478][36999] Updated weights for policy 0, policy_version 38000 (0.0030) [2024-07-02 15:02:54,501][36999] Updated weights for policy 0, policy_version 38010 (0.0026) [2024-07-02 15:02:56,095][36761] Fps is (10 sec: 45875.9, 60 sec: 49152.0, 300 sec: 46264.0). Total num frames: 622804992. Throughput: 0: 49392.6. Samples: 622979720. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-07-02 15:02:56,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:02:58,120][36999] Updated weights for policy 0, policy_version 38020 (0.0032) [2024-07-02 15:03:01,095][36761] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 46430.6). Total num frames: 623067136. Throughput: 0: 49464.9. Samples: 623129660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 15:03:01,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 15:03:01,125][36999] Updated weights for policy 0, policy_version 38030 (0.0024) [2024-07-02 15:03:04,774][36999] Updated weights for policy 0, policy_version 38040 (0.0022) [2024-07-02 15:03:06,095][36761] Fps is (10 sec: 50790.2, 60 sec: 49152.1, 300 sec: 46541.7). Total num frames: 623312896. Throughput: 0: 49394.8. Samples: 623425340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 15:03:06,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 15:03:08,099][36999] Updated weights for policy 0, policy_version 38050 (0.0023) [2024-07-02 15:03:11,095][36761] Fps is (10 sec: 49151.0, 60 sec: 49428.7, 300 sec: 46597.2). Total num frames: 623558656. Throughput: 0: 49363.8. Samples: 623725320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 15:03:11,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 15:03:11,334][36999] Updated weights for policy 0, policy_version 38060 (0.0022) [2024-07-02 15:03:14,540][36999] Updated weights for policy 0, policy_version 38070 (0.0021) [2024-07-02 15:03:16,095][36761] Fps is (10 sec: 49152.3, 60 sec: 49425.1, 300 sec: 46653.5). Total num frames: 623804416. Throughput: 0: 49478.3. Samples: 623875840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 15:03:16,095][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:03:17,591][36999] Updated weights for policy 0, policy_version 38080 (0.0023) [2024-07-02 15:03:20,739][36999] Updated weights for policy 0, policy_version 38090 (0.0020) [2024-07-02 15:03:21,095][36761] Fps is (10 sec: 52429.2, 60 sec: 49974.9, 300 sec: 46874.9). Total num frames: 624082944. Throughput: 0: 49625.3. Samples: 624181720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 15:03:21,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 15:03:23,931][36999] Updated weights for policy 0, policy_version 38100 (0.0023) [2024-07-02 15:03:26,095][36761] Fps is (10 sec: 50789.4, 60 sec: 49424.9, 300 sec: 46874.9). Total num frames: 624312320. Throughput: 0: 49634.6. Samples: 624475600. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-07-02 15:03:26,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 15:03:27,276][36999] Updated weights for policy 0, policy_version 38110 (0.0021) [2024-07-02 15:03:30,513][36979] Signal inference workers to stop experience collection... (8950 times) [2024-07-02 15:03:30,517][36979] Signal inference workers to resume experience collection... (8950 times) [2024-07-02 15:03:30,529][36999] InferenceWorker_p0-w0: stopping experience collection (8950 times) [2024-07-02 15:03:30,529][36999] InferenceWorker_p0-w0: resuming experience collection (8950 times) [2024-07-02 15:03:30,888][36999] Updated weights for policy 0, policy_version 38120 (0.0027) [2024-07-02 15:03:31,095][36761] Fps is (10 sec: 49151.6, 60 sec: 49971.1, 300 sec: 47041.5). Total num frames: 624574464. Throughput: 0: 49646.2. Samples: 624624920. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-07-02 15:03:31,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 15:03:33,728][36999] Updated weights for policy 0, policy_version 38130 (0.0021) [2024-07-02 15:03:36,096][36761] Fps is (10 sec: 49151.9, 60 sec: 49698.0, 300 sec: 47097.0). Total num frames: 624803840. Throughput: 0: 49832.6. Samples: 624923480. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-07-02 15:03:36,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 15:03:37,373][36999] Updated weights for policy 0, policy_version 38140 (0.0021) [2024-07-02 15:03:40,303][36999] Updated weights for policy 0, policy_version 38150 (0.0023) [2024-07-02 15:03:41,095][36761] Fps is (10 sec: 49152.2, 60 sec: 49698.1, 300 sec: 47319.2). Total num frames: 625065984. Throughput: 0: 49722.1. Samples: 625217220. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-07-02 15:03:41,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:03:43,754][36999] Updated weights for policy 0, policy_version 38160 (0.0027) [2024-07-02 15:03:46,095][36761] Fps is (10 sec: 50791.1, 60 sec: 49425.1, 300 sec: 47263.7). Total num frames: 625311744. Throughput: 0: 49637.7. Samples: 625363360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-07-02 15:03:46,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 15:03:46,920][36999] Updated weights for policy 0, policy_version 38170 (0.0025) [2024-07-02 15:03:50,115][36999] Updated weights for policy 0, policy_version 38180 (0.0025) [2024-07-02 15:03:51,095][36761] Fps is (10 sec: 47513.9, 60 sec: 49425.0, 300 sec: 47319.2). Total num frames: 625541120. Throughput: 0: 49843.9. Samples: 625668320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-07-02 15:03:51,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 15:03:53,666][36999] Updated weights for policy 0, policy_version 38190 (0.0028) [2024-07-02 15:03:56,095][36761] Fps is (10 sec: 49152.2, 60 sec: 49971.2, 300 sec: 47485.8). Total num frames: 625803264. Throughput: 0: 49651.7. Samples: 625959640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-07-02 15:03:56,104][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 15:03:57,371][36999] Updated weights for policy 0, policy_version 38200 (0.0023) [2024-07-02 15:04:00,557][36999] Updated weights for policy 0, policy_version 38210 (0.0031) [2024-07-02 15:04:01,095][36761] Fps is (10 sec: 50790.5, 60 sec: 49698.1, 300 sec: 47541.4). Total num frames: 626049024. Throughput: 0: 49537.2. Samples: 626105020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-07-02 15:04:01,096][36761] Avg episode reward: [(0, '1.018')] [2024-07-02 15:04:04,057][36999] Updated weights for policy 0, policy_version 38220 (0.0021) [2024-07-02 15:04:06,100][36761] Fps is (10 sec: 49129.1, 60 sec: 49694.3, 300 sec: 47652.4). Total num frames: 626294784. Throughput: 0: 49334.5. Samples: 626402000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-07-02 15:04:06,101][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 15:04:07,167][36999] Updated weights for policy 0, policy_version 38230 (0.0024) [2024-07-02 15:04:10,813][36999] Updated weights for policy 0, policy_version 38240 (0.0025) [2024-07-02 15:04:11,095][36761] Fps is (10 sec: 49151.5, 60 sec: 49698.1, 300 sec: 47763.5). Total num frames: 626540544. Throughput: 0: 49465.8. Samples: 626701560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-07-02 15:04:11,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:04:13,626][36999] Updated weights for policy 0, policy_version 38250 (0.0026) [2024-07-02 15:04:16,096][36761] Fps is (10 sec: 49174.0, 60 sec: 49697.9, 300 sec: 47819.0). Total num frames: 626786304. Throughput: 0: 49232.4. Samples: 626840380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 15:04:16,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:04:17,274][36999] Updated weights for policy 0, policy_version 38260 (0.0025) [2024-07-02 15:04:20,427][36999] Updated weights for policy 0, policy_version 38270 (0.0023) [2024-07-02 15:04:21,095][36761] Fps is (10 sec: 49152.1, 60 sec: 49152.0, 300 sec: 47930.1). Total num frames: 627032064. Throughput: 0: 49263.2. Samples: 627140320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 15:04:21,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 15:04:23,759][36999] Updated weights for policy 0, policy_version 38280 (0.0024) [2024-07-02 15:04:24,221][36979] Signal inference workers to stop experience collection... (9000 times) [2024-07-02 15:04:24,221][36979] Signal inference workers to resume experience collection... (9000 times) [2024-07-02 15:04:24,231][36999] InferenceWorker_p0-w0: stopping experience collection (9000 times) [2024-07-02 15:04:24,231][36999] InferenceWorker_p0-w0: resuming experience collection (9000 times) [2024-07-02 15:04:26,096][36761] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 48041.2). Total num frames: 627277824. Throughput: 0: 49337.6. Samples: 627437420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 15:04:26,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 15:04:26,970][36999] Updated weights for policy 0, policy_version 38290 (0.0027) [2024-07-02 15:04:30,240][36999] Updated weights for policy 0, policy_version 38300 (0.0022) [2024-07-02 15:04:31,095][36761] Fps is (10 sec: 50791.0, 60 sec: 49425.2, 300 sec: 48152.3). Total num frames: 627539968. Throughput: 0: 49601.4. Samples: 627595420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 15:04:31,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 15:04:33,829][36999] Updated weights for policy 0, policy_version 38310 (0.0020) [2024-07-02 15:04:36,095][36761] Fps is (10 sec: 50792.1, 60 sec: 49698.3, 300 sec: 48263.4). Total num frames: 627785728. Throughput: 0: 49483.7. Samples: 627895080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-07-02 15:04:36,096][36761] Avg episode reward: [(0, '1.018')] [2024-07-02 15:04:36,109][36761] No heartbeat for components: RolloutWorker_w15 (237 seconds) [2024-07-02 15:04:36,218][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000038318_627802112.pth... [2024-07-02 15:04:36,282][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000037595_615956480.pth [2024-07-02 15:04:36,706][36999] Updated weights for policy 0, policy_version 38320 (0.0031) [2024-07-02 15:04:40,469][36999] Updated weights for policy 0, policy_version 38330 (0.0022) [2024-07-02 15:04:41,095][36761] Fps is (10 sec: 47513.3, 60 sec: 49152.1, 300 sec: 48264.1). Total num frames: 628015104. Throughput: 0: 49451.9. Samples: 628184980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 15:04:41,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 15:04:43,579][36999] Updated weights for policy 0, policy_version 38340 (0.0020) [2024-07-02 15:04:46,096][36761] Fps is (10 sec: 47512.4, 60 sec: 49151.9, 300 sec: 48374.4). Total num frames: 628260864. Throughput: 0: 49383.0. Samples: 628327260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 15:04:46,096][36761] Avg episode reward: [(0, '1.037')] [2024-07-02 15:04:47,213][36999] Updated weights for policy 0, policy_version 38350 (0.0025) [2024-07-02 15:04:50,101][36999] Updated weights for policy 0, policy_version 38360 (0.0022) [2024-07-02 15:04:51,095][36761] Fps is (10 sec: 50790.8, 60 sec: 49698.2, 300 sec: 48541.1). Total num frames: 628523008. Throughput: 0: 49365.1. Samples: 628623200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 15:04:51,096][36761] Avg episode reward: [(0, '1.037')] [2024-07-02 15:04:53,910][36999] Updated weights for policy 0, policy_version 38370 (0.0023) [2024-07-02 15:04:56,095][36761] Fps is (10 sec: 52429.8, 60 sec: 49698.1, 300 sec: 48707.7). Total num frames: 628785152. Throughput: 0: 49404.2. Samples: 628924740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 15:04:56,096][36761] Avg episode reward: [(0, '1.037')] [2024-07-02 15:04:56,885][36999] Updated weights for policy 0, policy_version 38380 (0.0028) [2024-07-02 15:05:00,330][36999] Updated weights for policy 0, policy_version 38390 (0.0022) [2024-07-02 15:05:01,095][36761] Fps is (10 sec: 47513.1, 60 sec: 49152.0, 300 sec: 48596.6). Total num frames: 628998144. Throughput: 0: 49601.0. Samples: 629072420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 15:05:01,096][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 15:05:03,459][36999] Updated weights for policy 0, policy_version 38400 (0.0026) [2024-07-02 15:05:06,096][36761] Fps is (10 sec: 45874.1, 60 sec: 49155.6, 300 sec: 48708.4). Total num frames: 629243904. Throughput: 0: 49576.8. Samples: 629371280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-07-02 15:05:06,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:05:06,799][36999] Updated weights for policy 0, policy_version 38410 (0.0022) [2024-07-02 15:05:10,042][36999] Updated weights for policy 0, policy_version 38420 (0.0026) [2024-07-02 15:05:11,095][36761] Fps is (10 sec: 50791.1, 60 sec: 49425.2, 300 sec: 48929.8). Total num frames: 629506048. Throughput: 0: 49403.5. Samples: 629660560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-07-02 15:05:11,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 15:05:13,650][36999] Updated weights for policy 0, policy_version 38430 (0.0024) [2024-07-02 15:05:16,095][36761] Fps is (10 sec: 52429.3, 60 sec: 49698.2, 300 sec: 49040.9). Total num frames: 629768192. Throughput: 0: 49165.2. Samples: 629807860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-07-02 15:05:16,096][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 15:05:16,636][36999] Updated weights for policy 0, policy_version 38440 (0.0023) [2024-07-02 15:05:20,183][36999] Updated weights for policy 0, policy_version 38450 (0.0030) [2024-07-02 15:05:21,096][36761] Fps is (10 sec: 49150.9, 60 sec: 49425.0, 300 sec: 49040.9). Total num frames: 629997568. Throughput: 0: 49324.2. Samples: 630114680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-07-02 15:05:21,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 15:05:22,268][36979] Signal inference workers to stop experience collection... (9050 times) [2024-07-02 15:05:22,288][36999] InferenceWorker_p0-w0: stopping experience collection (9050 times) [2024-07-02 15:05:22,329][36979] Signal inference workers to resume experience collection... (9050 times) [2024-07-02 15:05:22,329][36999] InferenceWorker_p0-w0: resuming experience collection (9050 times) [2024-07-02 15:05:23,341][36999] Updated weights for policy 0, policy_version 38460 (0.0022) [2024-07-02 15:05:26,095][36761] Fps is (10 sec: 45875.5, 60 sec: 49152.2, 300 sec: 49152.4). Total num frames: 630226944. Throughput: 0: 49382.7. Samples: 630407200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 15:05:26,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:05:27,139][36999] Updated weights for policy 0, policy_version 38470 (0.0028) [2024-07-02 15:05:30,137][36999] Updated weights for policy 0, policy_version 38480 (0.0027) [2024-07-02 15:05:31,095][36761] Fps is (10 sec: 47514.7, 60 sec: 48879.0, 300 sec: 49207.5). Total num frames: 630472704. Throughput: 0: 49084.3. Samples: 630536040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 15:05:31,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 15:05:33,667][36999] Updated weights for policy 0, policy_version 38490 (0.0023) [2024-07-02 15:05:36,100][36761] Fps is (10 sec: 52405.2, 60 sec: 49421.2, 300 sec: 49373.4). Total num frames: 630751232. Throughput: 0: 49386.9. Samples: 630845840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 15:05:36,100][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:05:37,051][36999] Updated weights for policy 0, policy_version 38500 (0.0022) [2024-07-02 15:05:40,023][36999] Updated weights for policy 0, policy_version 38510 (0.0023) [2024-07-02 15:05:41,100][36761] Fps is (10 sec: 50766.8, 60 sec: 49421.3, 300 sec: 49484.5). Total num frames: 630980608. Throughput: 0: 49275.0. Samples: 631142340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 15:05:41,100][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 15:05:43,843][36999] Updated weights for policy 0, policy_version 38520 (0.0032) [2024-07-02 15:05:46,095][36761] Fps is (10 sec: 47535.1, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 631226368. Throughput: 0: 49374.7. Samples: 631294280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-07-02 15:05:46,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 15:05:46,458][36999] Updated weights for policy 0, policy_version 38530 (0.0024) [2024-07-02 15:05:50,259][36999] Updated weights for policy 0, policy_version 38540 (0.0020) [2024-07-02 15:05:51,095][36761] Fps is (10 sec: 49174.3, 60 sec: 49151.9, 300 sec: 49429.7). Total num frames: 631472128. Throughput: 0: 49422.0. Samples: 631595260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 15:05:51,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 15:05:53,047][36999] Updated weights for policy 0, policy_version 38550 (0.0027) [2024-07-02 15:05:56,096][36761] Fps is (10 sec: 47512.9, 60 sec: 48605.7, 300 sec: 49318.6). Total num frames: 631701504. Throughput: 0: 49688.6. Samples: 631896560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 15:05:56,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:05:56,757][36999] Updated weights for policy 0, policy_version 38560 (0.0026) [2024-07-02 15:05:59,588][36999] Updated weights for policy 0, policy_version 38570 (0.0031) [2024-07-02 15:06:01,095][36761] Fps is (10 sec: 52428.7, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 631996416. Throughput: 0: 49532.5. Samples: 632036820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 15:06:01,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 15:06:03,514][36999] Updated weights for policy 0, policy_version 38580 (0.0024) [2024-07-02 15:06:06,095][36761] Fps is (10 sec: 54068.1, 60 sec: 49971.4, 300 sec: 49540.8). Total num frames: 632242176. Throughput: 0: 49444.2. Samples: 632339660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 15:06:06,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 15:06:06,221][36999] Updated weights for policy 0, policy_version 38590 (0.0032) [2024-07-02 15:06:10,005][36999] Updated weights for policy 0, policy_version 38600 (0.0028) [2024-07-02 15:06:11,095][36761] Fps is (10 sec: 47513.7, 60 sec: 49425.0, 300 sec: 49486.0). Total num frames: 632471552. Throughput: 0: 49320.9. Samples: 632626640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 15:06:11,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 15:06:13,220][36999] Updated weights for policy 0, policy_version 38610 (0.0020) [2024-07-02 15:06:16,095][36761] Fps is (10 sec: 47513.4, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 632717312. Throughput: 0: 49768.7. Samples: 632775640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-07-02 15:06:16,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 15:06:16,713][36999] Updated weights for policy 0, policy_version 38620 (0.0022) [2024-07-02 15:06:19,567][36999] Updated weights for policy 0, policy_version 38630 (0.0024) [2024-07-02 15:06:21,095][36761] Fps is (10 sec: 49151.9, 60 sec: 49425.2, 300 sec: 49541.5). Total num frames: 632963072. Throughput: 0: 49422.7. Samples: 633069640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-07-02 15:06:21,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 15:06:21,241][36979] Signal inference workers to stop experience collection... (9100 times) [2024-07-02 15:06:21,241][36979] Signal inference workers to resume experience collection... (9100 times) [2024-07-02 15:06:21,256][36999] InferenceWorker_p0-w0: stopping experience collection (9100 times) [2024-07-02 15:06:21,256][36999] InferenceWorker_p0-w0: resuming experience collection (9100 times) [2024-07-02 15:06:23,338][36999] Updated weights for policy 0, policy_version 38640 (0.0026) [2024-07-02 15:06:26,095][36761] Fps is (10 sec: 50790.8, 60 sec: 49971.3, 300 sec: 49596.3). Total num frames: 633225216. Throughput: 0: 49553.5. Samples: 633372020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-07-02 15:06:26,096][36761] Avg episode reward: [(0, '1.041')] [2024-07-02 15:06:26,132][36999] Updated weights for policy 0, policy_version 38650 (0.0021) [2024-07-02 15:06:29,661][36999] Updated weights for policy 0, policy_version 38660 (0.0027) [2024-07-02 15:06:31,095][36761] Fps is (10 sec: 50791.0, 60 sec: 49971.2, 300 sec: 49485.3). Total num frames: 633470976. Throughput: 0: 49798.4. Samples: 633535200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-07-02 15:06:31,095][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:06:33,314][36999] Updated weights for policy 0, policy_version 38670 (0.0036) [2024-07-02 15:06:36,095][36761] Fps is (10 sec: 47513.8, 60 sec: 49155.8, 300 sec: 49429.7). Total num frames: 633700352. Throughput: 0: 49487.7. Samples: 633822200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-07-02 15:06:36,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:06:36,200][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000038679_633716736.pth... [2024-07-02 15:06:36,271][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000037954_621838336.pth [2024-07-02 15:06:36,422][36999] Updated weights for policy 0, policy_version 38680 (0.0021) [2024-07-02 15:06:39,838][36999] Updated weights for policy 0, policy_version 38690 (0.0026) [2024-07-02 15:06:41,095][36761] Fps is (10 sec: 47513.3, 60 sec: 49428.8, 300 sec: 49485.2). Total num frames: 633946112. Throughput: 0: 49288.2. Samples: 634114520. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-07-02 15:06:41,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 15:06:42,853][36999] Updated weights for policy 0, policy_version 38700 (0.0027) [2024-07-02 15:06:46,095][36761] Fps is (10 sec: 49151.6, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 634191872. Throughput: 0: 49354.7. Samples: 634257780. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-07-02 15:06:46,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:06:46,519][36999] Updated weights for policy 0, policy_version 38710 (0.0027) [2024-07-02 15:06:49,555][36999] Updated weights for policy 0, policy_version 38720 (0.0022) [2024-07-02 15:06:51,095][36761] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 634454016. Throughput: 0: 49354.6. Samples: 634560620. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-07-02 15:06:51,098][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:06:53,249][36999] Updated weights for policy 0, policy_version 38730 (0.0023) [2024-07-02 15:06:55,936][36999] Updated weights for policy 0, policy_version 38740 (0.0022) [2024-07-02 15:06:56,095][36761] Fps is (10 sec: 52428.3, 60 sec: 50244.3, 300 sec: 49540.7). Total num frames: 634716160. Throughput: 0: 49694.6. Samples: 634862900. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-07-02 15:06:56,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:06:59,598][36999] Updated weights for policy 0, policy_version 38750 (0.0023) [2024-07-02 15:07:01,095][36761] Fps is (10 sec: 49152.7, 60 sec: 49152.1, 300 sec: 49429.7). Total num frames: 634945536. Throughput: 0: 49765.0. Samples: 635015060. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-07-02 15:07:01,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 15:07:02,629][36999] Updated weights for policy 0, policy_version 38760 (0.0034) [2024-07-02 15:07:06,095][36761] Fps is (10 sec: 47514.3, 60 sec: 49152.0, 300 sec: 49486.0). Total num frames: 635191296. Throughput: 0: 50008.5. Samples: 635320020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 15:07:06,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:07:06,371][36999] Updated weights for policy 0, policy_version 38770 (0.0022) [2024-07-02 15:07:08,855][36999] Updated weights for policy 0, policy_version 38780 (0.0023) [2024-07-02 15:07:11,095][36761] Fps is (10 sec: 50790.1, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 635453440. Throughput: 0: 49916.0. Samples: 635618240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 15:07:11,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:07:12,867][36999] Updated weights for policy 0, policy_version 38790 (0.0023) [2024-07-02 15:07:15,122][36999] Updated weights for policy 0, policy_version 38800 (0.0030) [2024-07-02 15:07:16,095][36761] Fps is (10 sec: 54066.8, 60 sec: 50244.3, 300 sec: 49652.6). Total num frames: 635731968. Throughput: 0: 49787.9. Samples: 635775660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 15:07:16,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 15:07:19,158][36999] Updated weights for policy 0, policy_version 38810 (0.0025) [2024-07-02 15:07:20,604][36979] Signal inference workers to stop experience collection... (9150 times) [2024-07-02 15:07:20,605][36979] Signal inference workers to resume experience collection... (9150 times) [2024-07-02 15:07:20,622][36999] InferenceWorker_p0-w0: stopping experience collection (9150 times) [2024-07-02 15:07:20,622][36999] InferenceWorker_p0-w0: resuming experience collection (9150 times) [2024-07-02 15:07:21,095][36761] Fps is (10 sec: 52429.2, 60 sec: 50244.4, 300 sec: 49596.3). Total num frames: 635977728. Throughput: 0: 49926.7. Samples: 636068900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 15:07:21,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:07:21,687][36999] Updated weights for policy 0, policy_version 38820 (0.0022) [2024-07-02 15:07:25,949][36999] Updated weights for policy 0, policy_version 38830 (0.0023) [2024-07-02 15:07:26,100][36761] Fps is (10 sec: 45854.4, 60 sec: 49421.3, 300 sec: 49540.0). Total num frames: 636190720. Throughput: 0: 50147.8. Samples: 636371400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-07-02 15:07:26,100][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 15:07:28,666][36999] Updated weights for policy 0, policy_version 38840 (0.0024) [2024-07-02 15:07:31,095][36761] Fps is (10 sec: 47513.1, 60 sec: 49698.0, 300 sec: 49596.3). Total num frames: 636452864. Throughput: 0: 49974.2. Samples: 636506620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 15:07:31,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 15:07:32,587][36999] Updated weights for policy 0, policy_version 38850 (0.0022) [2024-07-02 15:07:35,348][36999] Updated weights for policy 0, policy_version 38860 (0.0025) [2024-07-02 15:07:36,100][36761] Fps is (10 sec: 49151.7, 60 sec: 49694.3, 300 sec: 49484.5). Total num frames: 636682240. Throughput: 0: 49835.4. Samples: 636803440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 15:07:36,101][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 15:07:36,108][36761] No heartbeat for components: RolloutWorker_w15 (417 seconds) [2024-07-02 15:07:39,426][36999] Updated weights for policy 0, policy_version 38870 (0.0022) [2024-07-02 15:07:41,100][36761] Fps is (10 sec: 49129.6, 60 sec: 49967.4, 300 sec: 49484.5). Total num frames: 636944384. Throughput: 0: 49547.1. Samples: 637092740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 15:07:41,101][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 15:07:42,429][36999] Updated weights for policy 0, policy_version 38880 (0.0027) [2024-07-02 15:07:45,699][36999] Updated weights for policy 0, policy_version 38890 (0.0024) [2024-07-02 15:07:46,095][36761] Fps is (10 sec: 50813.9, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 637190144. Throughput: 0: 49518.6. Samples: 637243400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 15:07:46,096][36761] Avg episode reward: [(0, '0.972')] [2024-07-02 15:07:49,058][36999] Updated weights for policy 0, policy_version 38900 (0.0032) [2024-07-02 15:07:51,096][36761] Fps is (10 sec: 47534.6, 60 sec: 49425.0, 300 sec: 49540.7). Total num frames: 637419520. Throughput: 0: 49393.6. Samples: 637542740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-07-02 15:07:51,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 15:07:52,122][36999] Updated weights for policy 0, policy_version 38910 (0.0019) [2024-07-02 15:07:55,745][36999] Updated weights for policy 0, policy_version 38920 (0.0027) [2024-07-02 15:07:56,095][36761] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 637681664. Throughput: 0: 49344.9. Samples: 637838760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-07-02 15:07:56,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:07:58,765][36999] Updated weights for policy 0, policy_version 38930 (0.0022) [2024-07-02 15:08:01,095][36761] Fps is (10 sec: 50791.6, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 637927424. Throughput: 0: 49067.7. Samples: 637983700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-07-02 15:08:01,096][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 15:08:02,097][36999] Updated weights for policy 0, policy_version 38940 (0.0025) [2024-07-02 15:08:05,282][36999] Updated weights for policy 0, policy_version 38950 (0.0021) [2024-07-02 15:08:06,096][36761] Fps is (10 sec: 50787.5, 60 sec: 49970.7, 300 sec: 49596.2). Total num frames: 638189568. Throughput: 0: 49230.4. Samples: 638284300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-07-02 15:08:06,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 15:08:08,709][36999] Updated weights for policy 0, policy_version 38960 (0.0022) [2024-07-02 15:08:11,095][36761] Fps is (10 sec: 50790.6, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 638435328. Throughput: 0: 49229.6. Samples: 638586500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-07-02 15:08:11,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 15:08:11,781][36999] Updated weights for policy 0, policy_version 38970 (0.0026) [2024-07-02 15:08:15,474][36999] Updated weights for policy 0, policy_version 38980 (0.0023) [2024-07-02 15:08:16,095][36761] Fps is (10 sec: 47516.4, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 638664704. Throughput: 0: 49516.0. Samples: 638734840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-07-02 15:08:16,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:08:16,785][36979] Signal inference workers to stop experience collection... (9200 times) [2024-07-02 15:08:16,785][36979] Signal inference workers to resume experience collection... (9200 times) [2024-07-02 15:08:16,796][36999] InferenceWorker_p0-w0: stopping experience collection (9200 times) [2024-07-02 15:08:16,824][36999] InferenceWorker_p0-w0: resuming experience collection (9200 times) [2024-07-02 15:08:18,406][36999] Updated weights for policy 0, policy_version 38990 (0.0027) [2024-07-02 15:08:21,101][36761] Fps is (10 sec: 49124.8, 60 sec: 49147.5, 300 sec: 49539.9). Total num frames: 638926848. Throughput: 0: 49476.9. Samples: 639029940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 15:08:21,101][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 15:08:21,999][36999] Updated weights for policy 0, policy_version 39000 (0.0022) [2024-07-02 15:08:25,446][36999] Updated weights for policy 0, policy_version 39010 (0.0030) [2024-07-02 15:08:26,095][36761] Fps is (10 sec: 50789.9, 60 sec: 49701.8, 300 sec: 49485.2). Total num frames: 639172608. Throughput: 0: 49781.0. Samples: 639332660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 15:08:26,096][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 15:08:28,842][36999] Updated weights for policy 0, policy_version 39020 (0.0023) [2024-07-02 15:08:31,095][36761] Fps is (10 sec: 49178.2, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 639418368. Throughput: 0: 49796.3. Samples: 639484240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 15:08:31,100][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 15:08:32,109][36999] Updated weights for policy 0, policy_version 39030 (0.0024) [2024-07-02 15:08:35,384][36999] Updated weights for policy 0, policy_version 39040 (0.0022) [2024-07-02 15:08:36,095][36761] Fps is (10 sec: 49152.0, 60 sec: 49701.9, 300 sec: 49485.2). Total num frames: 639664128. Throughput: 0: 49798.3. Samples: 639783660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 15:08:36,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 15:08:36,114][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000039043_639680512.pth... [2024-07-02 15:08:36,200][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000038318_627802112.pth [2024-07-02 15:08:38,521][36999] Updated weights for policy 0, policy_version 39050 (0.0021) [2024-07-02 15:08:41,095][36761] Fps is (10 sec: 47514.6, 60 sec: 49155.8, 300 sec: 49429.7). Total num frames: 639893504. Throughput: 0: 49704.1. Samples: 640075440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 15:08:41,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:08:42,148][36999] Updated weights for policy 0, policy_version 39060 (0.0023) [2024-07-02 15:08:45,087][36999] Updated weights for policy 0, policy_version 39070 (0.0031) [2024-07-02 15:08:46,095][36761] Fps is (10 sec: 49152.7, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 640155648. Throughput: 0: 49621.3. Samples: 640216660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 15:08:46,096][36761] Avg episode reward: [(0, '1.018')] [2024-07-02 15:08:48,674][36999] Updated weights for policy 0, policy_version 39080 (0.0032) [2024-07-02 15:08:51,100][36761] Fps is (10 sec: 52404.4, 60 sec: 49967.5, 300 sec: 49540.0). Total num frames: 640417792. Throughput: 0: 49315.6. Samples: 640503700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 15:08:51,101][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 15:08:51,859][36999] Updated weights for policy 0, policy_version 39090 (0.0025) [2024-07-02 15:08:55,172][36999] Updated weights for policy 0, policy_version 39100 (0.0036) [2024-07-02 15:08:56,095][36761] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 640663552. Throughput: 0: 49267.4. Samples: 640803540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 15:08:56,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:08:58,483][36999] Updated weights for policy 0, policy_version 39110 (0.0034) [2024-07-02 15:09:01,095][36761] Fps is (10 sec: 45896.3, 60 sec: 49152.0, 300 sec: 49430.5). Total num frames: 640876544. Throughput: 0: 49389.8. Samples: 640957380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 15:09:01,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 15:09:01,817][36999] Updated weights for policy 0, policy_version 39120 (0.0030) [2024-07-02 15:09:05,199][36999] Updated weights for policy 0, policy_version 39130 (0.0025) [2024-07-02 15:09:06,096][36761] Fps is (10 sec: 45874.7, 60 sec: 48879.3, 300 sec: 49429.7). Total num frames: 641122304. Throughput: 0: 49091.1. Samples: 641238780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 15:09:06,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:09:08,358][36999] Updated weights for policy 0, policy_version 39140 (0.0022) [2024-07-02 15:09:11,095][36761] Fps is (10 sec: 50790.1, 60 sec: 49151.9, 300 sec: 49485.3). Total num frames: 641384448. Throughput: 0: 49063.2. Samples: 641540500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 15:09:11,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 15:09:11,877][36999] Updated weights for policy 0, policy_version 39150 (0.0022) [2024-07-02 15:09:13,107][36979] Signal inference workers to stop experience collection... (9250 times) [2024-07-02 15:09:13,133][36999] InferenceWorker_p0-w0: stopping experience collection (9250 times) [2024-07-02 15:09:13,222][36979] Signal inference workers to resume experience collection... (9250 times) [2024-07-02 15:09:13,222][36999] InferenceWorker_p0-w0: resuming experience collection (9250 times) [2024-07-02 15:09:15,050][36999] Updated weights for policy 0, policy_version 39160 (0.0033) [2024-07-02 15:09:16,095][36761] Fps is (10 sec: 52429.2, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 641646592. Throughput: 0: 49196.0. Samples: 641698060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 15:09:16,100][36761] Avg episode reward: [(0, '1.039')] [2024-07-02 15:09:18,827][36999] Updated weights for policy 0, policy_version 39170 (0.0022) [2024-07-02 15:09:21,095][36761] Fps is (10 sec: 49151.5, 60 sec: 49156.3, 300 sec: 49485.3). Total num frames: 641875968. Throughput: 0: 49148.0. Samples: 641995320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 15:09:21,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 15:09:21,836][36999] Updated weights for policy 0, policy_version 39180 (0.0021) [2024-07-02 15:09:25,155][36999] Updated weights for policy 0, policy_version 39190 (0.0026) [2024-07-02 15:09:26,095][36761] Fps is (10 sec: 45875.2, 60 sec: 48878.9, 300 sec: 49374.1). Total num frames: 642105344. Throughput: 0: 49157.6. Samples: 642287540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 15:09:26,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 15:09:28,256][36999] Updated weights for policy 0, policy_version 39200 (0.0023) [2024-07-02 15:09:31,095][36761] Fps is (10 sec: 47513.7, 60 sec: 48878.9, 300 sec: 49374.1). Total num frames: 642351104. Throughput: 0: 49258.1. Samples: 642433280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-07-02 15:09:31,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 15:09:31,712][36999] Updated weights for policy 0, policy_version 39210 (0.0020) [2024-07-02 15:09:34,716][36999] Updated weights for policy 0, policy_version 39220 (0.0028) [2024-07-02 15:09:36,095][36761] Fps is (10 sec: 52429.0, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 642629632. Throughput: 0: 49286.7. Samples: 642721380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-07-02 15:09:36,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:09:38,518][36999] Updated weights for policy 0, policy_version 39230 (0.0028) [2024-07-02 15:09:41,095][36761] Fps is (10 sec: 52429.8, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 642875392. Throughput: 0: 49360.6. Samples: 643024760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-07-02 15:09:41,096][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 15:09:41,664][36999] Updated weights for policy 0, policy_version 39240 (0.0027) [2024-07-02 15:09:44,996][36999] Updated weights for policy 0, policy_version 39250 (0.0021) [2024-07-02 15:09:46,095][36761] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 643104768. Throughput: 0: 49254.6. Samples: 643173840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-07-02 15:09:46,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 15:09:48,376][36999] Updated weights for policy 0, policy_version 39260 (0.0027) [2024-07-02 15:09:51,096][36761] Fps is (10 sec: 47512.7, 60 sec: 48882.6, 300 sec: 49374.1). Total num frames: 643350528. Throughput: 0: 49531.6. Samples: 643467700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-07-02 15:09:51,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:09:51,843][36999] Updated weights for policy 0, policy_version 39270 (0.0026) [2024-07-02 15:09:54,711][36999] Updated weights for policy 0, policy_version 39280 (0.0022) [2024-07-02 15:09:56,095][36761] Fps is (10 sec: 50790.2, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 643612672. Throughput: 0: 49552.0. Samples: 643770340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-07-02 15:09:56,098][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:09:58,095][36999] Updated weights for policy 0, policy_version 39290 (0.0022) [2024-07-02 15:10:01,095][36761] Fps is (10 sec: 52428.9, 60 sec: 49971.1, 300 sec: 49596.3). Total num frames: 643874816. Throughput: 0: 49391.5. Samples: 643920680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:10:01,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:10:01,583][36999] Updated weights for policy 0, policy_version 39300 (0.0023) [2024-07-02 15:10:01,860][36979] Signal inference workers to stop experience collection... (9300 times) [2024-07-02 15:10:01,864][36979] Signal inference workers to resume experience collection... (9300 times) [2024-07-02 15:10:01,879][36999] InferenceWorker_p0-w0: stopping experience collection (9300 times) [2024-07-02 15:10:01,879][36999] InferenceWorker_p0-w0: resuming experience collection (9300 times) [2024-07-02 15:10:04,552][36999] Updated weights for policy 0, policy_version 39310 (0.0027) [2024-07-02 15:10:06,095][36761] Fps is (10 sec: 49151.7, 60 sec: 49698.2, 300 sec: 49485.2). Total num frames: 644104192. Throughput: 0: 49386.3. Samples: 644217700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:10:06,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:10:07,911][36999] Updated weights for policy 0, policy_version 39320 (0.0023) [2024-07-02 15:10:11,095][36761] Fps is (10 sec: 49152.9, 60 sec: 49698.2, 300 sec: 49485.3). Total num frames: 644366336. Throughput: 0: 49707.3. Samples: 644524360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:10:11,095][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 15:10:11,223][36999] Updated weights for policy 0, policy_version 39330 (0.0026) [2024-07-02 15:10:14,493][36999] Updated weights for policy 0, policy_version 39340 (0.0024) [2024-07-02 15:10:16,095][36761] Fps is (10 sec: 49152.6, 60 sec: 49152.1, 300 sec: 49485.3). Total num frames: 644595712. Throughput: 0: 49735.7. Samples: 644671380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:10:16,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 15:10:17,516][36999] Updated weights for policy 0, policy_version 39350 (0.0025) [2024-07-02 15:10:21,095][36761] Fps is (10 sec: 47513.9, 60 sec: 49425.3, 300 sec: 49540.8). Total num frames: 644841472. Throughput: 0: 49813.1. Samples: 644962960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:10:21,095][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:10:21,395][36999] Updated weights for policy 0, policy_version 39360 (0.0028) [2024-07-02 15:10:24,180][36999] Updated weights for policy 0, policy_version 39370 (0.0028) [2024-07-02 15:10:26,095][36761] Fps is (10 sec: 49151.7, 60 sec: 49698.2, 300 sec: 49540.7). Total num frames: 645087232. Throughput: 0: 49762.1. Samples: 645264060. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-07-02 15:10:26,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 15:10:27,947][36999] Updated weights for policy 0, policy_version 39380 (0.0023) [2024-07-02 15:10:31,049][36999] Updated weights for policy 0, policy_version 39390 (0.0022) [2024-07-02 15:10:31,095][36761] Fps is (10 sec: 52427.8, 60 sec: 50244.3, 300 sec: 49541.5). Total num frames: 645365760. Throughput: 0: 49611.5. Samples: 645406360. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-07-02 15:10:31,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:10:34,472][36999] Updated weights for policy 0, policy_version 39400 (0.0023) [2024-07-02 15:10:36,095][36761] Fps is (10 sec: 52428.9, 60 sec: 49698.1, 300 sec: 49597.1). Total num frames: 645611520. Throughput: 0: 49905.4. Samples: 645713440. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-07-02 15:10:36,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:10:36,125][36761] No heartbeat for components: RolloutWorker_w15 (597 seconds) [2024-07-02 15:10:36,233][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000039406_645627904.pth... [2024-07-02 15:10:36,299][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000038679_633716736.pth [2024-07-02 15:10:37,879][36999] Updated weights for policy 0, policy_version 39410 (0.0022) [2024-07-02 15:10:41,066][36999] Updated weights for policy 0, policy_version 39420 (0.0022) [2024-07-02 15:10:41,096][36761] Fps is (10 sec: 49151.4, 60 sec: 49697.9, 300 sec: 49596.3). Total num frames: 645857280. Throughput: 0: 49810.5. Samples: 646011820. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-07-02 15:10:41,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 15:10:44,600][36999] Updated weights for policy 0, policy_version 39430 (0.0026) [2024-07-02 15:10:46,095][36761] Fps is (10 sec: 47514.0, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 646086656. Throughput: 0: 49745.1. Samples: 646159200. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-07-02 15:10:46,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:10:47,546][36999] Updated weights for policy 0, policy_version 39440 (0.0023) [2024-07-02 15:10:50,910][36999] Updated weights for policy 0, policy_version 39450 (0.0026) [2024-07-02 15:10:51,095][36761] Fps is (10 sec: 49152.8, 60 sec: 49971.3, 300 sec: 49651.9). Total num frames: 646348800. Throughput: 0: 49722.3. Samples: 646455200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 15:10:51,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:10:54,316][36999] Updated weights for policy 0, policy_version 39460 (0.0022) [2024-07-02 15:10:56,095][36761] Fps is (10 sec: 50790.3, 60 sec: 49698.2, 300 sec: 49485.3). Total num frames: 646594560. Throughput: 0: 49321.3. Samples: 646743820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 15:10:56,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 15:10:56,776][36979] Signal inference workers to stop experience collection... (9350 times) [2024-07-02 15:10:56,823][36999] InferenceWorker_p0-w0: stopping experience collection (9350 times) [2024-07-02 15:10:56,831][36979] Signal inference workers to resume experience collection... (9350 times) [2024-07-02 15:10:56,842][36999] InferenceWorker_p0-w0: resuming experience collection (9350 times) [2024-07-02 15:10:57,955][36999] Updated weights for policy 0, policy_version 39470 (0.0026) [2024-07-02 15:11:00,705][36999] Updated weights for policy 0, policy_version 39480 (0.0027) [2024-07-02 15:11:01,095][36761] Fps is (10 sec: 49152.2, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 646840320. Throughput: 0: 49612.4. Samples: 646903940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 15:11:01,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 15:11:04,414][36999] Updated weights for policy 0, policy_version 39490 (0.0026) [2024-07-02 15:11:06,095][36761] Fps is (10 sec: 49151.3, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 647086080. Throughput: 0: 49828.2. Samples: 647205240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 15:11:06,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 15:11:07,288][36999] Updated weights for policy 0, policy_version 39500 (0.0028) [2024-07-02 15:11:10,927][36999] Updated weights for policy 0, policy_version 39510 (0.0025) [2024-07-02 15:11:11,095][36761] Fps is (10 sec: 49151.9, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 647331840. Throughput: 0: 49816.0. Samples: 647505780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-07-02 15:11:11,096][36761] Avg episode reward: [(0, '1.041')] [2024-07-02 15:11:13,893][36999] Updated weights for policy 0, policy_version 39520 (0.0026) [2024-07-02 15:11:16,095][36761] Fps is (10 sec: 50790.6, 60 sec: 49971.1, 300 sec: 49596.3). Total num frames: 647593984. Throughput: 0: 49800.5. Samples: 647647380. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-07-02 15:11:16,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 15:11:17,518][36999] Updated weights for policy 0, policy_version 39530 (0.0023) [2024-07-02 15:11:20,404][36999] Updated weights for policy 0, policy_version 39540 (0.0022) [2024-07-02 15:11:21,100][36761] Fps is (10 sec: 50767.3, 60 sec: 49967.3, 300 sec: 49540.0). Total num frames: 647839744. Throughput: 0: 49527.4. Samples: 647942400. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-07-02 15:11:21,101][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 15:11:24,374][36999] Updated weights for policy 0, policy_version 39550 (0.0021) [2024-07-02 15:11:26,095][36761] Fps is (10 sec: 49152.6, 60 sec: 49971.3, 300 sec: 49540.8). Total num frames: 648085504. Throughput: 0: 49588.3. Samples: 648243280. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-07-02 15:11:26,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:11:27,140][36999] Updated weights for policy 0, policy_version 39560 (0.0022) [2024-07-02 15:11:30,973][36999] Updated weights for policy 0, policy_version 39570 (0.0025) [2024-07-02 15:11:31,095][36761] Fps is (10 sec: 47535.2, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 648314880. Throughput: 0: 49315.9. Samples: 648378420. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-07-02 15:11:31,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 15:11:33,727][36999] Updated weights for policy 0, policy_version 39580 (0.0031) [2024-07-02 15:11:36,095][36761] Fps is (10 sec: 47513.0, 60 sec: 49152.0, 300 sec: 49540.8). Total num frames: 648560640. Throughput: 0: 49185.7. Samples: 648668560. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-07-02 15:11:36,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 15:11:37,889][36999] Updated weights for policy 0, policy_version 39590 (0.0037) [2024-07-02 15:11:40,421][36999] Updated weights for policy 0, policy_version 39600 (0.0025) [2024-07-02 15:11:41,095][36761] Fps is (10 sec: 52428.8, 60 sec: 49698.3, 300 sec: 49651.8). Total num frames: 648839168. Throughput: 0: 49248.4. Samples: 648960000. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-07-02 15:11:41,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 15:11:44,342][36999] Updated weights for policy 0, policy_version 39610 (0.0023) [2024-07-02 15:11:46,096][36761] Fps is (10 sec: 49151.2, 60 sec: 49424.8, 300 sec: 49485.2). Total num frames: 649052160. Throughput: 0: 49371.3. Samples: 649125660. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-07-02 15:11:46,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:11:47,270][36999] Updated weights for policy 0, policy_version 39620 (0.0040) [2024-07-02 15:11:51,033][36999] Updated weights for policy 0, policy_version 39630 (0.0031) [2024-07-02 15:11:51,100][36761] Fps is (10 sec: 45854.4, 60 sec: 49148.3, 300 sec: 49428.9). Total num frames: 649297920. Throughput: 0: 48945.4. Samples: 649408000. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-07-02 15:11:51,101][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 15:11:53,017][36979] Signal inference workers to stop experience collection... (9400 times) [2024-07-02 15:11:53,062][36999] InferenceWorker_p0-w0: stopping experience collection (9400 times) [2024-07-02 15:11:53,123][36979] Signal inference workers to resume experience collection... (9400 times) [2024-07-02 15:11:53,124][36999] InferenceWorker_p0-w0: resuming experience collection (9400 times) [2024-07-02 15:11:54,077][36999] Updated weights for policy 0, policy_version 39640 (0.0023) [2024-07-02 15:11:56,095][36761] Fps is (10 sec: 50791.3, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 649560064. Throughput: 0: 48784.4. Samples: 649701080. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-07-02 15:11:56,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:11:57,534][36999] Updated weights for policy 0, policy_version 39650 (0.0022) [2024-07-02 15:12:00,670][36999] Updated weights for policy 0, policy_version 39660 (0.0022) [2024-07-02 15:12:01,095][36761] Fps is (10 sec: 50813.7, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 649805824. Throughput: 0: 49129.4. Samples: 649858200. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-07-02 15:12:01,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 15:12:04,280][36999] Updated weights for policy 0, policy_version 39670 (0.0026) [2024-07-02 15:12:06,095][36761] Fps is (10 sec: 49152.7, 60 sec: 49425.2, 300 sec: 49485.3). Total num frames: 650051584. Throughput: 0: 49121.6. Samples: 650152640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 15:12:06,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:12:07,226][36999] Updated weights for policy 0, policy_version 39680 (0.0023) [2024-07-02 15:12:10,825][36999] Updated weights for policy 0, policy_version 39690 (0.0020) [2024-07-02 15:12:11,095][36761] Fps is (10 sec: 47513.3, 60 sec: 49152.0, 300 sec: 49318.6). Total num frames: 650280960. Throughput: 0: 49120.3. Samples: 650453700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 15:12:11,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:12:13,986][36999] Updated weights for policy 0, policy_version 39700 (0.0033) [2024-07-02 15:12:16,100][36761] Fps is (10 sec: 49128.9, 60 sec: 49148.3, 300 sec: 49373.4). Total num frames: 650543104. Throughput: 0: 49394.1. Samples: 650601380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 15:12:16,101][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:12:17,171][36999] Updated weights for policy 0, policy_version 39710 (0.0023) [2024-07-02 15:12:20,523][36999] Updated weights for policy 0, policy_version 39720 (0.0020) [2024-07-02 15:12:21,096][36761] Fps is (10 sec: 52428.0, 60 sec: 49428.7, 300 sec: 49541.5). Total num frames: 650805248. Throughput: 0: 49602.5. Samples: 650900680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 15:12:21,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 15:12:24,007][36999] Updated weights for policy 0, policy_version 39730 (0.0021) [2024-07-02 15:12:26,095][36761] Fps is (10 sec: 49175.0, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 651034624. Throughput: 0: 49773.9. Samples: 651199820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-07-02 15:12:26,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:12:26,994][36999] Updated weights for policy 0, policy_version 39740 (0.0027) [2024-07-02 15:12:30,774][36999] Updated weights for policy 0, policy_version 39750 (0.0025) [2024-07-02 15:12:31,095][36761] Fps is (10 sec: 47514.2, 60 sec: 49425.0, 300 sec: 49486.0). Total num frames: 651280384. Throughput: 0: 49284.6. Samples: 651343460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 19.0) [2024-07-02 15:12:31,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:12:33,570][36999] Updated weights for policy 0, policy_version 39760 (0.0021) [2024-07-02 15:12:36,100][36761] Fps is (10 sec: 49129.0, 60 sec: 49421.3, 300 sec: 49429.7). Total num frames: 651526144. Throughput: 0: 49419.5. Samples: 651631880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 19.0) [2024-07-02 15:12:36,100][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 15:12:36,128][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000039766_651526144.pth... [2024-07-02 15:12:36,205][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000039043_639680512.pth [2024-07-02 15:12:37,203][36999] Updated weights for policy 0, policy_version 39770 (0.0021) [2024-07-02 15:12:40,361][36999] Updated weights for policy 0, policy_version 39780 (0.0030) [2024-07-02 15:12:41,095][36761] Fps is (10 sec: 52429.2, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 651804672. Throughput: 0: 49589.4. Samples: 651932600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 19.0) [2024-07-02 15:12:41,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 15:12:43,700][36999] Updated weights for policy 0, policy_version 39790 (0.0024) [2024-07-02 15:12:46,095][36761] Fps is (10 sec: 50813.9, 60 sec: 49698.3, 300 sec: 49540.8). Total num frames: 652034048. Throughput: 0: 49448.9. Samples: 652083400. Policy #0 lag: (min: 0.0, avg: 8.2, max: 19.0) [2024-07-02 15:12:46,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:12:46,814][36999] Updated weights for policy 0, policy_version 39800 (0.0024) [2024-07-02 15:12:46,838][36979] Signal inference workers to stop experience collection... (9450 times) [2024-07-02 15:12:46,839][36979] Signal inference workers to resume experience collection... (9450 times) [2024-07-02 15:12:46,851][36999] InferenceWorker_p0-w0: stopping experience collection (9450 times) [2024-07-02 15:12:46,852][36999] InferenceWorker_p0-w0: resuming experience collection (9450 times) [2024-07-02 15:12:50,606][36999] Updated weights for policy 0, policy_version 39810 (0.0022) [2024-07-02 15:12:51,095][36761] Fps is (10 sec: 45874.8, 60 sec: 49428.7, 300 sec: 49429.7). Total num frames: 652263424. Throughput: 0: 49533.1. Samples: 652381640. Policy #0 lag: (min: 0.0, avg: 8.2, max: 19.0) [2024-07-02 15:12:51,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:12:53,685][36999] Updated weights for policy 0, policy_version 39820 (0.0025) [2024-07-02 15:12:56,100][36761] Fps is (10 sec: 49129.3, 60 sec: 49421.3, 300 sec: 49484.5). Total num frames: 652525568. Throughput: 0: 49280.4. Samples: 652671540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-07-02 15:12:56,101][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:12:57,402][36999] Updated weights for policy 0, policy_version 39830 (0.0023) [2024-07-02 15:13:00,060][36999] Updated weights for policy 0, policy_version 39840 (0.0035) [2024-07-02 15:13:01,095][36761] Fps is (10 sec: 49152.6, 60 sec: 49152.0, 300 sec: 49374.3). Total num frames: 652754944. Throughput: 0: 49317.9. Samples: 652820460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-07-02 15:13:01,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:13:03,938][36999] Updated weights for policy 0, policy_version 39850 (0.0030) [2024-07-02 15:13:06,095][36761] Fps is (10 sec: 50813.0, 60 sec: 49697.9, 300 sec: 49485.2). Total num frames: 653033472. Throughput: 0: 49210.7. Samples: 653115160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-07-02 15:13:06,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:13:06,816][36999] Updated weights for policy 0, policy_version 39860 (0.0022) [2024-07-02 15:13:10,386][36999] Updated weights for policy 0, policy_version 39870 (0.0022) [2024-07-02 15:13:11,095][36761] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 653262848. Throughput: 0: 49292.7. Samples: 653418000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-07-02 15:13:11,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:13:13,563][36999] Updated weights for policy 0, policy_version 39880 (0.0031) [2024-07-02 15:13:16,096][36761] Fps is (10 sec: 47513.5, 60 sec: 49428.7, 300 sec: 49430.6). Total num frames: 653508608. Throughput: 0: 49399.9. Samples: 653566460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-07-02 15:13:16,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 15:13:16,850][36999] Updated weights for policy 0, policy_version 39890 (0.0023) [2024-07-02 15:13:20,129][36999] Updated weights for policy 0, policy_version 39900 (0.0026) [2024-07-02 15:13:21,100][36761] Fps is (10 sec: 50767.6, 60 sec: 49421.5, 300 sec: 49484.5). Total num frames: 653770752. Throughput: 0: 49770.3. Samples: 653871540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-07-02 15:13:21,101][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 15:13:23,139][36999] Updated weights for policy 0, policy_version 39910 (0.0026) [2024-07-02 15:13:26,095][36761] Fps is (10 sec: 49152.8, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 654000128. Throughput: 0: 49735.1. Samples: 654170680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-07-02 15:13:26,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:13:26,760][36999] Updated weights for policy 0, policy_version 39920 (0.0023) [2024-07-02 15:13:30,066][36999] Updated weights for policy 0, policy_version 39930 (0.0022) [2024-07-02 15:13:31,096][36761] Fps is (10 sec: 49169.0, 60 sec: 49697.3, 300 sec: 49485.1). Total num frames: 654262272. Throughput: 0: 49711.2. Samples: 654320460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-07-02 15:13:31,097][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 15:13:33,401][36979] Signal inference workers to stop experience collection... (9500 times) [2024-07-02 15:13:33,449][36999] InferenceWorker_p0-w0: stopping experience collection (9500 times) [2024-07-02 15:13:33,459][36979] Signal inference workers to resume experience collection... (9500 times) [2024-07-02 15:13:33,469][36999] InferenceWorker_p0-w0: resuming experience collection (9500 times) [2024-07-02 15:13:33,471][36999] Updated weights for policy 0, policy_version 39940 (0.0024) [2024-07-02 15:13:36,095][36761] Fps is (10 sec: 50789.8, 60 sec: 49701.8, 300 sec: 49540.7). Total num frames: 654508032. Throughput: 0: 49562.2. Samples: 654611940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-07-02 15:13:36,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 15:13:36,117][36761] No heartbeat for components: RolloutWorker_w15 (777 seconds) [2024-07-02 15:13:36,789][36999] Updated weights for policy 0, policy_version 39950 (0.0029) [2024-07-02 15:13:40,157][36999] Updated weights for policy 0, policy_version 39960 (0.0021) [2024-07-02 15:13:41,095][36761] Fps is (10 sec: 49157.5, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 654753792. Throughput: 0: 49566.0. Samples: 654901780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-07-02 15:13:41,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 15:13:43,269][36999] Updated weights for policy 0, policy_version 39970 (0.0023) [2024-07-02 15:13:46,099][36761] Fps is (10 sec: 49136.5, 60 sec: 49422.3, 300 sec: 49429.9). Total num frames: 654999552. Throughput: 0: 49569.2. Samples: 655051240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-07-02 15:13:46,099][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 15:13:46,431][36999] Updated weights for policy 0, policy_version 39980 (0.0028) [2024-07-02 15:13:49,581][36999] Updated weights for policy 0, policy_version 39990 (0.0023) [2024-07-02 15:13:51,095][36761] Fps is (10 sec: 49151.8, 60 sec: 49698.2, 300 sec: 49429.7). Total num frames: 655245312. Throughput: 0: 49550.4. Samples: 655344920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-07-02 15:13:51,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 15:13:53,226][36999] Updated weights for policy 0, policy_version 40000 (0.0032) [2024-07-02 15:13:55,982][36999] Updated weights for policy 0, policy_version 40010 (0.0021) [2024-07-02 15:13:56,095][36761] Fps is (10 sec: 52445.5, 60 sec: 49974.9, 300 sec: 49651.8). Total num frames: 655523840. Throughput: 0: 49515.1. Samples: 655646180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-07-02 15:13:56,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 15:14:00,187][36999] Updated weights for policy 0, policy_version 40020 (0.0025) [2024-07-02 15:14:01,095][36761] Fps is (10 sec: 50789.8, 60 sec: 49971.1, 300 sec: 49596.3). Total num frames: 655753216. Throughput: 0: 49697.4. Samples: 655802840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-07-02 15:14:01,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:14:03,007][36999] Updated weights for policy 0, policy_version 40030 (0.0024) [2024-07-02 15:14:06,095][36761] Fps is (10 sec: 44237.4, 60 sec: 48879.1, 300 sec: 49429.7). Total num frames: 655966208. Throughput: 0: 49443.3. Samples: 656096260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-07-02 15:14:06,096][36761] Avg episode reward: [(0, '1.018')] [2024-07-02 15:14:06,643][36999] Updated weights for policy 0, policy_version 40040 (0.0023) [2024-07-02 15:14:09,590][36999] Updated weights for policy 0, policy_version 40050 (0.0028) [2024-07-02 15:14:11,100][36761] Fps is (10 sec: 47492.7, 60 sec: 49421.4, 300 sec: 49428.9). Total num frames: 656228352. Throughput: 0: 49541.2. Samples: 656400260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-07-02 15:14:11,100][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 15:14:12,984][36999] Updated weights for policy 0, policy_version 40060 (0.0026) [2024-07-02 15:14:16,037][36999] Updated weights for policy 0, policy_version 40070 (0.0021) [2024-07-02 15:14:16,095][36761] Fps is (10 sec: 54067.3, 60 sec: 49971.4, 300 sec: 49596.3). Total num frames: 656506880. Throughput: 0: 49435.5. Samples: 656545000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:14:16,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 15:14:19,732][36999] Updated weights for policy 0, policy_version 40080 (0.0022) [2024-07-02 15:14:21,095][36761] Fps is (10 sec: 50813.1, 60 sec: 49428.8, 300 sec: 49596.3). Total num frames: 656736256. Throughput: 0: 49748.5. Samples: 656850620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:14:21,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 15:14:22,755][36999] Updated weights for policy 0, policy_version 40090 (0.0022) [2024-07-02 15:14:25,941][36999] Updated weights for policy 0, policy_version 40100 (0.0023) [2024-07-02 15:14:26,095][36761] Fps is (10 sec: 49151.2, 60 sec: 49971.1, 300 sec: 49651.9). Total num frames: 656998400. Throughput: 0: 49917.7. Samples: 657148080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:14:26,096][36761] Avg episode reward: [(0, '1.013')] [2024-07-02 15:14:26,919][36979] Signal inference workers to stop experience collection... (9550 times) [2024-07-02 15:14:26,952][36999] InferenceWorker_p0-w0: stopping experience collection (9550 times) [2024-07-02 15:14:26,979][36979] Signal inference workers to resume experience collection... (9550 times) [2024-07-02 15:14:26,980][36999] InferenceWorker_p0-w0: resuming experience collection (9550 times) [2024-07-02 15:14:29,282][36999] Updated weights for policy 0, policy_version 40110 (0.0022) [2024-07-02 15:14:31,095][36761] Fps is (10 sec: 49152.7, 60 sec: 49426.0, 300 sec: 49485.3). Total num frames: 657227776. Throughput: 0: 49895.7. Samples: 657296380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:14:31,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 15:14:32,412][36999] Updated weights for policy 0, policy_version 40120 (0.0019) [2024-07-02 15:14:36,032][36999] Updated weights for policy 0, policy_version 40130 (0.0019) [2024-07-02 15:14:36,100][36761] Fps is (10 sec: 49130.0, 60 sec: 49694.5, 300 sec: 49540.0). Total num frames: 657489920. Throughput: 0: 50044.7. Samples: 657597160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:14:36,100][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 15:14:36,189][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000040131_657506304.pth... [2024-07-02 15:14:36,240][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000039406_645627904.pth [2024-07-02 15:14:39,176][36999] Updated weights for policy 0, policy_version 40140 (0.0029) [2024-07-02 15:14:41,095][36761] Fps is (10 sec: 50790.3, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 657735680. Throughput: 0: 49857.5. Samples: 657889760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 15:14:41,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 15:14:42,604][36999] Updated weights for policy 0, policy_version 40150 (0.0023) [2024-07-02 15:14:45,684][36999] Updated weights for policy 0, policy_version 40160 (0.0033) [2024-07-02 15:14:46,095][36761] Fps is (10 sec: 50813.3, 60 sec: 49973.9, 300 sec: 49651.9). Total num frames: 657997824. Throughput: 0: 49934.3. Samples: 658049880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 15:14:46,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 15:14:49,289][36999] Updated weights for policy 0, policy_version 40170 (0.0033) [2024-07-02 15:14:51,095][36761] Fps is (10 sec: 49151.3, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 658227200. Throughput: 0: 49813.2. Samples: 658337860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 15:14:51,100][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 15:14:52,346][36999] Updated weights for policy 0, policy_version 40180 (0.0030) [2024-07-02 15:14:55,915][36999] Updated weights for policy 0, policy_version 40190 (0.0019) [2024-07-02 15:14:56,095][36761] Fps is (10 sec: 47513.9, 60 sec: 49152.1, 300 sec: 49485.3). Total num frames: 658472960. Throughput: 0: 49621.9. Samples: 658633020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 15:14:56,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 15:14:59,337][36999] Updated weights for policy 0, policy_version 40200 (0.0022) [2024-07-02 15:15:01,095][36761] Fps is (10 sec: 47514.1, 60 sec: 49152.1, 300 sec: 49485.3). Total num frames: 658702336. Throughput: 0: 49613.7. Samples: 658777620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-07-02 15:15:01,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 15:15:02,414][36999] Updated weights for policy 0, policy_version 40210 (0.0027) [2024-07-02 15:15:05,654][36999] Updated weights for policy 0, policy_version 40220 (0.0023) [2024-07-02 15:15:06,095][36761] Fps is (10 sec: 49152.4, 60 sec: 49971.2, 300 sec: 49485.2). Total num frames: 658964480. Throughput: 0: 49572.6. Samples: 659081380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 15:15:06,095][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:15:08,861][36999] Updated weights for policy 0, policy_version 40230 (0.0025) [2024-07-02 15:15:11,095][36761] Fps is (10 sec: 50790.6, 60 sec: 49702.0, 300 sec: 49540.8). Total num frames: 659210240. Throughput: 0: 49497.1. Samples: 659375440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 15:15:11,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 15:15:12,207][36999] Updated weights for policy 0, policy_version 40240 (0.0022) [2024-07-02 15:15:15,710][36999] Updated weights for policy 0, policy_version 40250 (0.0026) [2024-07-02 15:15:16,095][36761] Fps is (10 sec: 50790.0, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 659472384. Throughput: 0: 49423.9. Samples: 659520460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 15:15:16,096][36761] Avg episode reward: [(0, '1.039')] [2024-07-02 15:15:18,700][36999] Updated weights for policy 0, policy_version 40260 (0.0022) [2024-07-02 15:15:19,702][36979] Signal inference workers to stop experience collection... (9600 times) [2024-07-02 15:15:19,703][36979] Signal inference workers to resume experience collection... (9600 times) [2024-07-02 15:15:19,713][36999] InferenceWorker_p0-w0: stopping experience collection (9600 times) [2024-07-02 15:15:19,713][36999] InferenceWorker_p0-w0: resuming experience collection (9600 times) [2024-07-02 15:15:21,095][36761] Fps is (10 sec: 50789.6, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 659718144. Throughput: 0: 49521.8. Samples: 659825420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 15:15:21,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 15:15:22,298][36999] Updated weights for policy 0, policy_version 40270 (0.0023) [2024-07-02 15:15:25,606][36999] Updated weights for policy 0, policy_version 40280 (0.0031) [2024-07-02 15:15:26,096][36761] Fps is (10 sec: 50789.5, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 659980288. Throughput: 0: 49655.3. Samples: 660124260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-07-02 15:15:26,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 15:15:28,796][36999] Updated weights for policy 0, policy_version 40290 (0.0022) [2024-07-02 15:15:31,095][36761] Fps is (10 sec: 50790.7, 60 sec: 49971.1, 300 sec: 49540.8). Total num frames: 660226048. Throughput: 0: 49372.9. Samples: 660271660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 15:15:31,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 15:15:31,955][36999] Updated weights for policy 0, policy_version 40300 (0.0021) [2024-07-02 15:15:35,760][36999] Updated weights for policy 0, policy_version 40310 (0.0022) [2024-07-02 15:15:36,095][36761] Fps is (10 sec: 47514.7, 60 sec: 49428.9, 300 sec: 49485.3). Total num frames: 660455424. Throughput: 0: 49624.2. Samples: 660570940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 15:15:36,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 15:15:38,358][36999] Updated weights for policy 0, policy_version 40320 (0.0028) [2024-07-02 15:15:41,095][36761] Fps is (10 sec: 49152.4, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 660717568. Throughput: 0: 49668.5. Samples: 660868100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 15:15:41,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 15:15:42,071][36999] Updated weights for policy 0, policy_version 40330 (0.0028) [2024-07-02 15:15:44,968][36999] Updated weights for policy 0, policy_version 40340 (0.0022) [2024-07-02 15:15:46,095][36761] Fps is (10 sec: 49152.2, 60 sec: 49152.1, 300 sec: 49485.3). Total num frames: 660946944. Throughput: 0: 49798.3. Samples: 661018540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 15:15:46,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 15:15:48,778][36999] Updated weights for policy 0, policy_version 40350 (0.0029) [2024-07-02 15:15:51,095][36761] Fps is (10 sec: 49151.7, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 661209088. Throughput: 0: 49490.1. Samples: 661308440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-07-02 15:15:51,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:15:52,006][36999] Updated weights for policy 0, policy_version 40360 (0.0022) [2024-07-02 15:15:55,674][36999] Updated weights for policy 0, policy_version 40370 (0.0027) [2024-07-02 15:15:56,095][36761] Fps is (10 sec: 49151.3, 60 sec: 49425.0, 300 sec: 49485.2). Total num frames: 661438464. Throughput: 0: 49696.8. Samples: 661611800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 15:15:56,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 15:15:58,439][36999] Updated weights for policy 0, policy_version 40380 (0.0028) [2024-07-02 15:16:01,095][36761] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 49540.8). Total num frames: 661700608. Throughput: 0: 49641.7. Samples: 661754340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 15:16:01,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 15:16:02,179][36999] Updated weights for policy 0, policy_version 40390 (0.0026) [2024-07-02 15:16:04,328][36979] Signal inference workers to stop experience collection... (9650 times) [2024-07-02 15:16:04,328][36979] Signal inference workers to resume experience collection... (9650 times) [2024-07-02 15:16:04,337][36999] InferenceWorker_p0-w0: stopping experience collection (9650 times) [2024-07-02 15:16:04,337][36999] InferenceWorker_p0-w0: resuming experience collection (9650 times) [2024-07-02 15:16:05,458][36999] Updated weights for policy 0, policy_version 40400 (0.0025) [2024-07-02 15:16:06,095][36761] Fps is (10 sec: 50790.7, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 661946368. Throughput: 0: 49331.2. Samples: 662045320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 15:16:06,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 15:16:08,770][36999] Updated weights for policy 0, policy_version 40410 (0.0023) [2024-07-02 15:16:11,095][36761] Fps is (10 sec: 47513.7, 60 sec: 49425.0, 300 sec: 49429.7). Total num frames: 662175744. Throughput: 0: 49234.4. Samples: 662339800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 15:16:11,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:16:12,041][36999] Updated weights for policy 0, policy_version 40420 (0.0032) [2024-07-02 15:16:15,489][36999] Updated weights for policy 0, policy_version 40430 (0.0030) [2024-07-02 15:16:16,095][36761] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 49486.0). Total num frames: 662437888. Throughput: 0: 49272.4. Samples: 662488920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-07-02 15:16:16,097][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:16:18,339][36999] Updated weights for policy 0, policy_version 40440 (0.0032) [2024-07-02 15:16:21,095][36761] Fps is (10 sec: 49151.6, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 662667264. Throughput: 0: 49285.6. Samples: 662788800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-07-02 15:16:21,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 15:16:21,960][36999] Updated weights for policy 0, policy_version 40450 (0.0023) [2024-07-02 15:16:25,089][36999] Updated weights for policy 0, policy_version 40460 (0.0026) [2024-07-02 15:16:26,095][36761] Fps is (10 sec: 50790.1, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 662945792. Throughput: 0: 49330.5. Samples: 663087980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-07-02 15:16:26,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:16:28,677][36999] Updated weights for policy 0, policy_version 40470 (0.0025) [2024-07-02 15:16:31,095][36761] Fps is (10 sec: 50791.5, 60 sec: 49152.1, 300 sec: 49540.8). Total num frames: 663175168. Throughput: 0: 49320.5. Samples: 663237960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-07-02 15:16:31,095][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 15:16:31,498][36999] Updated weights for policy 0, policy_version 40480 (0.0023) [2024-07-02 15:16:35,392][36999] Updated weights for policy 0, policy_version 40490 (0.0027) [2024-07-02 15:16:36,095][36761] Fps is (10 sec: 47513.6, 60 sec: 49424.9, 300 sec: 49429.7). Total num frames: 663420928. Throughput: 0: 49668.3. Samples: 663543520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-07-02 15:16:36,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 15:16:36,115][36761] No heartbeat for components: RolloutWorker_w15 (957 seconds) [2024-07-02 15:16:36,245][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000040493_663437312.pth... [2024-07-02 15:16:36,283][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000039766_651526144.pth [2024-07-02 15:16:38,200][36999] Updated weights for policy 0, policy_version 40500 (0.0027) [2024-07-02 15:16:41,096][36761] Fps is (10 sec: 47512.3, 60 sec: 48878.8, 300 sec: 49485.2). Total num frames: 663650304. Throughput: 0: 49233.2. Samples: 663827300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-07-02 15:16:41,099][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 15:16:42,061][36999] Updated weights for policy 0, policy_version 40510 (0.0026) [2024-07-02 15:16:44,511][36999] Updated weights for policy 0, policy_version 40520 (0.0022) [2024-07-02 15:16:46,095][36761] Fps is (10 sec: 49152.4, 60 sec: 49424.9, 300 sec: 49541.5). Total num frames: 663912448. Throughput: 0: 49376.4. Samples: 663976280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 19.0) [2024-07-02 15:16:46,104][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:16:48,536][36999] Updated weights for policy 0, policy_version 40530 (0.0031) [2024-07-02 15:16:51,095][36761] Fps is (10 sec: 52429.9, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 664174592. Throughput: 0: 49619.2. Samples: 664278180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-07-02 15:16:51,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:16:51,240][36999] Updated weights for policy 0, policy_version 40540 (0.0022) [2024-07-02 15:16:54,875][36999] Updated weights for policy 0, policy_version 40550 (0.0025) [2024-07-02 15:16:56,095][36761] Fps is (10 sec: 50790.7, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 664420352. Throughput: 0: 49617.8. Samples: 664572600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-07-02 15:16:56,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:16:58,128][36999] Updated weights for policy 0, policy_version 40560 (0.0020) [2024-07-02 15:17:01,095][36761] Fps is (10 sec: 47513.5, 60 sec: 49152.1, 300 sec: 49485.2). Total num frames: 664649728. Throughput: 0: 49450.8. Samples: 664714200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-07-02 15:17:01,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 15:17:01,875][36999] Updated weights for policy 0, policy_version 40570 (0.0024) [2024-07-02 15:17:04,575][36999] Updated weights for policy 0, policy_version 40580 (0.0027) [2024-07-02 15:17:06,095][36761] Fps is (10 sec: 49151.6, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 664911872. Throughput: 0: 49520.5. Samples: 665017220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-07-02 15:17:06,105][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 15:17:06,632][36979] Signal inference workers to stop experience collection... (9700 times) [2024-07-02 15:17:06,648][36999] InferenceWorker_p0-w0: stopping experience collection (9700 times) [2024-07-02 15:17:06,747][36979] Signal inference workers to resume experience collection... (9700 times) [2024-07-02 15:17:06,747][36999] InferenceWorker_p0-w0: resuming experience collection (9700 times) [2024-07-02 15:17:08,343][36999] Updated weights for policy 0, policy_version 40590 (0.0025) [2024-07-02 15:17:11,095][36761] Fps is (10 sec: 52428.9, 60 sec: 49971.3, 300 sec: 49597.1). Total num frames: 665174016. Throughput: 0: 49506.4. Samples: 665315760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-07-02 15:17:11,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:17:11,143][36999] Updated weights for policy 0, policy_version 40600 (0.0021) [2024-07-02 15:17:14,661][36999] Updated weights for policy 0, policy_version 40610 (0.0023) [2024-07-02 15:17:16,095][36761] Fps is (10 sec: 50790.9, 60 sec: 49698.2, 300 sec: 49540.8). Total num frames: 665419776. Throughput: 0: 49834.1. Samples: 665480500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 15:17:16,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 15:17:17,326][36999] Updated weights for policy 0, policy_version 40620 (0.0023) [2024-07-02 15:17:21,052][36999] Updated weights for policy 0, policy_version 40630 (0.0022) [2024-07-02 15:17:21,100][36761] Fps is (10 sec: 50767.0, 60 sec: 50240.5, 300 sec: 49651.1). Total num frames: 665681920. Throughput: 0: 49806.7. Samples: 665785040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 15:17:21,101][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 15:17:24,032][36999] Updated weights for policy 0, policy_version 40640 (0.0021) [2024-07-02 15:17:26,095][36761] Fps is (10 sec: 47513.5, 60 sec: 49152.1, 300 sec: 49540.8). Total num frames: 665894912. Throughput: 0: 50183.3. Samples: 666085540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 15:17:26,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 15:17:27,753][36999] Updated weights for policy 0, policy_version 40650 (0.0026) [2024-07-02 15:17:30,811][36999] Updated weights for policy 0, policy_version 40660 (0.0028) [2024-07-02 15:17:31,095][36761] Fps is (10 sec: 49174.0, 60 sec: 49971.0, 300 sec: 49652.6). Total num frames: 666173440. Throughput: 0: 49812.4. Samples: 666217840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 15:17:31,096][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 15:17:34,420][36999] Updated weights for policy 0, policy_version 40670 (0.0023) [2024-07-02 15:17:36,095][36761] Fps is (10 sec: 54066.5, 60 sec: 50244.3, 300 sec: 49596.3). Total num frames: 666435584. Throughput: 0: 49847.8. Samples: 666521340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-07-02 15:17:36,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 15:17:37,289][36999] Updated weights for policy 0, policy_version 40680 (0.0023) [2024-07-02 15:17:40,826][36999] Updated weights for policy 0, policy_version 40690 (0.0022) [2024-07-02 15:17:41,095][36761] Fps is (10 sec: 49152.7, 60 sec: 50244.5, 300 sec: 49596.3). Total num frames: 666664960. Throughput: 0: 49960.1. Samples: 666820800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 15:17:41,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:17:44,257][36999] Updated weights for policy 0, policy_version 40700 (0.0023) [2024-07-02 15:17:46,100][36761] Fps is (10 sec: 45854.9, 60 sec: 49694.4, 300 sec: 49595.6). Total num frames: 666894336. Throughput: 0: 50084.2. Samples: 666968220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 15:17:46,100][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 15:17:47,338][36999] Updated weights for policy 0, policy_version 40710 (0.0025) [2024-07-02 15:17:49,510][36979] Signal inference workers to stop experience collection... (9750 times) [2024-07-02 15:17:49,539][36999] InferenceWorker_p0-w0: stopping experience collection (9750 times) [2024-07-02 15:17:49,565][36979] Signal inference workers to resume experience collection... (9750 times) [2024-07-02 15:17:49,566][36999] InferenceWorker_p0-w0: resuming experience collection (9750 times) [2024-07-02 15:17:51,095][36761] Fps is (10 sec: 47512.9, 60 sec: 49424.9, 300 sec: 49541.5). Total num frames: 667140096. Throughput: 0: 49915.5. Samples: 667263420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 15:17:51,096][36761] Avg episode reward: [(0, '1.034')] [2024-07-02 15:17:51,163][36999] Updated weights for policy 0, policy_version 40720 (0.0025) [2024-07-02 15:17:54,113][36999] Updated weights for policy 0, policy_version 40730 (0.0024) [2024-07-02 15:17:56,095][36761] Fps is (10 sec: 52452.2, 60 sec: 49971.1, 300 sec: 49707.4). Total num frames: 667418624. Throughput: 0: 50026.5. Samples: 667566960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 15:17:56,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 15:17:57,501][36999] Updated weights for policy 0, policy_version 40740 (0.0024) [2024-07-02 15:18:00,688][36999] Updated weights for policy 0, policy_version 40750 (0.0027) [2024-07-02 15:18:01,095][36761] Fps is (10 sec: 54067.8, 60 sec: 50517.3, 300 sec: 49651.9). Total num frames: 667680768. Throughput: 0: 49866.7. Samples: 667724500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-07-02 15:18:01,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 15:18:03,927][36999] Updated weights for policy 0, policy_version 40760 (0.0023) [2024-07-02 15:18:06,097][36761] Fps is (10 sec: 47506.7, 60 sec: 49696.9, 300 sec: 49596.1). Total num frames: 667893760. Throughput: 0: 49741.6. Samples: 668023260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:18:06,097][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:18:06,942][36999] Updated weights for policy 0, policy_version 40770 (0.0029) [2024-07-02 15:18:10,392][36999] Updated weights for policy 0, policy_version 40780 (0.0022) [2024-07-02 15:18:11,095][36761] Fps is (10 sec: 47513.2, 60 sec: 49698.0, 300 sec: 49651.9). Total num frames: 668155904. Throughput: 0: 49755.5. Samples: 668324540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:18:11,096][36761] Avg episode reward: [(0, '1.039')] [2024-07-02 15:18:13,434][36999] Updated weights for policy 0, policy_version 40790 (0.0026) [2024-07-02 15:18:16,095][36761] Fps is (10 sec: 50797.7, 60 sec: 49698.0, 300 sec: 49597.1). Total num frames: 668401664. Throughput: 0: 50035.5. Samples: 668469440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:18:16,100][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 15:18:17,034][36999] Updated weights for policy 0, policy_version 40800 (0.0023) [2024-07-02 15:18:19,987][36999] Updated weights for policy 0, policy_version 40810 (0.0022) [2024-07-02 15:18:21,095][36761] Fps is (10 sec: 52429.1, 60 sec: 49975.0, 300 sec: 49762.9). Total num frames: 668680192. Throughput: 0: 49877.5. Samples: 668765820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:18:21,096][36761] Avg episode reward: [(0, '1.041')] [2024-07-02 15:18:23,921][36999] Updated weights for policy 0, policy_version 40820 (0.0020) [2024-07-02 15:18:26,095][36761] Fps is (10 sec: 54067.9, 60 sec: 50790.4, 300 sec: 49763.1). Total num frames: 668942336. Throughput: 0: 50037.3. Samples: 669072480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:18:26,096][36761] Avg episode reward: [(0, '1.037')] [2024-07-02 15:18:26,261][36999] Updated weights for policy 0, policy_version 40830 (0.0027) [2024-07-02 15:18:30,322][36999] Updated weights for policy 0, policy_version 40840 (0.0027) [2024-07-02 15:18:31,095][36761] Fps is (10 sec: 49151.7, 60 sec: 49971.2, 300 sec: 49707.4). Total num frames: 669171712. Throughput: 0: 50336.1. Samples: 669233120. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-07-02 15:18:31,107][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 15:18:33,042][36999] Updated weights for policy 0, policy_version 40850 (0.0025) [2024-07-02 15:18:34,080][36979] Signal inference workers to stop experience collection... (9800 times) [2024-07-02 15:18:34,109][36999] InferenceWorker_p0-w0: stopping experience collection (9800 times) [2024-07-02 15:18:34,192][36979] Signal inference workers to resume experience collection... (9800 times) [2024-07-02 15:18:34,193][36999] InferenceWorker_p0-w0: resuming experience collection (9800 times) [2024-07-02 15:18:36,095][36761] Fps is (10 sec: 45874.3, 60 sec: 49425.0, 300 sec: 49651.8). Total num frames: 669401088. Throughput: 0: 50183.0. Samples: 669521660. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-07-02 15:18:36,096][36761] Avg episode reward: [(0, '1.036')] [2024-07-02 15:18:36,120][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000040857_669401088.pth... [2024-07-02 15:18:36,195][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000040131_657506304.pth [2024-07-02 15:18:37,051][36999] Updated weights for policy 0, policy_version 40860 (0.0023) [2024-07-02 15:18:39,735][36999] Updated weights for policy 0, policy_version 40870 (0.0023) [2024-07-02 15:18:41,095][36761] Fps is (10 sec: 49152.8, 60 sec: 49971.2, 300 sec: 49708.0). Total num frames: 669663232. Throughput: 0: 49965.1. Samples: 669815380. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-07-02 15:18:41,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:18:43,879][36999] Updated weights for policy 0, policy_version 40880 (0.0028) [2024-07-02 15:18:46,096][36761] Fps is (10 sec: 50790.1, 60 sec: 50247.9, 300 sec: 49707.4). Total num frames: 669908992. Throughput: 0: 49909.5. Samples: 669970440. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-07-02 15:18:46,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 15:18:46,674][36999] Updated weights for policy 0, policy_version 40890 (0.0023) [2024-07-02 15:18:50,405][36999] Updated weights for policy 0, policy_version 40900 (0.0023) [2024-07-02 15:18:51,095][36761] Fps is (10 sec: 47513.7, 60 sec: 49971.4, 300 sec: 49540.8). Total num frames: 670138368. Throughput: 0: 49836.9. Samples: 670265840. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-07-02 15:18:51,096][36761] Avg episode reward: [(0, '1.035')] [2024-07-02 15:18:53,455][36999] Updated weights for policy 0, policy_version 40910 (0.0023) [2024-07-02 15:18:56,095][36761] Fps is (10 sec: 47514.0, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 670384128. Throughput: 0: 49698.1. Samples: 670560960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-07-02 15:18:56,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 15:18:56,986][36999] Updated weights for policy 0, policy_version 40920 (0.0027) [2024-07-02 15:18:59,850][36999] Updated weights for policy 0, policy_version 40930 (0.0026) [2024-07-02 15:19:01,095][36761] Fps is (10 sec: 52427.5, 60 sec: 49698.0, 300 sec: 49818.4). Total num frames: 670662656. Throughput: 0: 49592.0. Samples: 670701080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-07-02 15:19:01,096][36761] Avg episode reward: [(0, '1.033')] [2024-07-02 15:19:03,365][36999] Updated weights for policy 0, policy_version 40940 (0.0027) [2024-07-02 15:19:06,095][36761] Fps is (10 sec: 52429.5, 60 sec: 50245.5, 300 sec: 49763.7). Total num frames: 670908416. Throughput: 0: 49594.2. Samples: 670997560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-07-02 15:19:06,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 15:19:06,337][36999] Updated weights for policy 0, policy_version 40950 (0.0021) [2024-07-02 15:19:10,068][36999] Updated weights for policy 0, policy_version 40960 (0.0027) [2024-07-02 15:19:11,095][36761] Fps is (10 sec: 49152.8, 60 sec: 49971.3, 300 sec: 49651.8). Total num frames: 671154176. Throughput: 0: 49502.7. Samples: 671300100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-07-02 15:19:11,104][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 15:19:12,914][36999] Updated weights for policy 0, policy_version 40970 (0.0023) [2024-07-02 15:19:16,095][36761] Fps is (10 sec: 47513.1, 60 sec: 49698.1, 300 sec: 49651.8). Total num frames: 671383552. Throughput: 0: 49208.0. Samples: 671447480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-07-02 15:19:16,105][36761] Avg episode reward: [(0, '1.037')] [2024-07-02 15:19:16,523][36999] Updated weights for policy 0, policy_version 40980 (0.0023) [2024-07-02 15:19:19,227][36999] Updated weights for policy 0, policy_version 40990 (0.0023) [2024-07-02 15:19:21,095][36761] Fps is (10 sec: 47513.5, 60 sec: 49152.0, 300 sec: 49596.3). Total num frames: 671629312. Throughput: 0: 49376.6. Samples: 671743600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-07-02 15:19:21,096][36761] Avg episode reward: [(0, '1.040')] [2024-07-02 15:19:23,208][36999] Updated weights for policy 0, policy_version 41000 (0.0023) [2024-07-02 15:19:25,796][36999] Updated weights for policy 0, policy_version 41010 (0.0027) [2024-07-02 15:19:26,095][36761] Fps is (10 sec: 52428.9, 60 sec: 49425.0, 300 sec: 49762.9). Total num frames: 671907840. Throughput: 0: 49574.0. Samples: 672046220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 15:19:26,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 15:19:29,528][36999] Updated weights for policy 0, policy_version 41020 (0.0022) [2024-07-02 15:19:31,095][36761] Fps is (10 sec: 52429.0, 60 sec: 49698.2, 300 sec: 49708.2). Total num frames: 672153600. Throughput: 0: 49684.7. Samples: 672206240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 15:19:31,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 15:19:32,630][36999] Updated weights for policy 0, policy_version 41030 (0.0028) [2024-07-02 15:19:36,095][36761] Fps is (10 sec: 45875.5, 60 sec: 49425.2, 300 sec: 49596.3). Total num frames: 672366592. Throughput: 0: 49505.6. Samples: 672493600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 15:19:36,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 15:19:36,105][36761] No heartbeat for components: RolloutWorker_w15 (1137 seconds) [2024-07-02 15:19:36,245][36979] Signal inference workers to stop experience collection... (9850 times) [2024-07-02 15:19:36,245][36979] Signal inference workers to resume experience collection... (9850 times) [2024-07-02 15:19:36,268][36999] InferenceWorker_p0-w0: stopping experience collection (9850 times) [2024-07-02 15:19:36,268][36999] InferenceWorker_p0-w0: resuming experience collection (9850 times) [2024-07-02 15:19:36,384][36999] Updated weights for policy 0, policy_version 41040 (0.0030) [2024-07-02 15:19:39,489][36999] Updated weights for policy 0, policy_version 41050 (0.0023) [2024-07-02 15:19:41,095][36761] Fps is (10 sec: 47513.7, 60 sec: 49425.0, 300 sec: 49596.3). Total num frames: 672628736. Throughput: 0: 49323.8. Samples: 672780520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 15:19:41,095][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 15:19:43,238][36999] Updated weights for policy 0, policy_version 41060 (0.0022) [2024-07-02 15:19:46,051][36999] Updated weights for policy 0, policy_version 41070 (0.0031) [2024-07-02 15:19:46,100][36761] Fps is (10 sec: 52405.0, 60 sec: 49694.5, 300 sec: 49706.6). Total num frames: 672890880. Throughput: 0: 49597.3. Samples: 672933180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 15:19:46,101][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:19:49,817][36999] Updated weights for policy 0, policy_version 41080 (0.0023) [2024-07-02 15:19:51,100][36761] Fps is (10 sec: 47491.5, 60 sec: 49421.2, 300 sec: 49595.5). Total num frames: 673103872. Throughput: 0: 49614.1. Samples: 673230420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 15:19:51,100][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 15:19:52,763][36999] Updated weights for policy 0, policy_version 41090 (0.0022) [2024-07-02 15:19:56,095][36761] Fps is (10 sec: 47535.1, 60 sec: 49698.2, 300 sec: 49707.4). Total num frames: 673366016. Throughput: 0: 49628.4. Samples: 673533380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 15:19:56,100][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 15:19:56,161][36999] Updated weights for policy 0, policy_version 41100 (0.0022) [2024-07-02 15:19:59,168][36999] Updated weights for policy 0, policy_version 41110 (0.0021) [2024-07-02 15:20:01,097][36761] Fps is (10 sec: 52444.5, 60 sec: 49423.9, 300 sec: 49707.1). Total num frames: 673628160. Throughput: 0: 49577.9. Samples: 673678560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 15:20:01,097][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 15:20:02,846][36999] Updated weights for policy 0, policy_version 41120 (0.0022) [2024-07-02 15:20:05,578][36999] Updated weights for policy 0, policy_version 41130 (0.0029) [2024-07-02 15:20:06,095][36761] Fps is (10 sec: 50790.9, 60 sec: 49425.1, 300 sec: 49707.4). Total num frames: 673873920. Throughput: 0: 49821.8. Samples: 673985580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 15:20:06,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 15:20:09,328][36999] Updated weights for policy 0, policy_version 41140 (0.0026) [2024-07-02 15:20:11,099][36761] Fps is (10 sec: 49143.3, 60 sec: 49422.3, 300 sec: 49651.3). Total num frames: 674119680. Throughput: 0: 49724.4. Samples: 674283980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-07-02 15:20:11,099][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 15:20:12,337][36999] Updated weights for policy 0, policy_version 41150 (0.0021) [2024-07-02 15:20:15,489][36999] Updated weights for policy 0, policy_version 41160 (0.0023) [2024-07-02 15:20:16,095][36761] Fps is (10 sec: 50789.6, 60 sec: 49971.2, 300 sec: 49707.4). Total num frames: 674381824. Throughput: 0: 49536.7. Samples: 674435400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 15:20:16,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 15:20:19,042][36999] Updated weights for policy 0, policy_version 41170 (0.0027) [2024-07-02 15:20:21,095][36761] Fps is (10 sec: 50807.3, 60 sec: 49971.2, 300 sec: 49651.9). Total num frames: 674627584. Throughput: 0: 49856.0. Samples: 674737120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 15:20:21,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 15:20:21,971][36999] Updated weights for policy 0, policy_version 41180 (0.0033) [2024-07-02 15:20:23,858][36979] Signal inference workers to stop experience collection... (9900 times) [2024-07-02 15:20:23,906][36999] InferenceWorker_p0-w0: stopping experience collection (9900 times) [2024-07-02 15:20:23,967][36979] Signal inference workers to resume experience collection... (9900 times) [2024-07-02 15:20:23,968][36999] InferenceWorker_p0-w0: resuming experience collection (9900 times) [2024-07-02 15:20:25,659][36999] Updated weights for policy 0, policy_version 41190 (0.0022) [2024-07-02 15:20:26,095][36761] Fps is (10 sec: 49151.8, 60 sec: 49425.0, 300 sec: 49651.8). Total num frames: 674873344. Throughput: 0: 49942.4. Samples: 675027940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 15:20:26,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:20:28,753][36999] Updated weights for policy 0, policy_version 41200 (0.0023) [2024-07-02 15:20:31,100][36761] Fps is (10 sec: 50767.5, 60 sec: 49694.3, 300 sec: 49762.2). Total num frames: 675135488. Throughput: 0: 49876.0. Samples: 675177600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 15:20:31,100][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 15:20:32,338][36999] Updated weights for policy 0, policy_version 41210 (0.0029) [2024-07-02 15:20:35,492][36999] Updated weights for policy 0, policy_version 41220 (0.0028) [2024-07-02 15:20:36,100][36761] Fps is (10 sec: 49130.1, 60 sec: 49967.4, 300 sec: 49651.1). Total num frames: 675364864. Throughput: 0: 49948.4. Samples: 675478100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-07-02 15:20:36,101][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 15:20:36,131][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000041221_675364864.pth... [2024-07-02 15:20:36,211][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000040493_663437312.pth [2024-07-02 15:20:38,736][36999] Updated weights for policy 0, policy_version 41230 (0.0027) [2024-07-02 15:20:41,095][36761] Fps is (10 sec: 45896.0, 60 sec: 49425.0, 300 sec: 49651.8). Total num frames: 675594240. Throughput: 0: 49677.8. Samples: 675768880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:20:41,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 15:20:42,222][36999] Updated weights for policy 0, policy_version 41240 (0.0026) [2024-07-02 15:20:45,377][36999] Updated weights for policy 0, policy_version 41250 (0.0028) [2024-07-02 15:20:46,095][36761] Fps is (10 sec: 49174.6, 60 sec: 49428.8, 300 sec: 49651.9). Total num frames: 675856384. Throughput: 0: 49696.0. Samples: 675914800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:20:46,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 15:20:48,954][36999] Updated weights for policy 0, policy_version 41260 (0.0023) [2024-07-02 15:20:51,095][36761] Fps is (10 sec: 50790.8, 60 sec: 49975.1, 300 sec: 49707.4). Total num frames: 676102144. Throughput: 0: 49507.5. Samples: 676213420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:20:51,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 15:20:52,158][36999] Updated weights for policy 0, policy_version 41270 (0.0022) [2024-07-02 15:20:55,748][36999] Updated weights for policy 0, policy_version 41280 (0.0022) [2024-07-02 15:20:56,095][36761] Fps is (10 sec: 49151.8, 60 sec: 49698.1, 300 sec: 49651.8). Total num frames: 676347904. Throughput: 0: 49547.2. Samples: 676513440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:20:56,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 15:20:58,549][36999] Updated weights for policy 0, policy_version 41290 (0.0024) [2024-07-02 15:21:01,095][36761] Fps is (10 sec: 49152.0, 60 sec: 49426.4, 300 sec: 49651.9). Total num frames: 676593664. Throughput: 0: 49243.3. Samples: 676651340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:21:01,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 15:21:02,321][36999] Updated weights for policy 0, policy_version 41300 (0.0025) [2024-07-02 15:21:05,165][36999] Updated weights for policy 0, policy_version 41310 (0.0030) [2024-07-02 15:21:06,095][36761] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 49707.4). Total num frames: 676839424. Throughput: 0: 49092.4. Samples: 676946280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-07-02 15:21:06,100][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 15:21:08,825][36999] Updated weights for policy 0, policy_version 41320 (0.0025) [2024-07-02 15:21:11,095][36761] Fps is (10 sec: 50790.1, 60 sec: 49700.9, 300 sec: 49707.4). Total num frames: 677101568. Throughput: 0: 49158.4. Samples: 677240060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-07-02 15:21:11,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:21:11,946][36999] Updated weights for policy 0, policy_version 41330 (0.0021) [2024-07-02 15:21:15,643][36999] Updated weights for policy 0, policy_version 41340 (0.0022) [2024-07-02 15:21:16,095][36761] Fps is (10 sec: 50791.0, 60 sec: 49425.2, 300 sec: 49763.0). Total num frames: 677347328. Throughput: 0: 49321.5. Samples: 677396840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-07-02 15:21:16,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:21:18,674][36999] Updated weights for policy 0, policy_version 41350 (0.0022) [2024-07-02 15:21:21,095][36761] Fps is (10 sec: 45875.4, 60 sec: 48879.0, 300 sec: 49540.8). Total num frames: 677560320. Throughput: 0: 49273.1. Samples: 677695160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-07-02 15:21:21,096][36761] Avg episode reward: [(0, '1.010')] [2024-07-02 15:21:22,157][36999] Updated weights for policy 0, policy_version 41360 (0.0021) [2024-07-02 15:21:25,756][36999] Updated weights for policy 0, policy_version 41370 (0.0029) [2024-07-02 15:21:26,100][36761] Fps is (10 sec: 45854.1, 60 sec: 48875.4, 300 sec: 49595.5). Total num frames: 677806080. Throughput: 0: 49255.1. Samples: 677985580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-07-02 15:21:26,100][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 15:21:28,551][36999] Updated weights for policy 0, policy_version 41380 (0.0026) [2024-07-02 15:21:31,097][36761] Fps is (10 sec: 52418.0, 60 sec: 49154.1, 300 sec: 49707.1). Total num frames: 678084608. Throughput: 0: 49276.9. Samples: 678132360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-07-02 15:21:31,098][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:21:32,454][36999] Updated weights for policy 0, policy_version 41390 (0.0023) [2024-07-02 15:21:33,793][36979] Signal inference workers to stop experience collection... (9950 times) [2024-07-02 15:21:33,806][36999] InferenceWorker_p0-w0: stopping experience collection (9950 times) [2024-07-02 15:21:33,850][36979] Signal inference workers to resume experience collection... (9950 times) [2024-07-02 15:21:33,850][36999] InferenceWorker_p0-w0: resuming experience collection (9950 times) [2024-07-02 15:21:35,265][36999] Updated weights for policy 0, policy_version 41400 (0.0029) [2024-07-02 15:21:36,095][36761] Fps is (10 sec: 55730.0, 60 sec: 49974.9, 300 sec: 49874.0). Total num frames: 678363136. Throughput: 0: 49324.3. Samples: 678433020. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-07-02 15:21:36,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 15:21:38,755][36999] Updated weights for policy 0, policy_version 41410 (0.0022) [2024-07-02 15:21:41,095][36761] Fps is (10 sec: 49162.2, 60 sec: 49698.2, 300 sec: 49707.4). Total num frames: 678576128. Throughput: 0: 49404.1. Samples: 678736620. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-07-02 15:21:41,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 15:21:41,685][36999] Updated weights for policy 0, policy_version 41420 (0.0024) [2024-07-02 15:21:45,423][36999] Updated weights for policy 0, policy_version 41430 (0.0024) [2024-07-02 15:21:46,095][36761] Fps is (10 sec: 42598.8, 60 sec: 48878.9, 300 sec: 49540.8). Total num frames: 678789120. Throughput: 0: 49466.1. Samples: 678877320. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-07-02 15:21:46,096][36761] Avg episode reward: [(0, '1.010')] [2024-07-02 15:21:48,005][36999] Updated weights for policy 0, policy_version 41440 (0.0027) [2024-07-02 15:21:51,095][36761] Fps is (10 sec: 47513.2, 60 sec: 49151.9, 300 sec: 49596.3). Total num frames: 679051264. Throughput: 0: 49473.3. Samples: 679172580. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-07-02 15:21:51,096][36761] Avg episode reward: [(0, '1.018')] [2024-07-02 15:21:52,620][36999] Updated weights for policy 0, policy_version 41450 (0.0026) [2024-07-02 15:21:54,927][36999] Updated weights for policy 0, policy_version 41460 (0.0025) [2024-07-02 15:21:56,095][36761] Fps is (10 sec: 54067.3, 60 sec: 49698.2, 300 sec: 49762.9). Total num frames: 679329792. Throughput: 0: 49431.1. Samples: 679464460. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-07-02 15:21:56,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:21:59,004][36999] Updated weights for policy 0, policy_version 41470 (0.0023) [2024-07-02 15:22:01,099][36761] Fps is (10 sec: 52409.2, 60 sec: 49695.0, 300 sec: 49706.8). Total num frames: 679575552. Throughput: 0: 49726.8. Samples: 679634740. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-07-02 15:22:01,100][36761] Avg episode reward: [(0, '1.007')] [2024-07-02 15:22:01,442][36999] Updated weights for policy 0, policy_version 41480 (0.0023) [2024-07-02 15:22:05,514][36999] Updated weights for policy 0, policy_version 41490 (0.0027) [2024-07-02 15:22:06,095][36761] Fps is (10 sec: 47513.6, 60 sec: 49425.1, 300 sec: 49596.3). Total num frames: 679804928. Throughput: 0: 49503.5. Samples: 679922820. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-07-02 15:22:06,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 15:22:08,262][36999] Updated weights for policy 0, policy_version 41500 (0.0022) [2024-07-02 15:22:11,095][36761] Fps is (10 sec: 45892.2, 60 sec: 48878.9, 300 sec: 49540.8). Total num frames: 680034304. Throughput: 0: 49819.1. Samples: 680227220. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-07-02 15:22:11,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 15:22:11,997][36999] Updated weights for policy 0, policy_version 41510 (0.0023) [2024-07-02 15:22:14,801][36999] Updated weights for policy 0, policy_version 41520 (0.0022) [2024-07-02 15:22:16,095][36761] Fps is (10 sec: 52428.4, 60 sec: 49698.0, 300 sec: 49652.6). Total num frames: 680329216. Throughput: 0: 49761.3. Samples: 680371520. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-07-02 15:22:16,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 15:22:18,377][36999] Updated weights for policy 0, policy_version 41530 (0.0022) [2024-07-02 15:22:21,095][36761] Fps is (10 sec: 52428.8, 60 sec: 49971.1, 300 sec: 49707.4). Total num frames: 680558592. Throughput: 0: 49612.1. Samples: 680665560. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-07-02 15:22:21,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 15:22:21,395][36999] Updated weights for policy 0, policy_version 41540 (0.0023) [2024-07-02 15:22:24,865][36999] Updated weights for policy 0, policy_version 41550 (0.0023) [2024-07-02 15:22:26,095][36761] Fps is (10 sec: 49152.0, 60 sec: 50248.0, 300 sec: 49651.8). Total num frames: 680820736. Throughput: 0: 49653.2. Samples: 680971020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 15:22:26,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:22:27,064][36979] Signal inference workers to stop experience collection... (10000 times) [2024-07-02 15:22:27,064][36979] Signal inference workers to resume experience collection... (10000 times) [2024-07-02 15:22:27,073][36999] InferenceWorker_p0-w0: stopping experience collection (10000 times) [2024-07-02 15:22:27,074][36999] InferenceWorker_p0-w0: resuming experience collection (10000 times) [2024-07-02 15:22:27,626][36999] Updated weights for policy 0, policy_version 41560 (0.0022) [2024-07-02 15:22:31,095][36761] Fps is (10 sec: 50790.8, 60 sec: 49699.8, 300 sec: 49596.3). Total num frames: 681066496. Throughput: 0: 49812.0. Samples: 681118860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 15:22:31,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 15:22:31,185][36999] Updated weights for policy 0, policy_version 41570 (0.0022) [2024-07-02 15:22:34,253][36999] Updated weights for policy 0, policy_version 41580 (0.0026) [2024-07-02 15:22:36,095][36761] Fps is (10 sec: 47513.5, 60 sec: 48878.9, 300 sec: 49596.3). Total num frames: 681295872. Throughput: 0: 49892.8. Samples: 681417760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 15:22:36,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 15:22:36,110][36761] No heartbeat for components: RolloutWorker_w15 (1317 seconds) [2024-07-02 15:22:36,112][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000041583_681295872.pth... [2024-07-02 15:22:36,202][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000040857_669401088.pth [2024-07-02 15:22:37,978][36999] Updated weights for policy 0, policy_version 41590 (0.0023) [2024-07-02 15:22:41,095][36761] Fps is (10 sec: 49152.1, 60 sec: 49698.1, 300 sec: 49708.2). Total num frames: 681558016. Throughput: 0: 49853.8. Samples: 681707880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 15:22:41,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 15:22:41,171][36999] Updated weights for policy 0, policy_version 41600 (0.0028) [2024-07-02 15:22:44,588][36999] Updated weights for policy 0, policy_version 41610 (0.0026) [2024-07-02 15:22:46,095][36761] Fps is (10 sec: 50791.5, 60 sec: 50244.4, 300 sec: 49707.4). Total num frames: 681803776. Throughput: 0: 49482.9. Samples: 681861280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 15:22:46,095][36761] Avg episode reward: [(0, '1.012')] [2024-07-02 15:22:47,710][36999] Updated weights for policy 0, policy_version 41620 (0.0023) [2024-07-02 15:22:51,095][36761] Fps is (10 sec: 47513.1, 60 sec: 49698.1, 300 sec: 49540.8). Total num frames: 682033152. Throughput: 0: 49615.9. Samples: 682155540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-07-02 15:22:51,099][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 15:22:51,664][36999] Updated weights for policy 0, policy_version 41630 (0.0028) [2024-07-02 15:22:54,349][36999] Updated weights for policy 0, policy_version 41640 (0.0022) [2024-07-02 15:22:56,095][36761] Fps is (10 sec: 50789.9, 60 sec: 49698.1, 300 sec: 49596.3). Total num frames: 682311680. Throughput: 0: 49336.5. Samples: 682447360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-07-02 15:22:56,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 15:22:58,085][36999] Updated weights for policy 0, policy_version 41650 (0.0035) [2024-07-02 15:23:01,095][36761] Fps is (10 sec: 50790.3, 60 sec: 49428.1, 300 sec: 49652.1). Total num frames: 682541056. Throughput: 0: 49651.1. Samples: 682605820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-07-02 15:23:01,096][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 15:23:01,382][36999] Updated weights for policy 0, policy_version 41660 (0.0024) [2024-07-02 15:23:04,593][36999] Updated weights for policy 0, policy_version 41670 (0.0027) [2024-07-02 15:23:06,095][36761] Fps is (10 sec: 45875.3, 60 sec: 49425.1, 300 sec: 49540.8). Total num frames: 682770432. Throughput: 0: 49658.3. Samples: 682900180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-07-02 15:23:06,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:23:07,927][36999] Updated weights for policy 0, policy_version 41680 (0.0023) [2024-07-02 15:23:08,473][36979] Signal inference workers to stop experience collection... (10050 times) [2024-07-02 15:23:08,474][36979] Signal inference workers to resume experience collection... (10050 times) [2024-07-02 15:23:08,497][36999] InferenceWorker_p0-w0: stopping experience collection (10050 times) [2024-07-02 15:23:08,497][36999] InferenceWorker_p0-w0: resuming experience collection (10050 times) [2024-07-02 15:23:11,095][36761] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 49596.3). Total num frames: 683032576. Throughput: 0: 49553.3. Samples: 683200920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-07-02 15:23:11,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:23:11,593][36999] Updated weights for policy 0, policy_version 41690 (0.0023) [2024-07-02 15:23:14,503][36999] Updated weights for policy 0, policy_version 41700 (0.0025) [2024-07-02 15:23:16,097][36761] Fps is (10 sec: 52418.9, 60 sec: 49423.6, 300 sec: 49540.5). Total num frames: 683294720. Throughput: 0: 49424.6. Samples: 683343060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-07-02 15:23:16,098][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 15:23:18,048][36999] Updated weights for policy 0, policy_version 41710 (0.0038) [2024-07-02 15:23:20,928][36999] Updated weights for policy 0, policy_version 41720 (0.0029) [2024-07-02 15:23:21,095][36761] Fps is (10 sec: 52428.8, 60 sec: 49971.2, 300 sec: 49540.7). Total num frames: 683556864. Throughput: 0: 49491.1. Samples: 683644860. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-07-02 15:23:21,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 15:23:24,454][36999] Updated weights for policy 0, policy_version 41730 (0.0027) [2024-07-02 15:23:26,099][36761] Fps is (10 sec: 47506.1, 60 sec: 49149.3, 300 sec: 49484.7). Total num frames: 683769856. Throughput: 0: 49292.7. Samples: 683926220. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-07-02 15:23:26,099][36761] Avg episode reward: [(0, '1.013')] [2024-07-02 15:23:27,792][36999] Updated weights for policy 0, policy_version 41740 (0.0022) [2024-07-02 15:23:31,097][36761] Fps is (10 sec: 44231.1, 60 sec: 48877.8, 300 sec: 49485.0). Total num frames: 683999232. Throughput: 0: 49201.9. Samples: 684075440. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-07-02 15:23:31,097][36761] Avg episode reward: [(0, '1.018')] [2024-07-02 15:23:31,385][36999] Updated weights for policy 0, policy_version 41750 (0.0026) [2024-07-02 15:23:34,005][36999] Updated weights for policy 0, policy_version 41760 (0.0026) [2024-07-02 15:23:36,095][36761] Fps is (10 sec: 49169.0, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 684261376. Throughput: 0: 49349.4. Samples: 684376260. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-07-02 15:23:36,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 15:23:37,879][36999] Updated weights for policy 0, policy_version 41770 (0.0028) [2024-07-02 15:23:40,579][36999] Updated weights for policy 0, policy_version 41780 (0.0022) [2024-07-02 15:23:41,095][36761] Fps is (10 sec: 52435.8, 60 sec: 49425.0, 300 sec: 49540.8). Total num frames: 684523520. Throughput: 0: 49282.6. Samples: 684665080. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-07-02 15:23:41,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 15:23:44,898][36999] Updated weights for policy 0, policy_version 41790 (0.0022) [2024-07-02 15:23:46,095][36761] Fps is (10 sec: 52429.2, 60 sec: 49698.1, 300 sec: 49651.8). Total num frames: 684785664. Throughput: 0: 49443.8. Samples: 684830780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-07-02 15:23:46,095][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:23:47,103][36999] Updated weights for policy 0, policy_version 41800 (0.0033) [2024-07-02 15:23:51,095][36761] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 49485.2). Total num frames: 684982272. Throughput: 0: 49327.9. Samples: 685119940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-07-02 15:23:51,099][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 15:23:51,602][36999] Updated weights for policy 0, policy_version 41810 (0.0027) [2024-07-02 15:23:53,988][36999] Updated weights for policy 0, policy_version 41820 (0.0026) [2024-07-02 15:23:56,095][36761] Fps is (10 sec: 44236.4, 60 sec: 48605.9, 300 sec: 49374.2). Total num frames: 685228032. Throughput: 0: 48999.2. Samples: 685405880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-07-02 15:23:56,096][36761] Avg episode reward: [(0, '1.014')] [2024-07-02 15:23:58,153][36999] Updated weights for policy 0, policy_version 41830 (0.0023) [2024-07-02 15:24:00,870][36999] Updated weights for policy 0, policy_version 41840 (0.0037) [2024-07-02 15:24:01,095][36761] Fps is (10 sec: 52429.2, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 685506560. Throughput: 0: 49049.6. Samples: 685550200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-07-02 15:24:01,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 15:24:04,810][36999] Updated weights for policy 0, policy_version 41850 (0.0021) [2024-07-02 15:24:05,999][36979] Signal inference workers to stop experience collection... (10100 times) [2024-07-02 15:24:05,999][36979] Signal inference workers to resume experience collection... (10100 times) [2024-07-02 15:24:06,040][36999] InferenceWorker_p0-w0: stopping experience collection (10100 times) [2024-07-02 15:24:06,041][36999] InferenceWorker_p0-w0: resuming experience collection (10100 times) [2024-07-02 15:24:06,095][36761] Fps is (10 sec: 52428.8, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 685752320. Throughput: 0: 49004.1. Samples: 685850040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-07-02 15:24:06,096][36761] Avg episode reward: [(0, '1.013')] [2024-07-02 15:24:07,643][36999] Updated weights for policy 0, policy_version 41860 (0.0027) [2024-07-02 15:24:11,095][36761] Fps is (10 sec: 45875.3, 60 sec: 48879.0, 300 sec: 49429.7). Total num frames: 685965312. Throughput: 0: 49391.8. Samples: 686148680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-07-02 15:24:11,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:24:11,360][36999] Updated weights for policy 0, policy_version 41870 (0.0025) [2024-07-02 15:24:14,379][36999] Updated weights for policy 0, policy_version 41880 (0.0027) [2024-07-02 15:24:16,096][36761] Fps is (10 sec: 45870.0, 60 sec: 48606.5, 300 sec: 49429.5). Total num frames: 686211072. Throughput: 0: 48940.3. Samples: 686277740. Policy #0 lag: (min: 3.0, avg: 10.2, max: 23.0) [2024-07-02 15:24:16,097][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 15:24:17,958][36999] Updated weights for policy 0, policy_version 41890 (0.0021) [2024-07-02 15:24:21,095][36761] Fps is (10 sec: 50790.4, 60 sec: 48606.0, 300 sec: 49374.2). Total num frames: 686473216. Throughput: 0: 48790.2. Samples: 686571820. Policy #0 lag: (min: 3.0, avg: 10.2, max: 23.0) [2024-07-02 15:24:21,096][36761] Avg episode reward: [(0, '1.018')] [2024-07-02 15:24:21,187][36999] Updated weights for policy 0, policy_version 41900 (0.0027) [2024-07-02 15:24:24,439][36999] Updated weights for policy 0, policy_version 41910 (0.0022) [2024-07-02 15:24:26,095][36761] Fps is (10 sec: 52435.0, 60 sec: 49427.9, 300 sec: 49429.7). Total num frames: 686735360. Throughput: 0: 49085.9. Samples: 686873940. Policy #0 lag: (min: 3.0, avg: 10.2, max: 23.0) [2024-07-02 15:24:26,095][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 15:24:27,797][36999] Updated weights for policy 0, policy_version 41920 (0.0030) [2024-07-02 15:24:31,021][36999] Updated weights for policy 0, policy_version 41930 (0.0022) [2024-07-02 15:24:31,095][36761] Fps is (10 sec: 50789.9, 60 sec: 49699.2, 300 sec: 49540.8). Total num frames: 686981120. Throughput: 0: 48876.2. Samples: 687030220. Policy #0 lag: (min: 3.0, avg: 10.2, max: 23.0) [2024-07-02 15:24:31,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:24:34,986][36999] Updated weights for policy 0, policy_version 41940 (0.0028) [2024-07-02 15:24:36,095][36761] Fps is (10 sec: 45874.6, 60 sec: 48878.9, 300 sec: 49374.1). Total num frames: 687194112. Throughput: 0: 48973.8. Samples: 687323760. Policy #0 lag: (min: 3.0, avg: 10.2, max: 23.0) [2024-07-02 15:24:36,096][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 15:24:36,235][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000041944_687210496.pth... [2024-07-02 15:24:36,311][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000041221_675364864.pth [2024-07-02 15:24:37,775][36999] Updated weights for policy 0, policy_version 41950 (0.0028) [2024-07-02 15:24:41,095][36761] Fps is (10 sec: 45875.2, 60 sec: 48605.9, 300 sec: 49319.4). Total num frames: 687439872. Throughput: 0: 49234.1. Samples: 687621420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 19.0) [2024-07-02 15:24:41,100][36761] Avg episode reward: [(0, '1.039')] [2024-07-02 15:24:41,451][36999] Updated weights for policy 0, policy_version 41960 (0.0021) [2024-07-02 15:24:44,480][36999] Updated weights for policy 0, policy_version 41970 (0.0020) [2024-07-02 15:24:46,095][36761] Fps is (10 sec: 52428.6, 60 sec: 48878.8, 300 sec: 49541.5). Total num frames: 687718400. Throughput: 0: 49190.1. Samples: 687763760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 19.0) [2024-07-02 15:24:46,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:24:47,879][36999] Updated weights for policy 0, policy_version 41980 (0.0024) [2024-07-02 15:24:51,095][36761] Fps is (10 sec: 50790.7, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 687947776. Throughput: 0: 49197.3. Samples: 688063920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 19.0) [2024-07-02 15:24:51,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 15:24:51,170][36999] Updated weights for policy 0, policy_version 41990 (0.0020) [2024-07-02 15:24:54,200][36999] Updated weights for policy 0, policy_version 42000 (0.0022) [2024-07-02 15:24:54,962][36979] Signal inference workers to stop experience collection... (10150 times) [2024-07-02 15:24:54,962][36979] Signal inference workers to resume experience collection... (10150 times) [2024-07-02 15:24:54,979][36999] InferenceWorker_p0-w0: stopping experience collection (10150 times) [2024-07-02 15:24:54,979][36999] InferenceWorker_p0-w0: resuming experience collection (10150 times) [2024-07-02 15:24:56,095][36761] Fps is (10 sec: 49152.2, 60 sec: 49698.1, 300 sec: 49429.9). Total num frames: 688209920. Throughput: 0: 49231.4. Samples: 688364100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 19.0) [2024-07-02 15:24:56,098][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:24:57,602][36999] Updated weights for policy 0, policy_version 42010 (0.0034) [2024-07-02 15:25:00,961][36999] Updated weights for policy 0, policy_version 42020 (0.0020) [2024-07-02 15:25:01,095][36761] Fps is (10 sec: 50790.4, 60 sec: 49152.0, 300 sec: 49429.7). Total num frames: 688455680. Throughput: 0: 49579.9. Samples: 688508780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 19.0) [2024-07-02 15:25:01,098][36761] Avg episode reward: [(0, '1.030')] [2024-07-02 15:25:04,512][36999] Updated weights for policy 0, policy_version 42030 (0.0022) [2024-07-02 15:25:06,095][36761] Fps is (10 sec: 45875.2, 60 sec: 48605.8, 300 sec: 49319.2). Total num frames: 688668672. Throughput: 0: 49775.0. Samples: 688811700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 15:25:06,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 15:25:07,309][36999] Updated weights for policy 0, policy_version 42040 (0.0027) [2024-07-02 15:25:10,926][36999] Updated weights for policy 0, policy_version 42050 (0.0023) [2024-07-02 15:25:11,095][36761] Fps is (10 sec: 49152.5, 60 sec: 49698.2, 300 sec: 49374.2). Total num frames: 688947200. Throughput: 0: 49685.8. Samples: 689109800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 15:25:11,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:25:13,796][36999] Updated weights for policy 0, policy_version 42060 (0.0019) [2024-07-02 15:25:16,097][36761] Fps is (10 sec: 52419.1, 60 sec: 49697.5, 300 sec: 49373.8). Total num frames: 689192960. Throughput: 0: 49386.8. Samples: 689252720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 15:25:16,098][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:25:17,615][36999] Updated weights for policy 0, policy_version 42070 (0.0029) [2024-07-02 15:25:20,586][36999] Updated weights for policy 0, policy_version 42080 (0.0024) [2024-07-02 15:25:21,095][36761] Fps is (10 sec: 50789.7, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 689455104. Throughput: 0: 49511.1. Samples: 689551760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 15:25:21,098][36761] Avg episode reward: [(0, '1.031')] [2024-07-02 15:25:24,029][36999] Updated weights for policy 0, policy_version 42090 (0.0023) [2024-07-02 15:25:26,096][36761] Fps is (10 sec: 50799.4, 60 sec: 49424.9, 300 sec: 49374.9). Total num frames: 689700864. Throughput: 0: 49483.5. Samples: 689848180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-07-02 15:25:26,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:25:27,282][36999] Updated weights for policy 0, policy_version 42100 (0.0022) [2024-07-02 15:25:30,811][36999] Updated weights for policy 0, policy_version 42110 (0.0027) [2024-07-02 15:25:31,095][36761] Fps is (10 sec: 49152.0, 60 sec: 49425.1, 300 sec: 49430.5). Total num frames: 689946624. Throughput: 0: 49857.0. Samples: 690007320. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-07-02 15:25:31,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:25:33,839][36999] Updated weights for policy 0, policy_version 42120 (0.0022) [2024-07-02 15:25:36,095][36761] Fps is (10 sec: 47513.8, 60 sec: 49698.1, 300 sec: 49429.7). Total num frames: 690176000. Throughput: 0: 49839.4. Samples: 690306700. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-07-02 15:25:36,096][36761] Avg episode reward: [(0, '1.015')] [2024-07-02 15:25:36,112][36761] No heartbeat for components: RolloutWorker_w15 (1497 seconds) [2024-07-02 15:25:37,402][36999] Updated weights for policy 0, policy_version 42130 (0.0022) [2024-07-02 15:25:40,703][36999] Updated weights for policy 0, policy_version 42140 (0.0023) [2024-07-02 15:25:41,098][36761] Fps is (10 sec: 49140.3, 60 sec: 49969.2, 300 sec: 49429.3). Total num frames: 690438144. Throughput: 0: 49679.2. Samples: 690599780. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-07-02 15:25:41,098][36761] Avg episode reward: [(0, '1.016')] [2024-07-02 15:25:43,861][36999] Updated weights for policy 0, policy_version 42150 (0.0022) [2024-07-02 15:25:46,095][36761] Fps is (10 sec: 50791.4, 60 sec: 49425.2, 300 sec: 49429.7). Total num frames: 690683904. Throughput: 0: 49884.1. Samples: 690753560. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-07-02 15:25:46,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:25:47,117][36999] Updated weights for policy 0, policy_version 42160 (0.0021) [2024-07-02 15:25:48,610][36979] Signal inference workers to stop experience collection... (10200 times) [2024-07-02 15:25:48,631][36999] InferenceWorker_p0-w0: stopping experience collection (10200 times) [2024-07-02 15:25:48,666][36979] Signal inference workers to resume experience collection... (10200 times) [2024-07-02 15:25:48,666][36999] InferenceWorker_p0-w0: resuming experience collection (10200 times) [2024-07-02 15:25:50,518][36999] Updated weights for policy 0, policy_version 42170 (0.0025) [2024-07-02 15:25:51,095][36761] Fps is (10 sec: 52441.4, 60 sec: 50244.2, 300 sec: 49540.8). Total num frames: 690962432. Throughput: 0: 49925.8. Samples: 691058360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-07-02 15:25:51,096][36761] Avg episode reward: [(0, '1.019')] [2024-07-02 15:25:53,977][36999] Updated weights for policy 0, policy_version 42180 (0.0022) [2024-07-02 15:25:56,095][36761] Fps is (10 sec: 47513.1, 60 sec: 49152.1, 300 sec: 49374.1). Total num frames: 691159040. Throughput: 0: 49848.8. Samples: 691353000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-07-02 15:25:56,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 15:25:56,989][36999] Updated weights for policy 0, policy_version 42190 (0.0023) [2024-07-02 15:26:00,622][36999] Updated weights for policy 0, policy_version 42200 (0.0028) [2024-07-02 15:26:01,095][36761] Fps is (10 sec: 45875.6, 60 sec: 49425.1, 300 sec: 49429.7). Total num frames: 691421184. Throughput: 0: 49597.3. Samples: 691484500. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-07-02 15:26:01,096][36761] Avg episode reward: [(0, '1.028')] [2024-07-02 15:26:03,445][36999] Updated weights for policy 0, policy_version 42210 (0.0025) [2024-07-02 15:26:06,095][36761] Fps is (10 sec: 52428.6, 60 sec: 50244.3, 300 sec: 49429.7). Total num frames: 691683328. Throughput: 0: 49568.5. Samples: 691782340. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-07-02 15:26:06,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 15:26:07,302][36999] Updated weights for policy 0, policy_version 42220 (0.0024) [2024-07-02 15:26:10,089][36999] Updated weights for policy 0, policy_version 42230 (0.0022) [2024-07-02 15:26:11,095][36761] Fps is (10 sec: 54066.3, 60 sec: 50244.1, 300 sec: 49540.7). Total num frames: 691961856. Throughput: 0: 49529.4. Samples: 692077000. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-07-02 15:26:11,096][36761] Avg episode reward: [(0, '1.032')] [2024-07-02 15:26:13,856][36999] Updated weights for policy 0, policy_version 42240 (0.0026) [2024-07-02 15:26:16,095][36761] Fps is (10 sec: 49152.0, 60 sec: 49699.7, 300 sec: 49540.8). Total num frames: 692174848. Throughput: 0: 49725.8. Samples: 692244980. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-07-02 15:26:16,096][36761] Avg episode reward: [(0, '1.022')] [2024-07-02 15:26:16,546][36999] Updated weights for policy 0, policy_version 42250 (0.0025) [2024-07-02 15:26:20,388][36999] Updated weights for policy 0, policy_version 42260 (0.0022) [2024-07-02 15:26:21,095][36761] Fps is (10 sec: 44237.3, 60 sec: 49152.1, 300 sec: 49486.0). Total num frames: 692404224. Throughput: 0: 49606.3. Samples: 692538980. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-07-02 15:26:21,096][36761] Avg episode reward: [(0, '1.018')] [2024-07-02 15:26:23,038][36999] Updated weights for policy 0, policy_version 42270 (0.0028) [2024-07-02 15:26:26,095][36761] Fps is (10 sec: 47513.4, 60 sec: 49152.1, 300 sec: 49374.5). Total num frames: 692649984. Throughput: 0: 49669.3. Samples: 692834780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 15:26:26,096][36761] Avg episode reward: [(0, '1.020')] [2024-07-02 15:26:27,200][36999] Updated weights for policy 0, policy_version 42280 (0.0023) [2024-07-02 15:26:28,671][36979] Signal inference workers to stop experience collection... (10250 times) [2024-07-02 15:26:28,684][36999] InferenceWorker_p0-w0: stopping experience collection (10250 times) [2024-07-02 15:26:28,783][36979] Signal inference workers to resume experience collection... (10250 times) [2024-07-02 15:26:28,784][36999] InferenceWorker_p0-w0: resuming experience collection (10250 times) [2024-07-02 15:26:29,875][36999] Updated weights for policy 0, policy_version 42290 (0.0021) [2024-07-02 15:26:31,100][36761] Fps is (10 sec: 52404.7, 60 sec: 49694.4, 300 sec: 49373.4). Total num frames: 692928512. Throughput: 0: 49477.5. Samples: 692980280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 15:26:31,101][36761] Avg episode reward: [(0, '1.005')] [2024-07-02 15:26:33,743][36999] Updated weights for policy 0, policy_version 42300 (0.0023) [2024-07-02 15:26:36,095][36761] Fps is (10 sec: 54067.8, 60 sec: 50244.4, 300 sec: 49540.8). Total num frames: 693190656. Throughput: 0: 49502.3. Samples: 693285960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 15:26:36,096][36761] Avg episode reward: [(0, '1.029')] [2024-07-02 15:26:36,160][36979] Saving ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000042310_693207040.pth... [2024-07-02 15:26:36,163][36999] Updated weights for policy 0, policy_version 42310 (0.0023) [2024-07-02 15:26:36,198][36979] Removing ./train_dir/sample_factory/p2.sf.1/checkpoint_p0/checkpoint_000041583_681295872.pth [2024-07-02 15:26:40,018][36999] Updated weights for policy 0, policy_version 42320 (0.0023) [2024-07-02 15:26:41,095][36761] Fps is (10 sec: 45896.7, 60 sec: 49154.1, 300 sec: 49485.3). Total num frames: 693387264. Throughput: 0: 49523.2. Samples: 693581540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 15:26:41,096][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 15:26:42,674][36999] Updated weights for policy 0, policy_version 42330 (0.0020) [2024-07-02 15:26:46,095][36761] Fps is (10 sec: 45874.8, 60 sec: 49424.9, 300 sec: 49485.2). Total num frames: 693649408. Throughput: 0: 49746.6. Samples: 693723100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 15:26:46,096][36761] Avg episode reward: [(0, '1.026')] [2024-07-02 15:26:46,702][36999] Updated weights for policy 0, policy_version 42340 (0.0023) [2024-07-02 15:26:49,747][36999] Updated weights for policy 0, policy_version 42350 (0.0020) [2024-07-02 15:26:51,095][36761] Fps is (10 sec: 54067.1, 60 sec: 49425.2, 300 sec: 49485.2). Total num frames: 693927936. Throughput: 0: 49682.4. Samples: 694018040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-07-02 15:26:51,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:26:53,463][36999] Updated weights for policy 0, policy_version 42360 (0.0032) [2024-07-02 15:26:56,095][36761] Fps is (10 sec: 52429.1, 60 sec: 50244.3, 300 sec: 49485.9). Total num frames: 694173696. Throughput: 0: 49742.8. Samples: 694315420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-07-02 15:26:56,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 15:26:56,167][36999] Updated weights for policy 0, policy_version 42370 (0.0022) [2024-07-02 15:26:59,714][36999] Updated weights for policy 0, policy_version 42380 (0.0023) [2024-07-02 15:27:01,095][36761] Fps is (10 sec: 47513.3, 60 sec: 49698.1, 300 sec: 49485.2). Total num frames: 694403072. Throughput: 0: 49736.5. Samples: 694483120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-07-02 15:27:01,096][36761] Avg episode reward: [(0, '1.024')] [2024-07-02 15:27:02,604][36999] Updated weights for policy 0, policy_version 42390 (0.0026) [2024-07-02 15:27:06,095][36761] Fps is (10 sec: 49152.3, 60 sec: 49698.2, 300 sec: 49596.3). Total num frames: 694665216. Throughput: 0: 49782.7. Samples: 694779200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-07-02 15:27:06,096][36761] Avg episode reward: [(0, '1.025')] [2024-07-02 15:27:06,147][36999] Updated weights for policy 0, policy_version 42400 (0.0023) [2024-07-02 15:27:09,469][36999] Updated weights for policy 0, policy_version 42410 (0.0022) [2024-07-02 15:27:11,095][36761] Fps is (10 sec: 52429.0, 60 sec: 49425.2, 300 sec: 49485.3). Total num frames: 694927360. Throughput: 0: 49731.3. Samples: 695072680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-07-02 15:27:11,096][36761] Avg episode reward: [(0, '1.027')] [2024-07-02 15:27:12,966][36999] Updated weights for policy 0, policy_version 42420 (0.0026) [2024-07-02 15:27:16,095][36761] Fps is (10 sec: 50790.3, 60 sec: 49971.3, 300 sec: 49540.8). Total num frames: 695173120. Throughput: 0: 49904.3. Samples: 695225740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-07-02 15:27:16,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 15:27:16,097][36999] Updated weights for policy 0, policy_version 42430 (0.0023) [2024-07-02 15:27:19,434][36999] Updated weights for policy 0, policy_version 42440 (0.0025) [2024-07-02 15:27:21,095][36761] Fps is (10 sec: 47512.7, 60 sec: 49971.1, 300 sec: 49429.7). Total num frames: 695402496. Throughput: 0: 49611.8. Samples: 695518500. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-07-02 15:27:21,096][36761] Avg episode reward: [(0, '1.021')] [2024-07-02 15:27:21,229][36979] Signal inference workers to stop experience collection... (10300 times) [2024-07-02 15:27:21,255][36999] InferenceWorker_p0-w0: stopping experience collection (10300 times) [2024-07-02 15:27:21,281][36979] Signal inference workers to resume experience collection... (10300 times) [2024-07-02 15:27:21,282][36999] InferenceWorker_p0-w0: resuming experience collection (10300 times) [2024-07-02 15:27:22,718][36999] Updated weights for policy 0, policy_version 42450 (0.0023) [2024-07-02 15:27:26,095][36761] Fps is (10 sec: 47513.3, 60 sec: 49971.3, 300 sec: 49429.7). Total num frames: 695648256. Throughput: 0: 49515.5. Samples: 695809740. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-07-02 15:27:26,096][36761] Avg episode reward: [(0, '1.023')] [2024-07-02 15:27:26,204][36999] Updated weights for policy 0, policy_version 42460 (0.0023) [2024-07-02 15:27:29,206][36999] Updated weights for policy 0, policy_version 42470 (0.0027) [2024-07-02 15:27:31,095][36761] Fps is (10 sec: 50791.0, 60 sec: 49701.9, 300 sec: 49540.8). Total num frames: 695910400. Throughput: 0: 49651.2. Samples: 695957400. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-07-02 15:27:31,098][36761] Avg episode reward: [(0, '1.017')] [2024-07-02 15:27:32,944][36999] Updated weights for policy 0, policy_version 42480 (0.0027) [2024-07-02 15:27:35,709][36999] Updated weights for policy 0, policy_version 42490 (0.0022) [2024-07-02 15:27:36,095][36761] Fps is (10 sec: 50790.8, 60 sec: 49425.1, 300 sec: 49485.2). Total num frames: 696156160. Throughput: 0: 49788.0. Samples: 696258500. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-07-02 15:27:36,096][36761] Avg episode reward: [(0, '1.008')] [2024-07-02 15:27:39,415][36999] Updated weights for policy 0, policy_version 42500 (0.0027) [2024-07-02 15:27:41,096][36761] Fps is (10 sec: 50789.5, 60 sec: 50517.1, 300 sec: 49540.7). Total num frames: 696418304. Throughput: 0: 49805.6. Samples: 696556680. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-07-02 15:27:41,096][36761] Avg episode reward: [(0, '1.009')] [2024-07-02 15:27:42,401][36999] Updated weights for policy 0, policy_version 42510 (0.0021) [2024-07-02 15:27:45,948][36999] Updated weights for policy 0, policy_version 42520 (0.0023) [2024-07-02 15:27:46,095][36761] Fps is (10 sec: 49151.8, 60 sec: 49971.3, 300 sec: 49540.8). Total num frames: 696647680. Throughput: 0: 49340.0. Samples: 696703420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-07-02 15:27:46,096][36761] Avg episode reward: [(0, '1.019')]