diff --git "a/log.txt" "b/log.txt"
new file mode 100644--- /dev/null
+++ "b/log.txt"
@@ -0,0 +1,5983 @@
+[2023-12-25 02:57:56,716] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
+[2023-12-25 02:57:56,716] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
+[2023-12-25 02:57:56,716] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
+[2023-12-25 02:57:56,716] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
+[2023-12-25 02:57:56,716] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
+[2023-12-25 02:57:56,716] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
+[2023-12-25 02:57:56,814] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
+[2023-12-25 02:57:56,880] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
+[2023-12-25 02:57:57,177] [WARNING] [comm.py:152:init_deepspeed_backend] NCCL backend in DeepSpeed not yet implemented
+[2023-12-25 02:57:57,177] [INFO] [comm.py:616:init_distributed] cdb=None
+[2023-12-25 02:57:57,177] [INFO] [comm.py:643:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl
+[2023-12-25 02:57:57,177] [WARNING] [comm.py:152:init_deepspeed_backend] NCCL backend in DeepSpeed not yet implemented
+[2023-12-25 02:57:57,177] [WARNING] [comm.py:152:init_deepspeed_backend] NCCL backend in DeepSpeed not yet implemented
+[2023-12-25 02:57:57,177] [INFO] [comm.py:616:init_distributed] cdb=None
+[2023-12-25 02:57:57,177] [INFO] [comm.py:616:init_distributed] cdb=None
+[2023-12-25 02:57:57,177] [WARNING] [comm.py:152:init_deepspeed_backend] NCCL backend in DeepSpeed not yet implemented
+[2023-12-25 02:57:57,177] [INFO] [comm.py:616:init_distributed] cdb=None
+[2023-12-25 02:57:57,177] [WARNING] [comm.py:152:init_deepspeed_backend] NCCL backend in DeepSpeed not yet implemented
+[2023-12-25 02:57:57,177] [INFO] [comm.py:616:init_distributed] cdb=None
+[2023-12-25 02:57:57,177] [WARNING] [comm.py:152:init_deepspeed_backend] NCCL backend in DeepSpeed not yet implemented
+[2023-12-25 02:57:57,177] [WARNING] [comm.py:152:init_deepspeed_backend] NCCL backend in DeepSpeed not yet implemented
+[2023-12-25 02:57:57,177] [INFO] [comm.py:616:init_distributed] cdb=None
+[2023-12-25 02:57:57,177] [WARNING] [comm.py:152:init_deepspeed_backend] NCCL backend in DeepSpeed not yet implemented
+[2023-12-25 02:57:57,177] [INFO] [comm.py:616:init_distributed] cdb=None
+[2023-12-25 02:57:57,177] [INFO] [comm.py:616:init_distributed] cdb=None
+12/25/2023 02:57:57 - INFO - torch.distributed.distributed_c10d - Added key: store_based_barrier_key:1 to store for rank: 0
+12/25/2023 02:57:57 - INFO - torch.distributed.distributed_c10d - Added key: store_based_barrier_key:1 to store for rank: 4
+12/25/2023 02:57:57 - INFO - torch.distributed.distributed_c10d - Added key: store_based_barrier_key:1 to store for rank: 6
+12/25/2023 02:57:57 - INFO - torch.distributed.distributed_c10d - Added key: store_based_barrier_key:1 to store for rank: 7
+12/25/2023 02:57:57 - INFO - torch.distributed.distributed_c10d - Added key: store_based_barrier_key:1 to store for rank: 2
+12/25/2023 02:57:57 - INFO - torch.distributed.distributed_c10d - Added key: store_based_barrier_key:1 to store for rank: 3
+12/25/2023 02:57:57 - INFO - torch.distributed.distributed_c10d - Added key: store_based_barrier_key:1 to store for rank: 1
+12/25/2021212/25/2023 02:57:57 - INFO - torch.distributed.distributed_c10d - Rank 24: Completed store-based barrier for key:store_based_barrier_key:1 with 32 nodes.
+12/25/2023 02:57:57 - INFO - torch.distributed.distributed_c10d - Rank 31: Completed store-based barrier for key:store_based_barrier_key:1 with 32 nodes.
+12/25/2023 02:57:57 - INFO - torch.distributed.distributed_c10d - Rank 25: Completed store-based barrier for key:store_based_barrier_key:1 with 32 nodes.
+12/25/2023 02:57:57 - INFO - torch.distributed.distributed_c10d - Rank 28: Completed store-based barrier for key:store_based_barrier_key:1 with 32 nodes.
+12/25/2023 02:57:57 - INFO - torch.distributed.distributed_c10d - Rank 26: Completed store-based barrier for key:store_based_barrier_key:1 with 32 nodes.
+12/25/2023 02:57:57 - INFO - torch.distributed.distributed_c10d - Rank 30: Completed store-based barrier for key:store_based_barrier_key:1 with 32 nodes.
+12/25/2023 02:57:57 - INFO - torch.distributed.distributed_c10d - Rank 29: Completed store-based barrier for key:store_based_barrier_key:1 with 32 nodes.
+12/25/2023 02:57:57 - INFO - torch.distributed.distributed_c10d - Rank 27: Completed store-based barrier for key:store_based_barrier_key:1 with 32 nodes.
+12/25/2023 02:57:57 - WARNING - utils.common - We recommend enable fp16 mixed precision training.
+12/25/2023 02:57:57 - WARNING - utils.common - Please specify `prompt_template` if you are using other pre-trained models.
+12/25/2023 02:57:57 - WARNING - utils.common - `ddp_find_unused_parameters` needs to be set as False in DDP training.
+12/25/2023 02:57:57 - INFO - utils.common - Process rank: 0, device: cuda:0, n_gpu: 1
+  distributed training: True, 16-bits training: False
+12/25/2023 02:57:57 - INFO - utils.common - Training/evaluation parameters Seq2SeqTrainingArguments(
+_n_gpu=1,
+adafactor=False,
+adam_beta1=0.9,
+adam_beta2=0.999,
+adam_epsilon=1e-08,
+auto_find_batch_size=False,
+bf16=True,
+bf16_full_eval=False,
+data_seed=None,
+dataloader_drop_last=False,
+dataloader_num_workers=0,
+dataloader_pin_memory=True,
+ddp_backend=None,
+ddp_broadcast_buffers=None,
+ddp_bucket_cap_mb=None,
+ddp_find_unused_parameters=False,
+ddp_timeout=1800,
+debug=[],
+deepspeed=ds_z3_no_offload.json,
+disable_tqdm=False,
+dispatch_batches=None,
+do_eval=False,
+do_predict=False,
+do_train=True,
+eval_accumulation_steps=None,
+eval_delay=0,
+eval_steps=None,
+evaluation_strategy=no,
+fp16=False,
+fp16_backend=auto,
+fp16_full_eval=False,
+fp16_opt_level=O1,
+fsdp=[],
+fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False},
+fsdp_min_num_params=0,
+fsdp_transformer_layer_cls_to_wrap=None,
+full_determinism=False,
+generation_config=None,
+generation_max_length=None,
+generation_num_beams=None,
+gradient_accumulation_steps=2,
+gradient_checkpointing=False,
+greater_is_better=None,
+group_by_length=False,
+half_precision_backend=auto,
+hub_always_push=False,
+hub_model_id=None,
+hub_private_repo=False,
+hub_strategy=every_save,
+hub_token=<HUB_TOKEN>,
+ignore_data_skip=False,
+include_inputs_for_metrics=False,
+jit_mode_eval=False,
+label_names=None,
+label_smoothing_factor=0.0,
+learning_rate=2e-05,
+length_column_name=length,
+load_best_model_at_end=False,
+local_rank=0,
+log_level=passive,
+log_level_replica=warning,
+log_on_each_node=True,
+logging_dir=/group/20025/jiuding/ckpt/34B_final/runs/Dec25_02-57-56_vc-816648091-20231223-a1d29e4b-worker-3,
+logging_first_step=False,
+logging_nan_inf_filter=True,
+logging_steps=10,
+logging_strategy=steps,
+lr_scheduler_type=cosine,
+max_grad_norm=1.0,
+max_steps=-1,
+metric_for_best_model=None,
+mp_parameters=,
+no_cuda=False,
+num_train_epochs=2.0,
+optim=adamw_torch,
+optim_args=None,
+output_dir=/group/20025/jiuding/ckpt/34B_final,
+overwrite_output_dir=False,
+past_index=-1,
+per_device_eval_batch_size=8,
+per_device_train_batch_size=4,
+predict_with_generate=False,
+prediction_loss_only=False,
+push_to_hub=False,
+push_to_hub_model_id=None,
+push_to_hub_organization=None,
+push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
+ray_scope=last,
+remove_unused_columns=True,
+report_to=[],
+resume_from_checkpoint=None,
+run_name=/group/20025/jiuding/ckpt/34B_final,
+save_on_each_node=False,
+save_safetensors=False,
+save_steps=500,
+save_strategy=epoch,
+save_total_limit=10,
+seed=42,
+sharded_ddp=[],
+skip_memory_metrics=True,
+sortish_sampler=False,
+tf32=None,
+torch_compile=False,
+torch_compile_backend=None,
+torch_compile_mode=None,
+torchdynamo=None,
+tpu_metrics_debug=False,
+tpu_num_cores=None,
+use_cpu=False,
+use_ipex=False,
+use_legacy_prediction_loop=False,
+use_mps_device=False,
+warmup_ratio=0.0,
+warmup_steps=0,
+weight_decay=0.0,
+)
+12/25/2023 02:57:57 - WARNING - utils.common - We recommend enable fp16 mixed precision training.
+12/25/2023 02:57:57 - WARNING - utils.common - Please specify `prompt_template` if you are using other pre-trained models.
+12/25/2023 02:57:57 - WARNING - utils.common - We recommend enable fp16 mixed precision training.
+12/25/2023 02:57:57 - WARNING - utils.common - `ddp_find_unused_parameters` needs to be set as False in DDP training.
+12/25/2023 02:57:57 - WARNING - utils.common - Please specify `prompt_template` if you are using other pre-trained models.
+12/25/2023 02:57:57 - WARNING - utils.common - `ddp_find_unused_parameters` needs to be set as False in DDP training.
+12/25/2023 02:57:57 - INFO - utils.common - Process rank: 7, device: cuda:7, n_gpu: 1
+  distributed training: True, 16-bits training: False
+12/25/2023 02:57:57 - INFO - utils.common - Process rank: 1, device: cuda:1, n_gpu: 1
+  distributed training: True, 16-bits training: False
+12/25/2023 02:57:57 - INFO - utils.common - Training/evaluation parameters Seq2SeqTrainingArguments(
+_n_gpu=1,
+adafactor=False,
+adam_beta1=0.9,
+adam_beta2=0.999,
+adam_epsilon=1e-08,
+auto_find_batch_size=False,
+bf16=True,
+bf16_full_eval=False,
+data_seed=None,
+dataloader_drop_last=False,
+dataloader_num_workers=0,
+dataloader_pin_memory=True,
+ddp_backend=None,
+ddp_broadcast_buffers=None,
+ddp_bucket_cap_mb=None,
+ddp_find_unused_parameters=False,
+ddp_timeout=1800,
+debug=[],
+deepspeed=ds_z3_no_offload.json,
+disable_tqdm=False,
+dispatch_batches=None,
+do_eval=False,
+do_predict=False,
+do_train=True,
+eval_accumulation_steps=None,
+eval_delay=0,
+eval_steps=None,
+evaluation_strategy=no,
+fp16=False,
+fp16_backend=auto,
+fp16_full_eval=False,
+fp16_opt_level=O1,
+fsdp=[],
+fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False},
+fsdp_min_num_params=0,
+fsdp_transformer_layer_cls_to_wrap=None,
+full_determinism=False,
+generation_config=None,
+generation_max_length=None,
+generation_num_beams=None,
+gradient_accumulation_steps=2,
+gradient_checkpointing=False,
+greater_is_better=None,
+group_by_length=False,
+half_precision_backend=auto,
+hub_always_push=False,
+hub_model_id=None,
+hub_private_repo=False,
+hub_strategy=every_save,
+hub_token=<HUB_TOKEN>,
+ignore_data_skip=False,
+include_inputs_for_metrics=False,
+jit_mode_eval=False,
+label_names=None,
+label_smoothing_factor=0.0,
+learning_rate=2e-05,
+length_column_name=length,
+load_best_model_at_end=False,
+local_rank=7,
+log_level=passive,
+log_level_replica=warning,
+log_on_each_node=True,
+logging_dir=/group/20025/jiuding/ckpt/34B_final/runs/Dec25_02-57-56_vc-816648091-20231223-a1d29e4b-worker-3,
+logging_first_step=False,
+logging_nan_inf_filter=True,
+logging_steps=10,
+logging_strategy=steps,
+lr_scheduler_type=cosine,
+max_grad_norm=1.0,
+max_steps=-1,
+metric_for_best_model=None,
+mp_parameters=,
+no_cuda=False,
+num_train_epochs=2.0,
+optim=adamw_torch,
+optim_args=None,
+output_dir=/group/20025/jiuding/ckpt/34B_final,
+overwrite_output_dir=False,
+past_index=-1,
+per_device_eval_batch_size=8,
+per_device_train_batch_size=4,
+predict_with_generate=False,
+prediction_loss_only=False,
+push_to_hub=False,
+push_to_hub_model_id=None,
+push_to_hub_organization=None,
+push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
+ray_scope=last,
+remove_unused_columns=True,
+report_to=[],
+resume_from_checkpoint=None,
+run_name=/group/20025/jiuding/ckpt/34B_final,
+save_on_each_node=False,
+save_safetensors=False,
+save_steps=500,
+save_strategy=epoch,
+save_total_limit=10,
+seed=42,
+sharded_ddp=[],
+skip_memory_metrics=True,
+sortish_sampler=False,
+tf32=None,
+torch_compile=False,
+torch_compile_backend=None,
+torch_compile_mode=None,
+torchdynamo=None,
+tpu_metrics_debug=False,
+tpu_num_cores=None,
+use_cpu=False,
+use_ipex=False,
+use_legacy_prediction_loop=False,
+use_mps_device=False,
+warmup_ratio=0.0,
+warmup_steps=0,
+weight_decay=0.0,
+)
+12/25/2023 02:57:57 - INFO - utils.common - Training/evaluation parameters Seq2SeqTrainingArguments(
+_n_gpu=1,
+adafactor=False,
+adam_beta1=0.9,
+adam_beta2=0.999,
+adam_epsilon=1e-08,
+auto_find_batch_size=False,
+bf16=True,
+bf16_full_eval=False,
+data_seed=None,
+dataloader_drop_last=False,
+dataloader_num_workers=0,
+dataloader_pin_memory=True,
+ddp_backend=None,
+ddp_broadcast_buffers=None,
+ddp_bucket_cap_mb=None,
+ddp_find_unused_parameters=False,
+ddp_timeout=1800,
+debug=[],
+deepspeed=ds_z3_no_offload.json,
+disable_tqdm=False,
+dispatch_batches=None,
+do_eval=False,
+do_predict=False,
+do_train=True,
+eval_accumulation_steps=None,
+eval_delay=0,
+eval_steps=None,
+evaluation_strategy=no,
+fp16=False,
+fp16_backend=auto,
+fp16_full_eval=False,
+fp16_opt_level=O1,
+fsdp=[],
+fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False},
+fsdp_min_num_params=0,
+fsdp_transformer_layer_cls_to_wrap=None,
+full_determinism=False,
+generation_config=None,
+generation_max_length=None,
+generation_num_beams=None,
+gradient_accumulation_steps=2,
+gradient_checkpointing=False,
+greater_is_better=None,
+group_by_length=False,
+half_precision_backend=auto,
+hub_always_push=False,
+hub_model_id=None,
+hub_private_repo=False,
+hub_strategy=every_save,
+hub_token=<HUB_TOKEN>,
+ignore_data_skip=False,
+include_inputs_for_metrics=False,
+jit_mode_eval=False,
+label_names=None,
+label_smoothing_factor=0.0,
+learning_rate=2e-05,
+length_column_name=length,
+load_best_model_at_end=False,
+local_rank=1,
+log_level=passive,
+log_level_replica=warning,
+log_on_each_node=True,
+logging_dir=/group/20025/jiuding/ckpt/34B_final/runs/Dec25_02-57-56_vc-816648091-20231223-a1d29e4b-worker-3,
+logging_first_step=False,
+logging_nan_inf_filter=True,
+logging_steps=10,
+logging_strategy=steps,
+lr_scheduler_type=cosine,
+max_grad_norm=1.0,
+max_steps=-1,
+metric_for_best_model=None,
+mp_parameters=,
+no_cuda=False,
+num_train_epochs=2.0,
+optim=adamw_torch,
+optim_args=None,
+output_dir=/group/20025/jiuding/ckpt/34B_final,
+overwrite_output_dir=False,
+past_index=-1,
+per_device_eval_batch_size=8,
+per_device_train_batch_size=4,
+predict_with_generate=False,
+prediction_loss_only=False,
+push_to_hub=False,
+push_to_hub_model_id=None,
+push_to_hub_organization=None,
+push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
+ray_scope=last,
+remove_unused_columns=True,
+report_to=[],
+resume_from_checkpoint=None,
+run_name=/group/20025/jiuding/ckpt/34B_final,
+save_on_each_node=False,
+save_safetensors=False,
+save_steps=500,
+save_strategy=epoch,
+save_total_limit=10,
+seed=42,
+sharded_ddp=[],
+skip_memory_metrics=True,
+sortish_sampler=False,
+tf32=None,
+torch_compile=False,
+torch_compile_backend=None,
+torch_compile_mode=None,
+torchdynamo=None,
+tpu_metrics_debug=False,
+tpu_num_cores=None,
+use_cpu=False,
+use_ipex=False,
+use_legacy_prediction_loop=False,
+use_mps_device=False,
+warmup_ratio=0.0,
+warmup_steps=0,
+weight_decay=0.0,
+)
+12/25/2023 02:57:57 - WARNING - utils.common - We recommend enable fp16 mixed precision training.
+12/25/2023 02:57:57 - WARNING - utils.common - Please specify `prompt_template` if you are using other pre-trained models.
+12/25/2023 02:57:57 - WARNING - utils.common - `ddp_find_unused_parameters` needs to be set as False in DDP training.
+12/25/2023 02:57:57 - INFO - utils.common - Process rank: 4, device: cuda:4, n_gpu: 1
+  distributed training: True, 16-bits training: False
+12/25/2023 02:57:57 - INFO - utils.common - Training/evaluation parameters Seq2SeqTrainingArguments(
+_n_gpu=1,
+adafactor=False,
+adam_beta1=0.9,
+adam_beta2=0.999,
+adam_epsilon=1e-08,
+auto_find_batch_size=False,
+bf16=True,
+bf16_full_eval=False,
+data_seed=None,
+dataloader_drop_last=False,
+dataloader_num_workers=0,
+dataloader_pin_memory=True,
+ddp_backend=None,
+ddp_broadcast_buffers=None,
+ddp_bucket_cap_mb=None,
+ddp_find_unused_parameters=False,
+ddp_timeout=1800,
+debug=[],
+deepspeed=ds_z3_no_offload.json,
+disable_tqdm=False,
+dispatch_batches=None,
+do_eval=False,
+do_predict=False,
+do_train=True,
+eval_accumulation_steps=None,
+eval_delay=0,
+eval_steps=None,
+evaluation_strategy=no,
+fp16=False,
+fp16_backend=auto,
+fp16_full_eval=False,
+fp16_opt_level=O1,
+fsdp=[],
+fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False},
+fsdp_min_num_params=0,
+fsdp_transformer_layer_cls_to_wrap=None,
+full_determinism=False,
+generation_config=None,
+generation_max_length=None,
+generation_num_beams=None,
+gradient_accumulation_steps=2,
+gradient_checkpointing=False,
+greater_is_better=None,
+group_by_length=False,
+half_precision_backend=auto,
+hub_always_push=False,
+hub_model_id=None,
+hub_private_repo=False,
+hub_strategy=every_save,
+hub_token=<HUB_TOKEN>,
+ignore_data_skip=False,
+include_inputs_for_metrics=False,
+jit_mode_eval=False,
+label_names=None,
+label_smoothing_factor=0.0,
+learning_rate=2e-05,
+length_column_name=length,
+load_best_model_at_end=False,
+local_rank=4,
+log_level=passive,
+log_level_replica=warning,
+log_on_each_node=True,
+logging_dir=/group/20025/jiuding/ckpt/34B_final/runs/Dec25_02-57-56_vc-816648091-20231223-a1d29e4b-worker-3,
+logging_first_step=False,
+logging_nan_inf_filter=True,
+logging_steps=10,
+logging_strategy=steps,
+lr_scheduler_type=cosine,
+max_grad_norm=1.0,
+max_steps=-1,
+metric_for_best_model=None,
+mp_parameters=,
+no_cuda=False,
+num_train_epochs=2.0,
+optim=adamw_torch,
+optim_args=None,
+output_dir=/group/20025/jiuding/ckpt/34B_final,
+overwrite_output_dir=False,
+past_index=-1,
+per_device_eval_batch_size=8,
+per_device_train_batch_size=4,
+predict_with_generate=False,
+prediction_loss_only=False,
+push_to_hub=False,
+push_to_hub_model_id=None,
+push_to_hub_organization=None,
+push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
+ray_scope=last,
+remove_unused_columns=True,
+report_to=[],
+resume_from_checkpoint=None,
+run_name=/group/20025/jiuding/ckpt/34B_final,
+save_on_each_node=False,
+save_safetensors=False,
+save_steps=500,
+save_strategy=epoch,
+save_total_limit=10,
+seed=42,
+sharded_ddp=[],
+skip_memory_metrics=True,
+sortish_sampler=False,
+tf32=None,
+torch_compile=False,
+torch_compile_backend=None,
+torch_compile_mode=None,
+torchdynamo=None,
+tpu_metrics_debug=False,
+tpu_num_cores=None,
+use_cpu=False,
+use_ipex=False,
+use_legacy_prediction_loop=False,
+use_mps_device=False,
+warmup_ratio=0.0,
+warmup_steps=0,
+weight_decay=0.0,
+)
+12/25/2023 02:57:57 - WARNING - utils.common - We recommend enable fp16 mixed precision training.
+12/25/2023 02:57:57 - WARNING - utils.common - Please specify `prompt_template` if you are using other pre-trained models.
+12/25/2023 02:57:57 - WARNING - utils.common - `ddp_find_unused_parameters` needs to be set as False in DDP training.
+12/25/2023 02:57:57 - INFO - utils.common - Process rank: 2, device: cuda:2, n_gpu: 1
+  distributed training: True, 16-bits training: False
+12/25/2023 02:57:57 - WARNING - utils.common - We recommend enable fp16 mixed precision training.
+12/25/2023 02:57:57 - WARNING - utils.common - Please specify `prompt_template` if you are using other pre-trained models.
+12/25/2023 02:57:57 - WARNING - utils.common - `ddp_find_unused_parameters` needs to be set as False in DDP training.
+12/25/2023 02:57:57 - WARNING - utils.common - We recommend enable fp16 mixed precision training.
+12/25/2023 02:57:57 - INFO - utils.common - Training/evaluation parameters Seq2SeqTrainingArguments(
+_n_gpu=1,
+adafactor=False,
+adam_beta1=0.9,
+adam_beta2=0.999,
+adam_epsilon=1e-08,
+auto_find_batch_size=False,
+bf16=True,
+bf16_full_eval=False,
+data_seed=None,
+dataloader_drop_last=False,
+dataloader_num_workers=0,
+dataloader_pin_memory=True,
+ddp_backend=None,
+ddp_broadcast_buffers=None,
+ddp_bucket_cap_mb=None,
+ddp_find_unused_parameters=False,
+ddp_timeout=1800,
+debug=[],
+deepspeed=ds_z3_no_offload.json,
+disable_tqdm=False,
+dispatch_batches=None,
+do_eval=False,
+do_predict=False,
+do_train=True,
+eval_accumulation_steps=None,
+eval_delay=0,
+eval_steps=None,
+evaluation_strategy=no,
+fp16=False,
+fp16_backend=auto,
+fp16_full_eval=False,
+fp16_opt_level=O1,
+fsdp=[],
+fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False},
+fsdp_min_num_params=0,
+fsdp_transformer_layer_cls_to_wrap=None,
+full_determinism=False,
+generation_config=None,
+generation_max_length=None,
+generation_num_beams=None,
+gradient_accumulation_steps=2,
+gradient_checkpointing=False,
+greater_is_better=None,
+group_by_length=False,
+half_precision_backend=auto,
+hub_always_push=False,
+hub_model_id=None,
+hub_private_repo=False,
+hub_strategy=every_save,
+hub_token=<HUB_TOKEN>,
+ignore_data_skip=False,
+include_inputs_for_metrics=False,
+jit_mode_eval=False,
+label_names=None,
+label_smoothing_factor=0.0,
+learning_rate=2e-05,
+length_column_name=length,
+load_best_model_at_end=False,
+local_rank=2,
+log_level=passive,
+log_level_replica=warning,
+log_on_each_node=True,
+logging_dir=/group/20025/jiuding/ckpt/34B_final/runs/Dec25_02-57-56_vc-816648091-20231223-a1d29e4b-worker-3,
+logging_first_step=False,
+logging_nan_inf_filter=True,
+logging_steps=10,
+logging_strategy=steps,
+lr_scheduler_type=cosine,
+max_grad_norm=1.0,
+max_steps=-1,
+metric_for_best_model=None,
+mp_parameters=,
+no_cuda=False,
+num_train_epochs=2.0,
+optim=adamw_torch,
+optim_args=None,
+output_dir=/group/20025/jiuding/ckpt/34B_final,
+overwrite_output_dir=False,
+past_index=-1,
+per_device_eval_batch_size=8,
+per_device_train_batch_size=4,
+predict_with_generate=False,
+prediction_loss_only=False,
+push_to_hub=False,
+push_to_hub_model_id=None,
+push_to_hub_organization=None,
+push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
+ray_scope=last,
+remove_unused_columns=True,
+report_to=[],
+resume_from_checkpoint=None,
+run_name=/group/20025/jiuding/ckpt/34B_final,
+save_on_each_node=False,
+save_safetensors=False,
+save_steps=500,
+save_strategy=epoch,
+save_total_limit=10,
+seed=42,
+sharded_ddp=[],
+skip_memory_metrics=True,
+sortish_sampler=False,
+tf32=None,
+torch_compile=False,
+torch_compile_backend=None,
+torch_compile_mode=None,
+torchdynamo=None,
+tpu_metrics_debug=False,
+tpu_num_cores=None,
+use_cpu=False,
+use_ipex=False,
+use_legacy_prediction_loop=False,
+use_mps_device=False,
+warmup_ratio=0.0,
+warmup_steps=0,
+weight_decay=0.0,
+)
+12/25/2023 02:57:57 - WARNING - utils.common - Please specify `prompt_template` if you are using other pre-trained models.
+12/25/2023 02:57:57 - INFO - utils.common - Process rank: 3, device: cuda:3, n_gpu: 1
+  distributed training: True, 16-bits training: False
+12/25/2023 02:57:57 - WARNING - utils.common - `ddp_find_unused_parameters` needs to be set as False in DDP training.
+12/25/2023 02:57:57 - INFO - utils.common - Process rank: 6, device: cuda:6, n_gpu: 1
+  distributed training: True, 16-bits training: False
+12/25/2023 02:57:57 - WARNING - utils.common - We recommend enable fp16 mixed precision training.
+12/25/2023 02:57:57 - WARNING - utils.common - Please specify `prompt_template` if you are using other pre-trained models.
+12/25/2023 02:57:57 - WARNING - utils.common - `ddp_find_unused_parameters` needs to be set as False in DDP training.
+12/25/2023 02:57:57 - INFO - utils.common - Process rank: 5, device: cuda:5, n_gpu: 1
+  distributed training: True, 16-bits training: False
+12/25/2023 02:57:57 - INFO - utils.common - Training/evaluation parameters Seq2SeqTrainingArguments(
+_n_gpu=1,
+adafactor=False,
+adam_beta1=0.9,
+adam_beta2=0.999,
+adam_epsilon=1e-08,
+auto_find_batch_size=False,
+bf16=True,
+bf16_full_eval=False,
+data_seed=None,
+dataloader_drop_last=False,
+dataloader_num_workers=0,
+dataloader_pin_memory=True,
+ddp_backend=None,
+ddp_broadcast_buffers=None,
+ddp_bucket_cap_mb=None,
+ddp_find_unused_parameters=False,
+ddp_timeout=1800,
+debug=[],
+deepspeed=ds_z3_no_offload.json,
+disable_tqdm=False,
+dispatch_batches=None,
+do_eval=False,
+do_predict=False,
+do_train=True,
+eval_accumulation_steps=None,
+eval_delay=0,
+eval_steps=None,
+evaluation_strategy=no,
+fp16=False,
+fp16_backend=auto,
+fp16_full_eval=False,
+fp16_opt_level=O1,
+fsdp=[],
+fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False},
+fsdp_min_num_params=0,
+fsdp_transformer_layer_cls_to_wrap=None,
+full_determinism=False,
+generation_config=None,
+generation_max_length=None,
+generation_num_beams=None,
+gradient_accumulation_steps=2,
+gradient_checkpointing=False,
+greater_is_better=None,
+group_by_length=False,
+half_precision_backend=auto,
+hub_always_push=False,
+hub_model_id=None,
+hub_private_repo=False,
+hub_strategy=every_save,
+hub_token=<HUB_TOKEN>,
+ignore_data_skip=False,
+include_inputs_for_metrics=False,
+jit_mode_eval=False,
+label_names=None,
+label_smoothing_factor=0.0,
+learning_rate=2e-05,
+length_column_name=length,
+load_best_model_at_end=False,
+local_rank=3,
+log_level=passive,
+log_level_replica=warning,
+log_on_each_node=True,
+logging_dir=/group/20025/jiuding/ckpt/34B_final/runs/Dec25_02-57-56_vc-816648091-20231223-a1d29e4b-worker-3,
+logging_first_step=False,
+logging_nan_inf_filter=True,
+logging_steps=10,
+logging_strategy=steps,
+lr_scheduler_type=cosine,
+max_grad_norm=1.0,
+max_steps=-1,
+metric_for_best_model=None,
+mp_parameters=,
+no_cuda=False,
+num_train_epochs=2.0,
+optim=adamw_torch,
+optim_args=None,
+output_dir=/group/20025/jiuding/ckpt/34B_final,
+overwrite_output_dir=False,
+past_index=-1,
+per_device_eval_batch_size=8,
+per_device_train_batch_size=4,
+predict_with_generate=False,
+prediction_loss_only=False,
+push_to_hub=False,
+push_to_hub_model_id=None,
+push_to_hub_organization=None,
+push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
+ray_scope=last,
+remove_unused_columns=True,
+report_to=[],
+resume_from_checkpoint=None,
+run_name=/group/20025/jiuding/ckpt/34B_final,
+save_on_each_node=False,
+save_safetensors=False,
+save_steps=500,
+save_strategy=epoch,
+save_total_limit=10,
+seed=42,
+sharded_ddp=[],
+skip_memory_metrics=True,
+sortish_sampler=False,
+tf32=None,
+torch_compile=False,
+torch_compile_backend=None,
+torch_compile_mode=None,
+torchdynamo=None,
+tpu_metrics_debug=False,
+tpu_num_cores=None,
+use_cpu=False,
+use_ipex=False,
+use_legacy_prediction_loop=False,
+use_mps_device=False,
+warmup_ratio=0.0,
+warmup_steps=0,
+weight_decay=0.0,
+)
+12/25/2023 02:57:57 - INFO - utils.common - Training/evaluation parameters Seq2SeqTrainingArguments(
+_n_gpu=1,
+adafactor=False,
+adam_beta1=0.9,
+adam_beta2=0.999,
+adam_epsilon=1e-08,
+auto_find_batch_size=False,
+bf16=True,
+bf16_full_eval=False,
+data_seed=None,
+dataloader_drop_last=False,
+dataloader_num_workers=0,
+dataloader_pin_memory=True,
+ddp_backend=None,
+ddp_broadcast_buffers=None,
+ddp_bucket_cap_mb=None,
+ddp_find_unused_parameters=False,
+ddp_timeout=1800,
+debug=[],
+deepspeed=ds_z3_no_offload.json,
+disable_tqdm=False,
+dispatch_batches=None,
+do_eval=False,
+do_predict=False,
+do_train=True,
+eval_accumulation_steps=None,
+eval_delay=0,
+eval_steps=None,
+evaluation_strategy=no,
+fp16=False,
+fp16_backend=auto,
+fp16_full_eval=False,
+fp16_opt_level=O1,
+fsdp=[],
+fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False},
+fsdp_min_num_params=0,
+fsdp_transformer_layer_cls_to_wrap=None,
+full_determinism=False,
+generation_config=None,
+generation_max_length=None,
+generation_num_beams=None,
+gradient_accumulation_steps=2,
+gradient_checkpointing=False,
+greater_is_better=None,
+group_by_length=False,
+half_precision_backend=auto,
+hub_always_push=False,
+hub_model_id=None,
+hub_private_repo=False,
+hub_strategy=every_save,
+hub_token=<HUB_TOKEN>,
+ignore_data_skip=False,
+include_inputs_for_metrics=False,
+jit_mode_eval=False,
+label_names=None,
+label_smoothing_factor=0.0,
+learning_rate=2e-05,
+length_column_name=length,
+load_best_model_at_end=False,
+local_rank=6,
+log_level=passive,
+log_level_replica=warning,
+log_on_each_node=True,
+logging_dir=/group/20025/jiuding/ckpt/34B_final/runs/Dec25_02-57-56_vc-816648091-20231223-a1d29e4b-worker-3,
+logging_first_step=False,
+logging_nan_inf_filter=True,
+logging_steps=10,
+logging_strategy=steps,
+lr_scheduler_type=cosine,
+max_grad_norm=1.0,
+max_steps=-1,
+metric_for_best_model=None,
+mp_parameters=,
+no_cuda=False,
+num_train_epochs=2.0,
+optim=adamw_torch,
+optim_args=None,
+output_dir=/group/20025/jiuding/ckpt/34B_final,
+overwrite_output_dir=False,
+past_index=-1,
+per_device_eval_batch_size=8,
+per_device_train_batch_size=4,
+predict_with_generate=False,
+prediction_loss_only=False,
+push_to_hub=False,
+push_to_hub_model_id=None,
+push_to_hub_organization=None,
+push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
+ray_scope=last,
+remove_unused_columns=True,
+report_to=[],
+resume_from_checkpoint=None,
+run_name=/group/20025/jiuding/ckpt/34B_final,
+save_on_each_node=False,
+save_safetensors=False,
+save_steps=500,
+save_strategy=epoch,
+save_total_limit=10,
+seed=42,
+sharded_ddp=[],
+skip_memory_metrics=True,
+sortish_sampler=False,
+tf32=None,
+torch_compile=False,
+torch_compile_backend=None,
+torch_compile_mode=None,
+torchdynamo=None,
+tpu_metrics_debug=False,
+tpu_num_cores=None,
+use_cpu=False,
+use_ipex=False,
+use_legacy_prediction_loop=False,
+use_mps_device=False,
+warmup_ratio=0.0,
+warmup_steps=0,
+weight_decay=0.0,
+)
+12/25/2023 02:57:57 - INFO - utils.common - Training/evaluation parameters Seq2SeqTrainingArguments(
+_n_gpu=1,
+adafactor=False,
+adam_beta1=0.9,
+adam_beta2=0.999,
+adam_epsilon=1e-08,
+auto_find_batch_size=False,
+bf16=True,
+bf16_full_eval=False,
+data_seed=None,
+dataloader_drop_last=False,
+dataloader_num_workers=0,
+dataloader_pin_memory=True,
+ddp_backend=None,
+ddp_broadcast_buffers=None,
+ddp_bucket_cap_mb=None,
+ddp_find_unused_parameters=False,
+ddp_timeout=1800,
+debug=[],
+deepspeed=ds_z3_no_offload.json,
+disable_tqdm=False,
+dispatch_batches=None,
+do_eval=False,
+do_predict=False,
+do_train=True,
+eval_accumulation_steps=None,
+eval_delay=0,
+eval_steps=None,
+evaluation_strategy=no,
+fp16=False,
+fp16_backend=auto,
+fp16_full_eval=False,
+fp16_opt_level=O1,
+fsdp=[],
+fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False},
+fsdp_min_num_params=0,
+fsdp_transformer_layer_cls_to_wrap=None,
+full_determinism=False,
+generation_config=None,
+generation_max_length=None,
+generation_num_beams=None,
+gradient_accumulation_steps=2,
+gradient_checkpointing=False,
+greater_is_better=None,
+group_by_length=False,
+half_precision_backend=auto,
+hub_always_push=False,
+hub_model_id=None,
+hub_private_repo=False,
+hub_strategy=every_save,
+hub_token=<HUB_TOKEN>,
+ignore_data_skip=False,
+include_inputs_for_metrics=False,
+jit_mode_eval=False,
+label_names=None,
+label_smoothing_factor=0.0,
+learning_rate=2e-05,
+length_column_name=length,
+load_best_model_at_end=False,
+local_rank=5,
+log_level=passive,
+log_level_replica=warning,
+log_on_each_node=True,
+logging_dir=/group/20025/jiuding/ckpt/34B_final/runs/Dec25_02-57-56_vc-816648091-20231223-a1d29e4b-worker-3,
+logging_first_step=False,
+logging_nan_inf_filter=True,
+logging_steps=10,
+logging_strategy=steps,
+lr_scheduler_type=cosine,
+max_grad_norm=1.0,
+max_steps=-1,
+metric_for_best_model=None,
+mp_parameters=,
+no_cuda=False,
+num_train_epochs=2.0,
+optim=adamw_torch,
+optim_args=None,
+output_dir=/group/20025/jiuding/ckpt/34B_final,
+overwrite_output_dir=False,
+past_index=-1,
+per_device_eval_batch_size=8,
+per_device_train_batch_size=4,
+predict_with_generate=False,
+prediction_loss_only=False,
+push_to_hub=False,
+push_to_hub_model_id=None,
+push_to_hub_organization=None,
+push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
+ray_scope=last,
+remove_unused_columns=True,
+report_to=[],
+resume_from_checkpoint=None,
+run_name=/group/20025/jiuding/ckpt/34B_final,
+save_on_each_node=False,
+save_safetensors=False,
+save_steps=500,
+save_strategy=epoch,
+save_total_limit=10,
+seed=42,
+sharded_ddp=[],
+skip_memory_metrics=True,
+sortish_sampler=False,
+tf32=None,
+torch_compile=False,
+torch_compile_backend=None,
+torch_compile_mode=None,
+torchdynamo=None,
+tpu_metrics_debug=False,
+tpu_num_cores=None,
+use_cpu=False,
+use_ipex=False,
+use_legacy_prediction_loop=False,
+use_mps_device=False,
+warmup_ratio=0.0,
+warmup_steps=0,
+weight_decay=0.012/212/25/2023 02:57:57 - INFO - utils.common - Using FlashAttention-2 for faster training and inference.
+12/12/25/2023 02:57:57 - INFO - utils.common - Using FlashAttention-2 for faster training and inference.
+12/25/2023 02:57:57 - INFO - utils.common - Using FlashAttention-2 for faster training and inference.
+12/25/2023 02:57:57 - INFO - utils.common - Using FlashAttention-2 for faster training and inference.
+12/25/2023 02:57:57 - INFO - utils.common - Using FlashAttention-2 for faster training and inference.
+12/25/2023 02:57:57 - INFO - utils.common - Using FlashAttention-2 for faster training and inference.
+12/25/2023 02:57:57 - INFO - utils.common - Using FlashAttention-2 for faster training and inference.
+12/25/2023 02:57:57 - INFO - utils.common - Using FlashAttention-2 for faster training and infevc-816648091-20231223-a1d29e4b-worker-1:5785:5785 [0] NCCL INFO cudaDriverVersion 11080
+vc-816648091-20231223-a1d29e4b-worker-1:5785:5785 [0] NCCL INFO Bootstrap : Using eth1:11.220.53.97<0>
+vc-816648091-20231223-a1d29e4b-worker-1:5785:5785 [0] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6667 [0] NCCL INFO NCCL_IB_DISABLE set by environment to 0.
+vc-816648091-20231223-a1d29e4b-worker-1:5789:5789 [4] NCCL INFO cudaDriverVersion 11080
+vc-816648091-20231223-a1d29e4b-worker-1:5789:5789 [4] NCCL INFO Bootstrap : Using eth1:11.220.53.97<0>
+vc-816648091-20231223-a1d29e4b-worker-1:5789:5789 [4] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6668 [4] NCCL INFO NCCL_IB_DISABLE set by environment to 0.
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6667 [0] NCCL INFO NET/IB : Using [0]mlx5_2:1/RoCE [RO]; OOB eth1:11.220.53.97<0>
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6667 [0] NCCL INFO Using network IB
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6668 [4] NCCL INFO NET/IB : Using [0]mlx5_2:1/RoCE [RO]; OOB eth1:11.220.53.97<0>
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6668 [4] NCCL INFO Using network IB
+vc-816648091-20231223-a1d29e4b-worker-1:5792:5792 [7] NCCL INFO cudaDriverVersion 11080
+vc-816648091-20231223-a1d29e4b-worker-1:5791:5791 [6] NCCL INFO cudaDriverVersion 11080
+vc-816648091-20231223-a1d29e4b-worker-1:5786:5786 [1] NCCL INFO cudaDriverVersion 11080
+vc-816648091-20231223-a1d29e4b-worker-1:5792:5792 [7] NCCL INFO Bootstrap : Using eth1:11.220.53.97<0>
+vc-816648091-20231223-a1d29e4b-worker-1:5792:5792 [7] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation
+vc-816648091-20231223-a1d29e4b-worker-1:5791:5791 [6] NCCL INFO Bootstrap : Using eth1:11.220.53.97<0>
+vc-816648091-20231223-a1d29e4b-worker-1:5791:5791 [6] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation
+vc-816648091-20231223-a1d29e4b-worker-1:5786:5786 [1] NCCL INFO Bootstrap : Using eth1:11.220.53.97<0>
+vc-816648091-20231223-a1d29e4b-worker-1:5786:5786 [1] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation
+vc-816648091-20231223-a1d29e4b-worker-1:5788:5788 [3] NCCL INFO cudaDriverVersion 11080
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6671 [7] NCCL INFO NCCL_IB_DISABLE set by environment to 0.
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6672 [6] NCCL INFO NCCL_IB_DISABLE set by environment to 0.
+vc-816648091-20231223-a1d29e4b-worker-1:5788:5788 [3] NCCL INFO Bootstrap : Using eth1:11.220.53.97<0>
+vc-816648091-20231223-a1d29e4b-worker-1:5788:5788 [3] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6673 [1] NCCL INFO NCCL_IB_DISABLE set by environment to 0.
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6674 [3] NCCL INFO NCCL_IB_DISABLE set by environment to 0.
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6671 [7] NCCL INFO NET/IB : Using [0]mlx5_2:1/RoCE [RO]; OOB eth1:11.220.53.97<0>
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6671 [7] NCCL INFO Using network IB
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6672 [6] NCCL INFO NET/IB : Using [0]mlx5_2:1/RoCE [RO]; OOB eth1:11.220.53.97<0>
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6672 [6] NCCL INFO Using network IB
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6673 [1] NCCL INFO NET/IB : Using [0]mlx5_2:1/RoCE [RO]; OOB eth1:11.220.53.97<0>
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6673 [1] NCCL INFO Using network IB
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6674 [3] NCCL INFO NET/IB : Using [0]mlx5_2:1/RoCE [RO]; OOB eth1:11.220.53.97<0>
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6674 [3] NCCL INFO Using network IB
+vc-816648091-20231223-a1d29e4b-worker-1:5790:5790 [5] NCCL INFO cudaDriverVersion 11080
+vc-816648091-20231223-a1d29e4b-worker-1:5790:5790 [5] NCCL INFO Bootstrap : Using eth1:11.220.53.97<0>
+vc-816648091-20231223-a1d29e4b-worker-1:5790:5790 [5] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6679 [5] NCCL INFO NCCL_IB_DISABLE set by environment to 0.
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6679 [5] NCCL INFO NET/IB : Using [0]mlx5_2:1/RoCE [RO]; OOB eth1:11.220.53.97<0>
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6679 [5] NCCL INFO Using network IB
+vc-816648091-20231223-a1d29e4b-worker-1:5787:5787 [2] NCCL INFO cudaDriverVersion 11080
+vc-816648091-20231223-a1d29e4b-worker-1:5787:5787 [2] NCCL INFO Bootstrap : Using eth1:11.220.53.97<0>
+vc-816648091-20231223-a1d29e4b-worker-1:5787:5787 [2] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6681 [2] NCCL INFO NCCL_IB_DISABLE set by environment to 0.
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6681 [2] NCCL INFO NET/IB : Using [0]mlx5_2:1/RoCE [RO]; OOB eth1:11.220.53.97<0>
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6681 [2] NCCL INFO Using network IB
+vc-816648091-2023122vc-816648091-20231223-a1d29e4b-worker-2:5780:6661 [6] NCCL INFO Setting affinity for GPU 6 to ffffffff,ffff0000,000vc-vc-816648091-20231223-a1d29e4b-worker-0:5800:6675 [6] NCCL INFO Setting affinity for GPU 6 to ffffffff,ffff0000,00000000,ffffffff,ffff0000,00000000
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6674 [1] NCCL INFO Setting affinity for GPU 1 to ffff,ffffffff,00000000,0000ffff,ffffffff
+vc-816648091-20231223-a1d29e4b-worker-0:5794:6672 [0] NCCL INFO Setting affinity for GPU 0 to ffff,ffffffff,00000000,0000ffff,ffffffff
+vc-816648091-20231223-a1d29e4b-worker-0:5801:6677 [7] NCCL INFO Setting affinity for GPU 7 to ffffffff,ffff0000,00000000,ffffffff,ffff0000,00000000
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6678 [5] NCCL INFO Setting affinity for GPU 5 to ffffffff,ffff0000,00000000,ffffffff,ffff0000,00000000
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6676 [4] NCCL INFO Setting affinity for GPU 4 to ffffffff,ffff0000,00000000,ffffffff,ffff0000,00000000
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6673 [3] NCCL INFO Setting affinity for GPU 3 to ffff,ffffffff,00000000,0000ffff,ffffffvc-816648091-2023122vc-816648091-20231223-a1d29e4b-worker-3:6071:6957 [2] NCCL INFO Trees [0] 27/-1/-1->26->18 vc-816648091-20231223-a1d29e4b-worker-1:5785:6667 [0] NCCL INFO Trees [0] 9/-1/-1->8->15 [1] 9/-1/-1->8->15
+vc-816648091-20231223-a1dvc-816648091-20231223-a1d29e4b-worker-3:6069:6956 [0] NCCL INFO Trees [0] 25/-1/-1->24->31 [1] 25/-1/-1->24->31
+vc-816648091-20231223-a1d29e4b-worker-3:6073:6953 [4] NCCL INFO Trees [0] 29/-1/-1->28->27 [1] 29/-1/-1->28->27
+vc-816648091-20231223-a1d29e4b-worker-3:6076:6954 [7] NCCL INFO Trees [0] 24/-1/-1->31->30 [1] 24/-1/-1->31->30
+vc-816648091-20231223-a1d29e4b-worker-3:6074:6959 [5] NCCL INFO Trees [0] 30/-1/-1->29->28 [1] 30/-1/-1->29->28
+vc-816648091-20231223-a1d29e4b-worker-3:6070:6952 [1] NCCL INFO Trees [0] -1/-1/-1->25->24 [1] -1/-1/-1->25->24
+vc-816648091-20231223-a1d29e4b-worker-3:6075:6958 [6] NCCL INFO Trees [0] 31/-1/-vc-816648091-20231223-a1d29e4bvvc-816648091-20231223-a1d29e4b-worker-3:6073:6953 [4] NCCL INFO Channel 00/0 : 28[93000] -> 25[13000] via P2P/IPC/readvvc-816648091-20231223-a1d29e4b-worker-3:6071:6957 [2] NCCL INFO Channel 00/0 : 26[4b000] -> 31[d0000] via P2P/IPC/readvc-816648091-20231223-a1d29e4b-worker-2:5774:6660 [0] NCCL INFO Channel 00/0 : 16[e000vc-816648091-20231223-a1d29e4b-wvc-816648091-20231223-a1d29e4b-worker-2:5778:6662 [4] NCCL INFO Channel 01/0 : 20[93000vc-816648091-20231223-a1d29e4b-worker-1:5785:6667 [0] NCCL INFO Channel 01/0 : 8[e000] -> 11[51000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6668 [4] NCCL INFO Channel 01/0 : 12[93000] -> 9[13000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6672 [6] NCCL INFO Channel 00/0 : 14[cb000] -> 13[99000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6679 [5] NCCL INFO Channel 00/0 : 13[99000] -> 12[93000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6672 [6] NCCL INFO Channel 01/0 : 14[cb000] -> 13[99000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6679 [5] NCCL INFO Channel 01/0 : 13[99000] -> 12[93000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6671 [7] NCCL INFO Channel 00/0 : 15[d0000] -> 14[cb000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6673 [1] NCCL INFO Channel 00/0 : 9[13000] -> 8[e000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6671 [7] NCCL INFO Channel 01/0 : 15[d0000] -> 14[cb000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6673 [1] NCCL INFO Channel 01/0 : 9[13000] -> 8[e000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6671 [7] NCCL INFO Connected all rings
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6667 [0] NCCL INFO Connected all rings
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6667 [0] NCCL INFO Channel 00/0 : 8[e000] -> 9[13000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6668 [4] NCCL INFO Connected all rings
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6673 [1] NCCL INFO Connected all rings
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6667 [0] NCCL INFO Channel 01/0 : 8[e000] -> 9[13000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6672 [6] NCCL INFO Connected all rings
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6679 [5] NCCL INFO Connected all rings
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6668 [4] NCCL INFO Channel 00/0 : 12[93000] -> 13[99000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6668 [4] NCCL INFO Channel 01/0 : 12[93000] -> 13[99000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6672 [6] NCCL INFO Channel 00/0 : 14[cb000] -> 15[d0000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6667 [0] NCCL INFO Channel 00/0 : 8[e000] -> 15[d0000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6673 [1] NCCL INFO Connected all trees
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6673 [1] NCCL INFO threadThresholds 8/8/64 | 256/8/64 | 512 | 512
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6673 [1] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6679 [5] NCCL INFO Channel 00/0 : 13[99000] -> 14[cb000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6672 [6] NCCL INFO Channel 01/0 : 14[cb000] -> 15[d0000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6667 [0] NCCL INFO Channel 01/0 : 8[e000] -> 15[d0000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6679 [5] NCCL INFO Channel 01/0 : 13[99000] -> 14[cb000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6672 [6] NCCL INFO Connected all trees
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6672 [6] NCCL INFO threadThresholds 8/8/64 | 256/8/64 | 512 | 512
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6672 [6] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6679 [5] NCCL INFO Connected all trees
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6679 [5] NCCL INFO threadThresholds 8/8/64 | 256/8/64 | 512 | 512
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6679 [5] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6671 [7] NCCL INFO Channel 00/0 : 15[d0000] -> 8[e000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6671 [7] NCCL INFO Channel 01/0 : 15[d0000] -> 8[e000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6671 [7] NCCL INFO Connected all trees
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6671 [7] NCCL INFO threadThresholds 8/8/64 | 256/8/64 | 512 | 512
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6671 [7] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6667 [0] NCCL INFO Connected all trees
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6667 [0] NCCL INFO threadThresholds 8/8/64 | 256/8/64 | 512 | 512
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6667 [0] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6674 [3] NCCL INFO Channel 00/0 : 11[51000] -> 18[4b000] [send] via NET/IB/0/GDRDMA
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6681 [2] NCCL INFO Channel 00/0 : 3[51000] -> 10[4b000] [receive] via NET/IB/0/GDRDMA
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6674 [3] NCCL INFO Channel 01/0 : 11[51000] -> 18[4b000] [send] via NET/IB/0/GDRDMA
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] NCCL INFO NCCL_IB_GID_INDEX set by environment to 3.
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6681 [2] NCCL INFO Channel 01/0 : 3[51000] -> 10[4b000] [receive] via NET/IB/0/GDRDMA
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6688 [2] NCCL INFO NCCL_IB_GID_INDEX set by environment to 3.
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6674 [3] NCCL INFO Connected all rings
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6674 [3] NCCL INFO Channel 00/0 : 11[51000] -> 12[93000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6674 [3] NCCL INFO Channel 01/0 : 11[51000] -> 12[93000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6681 [2] NCCL INFO Connected all rings
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6681 [2] NCCL INFO Channel 00/0 : 10[4b000] -> 11[51000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6681 [2] NCCL INFO Channel 01/0 : 10[4b000] -> 11[51000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6668 [4] NCCL INFO Channel 00/0 : 12[93000] -> 11[51000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6668 [4] NCCL INFO Channel 01/0 : 12[93000] -> 11[51000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6681 [2] NCCL INFO Channel 01/0 : 2[4b000] -> 10[4b000] [receive] via NET/IB/0/GDRDMA
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6674 [3] NCCL INFO Channel 01/0 : 18[4b000] -> 11[51000] [receive] via NET/IB/0/GDRDMA
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6681 [2] NCCL INFO Channel 00/0 : 10[4b000] -> 19[51000] [send] via NET/IB/0/GDRDMA
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6681 [2] NCCL INFO Channel 01/0 : 26[4b000] -> 10[4b000] [receive] via NET/IB/0/GDRDMA
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6681 [2] NCCL INFO Channel 01/0 : 10[4b000] -> 26[4b000] [send] via NET/IB/0/GDRDMA
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6681 [2] NCCL INFO Channel 00/0 : 19[51000] -> 10[4b000] [receive] via NET/IB/0/GDRDMA
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6681 [2] NCCL INFO Channel 01/0 : 10[4b000] -> 2[4b000] [send] via NET/IB/0/GDRDMA
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6674 [3] NCCL INFO Channel 00/0 : 11[51000] -> 10[4b000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6674 [3] NCCL INFO Channel 01/0 : 11[51000] -> 10[4b000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6668 [4] NCCL INFO Connected all trees
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6668 [4] NCCL INFO threadThresholds 8/8/64 | 256/8/64 | 512 | 512
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6668 [4] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6681 [2] NCCL INFO Connected all trees
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6681 [2] NCCL INFO threadThresholds 8/8/64 | 256/8/64 | 512 | 512
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6681 [2] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6674 [3] NCCL INFO Connected all trees
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6674 [3] NCCL INFO threadThresholds 8/8/64 | 256/8/64 | 512 | 512
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6674 [3] NCCL INFO 2 coll channels, 2 p2p channels, 2 vc-816648091-20231223-vc-816648091-20231223-a1d29e4b-worker-1:5788:6674 [3] NCCL INFO comm 0x68449270 rank 11 nranks 32 cudaDev 3 busId 51000 - Init COMPLETE
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6681 [2] NCCL INFO comm 0x68d79a60 rank 10 nranks 32 cudaDev 2 busId 4b000 - Init COMPLETE
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6668 [4] NCCL INFO comm 0x6ab9dec0 rank 12 nranks 32 cudaDev 4 busId 93000 - Init COMPLETE
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6667 [0] NCCL INFO comm 0x67496240 rank 8 nranks 32 cudaDev 0 busId e000 - Init COMPLETE
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6673 [1] NCCL INFO comm 0x6996c5b0 rank 9 nranks 32 cudaDev 1 busId 13000 - Init COMPLETE
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6672 [6] NCCL INFO co12/25/2023 02:59:19 - INFO - utils.common - Fine-tuning method: Full
+12/25/2023 02:59:19 - INFO - utils.common - Fine-tuning method: Full
+12/25/2023 02:59:19 - INFO - utils.common - Fine-tuning method: Full
+12/25/2023 02:59:19 - INFO - utils.common - Fine-tuning method: Full
+12/25/2023 02:59:19 - INFO - utils.common - Fine-tuning method: Full
+12/25/2023 02:59:19 - INFO - utils.common - Fine-tuning method: Full
+12/25/2023 02:59:19 - INFO - utils.common - Fine-tuning method: Full
+12/25/2023 02:59:19 - INFO - utils.common - Fine-tuning method: Full
+trainable params: 33744052224 || all params: 33744052224 || trainable%: 100.0000
+length of tokenizer:  32005
+pad_token_id is:  32004
+trainable params: 33744052224 || all params: 33744052224 || trainable%: 100.0000
+length of tokenizer:  32005
+pad_token_id is:  32004
+trainable params: 33744052224 || all params: 33744052224 || trainable%: 100.0000
+length of tokenizer:  32005
+pad_token_id is:  32004
+trainable params: 33744052224 || all params: 33744052224 || trainable%: 100.0000
+length of tokenizer:  32005
+pad_token_id is:  32004
+trainable params: 33744052224 || all params: 33744052224 || trainable%: 100.0000
+length of tokenizer:  32005
+pad_token_id is:  32004
+trainable params: 33744052224 || all params: 33744052224 || trainable%: 100.0000
+length of tokenizer:  32005
+pad_token_id is: trainable params: 33744052224 || all params: 33744052224 || trainable%: 100.0000 
+32004
+length of tokenizer:  32005
+pad_token_id is:  32004
+trainable params: 33744052224 || all params: 33744052224 || trainable%: 100.0000
+length of tokenizer:  32005
+pad_token_id is:  32004
+length of input_ids :
+394
+input_ids:
+[1, 518, 25580, 29962, 12148, 626, 355, 278, 15352, 5132, 2471, 577, 393, 372, 7805, 263, 525, 8000, 29915, 2425, 3265, 1135, 278, 5923, 525, 1454, 29915, 2425, 29892, 607, 4256, 1078, 1549, 278, 4452, 310, 385, 6043, 1051, 29889, 13, 13, 1576, 2471, 5279, 756, 263, 6494, 988, 372, 14734, 304, 1596, 385,length of input_ids :
+394
+input_ids:
+[1, 518, 25580, 29962, 12148, 626, 355, 278, 15352, 5132, length of input_ids :
+394
+input_ids:
+[1, 518, 25580, 29962, 12148, 626, 355, 278, 15352, 5132, 2471, 577, 393, 372, 7805, 263, 525, 8000, 29915, 2425, 3265, 1135, 278, 5923, 525, 1454, 29915, 2425, 29892, 607, 4256, 1078, 1549, 278, 4452, 310, 385, 6043, 1051, 29889, 13, 13, 1576, 2471, 5279, 756, 263, 6494, 988, 372, 14734, 304, 1596, 385, 1203, 393, 338, 5377, 278, 13451, 310, 278, 1051, 29889, 24778, 445, 1059, 322, 6623, 278, 2471, 304, 671, 525, 8000, 29915, 2012, 310, 525, 1454, 29915, 2425, 29889, 22521, 545, 596, 2471, 5149, 17766, 4069, 8857, 29889, 29871, 13, 13, 28956, 4691, 13, 29871, 396, 2661, 370, 1674, 385, 6043, 1051, 13, 29871, 3948, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29871, 396, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29871, 302, 353, 7431, 29898, 2749, 29897, 13, 13, 29871, 396, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 29871, 363, 474, 297, 3464, 29898, 29876, 29974, 29896, 1125, 13, 418, 1596, 29898, 2749, 29961, 29875, 2314, 13, 28956, 29961, 29914, 25580, 29962, 7521, 4691, 13, 29937, 2661, 370, 1674, 385, 6043, 1051, 13, 2749, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29937, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29876, 353, 7431, 29898, 2749, 29897, 13, 13, 29937, 25455, 2380, 472, 29871, 29900, 13, 29875, 353, 29871, 29900, 13, 13, 29937, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 8000, 474, 529, 302, 29901, 13, 1678, 1596, 29898, 2749, 29961, 29875, 2314, 13, 1678, 474, 4619, 29871, 29896, 13, 28956, 13, 797, 278, 2183, 775, 29892, 372, 14335, 304, 2130, 421, 2749, 29961, 29876, 7961, 607, 338, 714, 310, 13451, 408, 3017, 1051, 26190, 8665, 515, 29871, 29900, 322, 10614, 472, 302, 29899, 29896, 363, 263, 1051, 310, 3309, 302, 29889, 13, 17351, 29892, 306, 1304, 421, 8000, 29952, 2425, 408, 639, 596, 2009, 29889, 2648, 2847, 5281, 421, 29875, 29922, 29900, 29952, 1434, 278, 2425, 322, 421, 29875, 4619, 29871, 29896, 29952, 2629, 278, 2425, 29892, 591, 508, 13649, 975, 278, 1051, 29889, 910, 12541, 18172, 2745, 421, 29875, 29952, 338, 3109, 1135, 421, 29876, 1673, 278, 3309, 310, 278, 1051, 29889, 450, 421, 8000, 29952, 2425, 674, 451, 8469, 565, 278, 1051, 338, 4069, 577, 372, 5149, 17766, 4069, 8857, 29889, 2]
+length of input_ids :
+394
+input_ids:
+[1, 518, 25580, 29962, 12148, 626, 355, 278, 15352, 5132, 2471, 577, 393, 372, 7805, 263, 525, 8000, 29915, 2425, 3265, 1135, 278, 5923, 525, 1454, 29915, 2425, 29892, 607, 4256, 1078, 1549, 278, 4452, 310, 385, 6043, 1051, 29889, 13, 13, 1576, 2471, 5279, 756, 263, 6494, 988, 372, 14734, 304, 1596, 385, 1203, 393, 338, 5377, 278, 13451, 310, 278, 1051, 29889, 24778, 445, 1059, 322, 6623, 278, 2471, 304, 671, 525, 8000, 29915, 2012, 310, 525, 1454, 29915, 2425, 29889, 22521, 545, 596, 2471, 5149, 17766, 4069, 8857, 29889, 29871, 13, 13, 28956, 4691, 13, 29871, 396, 2661, 370, 1674, 385, 6043, 1051, 13, 29871, 3948, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29871, 396, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29871, 302, 353, 7431, 29898, 2749, 29897, 13, 13, 29871, 396, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 29871, 363, 474, 297, 3464, 29898, 29876, 29974, 29896, 1125, 13, 418, 1596, 29898, 2749, 29961, 29875, 2314, 13, 28956, 29961, 29914, 25580, 29962, 7521, 4691, 13, 29937, 2661, 370, 1674, 385, 6043, 1051, 13, 2749, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29937, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29876, 353, 7431, 29898, 2749, 29897, 13, 13, 29937, 25455, 2380, 472, 29871, 29900, 13, 29875, 353, 29871, 29900, 13, 13, 29937, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 8000, 474, 529, 302, 29901, 13, 1678, 1596, 29898, 2749, 29961, 29875, 2314, 13, 1678, 474, 4619, 29871, 29896, 13, 28956, 13, 797, 278, 2183, 775, 29892, 372, 14335, 304, 2130, 421, 2749, 29961, 29876, 7961, 607, 338, 714, 310, 13451, 408, 3017, 1051, 26190, 8665, 515, 29871, 29900, 322, 10614, 472, 302, 29899, 29896, 363, 263, 1051, 310, 3309, 302, 29889, 13, 17351, 29892, 306, 1304, 421, 8000, 29952, 2425, 408, 639, 596, 2009, 29889, 2648, 2847, 5281, 421, 29875, 29922, 29900, 29952, 1434, 278, 2425, 322, 421, 29875, 4619, 29871, 29896, 29952, 2629, 278, 2425, 29892, 591, 508, 13649, 975, 278, 1051, 29889, 910, 12541, 18172, 2745, 421, 29875, 29952, 338, 3109, 1135, 421, 29876, 1673, 278, 3309, 310, 278, 1051, 29889, 450, 421, 8000, 29952, 2425, 674, 451, 8469, 565, 278, 1051, 338, 4069, 577, 372, 5149, 17766, 4069, 8857, 29889, 2]
+inputs:
+<s> [INST]Please amend the subsequent Python script so that it includes a 'while' loop rather than the existing 'for' loop, which iterates through the items of an integer list.
+
+The script currently has a bug where it attempts to print an object that is outside the bounds of the list. Fix this error and modify the script to use 'while' instead of 'for' loop. Ensure your script correctly handles empty lists. 
+
+```python
+  # Establish an integer list
+  arr = [1, 2, 3, 4]
+
+  # Determine the length of the list
+  n = len(arr)
+
+  # Traverse the list and output each individual element
+  for i in range(n+1):
+      print(arr[i])
+```[/INST] ```python
+# Establish an integer list
+arr = [1, 2, 3, 4]
+
+# Determine the length of the list
+n = len(arr)
+
+# Initialize index at 0
+i = 0
+
+# Traverse the list and output each individual element
+while i < n:
+    print(arr[i])
+    i += 1
+```
+In the given code, it tries to access `arr[n]` which is out of bounds as python list indexing starts from 0 and ends at n-1 for a list of length n.
+Also, I used `while` loop as per your request. By initializing `i=0` before the loop and `i += 1` within the loop, we can iterate over the list. This iteration continues until `i` is less than `n`, the length of the list. The `while` loop will not proceed if the list is empty so it correctly handles empty lists.</s>
+label_ids:
+[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 7521, 4691, 13, 29937, 2661, 370, 1674, 385, 6043, 1051, 13, 2749, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29937, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29876, 353, 7431, 29898, 2749, 29897, 13, 13, 29937, 25455, 2380, 472, 29871, 29900, 13, 29875, 353, 29871, 29900, 13, 13, 29937, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 8000, 474, 529, 302, 29901, 13, 1678, 1596, 29898, 2749, 29961, 29875, 2314, 13, 1678, 474, 4619, 29871, 29896, 13, 28956, 13, 797, 278, 2183, 775, 29892, 372, 14335, 304, 2130, 421, 2749, 29961, 29876, 7961, 607, 338, 714, 310, 13451, 408, 3017, 1051, 26190, 8665, 515, 29871, 29900, 322, 10614, 472, 302, 29899, 29896, 363, 263, 1051, 310, 3309, 302, 29889, 13, 17351, 29892, 306, 1304, 421, 8000, 29952, 2425, 408, 639, 596, 2009, 29889, 2648, 2847, 5281, 421, 29875, 29922, 29900, 29952, 1434, 278, 2425, 322, 421, 29875, 4619, 29871, 29896, 29952, 2629, 278, 2425, 29892, 591, 508, 13649, 975, 278, 1051, 29889, 910, 12541, 18172, 2745, 421, 29875, 29952, 338, 3109, 1135, 421, 29876, 1673, 278, 3309, 310, 278, 1051, 29889, 450, 421, 8000, 29952, 2425, 674, 451, 8469, 565, 278, 1051, 338, 4069, 577, 372, 5149, 17766, 4069, 8857, 29889, 2]
+length of input_ids :
+394
+input_ids:
+[1, 518, 25580, 29962, 12148, 626, 355, 278, 15352, 5132, 2471, 577, 393, 372, 7805, 263, 525, 8000, 29915, 2425, 3265, 1135, 278, 5923, 525, 1454, 29915, 2425, 29892, 607, 4256, 1078, 1549, 278, 4452, 310, 385, 6043, 1051, 29889, 13, 13, 1576, 2471, 5279, 756, 263, 6494, 988, 372, 14734, 304, 1596, 385, 1203, 393, 338, 5377, 278, 13451, 310, 278, 1051, 29889, 24778, 445, 1059, 322, 6623, 278, 2471, 304, 671, 525, 8000, 29915, 2012, 310, 525, 1454, 29915, 2425, 29889, 22521, 545, 596, 2471, 5149, 17766, 4069, 8857, 29889, 29871, 13, 13, 28956, 4691, 13, 29871, 396, 2661, 370, 1674, 385, 6043, 1051, 13, 29871, 3948, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29871, 396, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29871, 302, 353, 7431, 29898, 2749, 29897, 13, 13, 29871, 396, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 29871, 363, 474, 297, 3464, 29898, 29876, 29974, 29896, 1125, 13, 418, 1596, 29898, 2749, 29961, 29875, 2314, 13, 28956, 29961, 29914, 25580, 29962, 7521, 4691, 13, 29937, 2661, 370, 1674, 385, 6043, 1051, 13, 2749, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29937, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29876, 353, 7431, 29898, 2749, 29897, 13, 13, 29937, 25455, 2380, 472, 29871, 29900, 13, 29875, 353, 29871, 29900, 13, 13, 29937, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 8000, 474, 529, 302, 29901, 13, 1678, 1596, 29898, 2749, 29961, 29875, 2314, 13, 1678, 474, 4619, 29871, 29896, 13, 28956, 13, 797, 278, 2183, 775, 29892, 372, 14335, 304, 2130, 421, 2749, 29961, 29876, 7961, 607, 338, 714, 310, 13451, 408, 3017, 1051, 26190, 8665, 515, 29871, 29900, 322, 10614, 472, 302, 29899, 29896, 363, 263, 1051, 310, 3309, 302, 29889, 13, 17351, 29892, 306, 1304, 421, 8000, 29952, 2425, 408, 639, 596, 2009, 29889, 2648, 2847, 5281, 421, 29875, 29922, 29900, 29952, 1434, 278, 2425, 322, 421, 29875, 4619, 29871, 29896, 29952, 2629, 278, 2425, 29892, 591, 508, 13649, 975, 278, 1051, 29889, 910, 12541, 18172, 2745, 421, 29875, 29952, 338, 3109, 1135, 421, 29876, 1673, 278, 3309, 310, 278, 1051, 29889, 450, 421, 8000, 29952, 2425, 674, 451, 8469, 565, 278, 1051, 338, 4069, 577, 372, 5149, 17766, 4069, 8857, 29889, 2]
+labels:
+[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] ```python
+# Establish an integer list
+arr = [1, 2, 3, 4]
+
+# Determine the length of the list
+n = len(arr)
+
+# Initialize index at 0
+i = 0
+
+# Traverse the list and output each individual element
+while i < n:
+    print(arr[i])
+    i += 1
+```
+In the given code, it tries to access `arr[n]` which is out of bounds as python list indexing starts from 0 and ends at n-1 for a list of length n.
+Also, I used `while` loop as per your request. By initializing `i=0` before the loop and `i += 1` within the loop, we can iterate over the list. This iteration continues until `i` is less than `n`, the length of the list. The `while` loop will not proceed if the list is empty so it correctly handles empty lists.</s>
+inputs:
+<s> [INST]Please amend the subsequent Python script so that it includes a 'while' loop rather than the existing 'for' loop, which iterates through the items of an integer list.
+
+The script currently has a bug where it attempts to print an object that is outside the bounds of the list. Fix this error and modify the script to use 'while' instead of 'for' loop. Ensure your script correctly handles empty lists. 
+
+```python
+  # Establish an integer list
+  arr = [1, 2, 3, 4]
+
+  # Determine the length of the list
+  n = len(arr)
+
+  # Traverse the list and output each individual element
+  for i in range(n+1):
+      print(arr[i])
+```[/INST] ```python
+# Establish an integer list
+arr = [1, 2, 3, 4]
+
+# Determine the length of the list
+n = len(arr)
+
+# Initialize index at 0
+i = 0
+
+# Traverse the list and output each individual element
+while i < n:
+    print(arr[i])
+    i += 1
+```
+In the given code, it tries to access `arr[n]` which is out of bounds as python list indexing starts from 0 and ends at n-1 for a list of length n.
+Also, I used `while` loop as per your request. By initializing `i=0` before the loop and `i += 1` within the loop, we can iterate over the list. This iteration continues until `i` is less than `n`, the length of the list. The `while` loop will not proceed if the list is empty so it correctly handles empty lists.</s>
+label_ids:
+[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 7521, 4691, 13, 29937, 2661, 370, 1674, 385, 6043, 1051, 13, 2749, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29937, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29876, 353, 7431, 29898, 2749, 29897, 13, 13, 29937, 25455, 2380, 472, 29871, 29900, 13, 29875, 353, 29871, 29900, 13, 13, 29937, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 8000, 474, 529, 302, 29901, 13, 1678, 1596, 29898, 2749, 29961, 29875, 2314, 13, 1678, 474, 4619, 29871, 29896, 13, 28956, 13, 797, 278, 2183, 775, 29892, 372, 14335, 304, 2130, 421, 2749, 29961, 29876, 7961, 607, 338, 714, 310, 13451, 408, 3017, 1051, 26190, 8665, 515, 29871, 29900, 322, 10614, 472, 302, 29899, 29896, 363, 263, 1051, 310, 3309, 302, 29889, 13, 17351, 29892, 306, 1304, 421, 8000, 29952, 2425, 408, 639, 596, 2009, 29889, 2648, 2847, 5281, 421, 29875, 29922, 29900, 29952, 1434, 278, 2425, 322, 421, 29875, 4619, 29871, 29896, 29952, 2629, 278, 2425, 29892, 591, 508, 13649, 975, 278, 1051, 29889, 910, 12541, 18172, 2745, 421, 29875, 29952, 338, 3109, 1135, 421, 29876, 1673, 278, 3309, 310, 278, 1051, 29889, 450, 421, 8000, 29952, 2425, 674, 451, 8469, 565, 278, 1051, 338, 4069, 577, 372, 5149, 17766, 4069, 8857, 29889, 2]
+length of input_ids :
+394
+input_ids:
+[1, 518, 25580, 29962, 12148, 626, 355, 278, 15352, 5132, 2471, 577, 393, 372, 7805, 263, 525, 8000, 29915, 2425, 3265, 1135, 278, 5923, 525, 1454, 29915, 2425, 29892, 607, 4256, 1078, 1549, 278, 4452, 310, 385, 6043, 1051, 29889, 13, 13, 1576, 2471, 5279, 756, 263, 6494, 988, 372, 14734, 304, 1596, 385, 1203, 393, 338, 5377, 278, 13451, 310, 278, 1051, 29889, 24778, 445, 1059, 322, 6623, 278, 2471, 304, 671, 525, 8000, 29915, 2012, 310, 525, 1454, 29915, 2425, 29889, 22521, 545, 596, 2471, 5149, 17766, 4069, 8857, 29889, 29871, 13, 13, 28956, 4691, 13, 29871, 396, 2661, 370, 1674, 385, 6043, 1051, 13, 29871, 3948, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29871, 396, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29871, 302, 353, 7431, 29898, 2749, 29897, 13, 13, 29871, 396, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 29871, 363, 474, 297, 3464, 29898, 29876, 29974, 29896, 1125, 13, 418, 1596, 29898, 2749, 29961, 29875, 2314, 13, 28956, 29961, 29914, 25580, 29962, 7521, 4691, 13, 29937, 2661, 370, 1674, 385, 6043, 1051, 13, 2749, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29937, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29876, 353, 7431, 29898, 2749, 29897, 13, 13, 29937, 25455, 2380, 472, 29871, 29900, 13, 29875, 353, 29871, 29900, 13, 13, 29937, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 8000, 474, 529, 302, 29901, 13, 1678, 1596, 29898, 2749, 29961, 29875, 2314, 13, 1678, 474, 4619, 29871, 29896, 13, 28956, 13, 797, 278, 2183, 775, 29892, 372, 14335, 304, 2130, 421, 2749, 29961, 29876, 7961, 607, 338, 714, 310, 13451, 408, 3017, 1051, 26190, 8665, 515, 29871, 29900, 322, 10614, 472, 302, 29899, 29896, 363, 263, 1051, 310, 3309, 302, 29889, 13, 17351, 29892, 306, 1304, 421, 8000, 29952, 2425, 408, 639, 596, 2009, 29889, 2648, 2847, 5281, 421, 29875, 29922, 29900, 29952, 1434, 278, 2425, 322, 421, 29875, 4619, 29871, 29896, 29952, 2629, 278, 2425, 29892, 591, 508, 13649, 975, 278, 1051, 29889, 910, 12541, 18172, 2745, 421, 29875, 29952, 338, 3109, 1135, 421, 29876, 1673, 278, 3309, 310, 278, 1051, 29889, 450, 421, 8000, 29952, 2425, 674, 451, 8469, 565, 278, 1051, 338, 4069, 577, 372, 5149, 17766, 4069, 8857, 29889, 2]
+inputs:
+<s> [INST]Please amend the subsequent Python script so that it includes a 'while' loop rather than the existing 'for' loop, which iterates through the items of an integer list.
+
+The script currently has a bug where it attempts to print an object that is outside the bounds of the list. Fix this error and modify the script to use 'while' instead of 'for' loop. Ensure your script correctly handles empty lists. 
+
+```python
+  # Establish an integer list
+  arr = [1, 2, 3, 4]
+
+  # Determine the length of the list
+  n = len(arr)
+
+  # Traverse the list and output each individual element
+  for i in range(n+1):
+      print(arr[i])
+```[/INST] ```python
+# Establish an integer list
+arr = [1, 2, 3, 4]
+
+# Determine the length of the list
+n = len(arr)
+
+# Initialize index at 0
+i = 0
+
+# Traverse the list and output each individual element
+while i < n:
+    print(arr[i])
+    i += 1
+```
+In the given code, it tries to access `arr[n]` which is out of bounds as python list indexing starts from 0 and ends at n-1 for a list of length n.
+Also, I used `while` loop as per your request. By initializing `i=0` before the loop and `i += 1` within the loop, we can iterate over the list. This iteration continues until `i` is less than `n`, the length of the list. The `while` loop will not proceed if the list is empty so it correctly handles empty lists.</s>
+label_ids:
+[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 7521, 4691, 13, 29937, 2661, 370, 1674, 385, 6043, 1051, 13, 2749, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29937, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29876, 353, 7431, 29898, 2749, 29897, 13, 13, 29937, 25455, 2380, 472, 29871, 29900, 13, 29875, 353, 29871, 29900, 13, 13, 29937, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 8000, 474, 529, 302, 29901, 13, 1678, 1596, 29898, 2749, 29961, 29875, 2314, 13, 1678, 474, 4619, 29871, 29896, 13, 28956, 13, 797, 278, 2183, 775, 29892, 372, 14335, 304, 2130, 421, 2749, 29961, 29876, 7961, 607, 338, 714, 310, 13451, 408, 3017, 1051, 26190, 8665, 515, 29871, 29900, 322, 10614, 472, 302, 29899, 29896, 363, 263, 1051, 310, 3309, 302, 29889, 13, 17351, 29892, 306, 1304, 421, 8000, 29952, 2425, 408, 639, 596, 2009, 29889, 2648, 2847, 5281, 421, 29875, 29922, 29900, 29952, 1434, 278, 2425, 322, 421, 29875, 4619, 29871, 29896, 29952, 2629, 278, 2425, 29892, 591, 508, 13649, 975, 278, 1051, 29889, 910, 12541, 18172, 2745, 421, 29875, 29952, 338, 3109, 1135, 421, 29876, 1673, 278, 3309, 310, 278, 1051, 29889, 450, 421, 8000, 29952, 2425, 674, 451, 8469, 565, 278, 1051, 338, 4069, 577, 372, 5149, 17766, 4069, 8857, 29889, 2]
+labels:
+[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] ```python
+# Establish an integer list
+arr = [1, 2, 3, 4]
+
+# Determine the length of the list
+n = len(arr)
+
+# Initialize index at 0
+i = 0
+
+# Traverse the list and output each individual element
+while i < n:
+    print(arr[i])
+    i += 1
+```
+In the given code, it tries to access `arr[n]` which is out of bounds as python list indexing starts from 0 and ends at n-1 for a list of length n.
+Also, I used `while` loop as per your request. By initializing `i=0` before the loop and `i += 1` within the loop, we can iterate over the list. This iteration continues until `i` is less than `n`, the length of the list. The `while` loop will not proceed if the list is empty so it correctly handles empty lists.</s>
+length of input_ids :
+1212
+length of input_ids :
+394
+input_ids:
+[1, 518, 25580, 29962, 12148, 626, 355, 278, 15352, 5132, 2471, 577, 393, 372, 7805, 263, 525, 8000, 29915, 2425, 3265, 1135, 278, 5923, 525, 1454, 29915, 2425, 29892, 607, 4256, 1078, 1549, 278, 4452, 310, 385, 6043, 1051, 29889, 13, 13, 1576, 2471, 5279, 756, 263, 6494, 988, 372, 14734, 304, 1596, 385, 1203, 393, 338, 5377, 278, 13451, 310, 278, 1051, 29889, 24778, 445, 1059, 322, 6623, 278, 2471, 304, 671, 525, 8000, 29915, 2012, 310, 525, 1454, 29915, 2425, 29889, 22521, 545, 596, 2471, 5149, 17766, 4069, 8857, 29889, 29871, 13, 13, 28956, 4691, 13, 29871, 396, 2661, 370, 1674, 385, 6043, 1051, 13, 29871, 3948, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29871, 396, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29871, 302, 353, 7431, 29898, 2749, 29897, 13, 13, 29871, 396, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 29871, 363, 474, 297, 3464, 29898, 29876, 29974, 29896, 1125, 13, 418, 1596, 29898, 2749, 29961, 29875, 2314, 13, 28956, 29961, 29914, 25580, 29962, 7521, 4691, 13, 29937, 2661, 370, 1674, 385, 6043, 1051, 13, 2749, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29937, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29876, 353, 7431, 29898, 2749, 29897, 13, 13, 29937, 25455, 2380, 472, 29871, 29900, 13, 29875, 353, 29871, 29900, 13, 13, 29937, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 8000, 474, 529, 302, 29901, 13, 1678, 1596, 29898, 2749, 29961, 29875, 2314, 13, 1678, 474, 4619, 29871, 29896, 13, 28956, 13, 797, 278, 2183, 775, 29892, 372, 14335, 304, 2130, 421, 2749, 29961, 29876, 7961, 607, 338, 714, 310, 13451, 408, 3017, 1051, 26190, 8665, 515, 29871, 29900, 322, 10614, 472, 302, 29899, 29896, 363, 263, 1051, 310, 3309, 302, 29889, 13, 17351, 29892, 306, 1304, 421, 8000, 29952, 2425, 408, 639, 596, 2009, 29889, 2648, 2847, 5281, 421, 29875, 29922, 29900, 29952, 1434, 278, 2425, 322, 421, 29875, 4619, 29871, 29896, 29952, 2629, 278, 2425, 29892, 591, 508, 13649, 975, 278, 1051, 29889, 910, 12541, 18172, 2745, 421, 29875, 29952, 338, 3109, 1135, 421, 29876, 1673, 278, 3309, 310, 278, 1051, 29889, 450, 421, 8000, 29952, 2425, 674, 451, 8469, 565, 278, 1051, 338, 4069, 577, 372, 5149, 17766, 4069, 8857, 29889, 2]
+input_ids:
+[1, 518, 25580, 29962, 29875, 29915, 345, 2355, 445, 3017, 775, 515, 385, 288, 7283, 5780, 29892, 541, 372, 29915, 29879, 451, 1985, 29889, 508, 366, 4744, 372, 363, 592, 29973, 13, 1990, 29871, 29945, 324, 29894, 29955, 29991, 265, 29901, 13, 29877, 300, 260, 513, 21322, 30181, 303, 11457, 29876, 1566, 761, 29952, 302, 29876, 29987, 29879, 29955, 261, 29901, 525, 5756, 29924, 29987, 29879, 29974, 261, 1495, 448, 29966, 29871, 29896, 2273, 29901, 13, 1311, 29889, 29877, 29896, 29878, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1822, 29918, 29915, 29900, 2396, 518, 29896, 29892, 29871, 29900, 1118, 525, 29931, 2396, 518, 29900, 29892, 29918, 29899, 29896, 1822, 525, 29934, 2396, 426, 29900, 29892, 29871, 29896, 12258, 13, 29879, 2152, 29889, 11569, 29918, 29881, 29991, 29878, 448, 11117, 29963, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 29920, 30181, 29989, 29888, 29889, 29887, 5378, 353, 313, 324, 781, 29875, 4835, 29889, 4381, 27774, 29955, 29898, 2892, 29901, 29918, 29888, 417, 29946, 29873, 877, 3083, 29888, 11287, 13, 29937, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 29879, 30181, 29989, 29888, 29889, 12637, 479, 29974, 448, 10050, 30181, 13, 29945, 761, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 286, 29987, 303, 29941, 29878, 29897, 13, 29896, 29888, 395, 29872, 277, 29892, 29873, 29946, 29878, 657, 29918, 275, 390, 650, 29901, 13, 2267, 13416, 29876, 448, 29896, 13, 29950, 29871, 29947, 9998, 1284, 278, 5807, 29900, 29878, 2167, 29955, 29871, 29929, 29992, 29974, 29882, 13, 1212, 1038, 1583, 29889, 1635, 29879, 3552, 29949, 29892, 288, 876, 13, 13, 4801, 4489, 29879, 29898, 29920, 761, 29892, 1298, 29892, 5516, 29974, 264, 1125, 13, 361, 29918, 6207, 29892, 29875, 29945, 8667, 7295, 13, 1311, 29889, 5182, 448, 772, 326, 29873, 13, 29937, 302, 29876, 935, 16669, 2943, 408, 438, 13, 911, 29989, 29873, 29892, 7720, 29961, 3149, 29913, 353, 29871, 29900, 13, 7345, 270, 297, 1583, 29892, 29900, 262, 29901, 13, 29878, 486, 29918, 29886, 29900, 326, 29873, 353, 313, 29929, 29877, 326, 29873, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29877, 850, 29900, 511, 282, 29949, 524, 29961, 29896, 10062, 344, 29989, 29888, 29889, 3972, 29961, 29900, 3816, 29875, 5262, 13, 361, 29918, 6207, 19423, 273, 25363, 29941, 29898, 29881, 29897, 322, 1583, 29892, 629, 29896, 29900, 29961, 29885, 486, 29918, 29886, 29949, 262, 29974, 21540, 1360, 260, 601, 271, 10998, 524, 29374, 13, 6207, 29889, 29885, 283, 29872, 29898, 29881, 29913, 13, 29920, 761, 29889, 29069, 29898, 29876, 1193, 29918, 3149, 29952, 611, 29938, 357, 29897, 13, 8247, 29955, 29941, 29878, 29889, 29885, 29949, 345, 29898, 29879, 2152, 29889, 11569, 29918, 29881, 29991, 29878, 29961, 29881, 1800, 13, 13, 1753, 289, 5847, 29898, 29945, 29872, 29989, 29888, 29892, 380, 29987, 2273, 1125, 13, 311, 29929, 434, 448, 313, 29949, 645, 29941, 1953, 29892, 311, 29939, 345, 4197, 29898, 303, 29946, 593, 29892, 29871, 29900, 29897, 2314, 13, 1332, 29991, 280, 270, 30181, 29939, 29894, 30181, 29901, 13, 3149, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 1896, 1298, 353, 29899, 29871, 29945, 761, 29889, 29873, 574, 29941, 29873, 29901, 13, 2457, 4331, 13, 29888, 29900, 29878, 270, 29918, 326, 1583, 29889, 29900, 262, 29889, 791, 1960, 7295, 13, 29876, 486, 29918, 3149, 29918, 29899, 313, 29886, 29900, 524, 29961, 29900, 29962, 29911, 29881, 29961, 29949, 1118, 29871, 29929, 29877, 29896, 593, 29961, 29896, 10062, 29881, 29961, 29896, 12258, 13, 277, 29918, 1311, 29892, 29929, 2429, 29961, 29876, 486, 29918, 1129, 262, 28135, 1275, 29871, 29900, 29901, 13, 29881, 30181, 802, 29892, 932, 8154, 3552, 29876, 486, 29918, 29939, 2461, 29892, 269, 29955, 1022, 29974, 29875, 876, 13, 29937, 286, 29992, 17697, 1998, 29991, 29974, 287, 2943, 2698, 448, 29896, 13, 1311, 29892, 5138, 333, 29961, 29876, 486, 29918, 3149, 29897, 353, 448, 29896, 29961, 29914, 25580, 29962, 1670, 526, 3196, 5626, 411, 596, 775, 763, 17422, 3459, 2983, 29892, 2743, 5132, 5877, 2992, 29889, 13, 13, 10605, 338, 920, 393, 775, 881, 1106, 29901, 13, 13, 28956, 4691, 13, 5215, 16250, 13, 13, 1990, 24380, 29901, 13, 1678, 822, 1284, 21322, 342, 2605, 29898, 1311, 29892, 5835, 29901, 525, 5756, 19203, 1495, 1599, 938, 29901, 13, 4706, 1583, 29889, 3972, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1402, 29915, 29928, 2396, 518, 29896, 29892, 29871, 29900, 1402, 525, 29931, 2396, 518, 29900, 6653, 29896, 1402, 525, 29934, 2396, 518, 29900, 29892, 29871, 29896, 12258, 13, 4706, 1583, 29889, 11569, 29918, 3972, 353, 11117, 29965, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 4706, 1583, 29889, 7720, 353, 16250, 29889, 4381, 8977, 29898, 2892, 29901, 7411, 877, 7192, 8785, 13, 4706, 396, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 4706, 1583, 29889, 5182, 353, 6213, 13, 4706, 1583, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 5835, 29897, 13, 4706, 565, 1583, 29889, 5182, 338, 6213, 29901, 13, 9651, 736, 448, 29896, 13, 4706, 396, 350, 9998, 1284, 278, 3273, 342, 2224, 13, 4706, 736, 1583, 29889, 1635, 29879, 3552, 29900, 29892, 29871, 29900, 876, 13, 13, 1678, 822, 4489, 29879, 29898, 1311, 29892, 1298, 29892, 5835, 1125, 13, 4706, 565, 5835, 29889, 275, 8667, 7295, 13, 9651, 1583, 29889, 5182, 353, 1298, 13, 4706, 396, 2791, 16669, 2943, 408, 29871, 29900, 13, 4706, 1583, 29889, 7720, 29961, 3149, 29962, 353, 29871, 29900, 13, 4706, 363, 270, 297, 1583, 29889, 3972, 29901, 13, 9651, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29900, 1402, 1298, 29961, 29896, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29896, 2314, 13, 9651, 565, 5835, 29889, 3068, 16619, 29898, 29881, 29897, 322, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 5785, 877, 7192, 29374, 13, 18884, 5835, 29889, 11631, 29898, 29881, 29897, 13, 18884, 1583, 29889, 29069, 29898, 29876, 486, 29918, 3149, 29892, 5835, 29897, 13, 18884, 5835, 29889, 11631, 29898, 1311, 29889, 11569, 29918, 3972, 29961, 29881, 2314, 13, 13, 1678, 822, 289, 5847, 29898, 1311, 29892, 1369, 1125, 13, 4706, 316, 802, 353, 16250, 29889, 311, 802, 4197, 29898, 2962, 29892, 29871, 29900, 29897, 2314, 13, 4706, 1550, 316, 802, 29901, 13, 9651, 1298, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 9651, 565, 1298, 1275, 1583, 29889, 5182, 29901, 13, 18884, 736, 4331, 13, 9651, 363, 270, 297, 1583, 29889, 3972, 29889, 5975, 7295, 13, 18884, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 29881, 29961, 29900, 1402, 1298, 29961, 29896, 10062, 29881, 29961, 29896, 2314, 13, 18884, 565, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 29871, 29900, 29901, 13, 462, 1678, 316, 802, 29889, 4397, 3552, 29876, 486, 29918, 3149, 29892, 4331, 29974, 29896, 876, 13, 462, 1678, 396, 2791, 16669, 2943, 408, 448, 29896, 13, 462, 1678, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 353, 448, 29896, 13, 28956, 13, 12148, 4443, 393, 366, 3282, 29915, 29873, 3867, 278, 421, 5756, 19203, 29952, 770, 29892, 8151, 366, 505, 304, 1207, 1854, 393, 445, 4413, 505, 1906, 421, 275, 8667, 1673, 421, 3068, 16619, 29952, 322, 421, 11631, 29952, 3519, 297, 1797, 445, 775, 304, 664, 29889, 2]
+labels:
+[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] ```python
+# Establish an integer list
+arr = [1, 2, 3, 4]
+
+# Determine the length of the list
+n = len(arr)
+
+# Initialize index at 0
+i = 0
+
+# Traverse the list and output each individual element
+while i < n:
+    print(arr[i])
+    i += 1
+```
+In the given code, it tries to access `arr[n]` which is out of bounds as python list indexing starts from 0 and ends at n-1 for a list of length n.
+Also, I used `while` loop as per your request. By initializing `i=0` before the loop and `i += 1` within the loop, we can iterate over the list. This iteration continues until `i` is less than `n`, the length of the list. The `while` loop will not proceed if the list is empty so it correctly handles empty lists.</s>
+inputs:
+<s> [INST]Please amend the subsequent Python script so that it includes a 'while' loop rather than the existing 'for' loop, which iterates through the items of an integer list.
+
+The script currently has a bug where it attempts to print an object that is outside the bounds of the list. Fix this error and modify the script to use 'while' instead of 'for' loop. Ensure your script correctly handles empty lists. 
+
+```python
+  # Establish an integer list
+  arr = [1, 2, 3, 4]
+
+  # Determine the length of the list
+  n = len(arr)
+
+  # Traverse the list and output each individual element
+  for i in range(n+1):
+      print(arr[i])
+```[/INST] ```python
+# Establish an integer list
+arr = [1, 2, 3, 4]
+
+# Determine the length of the list
+n = len(arr)
+
+# Initialize index at 0
+i = 0
+
+# Traverse the list and output each individual element
+while i < n:
+    print(arr[i])
+    i += 1
+```
+In the given code, it tries to access `arr[n]` which is out of bounds as python list indexing starts from 0 and ends at n-1 for a list of length n.
+Also, I used `while` loop as per your request. By initializing `i=0` before the loop and `i += 1` within the loop, we can iterate over the list. This iteration continues until `i` is less than `n`, the length of the list. The `while` loop will not proceed if the list is empty so it correctly handles empty lists.</s>
+label_ids:
+[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 7521, 4691, 13, 29937, 2661, 370, 1674, 385, 6043, 1051, 13, 2749, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29937, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29876, 353, 7431, 29898, 2749, 29897, 13, 13, 29937, 25455, 2380, 472, 29871, 29900, 13, 29875, 353, 29871, 29900, 13, 13, 29937, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 8000, 474, 529, 302, 29901, 13, 1678, 1596, 29898, 2749, 29961, 29875, 2314, 13, 1678, 474, 4619, 29871, 29896, 13, 28956, 13, 797, 278, 2183, 775, 29892, 372, 14335, 304, 2130, 421, 2749, 29961, 29876, 7961, 607, 338, 714, 310, 13451, 408, 3017, 1051, 26190, 8665, 515, 29871, 29900, 322, 10614, 472, 302, 29899, 29896, 363, 263, 1051, 310, 3309, 302, 29889, 13, 17351, 29892, 306, 1304, 421, 8000, 29952, 2425, 408, 639, 596, 2009, 29889, 2648, 2847, 5281, 421, 29875, 29922, 29900, 29952, 1434, 278, 2425, 322, 421, 29875, 4619, 29871, 29896, 29952, 2629, 278, 2425, 29892, 591, 508, 13649, 975, 278, 1051, 29889, 910, 12541, 18172, 2745, 421, 29875, 29952, 338, 3109, 1135, 421, 29876, 1673, 278, 3309, 310, 278, 1051, 29889, 450, 421, 8000, 29952, 2425, 674, 451, 8469, 565, 278, 1051, 338, 4069, 577, 372, 5149, 17766, 4069, 8857, 29889, 2]
+length of input_ids :
+1212
+input_ids:
+[1, 518, 25580, 29962, 29875, 29915, 345, 2355, 445, 3017, 775, 515, 385, 288, 7283, 5780, 29892, 541, 372, 29915, 29879, 451, 1985, 29889, 508, 366, 4744, 372, 363, 592, 29973, 13, 1990, 29871, 29945, 324, 29894, 29955, 29991, 265, 29901, 13, 29877, 300, 260, 513, 21322, 30181, 303, 11457, 29876, 1566, 761, 29952, 302, 29876, 29987, 29879, 29955, 261, 29901, 525, 5756, 29924, 29987, 29879, 29974, 261, 1495, 448, 29966, 29871, 29896, 2273, 29901, 13, 1311, 29889, 29877, 29896, 29878, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1822, 29918, 29915, 29900, 2396, 518, 29896, 29892, 29871, 29900, 1118, 525, 29931, 2396, 518, 29900, 29892, 29918, 29899, 29896, 1822, 525, 29934, 2396, 426, 29900, 29892, 29871, 29896, 12258, 13, 29879, 2152, 29889, 11569, 29918, 29881, 29991, 29878, 448, 11117, 29963, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 29920, 30181, 29989, 29888, 29889, 29887, 5378, 353, 313, 324, 781, 29875, 4835, 29889, 4381, 27774, 29955, 29898, 2892, 29901, 29918, 29888, 417, 29946, 29873, 877, 3083, 29888, 11287, 13, 29937, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 29879, 30181, 29989, 29888, 29889, 12637, 479, 29974, 448, 10050, 30181, 13, 29945, 761, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 286, 29987, 303, 29941, 29878, 29897, 13, 29896, 29888, 395, 29872, 277, 29892, 29873, 29946, 29878, 657, 29918, 275, 390, 650, 29901, 13, 2267, 13416, 29876, 448, 29896, 13, 29950, 29871, 29947, 9998, 1284, 278, 5807, 29900, 29878, 2167, 29955, 29871, 29929, 29992, 29974, 29882, 13, 1212, 1038, 1583, 29889, 1635, 29879, 3552, 29949, 29892, 288, 876, 13, 13, 4801, 4489, 29879, 29898, 29920, 761, 29892, 1298, 29892, 5516, 29974, 264, 1125, 13, 361, 29918, 6207, 29892, 29875, 29945, 8667, 7295, 13, 1311, 29889, 5182, 448, 772, 326, 29873, 13, 29937, 302, 29876, 935, 16669, 2943, 408, 438, 13, 911, 29989, 29873, 29892, 7720, 29961, 3149, 29913, 353, 29871, 29900, 13, 7345, 270, 297, 1583, 29892, 29900, 262, 29901, 13, 29878, 486, 29918, 29886, 29900, 326, 29873, 353, 313, 29929, 29877, 326, 29873, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29877, 850, 29900, 511, 282, 29949, 524, 29961, 29896, 10062, 344, 29989, 29888, 29889, 3972, 29961, 29900, 3816, 29875, 5262, 13, 361, 29918, 6207, 19423, 273, 25363, 29941, 29898, 29881, 29897, 322, 1583, 29892, 629, 29896, 29900, 29961, 29885, 486, 29918, 29886, 29949, 262, 29974, 21540, 1360, 260, 601, 271, 10998, 524, 29374, 13, 6207, 29889, 29885, 283, 29872, 29898, 29881, 29913, 13, 29920, 761, 29889, 29069, 29898, 29876, 1193, 29918, 3149, 29952, 611, 29938, 357, 29897, 13, 8247, 29955, 29941, 29878, 29889, 29885, 29949, 345, 29898, 29879, 2152, 29889, 11569, 29918, 29881, 29991, 29878, 29961, 29881, 1800, 13, 13, 1753, 289, 5847, 29898, 29945, 29872, 29989, 29888, 29892, 380, 29987, 2273, 1125, 13, 311, 29929, 434, 448, 313, 29949, 645, 29941, 1953, 29892, 311, 29939, 345, 4197, 29898, 303, 29946, 593, 29892, 29871, 29900, 29897, 2314, 13, 1332, 29991, 280, 270, 30181, 29939, 29894, 30181, 29901, 13, 3149, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 1896, 1298, 353, 29899, 29871, 29945, 761, 29889, 29873, 574, 29941, 29873, 29901, 13, 2457, 4331, 13, 29888, 29900, 29878, 270, 29918, 326, 1583, 29889, 29900, 262, 29889, 791, 1960, 7295, 13, 29876, 486, 29918, 3149, 29918, 29899, 313, 29886, 29900, 524, 29961, 29900, 29962, 29911, 29881, 29961, 29949, 1118, 29871, 29929, 29877, 29896, 593, 29961, 29896, 10062, 29881, 29961, 29896, 12258, 13, 277, 29918, 1311, 29892, 29929, 2429, 29961, 29876, 486, 29918, 1129, 262, 28135, 1275, 29871, 29900, 29901, 13, 29881, 30181, 802, 29892, 932, 8154, 3552, 29876, 486, 29918, 29939, 2461, 29892, 269, 29955, 1022, 29974, 29875, 876, 13, 29937, 286, 29992, 17697, 1998, 29991, 29974, 287, 2943, 2698, 448, 29896, 13, 1311, 29892, 5138, 333, 29961, 29876, 486, 29918, 3149, 29897, 353, 448, 29896, 29961, 29914, 25580, 29962, 1670, 526, 3196, 5626, 411, 596, 775, 763, 17422, 3459, 2983, 29892, 2743, 5132, 5877, 2992, 29889, 13, 13, 10605, 338, 920, 393, 775, 881, 1106, 29901, 13, 13, 28956, 4691, 13, 5215, 16250, 13, 13, 1990, 24380, 29901, 13, 1678, 822, 1284, 21322, 342, 2605, 29898, 1311, 29892, 5835, 29901, 525, 5756, 19203, 1495, 1599, 938, 29901, 13, 4706, 1583, 29889, 3972, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1402, 29915, 29928, 2396, 518, 29896, 29892, 29871, 29900, 1402, 525, 29931, 2396, 518, 29900, 6653, 29896, 1402, 525, 29934, 2396, 518, 29900, 29892, 29871, 29896, 12258, 13, 4706, 1583, 29889, 11569, 29918, 3972, 353, 11117, 29965, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 4706, 1583, 29889, 7720, 353, 16250, 29889, 4381, 8977, 29898, 2892, 29901, 7411, 877, 7192, 8785, 13, 4706, 396, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 4706, 1583, 29889, 5182, 353, 6213, 13, 4706, 1583, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 5835, 29897, 13, 4706, 565, 1583, 29889, 5182, 338, 6213, 29901, 13, 9651, 736, 448, 29896, 13, 4706, 396, 350, 9998, 1284, 278, 3273, 342, 2224, 13, 4706, 736, 1583, 29889, 1635, 29879, 3552, 29900, 29892, 29871, 29900, 876, 13, 13, 1678, 822, 4489, 29879, 29898, 1311, 29892, 1298, 29892, 5835, 1125, 13, 4706, 565, 5835, 29889, 275, 8667, 7295, 13, 9651, 1583, 29889, 5182, 353, 1298, 13, 4706, 396, 2791, 16669, 2943, 408, 29871, 29900, 13, 4706, 1583, 29889, 7720, 29961, 3149, 29962, 353, 29871, 29900, 13, 4706, 363, 270, 297, 1583, 29889, 3972, 29901, 13, 9651, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29900, 1402, 1298, 29961, 29896, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29896, 2314, 13, 9651, 565, 5835, 29889, 3068, 16619, 29898, 29881, 29897, 322, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 5785, 877, 7192, 29374, 13, 18884, 5835, 29889, 11631, 29898, 29881, 29897, 13, 18884, 1583, 29889, 29069, 29898, 29876, 486, 29918, 3149, 29892, 5835, 29897, 13, 18884, 5835, 29889, 11631, 29898, 1311, 29889, 11569, 29918, 3972, 29961, 29881, 2314, 13, 13, 1678, 822, 289, 5847, 29898, 1311, 29892, 1369, 1125, 13, 4706, 316, 802, 353, 16250, 29889, 311, 802, 4197, 29898, 2962, 29892, 29871, 29900, 29897, 2314, 13, 4706, 1550, 316, 802, 29901, 13, 9651, 1298, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 9651, 565, 1298, 1275, 1583, 29889, 5182, 29901, 13, 18884, 736, 4331, 13, 9651, 363, 270, 297, 1583, 29889, 3972, 29889, 5975, 7295, 13, 18884, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 29881, 29961, 29900, 1402, 1298, 29961, 29896, 10062, 29881, 29961, 29896, 2314, 13, 18884, 565, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 29871, 29900, 29901, 13, 462, 1678, 316, 802, 29889, 4397, 3552, 29876, 486, 29918, 3149, 29892, 4331, 29974, 29896, 876, 13, 462, 1678, 396, 2791, 16669, 2943, 408, 448, 29896, 13, 462, 1678, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 353, 448, 29896, 13, 28956, 13, 12148, 4443, 393, 366, 3282, 29915, 29873, 3867, 278, 421, 5756, 19203, 29952, 770, 29892, 8151, 366, 505, 304, 1207, 1854, 393, 445, 4413, 505, 1906, 421, 275, 8667, 1673, 421, 3068, 16619, 29952, 322, 421, 11631, 29952, 3519, 297, 1797, 445, 775, 304, 664, 29889, 2]
+inputs:
+<s> [INST]Please amend the subsequent Python script so that it includes a 'while' loop rather than the existing 'for' loop, which iterates through the items of an integer list.
+
+The script currently has a bug where it attempts to print an object that is outside the bounds of the list. Fix this error and modify the script to use 'while' instead of 'for' loop. Ensure your script correctly handles empty lists. 
+
+```python
+  # Establish an integer list
+  arr = [1, 2, 3, 4]
+
+  # Determine the length of the list
+  n = len(arr)
+
+  # Traverse the list and output each individual element
+  for i in range(n+1):
+      print(arr[i])
+```[/INST] ```python
+# Establish an integer list
+arr = [1, 2, 3, 4]
+
+# Determine the length of the list
+n = len(arr)
+
+# Initialize index at 0
+i = 0
+
+# Traverse the list and output each individual element
+while i < n:
+    print(arr[i])
+    i += 1
+```
+In the given code, it tries to access `arr[n]` which is out of bounds as python list indexing starts from 0 and ends at n-1 for a list of length n.
+Also, I used `while` loop as per your request. By initializing `i=0` before the loop and `i += 1` within the loop, we can iterate over the list. This iteration continues until `i` is less than `n`, the length of the list. The `while` loop will not proceed if the list is empty so it correctly handles empty lists.</s>
+label_ids:
+[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 7521, 4691, 13, 29937, 2661, 370, 1674, 385, 6043, 1051, 13, 2749, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29937, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29876, 353, 7431, 29898, 2749, 29897, 13, 13, 29937, 25455, 2380, 472, 29871, 29900, 13, 29875, 353, 29871, 29900, 13, 13, 29937, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 8000, 474, 529, 302, 29901, 13, 1678, 1596, 29898, 2749, 29961, 29875, 2314, 13, 1678, 474, 4619, 29871, 29896, 13, 28956, 13, 797, 278, 2183, 775, 29892, 372, 14335, 304, 2130, 421, 2749, 29961, 29876, 7961, 607, 338, 714, 310, 13451, 408, 3017, 1051, 26190, 8665, 515, 29871, 29900, 322, 10614, 472, 302, 29899, 29896, 363, 263, 1051, 310, 3309, 302, 29889, 13, 17351, 29892, 306, 1304, 421, 8000, 29952, 2425, 408, 639, 596, 2009, 29889, 2648, 2847, 5281, 421, 29875, 29922, 29900, 29952, 1434, 278, 2425, 322, 421, 29875, 4619, 29871, 29896, 29952, 2629, 278, 2425, 29892, 591, 508, 13649, 975, 278, 1051, 29889, 910, 12541, 18172, 2745, 421, 29875, 29952, 338, 3109, 1135, 421, 29876, 1673, 278, 3309, 310, 278, 1051, 29889, 450, 421, 8000, 29952, 2425, 674, 451, 8469, 565, 278, 1051, 338, 4069, 577, 372, 5149, 17766, 4069, 8857, 29889, 2]
+labels:
+[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] ```python
+# Establish an integer list
+arr = [1, 2, 3, 4]
+
+# Determine the length of the list
+n = len(arr)
+
+# Initialize index at 0
+i = 0
+
+# Traverse the list and output each individual element
+while i < n:
+    print(arr[i])
+    i += 1
+```
+In the given code, it tries to access `arr[n]` which is out of bounds as python list indexing starts from 0 and ends at n-1 for a list of length n.
+Also, I used `while` loop as per your request. By initializing `i=0` before the loop and `i += 1` within the loop, we can iterate over the list. This iteration continues until `i` is less than `n`, the length of the list. The `while` loop will not proceed if the list is empty so it correctly handles empty lists.</s>
+length of input_ids :
+1212
+input_ids:
+[1, 518, 25580, 29962, 29875, 29915, 345, 2355, 445, 3017, 775, 515, 385, 288, 7283, 5780, 29892, 541, 372, 29915, 29879, 451, 1985, 29889, 508, 366, 4744, 372, 363, 592, 29973, 13, 1990, 29871, 29945, 324, 29894, 29955, 29991, 265, 29901, 13, 29877, 300, 260, 513, 21322, 30181, 303, 11457, 29876, 1566, 761, 29952, 302, 29876, 29987, 29879, 29955, 261, 29901, 525, 5756, 29924, 29987, 29879, 29974, 261, 1495, 448, 29966, 29871, 29896, 2273, 29901, 13, 1311, 29889, 29877, 29896, 29878, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1822, 29918, 29915, 29900, 2396, 518, 29896, 29892, 29871, 29900, 1118, 525, 29931, 2396, 518, 29900, 29892, 29918, 29899, 29896, 1822, 525, 29934, 2396, 426, 29900, 29892, 29871, 29896, 12258, 13, 29879, 2152, 29889, 11569, 29918, 29881, 29991, 29878, 448, 11117, 29963, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 29920, 30181, 29989, 29888, 29889, 29887, 5378, 353, 313, 324, 781, 29875, 4835, 29889, 4381, 27774, 29955, 29898, 2892, 29901, 29918, 29888, 417, 29946, 29873, 877, 3083, 29888, 11287, 13, 29937, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 29879, 30181, 29989, 29888, 29889, 12637, 479, 29974, 448, 10050, 30181, 13, 29945, 761, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 286, 29987, 303, 29941, 29878, 29897, 13, 29896, 29888, 395, 29872, 277, 29892, 29873, 29946, 29878, 657, 29918, 275, 390, 650, 29901, 13, 2267, 13416, 29876, 448, 29896, 13, 29950, 29871, 29947, 9998, 1284, 278, 5807, 29900, 29878, 2167, 29955, 29871, 29929, 29992, 29974, 29882, 13, 1212, 1038, 1583, 29889, 1635, 29879, 3552, 29949, 29892, 288, 876, 13, 13, 4801, 4489, 29879, 29898, 29920, 761, 29892, 1298, 29892, 5516, 29974, 264, 1125, 13, 361, 29918, 6207, 29892, 29875, 29945, 8667, 7295, 13, 1311, 29889, 5182, 448, 772, 326, 29873, 13, 29937, 302, 29876, 935, 16669, 2943, 408, 438, 13, 911, 29989, 29873, 29892, 7720, 29961, 3149, 29913, 353, 29871, 29900, 13, 7345, 270, 297, 1583, 29892, 29900, 262, 29901, 13, 29878, 486, 29918, 29886, 29900, 326, 29873, 353, 313, 29929, 29877, 326, 29873, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29877, 850, 29900, 511, 282, 29949, 524, 29961, 29896, 10062, 344, 29989, 29888, 29889, 3972, 29961, 29900, 3816, 29875, 5262, 13, 361, 29918, 6207, 19423, 273, 25363, 29941, 29898, 29881, 29897, 322, 1583, 29892, 629, 29896, 29900, 29961, 29885, 486, 29918, 29886, 29949, 262, 29974, 21540, 1360, 260, 601, 271, 10998, 524, 29374, 13, 6207, 29889, 29885, 283, 29872, 29898, 29881, 29913, 13, 29920, 761, 29889, 29069, 29898, 29876, 1193, 29918, 3149, 29952, 611, 29938, 357, 29897, 13, 8247, 29955, 29941, 29878, 29889, 29885, 29949, 345, 29898, 29879, 2152, 29889, 11569, 29918, 29881, 29991, 29878, 29961, 29881, 1800, 13, 13, 1753, 289, 5847, 29898, 29945, 29872, 29989, 29888, 29892, 380, 29987, 2273, 1125, 13, 311, 29929, 434, 448, 313, 29949, 645, 29941, 1953, 29892, 311, 29939, 345, 4197, 29898, 303, 29946, 593, 29892, 29871, 29900, 29897, 2314, 13, 1332, 29991, 280, 270, 30181, 29939, 29894, 30181, 29901, 13, 3149, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 1896, 1298, 353, 29899, 29871, 29945, 761, 29889, 29873, 574, 29941, 29873, 29901, 13, 2457, 4331, 13, 29888, 29900, 29878, 270, 29918, 326, 1583, 29889, 29900, 262, 29889, 791, 1960, 7295, 13, 29876, 486, 29918, 3149, 29918, 29899, 313, 29886, 29900, 524, 29961, 29900, 29962, 29911, 29881, 29961, 29949, 1118, 29871, 29929, 29877, 29896, 593, 29961, 29896, 10062, 29881, 29961, 29896, 12258, 13, 277, 29918, 1311, 29892, 29929, 2429, 29961, 29876, 486, 29918, 1129, 262, 28135, 1275, 29871, 29900, 29901, 13, 29881, 30181, 802, 29892, 932, 8154, 3552, 29876, 486, 29918, 29939, 2461, 29892, 269, 29955, 1022, 29974, 29875, 876, 13, 29937, 286, 29992, 17697, 1998, 29991, 29974, 287, 2943, 2698, 448, 29896, 13, 1311, 29892, 5138, 333, 29961, 29876, 486, 29918, 3149, 29897, 353, 448, 29896, 29961, 29914, 25580, 29962, 1670, 526, 3196, 5626, 411, 596, 775, 763, 17422, 3459, 2983, 29892, 2743, 5132, 5877, 2992, 29889, 13, 13, 10605, 338, 920, 393, 775, 881, 1106, 29901, 13, 13, 28956, 4691, 13, 5215, 16250, 13, 13, 1990, 24380, 29901, 13, 1678, 822, 1284, 21322, 342, 2605, 29898, 1311, 29892, 5835, 29901, 525, 5756, 19203, 1495, 1599, 938, 29901, 13, 4706, 1583, 29889, 3972, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1402, 29915, 29928, 2396, 518, 29896, 29892, 29871, 29900, 1402, 525, 29931, 2396, 518, 29900, 6653, 29896, 1402, 525, 29934, 2396, 518, 29900, 29892, 29871, 29896, 12258, 13, 4706, 1583, 29889, 11569, 29918, 3972, 353, 11117, 29965, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 4706, 1583, 29889, 7720, 353, 16250, 29889, 4381, 8977, 29898, 2892, 29901, 7411, 877, 7192, 8785, 13, 4706, 396, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 4706, 1583, 29889, 5182, 353, 6213, 13, 4706, 1583, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 5835, 29897, 13, 4706, 565, 1583, 29889, 5182, 338, 6213, 29901, 13, 9651, 736, 448, 29896, 13, 4706, 396, 350, 9998, 1284, 278, 3273, 342, 2224, 13, 4706, 736, 1583, 29889, 1635, 29879, 3552, 29900, 29892, 29871, 29900, 876, 13, 13, 1678, 822, 4489, 29879, 29898, 1311, 29892, 1298, 29892, 5835, 1125, 13, 4706, 565, 5835, 29889, 275, 8667, 7295, 13, 9651, 1583, 29889, 5182, 353, 1298, 13, 4706, 396, 2791, 16669, 2943, 408, 29871, 29900, 13, 4706, 1583, 29889, 7720, 29961, 3149, 29962, 353, 29871, 29900, 13, 4706, 363, 270, 297, 1583, 29889, 3972, 29901, 13, 9651, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29900, 1402, 1298, 29961, 29896, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29896, 2314, 13, 9651, 565, 5835, 29889, 3068, 16619, 29898, 29881, 29897, 322, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 5785, 877, 7192, 29374, 13, 18884, 5835, 29889, 11631, 29898, 29881, 29897, 13, 18884, 1583, 29889, 29069, 29898, 29876, 486, 29918, 3149, 29892, 5835, 29897, 13, 18884, 5835, 29889, 11631, 29898, 1311, 29889, 11569, 29918, 3972, 29961, 29881, 2314, 13, 13, 1678, 822, 289, 5847, 29898, 1311, 29892, 1369, 1125, 13, 4706, 316, 802, 353, 16250, 29889, 311, 802, 4197, 29898, 2962, 29892, 29871, 29900, 29897, 2314, 13, 4706, 1550, 316, 802, 29901, 13, 9651, 1298, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 9651, 565, 1298, 1275, 1583, 29889, 5182, 29901, 13, 18884, 736, 4331, 13, 9651, 363, 270, 297, 1583, 29889, 3972, 29889, 5975, 7295, 13, 18884, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 29881, 29961, 29900, 1402, 1298, 29961, 29896, 10062, 29881, 29961, 29896, 2314, 13, 18884, 565, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 29871, 29900, 29901, 13, 462, 1678, 316, 802, 29889, 4397, 3552, 29876, 486, 29918, 3149, 29892, 4331, 29974, 29896, 876, 13, 462, 1678, 396, 2791, 16669, 2943, 408, 448, 29896, 13, 462, 1678, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 353, 448, 29896, 13, 28956, 13, 12148, 4443, 393, 366, 3282, 29915, 29873, 3867, 278, 421, 5756, 19203, 29952, 770, 29892, 8151, 366, 505, 304, 1207, 1854, 393, 445, 4413, 505, 1906, 421, 275, 8667, 1673, 421, 3068, 16619, 29952, 322, 421, 11631, 29952, 3519, 297, 1797, 445, 775, 304, 664, 29889, 2]
+labels:
+[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] ```python
+# Establish an integer list
+arr = [1, 2, 3, 4]
+
+# Determine the length of the list
+n = len(arr)
+
+# Initialize index at 0
+i = 0
+
+# Traverse the list and output each individual element
+while i < n:
+    print(arr[i])
+    i += 1
+```
+In the given code, it tries to access `arr[n]` which is out of bounds as python list indexing starts from 0 and ends at n-1 for a list of length n.
+Also, I used `while` loop as per your request. By initializing `i=0` before the loop and `i += 1` within the loop, we can iterate over the list. This iteration continues until `i` is less than `n`, the length of the list. The `while` loop will not proceed if the list is empty so it correctly handles empty lists.</s>
+length of input_ids :
+394
+input_ids:
+[1, 518, 25580, 29962, 12148, 626, 355, 278, 15352, 5132, 2471, 577, 393, 372, 7805, 263, 525, 8000, 29915, 2425, 3265, 1135, 278, 5923, 525, 1454, 29915, 2425, 29892, 607, 4256, 1078, 1549, 278, 4452, 310, 385, 6043, 1051, 29889, 13, 13, 1576, 2471, 5279, 756, 263, 6494, 988, 372, 14734, 304, 1596, 385, 1203, 393, 338, 5377, 278, 13451, 310, 278, 1051, 29889, 24778, 445, 1059, 322, 6623, 278, 2471, 304, 671, 525, 8000, 29915, 2012, 310, 525, 1454, 29915, 2425, 29889, 22521, 545, 596, 2471, 5149, 17766, 4069, 8857, 29889, 29871, 13, 13, 28956, 4691, 13, 29871, 396, 2661, 370, 1674, 385, 6043, 1051, 13, 29871, 3948, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29871, 396, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29871, 302, 353, 7431, 29898, 2749, 29897, 13, 13, 29871, 396, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 29871, 363, 474, 297, 3464, 29898, 29876, 29974, 29896, 1125, 13, 418, 1596, 29898, 2749, 29961, 29875, 2314, 13, 28956, 29961, 29914, 25580, 29962, 7521, 4691, 13, 29937, 2661, 370, 1674, 385, 6043, 1051, 13, 2749, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29937, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29876, 353, 7431, 29898, 2749, 29897, 13, 13, 29937, 25455, 2380, 472, 29871, 29900, 13, 29875, 353, 29871, 29900, 13, 13, 29937, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 8000, 474, 529, 302, 29901, 13, 1678, 1596, 29898, 2749, 29961, 29875, 2314, 13, 1678, 474, 4619, 29871, 29896, 13, 28956, 13, 797, 278, 2183, 775, 29892, 372, 14335, 304, 2130, 421, 2749, 29961, 29876, 7961, 607, 338, 714, 310, 13451, 408, 3017, 1051, 26190, 8665, 515, 29871, 29900, 322, 10614, 472, 302, 29899, 29896, 363, 263, 1051, 310, 3309, 302, 29889, 13, 17351, 29892, 306, 1304, 421, 8000, 29952, 2425, 408, 639, 596, 2009, 29889, 2648, 2847, 5281, 421, 29875, 29922, 29900, 29952, 1434, 278, 2425, 322, 421, 29875, 4619, 29871, 29896, 29952, 2629, 278, 2425, 29892, 591, 508, 13649, 975, 278, 1051, 29889, 910, 12541, 18172, 2745, 421, 29875, 29952, 338, 3109, 1135, 421, 29876, 1673, 278, 3309, 310, 278, 1051, 29889, 450, 421, 8000, 29952, 2425, 674, 451, 8469, 565, 278, 1051, 338, 4069, 577, 372, 5149, 17766, 4069, 8857, 29889, 2]
+length of input_ids :
+1212
+input_ids:
+[1, 518, 25580, 29962, 29875, 29915, 345, 2355, 445, 3017, 775, 515, 385, 288, 7283, 5780, 29892, 541, 372, 29915, 29879, 451, 1985, 29889, 508, 366, 4744, 372, 363, 592, 29973, 13, 1990, 29871, 29945, 324, 29894, 29955, 29991, 265, 29901, 13, 29877, 300, 260, 513, 21322, 30181, 303, 11457, 29876, 1566, 761, 29952, 302, 29876, 29987, 29879, 29955, 261, 29901, 525, 5756, 29924, 29987, 29879, 29974, 261, 1495, 448, 29966, 29871, 29896, 2273, 29901, 13, 1311, 29889, 29877, 29896, 29878, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1822, 29918, 29915, 29900, 2396, 518, 29896, 29892, 29871, 29900, 1118, 525, 29931, 2396, 518, 29900, 29892, 29918, 29899, 29896, 1822, 525, 29934, 2396, 426, 29900, 29892, 29871, 29896, 12258, 13, 29879, 2152, 29889, 11569, 29918, 29881, 29991, 29878, 448, 11117, 29963, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 29920, 30181, 29989, 29888, 29889, 29887, 5378, 353, 313, 324, 781, 29875, 4835, 29889, 4381, 27774, 29955, 29898, 2892, 29901, 29918, 29888, 417, 29946, 29873, 877, 3083, 29888, 11287, 13, 29937, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 29879, 30181, 29989, 29888, 29889, 12637, 479, 29974, 448, 10050, 30181, 13, 29945, 761, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 286, 29987, 303, 29941, 29878, 29897, 13, 29896, 29888, 395, 29872, 277, 29892, 29873, 29946, 29878, 657, 29918, 275, 390, 650, 29901, 13, 2267, 13416, 29876, 448, 29896, 13, 29950, 29871, 29947, 9998, 1284, 278, 5807, 29900, 29878, 2167, 29955, 29871, 29929, 29992, 29974, 29882, 13, 1212, 1038, 1583, 29889, 1635, 29879, 3552, 29949, 29892, 288, 876, 13, 13, 4801, 4489, 29879, 29898, 29920, 761, 29892, 1298, 29892, 5516, 29974, 264, 1125, 13, 361, 29918, 6207, 29892, 29875, 29945, 8667, 7295, 13, 1311, 29889, 5182, 448, 772, 326, 29873, 13, 29937, 302, 29876, 935, 16669, 2943, 408, 438, 13, 911, 29989, 29873, 29892, 7720, 29961, 3149, 29913, 353, 29871, 29900, 13, 7345, 270, 297, 1583, 29892, 29900, 262, 29901, 13, 29878, 486, 29918, 29886, 29900, 326, 29873, 353, 313, 29929, 29877, 326, 29873, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29877, 850, 29900, 511, 282, 29949, 524, 29961, 29896, 10062, 344, 29989, 29888, 29889, 3972, 29961, 29900, 3816, 29875, 5262, 13, 361, 29918, 6207, 19423, 273, 25363, 29941, 29898, 29881, 29897, 322, 1583, 29892, 629, 29896, 29900, 29961, 29885, 486, 29918, 29886, 29949, 262, 29974, 21540, 1360, 260, 601, 271, 10998, 524, 29374, 13, 6207, 29889, 29885, 283, 29872, 29898, 29881, 29913, 13, 29920, 761, 29889, 29069, 29898, 29876, 1193, 29918, 3149, 29952, 611, 29938, 357, 29897, 13, 8247, 29955, 29941, 29878, 29889, 29885, 29949, 345, 29898, 29879, 2152, 29889, 11569, 29918, 29881, 29991, 29878, 29961, 29881, 1800, 13, 13, 1753, 289, 5847, 29898, 29945, 29872, 29989, 29888, 29892, 380, 29987, 2273, 1125, 13, 311, 29929, 434, 448, 313, 29949, 645, 29941, 1953, 29892, 311, 29939, 345, 4197, 29898, 303, 29946, 593, 29892, 29871, 29900, 29897, 2314, 13, 1332, 29991, 280, 270, 30181, 29939, 29894, 30181, 29901, 13, 3149, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 1896, 1298, 353, 29899, 29871, 29945, 761, 29889, 29873, 574, 29941, 29873, 29901, 13, 2457, 4331, 13, 29888, 29900, 29878, 270, 29918, 326, 1583, 29889, 29900, 262, 29889, 791, 1960, 7295, 13, 29876, 486, 29918, 3149, 29918, 29899, 313, 29886, 29900, 524, 29961, 29900, 29962, 29911, 29881, 29961, 29949, 1118, 29871, 29929, 29877, 29896, 593, 29961, 29896, 10062, 29881, 29961, 29896, 12258, 13, 277, 29918, 1311, 29892, 29929, 2429, 29961, 29876, 486, 29918, 1129, 262, 28135, 1275, 29871, 29900, 29901, 13, 29881, 30181, 802, 29892, 932, 8154, 3552, 29876, 486, 29918, 29939, 2461, 29892, 269, 29955, 1022, 29974, 29875, 876, 13, 29937, 286, 29992, 17697, 1998, 29991, 29974, 287, 2943, 2698, 448, 29896, 13, 1311, 29892, 5138, 333, 29961, 29876, 486, 29918, 3149, 29897, 353, 448, 29896, 29961, 29914, 25580, 29962, 1670, 526, 3196, 5626, 411, 596, 775, 763, 17422, 3459, 2983, 29892, 2743, 5132, 5877, 2992, 29889, 13, 13, 10605, 338, 920, 393, 775, 881, 1106, 29901, 13, 13, 28956, 4691, 13, 5215, 16250, 13, 13, 1990, 24380, 29901, 13, 1678, 822, 1284, 21322, 342, 2605, 29898, 1311, 29892, 5835, 29901, 525, 5756, 19203, 1495, 1599, 938, 29901, 13, 4706, 1583, 29889, 3972, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1402, 29915, 29928, 2396, 518, 29896, 29892, 29871, 29900, 1402, 525, 29931, 2396, 518, 29900, 6653, 29896, 1402, 525, 29934, 2396, 518, 29900, 29892, 29871, 29896, 12258, 13, 4706, 1583, 29889, 11569, 29918, 3972, 353, 11117, 29965, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 4706, 1583, 29889, 7720, 353, 16250, 29889, 4381, 8977, 29898, 2892, 29901, 7411, 877, 7192, 8785, 13, 4706, 396, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 4706, 1583, 29889, 5182, 353, 6213, 13, 4706, 1583, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 5835, 29897, 13, 4706, 565, 1583, 29889, 5182, 338, 6213, 29901, 13, 9651, 736, 448, 29896, 13, 4706, 396, 350, 9998, 1284, 278, 3273, 342, 2224, 13, 4706, 736, 1583, 29889, 1635, 29879, 3552, 29900, 29892, 29871, 29900, 876, 13, 13, 1678, 822, 4489, 29879, 29898, 1311, 29892, 1298, 29892, 5835, 1125, 13, 4706, 565, 5835, 29889, 275, 8667, 7295, 13, 9651, 1583, 29889, 5182, 353, 1298, 13, 4706, 396, 2791, 16669, 2943, 408, 29871, 29900, 13, 4706, 1583, 29889, 7720, 29961, 3149, 29962, 353, 29871, 29900, 13, 4706, 363, 270, 297, 1583, 29889, 3972, 29901, 13, 9651, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29900, 1402, 1298, 29961, 29896, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29896, 2314, 13, 9651, 565, 5835, 29889, 3068, 16619, 29898, 29881, 29897, 322, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 5785, 877, 7192, 29374, 13, 18884, 5835, 29889, 11631, 29898, 29881, 29897, 13, 18884, 1583, 29889, 29069, 29898, 29876, 486, 29918, 3149, 29892, 5835, 29897, 13, 18884, 5835, 29889, 11631, 29898, 1311, 29889, 11569, 29918, 3972, 29961, 29881, 2314, 13, 13, 1678, 822, 289, 5847, 29898, 1311, 29892, 1369, 1125, 13, 4706, 316, 802, 353, 16250, 29889, 311, 802, 4197, 29898, 2962, 29892, 29871, 29900, 29897, 2314, 13, 4706, 1550, 316, 802, 29901, 13, 9651, 1298, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 9651, 565, 1298, 1275, 1583, 29889, 5182, 29901, 13, 18884, 736, 4331, 13, 9651, 363, 270, 297, 1583, 29889, 3972, 29889, 5975, 7295, 13, 18884, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 29881, 29961, 29900, 1402, 1298, 29961, 29896, 10062, 29881, 29961, 29896, 2314, 13, 18884, 565, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 29871, 29900, 29901, 13, 462, 1678, 316, 802, 29889, 4397, 3552, 29876, 486, 29918, 3149, 29892, 4331, 29974, 29896, 876, 13, 462, 1678, 396, 2791, 16669, 2943, 408, 448, 29896, 13, 462, 1678, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 353, 448, 29896, 13, 28956, 13, 12148, 4443, 393, 366, 3282, 29915, 29873, 3867, 278, 421, 5756, 19203, 29952, 770, 29892, 8151, 366, 505, 304, 1207, 1854, 393, 445, 4413, 505, 1906, 421, 275, 8667, 1673, 421, 3068, 16619, 29952, 322, 421, 11631, 29952, 3519, 297, 1797, 445, 775, 304, 664, 29889, 2]
+length of input_ids :
+1212
+input_ids:
+[1, 518, 25580, 29962, 29875, 29915, 345, 2355, 445, 3017, 775, 515, 385, 288, 7283, 5780, 29892, 541, 372, 29915, 29879, 451, 1985, 29889, 508, 366, 4744, 372, 363, 592, 29973, 13, 1990, 29871, 29945, 324, 29894, 29955, 29991, 265, 29901, 13, 29877, 300, 260, 513, 21322, 30181, 303, 11457, 29876, 1566, 761, 29952, 302, 29876, 29987, 29879, 29955, 261, 29901, 525, 5756, 29924, 29987, 29879, 29974, 261, 1495, 448, 29966, 29871, 29896, 2273, 29901, 13, 1311, 29889, 29877, 29896, 29878, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1822, 29918, 29915, 29900, 2396, 518, 29896, 29892, 29871, 29900, 1118, 525, 29931, 2396, 518, 29900, 29892, 29918, 29899, 29896, 1822, 525, 29934, 2396, 426, 29900, 29892, 29871, 29896, 12258, 13, 29879, 2152, 29889, 11569, 29918, 29881, 29991, 29878, 448, 11117, 29963, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 29920, 30181, 29989, 29888, 29889, 29887, 5378, 353, 313, 324, 781, 29875, 4835, 29889, 4381, 27774, 29955, 29898, 2892, 29901, 29918, 29888, 417, 29946, 29873, 877, 3083, 29888, 11287, 13, 29937, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 29879, 30181, 29989, 29888, 29889, 12637, 479, 29974, 448, 10050, 30181, 13, 29945, 761, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 286, 29987, 303, 29941, 29878, 29897, 13, 29896, 29888, 395, 29872, 277, 29892, 29873, 29946, 29878, 657, 29918, 275, 390, 650, 29901, 13, 2267, 13416, 29876, 448, 29896, 13, 29950, 29871, 29947, 9998, 1284, 278, 5807, 29900, 29878, 2167, 29955, 29871, 29929, 29992, 29974, 29882, 13, 1212, 1038, 1583, 29889, 1635, 29879, 3552, 29949, 29892, 288, 876, 13, 13, 4801, 4489, 29879, 29898, 29920, 761, 29892, 1298, 29892, 5516, 29974, 264, 1125, 13, 361, 29918, 6207, 29892, 29875, 29945, 8667, 7295, 13, 1311, 29889, 5182, 448, 772, 326, 29873, 13, 29937, 302, 29876, 935, 16669, 2943, 408, 438, 13, 911, 29989, 29873, 29892, 7720, 29961, 3149, 29913, 353, 29871, 29900, 13, 7345, 270, 297, 1583, 29892, 29900, 262, 29901, 13, 29878, 486, 29918, 29886, 29900, 326, 29873, 353, 313, 29929, 29877, 326, 29873, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29877, 850, 29900, 511, 282, 29949, 524, 29961, 29896, 10062, 344, 29989, 29888, 29889, 3972, 29961, 29900, 3816, 29875, 5262, 13, 361, 29918, 6207, 19423, 273, 25363, 29941, 29898, 29881, 29897, 322, 1583, 29892, 629, 29896, 29900, 29961, 29885, 486, 29918, 29886, 29949, 262, 29974, 21540, 1360, 260, 601, 271, 10998, 524, 29374, 13, 6207, 29889, 29885, 283, 29872, 29898, 29881, 29913, 13, 29920, 761, 29889, 29069, 29898, 29876, 1193, 29918, 3149, 29952, 611, 29938, 357, 29897, 13, 8247, 29955, 29941, 29878, 29889, 29885, 29949, 345, 29898, 29879, 2152, 29889, 11569, 29918, 29881, 29991, 29878, 29961, 29881, 1800, 13, 13, 1753, 289, 5847, 29898, 29945, 29872, 29989, 29888, 29892, 380, 29987, 2273, 1125, 13, 311, 29929, 434, 448, 313, 29949, 645, 29941, 1953, 29892, 311, 29939, 345, 4197, 29898, 303, 29946, 593, 29892, 29871, 29900, 29897, 2314, 13, 1332, 29991, 280, 270, 30181, 29939, 29894, 30181, 29901, 13, 3149, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 1896, 1298, 353, 29899, 29871, 29945, 761, 29889, 29873, 574, 29941, 29873, 29901, 13, 2457, 4331, 13, 29888, 29900, 29878, 270, 29918, 326, 1583, 29889, 29900, 262, 29889, 791, 1960, 7295, 13, 29876, 486, 29918, 3149, 29918, 29899, 313, 29886, 29900, 524, 29961, 29900, 29962, 29911, 29881, 29961, 29949, 1118, 29871, 29929, 29877, 29896, 593, 29961, 29896, 10062, 29881, 29961, 29896, 12258, 13, 277, 29918, 1311, 29892, 29929, 2429, 29961, 29876, 486, 29918, 1129, 262, 28135, 1275, 29871, 29900, 29901, 13, 29881, 30181, 802, 29892, 932, 8154, 3552, 29876, 486, 29918, 29939, 2461, 29892, 269, 29955, 1022, 29974, 29875, 876, 13, 29937, 286, 29992, 17697, 1998, 29991, 29974, 287, 2943, 2698, 448, 29896, 13, 1311, 29892, 5138, 333, 29961, 29876, 486, 29918, 3149, 29897, 353, 448, 29896, 29961, 29914, 25580, 29962, 1670, 526, 3196, 5626, 411, 596, 775, 763, 17422, 3459, 2983, 29892, 2743, 5132, 5877, 2992, 29889, 13, 13, 10605, 338, 920, 393, 775, 881, 1106, 29901, 13, 13, 28956, 4691, 13, 5215, 16250, 13, 13, 1990, 24380, 29901, 13, 1678, 822, 1284, 21322, 342, 2605, 29898, 1311, 29892, 5835, 29901, 525, 5756, 19203, 1495, 1599, 938, 29901, 13, 4706, 1583, 29889, 3972, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1402, 29915, 29928, 2396, 518, 29896, 29892, 29871, 29900, 1402, 525, 29931, 2396, 518, 29900, 6653, 29896, 1402, 525, 29934, 2396, 518, 29900, 29892, 29871, 29896, 12258, 13, 4706, 1583, 29889, 11569, 29918, 3972, 353, 11117, 29965, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 4706, 1583, 29889, 7720, 353, 16250, 29889, 4381, 8977, 29898, 2892, 29901, 7411, 877, 7192, 8785, 13, 4706, 396, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 4706, 1583, 29889, 5182, 353, 6213, 13, 4706, 1583, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 5835, 29897, 13, 4706, 565, 1583, 29889, 5182, 338, 6213, 29901, 13, 9651, 736, 448, 29896, 13, 4706, 396, 350, 9998, 1284, 278, 3273, 342, 2224, 13, 4706, 736, 1583, 29889, 1635, 29879, 3552, 29900, 29892, 29871, 29900, 876, 13, 13, 1678, 822, 4489, 29879, 29898, 1311, 29892, 1298, 29892, 5835, 1125, 13, 4706, 565, 5835, 29889, 275, 8667, 7295, 13, 9651, 1583, 29889, 5182, 353, 1298, 13, 4706, 396, 2791, 16669, 2943, 408, 29871, 29900, 13, 4706, 1583, 29889, 7720, 29961, 3149, 29962, 353, 29871, 29900, 13, 4706, 363, 270, 297, 1583, 29889, 3972, 29901, 13, 9651, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29900, 1402, 1298, 29961, 29896, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29896, 2314, 13, 9651, 565, 5835, 29889, 3068, 16619, 29898, 29881, 29897, 322, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 5785, 877, 7192, 29374, 13, 18884, 5835, 29889, 11631, 29898, 29881, 29897, 13, 18884, 1583, 29889, 29069, 29898, 29876, 486, 29918, 3149, 29892, 5835, 29897, 13, 18884, 5835, 29889, 11631, 29898, 1311, 29889, 11569, 29918, 3972, 29961, 29881, 2314, 13, 13, 1678, 822, 289, 5847, 29898, 1311, 29892, 1369, 1125, 13, 4706, 316, 802, 353, 16250, 29889, 311, 802, 4197, 29898, 2962, 29892, 29871, 29900, 29897, 2314, 13, 4706, 1550, 316, 802, 29901, 13, 9651, 1298, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 9651, 565, 1298, 1275, 1583, 29889, 5182, 29901, 13, 18884, 736, 4331, 13, 9651, 363, 270, 297, 1583, 29889, 3972, 29889, 5975, 7295, 13, 18884, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 29881, 29961, 29900, 1402, 1298, 29961, 29896, 10062, 29881, 29961, 29896, 2314, 13, 18884, 565, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 29871, 29900, 29901, 13, 462, 1678, 316, 802, 29889, 4397, 3552, 29876, 486, 29918, 3149, 29892, 4331, 29974, 29896, 876, 13, 462, 1678, 396, 2791, 16669, 2943, 408, 448, 29896, 13, 462, 1678, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 353, 448, 29896, 13, 28956, 13, 12148, 4443, 393, 366, 3282, 29915, 29873, 3867, 278, 421, 5756, 19203, 29952, 770, 29892, 8151, 366, 505, 304, 1207, 1854, 393, 445, 4413, 505, 1906, 421, 275, 8667, 1673, 421, 3068, 16619, 29952, 322, 421, 11631, 29952, 3519, 297, 1797, 445, 775, 304, 664, 29889, 2]
+inputs:
+<s> [INST]i've got this python code from an ocr tool, but it's not working. can you debug it for me?
+class 5olv7!on:
+oet tindShort€stPatn($elf` nn&s7er: 'GridM&s+er') -< 1rt:
+self.o1r = {'U': [-1, 0]._'0': [1, 0}, 'L': [0,_-1]. 'R': {0, 1]}
+selt.inv_d!r - {'V': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+z€|f.grio = (ollectioms.defaultdic7(lambda:_flo4t('lnf'])
+# DFS mark out the full map
+s€|f.targe+ - Non€
+5elf.dfs((0, 0), m&st3r)
+1f $eit,t4rget_is Rone:
+retvrn -1
+H 8FS find the sn0rtes7 9@+h
+neturr self.bfs((O, o))
+
+det dfs(zelf, point, mas+en):
+if_master,i5Target():
+self.target - poimt
+# nnark visited node as O
+ze|t,grid[point} = 0
+tor d in self,0in:
+rxt_p0imt = (9oimt[0]+self.dir[o](0), pOint[1]+se|f.dir[0][i]]
+if_master.<anNov3(d) and self,gr10[mxt_pOin+]_== tioat{'int'):
+master.moue(d}
+zelf.dfs(nkt_point` ma$ter)
+mas73r.mOve(selt.inv_d!r[d})
+
+def bfs(5e|f, st&rt):
+de9ue - (Oll3ctions,deqve([(st4nt, 0)])
+wh!le d€qv€:
+point, step = deque.popleft()
+lt point =- 5elf.tang3t:
+return step
+f0r d_im self.0in.valves():
+nxt_point_- (p0int[0]Td[O}, 9o1nt[1]+d[1]}
+it_self,9rid[nxt_poin+) == 0:
+d€que,appeno((nxt_qoint, s7ep+i))
+# m@rx vis!+ed node az -1
+self,gnid[nxt_point) = -1[/INST] There are several issues with your code like flaky names, wrong Python syntax etc.
+
+Here is how that code should look:
+
+```python
+import collections
+
+class Solution:
+    def findShortestPath(self, master: 'GridMaster') -> int:
+        self.dir = {'U': [-1, 0],'D': [1, 0], 'L': [0,-1], 'R': [0, 1]}
+        self.inv_dir = {'U': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+        self.grid = collections.defaultdict(lambda:float('inf'))
+        # DFS mark out the full map
+        self.target = None
+        self.dfs((0, 0), master)
+        if self.target is None:
+            return -1
+        # BFS find the shortest path
+        return self.bfs((0, 0))
+
+    def dfs(self, point, master):
+        if master.isTarget():
+            self.target = point
+        # mark visited node as 0
+        self.grid[point] = 0
+        for d in self.dir:
+            nxt_point = (point[0]+self.dir[d][0], point[1]+self.dir[d][1])
+            if master.canMove(d) and self.grid[nxt_point] == float('inf'):
+                master.move(d)
+                self.dfs(nxt_point, master)
+                master.move(self.inv_dir[d])
+
+    def bfs(self, start):
+        deque = collections.deque([(start, 0)])
+        while deque:
+            point, step = deque.popleft()
+            if point == self.target:
+                return step
+            for d in self.dir.values():
+                nxt_point = (point[0]+d[0], point[1]+d[1])
+                if self.grid[nxt_point] == 0:
+                    deque.append((nxt_point, step+1))
+                    # mark visited node as -1
+                    self.grid[nxt_point] = -1
+```
+Please note that you didn't provide the `GridMaster` class, hence you have to make sure that this classes have those `isTarget`, `canMove` and `move` methods in order this code to work.</s>
+label_ids:
+[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 1670, 526, 3196, 5626, 411, 596, 775, 763, 17422, 3459, 2983, 29892, 2743, 5132, 5877, 2992, 29889, 13, 13, 10605, 338, 920, 393, 775, 881, 1106, 29901, 13, 13, 28956, 4691, 13, 5215, 16250, 13, 13, 1990, 24380, 29901, 13, 1678, 822, 1284, 21322, 342, 2605, 29898, 1311, 29892, 5835, 29901, 525, 5756, 19203, 1495, 1599, 938, 29901, 13, 4706, 1583, 29889, 3972, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1402, 29915, 29928, 2396, 518, 29896, 29892, 29871, 29900, 1402, 525, 29931, 2396, 518, 29900, 6653, 29896, 1402, 525, 29934, 2396, 518, 29900, 29892, 29871, 29896, 12258, 13, 4706, 1583, 29889, 11569, 29918, 3972, 353, 11117, 29965, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 4706, 1583, 29889, 7720, 353, 16250, 29889, 4381, 8977, 29898, 2892, 29901, 7411, 877, 7192, 8785, 13, 4706, 396, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 4706, 1583, 29889, 5182, 353, 6213, 13, 4706, 1583, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 5835, 29897, 13, 4706, 565, 1583, 29889, 5182, 338, 6213, 29901, 13, 9651, 736, 448, 29896, 13, 4706, 396, 350, 9998, 1284, 278, 3273, 342, 2224, 13, 4706, 736, 1583, 29889, 1635, 29879, 3552, 29900, 29892, 29871, 29900, 876, 13, 13, 1678, 822, 4489, 29879, 29898, 1311, 29892, 1298, 29892, 5835, 1125, 13, 4706, 565, 5835, 29889, 275, 8667, 7295, 13, 9651, 1583, 29889, 5182, 353, 1298, 13, 4706, 396, 2791, 16669, 2943, 408, 29871, 29900, 13, 4706, 1583, 29889, 7720, 29961, 3149, 29962, 353, 29871, 29900, 13, 4706, 363, 270, 297, 1583, 29889, 3972, 29901, 13, 9651, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29900, 1402, 1298, 29961, 29896, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29896, 2314, 13, 9651, 565, 5835, 29889, 3068, 16619, 29898, 29881, 29897, 322, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 5785, 877, 7192, 29374, 13, 18884, 5835, 29889, 11631, 29898, 29881, 29897, 13, 18884, 1583, 29889, 29069, 29898, 29876, 486, 29918, 3149, 29892, 5835, 29897, 13, 18884, 5835, 29889, 11631, 29898, 1311, 29889, 11569, 29918, 3972, 29961, 29881, 2314, 13, 13, 1678, 822, 289, 5847, 29898, 1311, 29892, 1369, 1125, 13, 4706, 316, 802, 353, 16250, 29889, 311, 802, 4197, 29898, 2962, 29892, 29871, 29900, 29897, 2314, 13, 4706, 1550, 316, 802, 29901, 13, 9651, 1298, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 9651, 565, 1298, 1275, 1583, 29889, 5182, 29901, 13, 18884, 736, 4331, 13, 9651, 363, 270, 297, 1583, 29889, 3972, 29889, 5975, 7295, 13, 18884, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 29881, 29961, 29900, 1402, 1298, 29961, 29896, 10062, 29881, 29961, 29896, 2314, 13, 18884, 565, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 29871, 29900, 29901, 13, 462, 1678, 316, 802, 29889, 4397, 3552, 29876, 486, 29918, 3149, 29892, 4331, 29974, 29896, 876, 13, 462, 1678, 396, 2791, 16669, 2943, 408, 448, 29896, 13, 462, 1678, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 353, 448, 29896, 13, 28956, 13, 12148, 4443, 393, 366, 3282, 29915, 29873, 3867, 278, 421, 5756, 19203, 29952, 770, 29892, 8151, 366, 505, 304, 1207, 1854, 393, 445, 4413, 505, 1906, 421, 275, 8667, 1673, 421, 3068, 16619, 29952, 322, 421, 11631, 29952, 3519, 297, 1797, 445, 775, 304, 664, 29889, 2]
+inputs:
+<s> [INST]Please amend the subsequent Python script so that it includes a 'while' loop rather than the existing 'for' loop, which iterates through the items of an integer list.
+
+The script currently has a bug where it attempts to print an object that is outside the bounds of the list. Fix this error and modify the script to use 'while' instead of 'for' loop. Ensure your script correctly handles empty lists. 
+
+```python
+  # Establish an integer list
+  arr = [1, 2, 3, 4]
+
+  # Determine the length of the list
+  n = len(arr)
+
+  # Traverse the list and output each individual element
+  for i in range(n+1):
+      print(arr[i])
+```[/INST] ```python
+# Establish an integer list
+arr = [1, 2, 3, 4]
+
+# Determine the length of the list
+n = len(arr)
+
+# Initialize index at 0
+i = 0
+
+# Traverse the list and output each individual element
+while i < n:
+    print(arr[i])
+    i += 1
+```
+In the given code, it tries to access `arr[n]` which is out of bounds as python list indexing starts from 0 and ends at n-1 for a list of length n.
+Also, I used `while` loop as per your request. By initializing `i=0` before the loop and `i += 1` within the loop, we can iterate over the list. This iteration continues until `i` is less than `n`, the length of the list. The `while` loop will not proceed if the list is empty so it correctly handles empty lists.</s>
+label_ids:
+[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 7521, 4691, 13, 29937, 2661, 370, 1674, 385, 6043, 1051, 13, 2749, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29937, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29876, 353, 7431, 29898, 2749, 29897, 13, 13, 29937, 25455, 2380, 472, 29871, 29900, 13, 29875, 353, 29871, 29900, 13, 13, 29937, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 8000, 474, 529, 302, 29901, 13, 1678, 1596, 29898, 2749, 29961, 29875, 2314, 13, 1678, 474, 4619, 29871, 29896, 13, 28956, 13, 797, 278, 2183, 775, 29892, 372, 14335, 304, 2130, 421, 2749, 29961, 29876, 7961, 607, 338, 714, 310, 13451, 408, 3017, 1051, 26190, 8665, 515, 29871, 29900, 322, 10614, 472, 302, 29899, 29896, 363, 263, 1051, 310, 3309, 302, 29889, 13, 17351, 29892, 306, 1304, 421, 8000, 29952, 2425, 408, 639, 596, 2009, 29889, 2648, 2847, 5281, 421, 29875, 29922, 29900, 29952, 1434, 278, 2425, 322, 421, 29875, 4619, 29871, 29896, 29952, 2629, 278, 2425, 29892, 591, 508, 13649, 975, 278, 1051, 29889, 910, 12541, 18172, 2745, 421, 29875, 29952, 338, 3109, 1135, 421, 29876, 1673, 278, 3309, 310, 278, 1051, 29889, 450, 421, 8000, 29952, 2425, 674, 451, 8469, 565, 278, 1051, 338, 4069, 577, 372, 5149, 17766, 4069, 8857, 29889, 2]
+inputs:
+<s> [INST]i've got this python code from an ocr tool, but it's not working. can you debug it for me?
+class 5olv7!on:
+oet tindShort€stPatn($elf` nn&s7er: 'GridM&s+er') -< 1rt:
+self.o1r = {'U': [-1, 0]._'0': [1, 0}, 'L': [0,_-1]. 'R': {0, 1]}
+selt.inv_d!r - {'V': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+z€|f.grio = (ollectioms.defaultdic7(lambda:_flo4t('lnf'])
+# DFS mark out the full map
+s€|f.targe+ - Non€
+5elf.dfs((0, 0), m&st3r)
+1f $eit,t4rget_is Rone:
+retvrn -1
+H 8FS find the sn0rtes7 9@+h
+neturr self.bfs((O, o))
+
+det dfs(zelf, point, mas+en):
+if_master,i5Target():
+self.target - poimt
+# nnark visited node as O
+ze|t,grid[point} = 0
+tor d in self,0in:
+rxt_p0imt = (9oimt[0]+self.dir[o](0), pOint[1]+se|f.dir[0][i]]
+if_master.<anNov3(d) and self,gr10[mxt_pOin+]_== tioat{'int'):
+master.moue(d}
+zelf.dfs(nkt_point` ma$ter)
+mas73r.mOve(selt.inv_d!r[d})
+
+def bfs(5e|f, st&rt):
+de9ue - (Oll3ctions,deqve([(st4nt, 0)])
+wh!le d€qv€:
+point, step = deque.popleft()
+lt point =- 5elf.tang3t:
+return step
+f0r d_im self.0in.valves():
+nxt_point_- (p0int[0]Td[O}, 9o1nt[1]+d[1]}
+it_self,9rid[nxt_poin+) == 0:
+d€que,appeno((nxt_qoint, s7ep+i))
+# m@rx vis!+ed node az -1
+self,gnid[nxt_point) = -1[/INST] There are several issues with your code like flaky names, wrong Python syntax etc.
+
+Here is how that code should look:
+
+```python
+import collections
+
+class Solution:
+    def findShortestPath(self, master: 'GridMaster') -> int:
+        self.dir = {'U': [-1, 0],'D': [1, 0], 'L': [0,-1], 'R': [0, 1]}
+        self.inv_dir = {'U': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+        self.grid = collections.defaultdict(lambda:float('inf'))
+        # DFS mark out the full map
+        self.target = None
+        self.dfs((0, 0), master)
+        if self.target is None:
+            return -1
+        # BFS find the shortest path
+        return self.bfs((0, 0))
+
+    def dfs(self, point, master):
+        if master.isTarget():
+            self.target = point
+        # mark visited node as 0
+        self.grid[point] = 0
+        for d in self.dir:
+            nxt_point = (point[0]+self.dir[d][0], point[1]+self.dir[d][1])
+            if master.canMove(d) and self.grid[nxt_point] == float('inf'):
+                master.move(d)
+                self.dfs(nxt_point, master)
+                master.move(self.inv_dir[d])
+
+    def bfs(self, start):
+        deque = collections.deque([(start, 0)])
+        while deque:
+            point, step = deque.popleft()
+            if point == self.target:
+                return step
+            for d in self.dir.values():
+                nxt_point = (point[0]+d[0], point[1]+d[1])
+                if self.grid[nxt_point] == 0:
+                    deque.append((nxt_point, step+1))
+                    # mark visited node as -1
+                    self.grid[nxt_point] = -1
+```
+Please note that you didn't provide the `GridMaster` class, hence you have to make sure that this classes have those `isTarget`, `canMove` and `move` methods in order this code to work.</s>
+label_ids:
+[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 1670, 526, 3196, 5626, 411, 596, 775, 763, 17422, 3459, 2983, 29892, 2743, 5132, 5877, 2992, 29889, 13, 13, 10605, 338, 920, 393, 775, 881, 1106, 29901, 13, 13, 28956, 4691, 13, 5215, 16250, 13, 13, 1990, 24380, 29901, 13, 1678, 822, 1284, 21322, 342, 2605, 29898, 1311, 29892, 5835, 29901, 525, 5756, 19203, 1495, 1599, 938, 29901, 13, 4706, 1583, 29889, 3972, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1402, 29915, 29928, 2396, 518, 29896, 29892, 29871, 29900, 1402, 525, 29931, 2396, 518, 29900, 6653, 29896, 1402, 525, 29934, 2396, 518, 29900, 29892, 29871, 29896, 12258, 13, 4706, 1583, 29889, 11569, 29918, 3972, 353, 11117, 29965, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 4706, 1583, 29889, 7720, 353, 16250, 29889, 4381, 8977, 29898, 2892, 29901, 7411, 877, 7192, 8785, 13, 4706, 396, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 4706, 1583, 29889, 5182, 353, 6213, 13, 4706, 1583, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 5835, 29897, 13, 4706, 565, 1583, 29889, 5182, 338, 6213, 29901, 13, 9651, 736, 448, 29896, 13, 4706, 396, 350, 9998, 1284, 278, 3273, 342, 2224, 13, 4706, 736, 1583, 29889, 1635, 29879, 3552, 29900, 29892, 29871, 29900, 876, 13, 13, 1678, 822, 4489, 29879, 29898, 1311, 29892, 1298, 29892, 5835, 1125, 13, 4706, 565, 5835, 29889, 275, 8667, 7295, 13, 9651, 1583, 29889, 5182, 353, 1298, 13, 4706, 396, 2791, 16669, 2943, 408, 29871, 29900, 13, 4706, 1583, 29889, 7720, 29961, 3149, 29962, 353, 29871, 29900, 13, 4706, 363, 270, 297, 1583, 29889, 3972, 29901, 13, 9651, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29900, 1402, 1298, 29961, 29896, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29896, 2314, 13, 9651, 565, 5835, 29889, 3068, 16619, 29898, 29881, 29897, 322, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 5785, 877, 7192, 29374, 13, 18884, 5835, 29889, 11631, 29898, 29881, 29897, 13, 18884, 1583, 29889, 29069, 29898, 29876, 486, 29918, 3149, 29892, 5835, 29897, 13, 18884, 5835, 29889, 11631, 29898, 1311, 29889, 11569, 29918, 3972, 29961, 29881, 2314, 13, 13, 1678, 822, 289, 5847, 29898, 1311, 29892, 1369, 1125, 13, 4706, 316, 802, 353, 16250, 29889, 311, 802, 4197, 29898, 2962, 29892, 29871, 29900, 29897, 2314, 13, 4706, 1550, 316, 802, 29901, 13, 9651, 1298, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 9651, 565, 1298, 1275, 1583, 29889, 5182, 29901, 13, 18884, 736, 4331, 13, 9651, 363, 270, 297, 1583, 29889, 3972, 29889, 5975, 7295, 13, 18884, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 29881, 29961, 29900, 1402, 1298, 29961, 29896, 10062, 29881, 29961, 29896, 2314, 13, 18884, 565, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 29871, 29900, 29901, 13, 462, 1678, 316, 802, 29889, 4397, 3552, 29876, 486, 29918, 3149, 29892, 4331, 29974, 29896, 876, 13, 462, 1678, 396, 2791, 16669, 2943, 408, 448, 29896, 13, 462, 1678, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 353, 448, 29896, 13, 28956, 13, 12148, 4443, 393, 366, 3282, 29915, 29873, 3867, 278, 421, 5756, 19203, 29952, 770, 29892, 8151, 366, 505, 304, 1207, 1854, 393, 445, 4413, 505, 1906, 421, 275, 8667, 1673, 421, 3068, 16619, 29952, 322, 421, 11631, 29952, 3519, 297, 1797, 445, 775, 304, 664, 29889, 2]
+labels:
+[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] ```python
+# Establish an integer list
+arr = [1, 2, 3, 4]
+
+# Determine the length of the list
+n = len(arr)
+
+# Initialize index at 0
+i = 0
+
+# Traverse the list and output each individual element
+while i < n:
+    print(arr[i])
+    i += 1
+```
+In the given code, it tries to access `arr[n]` which is out of bounds as python list indexing starts from 0 and ends at n-1 for a list of length n.
+Also, I used `while` loop as per your request. By initializing `i=0` before the loop and `i += 1` within the loop, we can iterate over the list. This iteration continues until `i` is less than `n`, the length of the list. The `while` loop will not proceed if the list is empty so it correctly handles empty lists.</s>
+inputs:
+<s> [INST]i've got this python code from an ocr tool, but it's not working. can you debug it for me?
+class 5olv7!on:
+oet tindShort€stPatn($elf` nn&s7er: 'GridM&s+er') -< 1rt:
+self.o1r = {'U': [-1, 0]._'0': [1, 0}, 'L': [0,_-1]. 'R': {0, 1]}
+selt.inv_d!r - {'V': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+z€|f.grio = (ollectioms.defaultdic7(lambda:_flo4t('lnf'])
+# DFS mark out the full map
+s€|f.targe+ - Non€
+5elf.dfs((0, 0), m&st3r)
+1f $eit,t4rget_is Rone:
+retvrn -1
+H 8FS find the sn0rtes7 9@+h
+neturr self.bfs((O, o))
+
+det dfs(zelf, point, mas+en):
+if_master,i5Target():
+self.target - poimt
+# nnark visited node as O
+ze|t,grid[point} = 0
+tor d in self,0in:
+rxt_p0imt = (9oimt[0]+self.dir[o](0), pOint[1]+se|f.dir[0][i]]
+if_master.<anNov3(d) and self,gr10[mxt_pOin+]_== tioat{'int'):
+master.moue(d}
+zelf.dfs(nkt_point` ma$ter)
+mas73r.mOve(selt.inv_d!r[d})
+
+def bfs(5e|f, st&rt):
+de9ue - (Oll3ctions,deqve([(st4nt, 0)])
+wh!le d€qv€:
+point, step = deque.popleft()
+lt point =- 5elf.tang3t:
+return step
+f0r d_im self.0in.valves():
+nxt_point_- (p0int[0]Td[O}, 9o1nt[1]+d[1]}
+it_self,9rid[nxt_poin+) == 0:
+d€que,appeno((nxt_qoint, s7ep+i))
+# m@rx vis!+ed node az -1
+self,gnid[nxt_point) = -1[/INST] There are several issues with your code like flaky names, wrong Python syntax etc.
+
+Here is how that code should look:
+
+```python
+import collections
+
+class Solution:
+    def findShortestPath(self, master: 'GridMaster') -> int:
+        self.dir = {'U': [-1, 0],'D': [1, 0], 'L': [0,-1], 'R': [0, 1]}
+        self.inv_dir = {'U': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+        self.grid = collections.defaultdict(lambda:float('inf'))
+        # DFS mark out the full map
+        self.target = None
+        self.dfs((0, 0), master)
+        if self.target is None:
+            return -1
+        # BFS find the shortest path
+        return self.bfs((0, 0))
+
+    def dfs(self, point, master):
+        if master.isTarget():
+            self.target = point
+        # mark visited node as 0
+        self.grid[point] = 0
+        for d in self.dir:
+            nxt_point = (point[0]+self.dir[d][0], point[1]+self.dir[d][1])
+            if master.canMove(d) and self.grid[nxt_point] == float('inf'):
+                master.move(d)
+                self.dfs(nxt_point, master)
+                master.move(self.inv_dir[d])
+
+    def bfs(self, start):
+        deque = collections.deque([(start, 0)])
+        while deque:
+            point, step = deque.popleft()
+            if point == self.target:
+                return step
+            for d in self.dir.values():
+                nxt_point = (point[0]+d[0], point[1]+d[1])
+                if self.grid[nxt_point] == 0:
+                    deque.append((nxt_point, step+1))
+                    # mark visited node as -1
+                    self.grid[nxt_point] = -1
+```
+Please note that you didn't provide the `GridMaster` class, hence you have to make sure that this classes have those `isTarget`, `canMove` and `move` methods in order this code to work.</s>
+label_ids:
+[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 1670, 526, 3196, 5626, 411, 596, 775, 763, 17422, 3459, 2983, 29892, 2743, 5132, 5877, 2992, 29889, 13, 13, 10605, 338, 920, 393, 775, 881, 1106, 29901, 13, 13, 28956, 4691, 13, 5215, 16250, 13, 13, 1990, 24380, 29901, 13, 1678, 822, 1284, 21322, 342, 2605, 29898, 1311, 29892, 5835, 29901, 525, 5756, 19203, 1495, 1599, 938, 29901, 13, 4706, 1583, 29889, 3972, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1402, 29915, 29928, 2396, 518, 29896, 29892, 29871, 29900, 1402, 525, 29931, 2396, 518, 29900, 6653, 29896, 1402, 525, 29934, 2396, 518, 29900, 29892, 29871, 29896, 12258, 13, 4706, 1583, 29889, 11569, 29918, 3972, 353, 11117, 29965, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 4706, 1583, 29889, 7720, 353, 16250, 29889, 4381, 8977, 29898, 2892, 29901, 7411, 877, 7192, 8785, 13, 4706, 396, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 4706, 1583, 29889, 5182, 353, 6213, 13, 4706, 1583, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 5835, 29897, 13, 4706, 565, 1583, 29889, 5182, 338, 6213, 29901, 13, 9651, 736, 448, 29896, 13, 4706, 396, 350, 9998, 1284, 278, 3273, 342, 2224, 13, 4706, 736, 1583, 29889, 1635, 29879, 3552, 29900, 29892, 29871, 29900, 876, 13, 13, 1678, 822, 4489, 29879, 29898, 1311, 29892, 1298, 29892, 5835, 1125, 13, 4706, 565, 5835, 29889, 275, 8667, 7295, 13, 9651, 1583, 29889, 5182, 353, 1298, 13, 4706, 396, 2791, 16669, 2943, 408, 29871, 29900, 13, 4706, 1583, 29889, 7720, 29961, 3149, 29962, 353, 29871, 29900, 13, 4706, 363, 270, 297, 1583, 29889, 3972, 29901, 13, 9651, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29900, 1402, 1298, 29961, 29896, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29896, 2314, 13, 9651, 565, 5835, 29889, 3068, 16619, 29898, 29881, 29897, 322, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 5785, 877, 7192, 29374, 13, 18884, 5835, 29889, 11631, 29898, 29881, 29897, 13, 18884, 1583, 29889, 29069, 29898, 29876, 486, 29918, 3149, 29892, 5835, 29897, 13, 18884, 5835, 29889, 11631, 29898, 1311, 29889, 11569, 29918, 3972, 29961, 29881, 2314, 13, 13, 1678, 822, 289, 5847, 29898, 1311, 29892, 1369, 1125, 13, 4706, 316, 802, 353, 16250, 29889, 311, 802, 4197, 29898, 2962, 29892, 29871, 29900, 29897, 2314, 13, 4706, 1550, 316, 802, 29901, 13, 9651, 1298, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 9651, 565, 1298, 1275, 1583, 29889, 5182, 29901, 13, 18884, 736, 4331, 13, 9651, 363, 270, 297, 1583, 29889, 3972, 29889, 5975, 7295, 13, 18884, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 29881, 29961, 29900, 1402, 1298, 29961, 29896, 10062, 29881, 29961, 29896, 2314, 13, 18884, 565, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 29871, 29900, 29901, 13, 462, 1678, 316, 802, 29889, 4397, 3552, 29876, 486, 29918, 3149, 29892, 4331, 29974, 29896, 876, 13, 462, 1678, 396, 2791, 16669, 2943, 408, 448, 29896, 13, 462, 1678, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 353, 448, 29896, 13, 28956, 13, 12148, 4443, 393, 366, 3282, 29915, 29873, 3867, 278, 421, 5756, 19203, 29952, 770, 29892, 8151, 366, 505, 304, 1207, 1854, 393, 445, 4413, 505, 1906, 421, 275, 8667, 1673, 421, 3068, 16619, 29952, 322, 421, 11631, 29952, 3519, 297, 1797, 445, 775, 304, 664, 29889, 2]
+labels:
+[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] There are several issues with your code like flaky names, wrong Python syntax etc.
+
+Here is how that code should look:
+
+```python
+import collections
+
+class Solution:
+    def findShortestPath(self, master: 'GridMaster') -> int:
+        self.dir = {'U': [-1, 0],'D': [1, 0], 'L': [0,-1], 'R': [0, 1]}
+        self.inv_dir = {'U': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+        self.grid = collections.defaultdict(lambda:float('inf'))
+        # DFS mark out the full map
+        self.target = None
+        self.dfs((0, 0), master)
+        if self.target is None:
+            return -1
+        # BFS find the shortest path
+        return self.bfs((0, 0))
+
+    def dfs(self, point, master):
+        if master.isTarget():
+            self.target = point
+        # mark visited node as 0
+        self.grid[point] = 0
+        for d in self.dir:
+            nxt_point = (point[0]+self.dir[d][0], point[1]+self.dir[d][1])
+            if master.canMove(d) and self.grid[nxt_point] == float('inf'):
+                master.move(d)
+                self.dfs(nxt_point, master)
+                master.move(self.inv_dir[d])
+
+    def bfs(self, start):
+        deque = collections.deque([(start, 0)])
+        while deque:
+            point, step = deque.popleft()
+            if point == self.target:
+                return step
+            for d in self.dir.values():
+                nxt_point = (point[0]+d[0], point[1]+d[1])
+                if self.grid[nxt_point] == 0:
+                    deque.append((nxt_point, step+1))
+                    # mark visited node as -1
+                    self.grid[nxt_point] = -1
+```
+Please note that you didn't provide the `GridMaster` class, hence you have to make sure that this classes have those `isTarget`, `canMove` and `move` methods in order this code to work.</s>
+length of input_ids :
+1212
+input_ids:
+[1, 518, 25580, 29962, 29875, 29915, 345, 2355, 445, 3017, 775, 515, 385, 288, 7283, 5780, 29892, 541, 372, 29915, 29879, 451, 1985, 29889, 508, 366, 4744, 372, 363, 592, 29973, 13, 1990, 29871, 29945, 324, 29894, 29955, 29991, 265, 29901, 13, 29877, 300, 260, 513, 21322, 30181, 303, 11457, 29876, 1566, 761, 29952, 302, 29876, 29987, 29879, 29955, 261, 29901, 525, 5756, 29924, 29987, 29879, 29974, 261, 1495, 448, 29966, 29871, 29896, 2273, 29901, 13, 1311, 29889, 29877, 29896, 29878, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1822, 29918, 29915, 29900, 2396, 518, 29896, 29892, 29871, 29900, 1118, 525, 29931, 2396, 518, 29900, 29892, 29918, 29899, 29896, 1822, 525, 29934, 2396, 426, 29900, 29892, 29871, 29896, 12258, 13, 29879, 2152, 29889, 11569, 29918, 29881, 29991, 29878, 448, 11117, 29963, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 29920, 30181, 29989, 29888, 29889, 29887, 5378, 353, 313, 324, 781, 29875, 4835, 29889, 4381, 27774, 29955, 29898, 2892, 29901, 29918, 29888, 417, 29946, 29873, 877, 3083, 29888, 11287, 13, 29937, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 29879, 30181, 29989, 29888, 29889, 12637, 479, 29974, 448, 10050, 30181, 13, 29945, 761, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 286, 29987, 303, 29941, 29878, 29897, 13, 29896, 29888, 395, 29872, 277, 29892, 29873, 29946, 29878, 657, 29918, 275, 390, 650, 29901, 13, 2267, 13416, 29876, 448, 29896, 13, 29950, 29871, 29947, 9998, 1284, 278, 5807, 29900, 29878, 2167, 29955, 29871, 29929, 29992, 29974, 29882, 13, 1212, 1038, 1583, 29889, 1635, 29879, 3552, 29949, 29892, 288, 876, 13, 13, 4801, 4489, 29879, 29898, 29920, 761, 29892, 1298, 29892, 5516, 29974, 264, 1125, 13, 361, 29918, 6207, 29892, 29875, 29945, 8667, 7295, 13, 1311, 29889, 5182, 448, 772, 326, 29873, 13, 29937, 302, 29876, 935, 16669, 2943, 408, 438, 13, 911, 29989, 29873, 29892, 7720, 29961, 3149, 29913, 353, 29871, 29900, 13, 7345, 270, 297, 1583, 29892, 29900, 262, 29901, 13, 29878, 486, 29918, 29886, 29900, 326, 29873, 353, 313, 29929, 29877, 326, 29873, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29877, 850, 29900, 511, 282, 29949, 524, 29961, 29896, 10062, 344, 29989, 29888, 29889, 3972, 29961, 29900, 3816, 29875, 5262, 13, 361, 29918, 6207, 19423, 273, 25363, 29941, 29898, 29881, 29897, 322, 1583, 29892, 629, 29896, 29900, 29961, 29885, 486, 29918, 29886, 29949, 262, 29974, 21540, 1360, 260, 601, 271, 10998, 524, 29374, 13, 6207, 29889, 29885, 283, 29872, 29898, 29881, 29913, 13, 29920, 761, 29889, 29069, 29898, 29876, 1193, 29918, 3149, 29952, 611, 29938, 357, 29897, 13, 8247, 29955, 29941, 29878, 29889, 29885, 29949, 345, 29898, 29879, 2152, 29889, 11569, 29918, 29881, 29991, 29878, 29961, 29881, 1800, 13, 13, 1753, 289, 5847, 29898, 29945, 29872, 29989, 29888, 29892, 380, 29987, 2273, 1125, 13, 311, 29929, 434, 448, 313, 29949, 645, 29941, 1953, 29892, 311, 29939, 345, 4197, 29898, 303, 29946, 593, 29892, 29871, 29900, 29897, 2314, 13, 1332, 29991, 280, 270, 30181, 29939, 29894, 30181, 29901, 13, 3149, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 1896, 1298, 353, 29899, 29871, 29945, 761, 29889, 29873, 574, 29941, 29873, 29901, 13, 2457, 4331, 13, 29888, 29900, 29878, 270, 29918, 326, 1583, 29889, 29900, 262, 29889, 791, 1960, 7295, 13, 29876, 486, 29918, 3149, 29918, 29899, 313, 29886, 29900, 524, 29961, 29900, 29962, 29911, 29881, 29961, 29949, 1118, 29871, 29929, 29877, 29896, 593, 29961, 29896, 10062, 29881, 29961, 29896, 12258, 13, 277, 29918, 1311, 29892, 29929, 2429, 29961, 29876, 486, 29918, 1129, 262, 28135, 1275, 29871, 29900, 29901, 13, 29881, 30181, 802, 29892, 932, 8154, 3552, 29876, 486, 29918, 29939, 2461, 29892, 269, 29955, 1022, 29974, 29875, 876, 13, 29937, 286, 29992, 17697, 1998, 29991, 29974, 287, 2943, 2698, 448, 29896, 13, 1311, 29892, 5138, 333, 29961, 29876, 486, 29918, 3149, 29897, 353, 448, 29896, 29961, 29914, 25580, 29962, 1670, 526, 3196, 5626, 411, 596, 775, 763, 17422, 3459, 2983, 29892, 2743, 5132, 5877, 2992, 29889, 13, 13, 10605, 338, 920, 393, 775, 881, 1106, 29901, 13, 13, 28956, 4691, 13, 5215, 16250, 13, 13, 1990, 24380, 29901, 13, 1678, 822, 1284, 21322, 342, 2605, 29898, 1311, 29892, 5835, 29901, 525, 5756, 19203, 1495, 1599, 938, 29901, 13, 4706, 1583, 29889, 3972, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1402, 29915, 29928, 2396, 518, 29896, 29892, 29871, 29900, 1402, 525, 29931, 2396, 518, 29900, 6653, 29896, 1402, 525, 29934, 2396, 518, 29900, 29892, 29871, 29896, 12258, 13, 4706, 1583, 29889, 11569, 29918, 3972, 353, 11117, 29965, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 4706, 1583, 29889, 7720, 353, 16250, 29889, 4381, 8977, 29898, 2892, 29901, 7411, 877, 7192, 8785, 13, 4706, 396, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 4706, 1583, 29889, 5182, 353, 6213, 13, 4706, 1583, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 5835, 29897, 13, 4706, 565, 1583, 29889, 5182, 338, 6213, 29901, 13, 9651, 736, 448, 29896, 13, 4706, 396, 350, 9998, 1284, 278, 3273, 342, 2224, 13, 4706, 736, 1583, 29889, 1635, 29879, 3552, 29900, 29892, 29871, 29900, 876, 13, 13, 1678, 822, 4489, 29879, 29898, 1311, 29892, 1298, 29892, 5835, 1125, 13, 4706, 565, 5835, 29889, 275, 8667, 7295, 13, 9651, 1583, 29889, 5182, 353, 1298, 13, 4706, 396, 2791, 16669, 2943, 408, 29871, 29900, 13, 4706, 1583, 29889, 7720, 29961, 3149, 29962, 353, 29871, 29900, 13, 4706, 363, 270, 297, 1583, 29889, 3972, 29901, 13, 9651, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29900, 1402, 1298, 29961, 29896, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29896, 2314, 13, 9651, 565, 5835, 29889, 3068, 16619, 29898, 29881, 29897, 322, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 5785, 877, 7192, 29374, 13, 18884, 5835, 29889, 11631, 29898, 29881, 29897, 13, 18884, 1583, 29889, 29069, 29898, 29876, 486, 29918, 3149, 29892, 5835, 29897, 13, 18884, 5835, 29889, 11631, 29898, 1311, 29889, 11569, 29918, 3972, 29961, 29881, 2314, 13, 13, 1678, 822, 289, 5847, 29898, 1311, 29892, 1369, 1125, 13, 4706, 316, 802, 353, 16250, 29889, 311, 802, 4197, 29898, 2962, 29892, 29871, 29900, 29897, 2314, 13, 4706, 1550, 316, 802, 29901, 13, 9651, 1298, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 9651, 565, 1298, 1275, 1583, 29889, 5182, 29901, 13, 18884, 736, 4331, 13, 9651, 363, 270, 297, 1583, 29889, 3972, 29889, 5975, 7295, 13, 18884, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 29881, 29961, 29900, 1402, 1298, 29961, 29896, 10062, 29881, 29961, 29896, 2314, 13, 18884, 565, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 29871, 29900, 29901, 13, 462, 1678, 316, 802, 29889, 4397, 3552, 29876, 486, 29918, 3149, 29892, 4331, 29974, 29896, 876, 13, 462, 1678, 396, 2791, 16669, 2943, 408, 448, 29896, 13, 462, 1678, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 353, 448, 29896, 13, 28956, 13, 12148, 4443, 393, 366, 3282, 29915, 29873, 3867, 278, 421, 5756, 19203, 29952, 770, 29892, 8151, 366, 505, 304, 1207, 1854, 393, 445, 4413, 505, 1906, 421, 275, 8667, 1673, 421, 3068, 16619, 29952, 322, 421, 11631, 29952, 3519, 297, 1797, 445, 775, 304, 664, 29889, 2]
+inputs:
+<s> [INST]i've got this python code from an ocr tool, but it's not working. can you debug it for me?
+class 5olv7!on:
+oet tindShort€stPatn($elf` nn&s7er: 'GridM&s+er') -< 1rt:
+self.o1r = {'U': [-1, 0]._'0': [1, 0}, 'L': [0,_-1]. 'R': {0, 1]}
+selt.inv_d!r - {'V': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+z€|f.grio = (ollectioms.defaultdic7(lambda:_flo4t('lnf'])
+# DFS mark out the full map
+s€|f.targe+ - Non€
+5elf.dfs((0, 0), m&st3r)
+1f $eit,t4rget_is Rone:
+retvrn -1
+H 8FS find the sn0rtes7 9@+h
+neturr self.bfs((O, o))
+
+det dfs(zelf, point, mas+en):
+if_master,i5Target():
+self.target - poimt
+# nnark visited node as O
+ze|t,grid[point} = 0
+tor d in self,0in:
+rxt_p0imt = (9oimt[0]+self.dir[o](0), pOint[1]+se|f.dir[0][i]]
+if_master.<anNov3(d) and self,gr10[mxt_pOin+]_== tioat{'int'):
+master.moue(d}
+zelf.dfs(nkt_point` ma$ter)
+mas73r.mOve(selt.inv_d!r[d})
+
+def bfs(5e|f, st&rt):
+de9ue - (Oll3ctions,deqve([(st4nt, 0)])
+wh!le d€qv€:
+point, step = deque.popleft()
+lt point =- 5elf.tang3t:
+return step
+f0r d_im self.0in.valves():
+nxt_point_- (p0int[0]Td[O}, 9o1nt[1]+d[1]}
+it_self,9rid[nxt_poin+) == 0:
+d€que,appeno((nxt_qoint, s7ep+i))
+# m@rx vis!+ed node az -1
+self,gnid[nxt_point) = -1[/INST] There are several issues with your code like flaky names, wrong Python syntax etc.
+
+Here is how that code should look:
+
+```python
+import collections
+
+class Solution:
+    def findShortestPath(self, master: 'GridMaster') -> int:
+        self.dir = {'U': [-1, 0],'D': [1, 0], 'L': [0,-1], 'R': [0, 1]}
+        self.inv_dir = {'U': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+        self.grid = collections.defaultdict(lambda:float('inf'))
+        # DFS mark out the full map
+        self.target = None
+        self.dfs((0, 0), master)
+        if self.target is None:
+            return -1
+        # BFS find the shortest path
+        return self.bfs((0, 0))
+
+    def dfs(self, point, master):
+        if master.isTarget():
+            self.target = point
+        # mark visited node as 0
+        self.grid[point] = 0
+        for d in self.dir:
+            nxt_point = (point[0]+self.dir[d][0], point[1]+self.dir[d][1])
+            if master.canMove(d) and self.grid[nxt_point] == float('inf'):
+                master.move(d)
+                self.dfs(nxt_point, master)
+                master.move(self.inv_dir[d])
+
+    def bfs(self, start):
+        deque = collections.deque([(start, 0)])
+        while deque:
+            point, step = deque.popleft()
+            if point == self.target:
+                return step
+            for d in self.dir.values():
+                nxt_point = (point[0]+d[0], point[1]+d[1])
+                if self.grid[nxt_point] == 0:
+                    deque.append((nxt_point, step+1))
+                    # mark visited node as -1
+                    self.grid[nxt_point] = -1
+```
+Please note that you didn't provide the `GridMaster` class, hence you have to make sure that this classes have those `isTarget`, `canMove` and `move` methods in order this code to work.</s>
+label_ids:
+[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 1670, 526, 3196, 5626, 411, 596, 775, 763, 17422, 3459, 2983, 29892, 2743, 5132, 5877, 2992, 29889, 13, 13, 10605, 338, 920, 393, 775, 881, 1106, 29901, 13, 13, 28956, 4691, 13, 5215, 16250, 13, 13, 1990, 24380, 29901, 13, 1678, 822, 1284, 21322, 342, 2605, 29898, 1311, 29892, 5835, 29901, 525, 5756, 19203, 1495, 1599, 938, 29901, 13, 4706, 1583, 29889, 3972, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1402, 29915, 29928, 2396, 518, 29896, 29892, 29871, 29900, 1402, 525, 29931, 2396, 518, 29900, 6653, 29896, 1402, 525, 29934, 2396, 518, 29900, 29892, 29871, 29896, 12258, 13, 4706, 1583, 29889, 11569, 29918, 3972, 353, 11117, 29965, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 4706, 1583, 29889, 7720, 353, 16250, 29889, 4381, 8977, 29898, 2892, 29901, 7411, 877, 7192, 8785, 13, 4706, 396, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 4706, 1583, 29889, 5182, 353, 6213, 13, 4706, 1583, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 5835, 29897, 13, 4706, 565, 1583, 29889, 5182, 338, 6213, 29901, 13, 9651, 736, 448, 29896, 13, 4706, 396, 350, 9998, 1284, 278, 3273, 342, 2224, 13, 4706, 736, 1583, 29889, 1635, 29879, 3552, 29900, 29892, 29871, 29900, 876, 13, 13, 1678, 822, 4489, 29879, 29898, 1311, 29892, 1298, 29892, 5835, 1125, 13, 4706, 565, 5835, 29889, 275, 8667, 7295, 13, 9651, 1583, 29889, 5182, 353, 1298, 13, 4706, 396, 2791, 16669, 2943, 408, 29871, 29900, 13, 4706, 1583, 29889, 7720, 29961, 3149, 29962, 353, 29871, 29900, 13, 4706, 363, 270, 297, 1583, 29889, 3972, 29901, 13, 9651, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29900, 1402, 1298, 29961, 29896, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29896, 2314, 13, 9651, 565, 5835, 29889, 3068, 16619, 29898, 29881, 29897, 322, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 5785, 877, 7192, 29374, 13, 18884, 5835, 29889, 11631, 29898, 29881, 29897, 13, 18884, 1583, 29889, 29069, 29898, 29876, 486, 29918, 3149, 29892, 5835, 29897, 13, 18884, 5835, 29889, 11631, 29898, 1311, 29889, 11569, 29918, 3972, 29961, 29881, 2314, 13, 13, 1678, 822, 289, 5847, 29898, 1311, 29892, 1369, 1125, 13, 4706, 316, 802, 353, 16250, 29889, 311, 802, 4197, 29898, 2962, 29892, 29871, 29900, 29897, 2314, 13, 4706, 1550, 316, 802, 29901, 13, 9651, 1298, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 9651, 565, 1298, 1275, 1583, 29889, 5182, 29901, 13, 18884, 736, 4331, 13, 9651, 363, 270, 297, 1583, 29889, 3972, 29889, 5975, 7295, 13, 18884, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 29881, 29961, 29900, 1402, 1298, 29961, 29896, 10062, 29881, 29961, 29896, 2314, 13, 18884, 565, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 29871, 29900, 29901, 13, 462, 1678, 316, 802, 29889, 4397, 3552, 29876, 486, 29918, 3149, 29892, 4331, 29974, 29896, 876, 13, 462, 1678, 396, 2791, 16669, 2943, 408, 448, 29896, 13, 462, 1678, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 353, 448, 29896, 13, 28956, 13, 12148, 4443, 393, 366, 3282, 29915, 29873, 3867, 278, 421, 5756, 19203, 29952, 770, 29892, 8151, 366, 505, 304, 1207, 1854, 393, 445, 4413, 505, 1906, 421, 275, 8667, 1673, 421, 3068, 16619, 29952, 322, 421, 11631, 29952, 3519, 297, 1797, 445, 775, 304, 664, 29889, 2]
+labels:
+[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] There are several issues with your code like flaky names, wrong Python syntax etc.
+
+Here is how that code should look:
+
+```python
+import collections
+
+class Solution:
+    def findShortestPath(self, master: 'GridMaster') -> int:
+        self.dir = {'U': [-1, 0],'D': [1, 0], 'L': [0,-1], 'R': [0, 1]}
+        self.inv_dir = {'U': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+        self.grid = collections.defaultdict(lambda:float('inf'))
+        # DFS mark out the full map
+        self.target = None
+        self.dfs((0, 0), master)
+        if self.target is None:
+            return -1
+        # BFS find the shortest path
+        return self.bfs((0, 0))
+
+    def dfs(self, point, master):
+        if master.isTarget():
+            self.target = point
+        # mark visited node as 0
+        self.grid[point] = 0
+        for d in self.dir:
+            nxt_point = (point[0]+self.dir[d][0], point[1]+self.dir[d][1])
+            if master.canMove(d) and self.grid[nxt_point] == float('inf'):
+                master.move(d)
+                self.dfs(nxt_point, master)
+                master.move(self.inv_dir[d])
+
+    def bfs(self, start):
+        deque = collections.deque([(start, 0)])
+        while deque:
+            point, step = deque.popleft()
+            if point == self.target:
+                return step
+            for d in self.dir.values():
+                nxt_point = (point[0]+d[0], point[1]+d[1])
+                if self.grid[nxt_point] == 0:
+                    deque.append((nxt_point, step+1))
+                    # mark visited node as -1
+                    self.grid[nxt_point] = -1
+```
+Please note that you didn't provide the `GridMaster` class, hence you have to make sure that this classes have those `isTarget`, `canMove` and `move` methods in order this code to work.</s>
+inputs:
+<s> [INST]i've got this python code from an ocr tool, but it's not working. can you debug it for me?
+class 5olv7!on:
+oet tindShort€stPatn($elf` nn&s7er: 'GridM&s+er') -< 1rt:
+self.o1r = {'U': [-1, 0]._'0': [1, 0}, 'L': [0,_-1]. 'R': {0, 1]}
+selt.inv_d!r - {'V': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+z€|f.grio = (ollectioms.defaultdic7(lambda:_flo4t('lnf'])
+# DFS mark out the full map
+s€|f.targe+ - Non€
+5elf.dfs((0, 0), m&st3r)
+1f $eit,t4rget_is Rone:
+retvrn -1
+H 8FS find the sn0rtes7 9@+h
+neturr self.bfs((O, o))
+
+det dfs(zelf, point, mas+en):
+if_master,i5Target():
+self.target - poimt
+# nnark visited node as O
+ze|t,grid[point} = 0
+tor d in self,0in:
+rxt_p0imt = (9oimt[0]+self.dir[o](0), pOint[1]+se|f.dir[0][i]]
+if_master.<anNov3(d) and self,gr10[mxt_pOin+]_== tioat{'int'):
+master.moue(d}
+zelf.dfs(nkt_point` ma$ter)
+mas73r.mOve(selt.inv_d!r[d})
+
+def bfs(5e|f, st&rt):
+de9ue - (Oll3ctions,deqve([(st4nt, 0)])
+wh!le d€qv€:
+point, step = deque.popleft()
+lt point =- 5elf.tang3t:
+return step
+f0r d_im self.0in.valves():
+nxt_point_- (p0int[0]Td[O}, 9o1nt[1]+d[1]}
+it_self,9rid[nxt_poin+) == 0:
+d€que,appeno((nxt_qoint, s7ep+i))
+# m@rx vis!+ed node az -1
+self,gnid[nxt_point) = -1[/INST] There are several issues with your code like flaky names, wrong Python syntax etc.
+
+Here is how that code should look:
+
+```python
+import collections
+
+class Solution:
+    def findShortestPath(self, master: 'GridMaster') -> int:
+        self.dir = {'U': [-1, 0],'D': [1, 0], 'L': [0,-1], 'R': [0, 1]}
+        self.inv_dir = {'U': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+        self.grid = collections.defaultdict(lambda:float('inf'))
+        # DFS mark out the full map
+        self.target = None
+        self.dfs((0, 0), master)
+        if self.target is None:
+            return -1
+        # BFS find the shortest path
+        return self.bfs((0, 0))
+
+    def dfs(self, point, master):
+        if master.isTarget():
+            self.target = point
+        # mark visited node as 0
+        self.grid[point] = 0
+        for d in self.dir:
+            nxt_point = (point[0]+self.dir[d][0], point[1]+self.dir[d][1])
+            if master.canMove(d) and self.grid[nxt_point] == float('inf'):
+                master.move(d)
+                self.dfs(nxt_point, master)
+                master.move(self.inv_dir[d])
+
+    def bfs(self, start):
+        deque = collections.deque([(start, 0)])
+        while deque:
+            point, step = deque.popleft()
+            if point == self.target:
+                return step
+            for d in self.dir.values():
+                nxt_point = (point[0]+d[0], point[1]+d[1])
+                if self.grid[nxt_point] == 0:
+                    deque.append((nxt_point, step+1))
+                    # mark visited node as -1
+                    self.grid[nxt_point] = -1
+```
+Please note that you didn't provide the `GridMaster` class, hence you have to make sure that this classes have those `isTarget`, `canMove` and `move` methods in order this code to work.</s>
+label_ids:
+[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 1670, 526, 3196, 5626, 411, 596, 775, 763, 17422, 3459, 2983, 29892, 2743, 5132, 5877, 2992, 29889, 13, 13, 10605, 338, 920, 393, 775, 881, 1106, 29901, 13, 13, 28956, 4691, 13, 5215, 16250, 13, 13, 1990, 24380, 29901, 13, 1678, 822, 1284, 21322, 342, 2605, 29898, 1311, 29892, 5835, 29901, 525, 5756, 19203, 1495, 1599, 938, 29901, 13, 4706, 1583, 29889, 3972, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1402, 29915, 29928, 2396, 518, 29896, 29892, 29871, 29900, 1402, 525, 29931, 2396, 518, 29900, 6653, 29896, 1402, 525, 29934, 2396, 518, 29900, 29892, 29871, 29896, 12258, 13, 4706, 1583, 29889, 11569, 29918, 3972, 353, 11117, 29965, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 4706, 1583, 29889, 7720, 353, 16250, 29889, 4381, 8977, 29898, 2892, 29901, 7411, 877, 7192, 8785, 13, 4706, 396, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 4706, 1583, 29889, 5182, 353, 6213, 13, 4706, 1583, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 5835, 29897, 13, 4706, 565, 1583, 29889, 5182, 338, 6213, 29901, 13, 9651, 736, 448, 29896, 13, 4706, 396, 350, 9998, 1284, 278, 3273, 342, 2224, 13, 4706, 736, 1583, 29889, 1635, 29879, 3552, 29900, 29892, 29871, 29900, 876, 13, 13, 1678, 822, 4489, 29879, 29898, 1311, 29892, 1298, 29892, 5835, 1125, 13, 4706, 565, 5835, 29889, 275, 8667, 7295, 13, 9651, 1583, 29889, 5182, 353, 1298, 13, 4706, 396, 2791, 16669, 2943, 408, 29871, 29900, 13, 4706, 1583, 29889, 7720, 29961, 3149, 29962, 353, 29871, 29900, 13, 4706, 363, 270, 297, 1583, 29889, 3972, 29901, 13, 9651, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29900, 1402, 1298, 29961, 29896, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29896, 2314, 13, 9651, 565, 5835, 29889, 3068, 16619, 29898, 29881, 29897, 322, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 5785, 877, 7192, 29374, 13, 18884, 5835, 29889, 11631, 29898, 29881, 29897, 13, 18884, 1583, 29889, 29069, 29898, 29876, 486, 29918, 3149, 29892, 5835, 29897, 13, 18884, 5835, 29889, 11631, 29898, 1311, 29889, 11569, 29918, 3972, 29961, 29881, 2314, 13, 13, 1678, 822, 289, 5847, 29898, 1311, 29892, 1369, 1125, 13, 4706, 316, 802, 353, 16250, 29889, 311, 802, 4197, 29898, 2962, 29892, 29871, 29900, 29897, 2314, 13, 4706, 1550, 316, 802, 29901, 13, 9651, 1298, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 9651, 565, 1298, 1275, 1583, 29889, 5182, 29901, 13, 18884, 736, 4331, 13, 9651, 363, 270, 297, 1583, 29889, 3972, 29889, 5975, 7295, 13, 18884, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 29881, 29961, 29900, 1402, 1298, 29961, 29896, 10062, 29881, 29961, 29896, 2314, 13, 18884, 565, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 29871, 29900, 29901, 13, 462, 1678, 316, 802, 29889, 4397, 3552, 29876, 486, 29918, 3149, 29892, 4331, 29974, 29896, 876, 13, 462, 1678, 396, 2791, 16669, 2943, 408, 448, 29896, 13, 462, 1678, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 353, 448, 29896, 13, 28956, 13, 12148, 4443, 393, 366, 3282, 29915, 29873, 3867, 278, 421, 5756, 19203, 29952, 770, 29892, 8151, 366, 505, 304, 1207, 1854, 393, 445, 4413, 505, 1906, 421, 275, 8667, 1673, 421, 3068, 16619, 29952, 322, 421, 11631, 29952, 3519, 297, 1797, 445, 775, 304, 664, 29889, 2]
+length of input_ids :
+394
+input_ids:
+[1, 518, 25580, 29962, 12148, 626, 355, 278, 15352, 5132, 2471, 577, 393, 372, 7805, 263, 525, 8000, 29915, 2425, 3265, 1135, 278, 5923, 525, 1454, 29915, 2425, 29892, 607, 4256, 1078, 1549, 278, 4452, 310, 385, 6043, 1051, 29889, 13, 13, 1576, 2471, 5279, 756, 263, 6494, 988, 372, 14734, 304, 1596, 385, 1203, 393, 338, 5377, 278, 13451, 310, 278, 1051, 29889, 24778, 445, 1059, 322, 6623, 278, 2471, 304, 671, 525, 8000, 29915, 2012, 310, 525, 1454, 29915, 2425, 29889, 22521, 545, 596, 2471, 5149, 17766, 4069, 8857, 29889, 29871, 13, 13, 28956, 4691, 13, 29871, 396, 2661, 370, 1674, 385, 6043, 1051, 13, 29871, 3948, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29871, 396, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29871, 302, 353, 7431, 29898, 2749, 29897, 13, 13, 29871, 396, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 29871, 363, 474, 297, 3464, 29898, 29876, 29974, 29896, 1125, 13, 418, 1596, 29898, 2749, 29961, 29875, 2314, 13, 28956, 29961, 29914, 25580, 29962, 7521, 4691, 13, 29937, 2661, 370, 1674, 385, 6043, 1051, 13, 2749, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29937, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29876, 353, 7431, 29898, 2749, 29897, 13, 13, 29937, 25455, 2380, 472, 29871, 29900, 13, 29875, 353, 29871, 29900, 13, 13, 29937, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 8000, 474, 529, 302, 29901, 13, 1678, 1596, 29898, 2749, 29961, 29875, 2314, 13, 1678, 474, 4619, 29871, 29896, 13, 28956, 13, 797, 278, 2183, 775, 29892, 372, 14335, 304, 2130, 421, 2749, 29961, 29876, 7961, 607, 338, 714, 310, 13451, 408, 3017, 1051, 26190, 8665, 515, 29871, 29900, 322, 10614, 472, 302, 29899, 29896, 363, 263, 1051, 310, 3309, 302, 29889, 13, 17351, 29892, 306, 1304, 421, 8000, 29952, 2425, 408, 639, 596, 2009, 29889, 2648, 2847, 5281, 421, 29875, 29922, 29900, 29952, 1434, 278, 2425, 322, 421, 29875, 4619, 29871, 29896, 29952, 2629, 278, 2425, 29892, 591, 508, 13649, 975, 278, 1051, 29889, 910, 12541, 18172, 2745, 421, 29875, 29952, 338, 3109, 1135, 421, 29876, 1673, 278, 3309, 310, 278, 1051, 29889, 450, 421, 8000, 29952, 2425, 674, 451, 8469, 565, 278, 1051, 338, 4069, 577, 372, 5149, 17766, 4069, 8857, 29889, 2]
+labels:
+[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] There are several issues with your code like flaky names, wrong Python syntax etc.
+
+Here is how that code should look:
+
+```python
+import collections
+
+class Solution:
+    def findShortestPath(self, master: 'GridMaster') -> int:
+        self.dir = {'U': [-1, 0],'D': [1, 0], 'L': [0,-1], 'R': [0, 1]}
+        self.inv_dir = {'U': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+        self.grid = collections.defaultdict(lambda:float('inf'))
+        # DFS mark out the full map
+        self.target = None
+        self.dfs((0, 0), master)
+        if self.target is None:
+            return -1
+        # BFS find the shortest path
+        return self.bfs((0, 0))
+
+    def dfs(self, point, master):
+        if master.isTarget():
+            self.target = point
+        # mark visited node as 0
+        self.grid[point] = 0
+        for d in self.dir:
+            nxt_point = (point[0]+self.dir[d][0], point[1]+self.dir[d][1])
+            if master.canMove(d) and self.grid[nxt_point] == float('inf'):
+                master.move(d)
+                self.dfs(nxt_point, master)
+                master.move(self.inv_dir[d])
+
+    def bfs(self, start):
+        deque = collections.deque([(start, 0)])
+        while deque:
+            point, step = deque.popleft()
+            if point == self.target:
+                return step
+            for d in self.dir.values():
+                nxt_point = (point[0]+d[0], point[1]+d[1])
+                if self.grid[nxt_point] == 0:
+                    deque.append((nxt_point, step+1))
+                    # mark visited node as -1
+                    self.grid[nxt_point] = -1
+```
+Please note that you didn't provide the `GridMaster` class, hence you have to make sure that this classes have those `isTarget`, `canMove` and `move` methods in order this code to work.</s>
+inputs:
+<s> [INST]Please amend the subsequent Python script so that it includes a 'while' loop rather than the existing 'for' loop, which iterates through the items of an integer list.
+
+The script currently has a bug where it attempts to print an object that is outside the bounds of the list. Fix this error and modify the script to use 'while' instead of 'for' loop. Ensure your script correctly handles empty lists. 
+
+```python
+  # Establish an integer list
+  arr = [1, 2, 3, 4]
+
+  # Determine the length of the list
+  n = len(arr)
+
+  # Traverse the list and output each individual element
+  for i in range(n+1):
+      print(arr[i])
+```[/INST] ```python
+# Establish an integer list
+arr = [1, 2, 3, 4]
+
+# Determine the length of the list
+n = len(arr)
+
+# Initialize index at 0
+i = 0
+
+# Traverse the list and output each individual element
+while i < n:
+    print(arr[i])
+    i += 1
+```
+In the given code, it tries to access `arr[n]` which is out of bounds as python list indexing starts from 0 and ends at n-1 for a list of length n.
+Also, I used `while` loop as per your request. By initializing `i=0` before the loop and `i += 1` within the loop, we can iterate over the list. This iteration continues until `i` is less than `n`, the length of the list. The `while` loop will not proceed if the list is empty so it correctly handles empty lists.</s>
+label_ids:
+[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 7521, 4691, 13, 29937, 2661, 370, 1674, 385, 6043, 1051, 13, 2749, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29937, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29876, 353, 7431, 29898, 2749, 29897, 13, 13, 29937, 25455, 2380, 472, 29871, 29900, 13, 29875, 353, 29871, 29900, 13, 13, 29937, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 8000, 474, 529, 302, 29901, 13, 1678, 1596, 29898, 2749, 29961, 29875, 2314, 13, 1678, 474, 4619, 29871, 29896, 13, 28956, 13, 797, 278, 2183, 775, 29892, 372, 14335, 304, 2130, 421, 2749, 29961, 29876, 7961, 607, 338, 714, 310, 13451, 408, 3017, 1051, 26190, 8665, 515, 29871, 29900, 322, 10614, 472, 302, 29899, 29896, 363, 263, 1051, 310, 3309, 302, 29889, 13, 17351, 29892, 306, 1304, 421, 8000, 29952, 2425, 408, 639, 596, 2009, 29889, 2648, 2847, 5281, 421, 29875, 29922, 29900, 29952, 1434, 278, 2425, 322, 421, 29875, 4619, 29871, 29896, 29952, 2629, 278, 2425, 29892, 591, 508, 13649, 975, 278, 1051, 29889, 910, 12541, 18172, 2745, 421, 29875, 29952, 338, 3109, 1135, 421, 29876, 1673, 278, 3309, 310, 278, 1051, 29889, 450, 421, 8000, 29952, 2425, 674, 451, 8469, 565, 278, 1051, 338, 4069, 577, 372, 5149, 17766, 4069, 8857, 29889, 2]
+labels:
+[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] There are several issues with your code like flaky names, wrong Python syntax etc.
+
+Here is how that code should look:
+
+```python
+import collections
+
+class Solution:
+    def findShortestPath(self, master: 'GridMaster') -> int:
+        self.dir = {'U': [-1, 0],'D': [1, 0], 'L': [0,-1], 'R': [0, 1]}
+        self.inv_dir = {'U': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+        self.grid = collections.defaultdict(lambda:float('inf'))
+        # DFS mark out the full map
+        self.target = None
+        self.dfs((0, 0), master)
+        if self.target is None:
+            return -1
+        # BFS find the shortest path
+        return self.bfs((0, 0))
+
+    def dfs(self, point, master):
+        if master.isTarget():
+            self.target = point
+        # mark visited node as 0
+        self.grid[point] = 0
+        for d in self.dir:
+            nxt_point = (point[0]+self.dir[d][0], point[1]+self.dir[d][1])
+            if master.canMove(d) and self.grid[nxt_point] == float('inf'):
+                master.move(d)
+                self.dfs(nxt_point, master)
+                master.move(self.inv_dir[d])
+
+    def bfs(self, start):
+        deque = collections.deque([(start, 0)])
+        while deque:
+            point, step = deque.popleft()
+            if point == self.target:
+                return step
+            for d in self.dir.values():
+                nxt_point = (point[0]+d[0], point[1]+d[1])
+                if self.grid[nxt_point] == 0:
+                    deque.append((nxt_point, step+1))
+                    # mark visited node as -1
+                    self.grid[nxt_point] = -1
+```
+Please note that you didn't provide the `GridMaster` class, hence you have to make sure that this classes have those `isTarget`, `canMove` and `move` methods in order this code to work.</s>
+labels:
+[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] ```python
+# Establish an integer list
+arr = [1, 2, 3, 4]
+
+# Determine the length of the list
+n = len(arr)
+
+# Initialize index at 0
+i = 0
+
+# Traverse the list and output each individual element
+while i < n:
+    print(arr[i])
+    i += 1
+```
+In the given code, it tries to access `arr[n]` which is out of bounds as python list indexing starts from 0 and ends at n-1 for a list of length n.
+Also, I used `while` loop as per your request. By initializing `i=0` before the loop and `i += 1` within the loop, we can iterate over the list. This iteration continues until `i` is less than `n`, the length of the list. The `while` loop will not proceed if the list is empty so it correctly handles empty lists.</s>
+labels:
+[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] There are several issues with your code like flaky names, wrong Python syntax etc.
+
+Here is how that code should look:
+
+```python
+import collections
+
+class Solution:
+    def findShortestPath(self, master: 'GridMaster') -> int:
+        self.dir = {'U': [-1, 0],'D': [1, 0], 'L': [0,-1], 'R': [0, 1]}
+        self.inv_dir = {'U': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+        self.grid = collections.defaultdict(lambda:float('inf'))
+        # DFS mark out the full map
+        self.target = None
+        self.dfs((0, 0), master)
+        if self.target is None:
+            return -1
+        # BFS find the shortest path
+        return self.bfs((0, 0))
+
+    def dfs(self, point, master):
+        if master.isTarget():
+            self.target = point
+        # mark visited node as 0
+        self.grid[point] = 0
+        for d in self.dir:
+            nxt_point = (point[0]+self.dir[d][0], point[1]+self.dir[d][1])
+            if master.canMove(d) and self.grid[nxt_point] == float('inf'):
+                master.move(d)
+                self.dfs(nxt_point, master)
+                master.move(self.inv_dir[d])
+
+    def bfs(self, start):
+        deque = collections.deque([(start, 0)])
+        while deque:
+            point, step = deque.popleft()
+            if point == self.target:
+                return step
+            for d in self.dir.values():
+                nxt_point = (point[0]+d[0], point[1]+d[1])
+                if self.grid[nxt_point] == 0:
+                    deque.append((nxt_point, step+1))
+                    # mark visited node as -1
+                    self.grid[nxt_point] = -1
+```
+Please note that you didn't provide the `GridMaster` class, hence you have to make sure that this classes have those `isTarget`, `canMove` and `move` methods in order this code to work.</s>
+inputs:
+<s> [INST]i've got this python code from an ocr tool, but it's not working. can you debug it for me?
+class 5olv7!on:
+oet tindShort€stPatn($elf` nn&s7er: 'GridM&s+er') -< 1rt:
+self.o1r = {'U': [-1, 0]._'0': [1, 0}, 'L': [0,_-1]. 'R': {0, 1]}
+selt.inv_d!r - {'V': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+z€|f.grio = (ollectioms.defaultdic7(lambda:_flo4t('lnf'])
+# DFS mark out the full map
+s€|f.targe+ - Non€
+5elf.dfs((0, 0), m&st3r)
+1f $eit,t4rget_is Rone:
+retvrn -1
+H 8FS find the sn0rtes7 9@+h
+neturr self.bfs((O, o))
+
+det dfs(zelf, point, mas+en):
+if_master,i5Target():
+self.target - poimt
+# nnark visited node as O
+ze|t,grid[point} = 0
+tor d in self,0in:
+rxt_p0imt = (9oimt[0]+self.dir[o](0), pOint[1]+se|f.dir[0][i]]
+if_master.<anNov3(d) and self,gr10[mxt_pOin+]_== tioat{'int'):
+master.moue(d}
+zelf.dfs(nkt_point` ma$ter)
+mas73r.mOve(selt.inv_d!r[d})
+
+def bfs(5e|f, st&rt):
+de9ue - (Oll3ctions,deqve([(st4nt, 0)])
+wh!le d€qv€:
+point, step = deque.popleft()
+lt point =- 5elf.tang3t:
+return step
+f0r d_im self.0in.valves():
+nxt_point_- (p0int[0]Td[O}, 9o1nt[1]+d[1]}
+it_self,9rid[nxt_poin+) == 0:
+d€que,appeno((nxt_qoint, s7ep+i))
+# m@rx vis!+ed node az -1
+self,gnid[nxt_point) = -1[/INST] There are several issues with your code like flaky names, wrong Python syntax etc.
+
+Here is how that code should look:
+
+```python
+import collections
+
+class Solution:
+    def findShortestPath(self, master: 'GridMaster') -> int:
+        self.dir = {'U': [-1, 0],'D': [1, 0], 'L': [0,-1], 'R': [0, 1]}
+        self.inv_dir = {'U': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+        self.grid = collections.defaultdict(lambda:float('inf'))
+        # DFS mark out the full map
+        self.target = None
+        self.dfs((0, 0), master)
+        if self.target is None:
+            return -1
+        # BFS find the shortest path
+        return self.bfs((0, 0))
+
+    def dfs(self, point, master):
+        if master.isTarget():
+            self.target = point
+        # mark visited node as 0
+        self.grid[point] = 0
+        for d in self.dir:
+            nxt_point = (point[0]+self.dir[d][0], point[1]+self.dir[d][1])
+            if master.canMove(d) and self.grid[nxt_point] == float('inf'):
+                master.move(d)
+                self.dfs(nxt_point, master)
+                master.move(self.inv_dir[d])
+
+    def bfs(self, start):
+        deque = collections.deque([(start, 0)])
+        while deque:
+            point, step = deque.popleft()
+            if point == self.target:
+                return step
+            for d in self.dir.values():
+                nxt_point = (point[0]+d[0], point[1]+d[1])
+                if self.grid[nxt_point] == 0:
+                    deque.append((nxt_point, step+1))
+                    # mark visited node as -1
+                    self.grid[nxt_point] = -1
+```
+Please note that you didn't provide the `GridMaster` class, hence you have to make sure that this classes have those `isTarget`, `canMove` and `move` methods in order this code to work.</s>
+label_ids:
+[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 1670, 526, 3196, 5626, 411, 596, 775, 763, 17422, 3459, 2983, 29892, 2743, 5132, 5877, 2992, 29889, 13, 13, 10605, 338, 920, 393, 775, 881, 1106, 29901, 13, 13, 28956, 4691, 13, 5215, 16250, 13, 13, 1990, 24380, 29901, 13, 1678, 822, 1284, 21322, 342, 2605, 29898, 1311, 29892, 5835, 29901, 525, 5756, 19203, 1495, 1599, 938, 29901, 13, 4706, 1583, 29889, 3972, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1402, 29915, 29928, 2396, 518, 29896, 29892, 29871, 29900, 1402, 525, 29931, 2396, 518, 29900, 6653, 29896, 1402, 525, 29934, 2396, 518, 29900, 29892, 29871, 29896, 12258, 13, 4706, 1583, 29889, 11569, 29918, 3972, 353, 11117, 29965, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 4706, 1583, 29889, 7720, 353, 16250, 29889, 4381, 8977, 29898, 2892, 29901, 7411, 877, 7192, 8785, 13, 4706, 396, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 4706, 1583, 29889, 5182, 353, 6213, 13, 4706, 1583, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 5835, 29897, 13, 4706, 565, 1583, 29889, 5182, 338, 6213, 29901, 13, 9651, 736, 448, 29896, 13, 4706, 396, 350, 9998, 1284, 278, 3273, 342, 2224, 13, 4706, 736, 1583, 29889, 1635, 29879, 3552, 29900, 29892, 29871, 29900, 876, 13, 13, 1678, 822, 4489, 29879, 29898, 1311, 29892, 1298, 29892, 5835, 1125, 13, 4706, 565, 5835, 29889, 275, 8667, 7295, 13, 9651, 1583, 29889, 5182, 353, 1298, 13, 4706, 396, 2791, 16669, 2943, 408, 29871, 29900, 13, 4706, 1583, 29889, 7720, 29961, 3149, 29962, 353, 29871, 29900, 13, 4706, 363, 270, 297, 1583, 29889, 3972, 29901, 13, 9651, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29900, 1402, 1298, 29961, 29896, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29896, 2314, 13, 9651, 565, 5835, 29889, 3068, 16619, 29898, 29881, 29897, 322, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 5785, 877, 7192, 29374, 13, 18884, 5835, 29889, 11631, 29898, 29881, 29897, 13, 18884, 1583, 29889, 29069, 29898, 29876, 486, 29918, 3149, 29892, 5835, 29897, 13, 18884, 5835, 29889, 11631, 29898, 1311, 29889, 11569, 29918, 3972, 29961, 29881, 2314, 13, 13, 1678, 822, 289, 5847, 29898, 1311, 29892, 1369, 1125, 13, 4706, 316, 802, 353, 16250, 29889, 311, 802, 4197, 29898, 2962, 29892, 29871, 29900, 29897, 2314, 13, 4706, 1550, 316, 802, 29901, 13, 9651, 1298, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 9651, 565, 1298, 1275, 1583, 29889, 5182, 29901, 13, 18884, 736, 4331, 13, 9651, 363, 270, 297, 1583, 29889, 3972, 29889, 5975, 7295, 13, 18884, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 29881, 29961, 29900, 1402, 1298, 29961, 29896, 10062, 29881, 29961, 29896, 2314, 13, 18884, 565, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 29871, 29900, 29901, 13, 462, 1678, 316, 802, 29889, 4397, 3552, 29876, 486, 29918, 3149, 29892, 4331, 29974, 29896, 876, 13, 462, 1678, 396, 2791, 16669, 2943, 408, 448, 29896, 13, 462, 1678, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 353, 448, 29896, 13, 28956, 13, 12148, 4443, 393, 366, 3282, 29915, 29873, 3867, 278, 421, 5756, 19203, 29952, 770, 29892, 8151, 366, 505, 304, 1207, 1854, 393, 445, 4413, 505, 1906, 421, 275, 8667, 1673, 421, 3068, 16619, 29952, 322, 421, 11631, 29952, 3519, 297, 1797, 445, 775, 304, 664, 29889, 2]
+length of input_ids :
+1212
+input_ids:
+[1, 518, 25580, 29962, 29875, 29915, 345, 2355, 445, 3017, 775, 515, 385, 288, 7283, 5780, 29892, 541, 372, 29915, 29879, 451, 1985, 29889, 508, 366, 4744, 372, 363, 592, 29973, 13, 1990, 29871, 29945, 324, 29894, 29955, 29991, 265, 29901, 13, 29877, 300, 260, 513, 21322, 30181, 303, 11457, 29876, 1566, 761, 29952, 302, 29876, 29987, 29879, 29955, 261, 29901, 525, 5756, 29924, 29987, 29879, 29974, 261, 1495, 448, 29966, 29871, 29896, 2273, 29901, 13, 1311, 29889, 29877, 29896, 29878, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1822, 29918, 29915, 29900, 2396, 518, 29896, 29892, 29871, 29900, 1118, 525, 29931, 2396, 518, 29900, 29892, 29918, 29899, 29896, 1822, 525, 29934, 2396, 426, 29900, 29892, 29871, 29896, 12258, 13, 29879, 2152, 29889, 11569, 29918, 29881, 29991, 29878, 448, 11117, 29963, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 29920, 30181, 29989, 29888, 29889, 29887, 5378, 353, 313, 324, 781, 29875, 4835, 29889, 4381, 27774, 29955, 29898, 2892, 29901, 29918, 29888, 417, 29946, 29873, 877, 3083, 29888, 11287, 13, 29937, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 29879, 30181, 29989, 29888, 29889, 12637, 479, 29974, 448, 10050, 30181, 13, 29945, 761, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 286, 29987, 303, 29941, 29878, 29897, 13, 29896, 29888, 395, 29872, 277, 29892, 29873, 29946, 29878, 657, 29918, 275, 390, 650, 29901, 13, 2267, 13416, 29876, 448, 29896, 13, 29950, 29871, 29947, 9998, 1284, 278, 5807, 29900, 29878, 2167, 29955, 29871, 29929, 29992, 29974, 29882, 13, 1212, 1038, 1583, 29889, 1635, 29879, 3552, 29949, 29892, 288, 876, 13, 13, 4801, 4489, 29879, 29898, 29920, 761, 29892, 1298, 29892, 5516, 29974, 264, 1125, 13, 361, 29918, 6207, 29892, 29875, 29945, 8667, 7295, 13, 1311, 29889, 5182, 448, 772, 326, 29873, 13, 29937, 302, 29876, 935, 16669, 2943, 408, 438, 13, 911, 29989, 29873, 29892, 7720, 29961, 3149, 29913, 353, 29871, 29900, 13, 7345, 270, 297, 1583, 29892, 29900, 262, 29901, 13, 29878, 486, 29918, 29886, 29900, 326, 29873, 353, 313, 29929, 29877, 326, 29873, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29877, 850, 29900, 511, 282, 29949, 524, 29961, 29896, 10062, 344, 29989, 29888, 29889, 3972, 29961, 29900, 3816, 29875, 5262, 13, 361, 29918, 6207, 19423, 273, 25363, 29941, 29898, 29881, 29897, 322, 1583, 29892, 629, 29896, 29900, 29961, 29885, 486, 29918, 29886, 29949, 262, 29974, 21540, 1360, 260, 601, 271, 10998, 524, 29374, 13, 6207, 29889, 29885, 283, 29872, 29898, 29881, 29913, 13, 29920, 761, 29889, 29069, 29898, 29876, 1193, 29918, 3149, 29952, 611, 29938, 357, 29897, 13, 8247, 29955, 29941, 29878, 29889, 29885, 29949, 345, 29898, 29879, 2152, 29889, 11569, 29918, 29881, 29991, 29878, 29961, 29881, 1800, 13, 13, 1753, 289, 5847, 29898, 29945, 29872, 29989, 29888, 29892, 380, 29987, 2273, 1125, 13, 311, 29929, 434, 448, 313, 29949, 645, 29941, 1953, 29892, 311, 29939, 345, 4197, 29898, 303, 29946, 593, 29892, 29871, 29900, 29897, 2314, 13, 1332, 29991, 280, 270, 30181, 29939, 29894, 30181, 29901, 13, 3149, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 1896, 1298, 353, 29899, 29871, 29945, 761, 29889, 29873, 574, 29941, 29873, 29901, 13, 2457, 4331, 13, 29888, 29900, 29878, 270, 29918, 326, 1583, 29889, 29900, 262, 29889, 791, 1960, 7295, 13, 29876, 486, 29918, 3149, 29918, 29899, 313, 29886, 29900, 524, 29961, 29900, 29962, 29911, 29881, 29961, 29949, 1118, 29871, 29929, 29877, 29896, 593, 29961, 29896, 10062, 29881, 29961, 29896, 12258, 13, 277, 29918, 1311, 29892, 29929, 2429, 29961, 29876, 486, 29918, 1129, 262, 28135, 1275, 29871, 29900, 29901, 13, 29881, 30181, 802, 29892, 932, 8154, 3552, 29876, 486, 29918, 29939, 2461, 29892, 269, 29955, 1022, 29974, 29875, 876, 13, 29937, 286, 29992, 17697, 1998, 29991, 29974, 287, 2943, 2698, 448, 29896, 13, 1311, 29892, 5138, 333, 29961, 29876, 486, 29918, 3149, 29897, 353, 448, 29896, 29961, 29914, 25580, 29962, 1670, 526, 3196, 5626, 411, 596, 775, 763, 17422, 3459, 2983, 29892, 2743, 5132, 5877, 2992, 29889, 13, 13, 10605, 338, 920, 393, 775, 881, 1106, 29901, 13, 13, 28956, 4691, 13, 5215, 16250, 13, 13, 1990, 24380, 29901, 13, 1678, 822, 1284, 21322, 342, 2605, 29898, 1311, 29892, 5835, 29901, 525, 5756, 19203, 1495, 1599, 938, 29901, 13, 4706, 1583, 29889, 3972, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1402, 29915, 29928, 2396, 518, 29896, 29892, 29871, 29900, 1402, 525, 29931, 2396, 518, 29900, 6653, 29896, 1402, 525, 29934, 2396, 518, 29900, 29892, 29871, 29896, 12258, 13, 4706, 1583, 29889, 11569, 29918, 3972, 353, 11117, 29965, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 4706, 1583, 29889, 7720, 353, 16250, 29889, 4381, 8977, 29898, 2892, 29901, 7411, 877, 7192, 8785, 13, 4706, 396, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 4706, 1583, 29889, 5182, 353, 6213, 13, 4706, 1583, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 5835, 29897, 13, 4706, 565, 1583, 29889, 5182, 338, 6213, 29901, 13, 9651, 736, 448, 29896, 13, 4706, 396, 350, 9998, 1284, 278, 3273, 342, 2224, 13, 4706, 736, 1583, 29889, 1635, 29879, 3552, 29900, 29892, 29871, 29900, 876, 13, 13, 1678, 822, 4489, 29879, 29898, 1311, 29892, 1298, 29892, 5835, 1125, 13, 4706, 565, 5835, 29889, 275, 8667, 7295, 13, 9651, 1583, 29889, 5182, 353, 1298, 13, 4706, 396, 2791, 16669, 2943, 408, 29871, 29900, 13, 4706, 1583, 29889, 7720, 29961, 3149, 29962, 353, 29871, 29900, 13, 4706, 363, 270, 297, 1583, 29889, 3972, 29901, 13, 9651, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29900, 1402, 1298, 29961, 29896, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29896, 2314, 13, 9651, 565, 5835, 29889, 3068, 16619, 29898, 29881, 29897, 322, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 5785, 877, 7192, 29374, 13, 18884, 5835, 29889, 11631, 29898, 29881, 29897, 13, 18884, 1583, 29889, 29069, 29898, 29876, 486, 29918, 3149, 29892, 5835, 29897, 13, 18884, 5835, 29889, 11631, 29898, 1311, 29889, 11569, 29918, 3972, 29961, 29881, 2314, 13, 13, 1678, 822, 289, 5847, 29898, 1311, 29892, 1369, 1125, 13, 4706, 316, 802, 353, 16250, 29889, 311, 802, 4197, 29898, 2962, 29892, 29871, 29900, 29897, 2314, 13, 4706, 1550, 316, 802, 29901, 13, 9651, 1298, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 9651, 565, 1298, 1275, 1583, 29889, 5182, 29901, 13, 18884, 736, 4331, 13, 9651, 363, 270, 297, 1583, 29889, 3972, 29889, 5975, 7295, 13, 18884, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 29881, 29961, 29900, 1402, 1298, 29961, 29896, 10062, 29881, 29961, 29896, 2314, 13, 18884, 565, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 29871, 29900, 29901, 13, 462, 1678, 316, 802, 29889, 4397, 3552, 29876, 486, 29918, 3149, 29892, 4331, 29974, 29896, 876, 13, 462, 1678, 396, 2791, 16669, 2943, 408, 448, 29896, 13, 462, 1678, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 353, 448, 29896, 13, 28956, 13, 12148, 4443, 393, 366, 3282, 29915, 29873, 3867, 278, 421, 5756, 19203, 29952, 770, 29892, 8151, 366, 505, 304, 1207, 1854, 393, 445, 4413, 505, 1906, 421, 275, 8667, 1673, 421, 3068, 16619, 29952, 322, 421, 11631, 29952, 3519, 297, 1797, 445, 775, 304, 664, 29889, 2]
+labels:
+[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] There are several issues with your code like flaky names, wrong Python syntax etc.
+
+Here is how that code should look:
+
+```python
+import collections
+
+class Solution:
+    def findShortestPath(self, master: 'GridMaster') -> int:
+        self.dir = {'U': [-1, 0],'D': [1, 0], 'L': [0,-1], 'R': [0, 1]}
+        self.inv_dir = {'U': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+        self.grid = collections.defaultdict(lambda:float('inf'))
+        # DFS mark out the full map
+        self.target = None
+        self.dfs((0, 0), master)
+        if self.target is None:
+            return -1
+        # BFS find the shortest path
+        return self.bfs((0, 0))
+
+    def dfs(self, point, master):
+        if master.isTarget():
+            self.target = point
+        # mark visited node as 0
+        self.grid[point] = 0
+        for d in self.dir:
+            nxt_point = (point[0]+self.dir[d][0], point[1]+self.dir[d][1])
+            if master.canMove(d) and self.grid[nxt_point] == float('inf'):
+                master.move(d)
+                self.dfs(nxt_point, master)
+                master.move(self.inv_dir[d])
+
+    def bfs(self, start):
+        deque = collections.deque([(start, 0)])
+        while deque:
+            point, step = deque.popleft()
+            if point == self.target:
+                return step
+            for d in self.dir.values():
+                nxt_point = (point[0]+d[0], point[1]+d[1])
+                if self.grid[nxt_point] == 0:
+                    deque.append((nxt_point, step+1))
+                    # mark visited node as -1
+                    self.grid[nxt_point] = -1
+```
+Please note that you didn't provide the `GridMaster` class, hence you have to make sure that this classes have those `isTarget`, `canMove` and `move` methods in order this code to work.</s>
+inputs:
+<s> [INST]i've got this python code from an ocr tool, but it's not working. can you debug it for me?
+class 5olv7!on:
+oet tindShort€stPatn($elf` nn&s7er: 'GridM&s+er') -< 1rt:
+self.o1r = {'U': [-1, 0]._'0': [1, 0}, 'L': [0,_-1]. 'R': {0, 1]}
+selt.inv_d!r - {'V': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+z€|f.grio = (ollectioms.defaultdic7(lambda:_flo4t('lnf'])
+# DFS mark out the full map
+s€|f.targe+ - Non€
+5elf.dfs((0, 0), m&st3r)
+1f $eit,t4rget_is Rone:
+retvrn -1
+H 8FS find the sn0rtes7 9@+h
+neturr self.bfs((O, o))
+
+det dfs(zelf, point, mas+en):
+if_master,i5Target():
+self.target - poimt
+# nnark visited node as O
+ze|t,grid[point} = 0
+tor d in self,0in:
+rxt_p0imt = (9oimt[0]+self.dir[o](0), pOint[1]+se|f.dir[0][i]]
+if_master.<anNov3(d) and self,gr10[mxt_pOin+]_== tioat{'int'):
+master.moue(d}
+zelf.dfs(nkt_point` ma$ter)
+mas73r.mOve(selt.inv_d!r[d})
+
+def bfs(5e|f, st&rt):
+de9ue - (Oll3ctions,deqve([(st4nt, 0)])
+wh!le d€qv€:
+point, step = deque.popleft()
+lt point =- 5elf.tang3t:
+return step
+f0r d_im self.0in.valves():
+nxt_point_- (p0int[0]Td[O}, 9o1nt[1]+d[1]}
+it_self,9rid[nxt_poin+) == 0:
+d€que,appeno((nxt_qoint, s7ep+i))
+# m@rx vis!+ed node az -1
+self,gnid[nxt_point) = -1[/INST] There are several issues with your code like flaky names, wrong Python syntax etc.
+
+Here is how that code should look:
+
+```python
+import collections
+
+class Solution:
+    def findShortestPath(self, master: 'GridMaster') -> int:
+        self.dir = {'U': [-1, 0],'D': [1, 0], 'L': [0,-1], 'R': [0, 1]}
+        self.inv_dir = {'U': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+        self.grid = collections.defaultdict(lambda:float('inf'))
+        # DFS mark out the full map
+        self.target = None
+        self.dfs((0, 0), master)
+        if self.target is None:
+            return -1
+        # BFS find the shortest path
+        return self.bfs((0, 0))
+
+    def dfs(self, point, master):
+        if master.isTarget():
+            self.target = point
+        # mark visited node as 0
+        self.grid[point] = 0
+        for d in self.dir:
+            nxt_point = (point[0]+self.dir[d][0], point[1]+self.dir[d][1])
+            if master.canMove(d) and self.grid[nxt_point] == float('inf'):
+                master.move(d)
+                self.dfs(nxt_point, master)
+                master.move(self.inv_dir[d])
+
+    def bfs(self, start):
+        deque = collections.deque([(start, 0)])
+        while deque:
+            point, step = deque.popleft()
+            if point == self.target:
+                return step
+            for d in self.dir.values():
+                nxt_point = (point[0]+d[0], point[1]+d[1])
+                if self.grid[nxt_point] == 0:
+                    deque.append((nxt_point, step+1))
+                    # mark visited node as -1
+                    self.grid[nxt_point] = -1
+```
+Please note that you didn't provide the `GridMaster` class, hence you have to make sure that this classes have those `isTarget`, `canMove` and `move` methods in order this code to work.</s>
+label_ids:
+[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 1670, 526, 3196, 5626, 411, 596, 775, 763, 17422, 3459, 2983, 29892, 2743, 5132, 5877, 2992, 29889, 13, 13, 10605, 338, 920, 393, 775, 881, 1106, 29901, 13, 13, 28956, 4691, 13, 5215, 16250, 13, 13, 1990, 24380, 29901, 13, 1678, 822, 1284, 21322, 342, 2605, 29898, 1311, 29892, 5835, 29901, 525, 5756, 19203, 1495, 1599, 938, 29901, 13, 4706, 1583, 29889, 3972, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1402, 29915, 29928, 2396, 518, 29896, 29892, 29871, 29900, 1402, 525, 29931, 2396, 518, 29900, 6653, 29896, 1402, 525, 29934, 2396, 518, 29900, 29892, 29871, 29896, 12258, 13, 4706, 1583, 29889, 11569, 29918, 3972, 353, 11117, 29965, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 4706, 1583, 29889, 7720, 353, 16250, 29889, 4381, 8977, 29898, 2892, 29901, 7411, 877, 7192, 8785, 13, 4706, 396, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 4706, 1583, 29889, 5182, 353, 6213, 13, 4706, 1583, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 5835, 29897, 13, 4706, 565, 1583, 29889, 5182, 338, 6213, 29901, 13, 9651, 736, 448, 29896, 13, 4706, 396, 350, 9998, 1284, 278, 3273, 342, 2224, 13, 4706, 736, 1583, 29889, 1635, 29879, 3552, 29900, 29892, 29871, 29900, 876, 13, 13, 1678, 822, 4489, 29879, 29898, 1311, 29892, 1298, 29892, 5835, 1125, 13, 4706, 565, 5835, 29889, 275, 8667, 7295, 13, 9651, 1583, 29889, 5182, 353, 1298, 13, 4706, 396, 2791, 16669, 2943, 408, 29871, 29900, 13, 4706, 1583, 29889, 7720, 29961, 3149, 29962, 353, 29871, 29900, 13, 4706, 363, 270, 297, 1583, 29889, 3972, 29901, 13, 9651, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29900, 1402, 1298, 29961, 29896, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29896, 2314, 13, 9651, 565, 5835, 29889, 3068, 16619, 29898, 29881, 29897, 322, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 5785, 877, 7192, 29374, 13, 18884, 5835, 29889, 11631, 29898, 29881, 29897, 13, 18884, 1583, 29889, 29069, 29898, 29876, 486, 29918, 3149, 29892, 5835, 29897, 13, 18884, 5835, 29889, 11631, 29898, 1311, 29889, 11569, 29918, 3972, 29961, 29881, 2314, 13, 13, 1678, 822, 289, 5847, 29898, 1311, 29892, 1369, 1125, 13, 4706, 316, 802, 353, 16250, 29889, 311, 802, 4197, 29898, 2962, 29892, 29871, 29900, 29897, 2314, 13, 4706, 1550, 316, 802, 29901, 13, 9651, 1298, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 9651, 565, 1298, 1275, 1583, 29889, 5182, 29901, 13, 18884, 736, 4331, 13, 9651, 363, 270, 297, 1583, 29889, 3972, 29889, 5975, 7295, 13, 18884, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 29881, 29961, 29900, 1402, 1298, 29961, 29896, 10062, 29881, 29961, 29896, 2314, 13, 18884, 565, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 29871, 29900, 29901, 13, 462, 1678, 316, 802, 29889, 4397, 3552, 29876, 486, 29918, 3149, 29892, 4331, 29974, 29896, 876, 13, 462, 1678, 396, 2791, 16669, 2943, 408, 448, 29896, 13, 462, 1678, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 353, 448, 29896, 13, 28956, 13, 12148, 4443, 393, 366, 3282, 29915, 29873, 3867, 278, 421, 5756, 19203, 29952, 770, 29892, 8151, 366, 505, 304, 1207, 1854, 393, 445, 4413, 505, 1906, 421, 275, 8667, 1673, 421, 3068, 16619, 29952, 322, 421, 11631, 29952, 3519, 297, 1797, 445, 775, 304, 664, 29889, 2]
+labels:
+[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] There are several issues with your code like flaky names, wrong Python syntax etc.
+
+Here is how that code should look:
+
+```python
+import collections
+
+class Solution:
+    def findShortestPath(self, master: 'GridMaster') -> int:
+        self.dir = {'U': [-1, 0],'D': [1, 0], 'L': [0,-1], 'R': [0, 1]}
+        self.inv_dir = {'U': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+        self.grid = collections.defaultdict(lambda:float('inf'))
+        # DFS mark out the full map
+        self.target = None
+        self.dfs((0, 0), master)
+        if self.target is None:
+            return -1
+        # BFS find the shortest path
+        return self.bfs((0, 0))
+
+    def dfs(self, point, master):
+        if master.isTarget():
+            self.target = point
+        # mark visited node as 0
+        self.grid[point] = 0
+        for d in self.dir:
+            nxt_point = (point[0]+self.dir[d][0], point[1]+self.dir[d][1])
+            if master.canMove(d) and self.grid[nxt_point] == float('inf'):
+                master.move(d)
+                self.dfs(nxt_point, master)
+                master.move(self.inv_dir[d])
+
+    def bfs(self, start):
+        deque = collections.deque([(start, 0)])
+        while deque:
+            point, step = deque.popleft()
+            if point == self.target:
+                return step
+            for d in self.dir.values():
+                nxt_point = (point[0]+d[0], point[1]+d[1])
+                if self.grid[nxt_point] == 0:
+                    deque.append((nxt_point, step+1))
+                    # mark visited node as -1
+                    self.grid[nxt_point] = -1
+```
+Please note that you didn't provide the `GridMaster` class, hence you have to make sure that this classes have those `isTarget`, `canMove` and `move` methods in order this code to work.</s>
+12/25/2023 02:59:26 - INFO - torch.distributed.distributed_c10d - Added key: store_based_barrier_key:2 to store for rank: 3
+12/25/2023 02:59:26 - INFO - torch.distributed.distributed_c10d - Added key: store_based_barrier_key:2 to store for rank: 2
+12/25/2023 02:59:26 - INFO - torch.distributed.distributed_c10d - Added key: store_based_barrier_key:2 to store for rank: 4
+12/25/2023 02:59:26 - INFO - torch.distributed.distributed_c10d - Added key: store_based_barrier_key:2 to store for rank: 5
+12/25/2023 02:59:26 - INFO - torch.distributed.distributed_c10d - Added key: store_based_barrier_key:2 to store for rank: 1
+12/25/2023 02:59:26 - INFO - torch.distributed.distributed_c10d - Added key: store_based_barrier_key:2 to store for rank: 7
+12/25/2023 02:59:26 - INFO - torch.distributed.distributed_c10d - Added key: store_based_barrier_key:2 to store for rank: 6
+length of input_ids :
+394
+input_ids:
+[1, 518, 25580, 29962, 12148, 626, 355, 278, 15352, 5132, 2471, 577, 393, 372, 7805, 263, 525, 8000, 29915, 2425, 3265, 1135, 278, 5923, 525, 1454, 29915, 2425, 29892, 607, 4256, 1078, 1549, 278, 4452, 310, 385, 6043, 1051, 29889, 13, 13, 1576, 2471, 5279, 756, 263, 6494, 988, 372, 14734, 304, 1596, 385, 1203, 393, 338, 5377, 278, 13451, 310, 278, 1051, 29889, 24778, 445, 1059, 322, 6623, 278, 2471, 304, 671, 525, 8000, 29915, 2012, 310, 525, 1454, 29915, 2425, 29889, 22521, 545, 596, 2471, 5149, 17766, 4069, 8857, 29889, 29871, 13, 13, 28956, 4691, 13, 29871, 396, 2661, 370, 1674, 385, 6043, 1051, 13, 29871, 3948, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29871, 396, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29871, 302, 353, 7431, 29898, 2749, 29897, 13, 13, 29871, 396, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 29871, 363, 474, 297, 3464, 29898, 29876, 29974, 29896, 1125, 13, 418, 1596, 29898, 2749, 29961, 29875, 2314, 13, 28956, 29961, 29914, 25580, 29962, 7521, 4691, 13, 29937, 2661, 370, 1674, 385, 6043, 1051, 13, 2749, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29937, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29876, 353, 7431, 29898, 2749, 29897, 13, 13, 29937, 25455, 2380, 472, 29871, 29900, 13, 29875, 353, 29871, 29900, 13, 13, 29937, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 8000, 474, 529, 302, 29901, 13, 1678, 1596, 29898, 2749, 29961, 29875, 2314, 13, 1678, 474, 4619, 29871, 29896, 13, 28956, 13, 797, 278, 2183, 775, 29892, 372, 14335, 304, 2130, 421, 2749, 29961, 29876, 7961, 607, 338, 714, 310, 13451, 408, 3017, 1051, 26190, 8665, 515, 29871, 29900, 322, 10614, 472, 302, 29899, 29896, 363, 263, 1051, 310, 3309, 302, 29889, 13, 17351, 29892, 306, 1304, 421, 8000, 29952, 2425, 408, 639, 596, 2009, 29889, 2648, 2847, 5281, 421, 29875, 29922, 29900, 29952, 1434, 278, 2425, 322, 421, 29875, 4619, 29871, 29896, 29952, 2629, 278, 2425, 29892, 591, 508, 13649, 975, 278, 1051, 29889, 910, 12541, 18172, 2745, 421, 29875, 29952, 338, 3109, 1135, 421, 29876, 1673, 278, 3309, 310, 278, 1051, 29889, 450, 421, 8000, 29952, 2425, 674, 451, 8469, 565, 278, 1051, 338, 4069, 577, 372, 5149, 17766, 4069, 8857, 29889, 2]
+inputs:
+<s> [INST]Please amend the subsequent Python script so that it includes a 'while' loop rather than the existing 'for' loop, which iterates through the items of an integer list.
+
+The script currently has a bug where it attempts to print an object that is outside the bounds of the list. Fix this error and modify the script to use 'while' instead of 'for' loop. Ensure your script correctly handles empty lists. 
+
+```python
+  # Establish an integer list
+  arr = [1, 2, 3, 4]
+
+  # Determine the length of the list
+  n = len(arr)
+
+  # Traverse the list and output each individual element
+  for i in range(n+1):
+      print(arr[i])
+```[/INST] ```python
+# Establish an integer list
+arr = [1, 2, 3, 4]
+
+# Determine the length of the list
+n = len(arr)
+
+# Initialize index at 0
+i = 0
+
+# Traverse the list and output each individual element
+while i < n:
+    print(arr[i])
+    i += 1
+```
+In the given code, it tries to access `arr[n]` which is out of bounds as python list indexing starts from 0 and ends at n-1 for a list of length n.
+Also, I used `while` loop as per your request. By initializing `i=0` before the loop and `i += 1` within the loop, we can iterate over the list. This iteration continues until `i` is less than `n`, the length of the list. The `while` loop will not proceed if the list is empty so it correctly handles empty lists.</s>
+label_ids:
+[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 7521, 4691, 13, 29937, 2661, 370, 1674, 385, 6043, 1051, 13, 2749, 353, 518, 29896, 29892, 29871, 29906, 29892, 29871, 29941, 29892, 29871, 29946, 29962, 13, 13, 29937, 5953, 837, 457, 278, 3309, 310, 278, 1051, 13, 29876, 353, 7431, 29898, 2749, 29897, 13, 13, 29937, 25455, 2380, 472, 29871, 29900, 13, 29875, 353, 29871, 29900, 13, 13, 29937, 3201, 3901, 278, 1051, 322, 1962, 1269, 5375, 1543, 13, 8000, 474, 529, 302, 29901, 13, 1678, 1596, 29898, 2749, 29961, 29875, 2314, 13, 1678, 474, 4619, 29871, 29896, 13, 28956, 13, 797, 278, 2183, 775, 29892, 372, 14335, 304, 2130, 421, 2749, 29961, 29876, 7961, 607, 338, 714, 310, 13451, 408, 3017, 1051, 26190, 8665, 515, 29871, 29900, 322, 10614, 472, 302, 29899, 29896, 363, 263, 1051, 310, 3309, 302, 29889, 13, 17351, 29892, 306, 1304, 421, 8000, 29952, 2425, 408, 639, 596, 2009, 29889, 2648, 2847, 5281, 421, 29875, 29922, 29900, 29952, 1434, 278, 2425, 322, 421, 29875, 4619, 29871, 29896, 29952, 2629, 278, 2425, 29892, 591, 508, 13649, 975, 278, 1051, 29889, 910, 12541, 18172, 2745, 421, 29875, 29952, 338, 3109, 1135, 421, 29876, 1673, 278, 3309, 310, 278, 1051, 29889, 450, 421, 8000, 29952, 2425, 674, 451, 8469, 565, 278, 1051, 338, 4069, 577, 372, 5149, 17766, 4069, 8857, 29889, 2]
+labels:
+[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] ```python
+# Establish an integer list
+arr = [1, 2, 3, 4]
+
+# Determine the length of the list
+n = len(arr)
+
+# Initialize index at 0
+i = 0
+
+# Traverse the list and output each individual element
+while i < n:
+    print(arr[i])
+    i += 1
+```
+In the given code, it tries to access `arr[n]` which is out of bounds as python list indexing starts from 0 and ends at n-1 for a list of length n.
+Also, I used `while` loop as per your request. By initializing `i=0` before the loop and `i += 1` within the loop, we can iterate over the list. This iteration continues until `i` is less than `n`, the length of the list. The `while` loop will not proceed if the list is empty so it correctly handles empty lists.</s>
+length of input_ids :
+1212
+input_ids:
+[1, 518, 25580, 29962, 29875, 29915, 345, 2355, 445, 3017, 775, 515, 385, 288, 7283, 5780, 29892, 541, 372, 29915, 29879, 451, 1985, 29889, 508, 366, 4744, 372, 363, 592, 29973, 13, 1990, 29871, 29945, 324, 29894, 29955, 29991, 265, 29901, 13, 29877, 300, 260, 513, 21322, 30181, 303, 11457, 29876, 1566, 761, 29952, 302, 29876, 29987, 29879, 29955, 261, 29901, 525, 5756, 29924, 29987, 29879, 29974, 261, 1495, 448, 29966, 29871, 29896, 2273, 29901, 13, 1311, 29889, 29877, 29896, 29878, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1822, 29918, 29915, 29900, 2396, 518, 29896, 29892, 29871, 29900, 1118, 525, 29931, 2396, 518, 29900, 29892, 29918, 29899, 29896, 1822, 525, 29934, 2396, 426, 29900, 29892, 29871, 29896, 12258, 13, 29879, 2152, 29889, 11569, 29918, 29881, 29991, 29878, 448, 11117, 29963, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 29920, 30181, 29989, 29888, 29889, 29887, 5378, 353, 313, 324, 781, 29875, 4835, 29889, 4381, 27774, 29955, 29898, 2892, 29901, 29918, 29888, 417, 29946, 29873, 877, 3083, 29888, 11287, 13, 29937, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 29879, 30181, 29989, 29888, 29889, 12637, 479, 29974, 448, 10050, 30181, 13, 29945, 761, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 286, 29987, 303, 29941, 29878, 29897, 13, 29896, 29888, 395, 29872, 277, 29892, 29873, 29946, 29878, 657, 29918, 275, 390, 650, 29901, 13, 2267, 13416, 29876, 448, 29896, 13, 29950, 29871, 29947, 9998, 1284, 278, 5807, 29900, 29878, 2167, 29955, 29871, 29929, 29992, 29974, 29882, 13, 1212, 1038, 1583, 29889, 1635, 29879, 3552, 29949, 29892, 288, 876, 13, 13, 4801, 4489, 29879, 29898, 29920, 761, 29892, 1298, 29892, 5516, 29974, 264, 1125, 13, 361, 29918, 6207, 29892, 29875, 29945, 8667, 7295, 13, 1311, 29889, 5182, 448, 772, 326, 29873, 13, 29937, 302, 29876, 935, 16669, 2943, 408, 438, 13, 911, 29989, 29873, 29892, 7720, 29961, 3149, 29913, 353, 29871, 29900, 13, 7345, 270, 297, 1583, 29892, 29900, 262, 29901, 13, 29878, 486, 29918, 29886, 29900, 326, 29873, 353, 313, 29929, 29877, 326, 29873, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29877, 850, 29900, 511, 282, 29949, 524, 29961, 29896, 10062, 344, 29989, 29888, 29889, 3972, 29961, 29900, 3816, 29875, 5262, 13, 361, 29918, 6207, 19423, 273, 25363, 29941, 29898, 29881, 29897, 322, 1583, 29892, 629, 29896, 29900, 29961, 29885, 486, 29918, 29886, 29949, 262, 29974, 21540, 1360, 260, 601, 271, 10998, 524, 29374, 13, 6207, 29889, 29885, 283, 29872, 29898, 29881, 29913, 13, 29920, 761, 29889, 29069, 29898, 29876, 1193, 29918, 3149, 29952, 611, 29938, 357, 29897, 13, 8247, 29955, 29941, 29878, 29889, 29885, 29949, 345, 29898, 29879, 2152, 29889, 11569, 29918, 29881, 29991, 29878, 29961, 29881, 1800, 13, 13, 1753, 289, 5847, 29898, 29945, 29872, 29989, 29888, 29892, 380, 29987, 2273, 1125, 13, 311, 29929, 434, 448, 313, 29949, 645, 29941, 1953, 29892, 311, 29939, 345, 4197, 29898, 303, 29946, 593, 29892, 29871, 29900, 29897, 2314, 13, 1332, 29991, 280, 270, 30181, 29939, 29894, 30181, 29901, 13, 3149, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 1896, 1298, 353, 29899, 29871, 29945, 761, 29889, 29873, 574, 29941, 29873, 29901, 13, 2457, 4331, 13, 29888, 29900, 29878, 270, 29918, 326, 1583, 29889, 29900, 262, 29889, 791, 1960, 7295, 13, 29876, 486, 29918, 3149, 29918, 29899, 313, 29886, 29900, 524, 29961, 29900, 29962, 29911, 29881, 29961, 29949, 1118, 29871, 29929, 29877, 29896, 593, 29961, 29896, 10062, 29881, 29961, 29896, 12258, 13, 277, 29918, 1311, 29892, 29929, 2429, 29961, 29876, 486, 29918, 1129, 262, 28135, 1275, 29871, 29900, 29901, 13, 29881, 30181, 802, 29892, 932, 8154, 3552, 29876, 486, 29918, 29939, 2461, 29892, 269, 29955, 1022, 29974, 29875, 876, 13, 29937, 286, 29992, 17697, 1998, 29991, 29974, 287, 2943, 2698, 448, 29896, 13, 1311, 29892, 5138, 333, 29961, 29876, 486, 29918, 3149, 29897, 353, 448, 29896, 29961, 29914, 25580, 29962, 1670, 526, 3196, 5626, 411, 596, 775, 763, 17422, 3459, 2983, 29892, 2743, 5132, 5877, 2992, 29889, 13, 13, 10605, 338, 920, 393, 775, 881, 1106, 29901, 13, 13, 28956, 4691, 13, 5215, 16250, 13, 13, 1990, 24380, 29901, 13, 1678, 822, 1284, 21322, 342, 2605, 29898, 1311, 29892, 5835, 29901, 525, 5756, 19203, 1495, 1599, 938, 29901, 13, 4706, 1583, 29889, 3972, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1402, 29915, 29928, 2396, 518, 29896, 29892, 29871, 29900, 1402, 525, 29931, 2396, 518, 29900, 6653, 29896, 1402, 525, 29934, 2396, 518, 29900, 29892, 29871, 29896, 12258, 13, 4706, 1583, 29889, 11569, 29918, 3972, 353, 11117, 29965, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 4706, 1583, 29889, 7720, 353, 16250, 29889, 4381, 8977, 29898, 2892, 29901, 7411, 877, 7192, 8785, 13, 4706, 396, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 4706, 1583, 29889, 5182, 353, 6213, 13, 4706, 1583, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 5835, 29897, 13, 4706, 565, 1583, 29889, 5182, 338, 6213, 29901, 13, 9651, 736, 448, 29896, 13, 4706, 396, 350, 9998, 1284, 278, 3273, 342, 2224, 13, 4706, 736, 1583, 29889, 1635, 29879, 3552, 29900, 29892, 29871, 29900, 876, 13, 13, 1678, 822, 4489, 29879, 29898, 1311, 29892, 1298, 29892, 5835, 1125, 13, 4706, 565, 5835, 29889, 275, 8667, 7295, 13, 9651, 1583, 29889, 5182, 353, 1298, 13, 4706, 396, 2791, 16669, 2943, 408, 29871, 29900, 13, 4706, 1583, 29889, 7720, 29961, 3149, 29962, 353, 29871, 29900, 13, 4706, 363, 270, 297, 1583, 29889, 3972, 29901, 13, 9651, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29900, 1402, 1298, 29961, 29896, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29896, 2314, 13, 9651, 565, 5835, 29889, 3068, 16619, 29898, 29881, 29897, 322, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 5785, 877, 7192, 29374, 13, 18884, 5835, 29889, 11631, 29898, 29881, 29897, 13, 18884, 1583, 29889, 29069, 29898, 29876, 486, 29918, 3149, 29892, 5835, 29897, 13, 18884, 5835, 29889, 11631, 29898, 1311, 29889, 11569, 29918, 3972, 29961, 29881, 2314, 13, 13, 1678, 822, 289, 5847, 29898, 1311, 29892, 1369, 1125, 13, 4706, 316, 802, 353, 16250, 29889, 311, 802, 4197, 29898, 2962, 29892, 29871, 29900, 29897, 2314, 13, 4706, 1550, 316, 802, 29901, 13, 9651, 1298, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 9651, 565, 1298, 1275, 1583, 29889, 5182, 29901, 13, 18884, 736, 4331, 13, 9651, 363, 270, 297, 1583, 29889, 3972, 29889, 5975, 7295, 13, 18884, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 29881, 29961, 29900, 1402, 1298, 29961, 29896, 10062, 29881, 29961, 29896, 2314, 13, 18884, 565, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 29871, 29900, 29901, 13, 462, 1678, 316, 802, 29889, 4397, 3552, 29876, 486, 29918, 3149, 29892, 4331, 29974, 29896, 876, 13, 462, 1678, 396, 2791, 16669, 2943, 408, 448, 29896, 13, 462, 1678, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 353, 448, 29896, 13, 28956, 13, 12148, 4443, 393, 366, 3282, 29915, 29873, 3867, 278, 421, 5756, 19203, 29952, 770, 29892, 8151, 366, 505, 304, 1207, 1854, 393, 445, 4413, 505, 1906, 421, 275, 8667, 1673, 421, 3068, 16619, 29952, 322, 421, 11631, 29952, 3519, 297, 1797, 445, 775, 304, 664, 29889, 2]
+inputs:
+<s> [INST]i've got this python code from an ocr tool, but it's not working. can you debug it for me?
+class 5olv7!on:
+oet tindShort€stPatn($elf` nn&s7er: 'GridM&s+er') -< 1rt:
+self.o1r = {'U': [-1, 0]._'0': [1, 0}, 'L': [0,_-1]. 'R': {0, 1]}
+selt.inv_d!r - {'V': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+z€|f.grio = (ollectioms.defaultdic7(lambda:_flo4t('lnf'])
+# DFS mark out the full map
+s€|f.targe+ - Non€
+5elf.dfs((0, 0), m&st3r)
+1f $eit,t4rget_is Rone:
+retvrn -1
+H 8FS find the sn0rtes7 9@+h
+neturr self.bfs((O, o))
+
+det dfs(zelf, point, mas+en):
+if_master,i5Target():
+self.target - poimt
+# nnark visited node as O
+ze|t,grid[point} = 0
+tor d in self,0in:
+rxt_p0imt = (9oimt[0]+self.dir[o](0), pOint[1]+se|f.dir[0][i]]
+if_master.<anNov3(d) and self,gr10[mxt_pOin+]_== tioat{'int'):
+master.moue(d}
+zelf.dfs(nkt_point` ma$ter)
+mas73r.mOve(selt.inv_d!r[d})
+
+def bfs(5e|f, st&rt):
+de9ue - (Oll3ctions,deqve([(st4nt, 0)])
+wh!le d€qv€:
+point, step = deque.popleft()
+lt point =- 5elf.tang3t:
+return step
+f0r d_im self.0in.valves():
+nxt_point_- (p0int[0]Td[O}, 9o1nt[1]+d[1]}
+it_self,9rid[nxt_poin+) == 0:
+d€que,appeno((nxt_qoint, s7ep+i))
+# m@rx vis!+ed node az -1
+self,gnid[nxt_point) = -1[/INST] There are several issues with your code like flaky names, wrong Python syntax etc.
+
+Here is how that code should look:
+
+```python
+import collections
+
+class Solution:
+    def findShortestPath(self, master: 'GridMaster') -> int:
+        self.dir = {'U': [-1, 0],'D': [1, 0], 'L': [0,-1], 'R': [0, 1]}
+        self.inv_dir = {'U': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+        self.grid = collections.defaultdict(lambda:float('inf'))
+        # DFS mark out the full map
+        self.target = None
+        self.dfs((0, 0), master)
+        if self.target is None:
+            return -1
+        # BFS find the shortest path
+        return self.bfs((0, 0))
+
+    def dfs(self, point, master):
+        if master.isTarget():
+            self.target = point
+        # mark visited node as 0
+        self.grid[point] = 0
+        for d in self.dir:
+            nxt_point = (point[0]+self.dir[d][0], point[1]+self.dir[d][1])
+            if master.canMove(d) and self.grid[nxt_point] == float('inf'):
+                master.move(d)
+                self.dfs(nxt_point, master)
+                master.move(self.inv_dir[d])
+
+    def bfs(self, start):
+        deque = collections.deque([(start, 0)])
+        while deque:
+            point, step = deque.popleft()
+            if point == self.target:
+                return step
+            for d in self.dir.values():
+                nxt_point = (point[0]+d[0], point[1]+d[1])
+                if self.grid[nxt_point] == 0:
+                    deque.append((nxt_point, step+1))
+                    # mark visited node as -1
+                    self.grid[nxt_point] = -1
+```
+Please note that you didn't provide the `GridMaster` class, hence you have to make sure that this classes have those `isTarget`, `canMove` and `move` methods in order this code to work.</s>
+label_ids:
+[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 1670, 526, 3196, 5626, 411, 596, 775, 763, 17422, 3459, 2983, 29892, 2743, 5132, 5877, 2992, 29889, 13, 13, 10605, 338, 920, 393, 775, 881, 1106, 29901, 13, 13, 28956, 4691, 13, 5215, 16250, 13, 13, 1990, 24380, 29901, 13, 1678, 822, 1284, 21322, 342, 2605, 29898, 1311, 29892, 5835, 29901, 525, 5756, 19203, 1495, 1599, 938, 29901, 13, 4706, 1583, 29889, 3972, 353, 11117, 29965, 2396, 21069, 29896, 29892, 29871, 29900, 1402, 29915, 29928, 2396, 518, 29896, 29892, 29871, 29900, 1402, 525, 29931, 2396, 518, 29900, 6653, 29896, 1402, 525, 29934, 2396, 518, 29900, 29892, 29871, 29896, 12258, 13, 4706, 1583, 29889, 11569, 29918, 3972, 353, 11117, 29965, 2396, 525, 29928, 742, 525, 29928, 2396, 525, 29965, 742, 525, 29931, 2396, 525, 29934, 742, 525, 29934, 2396, 525, 29931, 10827, 13, 4706, 1583, 29889, 7720, 353, 16250, 29889, 4381, 8977, 29898, 2892, 29901, 7411, 877, 7192, 8785, 13, 4706, 396, 360, 9998, 2791, 714, 278, 2989, 2910, 13, 4706, 1583, 29889, 5182, 353, 6213, 13, 4706, 1583, 29889, 29069, 3552, 29900, 29892, 29871, 29900, 511, 5835, 29897, 13, 4706, 565, 1583, 29889, 5182, 338, 6213, 29901, 13, 9651, 736, 448, 29896, 13, 4706, 396, 350, 9998, 1284, 278, 3273, 342, 2224, 13, 4706, 736, 1583, 29889, 1635, 29879, 3552, 29900, 29892, 29871, 29900, 876, 13, 13, 1678, 822, 4489, 29879, 29898, 1311, 29892, 1298, 29892, 5835, 1125, 13, 4706, 565, 5835, 29889, 275, 8667, 7295, 13, 9651, 1583, 29889, 5182, 353, 1298, 13, 4706, 396, 2791, 16669, 2943, 408, 29871, 29900, 13, 4706, 1583, 29889, 7720, 29961, 3149, 29962, 353, 29871, 29900, 13, 4706, 363, 270, 297, 1583, 29889, 3972, 29901, 13, 9651, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29900, 1402, 1298, 29961, 29896, 10062, 1311, 29889, 3972, 29961, 29881, 3816, 29896, 2314, 13, 9651, 565, 5835, 29889, 3068, 16619, 29898, 29881, 29897, 322, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 5785, 877, 7192, 29374, 13, 18884, 5835, 29889, 11631, 29898, 29881, 29897, 13, 18884, 1583, 29889, 29069, 29898, 29876, 486, 29918, 3149, 29892, 5835, 29897, 13, 18884, 5835, 29889, 11631, 29898, 1311, 29889, 11569, 29918, 3972, 29961, 29881, 2314, 13, 13, 1678, 822, 289, 5847, 29898, 1311, 29892, 1369, 1125, 13, 4706, 316, 802, 353, 16250, 29889, 311, 802, 4197, 29898, 2962, 29892, 29871, 29900, 29897, 2314, 13, 4706, 1550, 316, 802, 29901, 13, 9651, 1298, 29892, 4331, 353, 316, 802, 29889, 7323, 1563, 580, 13, 9651, 565, 1298, 1275, 1583, 29889, 5182, 29901, 13, 18884, 736, 4331, 13, 9651, 363, 270, 297, 1583, 29889, 3972, 29889, 5975, 7295, 13, 18884, 302, 486, 29918, 3149, 353, 313, 3149, 29961, 29900, 10062, 29881, 29961, 29900, 1402, 1298, 29961, 29896, 10062, 29881, 29961, 29896, 2314, 13, 18884, 565, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 1275, 29871, 29900, 29901, 13, 462, 1678, 316, 802, 29889, 4397, 3552, 29876, 486, 29918, 3149, 29892, 4331, 29974, 29896, 876, 13, 462, 1678, 396, 2791, 16669, 2943, 408, 448, 29896, 13, 462, 1678, 1583, 29889, 7720, 29961, 29876, 486, 29918, 3149, 29962, 353, 448, 29896, 13, 28956, 13, 12148, 4443, 393, 366, 3282, 29915, 29873, 3867, 278, 421, 5756, 19203, 29952, 770, 29892, 8151, 366, 505, 304, 1207, 1854, 393, 445, 4413, 505, 1906, 421, 275, 8667, 1673, 421, 3068, 16619, 29952, 322, 421, 11631, 29952, 3519, 297, 1797, 445, 775, 304, 664, 29889, 2]
+labels:
+[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] There are several issues with your code like flaky names, wrong Python syntax etc.
+
+Here is how that code should look:
+
+```python
+import collections
+
+class Solution:
+    def findShortestPath(self, master: 'GridMaster') -> int:
+        self.dir = {'U': [-1, 0],'D': [1, 0], 'L': [0,-1], 'R': [0, 1]}
+        self.inv_dir = {'U': 'D', 'D': 'U', 'L': 'R', 'R': 'L'}
+        self.grid = collections.defaultdict(lambda:float('inf'))
+        # DFS mark out the full map
+        self.target = None
+        self.dfs((0, 0), master)
+        if self.target is None:
+            return -1
+        # BFS find the shortest path
+        return self.bfs((0, 0))
+
+    def dfs(self, point, master):
+        if master.isTarget():
+            self.target = point
+        # mark visited node as 0
+        self.grid[point] = 0
+        for d in self.dir:
+            nxt_point = (point[0]+self.dir[d][0], point[1]+self.dir[d][1])
+            if master.canMove(d) and self.grid[nxt_point] == float('inf'):
+                master.move(d)
+                self.dfs(nxt_point, master)
+                master.move(self.inv_dir[d])
+
+    def bfs(self, start):
+        deque = collections.deque([(start, 0)])
+        while deque:
+            point, step = deque.popleft()
+            if point == self.target:
+                return step
+            for d in self.dir.values():
+                nxt_point = (point[0]+d[0], point[1]+d[1])
+                if self.grid[nxt_point] == 0:
+                    deque.append((n12/25/2023 02:59:28 - INFO - torch.distributed.distributed_c10d - Rank 25: Completed store-based barrier for key:store_based_barrier_key:2 with 32 nodes.
+12/25/2023 02:59:28 - INFO - torch.distributed.distributed_c10d - Rank 26: Completed store-based barrier for key:store_based_barrier_key:2 with 32 nodes.
+vc-816648091-20231223-a1d29e4b-w12/25/2023 02:59:28 - INFO - torch.distributed.distributed_c10d - Rank 10: Completed store-based barrier for key:store_based_barrier_key:2 with 32 nodes.
+12/25/2023 02:59:28 - INFO - torch.distributed.distributed_c10d - Rank 12: Completed store-based barrier for key:store_based_barrier_key:2 with 32 nodes.
+12/25/2023 02:59:28 - INFO - torch.distributed.distributed_c10d - Rank 14: Completed store-based barrier for key:store_based_barrier_key:2 with 32 nodes.
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6804 [2] NCCL INFO Using network IB
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6806 [4] NCCL INFO Using network IB
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6805 [6] NCCL INFO Using network IB
+12/25/2023 02:59:28 - INFO - torch.distributed.distributed_c10d - Rank 11: Completed store-based barrier for key:store_based_barrier_key:2 with 32 nodes.
+12/25/2023 02:59:28 - INFO - torch.distributed.distributed_c10d - Rank 13: Completed store-based barrier for key:store_based_barrier_key:2 with 32 nodes.
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6808 [3] NCCL INFO Using network IB
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6807 [5] NCCL INFO Using network IB
+12/25/2023 02:59:28 - INFO - torch.distributed.divc-816648091-20231223-a1d29e4b-worker-3:6073:7103 [4] NCCL INFO Using network IB
+12/25/2023 02:59:28 - INFO - torch.distributed.distributed_c10d - Rank 24: Completed store-based barrier12/25/2023 02:59:28 - INFO - torch.distributed.distributed_c10d - Rank 15: Completed store-based barrier for key:store_based_barrievc-816648091-20231223-a1d29e4b-worker-3:6072:7101 [3] NCCL INFO Setting affinity for GPU 3 to ffff,ffffffff,00000000,0000ffff,ffffffff
+vc-816648091-20231223-a1d29e4b-worker-3:6074:7100 [5] NCCL INFO Setting affinity for GPU 5 to ffffffff,ffff0000,00000000,ffffffff,ffff0000,00000000
+vc-816648091-20231223-a1d29e4b-worker-3:6069:7104 [0] NCCL INFO Setting affinity for GPU 0 to ffff,ffffffff,00000000,0000ffff,ffffffff
+vc-816648091-20231223-a1d29e4b-worker-3:6075:7102 [6] NCCL INFO Setting affinity for GPU 6 to ffffffff,ffff0000,0000000vc-816648091-20231223-a1d29e4vc-816648091-20231223-a1d29e4b-worker-3:6073:7103 [4] NCCL INFO Setting affinity for GPU 4 to ffffffff,fffvc-816648091-20231223-a1d29e4b-worker-2:5774:6808 [0] NCCL INFO Setting affinity for GPU 0 to ffff,ffffffff,00000000,0000ffff,ffffffff
+vvc-816648091-20231223-a1d29e4b-worker-0:5795:6814 [1] NCCL INFO Setting affinity for GPU 1 to ffff,ffffffff,00000000,0000ffff,ffffffff
+vc-816648091-20231223-a1d29e4b-worker-0:5794:6813 [0] NCCL INFO Setting affinity for GPU 0 to ffff,ffffffff,00000000,0000ffff,ffffffff
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6817 [4] vc-816648091-20231223-a1d29e4b-worker-3:6069:7104 [0] NCCL INFO Trees [0] 25/-1/-1->24->31 [1] 25/-1/-1->24->31
+vc-816648091-20231223-a1d29e4b-worker-3:6071:7098 [2] NCCL INFO Trees [0] 27/-1/-1->26->18 [1] 27/10/-1->26->-1
+vc-816648091-20231223-a1d29e4b-worker-3:6070:7097 [1] NCCL INFO Trees [0] -1/-1/-1->25->24 [1] -1/-1/-1->25->24
+vc-816648091-20231223-a1d29e4b-worker-3:6072:7101 [3] NCCL INFO Trees [0] 28/-1/-1->27->26 [1] 28/-1/-1->27->26
+vc-816648091-20231223-a1d29e4b-worker-3:6073:7103 [4] NCCL INFO Trees [0] 29/-1/-1->28->27 [1] 29/-1/-1->28->27
+vc-816648091-20231223-a1d29e4b-worker-3:6074:7100 [5] NCCL INFO Trees [0] 30/-1/-1->29->28 [1] 30/-1/-1->29->28
+vc-816648091-20231223-a1d29e4b-worker-3:6076:7099 [7] NCCL INFO Trees [0] 24/-1/-1->31->30 [1]vc-816648091-2023vc-816648091-20231223-a1d29e4b-worker-1:5785:6809 [0] NCCL INFO Trees [0] 9/-1/-1->8->15 [1] 9/-1/-1->8->15
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6808 [3] NCCL INFO Trees [0] 12/-1/-1->11->10 [1] 12/18/-1->11->10
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6806 [4] NCCL INFO Trees [0] 13/-1/-1->12->11 [1] 13/-1/-1->12->11
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6811 [1] NCCL INFO Trees [0] -1/-1/-1->9->8 [1] -1/-1/-1->9->8
+vc-816648091-20231223-a1d29e4vc-816648091-20231223-a1d29e4b-worker-3:6073:7103 [4] NCCL INFO Channel 01/0 : 28[93000] -> 25[13000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-3:6071:7098 [2] NCCL INFO Channel 01/0 : 26[4b000] -> 31[d0000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-3:6069:7104 [0] NCCL INFO Channel 01/0 : 24[e000]vc-816648091-20231223-a1d29e4b-worker-1:5787:6804 [2] NCCL INFO Channel 01/0 : 10[4b000] -> 15[d0000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6809 [0] NCCL INFO Channel 01/0 : 8[e000] -> 11[51000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6806 [4] NCCL INFO Channel 01/0 : 12[93000] -> 9[13000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6805 [6] NCCL INFO Channel 00/0 : 14[cb000] -> 13[99000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6807 [5] NCCL INFO Channel 00/0 : 13[99000] -> 12[93000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6805 [6] NCCL INFO Channel 01/0 : 14[cb000] -> 13[99000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6807 [5] NCCL INFO Channel 01/0 : 13[99000] -> 12[93000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6810 [7] NCCL INFO Channel 00/0 : 15[d0000] -> 14[cb000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6811 [1] NCCL INFO Channel 00/0 : 9[13000] -> 8[e000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6810 [7] NCCL INFO Channel 01/0 : 15[d0000] -> 14[cb000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6811 [1] NCCL INFO Channel 01/0 : 9[13000] -> 8[e000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6810 [7] NCCL INFO Connected all rings
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6806 [4] NCCL INFO Connected all rings
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6811 [1] NCCL INFO Connected all rings
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6809 [0] NCCL INFO Connected all rings
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6809 [0] NCCL INFO Channel 00/0 : 8[e000] -> 9[13000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6809 [0] NCCL INFO Channel 01/0 : 8[e000] -> 9[13000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6805 [6] NCCL INFO Connected all rings
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6807 [5] NCCL INFO Connected all rings
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6806 [4] NCCL INFO Channel 00/0 : 12[93000] -> 13[99000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6806 [4] NCCL INFO Channel 01/0 : 12[93000] -> 13[99000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6807 [5] NCCL INFO Channel 00/0 : 13[99000] -> 14[cb000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6805 [6] NCCL INFO Channel 00/0 : 14[cb000] -> 15[d0000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6811 [1] NCCL INFO Connected all trees
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6811 [1] NCCL INFO threadThresholds 8/8/64 | 256/8/64 | 512 | 512
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6811 [1] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6809 [0] NCCL INFO Channel 00/0 : 8[e000] -> 15[d0000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6807 [5] NCCL INFO Channel 01/0 : 13[99000] -> 14[cb000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6805 [6] NCCL INFO Channel 01/0 : 14[cb000] -> 15[d0000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6809 [0] NCCL INFO Channel 01/0 : 8[e000] -> 15[d0000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6807 [5] NCCL INFO Connected all trees
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6807 [5] NCCL INFO threadThresholds 8/8/64 | 256/8/64 | 512 | 512
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6807 [5] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6805 [6] NCCL INFO Connected all trees
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6805 [6] NCCL INFO threadThresholds 8/8/64 | 256/8/64 | 512 | 512
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6805 [6] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6810 [7] NCCL INFO Channel 00/0 : 15[d0000] -> 8[e000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6810 [7] NCCL INFO Channel 01/0 : 15[d0000] -> 8[e000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6810 [7] NCCL INFO Connected all trees
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6810 [7] NCCL INFO threadThresholds 8/8/64 | 256/8/64 | 512 | 512
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6810 [7] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6809 [0] NCCL INFO Connected all trees
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6809 [0] NCCL INFO threadThresholds 8/8/64 | 256/8/64 | 512 | 512
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6809 [0] NCCL INFO 2 collvc-816648091-20231223-a1d29e4b-worker-3:6072:7101 [3] NCCL INFO Channel 01/0 : 27[51000] -> 2[4b000] [send] via NET/IB/0/GDRDMA
+vc-816648091-20231223-a1d29e4b-worker-3:6071:7098 [2] NCCL INFO Channel 01/0 : 19[51000] -> 26[4b000] [receive] via NET/IB/0/GDRDMA
+vc-816648091-20231223-a1d29e4b-worker-3:6072:7101 [3] NCCL INFO Connected all rings
+vc-816648091-20231223-a1d29e4b-worker-3:6072:7101 [3] NCCL INFO Channel 00/0 : 27[51000] -> 28[93000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-3:6071:7098 [2] NCCL INFO Connected all rings
+vc-816648091-20231223-a1vc-816648091-20231223-a1d29e4b-worker-1:5788:6808 [3] NCCL INFO Connected all rings
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6808 [3] NCCL INFO Channel 00/0 : 11[51000] -> 12[93000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6804 [2] NCCL INFO Connected all rings
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6804 [2] NCCL INFO Channel 00/0 : 10[4b000] -> 11[51000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6808 [3] NCCL INFO Channel 01/0 : 11[51000] -> 12[93000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6804 [2] NCCL INFO Channel 01/0 : 10[4b000] -> 11[51000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6806 [4] NCCL INFO Channel 00/0 : 12[93000] -> 11[51000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6806 [4] NCCL INFO Channel 01/0 : 12[93000] -> 11[51000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6804 [2] NCCL INFO Channel 01/0 : 2[4b000] -> 10[4b000] [receive] via NET/IB/0/GDRDMA
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6808 [3] NCCL INFO Channel 01/0 : 18[4b000] -> 11[51000] [receive] via NET/IB/0/GDRDMA
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6804 [2] NCCL INFO Channel 00/0 : 10[4b000] -> 19[51000] [send] via NET/IB/0/GDRDMA
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6804 [2] NCCL INFO Channel 01/0 : 26[4b000] -> 10[4b000] [receive] via NET/IB/0/GDRDMA
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6804 [2] NCCL INFO Channel 01/0 : 10[4b000] -> 26[4b000] [send] viavc-816648091-2023vc-816648091-20231223-a1d29e4b-worker-1:5787:6804 [2] NCCL INFO Channel 00/0 : 19[51000] -> 10[4b000] [receive] vc-816648091-2023122vc-816648091-20231223-a1d29e4b-worker-1:5787:6804 [2] NCCL INFO Channel 01/0 : 10[4b000] -> 2[4b000] [send] via NET/IB/0/GDRDMA
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6808 [3] NCCL INFO Channel 00/0 : 11[51000] -> 10[4b000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6808 [3] NCCL INFO Channel 01/0 : 11[51000] -> 10[4b000] via P2P/IPC/read
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6806 [4] NCCL INFO Connected all trees
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6806 [4] NCCL INFO threadThresholds 8/8/64 | 256/8/64 | 512 | 512
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6806 [4] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6804 [2] NCCL INFO Connected all trees
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6804 [2] NCCL INFO threadThresholds 8/8/64 | 256/8/64 | 512 | 512
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6804 [2] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6808 [3] NCCL INFO Connected all trees
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6808 [3] NCCL INFO threadThresholds 8/8/64 | 256/8/64 | 512 | 512
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6808 [3] NCCL INFO 2 coll channels, 2 p2p channels, 2 p2p channels per peer
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6804 [2] NCCL INFO comm 0x69223ab0 rank 10 nranks 32 cudaDev 2 busId 4b000 - Init COMPLETE
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6808 [3] NCCL INFO comm 0x68e7b190 rank 11 nranks 32 cudaDev 3 busId 51000 - Init COMPLETE
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6805 [6] NCCL INFO comm 0x688e7690 rank 14 nranks 32 cudaDev 6 busId12/25/2023 02:59:40 - WARNING - utils.llama_patch - Padded sequences are less efficient in FlashAttention.
+12/25/2023 02:59:40 - WARNING - utils.llama_patch - Padded sequences are less efficient in FlashAttention.
+12/25/2023 02:59:40 - WARNING - utils.llama_patch - Padded sequences are less efficient in FlashAttention.
+12/25/2023 02:59:40 - WARNING - utils.llama_patch - Padded sequences are less efficient in FlashAttention.
+12/25/2023 02:59:40 - WARNING - utils.llama_patch - Padded sequences are less efficient in FlashAttention.
+12/25/2023 02:59:40 - WARNING - utils.llama_patch - Padded sequences are less efficient in FlashAttention.
+12/25/2023 02:59:40 - WARNING - utils.llama_patch - Padded12/25/2023 02:59:39 - WARNING - utils.llama_patch - Padded sequences are less efficient in FlashAttention.
+12/25/2023 02:59:39 - WARNING - utils.llama_patch{'loss': 0.6109, 'learning_rate': 1.9998183053318547e-05, 'epoch': 0.01}
+{'loss': 0.5587, 'learning_rate': 1.9992732873533223e-05, 'epoch': 0.02}
+{'loss': 0.5352, 'learning_rate': 1.9983651441181253e-05, 'epoch': 0.04}
+{'loss': 0.5315, 'learning_rate': 1.997094205635831e-05, 'epoch': 0.05}
+{'loss': 0.5115, 'learning_rate': 1.99546093375193e-05, 'epoch': 0.06}
+{'loss': 0.5241, 'learning_rate': 1.99346592198001e-05, 'epoch': 0.07}
+{'loss': 0.5085, 'learning_rate': 1.9911098952860726e-05, 'epoch': 0.08}
+{'loss': 0.4903, 'learning_rate': 1.9883937098250962e-05, 'epoch': 0.1}
+{'loss': 0.5028, 'learning_rate': 1.985318352629912e-05, 'epoch': 0.11}
+{'loss': 0.5097, 'learning_rate': 1.9818849412525294e-05, 'epoch': 0.12}
+{'loss': 0.5015, 'learning_rate': 1.978094723358031e-05, 'epoch': 0.13}
+{'loss': 0.5086, 'learning_rate': 1.9739490762711812e-05, 'epoch': 0.15}
+{'loss': 0.497, 'learning_rate': 1.969449506475924e-05, 'epoch': 0.16}
+{'loss': 0.5085, 'learning_rate': 1.9645976490679402e-05, 'epoch': 0.17}
+{'loss': 0.4994, 'learning_rate': 1.9593952671604737e-05, 'epoch': 0.18}
+{'loss': 0.4923, 'learning_rate': 1.953844251243633e-05, 'epoch': 0.19}
+{'loss': 0.4887, 'learning_rate': 1.947946618497407e-05, 'epoch': 0.21}
+{'loss': 0.496, 'learning_rate': 1.941704512058646e-05, 'epoch': 0.22}
+{'loss': 0.5039, 'learning_rate': 1.9351202002422654e-05, 'epoch': 0.23}
+{'loss': 0.4888, 'learning_rate': 1.928196075716966e-05, 'epoch': 0.24}
+{'loss': 0.4972, 'learning_rate': 1.920934654635764e-05, 'epoch': 0.25}
+{'loss': 0.4968, 'learning_rate': 1.9133385757216458e-05, 'epoch': 0.27}
+{'loss': 0.4952, 'learning_rate': 1.905410599308687e-05, 'epoch': 0.28}
+{'loss': 0.4831, 'learning_rate': 1.8971536063389745e-05, 'epoch': 0.29}
+{'loss': 0.4836, 'learning_rate': 1.888570597315703e-05, 'epoch': 0.3}
+{'loss': 0.4873, 'learning_rate': 1.8796646912128247e-05, 'epoch': 0.32}
+{'loss': 0.4773, 'learning_rate': 1.8704391243416478e-05, 'epoch': 0.33}
+{'loss': 0.49, 'learning_rate': 1.8608972491747946e-05, 'epoch': 0.34}
+{'loss': 0.4892, 'learning_rate': 1.8510425331279488e-05, 'epoch': 0.35}
+{'loss': 0.4794, 'learning_rate': 1.8408785572998335e-05, 'epoch': 0.36}
+{'loss': 0.4829, 'learning_rate': 1.8304090151708797e-05, 'epoch': 0.38}
+{'loss': 0.4814, 'learning_rate': 1.8196377112610524e-05, 'epoch': 0.39}
+{'loss': 0.4871, 'learning_rate': 1.808568559747331e-05, 'epoch': 0.4}
+{'loss': 0.4824, 'learning_rate': 1.7972055830413372e-05, 'epoch': 0.41}
+{'loss': 0.483, 'learning_rate': 1.7855529103276337e-05, 'epoch': 0.42}
+{'loss': 0.4701, 'learning_rate': 1.773614776063225e-05, 'epoch': 0.44}
+{'loss': 0.4859, 'learning_rate': 1.761395518438797e-05, 'epoch': 0.45}
+{'loss': 0.4622, 'learning_rate': 1.7488995778022687e-05, 'epoch': 0.46}
+{'loss': 0.4784, 'learning_rate': 1.7361314950452137e-05, 'epoch': 0.47}
+{'loss': 0.4888, 'learning_rate': 1.7230959099527512e-05, 'epoch': 0.49}
+{'loss': 0.4734, 'learning_rate': 1.709797559517496e-05, 'epoch': 0.5}
+{'loss': 0.4801, 'learning_rate': 1.6962412762181867e-05, 'epoch': 0.51}
+{'loss': 0.4895, 'learning_rate': 1.6824319862636137e-05, 'epoch': 0.52}
+{'loss': 0.4671, 'learning_rate': 1.6683747078024887e-05, 'epoch': 0.53}
+{'loss': 0.4881, 'learning_rate': 1.654074549099901e-05, 'epoch': 0.55}
+{'loss': 0.4781, 'learning_rate': 1.6395367066810312e-05, 'epoch': 0.56}
+{'loss': 0.488, 'learning_rate': 1.6247664634427866e-05, 'epoch': 0.57}
+{'loss': 0.4708, 'learning_rate': 1.6097691867340547e-05, 'epoch': 0.58}
+{'loss': 0.4811, 'learning_rate': 1.5945503264052638e-05, 'epoch': 0.59}
+{'loss': 0.4651, 'learning_rate': 1.5791154128279694e-05, 'epoch': 0.61}
+{'loss': 0.463, 'learning_rate': 1.5634700548851713e-05, 'epoch': 0.62}
+{'loss': 0.4769, 'learning_rate': 1.547619937933108e-05, 'epoch': 0.63}
+{'loss': 0.4791, 'learning_rate': 1.53157082173526e-05, 'epoch': 0.64}
+{'loss': 0.4905, 'learning_rate': 1.5153285383693091e-05, 'epoch': 0.65}
+{'loss': 0.4607, 'learning_rate': 1.4988989901078286e-05, 'epoch': 0.67}
+{'loss': 0.4764, 'learning_rate': 1.4822881472734563e-05, 'epoch': 0.68}
+{'loss': 0.489, 'learning_rate': 1.4655020460693452e-05, 'epoch': 0.69}
+{'loss': 0.4744, 'learning_rate': 1.4485467863856704e-05, 'epoch': 0.7}
+{'loss': 0.4722, 'learning_rate': 1.4314285295829957e-05, 'epoch': 0.72}
+{'loss': 0.4588, 'learning_rate': 1.4141534962532986e-05, 'epoch': 0.73}
+{'loss': 0.4779, 'learning_rate': 1.3967279639594753e-05, 'epoch': 0.74}
+{'loss': 0.4629, 'learning_rate': 1.3791582649541404e-05, 'epoch': 0.75}
+{'loss': 0.4826, 'learning_rate': 1.3614507838785547e-05, 'epoch': 0.76}
+{'loss': 0.4615, 'learning_rate': 1.3436119554425133e-05, 'epoch': 0.78}
+{'loss': 0.476, 'learning_rate': 1.3256482620860415e-05, 'epoch': 0.79}
+{'loss': 0.4746, 'learning_rate': 1.3075662316237466e-05, 'epoch': 0.8}
+{'loss': 0.4627, 'learning_rate': 1.2893724348726757e-05, 'epoch': 0.81}
+{'loss': 0.4645, 'learning_rate': 1.2710734832645557e-05, 'epoch': 0.82}
+{'loss': 0.472, 'learning_rate': 1.2526760264432658e-05, 'epoch': 0.84}
+{'loss': 0.4653, 'learning_rate': 1.2341867498484303e-05, 'epoch': 0.85}
+{'loss': 0.4713, 'learning_rate': 1.2156123722859989e-05, 'epoch': 0.86}
+{'loss': 0.4661, 'learning_rate': 1.1969596434867063e-05, 'epoch': 0.87}
+{'loss': 0.4761, 'learning_rate': 1.1782353416532908e-05, 'epoch': 0.89}
+{'loss': 0.4617, 'learning_rate': 1.1594462709973684e-05, 'epoch': 0.9}
+{'loss': 0.4595, 'learning_rate': 1.140599259266854e-05, 'epoch': 0.91}
+{'loss': 0.4677, 'learning_rate': 1.1217011552648316e-05, 'epoch': 0.92}
+{'loss': 0.4712, 'learning_rate': 1.102758826360772e-05, 'epoch': 0.93}
+{'loss': 0.4613, 'learning_rate': 1.0837791559950029e-05, 'epoch': 0.95}
+{'loss': 0.4598, 'learning_rate': 1.0647690411773415e-05, 'epoch': 0.96}
+{'loss': 0.4803, 'learning_rate': 1.0457353899807947e-05, 'epoch': 0.97}
+{'loss': 0.4549, 'learning_rate': 1.0266851190312375e-05, 'epoch': 0.98}
+{'loss': 0.463, 'learning_rate': 1.0076251509939867e-05, 'epoch': 0.99}
+[2023-12-25 12:56:04,297] [INFO] [torch_checkpoint_engine.py:21:save] [Torch] Saving /group/20025/jiuding/ckpt/34B_final/checkpoint-824/global_step824/zero_pp_rank_24_mp_rank_00_model_states.pt...
+[2023-12-25 12:56:04,334] [INFO] [torch_checkpoint_engine.py:23:save] [Torch] Saved /group/20025/jiuding/ckpt/34B_final/checkpoint-824/global_step824/zero_pp_rank_24_mp_rank_00_model_states.pt.
+[2023-12-25 12:56:04,374] [INFO] [torch_checkpoint_engine.py:21:save] [Torch] Saving /group/20025/jiuding/ckpt/34B_final/checkpoint-824/global_step824/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt...
+[2023-12-25 12:56:37,474] [INFO] [torch_checkpoint_engine.py:23:save] [Torch] Saved /group/20025/jiuding/ckpt/34B_final/checkpoint-824/global_step824/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt.
+[2023-12-25 12:56:37,474] [INFO] [engine.py:3285:_save_zero_checkpoint] zero checkpoint saved /group/20025/jiuding/ckpt/34B_final/checkpoint-824/global_step824/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt
+[2023-12-25 12:56:37,487] [INFO] [torch_checkpoint_engine.py:33:commit] [Torch] Checkpoint global_step824 is ready now!
+{'loss': 0.4129, 'learning_rate': 9.885624120581772e-06, 'epoch': 1.01}
+{'loss': 0.3427, 'learning_rate': 9.695038294198588e-06, 'epoch': 1.02}
+{'loss': 0.3457, 'learning_rate': 9.504563287647265e-06, 'epoch': 1.03}
+{'loss': 0.3401, 'learning_rate': 9.314268317514023e-06, 'epoch': 1.04}
+{'loss': 0.3354, 'learning_rate': 9.12422253496175e-06, 'epoch': 1.06}
+{'loss': 0.3391, 'learning_rate': 8.934495000601241e-06, 'epoch': 1.07}
+{'loss': 0.3428, 'learning_rate': 8.745154659395273e-06, 'epoch': 1.08}
+{'loss': 0.3339, 'learning_rate': 8.556270315604779e-06, 'epoch': 1.09}
+{'loss': 0.328, 'learning_rate': 8.36791060778608e-06, 'epoch': 1.1}
+{'loss': 0.3328, 'learning_rate': 8.180143983848388e-06, 'epoch': 1.12}
+{'loss': 0.3288, 'learning_rate': 7.993038676180546e-06, 'epoch': 1.13}
+{'loss': 0.3327, 'learning_rate': 7.806662676856134e-06, 'epoch': 1.14}
+{'loss': 0.3421, 'learning_rate': 7.62108371292584e-06, 'epoch': 1.15}
+{'loss': 0.3462, 'learning_rate': 7.436369221806201e-06, 'epoch': 1.16}
+{'loss': 0.3302, 'learning_rate': 7.2525863267735405e-06, 'epoch': 1.18}
+{'loss': 0.3278, 'learning_rate': 7.069801812572117e-06, 'epoch': 1.19}
+{'loss': 0.3402, 'learning_rate': 6.888082101145222e-06, 'epoch': 1.2}
+{'loss': 0.344, 'learning_rate': 6.707493227498187e-06, 'epoch': 1.21}
+{'loss': 0.3349, 'learning_rate': 6.5281008157019425e-06, 'epoch': 1.22}
+{'loss': 0.3379, 'learning_rate': 6.3499700550459554e-06, 'epoch': 1.24}
+{'loss': 0.3398, 'learning_rate': 6.173165676349103e-06, 'epoch': 1.25}
+{'loss': 0.3362, 'learning_rate': 5.99775192843722e-06, 'epoch': 1.26}
+{'loss': 0.3316, 'learning_rate': 5.823792554795738e-06, 'epoch': 1.27}
+{'loss': 0.3406, 'learning_rate': 5.6513507704059835e-06, 'epoch': 1.29}
+{'loss': 0.3266, 'learning_rate': 5.480489238773536e-06, 'epoch': 1.3}
+{'loss': 0.3393, 'learning_rate': 5.311270049156967e-06, 'epoch': 1.31}
+{'loss': 0.3286, 'learning_rate': 5.14375469400529e-06, 'epoch': 1.32}
+{'loss': 0.3338, 'learning_rate': 4.978004046612224e-06, 'epoch': 1.33}
+{'loss': 0.3247, 'learning_rate': 4.814078338995516e-06, 'epoch': 1.35}
+{'loss': 0.3184, 'learning_rate': 4.652037140009259e-06, 'epoch': 1.36}
+{'loss': 0.3297, 'learning_rate': 4.491939333697205e-06, 'epoch': 1.37}
+{'loss': 0.3378, 'learning_rate': 4.333843097894932e-06, 'epoch': 1.38}
+{'loss': 0.3426, 'learning_rate': 4.177805883088641e-06, 'epoch': 1.39}
+{'loss': 0.3354, 'learning_rate': 4.023884391538244e-06, 'epoch': 1.41}
+{'loss': 0.3348, 'learning_rate': 3.8721345566724156e-06, 'epoch': 1.42}
+{'loss': 0.3314, 'learning_rate': 3.722611522762917e-06, 'epoch': 1.43}
+{'loss': 0.3358, 'learning_rate': 3.575369624885803e-06, 'epoch': 1.44}
+{'loss': 0.3258, 'learning_rate': 3.4304623691766193e-06, 'epoch': 1.46}
+{'loss': 0.3296, 'learning_rate': 3.287942413386841e-06, 'epoch': 1.47}
+{'loss': 0.3364, 'learning_rate': 3.147861547748612e-06, 'epoch': 1.48}
+{'loss': 0.3206, 'learning_rate': 3.0102706761547264e-06, 'epoch': 1.49}
+{'loss': 0.3214, 'learning_rate': 2.875219797660681e-06, 'epoch': 1.5}
+{'loss': 0.3307, 'learning_rate': 2.7427579883155895e-06, 'epoch': 1.52}
+{'loss': 0.33, 'learning_rate': 2.612933383328432e-06, 'epoch': 1.53}
+{'loss': 0.3343, 'learning_rate': 2.4857931595762406e-06, 'epoch': 1.54}
+{'loss': 0.3258, 'learning_rate': 2.3613835184605527e-06, 'epoch': 1.55}
+{'loss': 0.3277, 'learning_rate': 2.239749669118272e-06, 'epoch': 1.56}
+{'loss': 0.3368, 'learning_rate': 2.1209358119931845e-06, 'epoch': 1.58}
+{'loss': 0.3297, 'learning_rate': 2.0049851227739746e-06, 'epoch': 1.59}
+{'loss': 0.3367, 'learning_rate': 1.891939736704641e-06, 'epoch': 1.6}
+{'loss': 0.3344, 'learning_rate': 1.7818407332729914e-06, 'epoch': 1.61}
+{'loss': 0.3388, 'learning_rate': 1.6747281212828193e-06, 'epoch': 1.63}
+{'loss': 0.3333, 'learning_rate': 1.570640824315095e-06, 'epoch': 1.64}
+{'loss': 0.3313, 'learning_rate': 1.4696166665835853e-06, 'epoch': 1.65}
+{'loss': 0.3355, 'learning_rate': 1.3716923591899167e-06, 'epoch': 1.66}
+{'loss': 0.3287, 'learning_rate': 1.2769034867831588e-06, 'epoch': 1.67}
+{'loss': 0.3294, 'learning_rate': 1.1852844946287434e-06, 'epoch': 1.69}
+{'loss': 0.3369, 'learning_rate': 1.0968686760914248e-06, 'epoch': 1.7}
+{'loss': 0.3289, 'learning_rate': 1.0116881605368112e-06, 'epoch': 1.71}
+{'loss': 0.3242, 'learning_rate': 9.297739016559226e-07, 'epoch': 1.72}
+{'loss': 0.3295, 'learning_rate': 8.511556662169218e-07, 'epoch': 1.73}
+{'loss': 0.3257, 'learning_rate': 7.758620232482083e-07, 'epoch': 1.75}
+{'loss': 0.3278, 'learning_rate': 7.039203336567247e-07, 'epoch': 1.76}
+{'loss': 0.3199, 'learning_rate': 6.353567402853056e-07, 'epoch': 1.77}
+{'loss': 0.3378, 'learning_rate': 5.701961584126392e-07, 'epoch': 1.78}
+{'loss': 0.3218, 'learning_rate': 5.084622666993244e-07, 'epoch': 1.8}
+{'loss': 0.326, 'learning_rate': 4.501774985832974e-07, 'epoch': 1.81}
+{'loss': 0.3359, 'learning_rate': 3.953630341277603e-07, 'epoch': 1.82}
+{'loss': 0.3304, 'learning_rate': 3.440387923245714e-07, 'epoch': 1.83}
+{'loss': 0.3239, 'learning_rate': 2.9622342385589256e-07, 'epoch': 1.84}
+{'loss': 0.333, 'learning_rate': 2.519343043167399e-07, 'epoch': 1.86}
+{'loss': 0.316, 'learning_rate': 2.111875279008657e-07, 'epoch': 1.87}
+{'loss': 0.3345, 'learning_rate': 1.7399790155230633e-07, 'epoch': 1.88}
+{'loss': 0.3284, 'learning_rate': 1.4037893958469994e-07, 'epoch': 1.89}
+{'loss': 0.3235, 'learning_rate': 1.1034285877032147e-07, 'epoch': 1.9}
+{'loss': 0.3304, 'learning_rate': 8.390057390064266e-08, 'epoch': 1.92}
+{'loss': 0.3197, 'learning_rate': 6.10616938200137e-08, 'epoch': 1.93}
+{'loss': 0.3394, 'learning_rate': 4.183451793390747e-08, 'epoch': 1.94}
+{'loss': 0.3267, 'learning_rate': 2.6226033193007538e-08, 'epoch': 1.95}
+{'loss': 0.3329, 'learning_rate': 1.424191155422583e-08, 'epoch': 1.96}
+{'loss': 0.3307, 'learning_rate': 5.886507919570239e-09, 'epoch': 1.98}
+{'loss': 0.3305, 'learning_rate': 1.1628585536216374e-09, 'epoch': 1.99}
+[2023-12-25 23:09:26,239] [INFO] [torch_checkpoint_engine.py:21:save] [Torch] Saving /group/20025/jiuding/ckpt/34B_final/checkpoint-1648/global_step1648/zero_pp_rank_24_mp_rank_00_model_states.pt...
+[2023-12-25 23:09:26,269] [INFO] [torch_checkpoint_engine.py:23:save] [Torch] Saved /group/20025/jiuding/ckpt/34B_final/checkpoint-1648/global_step1648/zero_pp_rank_24_mp_rank_00_model_states.pt.
+[2023-12-25 23:09:27,989] [INFO] [torch_checkpoint_engine.py:21:save] [Torch] Saving /group/20025/jiuding/ckpt/34B_final/checkpoint-1648/global_step1648/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt...
+[2023-12-25 23:10:00,433] [INFO] [torch_checkpoint_engine.py:23:save] [Torch] Saved /group/20025/jiuding/ckpt/34B_final/checkpoint-1648/global_step1648/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt.
+[2023-12-25 23:10:00,433] [INFO] [engine.py:3285:_save_zero_checkpoint] zero checkpoint saved /group/20025/jiuding/ckpt/34B_final/checkpoint-1648/global_step1648/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt
+[2023-12-25 23:10:01,042] [INFO] [torch_checkpoint_engine.py:33:commit] [Torch] Checkpoint global_step1648 is ready now!
+{'train_runtime': 72622.4905, 'train_samples_per_second': 5.811, 'train_steps_per_second': 0.023, 'train_loss': 0.4087800704738469, 'epoch': 2.0}
+g/ckpt/34B_final/checkpoint-1648/global_step1648/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt...
+[2023-12-25 23:09:59,045] [INFO] [torch_checkpoint_engine.py:23:save] [Torch] Saved /group/20025/jiuding/ckpt/34B_final/checkpoint-1648/global_step1648/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt.
+[2023-12-25 23:09:59,045] [INFO] [engine.py:3285:_save_zero_checkpoint] zero checkpoint saved /group/20025/jiuding/ckpt/34B_final/checkpoint-1648/global_step1648/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt
+[2023-12-25 23:10:01,039] [INFO] [torch_checkpoint_engine.py:33:commit] [Torch] Checkpoint global_step1648 is ready now!
+{'train_runtime': 72622.4769, 'train_samples_per_second': 5.811, 'train_steps_per_second': 0.023, 'train_loss': 0.40862118040473716, 'epoch': 2.0}
+vc-816648091-20231223-a1d29e4b-worker-2:5776:6817 [2] NCCL INFO [Service thread] Connection closed by localRank 1
+vc-816648091-20231223-a1d29e4b-worker-2:5776:6680 [2] NCCL INFO [Service thread] Connection closed by localRank 1
+vc-816648091-20231223-a1d29e4b-worker-2:5776:6817 [2] NCCL INFO [Service thread] Connection closed by localRank 5
+vc-816648091-20231223-a1d29e4b-worker-2:5776:6680 [2] NCCL INFO [Service thread] Connection closed by localRank 5
+d 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 721748225 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1886872682 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1347896394 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6691 [3] NCCL INFO [Service thread] Connection closed by localRank -1
+{'loss': 0.3302, 'learning_rate': 7.2525863267735405e-06, 'epoch': 1.18}
+{'loss': 0.3278, 'learning_rate': 7.069801812572117e-06, 'epoch': 1.19}
+{'loss': 0.3402, 'learning_rate': 6.888082101145222e-06, 'epoch': 1.2}
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1886872682 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1347896394 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1868983913 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 721748225 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] NCCL INFO [Service thread] Connection closed by localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1886872682 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1347896394 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 721748225 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1868983913 from localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] NCCL INFO [Service thread] Connection closed by localRank -1
+{'loss': 0.344, 'learning_rate': 6.707493227498187e-06, 'epoch': 1.21}
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1886872682 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1347896394 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 721748225 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1868983913 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] NCCL INFO [Service thread] Connection closed by localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1868983913 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 721748225 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1886872682 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1347896394 from localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] NCCL INFO [Service thread] Connection closed by localRank -1
+{'loss': 0.3349, 'learning_rate': 6.5281008157019425e-06, 'epoch': 1.22}
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1868983913 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 721748225 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1886872682 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1347896394 from localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5792:6687 [7] NCCL INFO [Service thread] Connection closed by localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1868983913 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 721748225 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1886872682 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1347896394 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6819 [0] NCCL INFO [Service thread] Connection closed by localRank -1
+{'loss': 0.3379, 'learning_rate': 6.3499700550459554e-06, 'epoch': 1.24}
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1886872682 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1347896394 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1868983913 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 721748225 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5789:6815 [4] NCCL INFO [Service thread] Connection closed by localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 721748225 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1886872682 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1347896394 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1868983913 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5791:6686 [6] NCCL INFO [Service thread] Connection closed by localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1886872682 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1347896394 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1868983913 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 721748225 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5785:6684 [0] NCCL INFO [Service thread] Connection closed by localRank -1
+{'loss': 0.3398, 'learning_rate': 6.173165676349103e-06, 'epoch': 1.25}
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 721748225 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1868983913 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1886872682 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1347896394 from localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5786:6685 [1] NCCL INFO [Service thread] Connection closed by localRank -1
+{'loss': 0.3362, 'learning_rate': 5.99775192843722e-06, 'epoch': 1.26}
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 721748225 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1868983913 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1886872682 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1347896394 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5788:6818 [3] NCCL INFO [Service thread] Connection closed by localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1886872682 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1868983913 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1347896394 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 721748225 from localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-1:5790:6689 [5] NCCL INFO [Service thread] Connection closed by localRank -1
+{'loss': 0.3316, 'learning_rate': 5.823792554795738e-06, 'epoch': 1.27}
+{'loss': 0.3406, 'learning_rate': 5.6513507704059835e-06, 'epoch': 1.29}
+{'loss': 0.3266, 'learning_rate': 5.480489238773536e-06, 'epoch': 1.3}
+{'loss': 0.3393, 'learning_rate': 5.311270049156967e-06, 'epoch': 1.31}
+{'loss': 0.3286, 'learning_rate': 5.14375469400529e-06, 'epoch': 1.32}
+{'loss': 0.3338, 'learning_rate': 4.978004046612224e-06, 'epoch': 1.33}
+{'loss': 0.3247, 'learning_rate': 4.814078338995516e-06, 'epoch': 1.35}
+{'loss': 0.3184, 'learning_rate': 4.652037140009259e-06, 'epoch': 1.36}
+{'loss': 0.3297, 'learning_rate': 4.491939333697205e-06, 'epoch': 1.37}
+{'loss': 0.3378, 'learning_rate': 4.333843097894932e-06, 'epoch': 1.38}
+{'loss': 0.3426, 'learning_rate': 4.177805883088641e-06, 'epoch': 1.39}
+{'loss': 0.3354, 'learning_rate': 4.023884391538244e-06, 'epoch': 1.41}
+{'loss': 0.3348, 'learning_rate': 3.8721345566724156e-06, 'epoch': 1.42}
+{'loss': 0.3314, 'learning_rate': 3.722611522762917e-06, 'epoch': 1.43}
+{'loss': 0.3358, 'learning_rate': 3.575369624885803e-06, 'epoch': 1.44}
+{'loss': 0.3258, 'learning_rate': 3.4304623691766193e-06, 'epoch': 1.46}
+{'loss': 0.3296, 'learning_rate': 3.287942413386841e-06, 'epoch': 1.47}
+{'loss': 0.3364, 'learning_rate': 3.147861547748612e-06, 'epoch': 1.48}
+{'loss': 0.3206, 'learning_rate': 3.0102706761547264e-06, 'epoch': 1.49}
+{'loss': 0.3214, 'learning_rate': 2.875219797660681e-06, 'epoch': 1.5}
+{'loss': 0.3307, 'learning_rate': 2.7427579883155895e-06, 'epoch': 1.52}
+{'loss': 0.33, 'learning_rate': 2.612933383328432e-06, 'epoch': 1.53}
+{'loss': 0.3343, 'learning_rate': 2.4857931595762406e-06, 'epoch': 1.54}
+{'loss': 0.3258, 'learning_rate': 2.3613835184605527e-06, 'epoch': 1.55}
+{'loss': 0.3277, 'learning_rate': 2.239749669118272e-06, 'epoch': 1.56}
+{'loss': 0.3368, 'learning_rate': 2.1209358119931845e-06, 'epoch': 1.58}
+{'loss': 0.3297, 'learning_rate': 2.0049851227739746e-06, 'epoch': 1.59}
+{'loss': 0.3367, 'learning_rate': 1.891939736704641e-06, 'epoch': 1.6}
+{'loss': 0.3344, 'learning_rate': 1.7818407332729914e-06, 'epoch': 1.61}
+{'loss': 0.3388, 'learning_rate': 1.6747281212828193e-06, 'epoch': 1.63}
+{'loss': 0.3333, 'learning_rate': 1.570640824315095e-06, 'epoch': 1.64}
+{'loss': 0.3313, 'learning_rate': 1.4696166665835853e-06, 'epoch': 1.65}
+{'loss': 0.3355, 'learning_rate': 1.3716923591899167e-06, 'epoch': 1.66}
+{'loss': 0.3287, 'learning_rate': 1.2769034867831588e-06, 'epoch': 1.67}
+{'loss': 0.3294, 'learning_rate': 1.1852844946287434e-06, 'epoch': 1.69}
+{'loss': 0.3369, 'learning_rate': 1.0968686760914248e-06, 'epoch': 1.7}
+{'loss': 0.3289, 'learning_rate': 1.0116881605368112e-06, 'epoch': 1.71}
+{'loss': 0.3242, 'learning_rate': 9.297739016559226e-07, 'epoch': 1.72}
+{'loss': 0.3295, 'learning_rate': 8.511556662169218e-07, 'epoch': 1.73}
+{'loss': 0.3257, 'learning_rate': 7.758620232482083e-07, 'epoch': 1.75}
+{'loss': 0.3278, 'learning_rate': 7.039203336567247e-07, 'epoch': 1.76}
+{'loss': 0.3199, 'learning_rate': 6.353567402853056e-07, 'epoch': 1.77}
+{'loss': 0.3378, 'learning_rate': 5.701961584126392e-07, 'epoch': 1.78}
+{'loss': 0.3218, 'learning_rate': 5.084622666993244e-07, 'epoch': 1.8}
+{'loss': 0.326, 'learning_rate': 4.501774985832974e-07, 'epoch': 1.81}
+{'loss': 0.3359, 'learning_rate': 3.953630341277603e-07, 'epoch': 1.82}
+{'loss': 0.3304, 'learning_rate': 3.440387923245714e-07, 'epoch': 1.83}
+{'loss': 0.3239, 'learning_rate': 2.9622342385589256e-07, 'epoch': 1.84}
+{'loss': 0.333, 'learning_rate': 2.519343043167399e-07, 'epoch': 1.86}
+{'loss': 0.316, 'learning_rate': 2.111875279008657e-07, 'epoch': 1.87}
+{'loss': 0.3345, 'learning_rate': 1.7399790155230633e-07, 'epoch': 1.88}
+{'loss': 0.3284, 'learning_rate': 1.4037893958469994e-07, 'epoch': 1.89}
+{'loss': 0.3235, 'learning_rate': 1.1034285877032147e-07, 'epoch': 1.9}
+{'loss': 0.3304, 'learning_rate': 8.390057390064266e-08, 'epoch': 1.92}
+{'loss': 0.3197, 'learning_rate': 6.10616938200137e-08, 'epoch': 1.93}
+{'loss': 0.3394, 'learning_rate': 4.183451793390747e-08, 'epoch': 1.94}
+{'loss': 0.3267, 'learning_rate': 2.6226033193007538e-08, 'epoch': 1.95}
+{'loss': 0.3329, 'learning_rate': 1.424191155422583e-08, 'epoch': 1.96}
+{'loss': 0.3307, 'learning_rate': 5.886507919570239e-09, 'epoch': 1.98}
+{'loss': 0.3305, 'learning_rate': 1.1628585536216374e-09, 'epoch': 1.99}
+[2023-12-25 23:09:26,271] [INFO] [torch_checkpoint_engine.py:21:save] [Torch] Saving /group/20025/jiuding/ckpt/34B_final/checkpoint-1648/global_step1648/zero_pp_rank_8_mp_rank_00_model_states.pt...
+[2023-12-25 23:09:26,293] [INFO] [torch_checkpoint_engine.py:23:save] [Torch] Saved /group/20025/jiuding/ckpt/34B_final/checkpoint-1648/global_step1648/zero_pp_rank_8_mp_rank_00_model_states.pt.
+[2023-12-25 23:09:28,022] [INFO] [torch_checkpoint_engine.py:21:save] [Torch] Saving /group/20025/jiuding/ckpt/34B_final/checkpoint-1648/global_step1648/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt...
+[2023-12-25 23:10:00,280] [INFO] [torch_checkpoint_engine.py:23:save] [Torch] Saved /group/20025/jiuding/ckpt/34B_final/checkpoint-1648/global_step1648/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt.
+[2023-12-25 23:10:00,280] [INFO] [engine.py:3285:_save_zero_checkpoint] zero checkpoint saved /group/20025/jiuding/ckpt/34B_final/checkpoint-1648/global_step1648/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt
+[2023-12-25 23:10:01,074] [INFO] [torch_checkpoint_engine.py:33:commit] [Torch] Checkpoint global_step1648 is ready now!
+{'train_runtime': 72622.6275, 'train_samples_per_second': 5.811, 'train_steps_per_second': 0.023, 'train_loss': 0.40901411404308763, 'epoch': 2.0}
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6816 [0] NCCL INFO [Service thread] Connection closed by localRank 3
+vc-816648091-20231223-a1d29e4b-worker-1:5787:6688 [2] NCCL INFO [Service thread] Connection closed by localRank 3
+ccelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:07:35,810] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:08:18,743] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:09:01,272] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:09:44,857] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:10:28,810] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:11:11,783] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:11:54,934] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.5097, 'learning_rate': 1.9818849412525294e-05, 'epoch': 0.12}
+[2023-12-25 04:12:38,808] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:13:21,260] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:14:04,108] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:14:48,497] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:15:31,278] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:16:14,222] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:16:57,148] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:17:41,658] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:18:25,418] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:19:08,700] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.5015, 'learning_rate': 1.978094723358031e-05, 'epoch': 0.13}
+[2023-12-25 04:19:51,656] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:20:35,397] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:21:18,717] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:22:01,877] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:22:45,731] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:23:29,392] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:24:12,922] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:24:56,113] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:25:39,219] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:26:22,270] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.5086, 'learning_rate': 1.9739490762711812e-05, 'epoch': 0.15}
+[2023-12-25 04:27:06,282] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:27:49,383] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:28:32,601] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:29:15,418] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:29:58,691] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:30:42,211] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:31:25,499] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:32:08,573] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:32:52,240] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:33:35,298] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.497, 'learning_rate': 1.969449506475924e-05, 'epoch': 0.16}
+[2023-12-25 04:34:19,245] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:35:01,743] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:35:43,835] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:36:27,537] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:37:10,894] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:37:53,539] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:38:37,980] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:39:21,292] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:40:05,958] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:40:48,988] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.5085, 'learning_rate': 1.9645976490679402e-05, 'epoch': 0.17}
+[2023-12-25 04:41:32,036] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:42:15,603] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:42:58,842] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:43:41,390] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:44:24,847] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:45:06,945] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:45:49,905] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:46:34,006] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:47:18,929] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:48:01,762] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4994, 'learning_rate': 1.9593952671604737e-05, 'epoch': 0.18}
+[2023-12-25 04:48:45,059] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:49:28,806] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:50:11,950] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:50:54,999] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:51:38,751] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:52:20,978] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:53:05,561] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:53:48,286] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:54:31,065] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:55:13,741] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4923, 'learning_rate': 1.953844251243633e-05, 'epoch': 0.19}
+[2023-12-25 04:55:56,119] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:56:39,847] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:57:22,921] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:58:06,549] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:58:49,698] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 04:59:32,254] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:00:15,391] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:00:58,002] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:01:40,714] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:02:24,091] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4887, 'learning_rate': 1.947946618497407e-05, 'epoch': 0.21}
+[2023-12-25 05:03:07,001] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:03:50,315] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:04:34,016] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:05:18,338] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:06:01,471] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:06:45,104] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:07:28,609] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:08:12,146] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:08:55,572] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:09:38,859] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.496, 'learning_rate': 1.941704512058646e-05, 'epoch': 0.22}
+[2023-12-25 05:10:23,065] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:11:06,541] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:11:49,857] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:12:33,755] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:13:18,334] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:14:01,985] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:14:46,052] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:15:28,870] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:16:12,294] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:16:56,371] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.5039, 'learning_rate': 1.9351202002422654e-05, 'epoch': 0.23}
+[2023-12-25 05:17:38,523] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:18:21,734] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:19:05,908] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:19:49,385] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:20:32,761] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:21:16,215] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:21:59,162] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:22:41,937] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:23:25,656] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:24:09,175] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4888, 'learning_rate': 1.928196075716966e-05, 'epoch': 0.24}
+[2023-12-25 05:24:52,031] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:25:35,103] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:26:18,833] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:27:01,899] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:27:46,773] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:28:29,934] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:29:13,070] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:29:55,671] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:30:38,657] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:31:22,185] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4972, 'learning_rate': 1.920934654635764e-05, 'epoch': 0.25}
+[2023-12-25 05:32:06,278] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:32:49,575] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:33:32,238] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:34:14,974] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:34:57,327] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:35:40,620] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:36:23,885] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:37:07,390] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:37:50,564] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:38:33,681] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4968, 'learning_rate': 1.9133385757216458e-05, 'epoch': 0.27}
+[2023-12-25 05:39:18,155] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:40:01,570] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:40:45,716] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:41:28,646] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:42:11,578] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:42:54,665] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:43:38,431] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:44:21,836] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:45:06,385] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:45:50,259] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4952, 'learning_rate': 1.905410599308687e-05, 'epoch': 0.28}
+[2023-12-25 05:46:32,984] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:47:16,183] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:47:59,701] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:48:42,494] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:49:24,979] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:50:08,006] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:50:51,166] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:51:33,940] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:52:18,164] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:53:00,940] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4831, 'learning_rate': 1.8971536063389745e-05, 'epoch': 0.29}
+[2023-12-25 05:53:45,022] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:54:28,095] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:55:10,698] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:55:53,707] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:56:38,173] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:57:21,241] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:58:03,580] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:58:48,882] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 05:59:32,541] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:00:16,902] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4836, 'learning_rate': 1.888570597315703e-05, 'epoch': 0.3}
+[2023-12-25 06:00:59,613] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:01:42,516] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:02:26,207] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:03:10,000] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:03:53,040] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:04:36,916] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:05:19,756] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:06:02,594] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:06:47,340] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:07:30,421] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4873, 'learning_rate': 1.8796646912128247e-05, 'epoch': 0.32}
+[2023-12-25 06:08:13,679] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:08:56,696] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:09:40,388] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:10:23,227] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:11:07,279] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:11:50,537] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:12:33,978] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:13:18,449] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:14:01,601] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:14:46,276] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4773, 'learning_rate': 1.8704391243416478e-05, 'epoch': 0.33}
+[2023-12-25 06:15:29,234] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:16:13,080] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:16:56,874] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:17:40,286] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:18:23,946] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:19:06,612] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:19:49,462] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:20:31,794] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:21:15,286] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:21:59,008] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.49, 'learning_rate': 1.8608972491747946e-05, 'epoch': 0.34}
+[2023-12-25 06:22:42,819] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:23:26,762] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:24:09,594] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:24:52,597] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:25:35,999] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:26:18,889] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:27:01,986] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:27:46,686] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:28:30,050] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:29:13,976] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4892, 'learning_rate': 1.8510425331279488e-05, 'epoch': 0.35}
+[2023-12-25 06:29:57,142] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:30:39,361] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:31:22,209] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:32:06,349] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:32:49,803] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:33:33,219] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:34:16,608] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:34:59,299] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:35:42,453] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:36:26,334] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4794, 'learning_rate': 1.8408785572998335e-05, 'epoch': 0.36}
+[2023-12-25 06:37:09,139] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:37:51,929] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:38:34,906] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:39:18,083] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:40:00,536] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:40:44,062] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:41:26,980] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:42:09,615] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:42:52,641] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:43:34,999] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4829, 'learning_rate': 1.8304090151708797e-05, 'epoch': 0.38}
+[2023-12-25 06:44:18,474] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:45:00,439] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:45:43,065] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:46:26,566] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:47:09,506] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:47:52,956] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:48:36,058] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:49:19,136] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:50:02,595] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:50:46,960] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4814, 'learning_rate': 1.8196377112610524e-05, 'epoch': 0.39}
+[2023-12-25 06:51:30,788] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:52:13,802] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:52:56,762] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:53:39,422] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:54:22,881] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:55:06,212] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:55:49,377] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:56:32,120] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:57:15,223] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:57:57,933] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4871, 'learning_rate': 1.808568559747331e-05, 'epoch': 0.4}
+[2023-12-25 06:58:40,967] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 06:59:24,138] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:00:07,403] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:00:50,096] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:01:33,974] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:02:18,787] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:03:01,372] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:03:45,193] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:04:28,649] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:05:12,211] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4824, 'learning_rate': 1.7972055830413372e-05, 'epoch': 0.41}
+[2023-12-25 07:05:55,338] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:06:38,739] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:07:21,979] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:08:04,118] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:08:48,067] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:09:31,472] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:10:14,681] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:10:58,006] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:11:41,439] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:12:24,591] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.483, 'learning_rate': 1.7855529103276337e-05, 'epoch': 0.42}
+[2023-12-25 07:13:06,742] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:13:50,691] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:14:33,872] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:15:16,706] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:16:00,102] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:16:43,116] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:17:26,061] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:18:09,063] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:18:52,346] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:19:35,837] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4701, 'learning_rate': 1.773614776063225e-05, 'epoch': 0.44}
+[2023-12-25 07:20:18,720] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:21:02,152] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:21:46,649] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:22:30,357] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:23:13,197] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:23:56,089] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:24:39,167] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:25:22,691] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:26:06,144] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:26:48,722] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4859, 'learning_rate': 1.761395518438797e-05, 'epoch': 0.45}
+[2023-12-25 07:27:31,722] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:28:15,256] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:28:57,984] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:29:41,840] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:30:26,374] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:31:10,046] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:31:53,417] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:32:36,676] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:33:19,605] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:34:02,476] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4622, 'learning_rate': 1.7488995778022687e-05, 'epoch': 0.46}
+[2023-12-25 07:34:46,254] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:35:29,494] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:36:11,421] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:36:54,845] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:37:38,643] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:38:21,373] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:39:06,104] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:39:48,917] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:40:31,972] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:41:15,415] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4784, 'learning_rate': 1.7361314950452137e-05, 'epoch': 0.47}
+[2023-12-25 07:41:58,644] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:42:42,188] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:43:25,269] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:44:07,920] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:44:51,001] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:45:34,887] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:46:19,177] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:47:02,491] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:47:46,970] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:48:29,817] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4888, 'learning_rate': 1.7230959099527512e-05, 'epoch': 0.49}
+[2023-12-25 07:49:13,428] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:49:56,668] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:50:39,732] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:51:21,988] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:52:06,145] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:52:49,325] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:53:31,605] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:54:14,730] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:54:58,150] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:55:40,502] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4734, 'learning_rate': 1.709797559517496e-05, 'epoch': 0.5}
+[2023-12-25 07:56:24,005] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:57:07,253] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:57:49,471] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:58:32,246] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:59:14,520] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 07:59:57,821] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:00:41,256] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:01:24,240] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:02:07,503] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:02:51,035] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4801, 'learning_rate': 1.6962412762181867e-05, 'epoch': 0.51}
+[2023-12-25 08:03:34,045] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:04:16,625] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:04:59,956] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:05:42,892] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:06:25,752] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:07:08,653] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:07:50,933] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:08:33,652] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:09:17,000] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:10:00,008] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4895, 'learning_rate': 1.6824319862636137e-05, 'epoch': 0.52}
+[2023-12-25 08:10:44,543] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:11:27,412] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:12:10,049] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:12:53,441] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:13:37,496] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:14:20,739] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:15:03,110] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:15:46,710] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:16:29,707] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:17:13,384] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4671, 'learning_rate': 1.6683747078024887e-05, 'epoch': 0.53}
+[2023-12-25 08:17:56,446] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:18:39,201] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:19:21,497] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:20:06,406] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:20:49,089] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:21:32,086] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:22:14,897] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:22:57,754] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:23:40,885] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:24:24,536] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4881, 'learning_rate': 1.654074549099901e-05, 'epoch': 0.55}
+[2023-12-25 08:25:07,602] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:25:51,003] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:26:34,197] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:27:18,652] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:28:02,102] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:28:46,422] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:29:29,501] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:30:12,457] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:30:56,262] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:31:40,027] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4781, 'learning_rate': 1.6395367066810312e-05, 'epoch': 0.56}
+[2023-12-25 08:32:23,484] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:33:06,514] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:33:48,940] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:34:32,562] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:35:15,237] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:35:58,354] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:36:41,686] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:37:24,819] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:38:07,947] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:38:50,182] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.488, 'learning_rate': 1.6247664634427866e-05, 'epoch': 0.57}
+[2023-12-25 08:39:33,829] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:40:17,476] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:40:59,702] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:41:43,544] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:42:26,565] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:43:10,225] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:43:52,492] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:44:36,359] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:45:19,241] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:46:00,944] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4708, 'learning_rate': 1.6097691867340547e-05, 'epoch': 0.58}
+[2023-12-25 08:46:44,391] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:47:27,908] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:48:11,291] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:48:53,815] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:49:38,912] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:50:23,159] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:51:07,236] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:51:50,357] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:52:33,376] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:53:16,957] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4811, 'learning_rate': 1.5945503264052638e-05, 'epoch': 0.59}
+[2023-12-25 08:53:59,693] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:54:43,468] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:55:26,245] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:56:08,979] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:56:51,642] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:57:35,081] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:58:18,830] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:59:01,529] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 08:59:46,272] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:00:29,787] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4651, 'learning_rate': 1.5791154128279694e-05, 'epoch': 0.61}
+[2023-12-25 09:01:13,817] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:01:55,957] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:02:39,176] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:03:21,380] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:04:05,844] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:04:49,690] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:05:32,910] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:06:15,246] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:06:58,810] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:07:41,529] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.463, 'learning_rate': 1.5634700548851713e-05, 'epoch': 0.62}
+[2023-12-25 09:08:24,842] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:09:07,989] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:09:50,377] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:10:33,638] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:11:16,939] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:12:00,245] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:12:42,825] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:13:25,883] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:14:08,539] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:14:50,807] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4769, 'learning_rate': 1.547619937933108e-05, 'epoch': 0.63}
+[2023-12-25 09:15:33,341] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:16:16,998] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:16:59,515] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:17:42,561] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:18:25,762] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:19:08,472] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:19:52,212] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:20:35,682] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:21:18,614] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:22:02,037] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4791, 'learning_rate': 1.53157082173526e-05, 'epoch': 0.64}
+[2023-12-25 09:22:45,602] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:23:28,146] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:24:11,272] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:24:54,149] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:25:37,750] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:26:21,018] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:27:05,840] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:27:48,293] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:28:31,719] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:29:14,527] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4905, 'learning_rate': 1.5153285383693091e-05, 'epoch': 0.65}
+[2023-12-25 09:29:57,497] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:30:40,677] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:31:23,319] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:32:06,728] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:32:51,197] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:33:34,400] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:34:18,920] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:35:02,467] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:35:46,735] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:36:29,913] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4607, 'learning_rate': 1.4988989901078286e-05, 'epoch': 0.67}
+[2023-12-25 09:37:13,354] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:37:56,503] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:38:39,797] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:39:23,998] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:40:07,181] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:40:50,417] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:41:33,224] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:42:16,023] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:42:58,750] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:43:41,926] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4764, 'learning_rate': 1.4822881472734563e-05, 'epoch': 0.68}
+[2023-12-25 09:44:25,839] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:45:09,537] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:45:52,607] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:46:36,232] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:47:19,874] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:48:03,183] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:48:47,288] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:49:30,670] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:50:14,161] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:50:57,773] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.489, 'learning_rate': 1.4655020460693452e-05, 'epoch': 0.69}
+[2023-12-25 09:51:40,712] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:52:24,982] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:53:08,593] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:53:52,314] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:54:34,770] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:55:18,595] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:56:02,083] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:56:46,466] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:57:30,286] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:58:13,572] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4744, 'learning_rate': 1.4485467863856704e-05, 'epoch': 0.7}
+[2023-12-25 09:58:56,707] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 09:59:39,451] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:00:22,344] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:01:06,167] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:01:49,007] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:02:32,914] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:03:17,917] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:04:01,204] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:04:44,114] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:05:28,004] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4722, 'learning_rate': 1.4314285295829957e-05, 'epoch': 0.72}
+[2023-12-25 10:06:10,192] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:06:53,066] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:07:37,078] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:08:19,937] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:09:03,092] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:09:47,383] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:10:30,540] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:11:14,423] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:11:57,550] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:12:40,865] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4588, 'learning_rate': 1.4141534962532986e-05, 'epoch': 0.73}
+[2023-12-25 10:13:25,079] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:14:07,902] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:14:50,368] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:15:33,972] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:16:18,503] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:17:02,167] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:17:45,803] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:18:29,087] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:19:13,117] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:19:55,654] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4779, 'learning_rate': 1.3967279639594753e-05, 'epoch': 0.74}
+[2023-12-25 10:20:39,070] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:21:22,908] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:22:06,328] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:22:48,885] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:23:31,646] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:24:14,456] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:24:58,043] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:25:41,252] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:26:24,124] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:27:06,930] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4629, 'learning_rate': 1.3791582649541404e-05, 'epoch': 0.75}
+[2023-12-25 10:27:50,808] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:28:34,134] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:29:17,725] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:30:00,797] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:30:43,455] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:31:26,362] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:32:08,916] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:32:52,014] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:33:35,677] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:34:19,057] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4826, 'learning_rate': 1.3614507838785547e-05, 'epoch': 0.76}
+[2023-12-25 10:35:01,867] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:35:45,036] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:36:27,736] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:37:09,997] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:37:52,782] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:38:36,097] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:39:18,821] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:40:01,523] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:40:46,284] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:41:29,095] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4615, 'learning_rate': 1.3436119554425133e-05, 'epoch': 0.78}
+[2023-12-25 10:42:12,423] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:42:55,735] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:43:38,323] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:44:20,918] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:45:03,073] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:45:47,207] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:46:30,504] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:47:12,590] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:47:56,015] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:48:38,898] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.476, 'learning_rate': 1.3256482620860415e-05, 'epoch': 0.79}
+[2023-12-25 10:49:21,782] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:50:05,466] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:50:49,201] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:51:32,546] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:52:15,371] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:52:58,232] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:53:41,590] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:54:24,610] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:55:07,550] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:55:50,953] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4746, 'learning_rate': 1.3075662316237466e-05, 'epoch': 0.8}
+[2023-12-25 10:56:34,671] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:57:18,547] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:58:01,818] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:58:46,353] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 10:59:29,358] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:00:12,591] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:00:55,416] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:01:38,192] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:02:20,640] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:03:04,000] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4627, 'learning_rate': 1.2893724348726757e-05, 'epoch': 0.81}
+[2023-12-25 11:03:47,733] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:04:31,018] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:05:13,923] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:05:56,926] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:06:39,729] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:07:23,154] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:08:06,682] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:08:50,767] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:09:33,995] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:10:18,128] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4645, 'learning_rate': 1.2710734832645557e-05, 'epoch': 0.82}
+[2023-12-25 11:10:59,747] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:11:42,597] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:12:26,205] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:13:08,664] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:13:50,922] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:14:33,374] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:15:16,126] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:15:59,249] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:16:41,951] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:17:25,233] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.472, 'learning_rate': 1.2526760264432658e-05, 'epoch': 0.84}
+[2023-12-25 11:18:08,645] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:18:51,383] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:19:35,253] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:20:19,354] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:21:02,481] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:21:46,268] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:22:29,244] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:23:12,751] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:23:56,667] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:24:40,041] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4653, 'learning_rate': 1.2341867498484303e-05, 'epoch': 0.85}
+[2023-12-25 11:25:22,503] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:26:06,498] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:26:49,069] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:27:31,448] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:28:14,677] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:28:57,389] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:29:40,244] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:30:23,773] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:31:06,418] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:31:48,985] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4713, 'learning_rate': 1.2156123722859989e-05, 'epoch': 0.86}
+[2023-12-25 11:32:32,455] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:33:15,660] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:33:57,885] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:34:41,433] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:35:24,112] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:36:06,621] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:36:48,904] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:37:32,470] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:38:15,991] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:38:58,844] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4661, 'learning_rate': 1.1969596434867063e-05, 'epoch': 0.87}
+[2023-12-25 11:39:41,428] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:40:24,710] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:41:07,755] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:41:50,679] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:42:34,245] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:43:18,212] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:44:00,785] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:44:43,525] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:45:26,386] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:46:10,065] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4761, 'learning_rate': 1.1782353416532908e-05, 'epoch': 0.89}
+[2023-12-25 11:46:52,921] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:47:36,255] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:48:18,839] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:49:01,905] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:49:46,255] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:50:28,324] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:51:11,647] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:51:55,183] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:52:38,881] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:53:21,478] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4617, 'learning_rate': 1.1594462709973684e-05, 'epoch': 0.9}
+[2023-12-25 11:54:05,592] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:54:48,900] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:55:31,777] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:56:14,309] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:56:57,235] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:57:40,677] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:58:24,373] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:59:07,472] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 11:59:51,024] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:00:33,479] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4595, 'learning_rate': 1.140599259266854e-05, 'epoch': 0.91}
+[2023-12-25 12:01:18,641] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:02:01,402] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:02:45,578] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:03:28,801] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:04:11,502] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:04:54,381] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:05:38,471] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:06:20,779] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:07:05,596] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:07:47,819] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4677, 'learning_rate': 1.1217011552648316e-05, 'epoch': 0.92}
+[2023-12-25 12:08:31,170] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:09:13,852] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:09:56,683] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:10:39,661] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:11:22,466] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:12:06,336] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:12:49,400] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:13:33,018] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:14:15,691] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:14:57,499] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4712, 'learning_rate': 1.102758826360772e-05, 'epoch': 0.93}
+[2023-12-25 12:15:40,374] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:16:23,150] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:17:06,773] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:17:49,904] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:18:32,919] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:19:15,142] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:19:58,293] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:20:41,748] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:21:25,119] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:22:08,358] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4613, 'learning_rate': 1.0837791559950029e-05, 'epoch': 0.95}
+[2023-12-25 12:22:51,485] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:23:34,602] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:24:18,409] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:25:01,391] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:25:45,397] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:26:28,954] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:27:12,252] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:27:55,699] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:28:39,243] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:29:23,182] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4598, 'learning_rate': 1.0647690411773415e-05, 'epoch': 0.96}
+[2023-12-25 12:30:06,221] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:30:49,066] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:31:31,856] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:32:14,818] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:32:57,217] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:33:40,540] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:34:23,747] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:35:06,248] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:35:48,828] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:36:32,420] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4803, 'learning_rate': 1.0457353899807947e-05, 'epoch': 0.97}
+[2023-12-25 12:37:16,287] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:37:58,875] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:38:42,269] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:39:26,139] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:40:10,089] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:40:53,660] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:41:39,029] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:42:21,096] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:43:04,029] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:43:48,170] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4549, 'learning_rate': 1.0266851190312375e-05, 'epoch': 0.98}
+[2023-12-25 12:44:31,264] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:45:14,644] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:45:58,015] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:46:41,115] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:47:24,443] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:48:08,653] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:48:51,651] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:49:35,330] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:50:18,589] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:51:01,096] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.463, 'learning_rate': 1.0076251509939867e-05, 'epoch': 0.99}
+[2023-12-25 12:51:44,596] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:52:28,082] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:53:10,327] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:53:53,664] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:56:04,292] [INFO] [logging.py:96:log_dist] [Rank 0] [Torch] Checkpoint global_step824 is about to be saved!
+[2023-12-25 12:56:04,310] [INFO] [logging.py:96:log_dist] [Rank 0] Saving model checkpoint: /group/20025/jiuding/ckpt/34B_final/checkpoint-824/global_step824/zero_pp_rank_0_mp_rank_00_model_states.pt
+[2023-12-25 12:56:04,310] [INFO] [torch_checkpoint_engine.py:21:save] [Torch] Saving /group/20025/jiuding/ckpt/34B_final/checkpoint-824/global_step824/zero_pp_rank_0_mp_rank_00_model_states.pt...
+[2023-12-25 12:56:04,338] [INFO] [torch_checkpoint_engine.py:23:save] [Torch] Saved /group/20025/jiuding/ckpt/34B_final/checkpoint-824/global_step824/zero_pp_rank_0_mp_rank_00_model_states.pt.
+[2023-12-25 12:56:04,389] [INFO] [torch_checkpoint_engine.py:21:save] [Torch] Saving /group/20025/jiuding/ckpt/34B_final/checkpoint-824/global_step824/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt...
+[2023-12-25 12:56:34,712] [INFO] [torch_checkpoint_engine.py:23:save] [Torch] Saved /group/20025/jiuding/ckpt/34B_final/checkpoint-824/global_step824/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt.
+[2023-12-25 12:56:34,737] [INFO] [engine.py:3285:_save_zero_checkpoint] zero checkpoint saved /group/20025/jiuding/ckpt/34B_final/checkpoint-824/global_step824/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt
+[2023-12-25 12:56:37,503] [INFO] [torch_checkpoint_engine.py:33:commit] [Torch] Checkpoint global_step824 is ready now!
+[2023-12-25 12:56:59,927] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:57:42,763] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:58:26,423] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:59:09,822] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 12:59:52,908] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:00:35,335] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.4129, 'learning_rate': 9.885624120581772e-06, 'epoch': 1.01}
+[2023-12-25 13:01:18,783] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:02:01,518] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:02:44,414] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:03:28,582] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:04:11,966] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:04:54,926] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:05:39,291] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:06:22,023] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:07:05,755] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:07:48,827] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3427, 'learning_rate': 9.695038294198588e-06, 'epoch': 1.02}
+[2023-12-25 13:08:32,085] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:09:14,540] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:09:56,690] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:10:39,896] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:11:23,524] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:12:06,265] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:12:48,510] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:13:30,844] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:14:13,973] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:14:57,898] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3457, 'learning_rate': 9.504563287647265e-06, 'epoch': 1.03}
+[2023-12-25 13:15:41,083] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:16:24,532] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:17:07,305] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:17:49,975] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:18:32,820] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:19:15,714] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:19:58,471] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:20:42,567] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:21:25,834] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:22:08,930] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3401, 'learning_rate': 9.314268317514023e-06, 'epoch': 1.04}
+[2023-12-25 13:22:52,154] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:23:35,628] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:24:19,389] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:25:03,019] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:25:46,648] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:26:29,774] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:27:12,472] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:27:55,250] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:28:38,431] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:29:22,026] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3354, 'learning_rate': 9.12422253496175e-06, 'epoch': 1.06}
+[2023-12-25 13:30:06,191] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:30:49,691] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:31:32,511] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:32:15,623] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:32:58,920] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:33:42,034] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:34:25,205] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:35:07,951] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:35:50,709] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:36:34,254] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3391, 'learning_rate': 8.934495000601241e-06, 'epoch': 1.07}
+[2023-12-25 13:37:18,830] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:38:02,023] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:38:46,366] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:39:29,368] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:40:12,755] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:40:56,401] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:41:39,728] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:42:23,604] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:43:07,194] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:43:50,623] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3428, 'learning_rate': 8.745154659395273e-06, 'epoch': 1.08}
+[2023-12-25 13:44:33,807] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:45:18,425] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:46:01,691] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:46:46,808] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:47:30,170] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:48:13,898] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:48:57,192] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:49:40,716] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:50:24,077] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:51:07,916] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3339, 'learning_rate': 8.556270315604779e-06, 'epoch': 1.09}
+[2023-12-25 13:51:51,694] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:52:34,855] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:53:18,915] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:54:01,939] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:54:45,801] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:55:29,014] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:56:12,746] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:56:55,666] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:57:39,564] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:58:22,535] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.328, 'learning_rate': 8.36791060778608e-06, 'epoch': 1.1}
+[2023-12-25 13:59:06,223] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 13:59:49,379] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:00:32,347] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:01:15,605] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:01:58,187] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:02:41,777] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:03:25,660] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:04:08,586] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:04:51,481] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:05:34,413] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3328, 'learning_rate': 8.180143983848388e-06, 'epoch': 1.12}
+[2023-12-25 14:06:16,896] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:07:00,656] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:07:44,137] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:08:27,629] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:09:10,512] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:09:54,195] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:10:39,428] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:11:22,155] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:12:06,587] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:12:50,126] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3288, 'learning_rate': 7.993038676180546e-06, 'epoch': 1.13}
+[2023-12-25 14:13:33,805] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:14:18,788] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:15:01,257] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:15:45,838] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:16:29,020] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:17:12,029] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:17:55,207] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:18:38,806] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:19:22,943] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:20:06,943] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3327, 'learning_rate': 7.806662676856134e-06, 'epoch': 1.14}
+[2023-12-25 14:20:49,074] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:21:32,851] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:22:17,493] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:22:59,689] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:23:43,224] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:24:26,570] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:25:08,992] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:25:52,170] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:26:35,510] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:27:18,735] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3421, 'learning_rate': 7.62108371292584e-06, 'epoch': 1.15}
+[2023-12-25 14:28:01,282] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:28:46,036] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:29:28,683] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:30:12,361] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:30:55,809] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:31:38,800] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:32:21,614] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:33:06,166] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:33:49,077] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:34:31,953] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3462, 'learning_rate': 7.436369221806201e-06, 'epoch': 1.16}
+[2023-12-25 14:35:15,193] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:35:58,649] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:36:42,004] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:37:25,788] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:38:09,128] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:38:52,425] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:39:36,213] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:40:20,051] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:41:04,391] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:41:48,289] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3302, 'learning_rate': 7.2525863267735405e-06, 'epoch': 1.18}
+[2023-12-25 14:42:32,098] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:43:15,475] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:43:58,167] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:44:41,396] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:45:25,341] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:46:08,544] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:46:52,136] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:47:35,651] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:48:20,122] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:49:03,558] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3278, 'learning_rate': 7.069801812572117e-06, 'epoch': 1.19}
+[2023-12-25 14:49:48,543] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:50:31,745] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:51:14,670] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:51:58,487] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:52:41,807] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:53:25,670] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:54:09,113] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:54:51,775] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:55:34,133] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:56:16,853] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3402, 'learning_rate': 6.888082101145222e-06, 'epoch': 1.2}
+[2023-12-25 14:56:59,939] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:57:43,403] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:58:26,803] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:59:09,974] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 14:59:53,346] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:00:39,106] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:01:22,864] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:02:06,642] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:02:50,142] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:03:32,634] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.344, 'learning_rate': 6.707493227498187e-06, 'epoch': 1.21}
+[2023-12-25 15:04:15,860] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:04:59,725] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:05:42,338] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:06:25,921] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:07:09,597] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:07:52,460] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:08:35,434] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:09:19,150] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:10:01,910] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:10:46,327] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3349, 'learning_rate': 6.5281008157019425e-06, 'epoch': 1.22}
+[2023-12-25 15:11:29,995] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:12:12,744] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:12:55,629] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:13:38,757] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:14:22,016] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:15:06,119] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:15:49,985] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:16:33,396] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:17:17,039] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:17:59,897] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3379, 'learning_rate': 6.3499700550459554e-06, 'epoch': 1.24}
+[2023-12-25 15:18:43,075] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:19:26,574] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:20:10,672] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:20:54,349] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:21:39,220] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:22:22,617] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:23:06,425] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:23:49,303] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:24:32,253] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:25:15,856] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3398, 'learning_rate': 6.173165676349103e-06, 'epoch': 1.25}
+[2023-12-25 15:25:59,851] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:26:42,551] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:27:25,993] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:28:09,281] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:28:52,143] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:29:35,802] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:30:18,886] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:31:01,865] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:31:46,483] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:32:29,383] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3362, 'learning_rate': 5.99775192843722e-06, 'epoch': 1.26}
+[2023-12-25 15:33:12,082] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:33:55,882] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:34:38,409] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:35:20,999] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:36:04,001] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:36:47,758] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:37:30,870] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:38:14,242] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:38:58,380] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:39:41,665] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3316, 'learning_rate': 5.823792554795738e-06, 'epoch': 1.27}
+[2023-12-25 15:40:25,489] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:41:08,736] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:41:52,403] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:42:35,926] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:43:18,666] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:44:02,060] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:44:47,636] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:45:30,447] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:46:14,046] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:46:56,957] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3406, 'learning_rate': 5.6513507704059835e-06, 'epoch': 1.29}
+[2023-12-25 15:47:40,089] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:48:23,434] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:49:05,971] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:49:48,555] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:50:32,314] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:51:15,847] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:51:58,596] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:52:41,940] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:53:25,318] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:54:08,443] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3266, 'learning_rate': 5.480489238773536e-06, 'epoch': 1.3}
+[2023-12-25 15:54:51,644] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:55:34,863] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:56:18,792] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:57:01,707] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:57:46,036] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:58:29,590] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:59:12,197] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 15:59:55,513] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:00:38,427] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:01:21,776] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3393, 'learning_rate': 5.311270049156967e-06, 'epoch': 1.31}
+[2023-12-25 16:02:03,968] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:02:49,823] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:03:33,122] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:04:15,799] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:04:58,834] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:05:42,236] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:06:25,591] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:07:09,487] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:07:52,539] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:08:35,230] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3286, 'learning_rate': 5.14375469400529e-06, 'epoch': 1.32}
+[2023-12-25 16:09:18,521] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:10:01,236] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:10:46,299] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:11:28,743] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:12:11,468] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:12:54,738] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:13:38,160] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:14:21,178] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:15:05,204] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:15:48,494] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3338, 'learning_rate': 4.978004046612224e-06, 'epoch': 1.33}
+[2023-12-25 16:16:32,287] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:17:15,608] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:17:59,217] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:18:43,352] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:19:26,435] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:20:09,873] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:20:53,469] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:21:36,566] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:22:22,115] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:23:07,567] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3247, 'learning_rate': 4.814078338995516e-06, 'epoch': 1.35}
+[2023-12-25 16:23:53,339] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:24:40,636] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:25:27,063] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:26:13,452] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:26:58,519] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:27:44,626] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:28:31,186] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:29:16,354] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:30:06,415] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:30:54,832] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3184, 'learning_rate': 4.652037140009259e-06, 'epoch': 1.36}
+[2023-12-25 16:31:44,451] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:32:33,878] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:33:22,397] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:34:11,587] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:35:00,682] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:35:47,494] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:36:37,170] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:37:27,229] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:38:14,504] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:39:03,484] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3297, 'learning_rate': 4.491939333697205e-06, 'epoch': 1.37}
+[2023-12-25 16:39:53,213] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:40:40,886] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:41:29,828] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:42:20,240] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1886872682 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1347896394 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1868983913 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 721748225 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-0:5797:6826 [3] NCCL INFO [Service thread] Connection closed by localRank -1
+[2023-12-25 16:43:08,864] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:43:57,412] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:44:46,335] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:45:34,622] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:46:24,791] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1886872682 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1347896394 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1868983913 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 721748225 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6693 [4] NCCL INFO [Service thread] Connection closed by localRank -1
+[2023-12-25 16:47:13,817] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3378, 'learning_rate': 4.333843097894932e-06, 'epoch': 1.38}
+[2023-12-25 16:48:02,756] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:48:54,129] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:49:39,411] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:50:24,263] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:51:08,718] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:51:54,000] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:52:40,566] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:53:25,338] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:54:09,640] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:54:53,282] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3426, 'learning_rate': 4.177805883088641e-06, 'epoch': 1.39}
+[2023-12-25 16:55:38,990] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:56:24,337] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:57:09,833] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1868983913 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 721748225 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1886872682 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1347896394 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-0:5798:6824 [4] NCCL INFO [Service thread] Connection closed by localRank -1
+[2023-12-25 16:57:54,747] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:58:39,624] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 16:59:24,437] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:00:08,362] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:00:53,745] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:01:39,480] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:02:24,287] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3354, 'learning_rate': 4.023884391538244e-06, 'epoch': 1.41}
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1886872682 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1868983913 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1347896394 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 721748225 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-0:5799:6825 [5] NCCL INFO [Service thread] Connection closed by localRank -1
+[2023-12-25 17:03:09,647] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:03:54,788] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:04:40,402] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:05:26,220] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:06:11,486] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:06:56,656] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:07:42,026] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:08:27,230] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:09:12,458] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:09:58,141] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3348, 'learning_rate': 3.8721345566724156e-06, 'epoch': 1.42}
+[2023-12-25 17:10:45,295] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 721748225 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1886872682 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1347896394 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 1868983913 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 16843542 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] proxy.cc:1111 NCCL WARN [Service thread] Unknown command 542393671 from localRank -1
+
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] NCCL INFO [Service thread] Connection closed by localRank -1
+vc-816648091-20231223-a1d29e4b-worker-0:5795:6822 [1] NCCL INFO [Service thread] Connection closed by localRank -1
+[2023-12-25 17:11:31,489] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:12:20,963] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:13:07,862] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:13:54,373] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:14:40,987] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:15:27,735] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:16:13,373] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:17:00,301] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:17:47,685] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3314, 'learning_rate': 3.722611522762917e-06, 'epoch': 1.43}
+[2023-12-25 17:18:34,256] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:19:20,577] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:20:07,198] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:20:53,734] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:21:41,644] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:22:27,508] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:23:14,214] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:24:01,115] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:24:47,099] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:25:33,968] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3358, 'learning_rate': 3.575369624885803e-06, 'epoch': 1.44}
+[2023-12-25 17:26:20,900] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:27:08,521] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:27:53,990] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:28:41,384] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:29:27,967] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:30:14,257] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:31:00,630] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:31:47,148] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:32:34,082] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:33:20,285] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3258, 'learning_rate': 3.4304623691766193e-06, 'epoch': 1.46}
+[2023-12-25 17:34:06,541] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:34:52,657] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:35:41,659] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:36:29,262] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:37:16,249] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:38:03,069] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:38:50,209] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:39:36,649] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:40:22,752] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:41:08,003] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3296, 'learning_rate': 3.287942413386841e-06, 'epoch': 1.47}
+[2023-12-25 17:41:53,472] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:42:40,174] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:43:26,651] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:44:12,472] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:44:58,505] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:45:45,099] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:46:30,931] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:47:15,875] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:48:01,226] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:48:46,404] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3364, 'learning_rate': 3.147861547748612e-06, 'epoch': 1.48}
+[2023-12-25 17:49:31,228] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:50:17,156] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:51:01,848] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:51:48,169] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:52:33,916] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:53:18,751] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:54:04,305] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:54:50,819] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:55:35,608] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:56:19,810] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3206, 'learning_rate': 3.0102706761547264e-06, 'epoch': 1.49}
+[2023-12-25 17:57:06,071] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:57:51,414] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:58:36,941] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 17:59:22,149] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:00:07,865] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:00:53,471] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:01:39,938] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:02:26,351] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:03:11,804] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:03:57,057] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3214, 'learning_rate': 2.875219797660681e-06, 'epoch': 1.5}
+[2023-12-25 18:04:42,506] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:05:27,454] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:06:12,473] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:06:57,913] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:07:42,429] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:08:26,804] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:09:12,143] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:09:58,203] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:10:44,033] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:11:29,720] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3307, 'learning_rate': 2.7427579883155895e-06, 'epoch': 1.52}
+[2023-12-25 18:12:15,160] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:12:59,236] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:13:43,565] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:14:27,959] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:15:13,542] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:15:58,873] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:16:44,396] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:17:28,789] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:18:12,919] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:18:57,534] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.33, 'learning_rate': 2.612933383328432e-06, 'epoch': 1.53}
+[2023-12-25 18:19:42,034] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:20:27,491] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:21:12,371] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:21:57,517] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:22:41,877] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:23:27,279] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:24:12,559] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:24:56,777] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:25:41,745] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:26:27,173] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3343, 'learning_rate': 2.4857931595762406e-06, 'epoch': 1.54}
+[2023-12-25 18:27:11,710] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:27:57,768] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:28:41,460] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:29:24,448] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:30:08,877] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:30:53,950] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:31:39,693] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:32:25,409] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:33:10,810] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:33:55,375] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3258, 'learning_rate': 2.3613835184605527e-06, 'epoch': 1.55}
+[2023-12-25 18:34:40,012] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:35:25,251] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:36:09,768] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:36:55,127] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:37:40,554] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:38:26,167] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:39:10,810] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:39:55,638] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:40:41,006] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:41:25,548] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3277, 'learning_rate': 2.239749669118272e-06, 'epoch': 1.56}
+[2023-12-25 18:42:10,451] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:42:56,144] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:43:41,094] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:44:25,795] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:45:10,748] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:45:55,963] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:46:40,211] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:47:26,245] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:48:12,141] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:48:57,283] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3368, 'learning_rate': 2.1209358119931845e-06, 'epoch': 1.58}
+[2023-12-25 18:49:42,754] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:50:27,631] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:51:12,824] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:51:57,143] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:52:42,117] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:53:28,347] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:54:13,643] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:54:58,719] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:55:42,769] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:56:27,817] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3297, 'learning_rate': 2.0049851227739746e-06, 'epoch': 1.59}
+[2023-12-25 18:57:12,583] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:57:57,659] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:58:42,390] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 18:59:27,832] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:00:13,070] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:00:57,725] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:01:41,145] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:02:26,676] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:03:11,700] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:03:56,458] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3367, 'learning_rate': 1.891939736704641e-06, 'epoch': 1.6}
+[2023-12-25 19:04:41,056] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:05:26,425] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:06:11,617] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:06:56,720] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:07:41,222] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:08:27,409] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:09:12,240] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:09:57,346] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:10:42,222] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:11:27,407] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3344, 'learning_rate': 1.7818407332729914e-06, 'epoch': 1.61}
+[2023-12-25 19:12:11,459] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:12:55,839] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:13:40,983] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:14:25,559] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:15:10,712] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:15:56,777] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:16:41,691] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:17:26,435] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:18:11,294] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:18:55,184] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3388, 'learning_rate': 1.6747281212828193e-06, 'epoch': 1.63}
+[2023-12-25 19:19:39,335] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:20:24,889] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:21:10,069] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:21:55,532] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:22:40,884] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:23:25,964] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:24:10,451] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:24:55,252] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:25:40,515] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:26:25,389] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3333, 'learning_rate': 1.570640824315095e-06, 'epoch': 1.64}
+[2023-12-25 19:27:10,604] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:27:55,618] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:28:40,904] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:29:25,131] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:30:10,618] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:30:55,789] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:31:40,443] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:32:25,788] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:33:10,489] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:33:56,209] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3313, 'learning_rate': 1.4696166665835853e-06, 'epoch': 1.65}
+[2023-12-25 19:34:40,534] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:35:25,000] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:36:09,581] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:36:54,815] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:37:40,342] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:38:25,388] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:39:10,374] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:39:54,935] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:40:40,104] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:41:24,656] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3355, 'learning_rate': 1.3716923591899167e-06, 'epoch': 1.66}
+[2023-12-25 19:42:09,190] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:42:54,283] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:43:40,041] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:44:24,961] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:45:10,208] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:45:54,558] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:46:40,840] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:47:27,307] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:48:12,337] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:48:56,923] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3287, 'learning_rate': 1.2769034867831588e-06, 'epoch': 1.67}
+[2023-12-25 19:49:42,231] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:50:27,444] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:51:12,561] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:51:57,659] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:52:42,844] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:53:28,622] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:54:13,942] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:54:58,116] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:55:42,365] [WARNING] [stage3.py:1898:step] 5 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:56:27,221] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3294, 'learning_rate': 1.1852844946287434e-06, 'epoch': 1.69}
+[2023-12-25 19:57:11,900] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:57:57,129] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:58:42,903] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 19:59:27,797] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:00:12,288] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:00:57,140] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:01:41,943] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:02:28,262] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:03:13,149] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:03:57,872] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3369, 'learning_rate': 1.0968686760914248e-06, 'epoch': 1.7}
+[2023-12-25 20:04:42,474] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:05:28,220] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:06:12,672] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:06:57,975] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:07:43,212] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:08:29,108] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:09:14,869] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:09:59,745] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:10:43,661] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:11:28,702] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3289, 'learning_rate': 1.0116881605368112e-06, 'epoch': 1.71}
+[2023-12-25 20:12:13,101] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:12:58,135] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:13:41,956] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:14:27,439] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:15:11,996] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:15:56,155] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:16:41,786] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:17:26,455] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:18:11,584] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:18:57,163] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3242, 'learning_rate': 9.297739016559226e-07, 'epoch': 1.72}
+[2023-12-25 20:19:41,759] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:20:26,745] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:21:11,945] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:21:56,545] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:22:42,985] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:23:28,208] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:24:13,468] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:24:57,528] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:25:42,518] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:26:27,850] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3295, 'learning_rate': 8.511556662169218e-07, 'epoch': 1.73}
+[2023-12-25 20:27:12,166] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:27:56,537] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:28:41,230] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:29:27,636] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:30:12,380] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:30:56,767] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:31:40,723] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:32:25,832] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:33:10,669] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:33:55,607] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3257, 'learning_rate': 7.758620232482083e-07, 'epoch': 1.75}
+[2023-12-25 20:34:40,998] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:35:27,102] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:36:11,494] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:36:55,829] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:37:41,751] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:38:26,490] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:39:10,533] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:39:56,635] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:40:41,060] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:41:26,542] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3278, 'learning_rate': 7.039203336567247e-07, 'epoch': 1.76}
+[2023-12-25 20:42:12,427] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:42:57,354] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:43:42,744] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:44:27,381] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:45:12,201] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:45:55,934] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:46:41,229] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:47:26,683] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:48:12,033] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:48:57,714] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3199, 'learning_rate': 6.353567402853056e-07, 'epoch': 1.77}
+[2023-12-25 20:49:42,073] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:50:27,295] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:51:12,149] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:51:56,960] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:52:42,331] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:53:28,371] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:54:12,801] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:54:57,478] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:55:41,409] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:56:25,922] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3378, 'learning_rate': 5.701961584126392e-07, 'epoch': 1.78}
+[2023-12-25 20:57:10,746] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:57:54,947] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:58:40,715] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 20:59:25,431] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:00:10,175] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:00:55,271] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:01:40,977] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:02:26,389] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:03:10,901] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:03:56,719] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3218, 'learning_rate': 5.084622666993244e-07, 'epoch': 1.8}
+[2023-12-25 21:04:41,935] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:05:26,916] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:06:13,137] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:06:58,749] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:07:42,692] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:08:28,212] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:09:13,477] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:09:58,232] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:10:43,603] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:11:28,674] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.326, 'learning_rate': 4.501774985832974e-07, 'epoch': 1.81}
+[2023-12-25 21:12:13,242] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:12:56,183] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:13:39,799] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:14:25,215] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:15:09,515] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:15:53,382] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:16:38,413] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:17:23,159] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:18:07,291] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:18:50,363] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3359, 'learning_rate': 3.953630341277603e-07, 'epoch': 1.82}
+[2023-12-25 21:19:33,527] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:20:18,919] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:21:03,319] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:21:47,231] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:22:31,552] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:23:15,622] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:23:59,541] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:24:43,850] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:25:26,790] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:26:10,272] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3304, 'learning_rate': 3.440387923245714e-07, 'epoch': 1.83}
+[2023-12-25 21:26:54,190] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:27:39,184] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:28:23,122] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:29:06,655] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:29:50,856] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:30:34,344] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:31:18,982] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:32:02,424] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:32:47,465] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:33:30,991] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3239, 'learning_rate': 2.9622342385589256e-07, 'epoch': 1.84}
+[2023-12-25 21:34:14,464] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:34:58,417] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:35:41,862] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:36:26,018] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:37:10,720] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:37:54,889] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:38:39,626] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:39:23,406] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:40:07,194] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:40:51,371] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.333, 'learning_rate': 2.519343043167399e-07, 'epoch': 1.86}
+[2023-12-25 21:41:35,429] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:42:19,014] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:43:02,858] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:43:48,066] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:44:32,240] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:45:16,507] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:45:59,785] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:46:42,879] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:47:26,635] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:48:10,009] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.316, 'learning_rate': 2.111875279008657e-07, 'epoch': 1.87}
+[2023-12-25 21:48:53,767] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:49:39,485] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:50:23,166] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:51:07,055] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:51:51,170] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:52:35,036] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:53:19,158] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:54:02,947] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:54:48,128] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:55:32,767] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3345, 'learning_rate': 1.7399790155230633e-07, 'epoch': 1.88}
+[2023-12-25 21:56:17,450] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:57:01,624] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:57:45,084] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:58:29,026] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:59:12,801] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 21:59:57,014] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:00:40,807] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:01:25,178] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:02:09,391] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:02:53,901] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3284, 'learning_rate': 1.4037893958469994e-07, 'epoch': 1.89}
+[2023-12-25 22:03:38,472] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:04:22,462] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:05:06,162] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:05:51,489] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:06:35,183] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:07:18,714] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:08:01,329] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:08:45,585] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:09:29,828] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:10:13,293] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3235, 'learning_rate': 1.1034285877032147e-07, 'epoch': 1.9}
+[2023-12-25 22:10:58,061] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:11:41,816] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:12:25,659] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:13:11,394] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:13:55,323] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:14:39,881] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:15:23,091] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:16:06,764] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:16:51,450] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:17:36,019] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3304, 'learning_rate': 8.390057390064266e-08, 'epoch': 1.92}
+[2023-12-25 22:18:20,206] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:19:03,388] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:19:48,517] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:20:32,330] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:21:17,998] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:22:02,306] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:22:47,892] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:23:30,914] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:24:14,906] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:24:58,582] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3197, 'learning_rate': 6.10616938200137e-08, 'epoch': 1.93}
+[2023-12-25 22:25:43,491] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:26:27,274] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:27:11,722] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:27:55,804] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:28:41,019] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:29:26,055] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:30:10,095] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:30:54,078] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:31:39,434] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:32:22,099] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3394, 'learning_rate': 4.183451793390747e-08, 'epoch': 1.94}
+[2023-12-25 22:33:06,071] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:33:49,808] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:34:34,406] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:35:18,994] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:36:02,654] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:36:47,800] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:37:30,842] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:38:14,467] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:38:57,936] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:39:41,571] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3267, 'learning_rate': 2.6226033193007538e-08, 'epoch': 1.95}
+[2023-12-25 22:40:25,756] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:41:09,130] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:41:53,646] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:42:38,552] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:43:22,441] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:44:06,693] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:44:50,832] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:45:34,451] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:46:19,769] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:47:03,883] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3329, 'learning_rate': 1.424191155422583e-08, 'epoch': 1.96}
+[2023-12-25 22:47:48,973] [WARNING] [stage3.py:1898:step] 4 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:48:32,843] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:49:16,768] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:49:59,991] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:50:44,134] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:51:28,100] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:52:12,862] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:52:56,444] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:53:39,937] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:54:23,639] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3307, 'learning_rate': 5.886507919570239e-09, 'epoch': 1.98}
+[2023-12-25 22:55:08,344] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:55:52,165] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:56:36,316] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:57:20,620] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:58:05,140] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:58:49,144] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 22:59:33,899] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 23:00:18,850] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 23:01:02,324] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 23:01:47,396] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+{'loss': 0.3305, 'learning_rate': 1.1628585536216374e-09, 'epoch': 1.99}
+[2023-12-25 23:02:31,363] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 23:03:15,301] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 23:03:59,391] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 23:04:43,132] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 23:05:26,785] [WARNING] [stage3.py:1898:step] 3 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 23:06:09,803] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 23:06:53,683] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 23:07:37,557] [WARNING] [stage3.py:1898:step] 2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time
+[2023-12-25 23:09:26,221] [INFO] [logging.py:96:log_dist] [Rank 0] [Torch] Checkpoint global_step1648 is about to be saved!
+[2023-12-25 23:09:27,918] [INFO] [logging.py:96:log_dist] [Rank 0] Saving model checkpoint: /group/20025/jiuding/ckpt/34B_final/checkpoint-1648/global_step1648/zero_pp_rank_0_mp_rank_00_model_states.pt
+[2023-12-25 23:09:27,918] [INFO] [torch_checkpoint_engine.py:21:save] [Torch] Saving /group/20025/jiuding/ckpt/34B_final/checkpoint-1648/global_step1648/zero_pp_rank_0_mp_rank_00_model_states.pt...
+[2023-12-25 23:09:27,942] [INFO] [torch_checkpoint_engine.py:23:save] [Torch] Saved /group/20025/jiuding/ckpt/34B_final/checkpoint-1648/global_step1648/zero_pp_rank_0_mp_rank_00_model_states.pt.
+[2023-12-25 23:09:27,985] [INFO] [torch_checkpoint_engine.py:21:save] [Torch] Saving /group/20025/jiuding/ckpt/34B_final/checkpoint-1648/global_step1648/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt...
+[2023-12-25 23:09:59,171] [INFO] [torch_checkpoint_engine.py:23:save] [Torch] Saved /group/20025/jiuding/ckpt/34B_final/checkpoint-1648/global_step1648/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt.
+[2023-12-25 23:09:59,201] [INFO] [engine.py:3285:_save_zero_checkpoint] zero checkpoint saved /group/20025/jiuding/ckpt/34B_final/checkpoint-1648/global_step1648/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt
+[2023-12-25 23:10:01,039] [INFO] [torch_checkpoint_engine.py:33:commit] [Torch] Checkpoint global_step1648 is ready now!
+{'train_runtime': 72622.3244, 'train_samples_per_second': 5.811, 'train_steps_per_second': 0.023, 'train_loss': 0.40841181026500406, 'epoch': 2.0}
+***** train metrics *****
+  epoch                    =         2.0
+  train_loss               =      0.4084
+  train_runtime            = 20:10:22.32
+  train_samples_per_second =       5.811
+  train_steps_per_second   =       0.023
+12/25/2023 23:11:57 - INFO - matplotlib.font_manager - generated new fontManager
+Figure saved: /group/20025/jiuding/ckpt/34B_final/training_loss.svg
+12/25/2023 23:11:58 - WARNING - utils.other - No metric eval_loss to plot.