2023-01-29 15:37:40 | INFO | fairseq.distributed.utils | distributed init (rank 1): tcp://localhost:13784 2023-01-29 15:37:40 | INFO | fairseq.distributed.utils | distributed init (rank 3): tcp://localhost:13784 2023-01-29 15:37:40 | INFO | fairseq.distributed.utils | distributed init (rank 2): tcp://localhost:13784 2023-01-29 15:37:40 | INFO | fairseq.distributed.utils | distributed init (rank 0): tcp://localhost:13784 2023-01-29 15:37:41 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:1 to store for rank: 1 2023-01-29 15:37:41 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:1 to store for rank: 3 2023-01-29 15:37:41 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:1 to store for rank: 2 2023-01-29 15:37:41 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:1 to store for rank: 0 2023-01-29 15:37:41 | INFO | torch.distributed.distributed_c10d | Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2023-01-29 15:37:41 | INFO | torch.distributed.distributed_c10d | Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2023-01-29 15:37:41 | INFO | fairseq.distributed.utils | initialized host ubuntu as rank 1 2023-01-29 15:37:41 | INFO | fairseq.distributed.utils | initialized host ubuntu as rank 0 2023-01-29 15:37:41 | INFO | torch.distributed.distributed_c10d | Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2023-01-29 15:37:41 | INFO | torch.distributed.distributed_c10d | Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2023-01-29 15:37:41 | INFO | fairseq.distributed.utils | initialized host ubuntu as rank 3 2023-01-29 15:37:41 | INFO | fairseq.distributed.utils | initialized host ubuntu as rank 2 2023-01-29 15:37:45 | INFO | fairseq_cli.train | {'_name': None, 'common': {'_name': None, 'no_progress_bar': False, 'log_interval': 10, 'log_format': 'json', 'log_file': None, 'tensorboard_logdir': '/home/wangrui/projects/SpeechT5/experimental/s2c', 'wandb_project': None, 'azureml_logging': False, 'seed': 1, 'cpu': False, 'tpu': False, 'bf16': False, 'memory_efficient_bf16': False, 'fp16': True, 'memory_efficient_fp16': False, 'fp16_no_flatten_grads': False, 'fp16_init_scale': 128, 'fp16_scale_window': None, 'fp16_scale_tolerance': 0.0, 'on_cpu_convert_precision': False, 'min_loss_scale': 0.0001, 'threshold_loss_scale': None, 'amp': False, 'amp_batch_retries': 2, 'amp_init_scale': 128, 'amp_scale_window': None, 'user_dir': '/home/wangrui/projects/SpeechT5/SpeechT5/fairseq/examples/speecht5', 'empty_cache_freq': 0, 'all_gather_list_size': 16384, 'model_parallel_size': 1, 'quantization_config_path': None, 'profile': False, 'reset_logging': False, 'suppress_crashes': False, 'use_plasma_view': False, 'plasma_path': '/tmp/plasma'}, 'common_eval': {'_name': None, 'path': None, 'post_process': 'sentencepiece', 'quiet': False, 'model_overrides': '{}', 'results_path': None}, 'distributed_training': {'_name': None, 'distributed_world_size': 4, 'distributed_num_procs': 4, 'distributed_rank': 0, 'distributed_backend': 'nccl', 'distributed_init_method': 'tcp://localhost:13784', 'distributed_port': 0, 'device_id': 0, 'distributed_no_spawn': False, 'ddp_backend': 'legacy_ddp', 'ddp_comm_hook': 'none', 'bucket_cap_mb': 25, 'fix_batches_to_gpus': False, 'find_unused_parameters': True, 'fast_stat_sync': False, 'heartbeat_timeout': -1, 'broadcast_buffers': False, 'slowmo_momentum': None, 'slowmo_algorithm': 'LocalSGD', 'localsgd_frequency': 3, 'nprocs_per_node': 4, 'pipeline_model_parallel': False, 'pipeline_balance': None, 'pipeline_devices': None, 'pipeline_chunks': 0, 'pipeline_encoder_balance': None, 'pipeline_encoder_devices': None, 'pipeline_decoder_balance': None, 'pipeline_decoder_devices': None, 'pipeline_checkpoint': 'never', 'zero_sharding': 'none', 'fp16': True, 'memory_efficient_fp16': False, 'tpu': False, 'no_reshard_after_forward': False, 'fp32_reduce_scatter': False, 'cpu_offload': False, 'use_sharded_state': False}, 'dataset': {'_name': None, 'num_workers': 4, 'skip_invalid_size_inputs_valid_test': True, 'max_tokens': None, 'batch_size': 8, 'required_batch_size_multiple': 1, 'required_seq_len_multiple': 1, 'dataset_impl': None, 'data_buffer_size': 0, 'train_subset': 'train', 'valid_subset': 'valid', 'combine_valid_subsets': None, 'ignore_unused_valid_subsets': False, 'validate_interval': 1, 'validate_interval_updates': 0, 'validate_after_updates': 20000, 'fixed_validation_seed': None, 'disable_validation': False, 'max_tokens_valid': None, 'batch_size_valid': 8, 'max_valid_steps': None, 'curriculum': 0, 'gen_subset': 'test', 'num_shards': 1, 'shard_id': 0}, 'optimization': {'_name': None, 'max_epoch': 0, 'max_update': 60000, 'stop_time_hours': 0.0, 'clip_norm': 0.0, 'sentence_avg': False, 'update_freq': [2], 'lr': [1e-08], 'stop_min_lr': -1.0, 'use_bmuf': False}, 'checkpoint': {'_name': None, 'save_dir': '/home/wangrui/projects/SpeechT5/experimental/s2c', 'restore_file': 'checkpoint_last.pt', 'finetune_from_model': '/nfs-data/user1/PhDHub/ckpt/speecht5_base.pt', 'reset_dataloader': False, 'reset_lr_scheduler': False, 'reset_meters': False, 'reset_optimizer': False, 'optimizer_overrides': '{}', 'save_interval': 1, 'save_interval_updates': 10000, 'keep_interval_updates': -1, 'keep_interval_updates_pattern': -1, 'keep_last_epochs': -1, 'keep_best_checkpoints': -1, 'no_save': False, 'no_epoch_checkpoints': True, 'no_last_checkpoints': False, 'no_save_optimizer_state': False, 'best_checkpoint_metric': 's2c_accuracy', 'maximize_best_checkpoint_metric': True, 'patience': -1, 'checkpoint_suffix': '', 'checkpoint_shard_count': 1, 'load_checkpoint_on_all_dp_ranks': False, 'write_checkpoints_asynchronously': False, 'model_parallel_size': 1}, 'bmuf': {'_name': None, 'block_lr': 1.0, 'block_momentum': 0.875, 'global_sync_iter': 50, 'warmup_iterations': 500, 'use_nbm': False, 'average_sync': False, 'distributed_world_size': 4}, 'generation': {'_name': None, 'beam': 5, 'nbest': 1, 'max_len_a': 0.0, 'max_len_b': 200, 'min_len': 1, 'match_source_len': False, 'unnormalized': False, 'no_early_stop': False, 'no_beamable_mm': False, 'lenpen': 1.0, 'unkpen': 0.0, 'replace_unk': None, 'sacrebleu': False, 'score_reference': False, 'prefix_size': 0, 'no_repeat_ngram_size': 0, 'sampling': False, 'sampling_topk': -1, 'sampling_topp': -1.0, 'constraints': None, 'temperature': 1.0, 'diverse_beam_groups': -1, 'diverse_beam_strength': 0.5, 'diversity_rate': -1.0, 'print_alignment': None, 'print_step': False, 'lm_path': None, 'lm_weight': 0.0, 'iter_decode_eos_penalty': 0.0, 'iter_decode_max_iter': 10, 'iter_decode_force_max_iter': False, 'iter_decode_with_beam': 1, 'iter_decode_with_external_reranker': False, 'retain_iter_history': False, 'retain_dropout': False, 'retain_dropout_modules': None, 'decoding_format': None, 'no_seed_provided': False}, 'eval_lm': {'_name': None, 'output_word_probs': False, 'output_word_stats': False, 'context_window': 0, 'softmax_batch': 9223372036854775807}, 'interactive': {'_name': None, 'buffer_size': 0, 'input': '-'}, 'model': Namespace(_name='t5_transformer_base_asr', activation_dropout=0.1, activation_fn='gelu', adam_betas=(0.9, 0.999), adam_eps=1e-08, adaptive_input=False, adaptive_softmax_cutoff=None, adaptive_softmax_dropout=0, all_gather_list_size=16384, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, arch='t5_transformer_base_asr', attention_dropout=0.1, azureml_logging=False, bart_weight=1.0, batch_ratio=None, batch_size=8, batch_size_valid=8, bce_loss_lambda=1.0, bce_pos_weight=5.0, bert_init=True, best_checkpoint_metric='s2c_accuracy', bf16=False, bpe=None, bpe_tokenizer=None, broadcast_buffers=False, bucket_cap_mb=25, ce_weight=1.0, checkpoint_shard_count=1, checkpoint_suffix='', clip_norm=0.0, codebook_prob=0.5, combine_valid_subsets=None, config_yaml='config.yaml', conv_bias=False, conv_channels=1024, conv_feature_layers='[(512,10,5)] + [(512,3,2)] * 4 + [(512,2,2)] * 2', conv_kernel_sizes='5,5', conv_pos=128, conv_pos_groups=16, cpu=False, cpu_offload=False, criterion='speecht5', ctc_weight=0.0, curriculum=0, data='/home/wangrui/projects/SpeechT5/manifest', data_buffer_size=0, dataset_impl=None, ddp_backend='legacy_ddp', ddp_comm_hook='none', dec_use_scaled_pos_enc=True, dec_weight=1.0, decoder_attention_heads=12, decoder_embed_dim=768, decoder_ffn_embed_dim=3072, decoder_input_dim=768, decoder_layerdrop=0.1, decoder_layers=6, decoder_learned_pos=False, decoder_max_relative_position=160, decoder_normalize_before=False, decoder_output_dim=768, device_id=0, disable_validation=False, distributed_backend='nccl', distributed_init_method=None, distributed_no_spawn=False, distributed_num_procs=4, distributed_port=0, distributed_rank=0, distributed_world_size=4, dprenet_dropout_rate=0.5, dprenet_layers=2, dprenet_units=256, dropout=0.1, empty_cache_freq=0, enable_padding=False, enc_use_scaled_pos_enc=True, encoder_attention_heads=12, encoder_attn_branch='identity,full', encoder_embed_dim=768, encoder_ffn_embed_dim=3072, encoder_layerdrop=0.05, encoder_layers=12, encoder_max_relative_position=160, encoder_normalize_before=False, encoder_reduction_factor=1, encoder_sliding_window_attn=None, encoder_speech_prenet='conv', eos=2, eprenet_conv_chans=0, eprenet_conv_filts=0, eprenet_conv_layers=0, eprenet_dropout_rate=0.0, extractor_mode='default', fast_stat_sync=False, feature_grad_mult=1.0, final_dim=256, find_unused_parameters=True, finetune_from_model='/nfs-data/user1/PhDHub/ckpt/speecht5_base.pt', finetune_from_modules=None, finetune_out_of_modules=None, fix_batches_to_gpus=False, fixed_validation_seed=None, fp16=True, fp16_init_scale=128, fp16_no_flatten_grads=False, fp16_scale_tolerance=0.0, fp16_scale_window=None, fp32_reduce_scatter=False, freeze_decoder_updates=0, freeze_encoder_updates=0, gen_subset='test', guided_attn_loss_lambda=10.0, guided_attn_loss_sigma=0.4, heartbeat_timeout=-1, hubert_label_dir=None, hubert_labels=['km'], hubert_mask_length=10, hubert_weight=1.0, ignore_prefix_size=0, ignore_unused_valid_subsets=False, iid_noise_target=False, initial_decoder_alpha=1.0, initial_encoder_alpha=1.0, insert=0.0, keep_best_checkpoints=-1, keep_interval_updates=-1, keep_interval_updates_pattern=-1, keep_last_epochs=-1, label_rates=-1, label_smoothing=0.0, latent_dim=0, latent_groups=2, latent_temp=(2, 0.5, 0.999995), latent_vars=100, layer_norm_eps=1e-05, layer_norm_first=False, load_checkpoint_on_all_dp_ranks=False, localsgd_frequency=3, log_file=None, log_format='json', log_interval=10, log_keys=[], logit_temp=0.1, loss_type='L1', loss_weights=[0.1], lr=[1e-08], lr_period_updates=60000.0, lr_scheduler='triangular', lr_shrink=0.5, mask=0.3, mask_channel_length=64, mask_channel_min_space=1, mask_channel_other=0, mask_channel_prob=0.0, mask_channel_selection='static', mask_length='span-poisson', mask_min_space=1, mask_other=0, mask_prob=0.0, mask_random=0.1, mask_selection='static', max_distance=1280, max_epoch=0, max_lr=0.0002, max_speech_positions=4000, max_speech_sample_size=None, max_text_positions=600, max_tokens=None, max_tokens_valid=None, max_update=60000, max_valid_steps=None, maximize_best_checkpoint_metric=True, memory_efficient_bf16=False, memory_efficient_fp16=False, min_loss_scale=0.0001, min_speech_sample_size=None, model_parallel_size=1, modules_applied_guided_attn=('encoder-decoder',), modules_filter=None, no_epoch_checkpoints=True, no_freeze_encoder_layer=None, no_last_checkpoints=False, no_mask_channel_overlap=False, no_mask_overlap=False, no_progress_bar=False, no_reshard_after_forward=False, no_save=False, no_save_optimizer_state=False, no_scale_embedding=True, no_seed_provided=False, no_token_positional_embeddings=False, normalize=False, nprocs_per_node=4, num_buckets=320, num_heads_applied_guided_attn=2, num_layers_applied_guided_attn=2, num_shards=1, num_workers=4, on_cpu_convert_precision=False, optimizer='adam', optimizer_overrides='{}', pad=1, pad_audio=False, patience=-1, permute=0.0, permute_sentences=0.0, pipeline_balance=None, pipeline_checkpoint='never', pipeline_chunks=0, pipeline_decoder_balance=None, pipeline_decoder_devices=None, pipeline_devices=None, pipeline_encoder_balance=None, pipeline_encoder_devices=None, pipeline_model_parallel=False, plasma_path='/tmp/plasma', poisson_lambda=3.5, post_process='sentencepiece', postnet_chans=256, postnet_dropout_rate=0.5, postnet_filts=5, postnet_layers=5, pred_masked_weight=1.0, pred_nomask_weight=0.0, profile=False, quant_noise_pq=0, quantization_config_path=None, quantizer_depth=1, quantizer_factor=3, random_crop=False, reduction_factor=2, relative_position_embedding=True, replace_length=1, report_accuracy=True, required_batch_size_multiple=1, required_seq_len_multiple=1, reset_dataloader=False, reset_logging=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, restore_file='checkpoint_last.pt', rotate=0.0, sample_break_mode='eos', sample_rate=16000.0, sample_ratios=None, save_dir='/home/wangrui/projects/SpeechT5/experimental/s2c', save_interval=1, save_interval_updates=10000, scoring='bleu', se_decoder_input='previous_target', se_predict=None, seed=1, sentence_avg=False, shard_id=0, share_ctc_embed=False, share_input_output_embed=True, shorten_data_split_list='', shorten_method='none', shrink_min=False, sid_decoder_attn_dim=128, sid_embed_dim=128, sid_encoder_cls=None, sid_no_embed_postnet=True, sid_no_pooling_bn=True, sid_pooling_layer='decoder', sid_softmax_type='softmax', single_target=False, skip_invalid_size_inputs_valid_test=True, skip_masked=False, skip_nomask=False, slowmo_algorithm='LocalSGD', slowmo_momentum=None, softmax_easy_margin=False, softmax_margin=0.0, softmax_scale=1.0, spk_embed_dim=512, spk_embed_integration_type='pre', stop_min_lr=-1.0, stop_time_hours=0, subsample_stride='2,2', suppress_crashes=False, t5_task='s2c', target_glu=False, task='speecht5', tensorboard_logdir='/home/wangrui/projects/SpeechT5/experimental/s2c', threshold_loss_scale=None, tokenizer=None, tokens_per_sample=512, tpu=False, train_subset='train', transformer_dec_positional_dropout_rate=0.1, transformer_enc_positional_dropout_rate=0.1, unb_enc_layer=-1, unk=3, untie_final_proj=True, update_freq=[2], use_batch_norm=True, use_bmuf=False, use_codebook=False, use_conv_pos=True, use_guided_attn_loss=False, use_masking=True, use_old_adam=False, use_plasma_view=False, use_sent_enc_layer=True, use_sharded_state=False, use_sinc_pos=True, use_weighted_masking=False, user_dir='/home/wangrui/projects/SpeechT5/SpeechT5/fairseq/examples/speecht5', valid_subset='valid', validate_after_updates=20000, validate_interval=1, validate_interval_updates=0, wandb_project=None, weight_decay=0.1, wer_args=None, wer_kenlm_model=None, wer_lexicon=None, wer_lm_weight=2.0, wer_word_score=-1.0, write_checkpoints_asynchronously=False, zero_infinity=False, zero_sharding='none'), 'task': Namespace(_name='speecht5', activation_dropout=0.1, activation_fn='gelu', adam_betas=(0.9, 0.999), adam_eps=1e-08, adaptive_input=False, adaptive_softmax_cutoff=None, adaptive_softmax_dropout=0, all_gather_list_size=16384, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, arch='t5_transformer_base_asr', attention_dropout=0.1, azureml_logging=False, bart_weight=1.0, batch_ratio=None, batch_size=8, batch_size_valid=8, bce_loss_lambda=1.0, bce_pos_weight=5.0, bert_init=True, best_checkpoint_metric='s2c_accuracy', bf16=False, bpe=None, bpe_tokenizer=None, broadcast_buffers=False, bucket_cap_mb=25, ce_weight=1.0, checkpoint_shard_count=1, checkpoint_suffix='', clip_norm=0.0, codebook_prob=0.5, combine_valid_subsets=None, config_yaml='config.yaml', conv_bias=False, conv_channels=1024, conv_feature_layers='[(512,10,5)] + [(512,3,2)] * 4 + [(512,2,2)] * 2', conv_kernel_sizes='5,5', conv_pos=128, conv_pos_groups=16, cpu=False, cpu_offload=False, criterion='speecht5', ctc_weight=0.0, curriculum=0, data='/home/wangrui/projects/SpeechT5/manifest', data_buffer_size=0, dataset_impl=None, ddp_backend='legacy_ddp', ddp_comm_hook='none', dec_use_scaled_pos_enc=True, dec_weight=1.0, decoder_attention_heads=12, decoder_embed_dim=768, decoder_ffn_embed_dim=3072, decoder_input_dim=768, decoder_layerdrop=0.1, decoder_layers=6, decoder_learned_pos=False, decoder_max_relative_position=160, decoder_normalize_before=False, decoder_output_dim=768, device_id=0, disable_validation=False, distributed_backend='nccl', distributed_init_method=None, distributed_no_spawn=False, distributed_num_procs=4, distributed_port=0, distributed_rank=0, distributed_world_size=4, dprenet_dropout_rate=0.5, dprenet_layers=2, dprenet_units=256, dropout=0.1, empty_cache_freq=0, enable_padding=False, enc_use_scaled_pos_enc=True, encoder_attention_heads=12, encoder_attn_branch='identity,full', encoder_embed_dim=768, encoder_ffn_embed_dim=3072, encoder_layerdrop=0.05, encoder_layers=12, encoder_max_relative_position=160, encoder_normalize_before=False, encoder_reduction_factor=1, encoder_sliding_window_attn=None, encoder_speech_prenet='conv', eos=2, eprenet_conv_chans=0, eprenet_conv_filts=0, eprenet_conv_layers=0, eprenet_dropout_rate=0.0, extractor_mode='default', fast_stat_sync=False, feature_grad_mult=1.0, final_dim=256, find_unused_parameters=True, finetune_from_model='/nfs-data/user1/PhDHub/ckpt/speecht5_base.pt', finetune_from_modules=None, finetune_out_of_modules=None, fix_batches_to_gpus=False, fixed_validation_seed=None, fp16=True, fp16_init_scale=128, fp16_no_flatten_grads=False, fp16_scale_tolerance=0.0, fp16_scale_window=None, fp32_reduce_scatter=False, freeze_decoder_updates=0, freeze_encoder_updates=0, gen_subset='test', guided_attn_loss_lambda=10.0, guided_attn_loss_sigma=0.4, heartbeat_timeout=-1, hubert_label_dir=None, hubert_labels=['km'], hubert_mask_length=10, hubert_weight=1.0, ignore_prefix_size=0, ignore_unused_valid_subsets=False, iid_noise_target=False, initial_decoder_alpha=1.0, initial_encoder_alpha=1.0, insert=0.0, keep_best_checkpoints=-1, keep_interval_updates=-1, keep_interval_updates_pattern=-1, keep_last_epochs=-1, label_rates=-1, label_smoothing=0.0, latent_dim=0, latent_groups=2, latent_temp=(2, 0.5, 0.999995), latent_vars=100, layer_norm_eps=1e-05, layer_norm_first=False, load_checkpoint_on_all_dp_ranks=False, localsgd_frequency=3, log_file=None, log_format='json', log_interval=10, log_keys=[], logit_temp=0.1, loss_type='L1', loss_weights=[0.1], lr=[1e-08], lr_period_updates=60000.0, lr_scheduler='triangular', lr_shrink=0.5, mask=0.3, mask_channel_length=64, mask_channel_min_space=1, mask_channel_other=0, mask_channel_prob=0.0, mask_channel_selection='static', mask_length='span-poisson', mask_min_space=1, mask_other=0, mask_prob=0.0, mask_random=0.1, mask_selection='static', max_distance=1280, max_epoch=0, max_lr=0.0002, max_speech_positions=4000, max_speech_sample_size=None, max_text_positions=600, max_tokens=None, max_tokens_valid=None, max_update=60000, max_valid_steps=None, maximize_best_checkpoint_metric=True, memory_efficient_bf16=False, memory_efficient_fp16=False, min_loss_scale=0.0001, min_speech_sample_size=None, model_parallel_size=1, modules_applied_guided_attn=('encoder-decoder',), modules_filter=None, no_epoch_checkpoints=True, no_freeze_encoder_layer=None, no_last_checkpoints=False, no_mask_channel_overlap=False, no_mask_overlap=False, no_progress_bar=False, no_reshard_after_forward=False, no_save=False, no_save_optimizer_state=False, no_scale_embedding=True, no_seed_provided=False, no_token_positional_embeddings=False, normalize=False, nprocs_per_node=4, num_buckets=320, num_heads_applied_guided_attn=2, num_layers_applied_guided_attn=2, num_shards=1, num_workers=4, on_cpu_convert_precision=False, optimizer='adam', optimizer_overrides='{}', pad=1, pad_audio=False, patience=-1, permute=0.0, permute_sentences=0.0, pipeline_balance=None, pipeline_checkpoint='never', pipeline_chunks=0, pipeline_decoder_balance=None, pipeline_decoder_devices=None, pipeline_devices=None, pipeline_encoder_balance=None, pipeline_encoder_devices=None, pipeline_model_parallel=False, plasma_path='/tmp/plasma', poisson_lambda=3.5, post_process='sentencepiece', postnet_chans=256, postnet_dropout_rate=0.5, postnet_filts=5, postnet_layers=5, pred_masked_weight=1.0, pred_nomask_weight=0.0, profile=False, quant_noise_pq=0, quantization_config_path=None, quantizer_depth=1, quantizer_factor=3, random_crop=False, reduction_factor=2, relative_position_embedding=True, replace_length=1, report_accuracy=True, required_batch_size_multiple=1, required_seq_len_multiple=1, reset_dataloader=False, reset_logging=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, restore_file='checkpoint_last.pt', rotate=0.0, sample_break_mode='eos', sample_rate=16000.0, sample_ratios=None, save_dir='/home/wangrui/projects/SpeechT5/experimental/s2c', save_interval=1, save_interval_updates=10000, scoring='bleu', se_decoder_input='previous_target', se_predict=None, seed=1, sentence_avg=False, shard_id=0, share_ctc_embed=False, share_input_output_embed=True, shorten_data_split_list='', shorten_method='none', shrink_min=False, sid_decoder_attn_dim=128, sid_embed_dim=128, sid_encoder_cls=None, sid_no_embed_postnet=True, sid_no_pooling_bn=True, sid_pooling_layer='decoder', sid_softmax_type='softmax', single_target=False, skip_invalid_size_inputs_valid_test=True, skip_masked=False, skip_nomask=False, slowmo_algorithm='LocalSGD', slowmo_momentum=None, softmax_easy_margin=False, softmax_margin=0.0, softmax_scale=1.0, spk_embed_dim=512, spk_embed_integration_type='pre', stop_min_lr=-1.0, stop_time_hours=0, subsample_stride='2,2', suppress_crashes=False, t5_task='s2c', target_glu=False, task='speecht5', tensorboard_logdir='/home/wangrui/projects/SpeechT5/experimental/s2c', threshold_loss_scale=None, tokenizer=None, tokens_per_sample=512, tpu=False, train_subset='train', transformer_dec_positional_dropout_rate=0.1, transformer_enc_positional_dropout_rate=0.1, unb_enc_layer=-1, unk=3, untie_final_proj=True, update_freq=[2], use_batch_norm=True, use_bmuf=False, use_codebook=False, use_conv_pos=True, use_guided_attn_loss=False, use_masking=True, use_old_adam=False, use_plasma_view=False, use_sent_enc_layer=True, use_sharded_state=False, use_sinc_pos=True, use_weighted_masking=False, user_dir='/home/wangrui/projects/SpeechT5/SpeechT5/fairseq/examples/speecht5', valid_subset='valid', validate_after_updates=20000, validate_interval=1, validate_interval_updates=0, wandb_project=None, weight_decay=0.1, wer_args=None, wer_kenlm_model=None, wer_lexicon=None, wer_lm_weight=2.0, wer_word_score=-1.0, write_checkpoints_asynchronously=False, zero_infinity=False, zero_sharding='none'), 'criterion': {'_name': 'speecht5', 'zero_infinity': False, 'sentence_avg': False, 'post_process': 'sentencepiece', 'wer_kenlm_model': None, 'wer_lexicon': None, 'wer_lm_weight': 2.0, 'wer_word_score': -1.0, 'wer_args': None, 'label_smoothing': 0.0, 'report_accuracy': True, 'ignore_prefix_size': 0, 'ce_weight': 1.0, 'ctc_weight': 0.0, 'use_masking': True, 'use_weighted_masking': False, 'loss_type': 'L1', 'bce_pos_weight': 5.0, 'bce_loss_lambda': 1.0, 'use_guided_attn_loss': False, 'guided_attn_loss_sigma': 0.4, 'guided_attn_loss_lambda': 10.0, 'num_layers_applied_guided_attn': 2, 'num_heads_applied_guided_attn': 2, 'modules_applied_guided_attn': ['encoder-decoder'], 'pred_masked_weight': 1.0, 'pred_nomask_weight': 0.0, 'loss_weights': [0.1], 'log_keys': [], 'hubert_weight': 1.0, 'dec_weight': 1.0, 'bart_weight': 1.0}, 'optimizer': {'_name': 'adam', 'adam_betas': [0.9, 0.999], 'adam_eps': 1e-08, 'weight_decay': 0.1, 'use_old_adam': False, 'tpu': False, 'lr': [1e-08]}, 'lr_scheduler': {'_name': 'triangular', 'max_lr': 0.0002, 'lr_period_updates': 60000.0, 'lr_shrink': 0.5, 'shrink_min': False, 'lr': [1e-08]}, 'scoring': {'_name': 'bleu', 'pad': 1, 'eos': 2, 'unk': 3}, 'bpe': None, 'tokenizer': None} 2023-01-29 15:37:45 | INFO | speecht5.tasks.speecht5 | No config file for s2c 2023-01-29 15:37:45 | INFO | speecht5.tasks.speecht5 | Cannot set input_feat_per_channel, input_channels, since: 2023-01-29 15:37:45 | WARNING | speecht5.tasks.speecht5 | 'NoneType' object has no attribute 'input_feat_per_channel' 2023-01-29 15:37:45 | INFO | speecht5.tasks.speecht5 | Set to: 80 and 1 2023-01-29 15:37:45 | WARNING | speecht5.tasks.speecht5 | 'NoneType' object has no attribute 'input_feat_per_channel' 2023-01-29 15:37:45 | WARNING | speecht5.tasks.speecht5 | 'NoneType' object has no attribute 'input_feat_per_channel' 2023-01-29 15:37:45 | WARNING | speecht5.tasks.speecht5 | 'NoneType' object has no attribute 'input_feat_per_channel' 2023-01-29 15:37:48 | INFO | speecht5.criterions.speech_to_text_loss | Only using CE loss 2023-01-29 15:37:48 | INFO | fairseq_cli.train | T5TransformerModel( (encoder): TransformerEncoder( (dropout_module): FairseqDropout() (layers): ModuleList( (0): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (1): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (2): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (3): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (4): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (5): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (6): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (7): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (8): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (9): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (10): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (11): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) ) (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (proj): Linear(in_features=768, out_features=1257, bias=True) (pos_emb): RelativePositionalEncoding( (pe_k): Embedding(320, 64) ) ) (decoder): TransformerDecoder( (dropout_module): FairseqDropout() (layers): LayerDropModuleList( (0): TransformerDecoderLayer( (dropout_module): FairseqDropout() (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (activation_dropout_module): FairseqDropout() (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (encoder_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (encoder_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (1): TransformerDecoderLayer( (dropout_module): FairseqDropout() (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (activation_dropout_module): FairseqDropout() (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (encoder_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (encoder_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (2): TransformerDecoderLayer( (dropout_module): FairseqDropout() (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (activation_dropout_module): FairseqDropout() (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (encoder_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (encoder_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (3): TransformerDecoderLayer( (dropout_module): FairseqDropout() (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (activation_dropout_module): FairseqDropout() (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (encoder_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (encoder_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (4): TransformerDecoderLayer( (dropout_module): FairseqDropout() (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (activation_dropout_module): FairseqDropout() (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (encoder_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (encoder_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (5): TransformerDecoderLayer( (dropout_module): FairseqDropout() (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (activation_dropout_module): FairseqDropout() (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (encoder_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (encoder_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) ) (pos_emb): RelativePositionalEncoding( (pe_k): Embedding(320, 64) ) ) (text_encoder_prenet): TextEncoderPrenet( (encoder_prenet): Sequential( (0): Embedding(1257, 768, padding_idx=1) (1): ScaledPositionalEncoding( (dropout): Dropout(p=0.1, inplace=False) ) ) ) (speech_encoder_prenet): SpeechEncoderPrenet( (dropout_module): FairseqDropout() (feature_extractor): ConvFeatureExtractionModel( (conv_layers): ModuleList( (0): Sequential( (0): Conv1d(1, 512, kernel_size=(10,), stride=(5,), bias=False) (1): Dropout(p=0.0, inplace=False) (2): Fp32GroupNorm(512, 512, eps=1e-05, affine=True) (3): GELU() ) (1): Sequential( (0): Conv1d(512, 512, kernel_size=(3,), stride=(2,), bias=False) (1): Dropout(p=0.0, inplace=False) (2): GELU() ) (2): Sequential( (0): Conv1d(512, 512, kernel_size=(3,), stride=(2,), bias=False) (1): Dropout(p=0.0, inplace=False) (2): GELU() ) (3): Sequential( (0): Conv1d(512, 512, kernel_size=(3,), stride=(2,), bias=False) (1): Dropout(p=0.0, inplace=False) (2): GELU() ) (4): Sequential( (0): Conv1d(512, 512, kernel_size=(3,), stride=(2,), bias=False) (1): Dropout(p=0.0, inplace=False) (2): GELU() ) (5): Sequential( (0): Conv1d(512, 512, kernel_size=(2,), stride=(2,), bias=False) (1): Dropout(p=0.0, inplace=False) (2): GELU() ) (6): Sequential( (0): Conv1d(512, 512, kernel_size=(2,), stride=(2,), bias=False) (1): Dropout(p=0.0, inplace=False) (2): GELU() ) ) ) (post_extract_proj): Linear(in_features=512, out_features=768, bias=True) (layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (pos_conv): Sequential( (0): Conv1d(768, 768, kernel_size=(128,), stride=(1,), padding=(64,), groups=16) (1): SamePad() (2): GELU() ) (embed_positions): SinusoidalPositionalEmbedding() ) (text_decoder_prenet): TextDecoderPrenet( (dropout_module): FairseqDropout() (embed_tokens): Embedding(1257, 768, padding_idx=1) (embed_positions): SinusoidalPositionalEmbedding() ) (speech_decoder_prenet): SpeechDecoderPrenet( (decoder_prenet): Sequential( (0): Sequential( (0): Prenet( (prenet): ModuleList( (0): Sequential( (0): Linear(in_features=80, out_features=256, bias=True) (1): ReLU() ) (1): Sequential( (0): Linear(in_features=256, out_features=256, bias=True) (1): ReLU() ) ) ) (1): Linear(in_features=256, out_features=768, bias=True) ) (1): ScaledPositionalEncoding( (dropout): Dropout(p=0.1, inplace=False) ) ) (spkembs_layer): Sequential( (0): Linear(in_features=1280, out_features=768, bias=True) (1): ReLU() ) ) (text_decoder_postnet): TextDecoderPostnet( (output_projection): Linear(in_features=768, out_features=1257, bias=False) ) (speech_decoder_postnet): SpeechDecoderPostnet( (feat_out): Linear(in_features=768, out_features=160, bias=True) (prob_out): Linear(in_features=768, out_features=2, bias=True) (postnet): Postnet( (postnet): ModuleList( (0): Sequential( (0): Conv1d(80, 256, kernel_size=(5,), stride=(1,), padding=(2,), bias=False) (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): Tanh() (3): Dropout(p=0.5, inplace=False) ) (1): Sequential( (0): Conv1d(256, 256, kernel_size=(5,), stride=(1,), padding=(2,), bias=False) (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): Tanh() (3): Dropout(p=0.5, inplace=False) ) (2): Sequential( (0): Conv1d(256, 256, kernel_size=(5,), stride=(1,), padding=(2,), bias=False) (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): Tanh() (3): Dropout(p=0.5, inplace=False) ) (3): Sequential( (0): Conv1d(256, 256, kernel_size=(5,), stride=(1,), padding=(2,), bias=False) (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): Tanh() (3): Dropout(p=0.5, inplace=False) ) (4): Sequential( (0): Conv1d(256, 80, kernel_size=(5,), stride=(1,), padding=(2,), bias=False) (1): BatchNorm1d(80, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): Dropout(p=0.5, inplace=False) ) ) ) ) (speaker_decoder_postnet): SpeakerDecoderPostnet( (output_projection): Linear(in_features=768, out_features=1257, bias=False) ) ) 2023-01-29 15:37:48 | INFO | fairseq_cli.train | task: SpeechT5Task 2023-01-29 15:37:48 | INFO | fairseq_cli.train | model: T5TransformerModel 2023-01-29 15:37:48 | INFO | fairseq_cli.train | criterion: SpeechT5Criterion 2023-01-29 15:37:48 | INFO | fairseq_cli.train | num. shared model params: 156,605,357 (num. trained: 156,605,357) 2023-01-29 15:37:48 | INFO | fairseq_cli.train | num. expert model params: 0 (num. trained: 0) 2023-01-29 15:37:48 | INFO | speecht5.data.speech_to_class_dataset | max_keep=1024000, min_keep=None, loaded 6903, skipped 0 short and 1 long, longest-loaded=975361, shortest-loaded=63361 2023-01-29 15:37:48 | INFO | speecht5.data.speech_to_class_dataset | max_length=76800, normalize=False 2023-01-29 15:37:48 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:2 to store for rank: 0 2023-01-29 15:37:48 | INFO | torch.distributed.distributed_c10d | Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2023-01-29 15:37:48 | INFO | fairseq.trainer | detected shared parameter: text_encoder_prenet.encoder_prenet.0.weight <- text_decoder_prenet.embed_tokens.weight 2023-01-29 15:37:48 | INFO | fairseq.trainer | detected shared parameter: text_encoder_prenet.encoder_prenet.0.weight <- text_decoder_postnet.output_projection.weight 2023-01-29 15:37:48 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_encoder_prenet.feature_extractor.conv_layers.1.0.bias 2023-01-29 15:37:48 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_encoder_prenet.feature_extractor.conv_layers.2.0.bias 2023-01-29 15:37:48 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_encoder_prenet.feature_extractor.conv_layers.3.0.bias 2023-01-29 15:37:48 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_encoder_prenet.feature_extractor.conv_layers.4.0.bias 2023-01-29 15:37:48 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_encoder_prenet.feature_extractor.conv_layers.5.0.bias 2023-01-29 15:37:48 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_encoder_prenet.feature_extractor.conv_layers.6.0.bias 2023-01-29 15:37:48 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- text_decoder_postnet.output_projection.bias 2023-01-29 15:37:48 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_decoder_postnet.postnet.postnet.0.0.bias 2023-01-29 15:37:48 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_decoder_postnet.postnet.postnet.1.0.bias 2023-01-29 15:37:48 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_decoder_postnet.postnet.postnet.2.0.bias 2023-01-29 15:37:48 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_decoder_postnet.postnet.postnet.3.0.bias 2023-01-29 15:37:48 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_decoder_postnet.postnet.postnet.4.0.bias 2023-01-29 15:37:48 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speaker_decoder_postnet.output_projection.bias 2023-01-29 15:37:48 | INFO | fairseq.utils | ***********************CUDA enviroments for all 4 workers*********************** 2023-01-29 15:37:48 | INFO | fairseq.utils | rank 0: capabilities = 8.6 ; total memory = 11.771 GB ; name = NVIDIA GeForce RTX 3080 Ti 2023-01-29 15:37:48 | INFO | fairseq.utils | rank 1: capabilities = 8.6 ; total memory = 11.771 GB ; name = NVIDIA GeForce RTX 3080 Ti 2023-01-29 15:37:48 | INFO | fairseq.utils | rank 2: capabilities = 8.6 ; total memory = 11.771 GB ; name = NVIDIA GeForce RTX 3080 Ti 2023-01-29 15:37:48 | INFO | fairseq.utils | rank 3: capabilities = 8.6 ; total memory = 11.771 GB ; name = NVIDIA GeForce RTX 3080 Ti 2023-01-29 15:37:48 | INFO | fairseq.utils | ***********************CUDA enviroments for all 4 workers*********************** 2023-01-29 15:37:48 | INFO | fairseq_cli.train | training on 4 devices (GPUs/TPUs) 2023-01-29 15:37:48 | INFO | fairseq_cli.train | max tokens per device = None and max sentences per device = 8 2023-01-29 15:37:48 | INFO | fairseq.checkpoint_utils | loading pretrained model from /nfs-data/user1/PhDHub/ckpt/speecht5_base.pt: optimizer, lr scheduler, meters, dataloader will be reset 2023-01-29 15:37:48 | INFO | fairseq.trainer | Preparing to load checkpoint /nfs-data/user1/PhDHub/ckpt/speecht5_base.pt 2023-01-29 15:37:50 | WARNING | speecht5.models.speecht5 | not equal dictionary between model and checkpoint: 1257 vs 81 2023-01-29 15:37:50 | INFO | speecht5.models.speecht5 | reset model dictionary with size of 1257 2023-01-29 15:37:50 | INFO | speecht5.models.speecht5 | removed loaded checkpoint: encoder.proj.weight 2023-01-29 15:37:50 | INFO | speecht5.models.speecht5 | removed loaded checkpoint: encoder.proj.bias 2023-01-29 15:37:50 | INFO | speecht5.models.speecht5 | removed loaded checkpoint: text_encoder_prenet.encoder_prenet.0.weight 2023-01-29 15:37:50 | INFO | speecht5.models.speecht5 | removed loaded checkpoint: text_encoder_prenet.encoder_prenet.1.alpha 2023-01-29 15:37:50 | INFO | speecht5.models.speecht5 | removed loaded checkpoint: text_decoder_prenet.embed_tokens.weight 2023-01-29 15:37:50 | INFO | speecht5.models.speecht5 | removed loaded checkpoint: text_decoder_prenet.embed_positions._float_tensor 2023-01-29 15:37:50 | INFO | speecht5.models.speecht5 | removed loaded checkpoint: text_decoder_postnet.output_projection.weight 2023-01-29 15:37:51 | INFO | fairseq.trainer | Loaded checkpoint /nfs-data/user1/PhDHub/ckpt/speecht5_base.pt (epoch 39 @ 0 updates) 2023-01-29 15:37:51 | INFO | fairseq.trainer | loading train data for epoch 1 2023-01-29 15:37:51 | INFO | speecht5.data.speech_to_class_dataset | max_keep=1024000, min_keep=None, loaded 138333, skipped 0 short and 28 long, longest-loaded=1015681, shortest-loaded=63361 2023-01-29 15:37:51 | INFO | speecht5.data.speech_to_class_dataset | max_length=51200, normalize=False 2023-01-29 15:37:51 | WARNING | speecht5.models.speecht5 | not equal dictionary between model and checkpoint: 1257 vs 81 2023-01-29 15:37:51 | WARNING | speecht5.models.speecht5 | not equal dictionary between model and checkpoint: 1257 vs 81 2023-01-29 15:37:51 | WARNING | speecht5.models.speecht5 | not equal dictionary between model and checkpoint: 1257 vs 81 2023-01-29 15:37:57 | INFO | fairseq.trainer | begin training epoch 1 2023-01-29 15:37:57 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 15:38:01 | INFO | train_inner | {"epoch": 1, "update": 0.005, "s2c_loss": "10.381", "loss": "7.19523", "s2c_nll_loss": "10.381", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "10", "lr": "7.66633e-08", "gnorm": "5.896", "loss_scale": "128", "train_wall": "4", "gb_free": "7.3", "wall": "13"} 2023-01-29 15:38:04 | INFO | train_inner | {"epoch": 1, "update": 0.009, "s2c_loss": "10.358", "loss": "7.17947", "s2c_nll_loss": "10.358", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "20", "lr": "1.43327e-07", "gnorm": "5.624", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "15"} 2023-01-29 15:38:06 | INFO | train_inner | {"epoch": 1, "update": 0.014, "s2c_loss": "10.343", "loss": "7.16907", "s2c_nll_loss": "10.343", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "30", "lr": "2.0999e-07", "gnorm": "5.58", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "18"} 2023-01-29 15:38:09 | INFO | train_inner | {"epoch": 1, "update": 0.019, "s2c_loss": "10.358", "loss": "7.17943", "s2c_nll_loss": "10.358", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "40", "lr": "2.76653e-07", "gnorm": "5.798", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "21"} 2023-01-29 15:38:11 | INFO | train_inner | {"epoch": 1, "update": 0.023, "s2c_loss": "10.361", "loss": "7.18196", "s2c_nll_loss": "10.361", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "50", "lr": "3.43317e-07", "gnorm": "5.792", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "23"} 2023-01-29 15:38:14 | INFO | train_inner | {"epoch": 1, "update": 0.028, "s2c_loss": "10.334", "loss": "7.16303", "s2c_nll_loss": "10.334", "s2c_accuracy": "0.312", "s2c_total": "64", "s2c_n_correct": "0.2", "wps": "245", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "60", "lr": "4.0998e-07", "gnorm": "5.753", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "26"} 2023-01-29 15:38:17 | INFO | train_inner | {"epoch": 1, "update": 0.032, "s2c_loss": "10.334", "loss": "7.16288", "s2c_nll_loss": "10.334", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "70", "lr": "4.76643e-07", "gnorm": "5.814", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "28"} 2023-01-29 15:38:19 | INFO | train_inner | {"epoch": 1, "update": 0.037, "s2c_loss": "10.365", "loss": "7.1846", "s2c_nll_loss": "10.365", "s2c_accuracy": "0.312", "s2c_total": "64", "s2c_n_correct": "0.2", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "80", "lr": "5.43307e-07", "gnorm": "5.733", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "31"} 2023-01-29 15:38:22 | INFO | train_inner | {"epoch": 1, "update": 0.042, "s2c_loss": "10.348", "loss": "7.17294", "s2c_nll_loss": "10.348", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "90", "lr": "6.0997e-07", "gnorm": "5.592", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "33"} 2023-01-29 15:38:24 | INFO | train_inner | {"epoch": 1, "update": 0.046, "s2c_loss": "10.348", "loss": "7.17296", "s2c_nll_loss": "10.348", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "100", "lr": "6.76633e-07", "gnorm": "5.821", "loss_scale": "128", "train_wall": "2", "gb_free": "7.4", "wall": "36"} 2023-01-29 15:38:27 | INFO | train_inner | {"epoch": 1, "update": 0.051, "s2c_loss": "10.314", "loss": "7.1494", "s2c_nll_loss": "10.314", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "110", "lr": "7.43297e-07", "gnorm": "5.897", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "38"} 2023-01-29 15:38:29 | INFO | train_inner | {"epoch": 1, "update": 0.056, "s2c_loss": "10.358", "loss": "7.17992", "s2c_nll_loss": "10.358", "s2c_accuracy": "0.312", "s2c_total": "64", "s2c_n_correct": "0.2", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "120", "lr": "8.0996e-07", "gnorm": "5.691", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "41"} 2023-01-29 15:38:32 | INFO | train_inner | {"epoch": 1, "update": 0.06, "s2c_loss": "10.363", "loss": "7.18338", "s2c_nll_loss": "10.363", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "130", "lr": "8.76623e-07", "gnorm": "5.689", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "43"} 2023-01-29 15:38:34 | INFO | train_inner | {"epoch": 1, "update": 0.065, "s2c_loss": "10.331", "loss": "7.16082", "s2c_nll_loss": "10.331", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "140", "lr": "9.43287e-07", "gnorm": "5.506", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "46"} 2023-01-29 15:38:37 | INFO | train_inner | {"epoch": 1, "update": 0.069, "s2c_loss": "10.359", "loss": "7.18053", "s2c_nll_loss": "10.359", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "150", "lr": "1.00995e-06", "gnorm": "5.432", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "48"} 2023-01-29 15:38:39 | INFO | train_inner | {"epoch": 1, "update": 0.074, "s2c_loss": "10.321", "loss": "7.15425", "s2c_nll_loss": "10.321", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "160", "lr": "1.07661e-06", "gnorm": "5.411", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "51"} 2023-01-29 15:38:42 | INFO | train_inner | {"epoch": 1, "update": 0.079, "s2c_loss": "10.372", "loss": "7.18956", "s2c_nll_loss": "10.372", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "170", "lr": "1.14328e-06", "gnorm": "5.359", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "53"} 2023-01-29 15:38:44 | INFO | train_inner | {"epoch": 1, "update": 0.083, "s2c_loss": "10.313", "loss": "7.14858", "s2c_nll_loss": "10.313", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "180", "lr": "1.20994e-06", "gnorm": "5.472", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "56"} 2023-01-29 15:38:47 | INFO | train_inner | {"epoch": 1, "update": 0.088, "s2c_loss": "10.351", "loss": "7.1751", "s2c_nll_loss": "10.351", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "190", "lr": "1.2766e-06", "gnorm": "5.432", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "59"} 2023-01-29 15:38:49 | INFO | train_inner | {"epoch": 1, "update": 0.093, "s2c_loss": "10.34", "loss": "7.16685", "s2c_nll_loss": "10.34", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "200", "lr": "1.34327e-06", "gnorm": "5.126", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "61"} 2023-01-29 15:38:52 | INFO | train_inner | {"epoch": 1, "update": 0.097, "s2c_loss": "10.334", "loss": "7.16314", "s2c_nll_loss": "10.334", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "210", "lr": "1.40993e-06", "gnorm": "5.352", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "64"} 2023-01-29 15:38:54 | INFO | train_inner | {"epoch": 1, "update": 0.102, "s2c_loss": "10.338", "loss": "7.16557", "s2c_nll_loss": "10.338", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "220", "lr": "1.47659e-06", "gnorm": "5.231", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "66"} 2023-01-29 15:38:57 | INFO | train_inner | {"epoch": 1, "update": 0.106, "s2c_loss": "10.342", "loss": "7.1685", "s2c_nll_loss": "10.342", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "230", "lr": "1.54326e-06", "gnorm": "4.941", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "69"} 2023-01-29 15:39:00 | INFO | train_inner | {"epoch": 1, "update": 0.111, "s2c_loss": "10.352", "loss": "7.17572", "s2c_nll_loss": "10.352", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "240", "lr": "1.60992e-06", "gnorm": "4.858", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "71"} 2023-01-29 15:39:02 | INFO | train_inner | {"epoch": 1, "update": 0.116, "s2c_loss": "10.341", "loss": "7.16814", "s2c_nll_loss": "10.341", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "250", "lr": "1.67658e-06", "gnorm": "4.927", "loss_scale": "128", "train_wall": "3", "gb_free": "7.4", "wall": "74"} 2023-01-29 15:39:05 | INFO | train_inner | {"epoch": 1, "update": 0.12, "s2c_loss": "10.335", "loss": "7.16351", "s2c_nll_loss": "10.335", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "259.8", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "260", "lr": "1.74325e-06", "gnorm": "4.952", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "76"} 2023-01-29 15:39:07 | INFO | train_inner | {"epoch": 1, "update": 0.125, "s2c_loss": "10.312", "loss": "7.14771", "s2c_nll_loss": "10.312", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "270", "lr": "1.80991e-06", "gnorm": "4.556", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "79"} 2023-01-29 15:39:10 | INFO | train_inner | {"epoch": 1, "update": 0.13, "s2c_loss": "10.337", "loss": "7.16516", "s2c_nll_loss": "10.337", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "258.8", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "280", "lr": "1.87657e-06", "gnorm": "4.648", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "81"} 2023-01-29 15:39:12 | INFO | train_inner | {"epoch": 1, "update": 0.134, "s2c_loss": "10.331", "loss": "7.16122", "s2c_nll_loss": "10.331", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "258.2", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "290", "lr": "1.94324e-06", "gnorm": "4.44", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "84"} 2023-01-29 15:39:15 | INFO | train_inner | {"epoch": 1, "update": 0.139, "s2c_loss": "10.318", "loss": "7.15189", "s2c_nll_loss": "10.318", "s2c_accuracy": "0.312", "s2c_total": "64", "s2c_n_correct": "0.2", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "300", "lr": "2.0099e-06", "gnorm": "4.318", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "86"} 2023-01-29 15:39:17 | INFO | train_inner | {"epoch": 1, "update": 0.143, "s2c_loss": "10.31", "loss": "7.14649", "s2c_nll_loss": "10.31", "s2c_accuracy": "0.312", "s2c_total": "64", "s2c_n_correct": "0.2", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "310", "lr": "2.07656e-06", "gnorm": "4.522", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "89"} 2023-01-29 15:39:20 | INFO | train_inner | {"epoch": 1, "update": 0.148, "s2c_loss": "10.312", "loss": "7.14751", "s2c_nll_loss": "10.312", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "320", "lr": "2.14323e-06", "gnorm": "4.065", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "91"} 2023-01-29 15:39:22 | INFO | train_inner | {"epoch": 1, "update": 0.153, "s2c_loss": "10.323", "loss": "7.15565", "s2c_nll_loss": "10.323", "s2c_accuracy": "0.469", "s2c_total": "64", "s2c_n_correct": "0.3", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "330", "lr": "2.20989e-06", "gnorm": "4.165", "loss_scale": "128", "train_wall": "3", "gb_free": "7.4", "wall": "94"} 2023-01-29 15:39:25 | INFO | train_inner | {"epoch": 1, "update": 0.157, "s2c_loss": "10.306", "loss": "7.14386", "s2c_nll_loss": "10.306", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "340", "lr": "2.27655e-06", "gnorm": "4.35", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "97"} 2023-01-29 15:39:27 | INFO | train_inner | {"epoch": 1, "update": 0.162, "s2c_loss": "10.313", "loss": "7.1482", "s2c_nll_loss": "10.313", "s2c_accuracy": "0.312", "s2c_total": "64", "s2c_n_correct": "0.2", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "350", "lr": "2.34322e-06", "gnorm": "3.797", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "99"} 2023-01-29 15:39:30 | INFO | train_inner | {"epoch": 1, "update": 0.167, "s2c_loss": "10.307", "loss": "7.14416", "s2c_nll_loss": "10.307", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "360", "lr": "2.40988e-06", "gnorm": "4.108", "loss_scale": "128", "train_wall": "2", "gb_free": "7.4", "wall": "102"} 2023-01-29 15:39:32 | INFO | train_inner | {"epoch": 1, "update": 0.171, "s2c_loss": "10.307", "loss": "7.14454", "s2c_nll_loss": "10.307", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "370", "lr": "2.47654e-06", "gnorm": "4.224", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "104"} 2023-01-29 15:39:35 | INFO | train_inner | {"epoch": 1, "update": 0.176, "s2c_loss": "10.317", "loss": "7.15105", "s2c_nll_loss": "10.317", "s2c_accuracy": "0.469", "s2c_total": "64", "s2c_n_correct": "0.3", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "380", "lr": "2.54321e-06", "gnorm": "3.989", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "107"} 2023-01-29 15:39:38 | INFO | train_inner | {"epoch": 1, "update": 0.18, "s2c_loss": "10.317", "loss": "7.15141", "s2c_nll_loss": "10.317", "s2c_accuracy": "0.312", "s2c_total": "64", "s2c_n_correct": "0.2", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "390", "lr": "2.60987e-06", "gnorm": "4.165", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "109"} 2023-01-29 15:39:40 | INFO | train_inner | {"epoch": 1, "update": 0.185, "s2c_loss": "10.29", "loss": "7.1327", "s2c_nll_loss": "10.29", "s2c_accuracy": "0.312", "s2c_total": "64", "s2c_n_correct": "0.2", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "400", "lr": "2.67653e-06", "gnorm": "3.973", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "112"} 2023-01-29 15:39:43 | INFO | train_inner | {"epoch": 1, "update": 0.19, "s2c_loss": "10.295", "loss": "7.13593", "s2c_nll_loss": "10.295", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "410", "lr": "2.7432e-06", "gnorm": "3.827", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "114"} 2023-01-29 15:39:45 | INFO | train_inner | {"epoch": 1, "update": 0.194, "s2c_loss": "10.3", "loss": "7.13961", "s2c_nll_loss": "10.3", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "420", "lr": "2.80986e-06", "gnorm": "3.668", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "117"} 2023-01-29 15:39:48 | INFO | train_inner | {"epoch": 1, "update": 0.199, "s2c_loss": "10.278", "loss": "7.12414", "s2c_nll_loss": "10.278", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "430", "lr": "2.87652e-06", "gnorm": "3.516", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "120"} 2023-01-29 15:39:50 | INFO | train_inner | {"epoch": 1, "update": 0.204, "s2c_loss": "10.312", "loss": "7.14758", "s2c_nll_loss": "10.312", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "440", "lr": "2.94319e-06", "gnorm": "3.663", "loss_scale": "128", "train_wall": "2", "gb_free": "7.4", "wall": "122"} 2023-01-29 15:39:53 | INFO | train_inner | {"epoch": 1, "update": 0.208, "s2c_loss": "10.296", "loss": "7.13638", "s2c_nll_loss": "10.296", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "450", "lr": "3.00985e-06", "gnorm": "3.637", "loss_scale": "128", "train_wall": "3", "gb_free": "7.4", "wall": "125"} 2023-01-29 15:39:55 | INFO | train_inner | {"epoch": 1, "update": 0.213, "s2c_loss": "10.293", "loss": "7.13473", "s2c_nll_loss": "10.293", "s2c_accuracy": "0.312", "s2c_total": "64", "s2c_n_correct": "0.2", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "460", "lr": "3.07651e-06", "gnorm": "3.487", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "127"} 2023-01-29 15:39:58 | INFO | train_inner | {"epoch": 1, "update": 0.217, "s2c_loss": "10.311", "loss": "7.14681", "s2c_nll_loss": "10.311", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "470", "lr": "3.14318e-06", "gnorm": "3.552", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "130"} 2023-01-29 15:40:00 | INFO | train_inner | {"epoch": 1, "update": 0.222, "s2c_loss": "10.301", "loss": "7.14041", "s2c_nll_loss": "10.301", "s2c_accuracy": "0.312", "s2c_total": "64", "s2c_n_correct": "0.2", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "480", "lr": "3.20984e-06", "gnorm": "3.365", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "132"} 2023-01-29 15:40:03 | INFO | train_inner | {"epoch": 1, "update": 0.227, "s2c_loss": "10.29", "loss": "7.1323", "s2c_nll_loss": "10.29", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "490", "lr": "3.2765e-06", "gnorm": "3.417", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "135"} 2023-01-29 15:40:05 | INFO | train_inner | {"epoch": 1, "update": 0.231, "s2c_loss": "10.289", "loss": "7.13209", "s2c_nll_loss": "10.289", "s2c_accuracy": "0.312", "s2c_total": "64", "s2c_n_correct": "0.2", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "500", "lr": "3.34317e-06", "gnorm": "3.537", "loss_scale": "128", "train_wall": "2", "gb_free": "7.4", "wall": "137"} 2023-01-29 15:40:08 | INFO | train_inner | {"epoch": 1, "update": 0.236, "s2c_loss": "10.283", "loss": "7.12785", "s2c_nll_loss": "10.283", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "510", "lr": "3.40983e-06", "gnorm": "3.44", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "140"} 2023-01-29 15:40:11 | INFO | train_inner | {"epoch": 1, "update": 0.241, "s2c_loss": "10.283", "loss": "7.12792", "s2c_nll_loss": "10.283", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "520", "lr": "3.47649e-06", "gnorm": "3.749", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "142"} 2023-01-29 15:40:13 | INFO | train_inner | {"epoch": 1, "update": 0.245, "s2c_loss": "10.269", "loss": "7.11801", "s2c_nll_loss": "10.269", "s2c_accuracy": "0.469", "s2c_total": "64", "s2c_n_correct": "0.3", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "530", "lr": "3.54316e-06", "gnorm": "3.749", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "145"} 2023-01-29 15:40:16 | INFO | train_inner | {"epoch": 1, "update": 0.25, "s2c_loss": "10.278", "loss": "7.12395", "s2c_nll_loss": "10.278", "s2c_accuracy": "0.469", "s2c_total": "64", "s2c_n_correct": "0.3", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "540", "lr": "3.60982e-06", "gnorm": "3.719", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "147"} 2023-01-29 15:40:18 | INFO | train_inner | {"epoch": 1, "update": 0.254, "s2c_loss": "10.274", "loss": "7.12118", "s2c_nll_loss": "10.274", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "550", "lr": "3.67648e-06", "gnorm": "4.097", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "150"} 2023-01-29 15:40:21 | INFO | train_inner | {"epoch": 1, "update": 0.259, "s2c_loss": "10.28", "loss": "7.12532", "s2c_nll_loss": "10.28", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "560", "lr": "3.74315e-06", "gnorm": "3.499", "loss_scale": "128", "train_wall": "2", "gb_free": "7.4", "wall": "152"} 2023-01-29 15:40:23 | INFO | train_inner | {"epoch": 1, "update": 0.264, "s2c_loss": "10.267", "loss": "7.11666", "s2c_nll_loss": "10.267", "s2c_accuracy": "0.469", "s2c_total": "64", "s2c_n_correct": "0.3", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "570", "lr": "3.80981e-06", "gnorm": "3.759", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "155"} 2023-01-29 15:40:26 | INFO | train_inner | {"epoch": 1, "update": 0.268, "s2c_loss": "10.294", "loss": "7.13541", "s2c_nll_loss": "10.294", "s2c_accuracy": "0", "s2c_total": "64", "s2c_n_correct": "0", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "580", "lr": "3.87647e-06", "gnorm": "3.573", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "158"} 2023-01-29 15:40:28 | INFO | train_inner | {"epoch": 1, "update": 0.273, "s2c_loss": "10.277", "loss": "7.12318", "s2c_nll_loss": "10.277", "s2c_accuracy": "0.781", "s2c_total": "64", "s2c_n_correct": "0.5", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "590", "lr": "3.94314e-06", "gnorm": "4.097", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "160"} 2023-01-29 15:40:31 | INFO | train_inner | {"epoch": 1, "update": 0.278, "s2c_loss": "10.282", "loss": "7.12687", "s2c_nll_loss": "10.282", "s2c_accuracy": "0.312", "s2c_total": "64", "s2c_n_correct": "0.2", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "600", "lr": "4.0098e-06", "gnorm": "3.831", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "163"} 2023-01-29 15:40:33 | INFO | train_inner | {"epoch": 1, "update": 0.282, "s2c_loss": "10.278", "loss": "7.1239", "s2c_nll_loss": "10.278", "s2c_accuracy": "0.625", "s2c_total": "64", "s2c_n_correct": "0.4", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "610", "lr": "4.07646e-06", "gnorm": "4.056", "loss_scale": "128", "train_wall": "3", "gb_free": "7.4", "wall": "165"} 2023-01-29 15:40:36 | INFO | train_inner | {"epoch": 1, "update": 0.287, "s2c_loss": "10.268", "loss": "7.11703", "s2c_nll_loss": "10.268", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "620", "lr": "4.14313e-06", "gnorm": "4.269", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "168"} 2023-01-29 15:40:39 | INFO | train_inner | {"epoch": 1, "update": 0.291, "s2c_loss": "10.222", "loss": "7.08525", "s2c_nll_loss": "10.222", "s2c_accuracy": "0.938", "s2c_total": "64", "s2c_n_correct": "0.6", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "630", "lr": "4.20979e-06", "gnorm": "4.171", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "170"} 2023-01-29 15:40:41 | INFO | train_inner | {"epoch": 1, "update": 0.296, "s2c_loss": "10.271", "loss": "7.1193", "s2c_nll_loss": "10.271", "s2c_accuracy": "0.312", "s2c_total": "64", "s2c_n_correct": "0.2", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "640", "lr": "4.27645e-06", "gnorm": "4.306", "loss_scale": "128", "train_wall": "3", "gb_free": "7.4", "wall": "173"} 2023-01-29 15:40:44 | INFO | train_inner | {"epoch": 1, "update": 0.301, "s2c_loss": "10.287", "loss": "7.13028", "s2c_nll_loss": "10.287", "s2c_accuracy": "0.312", "s2c_total": "64", "s2c_n_correct": "0.2", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "650", "lr": "4.34312e-06", "gnorm": "4.548", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "175"} 2023-01-29 15:40:46 | INFO | train_inner | {"epoch": 1, "update": 0.305, "s2c_loss": "10.235", "loss": "7.09419", "s2c_nll_loss": "10.235", "s2c_accuracy": "0.625", "s2c_total": "64", "s2c_n_correct": "0.4", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "660", "lr": "4.40978e-06", "gnorm": "4.376", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "178"} 2023-01-29 15:40:49 | INFO | train_inner | {"epoch": 1, "update": 0.31, "s2c_loss": "10.255", "loss": "7.10821", "s2c_nll_loss": "10.255", "s2c_accuracy": "0.469", "s2c_total": "64", "s2c_n_correct": "0.3", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "670", "lr": "4.47644e-06", "gnorm": "4.529", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "180"} 2023-01-29 15:40:51 | INFO | train_inner | {"epoch": 1, "update": 0.315, "s2c_loss": "10.243", "loss": "7.09986", "s2c_nll_loss": "10.243", "s2c_accuracy": "0.312", "s2c_total": "64", "s2c_n_correct": "0.2", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "680", "lr": "4.54311e-06", "gnorm": "4.602", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "183"} 2023-01-29 15:40:54 | INFO | train_inner | {"epoch": 1, "update": 0.319, "s2c_loss": "10.258", "loss": "7.11008", "s2c_nll_loss": "10.258", "s2c_accuracy": "0.469", "s2c_total": "64", "s2c_n_correct": "0.3", "wps": "259.5", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "690", "lr": "4.60977e-06", "gnorm": "4.692", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "185"} 2023-01-29 15:40:56 | INFO | train_inner | {"epoch": 1, "update": 0.324, "s2c_loss": "10.225", "loss": "7.08771", "s2c_nll_loss": "10.225", "s2c_accuracy": "0.625", "s2c_total": "64", "s2c_n_correct": "0.4", "wps": "256.3", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "700", "lr": "4.67643e-06", "gnorm": "4.471", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "188"} 2023-01-29 15:40:59 | INFO | train_inner | {"epoch": 1, "update": 0.328, "s2c_loss": "10.226", "loss": "7.08809", "s2c_nll_loss": "10.226", "s2c_accuracy": "0.625", "s2c_total": "64", "s2c_n_correct": "0.4", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "710", "lr": "4.7431e-06", "gnorm": "4.352", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "190"} 2023-01-29 15:41:01 | INFO | train_inner | {"epoch": 1, "update": 0.333, "s2c_loss": "10.23", "loss": "7.0907", "s2c_nll_loss": "10.23", "s2c_accuracy": "1.25", "s2c_total": "64", "s2c_n_correct": "0.8", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "720", "lr": "4.80976e-06", "gnorm": "4.366", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "193"} 2023-01-29 15:41:04 | INFO | train_inner | {"epoch": 1, "update": 0.338, "s2c_loss": "10.237", "loss": "7.09559", "s2c_nll_loss": "10.237", "s2c_accuracy": "0.312", "s2c_total": "64", "s2c_n_correct": "0.2", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "730", "lr": "4.87642e-06", "gnorm": "4.306", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "195"} 2023-01-29 15:41:06 | INFO | train_inner | {"epoch": 1, "update": 0.342, "s2c_loss": "10.222", "loss": "7.08516", "s2c_nll_loss": "10.222", "s2c_accuracy": "0.781", "s2c_total": "64", "s2c_n_correct": "0.5", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "740", "lr": "4.94309e-06", "gnorm": "4.318", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "198"} 2023-01-29 15:41:09 | INFO | train_inner | {"epoch": 1, "update": 0.347, "s2c_loss": "10.2", "loss": "7.07031", "s2c_nll_loss": "10.2", "s2c_accuracy": "0.938", "s2c_total": "64", "s2c_n_correct": "0.6", "wps": "246.5", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "750", "lr": "5.00975e-06", "gnorm": "4.103", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "201"} 2023-01-29 15:41:11 | INFO | train_inner | {"epoch": 1, "update": 0.352, "s2c_loss": "10.199", "loss": "7.06974", "s2c_nll_loss": "10.199", "s2c_accuracy": "0.625", "s2c_total": "64", "s2c_n_correct": "0.4", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "760", "lr": "5.07641e-06", "gnorm": "4.023", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "203"} 2023-01-29 15:41:14 | INFO | train_inner | {"epoch": 1, "update": 0.356, "s2c_loss": "10.168", "loss": "7.04779", "s2c_nll_loss": "10.168", "s2c_accuracy": "0.781", "s2c_total": "64", "s2c_n_correct": "0.5", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "770", "lr": "5.14308e-06", "gnorm": "4.154", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "206"} 2023-01-29 15:41:16 | INFO | train_inner | {"epoch": 1, "update": 0.361, "s2c_loss": "10.202", "loss": "7.0712", "s2c_nll_loss": "10.202", "s2c_accuracy": "0.469", "s2c_total": "64", "s2c_n_correct": "0.3", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "780", "lr": "5.20974e-06", "gnorm": "4.082", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "208"} 2023-01-29 15:41:19 | INFO | train_inner | {"epoch": 1, "update": 0.365, "s2c_loss": "10.18", "loss": "7.05612", "s2c_nll_loss": "10.18", "s2c_accuracy": "0.938", "s2c_total": "64", "s2c_n_correct": "0.6", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "790", "lr": "5.2764e-06", "gnorm": "4.251", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "211"} 2023-01-29 15:41:21 | INFO | train_inner | {"epoch": 1, "update": 0.37, "s2c_loss": "10.202", "loss": "7.07125", "s2c_nll_loss": "10.202", "s2c_accuracy": "0.625", "s2c_total": "64", "s2c_n_correct": "0.4", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "800", "lr": "5.34307e-06", "gnorm": "4.032", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "213"} 2023-01-29 15:41:24 | INFO | train_inner | {"epoch": 1, "update": 0.375, "s2c_loss": "10.144", "loss": "7.03099", "s2c_nll_loss": "10.144", "s2c_accuracy": "0.469", "s2c_total": "64", "s2c_n_correct": "0.3", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "810", "lr": "5.40973e-06", "gnorm": "3.938", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "216"} 2023-01-29 15:41:27 | INFO | train_inner | {"epoch": 1, "update": 0.379, "s2c_loss": "10.176", "loss": "7.05336", "s2c_nll_loss": "10.176", "s2c_accuracy": "0.625", "s2c_total": "64", "s2c_n_correct": "0.4", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "820", "lr": "5.47639e-06", "gnorm": "3.729", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "218"} 2023-01-29 15:41:29 | INFO | train_inner | {"epoch": 1, "update": 0.384, "s2c_loss": "10.206", "loss": "7.07412", "s2c_nll_loss": "10.206", "s2c_accuracy": "0.469", "s2c_total": "64", "s2c_n_correct": "0.3", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "830", "lr": "5.54306e-06", "gnorm": "3.972", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "221"} 2023-01-29 15:41:32 | INFO | train_inner | {"epoch": 1, "update": 0.389, "s2c_loss": "10.149", "loss": "7.03504", "s2c_nll_loss": "10.149", "s2c_accuracy": "0.938", "s2c_total": "64", "s2c_n_correct": "0.6", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "840", "lr": "5.60972e-06", "gnorm": "3.666", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "223"} 2023-01-29 15:41:34 | INFO | train_inner | {"epoch": 1, "update": 0.393, "s2c_loss": "10.204", "loss": "7.0726", "s2c_nll_loss": "10.204", "s2c_accuracy": "0.156", "s2c_total": "64", "s2c_n_correct": "0.1", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "850", "lr": "5.67638e-06", "gnorm": "3.659", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "226"} 2023-01-29 15:41:37 | INFO | train_inner | {"epoch": 1, "update": 0.398, "s2c_loss": "10.121", "loss": "7.01502", "s2c_nll_loss": "10.121", "s2c_accuracy": "0.625", "s2c_total": "64", "s2c_n_correct": "0.4", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "860", "lr": "5.74305e-06", "gnorm": "3.73", "loss_scale": "128", "train_wall": "2", "gb_free": "7.4", "wall": "228"} 2023-01-29 15:41:39 | INFO | train_inner | {"epoch": 1, "update": 0.402, "s2c_loss": "10.137", "loss": "7.02637", "s2c_nll_loss": "10.137", "s2c_accuracy": "0.469", "s2c_total": "64", "s2c_n_correct": "0.3", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "870", "lr": "5.80971e-06", "gnorm": "3.794", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "231"} 2023-01-29 15:41:42 | INFO | train_inner | {"epoch": 1, "update": 0.407, "s2c_loss": "10.164", "loss": "7.04527", "s2c_nll_loss": "10.164", "s2c_accuracy": "0.938", "s2c_total": "64", "s2c_n_correct": "0.6", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "880", "lr": "5.87637e-06", "gnorm": "3.795", "loss_scale": "128", "train_wall": "2", "gb_free": "7.4", "wall": "233"} 2023-01-29 15:41:44 | INFO | train_inner | {"epoch": 1, "update": 0.412, "s2c_loss": "10.118", "loss": "7.01359", "s2c_nll_loss": "10.118", "s2c_accuracy": "1.094", "s2c_total": "64", "s2c_n_correct": "0.7", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "890", "lr": "5.94304e-06", "gnorm": "3.721", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "236"} 2023-01-29 15:41:47 | INFO | train_inner | {"epoch": 1, "update": 0.416, "s2c_loss": "10.1", "loss": "7.0008", "s2c_nll_loss": "10.1", "s2c_accuracy": "0.938", "s2c_total": "64", "s2c_n_correct": "0.6", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "900", "lr": "6.0097e-06", "gnorm": "3.729", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "238"} 2023-01-29 15:41:49 | INFO | train_inner | {"epoch": 1, "update": 0.421, "s2c_loss": "10.107", "loss": "7.00591", "s2c_nll_loss": "10.107", "s2c_accuracy": "0.781", "s2c_total": "64", "s2c_n_correct": "0.5", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "910", "lr": "6.07636e-06", "gnorm": "3.741", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "241"} 2023-01-29 15:41:52 | INFO | train_inner | {"epoch": 1, "update": 0.426, "s2c_loss": "10.141", "loss": "7.02919", "s2c_nll_loss": "10.141", "s2c_accuracy": "0.625", "s2c_total": "64", "s2c_n_correct": "0.4", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "920", "lr": "6.14303e-06", "gnorm": "3.73", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "243"} 2023-01-29 15:41:54 | INFO | train_inner | {"epoch": 1, "update": 0.43, "s2c_loss": "10.168", "loss": "7.04782", "s2c_nll_loss": "10.168", "s2c_accuracy": "0.469", "s2c_total": "64", "s2c_n_correct": "0.3", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "930", "lr": "6.20969e-06", "gnorm": "3.751", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "246"} 2023-01-29 15:41:57 | INFO | train_inner | {"epoch": 1, "update": 0.435, "s2c_loss": "10.072", "loss": "6.9815", "s2c_nll_loss": "10.072", "s2c_accuracy": "0.781", "s2c_total": "64", "s2c_n_correct": "0.5", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "940", "lr": "6.27635e-06", "gnorm": "3.918", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "249"} 2023-01-29 15:41:59 | INFO | train_inner | {"epoch": 1, "update": 0.439, "s2c_loss": "10.13", "loss": "7.02129", "s2c_nll_loss": "10.13", "s2c_accuracy": "0.938", "s2c_total": "64", "s2c_n_correct": "0.6", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "950", "lr": "6.34302e-06", "gnorm": "3.795", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "251"} 2023-01-29 15:42:02 | INFO | train_inner | {"epoch": 1, "update": 0.444, "s2c_loss": "10.099", "loss": "7.00043", "s2c_nll_loss": "10.099", "s2c_accuracy": "0.938", "s2c_total": "64", "s2c_n_correct": "0.6", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "960", "lr": "6.40968e-06", "gnorm": "3.717", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "254"} 2023-01-29 15:42:05 | INFO | train_inner | {"epoch": 1, "update": 0.449, "s2c_loss": "10.062", "loss": "6.9747", "s2c_nll_loss": "10.062", "s2c_accuracy": "0.938", "s2c_total": "64", "s2c_n_correct": "0.6", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "970", "lr": "6.47634e-06", "gnorm": "3.786", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "256"} 2023-01-29 15:42:07 | INFO | train_inner | {"epoch": 1, "update": 0.453, "s2c_loss": "10.047", "loss": "6.96413", "s2c_nll_loss": "10.047", "s2c_accuracy": "0.469", "s2c_total": "64", "s2c_n_correct": "0.3", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "980", "lr": "6.54301e-06", "gnorm": "3.688", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "259"} 2023-01-29 15:42:10 | INFO | train_inner | {"epoch": 1, "update": 0.458, "s2c_loss": "10.08", "loss": "6.98722", "s2c_nll_loss": "10.08", "s2c_accuracy": "1.25", "s2c_total": "64", "s2c_n_correct": "0.8", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "990", "lr": "6.60967e-06", "gnorm": "3.824", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "261"} 2023-01-29 15:42:12 | INFO | train_inner | {"epoch": 1, "update": 0.463, "s2c_loss": "10.039", "loss": "6.95877", "s2c_nll_loss": "10.039", "s2c_accuracy": "1.094", "s2c_total": "64", "s2c_n_correct": "0.7", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "1000", "lr": "6.67633e-06", "gnorm": "3.872", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "264"} 2023-01-29 15:42:15 | INFO | train_inner | {"epoch": 1, "update": 0.467, "s2c_loss": "10.06", "loss": "6.97323", "s2c_nll_loss": "10.06", "s2c_accuracy": "1.094", "s2c_total": "64", "s2c_n_correct": "0.7", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "1010", "lr": "6.743e-06", "gnorm": "3.974", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "266"} 2023-01-29 15:42:17 | INFO | train_inner | {"epoch": 1, "update": 0.472, "s2c_loss": "10.067", "loss": "6.97762", "s2c_nll_loss": "10.067", "s2c_accuracy": "0.469", "s2c_total": "64", "s2c_n_correct": "0.3", "wps": "246.6", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "1020", "lr": "6.80966e-06", "gnorm": "3.686", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "269"} 2023-01-29 15:42:20 | INFO | train_inner | {"epoch": 1, "update": 0.476, "s2c_loss": "10.101", "loss": "7.00156", "s2c_nll_loss": "10.101", "s2c_accuracy": "0.312", "s2c_total": "64", "s2c_n_correct": "0.2", "wps": "257.6", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "1030", "lr": "6.87632e-06", "gnorm": "3.845", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "271"} 2023-01-29 15:42:22 | INFO | train_inner | {"epoch": 1, "update": 0.481, "s2c_loss": "10.043", "loss": "6.96119", "s2c_nll_loss": "10.043", "s2c_accuracy": "0.938", "s2c_total": "64", "s2c_n_correct": "0.6", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "1040", "lr": "6.94299e-06", "gnorm": "3.716", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "274"} 2023-01-29 15:42:25 | INFO | train_inner | {"epoch": 1, "update": 0.486, "s2c_loss": "10.031", "loss": "6.95271", "s2c_nll_loss": "10.031", "s2c_accuracy": "1.094", "s2c_total": "64", "s2c_n_correct": "0.7", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "1050", "lr": "7.00965e-06", "gnorm": "3.814", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "276"} 2023-01-29 15:42:27 | INFO | train_inner | {"epoch": 1, "update": 0.49, "s2c_loss": "9.923", "loss": "6.87804", "s2c_nll_loss": "9.923", "s2c_accuracy": "2.188", "s2c_total": "64", "s2c_n_correct": "1.4", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "1060", "lr": "7.07631e-06", "gnorm": "3.704", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "279"} 2023-01-29 15:42:30 | INFO | train_inner | {"epoch": 1, "update": 0.495, "s2c_loss": "10.015", "loss": "6.94218", "s2c_nll_loss": "10.015", "s2c_accuracy": "1.875", "s2c_total": "64", "s2c_n_correct": "1.2", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "1070", "lr": "7.14298e-06", "gnorm": "3.676", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "281"} 2023-01-29 15:42:32 | INFO | train_inner | {"epoch": 1, "update": 0.5, "s2c_loss": "9.97", "loss": "6.91078", "s2c_nll_loss": "9.97", "s2c_accuracy": "0.469", "s2c_total": "64", "s2c_n_correct": "0.3", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "1080", "lr": "7.20964e-06", "gnorm": "3.615", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "284"} 2023-01-29 15:42:35 | INFO | train_inner | {"epoch": 1, "update": 0.504, "s2c_loss": "10.013", "loss": "6.94071", "s2c_nll_loss": "10.013", "s2c_accuracy": "0.781", "s2c_total": "64", "s2c_n_correct": "0.5", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "1090", "lr": "7.2763e-06", "gnorm": "3.779", "loss_scale": "128", "train_wall": "2", "gb_free": "7.4", "wall": "287"} 2023-01-29 15:42:37 | INFO | train_inner | {"epoch": 1, "update": 0.509, "s2c_loss": "9.941", "loss": "6.89035", "s2c_nll_loss": "9.941", "s2c_accuracy": "0.938", "s2c_total": "64", "s2c_n_correct": "0.6", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "1100", "lr": "7.34297e-06", "gnorm": "3.64", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "289"} 2023-01-29 15:42:40 | INFO | train_inner | {"epoch": 1, "update": 0.513, "s2c_loss": "9.973", "loss": "6.91265", "s2c_nll_loss": "9.973", "s2c_accuracy": "0.938", "s2c_total": "64", "s2c_n_correct": "0.6", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "1110", "lr": "7.40963e-06", "gnorm": "3.682", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "292"} 2023-01-29 15:42:42 | INFO | train_inner | {"epoch": 1, "update": 0.518, "s2c_loss": "9.966", "loss": "6.90807", "s2c_nll_loss": "9.966", "s2c_accuracy": "1.406", "s2c_total": "64", "s2c_n_correct": "0.9", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "1120", "lr": "7.47629e-06", "gnorm": "3.52", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "294"} 2023-01-29 15:42:45 | INFO | train_inner | {"epoch": 1, "update": 0.523, "s2c_loss": "9.947", "loss": "6.89504", "s2c_nll_loss": "9.947", "s2c_accuracy": "0.625", "s2c_total": "64", "s2c_n_correct": "0.4", "wps": "258.4", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "1130", "lr": "7.54296e-06", "gnorm": "3.704", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "297"} 2023-01-29 15:42:47 | INFO | train_inner | {"epoch": 1, "update": 0.527, "s2c_loss": "9.919", "loss": "6.87514", "s2c_nll_loss": "9.919", "s2c_accuracy": "0.625", "s2c_total": "64", "s2c_n_correct": "0.4", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "1140", "lr": "7.60962e-06", "gnorm": "3.644", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "299"} 2023-01-29 15:42:50 | INFO | train_inner | {"epoch": 1, "update": 0.532, "s2c_loss": "9.941", "loss": "6.89036", "s2c_nll_loss": "9.941", "s2c_accuracy": "1.719", "s2c_total": "64", "s2c_n_correct": "1.1", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "1150", "lr": "7.67628e-06", "gnorm": "3.609", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "302"} 2023-01-29 15:42:52 | INFO | train_inner | {"epoch": 1, "update": 0.537, "s2c_loss": "9.895", "loss": "6.85871", "s2c_nll_loss": "9.895", "s2c_accuracy": "1.25", "s2c_total": "64", "s2c_n_correct": "0.8", "wps": "255", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "1160", "lr": "7.74295e-06", "gnorm": "3.643", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "304"} 2023-01-29 15:42:55 | INFO | train_inner | {"epoch": 1, "update": 0.541, "s2c_loss": "9.9", "loss": "6.86199", "s2c_nll_loss": "9.9", "s2c_accuracy": "1.562", "s2c_total": "64", "s2c_n_correct": "1", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "1170", "lr": "7.80961e-06", "gnorm": "3.6", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "307"} 2023-01-29 15:42:57 | INFO | train_inner | {"epoch": 1, "update": 0.546, "s2c_loss": "9.913", "loss": "6.87126", "s2c_nll_loss": "9.913", "s2c_accuracy": "0.781", "s2c_total": "64", "s2c_n_correct": "0.5", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "1180", "lr": "7.87627e-06", "gnorm": "3.641", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "309"} 2023-01-29 15:43:00 | INFO | train_inner | {"epoch": 1, "update": 0.55, "s2c_loss": "9.894", "loss": "6.8582", "s2c_nll_loss": "9.894", "s2c_accuracy": "1.406", "s2c_total": "64", "s2c_n_correct": "0.9", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "1190", "lr": "7.94294e-06", "gnorm": "3.625", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "312"} 2023-01-29 15:43:03 | INFO | train_inner | {"epoch": 1, "update": 0.555, "s2c_loss": "9.87", "loss": "6.84108", "s2c_nll_loss": "9.87", "s2c_accuracy": "1.25", "s2c_total": "64", "s2c_n_correct": "0.8", "wps": "260.1", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "1200", "lr": "8.0096e-06", "gnorm": "3.609", "loss_scale": "128", "train_wall": "2", "gb_free": "7.4", "wall": "314"} 2023-01-29 15:43:05 | INFO | train_inner | {"epoch": 1, "update": 0.56, "s2c_loss": "9.833", "loss": "6.81572", "s2c_nll_loss": "9.833", "s2c_accuracy": "1.562", "s2c_total": "64", "s2c_n_correct": "1", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "1210", "lr": "8.07626e-06", "gnorm": "3.615", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "317"} 2023-01-29 15:43:08 | INFO | train_inner | {"epoch": 1, "update": 0.564, "s2c_loss": "9.846", "loss": "6.82484", "s2c_nll_loss": "9.846", "s2c_accuracy": "1.406", "s2c_total": "64", "s2c_n_correct": "0.9", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "1220", "lr": "8.14293e-06", "gnorm": "3.679", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "319"} 2023-01-29 15:43:10 | INFO | train_inner | {"epoch": 1, "update": 0.569, "s2c_loss": "9.817", "loss": "6.80447", "s2c_nll_loss": "9.817", "s2c_accuracy": "1.406", "s2c_total": "64", "s2c_n_correct": "0.9", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "1230", "lr": "8.20959e-06", "gnorm": "3.663", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "322"} 2023-01-29 15:43:13 | INFO | train_inner | {"epoch": 1, "update": 0.574, "s2c_loss": "9.784", "loss": "6.78162", "s2c_nll_loss": "9.784", "s2c_accuracy": "1.562", "s2c_total": "64", "s2c_n_correct": "1", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "1240", "lr": "8.27625e-06", "gnorm": "3.661", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "324"} 2023-01-29 15:43:15 | INFO | train_inner | {"epoch": 1, "update": 0.578, "s2c_loss": "9.787", "loss": "6.78401", "s2c_nll_loss": "9.787", "s2c_accuracy": "2.656", "s2c_total": "64", "s2c_n_correct": "1.7", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "1250", "lr": "8.34292e-06", "gnorm": "3.638", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "327"} 2023-01-29 15:43:18 | INFO | train_inner | {"epoch": 1, "update": 0.583, "s2c_loss": "9.76", "loss": "6.7649", "s2c_nll_loss": "9.76", "s2c_accuracy": "1.875", "s2c_total": "64", "s2c_n_correct": "1.2", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "1260", "lr": "8.40958e-06", "gnorm": "3.723", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "329"} 2023-01-29 15:43:20 | INFO | train_inner | {"epoch": 1, "update": 0.587, "s2c_loss": "9.833", "loss": "6.8159", "s2c_nll_loss": "9.833", "s2c_accuracy": "1.719", "s2c_total": "64", "s2c_n_correct": "1.1", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "1270", "lr": "8.47624e-06", "gnorm": "3.583", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "332"} 2023-01-29 15:43:23 | INFO | train_inner | {"epoch": 1, "update": 0.592, "s2c_loss": "9.759", "loss": "6.76449", "s2c_nll_loss": "9.759", "s2c_accuracy": "2.188", "s2c_total": "64", "s2c_n_correct": "1.4", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "1280", "lr": "8.54291e-06", "gnorm": "3.644", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "334"} 2023-01-29 15:43:25 | INFO | train_inner | {"epoch": 1, "update": 0.597, "s2c_loss": "9.782", "loss": "6.78034", "s2c_nll_loss": "9.782", "s2c_accuracy": "1.875", "s2c_total": "64", "s2c_n_correct": "1.2", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "1290", "lr": "8.60957e-06", "gnorm": "3.634", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "337"} 2023-01-29 15:43:28 | INFO | train_inner | {"epoch": 1, "update": 0.601, "s2c_loss": "9.791", "loss": "6.78634", "s2c_nll_loss": "9.791", "s2c_accuracy": "2.041", "s2c_total": "63.7", "s2c_n_correct": "1.3", "wps": "253.3", "ups": "3.98", "wpb": "63.7", "bsz": "63.7", "num_updates": "1300", "lr": "8.67623e-06", "gnorm": "3.634", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "339"} 2023-01-29 15:43:30 | INFO | train_inner | {"epoch": 1, "update": 0.606, "s2c_loss": "9.795", "loss": "6.78961", "s2c_nll_loss": "9.795", "s2c_accuracy": "1.25", "s2c_total": "64", "s2c_n_correct": "0.8", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "1310", "lr": "8.7429e-06", "gnorm": "3.539", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "342"} 2023-01-29 15:43:33 | INFO | train_inner | {"epoch": 1, "update": 0.611, "s2c_loss": "9.736", "loss": "6.74865", "s2c_nll_loss": "9.736", "s2c_accuracy": "2.5", "s2c_total": "64", "s2c_n_correct": "1.6", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "1320", "lr": "8.80956e-06", "gnorm": "3.774", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "344"} 2023-01-29 15:43:35 | INFO | train_inner | {"epoch": 1, "update": 0.615, "s2c_loss": "9.741", "loss": "6.75195", "s2c_nll_loss": "9.741", "s2c_accuracy": "1.875", "s2c_total": "64", "s2c_n_correct": "1.2", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "1330", "lr": "8.87622e-06", "gnorm": "3.825", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "347"} 2023-01-29 15:43:38 | INFO | train_inner | {"epoch": 1, "update": 0.62, "s2c_loss": "9.702", "loss": "6.72471", "s2c_nll_loss": "9.702", "s2c_accuracy": "1.875", "s2c_total": "64", "s2c_n_correct": "1.2", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "1340", "lr": "8.94289e-06", "gnorm": "3.729", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "349"} 2023-01-29 15:43:40 | INFO | train_inner | {"epoch": 1, "update": 0.624, "s2c_loss": "9.667", "loss": "6.70075", "s2c_nll_loss": "9.667", "s2c_accuracy": "2.031", "s2c_total": "64", "s2c_n_correct": "1.3", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "1350", "lr": "9.00955e-06", "gnorm": "3.546", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "352"} 2023-01-29 15:43:43 | INFO | train_inner | {"epoch": 1, "update": 0.629, "s2c_loss": "9.695", "loss": "6.71996", "s2c_nll_loss": "9.695", "s2c_accuracy": "2.656", "s2c_total": "64", "s2c_n_correct": "1.7", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "1360", "lr": "9.07621e-06", "gnorm": "3.624", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "355"} 2023-01-29 15:43:45 | INFO | train_inner | {"epoch": 1, "update": 0.634, "s2c_loss": "9.676", "loss": "6.70712", "s2c_nll_loss": "9.676", "s2c_accuracy": "2.344", "s2c_total": "64", "s2c_n_correct": "1.5", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "1370", "lr": "9.14288e-06", "gnorm": "3.736", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "357"} 2023-01-29 15:43:48 | INFO | train_inner | {"epoch": 1, "update": 0.638, "s2c_loss": "9.677", "loss": "6.7079", "s2c_nll_loss": "9.677", "s2c_accuracy": "2.5", "s2c_total": "64", "s2c_n_correct": "1.6", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "1380", "lr": "9.20954e-06", "gnorm": "3.593", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "360"} 2023-01-29 15:43:50 | INFO | train_inner | {"epoch": 1, "update": 0.643, "s2c_loss": "9.718", "loss": "6.73595", "s2c_nll_loss": "9.718", "s2c_accuracy": "2.812", "s2c_total": "64", "s2c_n_correct": "1.8", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "1390", "lr": "9.2762e-06", "gnorm": "3.622", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "362"} 2023-01-29 15:43:53 | INFO | train_inner | {"epoch": 1, "update": 0.648, "s2c_loss": "9.634", "loss": "6.67752", "s2c_nll_loss": "9.634", "s2c_accuracy": "2.188", "s2c_total": "64", "s2c_n_correct": "1.4", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "1400", "lr": "9.34287e-06", "gnorm": "3.843", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "365"} 2023-01-29 15:43:55 | INFO | train_inner | {"epoch": 1, "update": 0.652, "s2c_loss": "9.579", "loss": "6.6394", "s2c_nll_loss": "9.579", "s2c_accuracy": "2.969", "s2c_total": "64", "s2c_n_correct": "1.9", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "1410", "lr": "9.40953e-06", "gnorm": "3.739", "loss_scale": "128", "train_wall": "2", "gb_free": "7.4", "wall": "367"} 2023-01-29 15:43:58 | INFO | train_inner | {"epoch": 1, "update": 0.657, "s2c_loss": "9.602", "loss": "6.65537", "s2c_nll_loss": "9.602", "s2c_accuracy": "3.125", "s2c_total": "64", "s2c_n_correct": "2", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "1420", "lr": "9.47619e-06", "gnorm": "3.768", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "370"} 2023-01-29 15:44:01 | INFO | train_inner | {"epoch": 1, "update": 0.661, "s2c_loss": "9.598", "loss": "6.65309", "s2c_nll_loss": "9.598", "s2c_accuracy": "2.188", "s2c_total": "64", "s2c_n_correct": "1.4", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "1430", "lr": "9.54286e-06", "gnorm": "3.71", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "372"} 2023-01-29 15:44:03 | INFO | train_inner | {"epoch": 1, "update": 0.666, "s2c_loss": "9.651", "loss": "6.68936", "s2c_nll_loss": "9.651", "s2c_accuracy": "2.344", "s2c_total": "64", "s2c_n_correct": "1.5", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "1440", "lr": "9.60952e-06", "gnorm": "3.639", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "375"} 2023-01-29 15:44:06 | INFO | train_inner | {"epoch": 1, "update": 0.671, "s2c_loss": "9.625", "loss": "6.67138", "s2c_nll_loss": "9.625", "s2c_accuracy": "3.125", "s2c_total": "64", "s2c_n_correct": "2", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "1450", "lr": "9.67618e-06", "gnorm": "3.686", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "377"} 2023-01-29 15:44:08 | INFO | train_inner | {"epoch": 1, "update": 0.675, "s2c_loss": "9.556", "loss": "6.62367", "s2c_nll_loss": "9.556", "s2c_accuracy": "2.5", "s2c_total": "64", "s2c_n_correct": "1.6", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "1460", "lr": "9.74285e-06", "gnorm": "3.768", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "380"} 2023-01-29 15:44:11 | INFO | train_inner | {"epoch": 1, "update": 0.68, "s2c_loss": "9.586", "loss": "6.64482", "s2c_nll_loss": "9.586", "s2c_accuracy": "3.75", "s2c_total": "64", "s2c_n_correct": "2.4", "wps": "258.1", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "1470", "lr": "9.80951e-06", "gnorm": "3.725", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "382"} 2023-01-29 15:44:13 | INFO | train_inner | {"epoch": 1, "update": 0.685, "s2c_loss": "9.601", "loss": "6.65493", "s2c_nll_loss": "9.601", "s2c_accuracy": "2.969", "s2c_total": "64", "s2c_n_correct": "1.9", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "1480", "lr": "9.87617e-06", "gnorm": "3.786", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "385"} 2023-01-29 15:44:16 | INFO | train_inner | {"epoch": 1, "update": 0.689, "s2c_loss": "9.445", "loss": "6.54674", "s2c_nll_loss": "9.445", "s2c_accuracy": "4.531", "s2c_total": "64", "s2c_n_correct": "2.9", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "1490", "lr": "9.94284e-06", "gnorm": "3.833", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "387"} 2023-01-29 15:44:18 | INFO | train_inner | {"epoch": 1, "update": 0.694, "s2c_loss": "9.509", "loss": "6.59122", "s2c_nll_loss": "9.509", "s2c_accuracy": "2.969", "s2c_total": "64", "s2c_n_correct": "1.9", "wps": "258.4", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "1500", "lr": "1.00095e-05", "gnorm": "3.808", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "390"} 2023-01-29 15:44:21 | INFO | train_inner | {"epoch": 1, "update": 0.698, "s2c_loss": "9.521", "loss": "6.59924", "s2c_nll_loss": "9.521", "s2c_accuracy": "2.812", "s2c_total": "64", "s2c_n_correct": "1.8", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "1510", "lr": "1.00762e-05", "gnorm": "3.861", "loss_scale": "128", "train_wall": "2", "gb_free": "7.4", "wall": "392"} 2023-01-29 15:44:23 | INFO | train_inner | {"epoch": 1, "update": 0.703, "s2c_loss": "9.425", "loss": "6.53287", "s2c_nll_loss": "9.425", "s2c_accuracy": "3.75", "s2c_total": "64", "s2c_n_correct": "2.4", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "1520", "lr": "1.01428e-05", "gnorm": "3.741", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "395"} 2023-01-29 15:44:26 | INFO | train_inner | {"epoch": 1, "update": 0.708, "s2c_loss": "9.498", "loss": "6.58371", "s2c_nll_loss": "9.498", "s2c_accuracy": "4.688", "s2c_total": "64", "s2c_n_correct": "3", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "1530", "lr": "1.02095e-05", "gnorm": "3.717", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "397"} 2023-01-29 15:44:28 | INFO | train_inner | {"epoch": 1, "update": 0.712, "s2c_loss": "9.412", "loss": "6.52357", "s2c_nll_loss": "9.412", "s2c_accuracy": "5", "s2c_total": "64", "s2c_n_correct": "3.2", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "1540", "lr": "1.02762e-05", "gnorm": "3.716", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "400"} 2023-01-29 15:44:31 | INFO | train_inner | {"epoch": 1, "update": 0.717, "s2c_loss": "9.347", "loss": "6.47896", "s2c_nll_loss": "9.347", "s2c_accuracy": "3.594", "s2c_total": "64", "s2c_n_correct": "2.3", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "1550", "lr": "1.03428e-05", "gnorm": "3.796", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "402"} 2023-01-29 15:44:33 | INFO | train_inner | {"epoch": 1, "update": 0.722, "s2c_loss": "9.492", "loss": "6.57923", "s2c_nll_loss": "9.492", "s2c_accuracy": "4.219", "s2c_total": "64", "s2c_n_correct": "2.7", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "1560", "lr": "1.04095e-05", "gnorm": "3.845", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "405"} 2023-01-29 15:44:36 | INFO | train_inner | {"epoch": 1, "update": 0.726, "s2c_loss": "9.38", "loss": "6.5017", "s2c_nll_loss": "9.38", "s2c_accuracy": "4.844", "s2c_total": "64", "s2c_n_correct": "3.1", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "1570", "lr": "1.04761e-05", "gnorm": "3.917", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "408"} 2023-01-29 15:44:38 | INFO | train_inner | {"epoch": 1, "update": 0.731, "s2c_loss": "9.4", "loss": "6.51587", "s2c_nll_loss": "9.4", "s2c_accuracy": "4.219", "s2c_total": "64", "s2c_n_correct": "2.7", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "1580", "lr": "1.05428e-05", "gnorm": "3.925", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "410"} 2023-01-29 15:44:41 | INFO | train_inner | {"epoch": 1, "update": 0.735, "s2c_loss": "9.48", "loss": "6.5713", "s2c_nll_loss": "9.48", "s2c_accuracy": "2.5", "s2c_total": "64", "s2c_n_correct": "1.6", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "1590", "lr": "1.06095e-05", "gnorm": "3.781", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "413"} 2023-01-29 15:44:43 | INFO | train_inner | {"epoch": 1, "update": 0.74, "s2c_loss": "9.324", "loss": "6.46272", "s2c_nll_loss": "9.324", "s2c_accuracy": "5.312", "s2c_total": "64", "s2c_n_correct": "3.4", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "1600", "lr": "1.06761e-05", "gnorm": "3.874", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "415"} 2023-01-29 15:44:46 | INFO | train_inner | {"epoch": 1, "update": 0.745, "s2c_loss": "9.476", "loss": "6.56835", "s2c_nll_loss": "9.476", "s2c_accuracy": "3.594", "s2c_total": "64", "s2c_n_correct": "2.3", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "1610", "lr": "1.07428e-05", "gnorm": "4.049", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "418"} 2023-01-29 15:44:48 | INFO | train_inner | {"epoch": 1, "update": 0.749, "s2c_loss": "9.34", "loss": "6.47404", "s2c_nll_loss": "9.34", "s2c_accuracy": "3.125", "s2c_total": "64", "s2c_n_correct": "2", "wps": "257.6", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "1620", "lr": "1.08095e-05", "gnorm": "3.893", "loss_scale": "128", "train_wall": "2", "gb_free": "7.4", "wall": "420"} 2023-01-29 15:44:51 | INFO | train_inner | {"epoch": 1, "update": 0.754, "s2c_loss": "9.421", "loss": "6.53027", "s2c_nll_loss": "9.421", "s2c_accuracy": "2.969", "s2c_total": "64", "s2c_n_correct": "1.9", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "1630", "lr": "1.08761e-05", "gnorm": "3.681", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "423"} 2023-01-29 15:44:54 | INFO | train_inner | {"epoch": 1, "update": 0.759, "s2c_loss": "9.278", "loss": "6.43127", "s2c_nll_loss": "9.278", "s2c_accuracy": "4.844", "s2c_total": "64", "s2c_n_correct": "3.1", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "1640", "lr": "1.09428e-05", "gnorm": "3.838", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "425"} 2023-01-29 15:44:56 | INFO | train_inner | {"epoch": 1, "update": 0.763, "s2c_loss": "9.441", "loss": "6.54411", "s2c_nll_loss": "9.441", "s2c_accuracy": "3.75", "s2c_total": "64", "s2c_n_correct": "2.4", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "1650", "lr": "1.10094e-05", "gnorm": "3.619", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "428"} 2023-01-29 15:44:59 | INFO | train_inner | {"epoch": 1, "update": 0.768, "s2c_loss": "9.36", "loss": "6.48813", "s2c_nll_loss": "9.36", "s2c_accuracy": "5.156", "s2c_total": "64", "s2c_n_correct": "3.3", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "1660", "lr": "1.10761e-05", "gnorm": "3.765", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "430"} 2023-01-29 15:45:01 | INFO | train_inner | {"epoch": 1, "update": 0.772, "s2c_loss": "9.245", "loss": "6.4079", "s2c_nll_loss": "9.245", "s2c_accuracy": "5.156", "s2c_total": "64", "s2c_n_correct": "3.3", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "1670", "lr": "1.11428e-05", "gnorm": "3.988", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "433"} 2023-01-29 15:45:04 | INFO | train_inner | {"epoch": 1, "update": 0.777, "s2c_loss": "9.181", "loss": "6.36385", "s2c_nll_loss": "9.181", "s2c_accuracy": "5.781", "s2c_total": "64", "s2c_n_correct": "3.7", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "1680", "lr": "1.12094e-05", "gnorm": "3.925", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "435"} 2023-01-29 15:45:06 | INFO | train_inner | {"epoch": 1, "update": 0.782, "s2c_loss": "9.301", "loss": "6.44674", "s2c_nll_loss": "9.301", "s2c_accuracy": "4.375", "s2c_total": "64", "s2c_n_correct": "2.8", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "1690", "lr": "1.12761e-05", "gnorm": "4.454", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "438"} 2023-01-29 15:45:09 | INFO | train_inner | {"epoch": 1, "update": 0.786, "s2c_loss": "9.354", "loss": "6.4838", "s2c_nll_loss": "9.354", "s2c_accuracy": "5", "s2c_total": "64", "s2c_n_correct": "3.2", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "1700", "lr": "1.13428e-05", "gnorm": "3.797", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "440"} 2023-01-29 15:45:11 | INFO | train_inner | {"epoch": 1, "update": 0.791, "s2c_loss": "9.215", "loss": "6.38711", "s2c_nll_loss": "9.215", "s2c_accuracy": "4.375", "s2c_total": "64", "s2c_n_correct": "2.8", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "1710", "lr": "1.14094e-05", "gnorm": "3.885", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "443"} 2023-01-29 15:45:14 | INFO | train_inner | {"epoch": 1, "update": 0.796, "s2c_loss": "9.217", "loss": "6.38908", "s2c_nll_loss": "9.217", "s2c_accuracy": "5.781", "s2c_total": "64", "s2c_n_correct": "3.7", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "1720", "lr": "1.14761e-05", "gnorm": "3.89", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "445"} 2023-01-29 15:45:16 | INFO | train_inner | {"epoch": 1, "update": 0.8, "s2c_loss": "9.226", "loss": "6.39481", "s2c_nll_loss": "9.226", "s2c_accuracy": "2.344", "s2c_total": "64", "s2c_n_correct": "1.5", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "1730", "lr": "1.15428e-05", "gnorm": "4.033", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "448"} 2023-01-29 15:45:19 | INFO | train_inner | {"epoch": 1, "update": 0.805, "s2c_loss": "9.323", "loss": "6.46213", "s2c_nll_loss": "9.323", "s2c_accuracy": "3.281", "s2c_total": "64", "s2c_n_correct": "2.1", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "1740", "lr": "1.16094e-05", "gnorm": "3.878", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "451"} 2023-01-29 15:45:21 | INFO | train_inner | {"epoch": 1, "update": 0.809, "s2c_loss": "9.195", "loss": "6.37359", "s2c_nll_loss": "9.195", "s2c_accuracy": "4.844", "s2c_total": "64", "s2c_n_correct": "3.1", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "1750", "lr": "1.16761e-05", "gnorm": "3.773", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "453"} 2023-01-29 15:45:24 | INFO | train_inner | {"epoch": 1, "update": 0.814, "s2c_loss": "9.126", "loss": "6.32561", "s2c_nll_loss": "9.126", "s2c_accuracy": "5.938", "s2c_total": "64", "s2c_n_correct": "3.8", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "1760", "lr": "1.17427e-05", "gnorm": "3.977", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "456"} 2023-01-29 15:45:27 | INFO | train_inner | {"epoch": 1, "update": 0.819, "s2c_loss": "9.215", "loss": "6.38717", "s2c_nll_loss": "9.215", "s2c_accuracy": "4.844", "s2c_total": "64", "s2c_n_correct": "3.1", "wps": "256.3", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "1770", "lr": "1.18094e-05", "gnorm": "4.318", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "458"} 2023-01-29 15:45:29 | INFO | train_inner | {"epoch": 1, "update": 0.823, "s2c_loss": "9.088", "loss": "6.29935", "s2c_nll_loss": "9.088", "s2c_accuracy": "6.406", "s2c_total": "64", "s2c_n_correct": "4.1", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "1780", "lr": "1.18761e-05", "gnorm": "3.87", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "461"} 2023-01-29 15:45:32 | INFO | train_inner | {"epoch": 1, "update": 0.828, "s2c_loss": "9.118", "loss": "6.31991", "s2c_nll_loss": "9.118", "s2c_accuracy": "5.781", "s2c_total": "64", "s2c_n_correct": "3.7", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "1790", "lr": "1.19427e-05", "gnorm": "4.312", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "463"} 2023-01-29 15:45:34 | INFO | train_inner | {"epoch": 1, "update": 0.833, "s2c_loss": "9.126", "loss": "6.32544", "s2c_nll_loss": "9.126", "s2c_accuracy": "5.781", "s2c_total": "64", "s2c_n_correct": "3.7", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "1800", "lr": "1.20094e-05", "gnorm": "4.094", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "466"} 2023-01-29 15:45:37 | INFO | train_inner | {"epoch": 1, "update": 0.837, "s2c_loss": "9.141", "loss": "6.33574", "s2c_nll_loss": "9.141", "s2c_accuracy": "5.469", "s2c_total": "64", "s2c_n_correct": "3.5", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "1810", "lr": "1.20761e-05", "gnorm": "3.968", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "468"} 2023-01-29 15:45:39 | INFO | train_inner | {"epoch": 1, "update": 0.842, "s2c_loss": "9.18", "loss": "6.36291", "s2c_nll_loss": "9.18", "s2c_accuracy": "5", "s2c_total": "64", "s2c_n_correct": "3.2", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "1820", "lr": "1.21427e-05", "gnorm": "3.868", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "471"} 2023-01-29 15:45:42 | INFO | train_inner | {"epoch": 1, "update": 0.846, "s2c_loss": "9.166", "loss": "6.35334", "s2c_nll_loss": "9.166", "s2c_accuracy": "5", "s2c_total": "64", "s2c_n_correct": "3.2", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "1830", "lr": "1.22094e-05", "gnorm": "3.735", "loss_scale": "128", "train_wall": "2", "gb_free": "7.4", "wall": "473"} 2023-01-29 15:45:44 | INFO | train_inner | {"epoch": 1, "update": 0.851, "s2c_loss": "9.001", "loss": "6.23894", "s2c_nll_loss": "9.001", "s2c_accuracy": "6.875", "s2c_total": "64", "s2c_n_correct": "4.4", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "1840", "lr": "1.22761e-05", "gnorm": "3.863", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "476"} 2023-01-29 15:45:47 | INFO | train_inner | {"epoch": 1, "update": 0.856, "s2c_loss": "9.132", "loss": "6.32951", "s2c_nll_loss": "9.132", "s2c_accuracy": "4.688", "s2c_total": "64", "s2c_n_correct": "3", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "1850", "lr": "1.23427e-05", "gnorm": "3.774", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "478"} 2023-01-29 15:45:49 | INFO | train_inner | {"epoch": 1, "update": 0.86, "s2c_loss": "8.998", "loss": "6.2371", "s2c_nll_loss": "8.998", "s2c_accuracy": "6.406", "s2c_total": "64", "s2c_n_correct": "4.1", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "1860", "lr": "1.24094e-05", "gnorm": "4.035", "loss_scale": "128", "train_wall": "2", "gb_free": "7.4", "wall": "481"} 2023-01-29 15:45:52 | INFO | train_inner | {"epoch": 1, "update": 0.865, "s2c_loss": "9.048", "loss": "6.27187", "s2c_nll_loss": "9.048", "s2c_accuracy": "5.781", "s2c_total": "64", "s2c_n_correct": "3.7", "wps": "259.1", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "1870", "lr": "1.2476e-05", "gnorm": "4.088", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "483"} 2023-01-29 15:45:54 | INFO | train_inner | {"epoch": 1, "update": 0.87, "s2c_loss": "8.98", "loss": "6.22455", "s2c_nll_loss": "8.98", "s2c_accuracy": "5.938", "s2c_total": "64", "s2c_n_correct": "3.8", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "1880", "lr": "1.25427e-05", "gnorm": "3.994", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "486"} 2023-01-29 15:45:57 | INFO | train_inner | {"epoch": 1, "update": 0.874, "s2c_loss": "8.96", "loss": "6.21032", "s2c_nll_loss": "8.96", "s2c_accuracy": "6.094", "s2c_total": "64", "s2c_n_correct": "3.9", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "1890", "lr": "1.26094e-05", "gnorm": "3.899", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "489"} 2023-01-29 15:45:59 | INFO | train_inner | {"epoch": 1, "update": 0.879, "s2c_loss": "9.05", "loss": "6.27294", "s2c_nll_loss": "9.05", "s2c_accuracy": "5", "s2c_total": "64", "s2c_n_correct": "3.2", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "1900", "lr": "1.2676e-05", "gnorm": "4.127", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "491"} 2023-01-29 15:46:02 | INFO | train_inner | {"epoch": 1, "update": 0.883, "s2c_loss": "8.947", "loss": "6.20128", "s2c_nll_loss": "8.947", "s2c_accuracy": "8.125", "s2c_total": "64", "s2c_n_correct": "5.2", "wps": "255", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "1910", "lr": "1.27427e-05", "gnorm": "4.065", "loss_scale": "128", "train_wall": "2", "gb_free": "7.4", "wall": "494"} 2023-01-29 15:46:04 | INFO | train_inner | {"epoch": 1, "update": 0.888, "s2c_loss": "8.957", "loss": "6.2085", "s2c_nll_loss": "8.957", "s2c_accuracy": "5.469", "s2c_total": "64", "s2c_n_correct": "3.5", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "1920", "lr": "1.28094e-05", "gnorm": "4.046", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "496"} 2023-01-29 15:46:07 | INFO | train_inner | {"epoch": 1, "update": 0.893, "s2c_loss": "8.926", "loss": "6.18724", "s2c_nll_loss": "8.926", "s2c_accuracy": "6.406", "s2c_total": "64", "s2c_n_correct": "4.1", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "1930", "lr": "1.2876e-05", "gnorm": "3.835", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "499"} 2023-01-29 15:46:10 | INFO | train_inner | {"epoch": 1, "update": 0.897, "s2c_loss": "8.858", "loss": "6.13993", "s2c_nll_loss": "8.858", "s2c_accuracy": "7.031", "s2c_total": "64", "s2c_n_correct": "4.5", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "1940", "lr": "1.29427e-05", "gnorm": "4.068", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "501"} 2023-01-29 15:46:12 | INFO | train_inner | {"epoch": 1, "update": 0.902, "s2c_loss": "8.903", "loss": "6.17075", "s2c_nll_loss": "8.903", "s2c_accuracy": "6.25", "s2c_total": "64", "s2c_n_correct": "4", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "1950", "lr": "1.30093e-05", "gnorm": "4.11", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "504"} 2023-01-29 15:46:14 | INFO | train_inner | {"epoch": 1, "update": 0.907, "s2c_loss": "9.087", "loss": "6.29887", "s2c_nll_loss": "9.087", "s2c_accuracy": "5.312", "s2c_total": "64", "s2c_n_correct": "3.4", "wps": "257.6", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "1960", "lr": "1.3076e-05", "gnorm": "4.148", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "506"} 2023-01-29 15:46:17 | INFO | train_inner | {"epoch": 1, "update": 0.911, "s2c_loss": "8.8", "loss": "6.09968", "s2c_nll_loss": "8.8", "s2c_accuracy": "8.125", "s2c_total": "64", "s2c_n_correct": "5.2", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "1970", "lr": "1.31427e-05", "gnorm": "4.364", "loss_scale": "128", "train_wall": "3", "gb_free": "7.2", "wall": "509"} 2023-01-29 15:46:20 | INFO | train_inner | {"epoch": 1, "update": 0.916, "s2c_loss": "8.916", "loss": "6.17989", "s2c_nll_loss": "8.916", "s2c_accuracy": "7.5", "s2c_total": "64", "s2c_n_correct": "4.8", "wps": "257.6", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "1980", "lr": "1.32093e-05", "gnorm": "4.116", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "511"} 2023-01-29 15:46:22 | INFO | train_inner | {"epoch": 1, "update": 0.92, "s2c_loss": "8.785", "loss": "6.08942", "s2c_nll_loss": "8.785", "s2c_accuracy": "6.406", "s2c_total": "64", "s2c_n_correct": "4.1", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "1990", "lr": "1.3276e-05", "gnorm": "3.949", "loss_scale": "128", "train_wall": "3", "gb_free": "7.3", "wall": "514"} 2023-01-29 15:46:25 | INFO | train_inner | {"epoch": 1, "update": 0.925, "s2c_loss": "8.893", "loss": "6.16437", "s2c_nll_loss": "8.893", "s2c_accuracy": "5.938", "s2c_total": "64", "s2c_n_correct": "3.8", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "2000", "lr": "1.33427e-05", "gnorm": "4.281", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "516"} 2023-01-29 15:46:27 | INFO | train_inner | {"epoch": 1, "update": 0.93, "s2c_loss": "8.818", "loss": "6.11193", "s2c_nll_loss": "8.818", "s2c_accuracy": "6.719", "s2c_total": "64", "s2c_n_correct": "4.3", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "2010", "lr": "1.34093e-05", "gnorm": "4.159", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "519"} 2023-01-29 15:46:30 | INFO | train_inner | {"epoch": 1, "update": 0.934, "s2c_loss": "8.928", "loss": "6.18847", "s2c_nll_loss": "8.928", "s2c_accuracy": "6.094", "s2c_total": "64", "s2c_n_correct": "3.9", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "2020", "lr": "1.3476e-05", "gnorm": "4.03", "loss_scale": "128", "train_wall": "2", "gb_free": "7.2", "wall": "521"} 2023-01-29 15:46:32 | INFO | train_inner | {"epoch": 1, "update": 0.939, "s2c_loss": "8.848", "loss": "6.13298", "s2c_nll_loss": "8.848", "s2c_accuracy": "7.5", "s2c_total": "64", "s2c_n_correct": "4.8", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "2030", "lr": "1.35427e-05", "gnorm": "4.107", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "524"} 2023-01-29 15:46:35 | INFO | train_inner | {"epoch": 1, "update": 0.944, "s2c_loss": "8.799", "loss": "6.09903", "s2c_nll_loss": "8.799", "s2c_accuracy": "7.969", "s2c_total": "64", "s2c_n_correct": "5.1", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "2040", "lr": "1.36093e-05", "gnorm": "3.973", "loss_scale": "128", "train_wall": "2", "gb_free": "7.3", "wall": "526"} 2023-01-29 15:46:37 | INFO | train_inner | {"epoch": 1, "update": 0.948, "s2c_loss": "8.821", "loss": "6.11417", "s2c_nll_loss": "8.821", "s2c_accuracy": "7.188", "s2c_total": "64", "s2c_n_correct": "4.6", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "2050", "lr": "1.3676e-05", "gnorm": "4.113", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "529"} 2023-01-29 15:46:40 | INFO | train_inner | {"epoch": 1, "update": 0.953, "s2c_loss": "8.816", "loss": "6.11072", "s2c_nll_loss": "8.816", "s2c_accuracy": "6.406", "s2c_total": "64", "s2c_n_correct": "4.1", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "2060", "lr": "1.37426e-05", "gnorm": "4.099", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "531"} 2023-01-29 15:46:42 | INFO | train_inner | {"epoch": 1, "update": 0.957, "s2c_loss": "8.733", "loss": "6.05325", "s2c_nll_loss": "8.733", "s2c_accuracy": "6.406", "s2c_total": "64", "s2c_n_correct": "4.1", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "2070", "lr": "1.38093e-05", "gnorm": "3.93", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "534"} 2023-01-29 15:46:45 | INFO | train_inner | {"epoch": 1, "update": 0.962, "s2c_loss": "8.75", "loss": "6.0647", "s2c_nll_loss": "8.75", "s2c_accuracy": "7.031", "s2c_total": "64", "s2c_n_correct": "4.5", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "2080", "lr": "1.3876e-05", "gnorm": "4.15", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "536"} 2023-01-29 15:46:47 | INFO | train_inner | {"epoch": 1, "update": 0.967, "s2c_loss": "8.73", "loss": "6.05133", "s2c_nll_loss": "8.73", "s2c_accuracy": "7.5", "s2c_total": "64", "s2c_n_correct": "4.8", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "2090", "lr": "1.39426e-05", "gnorm": "4.466", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "539"} 2023-01-29 15:46:50 | INFO | train_inner | {"epoch": 1, "update": 0.971, "s2c_loss": "8.658", "loss": "6.00158", "s2c_nll_loss": "8.658", "s2c_accuracy": "9.219", "s2c_total": "64", "s2c_n_correct": "5.9", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "2100", "lr": "1.40093e-05", "gnorm": "4.161", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "541"} 2023-01-29 15:46:52 | INFO | train_inner | {"epoch": 1, "update": 0.976, "s2c_loss": "8.852", "loss": "6.13542", "s2c_nll_loss": "8.852", "s2c_accuracy": "7.969", "s2c_total": "64", "s2c_n_correct": "5.1", "wps": "260.5", "ups": "4.07", "wpb": "64", "bsz": "64", "num_updates": "2110", "lr": "1.4076e-05", "gnorm": "4.421", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "544"} 2023-01-29 15:46:55 | INFO | train_inner | {"epoch": 1, "update": 0.981, "s2c_loss": "8.7", "loss": "6.03031", "s2c_nll_loss": "8.7", "s2c_accuracy": "7.812", "s2c_total": "64", "s2c_n_correct": "5", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "2120", "lr": "1.41426e-05", "gnorm": "4.456", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "546"} 2023-01-29 15:46:57 | INFO | train_inner | {"epoch": 1, "update": 0.985, "s2c_loss": "8.601", "loss": "5.96154", "s2c_nll_loss": "8.601", "s2c_accuracy": "8.438", "s2c_total": "64", "s2c_n_correct": "5.4", "wps": "255.7", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "2130", "lr": "1.42093e-05", "gnorm": "4.439", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "549"} 2023-01-29 15:47:00 | INFO | train_inner | {"epoch": 1, "update": 0.99, "s2c_loss": "8.708", "loss": "6.03607", "s2c_nll_loss": "8.708", "s2c_accuracy": "7.344", "s2c_total": "64", "s2c_n_correct": "4.7", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "2140", "lr": "1.4276e-05", "gnorm": "4.2", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "551"} 2023-01-29 15:47:02 | INFO | train_inner | {"epoch": 1, "update": 0.994, "s2c_loss": "8.664", "loss": "6.00573", "s2c_nll_loss": "8.664", "s2c_accuracy": "6.719", "s2c_total": "64", "s2c_n_correct": "4.3", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "2150", "lr": "1.43426e-05", "gnorm": "3.963", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "554"} 2023-01-29 15:47:05 | INFO | train_inner | {"epoch": 1, "update": 0.999, "s2c_loss": "8.563", "loss": "5.93525", "s2c_nll_loss": "8.563", "s2c_accuracy": "8.125", "s2c_total": "64", "s2c_n_correct": "5.2", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "2160", "lr": "1.44093e-05", "gnorm": "4.081", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "557"} 2023-01-29 15:47:05 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 1 @ 2162 updates 2023-01-29 15:47:05 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 15:47:09 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 15:47:09 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt (epoch 1 @ 2162 updates, score None) (writing took 3.261515055783093 seconds) 2023-01-29 15:47:09 | INFO | fairseq_cli.train | end of epoch 1 (average epoch stats below) 2023-01-29 15:47:09 | INFO | train | {"epoch": 1, "train_s2c_loss": "9.794", "train_loss": "6.78837", "train_s2c_nll_loss": "9.794", "train_s2c_accuracy": "2.346", "train_s2c_total": "63.9838", "train_s2c_n_correct": "1.50093", "train_wps": "251.6", "train_ups": "3.93", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "2162", "train_lr": "1.44226e-05", "train_gnorm": "4.101", "train_loss_scale": "256", "train_train_wall": "540", "train_gb_free": "7.5", "train_wall": "560"} 2023-01-29 16:11:33 | INFO | fairseq.distributed.utils | distributed init (rank 3): tcp://localhost:12742 2023-01-29 16:11:33 | INFO | fairseq.distributed.utils | distributed init (rank 1): tcp://localhost:12742 2023-01-29 16:11:33 | INFO | fairseq.distributed.utils | distributed init (rank 2): tcp://localhost:12742 2023-01-29 16:11:33 | INFO | fairseq.distributed.utils | distributed init (rank 0): tcp://localhost:12742 2023-01-29 16:11:34 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:1 to store for rank: 3 2023-01-29 16:11:34 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:1 to store for rank: 1 2023-01-29 16:11:34 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:1 to store for rank: 2 2023-01-29 16:11:34 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:1 to store for rank: 0 2023-01-29 16:11:34 | INFO | torch.distributed.distributed_c10d | Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2023-01-29 16:11:34 | INFO | fairseq.distributed.utils | initialized host ubuntu as rank 0 2023-01-29 16:11:34 | INFO | torch.distributed.distributed_c10d | Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2023-01-29 16:11:34 | INFO | fairseq.distributed.utils | initialized host ubuntu as rank 2 2023-01-29 16:11:34 | INFO | torch.distributed.distributed_c10d | Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2023-01-29 16:11:34 | INFO | torch.distributed.distributed_c10d | Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2023-01-29 16:11:34 | INFO | fairseq.distributed.utils | initialized host ubuntu as rank 3 2023-01-29 16:11:34 | INFO | fairseq.distributed.utils | initialized host ubuntu as rank 1 2023-01-29 16:11:39 | INFO | fairseq_cli.train | {'_name': None, 'common': {'_name': None, 'no_progress_bar': False, 'log_interval': 10, 'log_format': 'json', 'log_file': None, 'tensorboard_logdir': '/home/wangrui/projects/SpeechT5/experimental/s2c', 'wandb_project': None, 'azureml_logging': False, 'seed': 1, 'cpu': False, 'tpu': False, 'bf16': False, 'memory_efficient_bf16': False, 'fp16': True, 'memory_efficient_fp16': False, 'fp16_no_flatten_grads': False, 'fp16_init_scale': 128, 'fp16_scale_window': None, 'fp16_scale_tolerance': 0.0, 'on_cpu_convert_precision': False, 'min_loss_scale': 0.0001, 'threshold_loss_scale': None, 'amp': False, 'amp_batch_retries': 2, 'amp_init_scale': 128, 'amp_scale_window': None, 'user_dir': '/home/wangrui/projects/SpeechT5/SpeechT5/fairseq/examples/speecht5', 'empty_cache_freq': 0, 'all_gather_list_size': 16384, 'model_parallel_size': 1, 'quantization_config_path': None, 'profile': False, 'reset_logging': False, 'suppress_crashes': False, 'use_plasma_view': False, 'plasma_path': '/tmp/plasma'}, 'common_eval': {'_name': None, 'path': None, 'post_process': 'sentencepiece', 'quiet': False, 'model_overrides': '{}', 'results_path': None}, 'distributed_training': {'_name': None, 'distributed_world_size': 4, 'distributed_num_procs': 4, 'distributed_rank': 0, 'distributed_backend': 'nccl', 'distributed_init_method': 'tcp://localhost:12742', 'distributed_port': 0, 'device_id': 0, 'distributed_no_spawn': False, 'ddp_backend': 'legacy_ddp', 'ddp_comm_hook': 'none', 'bucket_cap_mb': 25, 'fix_batches_to_gpus': False, 'find_unused_parameters': True, 'fast_stat_sync': False, 'heartbeat_timeout': -1, 'broadcast_buffers': False, 'slowmo_momentum': None, 'slowmo_algorithm': 'LocalSGD', 'localsgd_frequency': 3, 'nprocs_per_node': 4, 'pipeline_model_parallel': False, 'pipeline_balance': None, 'pipeline_devices': None, 'pipeline_chunks': 0, 'pipeline_encoder_balance': None, 'pipeline_encoder_devices': None, 'pipeline_decoder_balance': None, 'pipeline_decoder_devices': None, 'pipeline_checkpoint': 'never', 'zero_sharding': 'none', 'fp16': True, 'memory_efficient_fp16': False, 'tpu': False, 'no_reshard_after_forward': False, 'fp32_reduce_scatter': False, 'cpu_offload': False, 'use_sharded_state': False}, 'dataset': {'_name': None, 'num_workers': 4, 'skip_invalid_size_inputs_valid_test': True, 'max_tokens': None, 'batch_size': 8, 'required_batch_size_multiple': 1, 'required_seq_len_multiple': 1, 'dataset_impl': None, 'data_buffer_size': 0, 'train_subset': 'train', 'valid_subset': 'valid', 'combine_valid_subsets': None, 'ignore_unused_valid_subsets': False, 'validate_interval': 1, 'validate_interval_updates': 0, 'validate_after_updates': 20000, 'fixed_validation_seed': None, 'disable_validation': False, 'max_tokens_valid': None, 'batch_size_valid': 8, 'max_valid_steps': None, 'curriculum': 0, 'gen_subset': 'test', 'num_shards': 1, 'shard_id': 0}, 'optimization': {'_name': None, 'max_epoch': 0, 'max_update': 60000, 'stop_time_hours': 0.0, 'clip_norm': 0.0, 'sentence_avg': False, 'update_freq': [2], 'lr': [1e-08], 'stop_min_lr': -1.0, 'use_bmuf': False}, 'checkpoint': {'_name': None, 'save_dir': '/home/wangrui/projects/SpeechT5/experimental/s2c', 'restore_file': 'checkpoint_last.pt', 'finetune_from_model': '/nfs-data/user1/PhDHub/ckpt/speecht5_base.pt', 'reset_dataloader': False, 'reset_lr_scheduler': False, 'reset_meters': False, 'reset_optimizer': False, 'optimizer_overrides': '{}', 'save_interval': 1, 'save_interval_updates': 10000, 'keep_interval_updates': -1, 'keep_interval_updates_pattern': -1, 'keep_last_epochs': -1, 'keep_best_checkpoints': -1, 'no_save': False, 'no_epoch_checkpoints': True, 'no_last_checkpoints': False, 'no_save_optimizer_state': False, 'best_checkpoint_metric': 's2c_accuracy', 'maximize_best_checkpoint_metric': True, 'patience': -1, 'checkpoint_suffix': '', 'checkpoint_shard_count': 1, 'load_checkpoint_on_all_dp_ranks': False, 'write_checkpoints_asynchronously': False, 'model_parallel_size': 1}, 'bmuf': {'_name': None, 'block_lr': 1.0, 'block_momentum': 0.875, 'global_sync_iter': 50, 'warmup_iterations': 500, 'use_nbm': False, 'average_sync': False, 'distributed_world_size': 4}, 'generation': {'_name': None, 'beam': 5, 'nbest': 1, 'max_len_a': 0.0, 'max_len_b': 200, 'min_len': 1, 'match_source_len': False, 'unnormalized': False, 'no_early_stop': False, 'no_beamable_mm': False, 'lenpen': 1.0, 'unkpen': 0.0, 'replace_unk': None, 'sacrebleu': False, 'score_reference': False, 'prefix_size': 0, 'no_repeat_ngram_size': 0, 'sampling': False, 'sampling_topk': -1, 'sampling_topp': -1.0, 'constraints': None, 'temperature': 1.0, 'diverse_beam_groups': -1, 'diverse_beam_strength': 0.5, 'diversity_rate': -1.0, 'print_alignment': None, 'print_step': False, 'lm_path': None, 'lm_weight': 0.0, 'iter_decode_eos_penalty': 0.0, 'iter_decode_max_iter': 10, 'iter_decode_force_max_iter': False, 'iter_decode_with_beam': 1, 'iter_decode_with_external_reranker': False, 'retain_iter_history': False, 'retain_dropout': False, 'retain_dropout_modules': None, 'decoding_format': None, 'no_seed_provided': False}, 'eval_lm': {'_name': None, 'output_word_probs': False, 'output_word_stats': False, 'context_window': 0, 'softmax_batch': 9223372036854775807}, 'interactive': {'_name': None, 'buffer_size': 0, 'input': '-'}, 'model': Namespace(_name='t5_transformer_base_asr', activation_dropout=0.1, activation_fn='gelu', adam_betas=(0.9, 0.999), adam_eps=1e-08, adaptive_input=False, adaptive_softmax_cutoff=None, adaptive_softmax_dropout=0, all_gather_list_size=16384, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, arch='t5_transformer_base_asr', attention_dropout=0.1, azureml_logging=False, bart_weight=1.0, batch_ratio=None, batch_size=8, batch_size_valid=8, bce_loss_lambda=1.0, bce_pos_weight=5.0, bert_init=True, best_checkpoint_metric='s2c_accuracy', bf16=False, bpe=None, bpe_tokenizer=None, broadcast_buffers=False, bucket_cap_mb=25, ce_weight=1.0, checkpoint_shard_count=1, checkpoint_suffix='', clip_norm=0.0, codebook_prob=0.5, combine_valid_subsets=None, config_yaml='config.yaml', conv_bias=False, conv_channels=1024, conv_feature_layers='[(512,10,5)] + [(512,3,2)] * 4 + [(512,2,2)] * 2', conv_kernel_sizes='5,5', conv_pos=128, conv_pos_groups=16, cpu=False, cpu_offload=False, criterion='speecht5', ctc_weight=0.0, curriculum=0, data='/home/wangrui/projects/SpeechT5/manifest', data_buffer_size=0, dataset_impl=None, ddp_backend='legacy_ddp', ddp_comm_hook='none', dec_use_scaled_pos_enc=True, dec_weight=1.0, decoder_attention_heads=12, decoder_embed_dim=768, decoder_ffn_embed_dim=3072, decoder_input_dim=768, decoder_layerdrop=0.1, decoder_layers=6, decoder_learned_pos=False, decoder_max_relative_position=160, decoder_normalize_before=False, decoder_output_dim=768, device_id=0, disable_validation=False, distributed_backend='nccl', distributed_init_method=None, distributed_no_spawn=False, distributed_num_procs=4, distributed_port=0, distributed_rank=0, distributed_world_size=4, dprenet_dropout_rate=0.5, dprenet_layers=2, dprenet_units=256, dropout=0.1, empty_cache_freq=0, enable_padding=False, enc_use_scaled_pos_enc=True, encoder_attention_heads=12, encoder_attn_branch='identity,full', encoder_embed_dim=768, encoder_ffn_embed_dim=3072, encoder_layerdrop=0.05, encoder_layers=12, encoder_max_relative_position=160, encoder_normalize_before=False, encoder_reduction_factor=1, encoder_sliding_window_attn=None, encoder_speech_prenet='conv', eos=2, eprenet_conv_chans=0, eprenet_conv_filts=0, eprenet_conv_layers=0, eprenet_dropout_rate=0.0, extractor_mode='default', fast_stat_sync=False, feature_grad_mult=1.0, final_dim=256, find_unused_parameters=True, finetune_from_model='/nfs-data/user1/PhDHub/ckpt/speecht5_base.pt', finetune_from_modules=None, finetune_out_of_modules=None, fix_batches_to_gpus=False, fixed_validation_seed=None, fp16=True, fp16_init_scale=128, fp16_no_flatten_grads=False, fp16_scale_tolerance=0.0, fp16_scale_window=None, fp32_reduce_scatter=False, freeze_decoder_updates=0, freeze_encoder_updates=0, gen_subset='test', guided_attn_loss_lambda=10.0, guided_attn_loss_sigma=0.4, heartbeat_timeout=-1, hubert_label_dir=None, hubert_labels=['km'], hubert_mask_length=10, hubert_weight=1.0, ignore_prefix_size=0, ignore_unused_valid_subsets=False, iid_noise_target=False, initial_decoder_alpha=1.0, initial_encoder_alpha=1.0, insert=0.0, keep_best_checkpoints=-1, keep_interval_updates=-1, keep_interval_updates_pattern=-1, keep_last_epochs=-1, label_rates=-1, label_smoothing=0.0, latent_dim=0, latent_groups=2, latent_temp=(2, 0.5, 0.999995), latent_vars=100, layer_norm_eps=1e-05, layer_norm_first=False, load_checkpoint_on_all_dp_ranks=False, localsgd_frequency=3, log_file=None, log_format='json', log_interval=10, log_keys=[], logit_temp=0.1, loss_type='L1', loss_weights=[0.1], lr=[1e-08], lr_period_updates=60000.0, lr_scheduler='triangular', lr_shrink=0.5, mask=0.3, mask_channel_length=64, mask_channel_min_space=1, mask_channel_other=0, mask_channel_prob=0.0, mask_channel_selection='static', mask_length='span-poisson', mask_min_space=1, mask_other=0, mask_prob=0.0, mask_random=0.1, mask_selection='static', max_distance=1280, max_epoch=0, max_lr=0.0002, max_speech_positions=4000, max_speech_sample_size=None, max_text_positions=600, max_tokens=None, max_tokens_valid=None, max_update=60000, max_valid_steps=None, maximize_best_checkpoint_metric=True, memory_efficient_bf16=False, memory_efficient_fp16=False, min_loss_scale=0.0001, min_speech_sample_size=None, model_parallel_size=1, modules_applied_guided_attn=('encoder-decoder',), modules_filter=None, no_epoch_checkpoints=True, no_freeze_encoder_layer=None, no_last_checkpoints=False, no_mask_channel_overlap=False, no_mask_overlap=False, no_progress_bar=False, no_reshard_after_forward=False, no_save=False, no_save_optimizer_state=False, no_scale_embedding=True, no_seed_provided=False, no_token_positional_embeddings=False, normalize=False, nprocs_per_node=4, num_buckets=320, num_heads_applied_guided_attn=2, num_layers_applied_guided_attn=2, num_shards=1, num_workers=4, on_cpu_convert_precision=False, optimizer='adam', optimizer_overrides='{}', pad=1, pad_audio=False, patience=-1, permute=0.0, permute_sentences=0.0, pipeline_balance=None, pipeline_checkpoint='never', pipeline_chunks=0, pipeline_decoder_balance=None, pipeline_decoder_devices=None, pipeline_devices=None, pipeline_encoder_balance=None, pipeline_encoder_devices=None, pipeline_model_parallel=False, plasma_path='/tmp/plasma', poisson_lambda=3.5, post_process='sentencepiece', postnet_chans=256, postnet_dropout_rate=0.5, postnet_filts=5, postnet_layers=5, pred_masked_weight=1.0, pred_nomask_weight=0.0, profile=False, quant_noise_pq=0, quantization_config_path=None, quantizer_depth=1, quantizer_factor=3, random_crop=False, reduction_factor=2, relative_position_embedding=True, replace_length=1, report_accuracy=True, required_batch_size_multiple=1, required_seq_len_multiple=1, reset_dataloader=False, reset_logging=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, restore_file='checkpoint_last.pt', rotate=0.0, sample_break_mode='eos', sample_rate=16000.0, sample_ratios=None, save_dir='/home/wangrui/projects/SpeechT5/experimental/s2c', save_interval=1, save_interval_updates=10000, scoring='bleu', se_decoder_input='previous_target', se_predict=None, seed=1, sentence_avg=False, shard_id=0, share_ctc_embed=False, share_input_output_embed=True, shorten_data_split_list='', shorten_method='none', shrink_min=False, sid_decoder_attn_dim=128, sid_embed_dim=128, sid_encoder_cls=None, sid_no_embed_postnet=True, sid_no_pooling_bn=True, sid_pooling_layer='decoder', sid_softmax_type='softmax', single_target=False, skip_invalid_size_inputs_valid_test=True, skip_masked=False, skip_nomask=False, slowmo_algorithm='LocalSGD', slowmo_momentum=None, softmax_easy_margin=False, softmax_margin=0.0, softmax_scale=1.0, spk_embed_dim=512, spk_embed_integration_type='pre', stop_min_lr=-1.0, stop_time_hours=0, subsample_stride='2,2', suppress_crashes=False, t5_task='s2c', target_glu=False, task='speecht5', tensorboard_logdir='/home/wangrui/projects/SpeechT5/experimental/s2c', threshold_loss_scale=None, tokenizer=None, tokens_per_sample=512, tpu=False, train_subset='train', transformer_dec_positional_dropout_rate=0.1, transformer_enc_positional_dropout_rate=0.1, unb_enc_layer=-1, unk=3, untie_final_proj=True, update_freq=[2], use_batch_norm=True, use_bmuf=False, use_codebook=False, use_conv_pos=True, use_guided_attn_loss=False, use_masking=True, use_old_adam=False, use_plasma_view=False, use_sent_enc_layer=True, use_sharded_state=False, use_sinc_pos=True, use_weighted_masking=False, user_dir='/home/wangrui/projects/SpeechT5/SpeechT5/fairseq/examples/speecht5', valid_subset='valid', validate_after_updates=20000, validate_interval=1, validate_interval_updates=0, wandb_project=None, weight_decay=0.1, wer_args=None, wer_kenlm_model=None, wer_lexicon=None, wer_lm_weight=2.0, wer_word_score=-1.0, write_checkpoints_asynchronously=False, zero_infinity=False, zero_sharding='none'), 'task': Namespace(_name='speecht5', activation_dropout=0.1, activation_fn='gelu', adam_betas=(0.9, 0.999), adam_eps=1e-08, adaptive_input=False, adaptive_softmax_cutoff=None, adaptive_softmax_dropout=0, all_gather_list_size=16384, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, arch='t5_transformer_base_asr', attention_dropout=0.1, azureml_logging=False, bart_weight=1.0, batch_ratio=None, batch_size=8, batch_size_valid=8, bce_loss_lambda=1.0, bce_pos_weight=5.0, bert_init=True, best_checkpoint_metric='s2c_accuracy', bf16=False, bpe=None, bpe_tokenizer=None, broadcast_buffers=False, bucket_cap_mb=25, ce_weight=1.0, checkpoint_shard_count=1, checkpoint_suffix='', clip_norm=0.0, codebook_prob=0.5, combine_valid_subsets=None, config_yaml='config.yaml', conv_bias=False, conv_channels=1024, conv_feature_layers='[(512,10,5)] + [(512,3,2)] * 4 + [(512,2,2)] * 2', conv_kernel_sizes='5,5', conv_pos=128, conv_pos_groups=16, cpu=False, cpu_offload=False, criterion='speecht5', ctc_weight=0.0, curriculum=0, data='/home/wangrui/projects/SpeechT5/manifest', data_buffer_size=0, dataset_impl=None, ddp_backend='legacy_ddp', ddp_comm_hook='none', dec_use_scaled_pos_enc=True, dec_weight=1.0, decoder_attention_heads=12, decoder_embed_dim=768, decoder_ffn_embed_dim=3072, decoder_input_dim=768, decoder_layerdrop=0.1, decoder_layers=6, decoder_learned_pos=False, decoder_max_relative_position=160, decoder_normalize_before=False, decoder_output_dim=768, device_id=0, disable_validation=False, distributed_backend='nccl', distributed_init_method=None, distributed_no_spawn=False, distributed_num_procs=4, distributed_port=0, distributed_rank=0, distributed_world_size=4, dprenet_dropout_rate=0.5, dprenet_layers=2, dprenet_units=256, dropout=0.1, empty_cache_freq=0, enable_padding=False, enc_use_scaled_pos_enc=True, encoder_attention_heads=12, encoder_attn_branch='identity,full', encoder_embed_dim=768, encoder_ffn_embed_dim=3072, encoder_layerdrop=0.05, encoder_layers=12, encoder_max_relative_position=160, encoder_normalize_before=False, encoder_reduction_factor=1, encoder_sliding_window_attn=None, encoder_speech_prenet='conv', eos=2, eprenet_conv_chans=0, eprenet_conv_filts=0, eprenet_conv_layers=0, eprenet_dropout_rate=0.0, extractor_mode='default', fast_stat_sync=False, feature_grad_mult=1.0, final_dim=256, find_unused_parameters=True, finetune_from_model='/nfs-data/user1/PhDHub/ckpt/speecht5_base.pt', finetune_from_modules=None, finetune_out_of_modules=None, fix_batches_to_gpus=False, fixed_validation_seed=None, fp16=True, fp16_init_scale=128, fp16_no_flatten_grads=False, fp16_scale_tolerance=0.0, fp16_scale_window=None, fp32_reduce_scatter=False, freeze_decoder_updates=0, freeze_encoder_updates=0, gen_subset='test', guided_attn_loss_lambda=10.0, guided_attn_loss_sigma=0.4, heartbeat_timeout=-1, hubert_label_dir=None, hubert_labels=['km'], hubert_mask_length=10, hubert_weight=1.0, ignore_prefix_size=0, ignore_unused_valid_subsets=False, iid_noise_target=False, initial_decoder_alpha=1.0, initial_encoder_alpha=1.0, insert=0.0, keep_best_checkpoints=-1, keep_interval_updates=-1, keep_interval_updates_pattern=-1, keep_last_epochs=-1, label_rates=-1, label_smoothing=0.0, latent_dim=0, latent_groups=2, latent_temp=(2, 0.5, 0.999995), latent_vars=100, layer_norm_eps=1e-05, layer_norm_first=False, load_checkpoint_on_all_dp_ranks=False, localsgd_frequency=3, log_file=None, log_format='json', log_interval=10, log_keys=[], logit_temp=0.1, loss_type='L1', loss_weights=[0.1], lr=[1e-08], lr_period_updates=60000.0, lr_scheduler='triangular', lr_shrink=0.5, mask=0.3, mask_channel_length=64, mask_channel_min_space=1, mask_channel_other=0, mask_channel_prob=0.0, mask_channel_selection='static', mask_length='span-poisson', mask_min_space=1, mask_other=0, mask_prob=0.0, mask_random=0.1, mask_selection='static', max_distance=1280, max_epoch=0, max_lr=0.0002, max_speech_positions=4000, max_speech_sample_size=None, max_text_positions=600, max_tokens=None, max_tokens_valid=None, max_update=60000, max_valid_steps=None, maximize_best_checkpoint_metric=True, memory_efficient_bf16=False, memory_efficient_fp16=False, min_loss_scale=0.0001, min_speech_sample_size=None, model_parallel_size=1, modules_applied_guided_attn=('encoder-decoder',), modules_filter=None, no_epoch_checkpoints=True, no_freeze_encoder_layer=None, no_last_checkpoints=False, no_mask_channel_overlap=False, no_mask_overlap=False, no_progress_bar=False, no_reshard_after_forward=False, no_save=False, no_save_optimizer_state=False, no_scale_embedding=True, no_seed_provided=False, no_token_positional_embeddings=False, normalize=False, nprocs_per_node=4, num_buckets=320, num_heads_applied_guided_attn=2, num_layers_applied_guided_attn=2, num_shards=1, num_workers=4, on_cpu_convert_precision=False, optimizer='adam', optimizer_overrides='{}', pad=1, pad_audio=False, patience=-1, permute=0.0, permute_sentences=0.0, pipeline_balance=None, pipeline_checkpoint='never', pipeline_chunks=0, pipeline_decoder_balance=None, pipeline_decoder_devices=None, pipeline_devices=None, pipeline_encoder_balance=None, pipeline_encoder_devices=None, pipeline_model_parallel=False, plasma_path='/tmp/plasma', poisson_lambda=3.5, post_process='sentencepiece', postnet_chans=256, postnet_dropout_rate=0.5, postnet_filts=5, postnet_layers=5, pred_masked_weight=1.0, pred_nomask_weight=0.0, profile=False, quant_noise_pq=0, quantization_config_path=None, quantizer_depth=1, quantizer_factor=3, random_crop=False, reduction_factor=2, relative_position_embedding=True, replace_length=1, report_accuracy=True, required_batch_size_multiple=1, required_seq_len_multiple=1, reset_dataloader=False, reset_logging=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, restore_file='checkpoint_last.pt', rotate=0.0, sample_break_mode='eos', sample_rate=16000.0, sample_ratios=None, save_dir='/home/wangrui/projects/SpeechT5/experimental/s2c', save_interval=1, save_interval_updates=10000, scoring='bleu', se_decoder_input='previous_target', se_predict=None, seed=1, sentence_avg=False, shard_id=0, share_ctc_embed=False, share_input_output_embed=True, shorten_data_split_list='', shorten_method='none', shrink_min=False, sid_decoder_attn_dim=128, sid_embed_dim=128, sid_encoder_cls=None, sid_no_embed_postnet=True, sid_no_pooling_bn=True, sid_pooling_layer='decoder', sid_softmax_type='softmax', single_target=False, skip_invalid_size_inputs_valid_test=True, skip_masked=False, skip_nomask=False, slowmo_algorithm='LocalSGD', slowmo_momentum=None, softmax_easy_margin=False, softmax_margin=0.0, softmax_scale=1.0, spk_embed_dim=512, spk_embed_integration_type='pre', stop_min_lr=-1.0, stop_time_hours=0, subsample_stride='2,2', suppress_crashes=False, t5_task='s2c', target_glu=False, task='speecht5', tensorboard_logdir='/home/wangrui/projects/SpeechT5/experimental/s2c', threshold_loss_scale=None, tokenizer=None, tokens_per_sample=512, tpu=False, train_subset='train', transformer_dec_positional_dropout_rate=0.1, transformer_enc_positional_dropout_rate=0.1, unb_enc_layer=-1, unk=3, untie_final_proj=True, update_freq=[2], use_batch_norm=True, use_bmuf=False, use_codebook=False, use_conv_pos=True, use_guided_attn_loss=False, use_masking=True, use_old_adam=False, use_plasma_view=False, use_sent_enc_layer=True, use_sharded_state=False, use_sinc_pos=True, use_weighted_masking=False, user_dir='/home/wangrui/projects/SpeechT5/SpeechT5/fairseq/examples/speecht5', valid_subset='valid', validate_after_updates=20000, validate_interval=1, validate_interval_updates=0, wandb_project=None, weight_decay=0.1, wer_args=None, wer_kenlm_model=None, wer_lexicon=None, wer_lm_weight=2.0, wer_word_score=-1.0, write_checkpoints_asynchronously=False, zero_infinity=False, zero_sharding='none'), 'criterion': {'_name': 'speecht5', 'zero_infinity': False, 'sentence_avg': False, 'post_process': 'sentencepiece', 'wer_kenlm_model': None, 'wer_lexicon': None, 'wer_lm_weight': 2.0, 'wer_word_score': -1.0, 'wer_args': None, 'label_smoothing': 0.0, 'report_accuracy': True, 'ignore_prefix_size': 0, 'ce_weight': 1.0, 'ctc_weight': 0.0, 'use_masking': True, 'use_weighted_masking': False, 'loss_type': 'L1', 'bce_pos_weight': 5.0, 'bce_loss_lambda': 1.0, 'use_guided_attn_loss': False, 'guided_attn_loss_sigma': 0.4, 'guided_attn_loss_lambda': 10.0, 'num_layers_applied_guided_attn': 2, 'num_heads_applied_guided_attn': 2, 'modules_applied_guided_attn': ['encoder-decoder'], 'pred_masked_weight': 1.0, 'pred_nomask_weight': 0.0, 'loss_weights': [0.1], 'log_keys': [], 'hubert_weight': 1.0, 'dec_weight': 1.0, 'bart_weight': 1.0}, 'optimizer': {'_name': 'adam', 'adam_betas': [0.9, 0.999], 'adam_eps': 1e-08, 'weight_decay': 0.1, 'use_old_adam': False, 'tpu': False, 'lr': [1e-08]}, 'lr_scheduler': {'_name': 'triangular', 'max_lr': 0.0002, 'lr_period_updates': 60000.0, 'lr_shrink': 0.5, 'shrink_min': False, 'lr': [1e-08]}, 'scoring': {'_name': 'bleu', 'pad': 1, 'eos': 2, 'unk': 3}, 'bpe': None, 'tokenizer': None} 2023-01-29 16:11:39 | INFO | speecht5.tasks.speecht5 | No config file for s2c 2023-01-29 16:11:39 | INFO | speecht5.tasks.speecht5 | Cannot set input_feat_per_channel, input_channels, since: 2023-01-29 16:11:39 | WARNING | speecht5.tasks.speecht5 | 'NoneType' object has no attribute 'input_feat_per_channel' 2023-01-29 16:11:39 | INFO | speecht5.tasks.speecht5 | Set to: 80 and 1 2023-01-29 16:11:39 | WARNING | speecht5.tasks.speecht5 | 'NoneType' object has no attribute 'input_feat_per_channel' 2023-01-29 16:11:39 | WARNING | speecht5.tasks.speecht5 | 'NoneType' object has no attribute 'input_feat_per_channel' 2023-01-29 16:11:39 | WARNING | speecht5.tasks.speecht5 | 'NoneType' object has no attribute 'input_feat_per_channel' 2023-01-29 16:11:42 | INFO | speecht5.criterions.speech_to_text_loss | Only using CE loss 2023-01-29 16:11:42 | INFO | fairseq_cli.train | T5TransformerModel( (encoder): TransformerEncoder( (dropout_module): FairseqDropout() (layers): ModuleList( (0): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (1): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (2): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (3): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (4): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (5): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (6): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (7): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (8): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (9): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (10): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (11): TransformerSentenceEncoderLayer( (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) ) (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (proj): Linear(in_features=768, out_features=1257, bias=True) (pos_emb): RelativePositionalEncoding( (pe_k): Embedding(320, 64) ) ) (decoder): TransformerDecoder( (dropout_module): FairseqDropout() (layers): LayerDropModuleList( (0): TransformerDecoderLayer( (dropout_module): FairseqDropout() (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (activation_dropout_module): FairseqDropout() (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (encoder_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (encoder_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (1): TransformerDecoderLayer( (dropout_module): FairseqDropout() (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (activation_dropout_module): FairseqDropout() (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (encoder_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (encoder_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (2): TransformerDecoderLayer( (dropout_module): FairseqDropout() (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (activation_dropout_module): FairseqDropout() (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (encoder_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (encoder_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (3): TransformerDecoderLayer( (dropout_module): FairseqDropout() (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (activation_dropout_module): FairseqDropout() (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (encoder_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (encoder_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (4): TransformerDecoderLayer( (dropout_module): FairseqDropout() (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (activation_dropout_module): FairseqDropout() (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (encoder_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (encoder_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) (5): TransformerDecoderLayer( (dropout_module): FairseqDropout() (self_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (activation_dropout_module): FairseqDropout() (self_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (encoder_attn): MultiheadAttention( (dropout_module): FairseqDropout() (k_proj): Linear(in_features=768, out_features=768, bias=True) (v_proj): Linear(in_features=768, out_features=768, bias=True) (q_proj): Linear(in_features=768, out_features=768, bias=True) (out_proj): Linear(in_features=768, out_features=768, bias=True) ) (encoder_attn_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (fc1): Linear(in_features=768, out_features=3072, bias=True) (fc2): Linear(in_features=3072, out_features=768, bias=True) (final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (norm_k): LayerNorm((64,), eps=1e-05, elementwise_affine=True) ) ) (pos_emb): RelativePositionalEncoding( (pe_k): Embedding(320, 64) ) ) (text_encoder_prenet): TextEncoderPrenet( (encoder_prenet): Sequential( (0): Embedding(1257, 768, padding_idx=1) (1): ScaledPositionalEncoding( (dropout): Dropout(p=0.1, inplace=False) ) ) ) (speech_encoder_prenet): SpeechEncoderPrenet( (dropout_module): FairseqDropout() (feature_extractor): ConvFeatureExtractionModel( (conv_layers): ModuleList( (0): Sequential( (0): Conv1d(1, 512, kernel_size=(10,), stride=(5,), bias=False) (1): Dropout(p=0.0, inplace=False) (2): Fp32GroupNorm(512, 512, eps=1e-05, affine=True) (3): GELU() ) (1): Sequential( (0): Conv1d(512, 512, kernel_size=(3,), stride=(2,), bias=False) (1): Dropout(p=0.0, inplace=False) (2): GELU() ) (2): Sequential( (0): Conv1d(512, 512, kernel_size=(3,), stride=(2,), bias=False) (1): Dropout(p=0.0, inplace=False) (2): GELU() ) (3): Sequential( (0): Conv1d(512, 512, kernel_size=(3,), stride=(2,), bias=False) (1): Dropout(p=0.0, inplace=False) (2): GELU() ) (4): Sequential( (0): Conv1d(512, 512, kernel_size=(3,), stride=(2,), bias=False) (1): Dropout(p=0.0, inplace=False) (2): GELU() ) (5): Sequential( (0): Conv1d(512, 512, kernel_size=(2,), stride=(2,), bias=False) (1): Dropout(p=0.0, inplace=False) (2): GELU() ) (6): Sequential( (0): Conv1d(512, 512, kernel_size=(2,), stride=(2,), bias=False) (1): Dropout(p=0.0, inplace=False) (2): GELU() ) ) ) (post_extract_proj): Linear(in_features=512, out_features=768, bias=True) (layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (pos_conv): Sequential( (0): Conv1d(768, 768, kernel_size=(128,), stride=(1,), padding=(64,), groups=16) (1): SamePad() (2): GELU() ) (embed_positions): SinusoidalPositionalEmbedding() ) (text_decoder_prenet): TextDecoderPrenet( (dropout_module): FairseqDropout() (embed_tokens): Embedding(1257, 768, padding_idx=1) (embed_positions): SinusoidalPositionalEmbedding() ) (speech_decoder_prenet): SpeechDecoderPrenet( (decoder_prenet): Sequential( (0): Sequential( (0): Prenet( (prenet): ModuleList( (0): Sequential( (0): Linear(in_features=80, out_features=256, bias=True) (1): ReLU() ) (1): Sequential( (0): Linear(in_features=256, out_features=256, bias=True) (1): ReLU() ) ) ) (1): Linear(in_features=256, out_features=768, bias=True) ) (1): ScaledPositionalEncoding( (dropout): Dropout(p=0.1, inplace=False) ) ) (spkembs_layer): Sequential( (0): Linear(in_features=1280, out_features=768, bias=True) (1): ReLU() ) ) (text_decoder_postnet): TextDecoderPostnet( (output_projection): Linear(in_features=768, out_features=1257, bias=False) ) (speech_decoder_postnet): SpeechDecoderPostnet( (feat_out): Linear(in_features=768, out_features=160, bias=True) (prob_out): Linear(in_features=768, out_features=2, bias=True) (postnet): Postnet( (postnet): ModuleList( (0): Sequential( (0): Conv1d(80, 256, kernel_size=(5,), stride=(1,), padding=(2,), bias=False) (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): Tanh() (3): Dropout(p=0.5, inplace=False) ) (1): Sequential( (0): Conv1d(256, 256, kernel_size=(5,), stride=(1,), padding=(2,), bias=False) (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): Tanh() (3): Dropout(p=0.5, inplace=False) ) (2): Sequential( (0): Conv1d(256, 256, kernel_size=(5,), stride=(1,), padding=(2,), bias=False) (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): Tanh() (3): Dropout(p=0.5, inplace=False) ) (3): Sequential( (0): Conv1d(256, 256, kernel_size=(5,), stride=(1,), padding=(2,), bias=False) (1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): Tanh() (3): Dropout(p=0.5, inplace=False) ) (4): Sequential( (0): Conv1d(256, 80, kernel_size=(5,), stride=(1,), padding=(2,), bias=False) (1): BatchNorm1d(80, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): Dropout(p=0.5, inplace=False) ) ) ) ) (speaker_decoder_postnet): SpeakerDecoderPostnet( (output_projection): Linear(in_features=768, out_features=1257, bias=False) ) ) 2023-01-29 16:11:42 | INFO | fairseq_cli.train | task: SpeechT5Task 2023-01-29 16:11:42 | INFO | fairseq_cli.train | model: T5TransformerModel 2023-01-29 16:11:42 | INFO | fairseq_cli.train | criterion: SpeechT5Criterion 2023-01-29 16:11:42 | INFO | fairseq_cli.train | num. shared model params: 156,605,357 (num. trained: 156,605,357) 2023-01-29 16:11:42 | INFO | fairseq_cli.train | num. expert model params: 0 (num. trained: 0) 2023-01-29 16:11:42 | INFO | speecht5.data.speech_to_class_dataset | max_keep=1024000, min_keep=None, loaded 6903, skipped 0 short and 1 long, longest-loaded=975361, shortest-loaded=63361 2023-01-29 16:11:42 | INFO | speecht5.data.speech_to_class_dataset | max_length=76800, normalize=False 2023-01-29 16:11:42 | INFO | torch.distributed.distributed_c10d | Added key: store_based_barrier_key:2 to store for rank: 0 2023-01-29 16:11:42 | INFO | torch.distributed.distributed_c10d | Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2023-01-29 16:11:42 | INFO | fairseq.trainer | detected shared parameter: text_encoder_prenet.encoder_prenet.0.weight <- text_decoder_prenet.embed_tokens.weight 2023-01-29 16:11:42 | INFO | fairseq.trainer | detected shared parameter: text_encoder_prenet.encoder_prenet.0.weight <- text_decoder_postnet.output_projection.weight 2023-01-29 16:11:42 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_encoder_prenet.feature_extractor.conv_layers.1.0.bias 2023-01-29 16:11:42 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_encoder_prenet.feature_extractor.conv_layers.2.0.bias 2023-01-29 16:11:42 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_encoder_prenet.feature_extractor.conv_layers.3.0.bias 2023-01-29 16:11:42 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_encoder_prenet.feature_extractor.conv_layers.4.0.bias 2023-01-29 16:11:42 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_encoder_prenet.feature_extractor.conv_layers.5.0.bias 2023-01-29 16:11:42 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_encoder_prenet.feature_extractor.conv_layers.6.0.bias 2023-01-29 16:11:42 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- text_decoder_postnet.output_projection.bias 2023-01-29 16:11:42 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_decoder_postnet.postnet.postnet.0.0.bias 2023-01-29 16:11:42 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_decoder_postnet.postnet.postnet.1.0.bias 2023-01-29 16:11:42 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_decoder_postnet.postnet.postnet.2.0.bias 2023-01-29 16:11:42 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_decoder_postnet.postnet.postnet.3.0.bias 2023-01-29 16:11:42 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speech_decoder_postnet.postnet.postnet.4.0.bias 2023-01-29 16:11:42 | INFO | fairseq.trainer | detected shared parameter: speech_encoder_prenet.feature_extractor.conv_layers.0.0.bias <- speaker_decoder_postnet.output_projection.bias 2023-01-29 16:11:42 | INFO | fairseq.utils | ***********************CUDA enviroments for all 4 workers*********************** 2023-01-29 16:11:42 | INFO | fairseq.utils | rank 0: capabilities = 8.6 ; total memory = 11.771 GB ; name = NVIDIA GeForce RTX 3080 Ti 2023-01-29 16:11:42 | INFO | fairseq.utils | rank 1: capabilities = 8.6 ; total memory = 11.771 GB ; name = NVIDIA GeForce RTX 3080 Ti 2023-01-29 16:11:42 | INFO | fairseq.utils | rank 2: capabilities = 8.6 ; total memory = 11.771 GB ; name = NVIDIA GeForce RTX 3080 Ti 2023-01-29 16:11:42 | INFO | fairseq.utils | rank 3: capabilities = 8.6 ; total memory = 11.771 GB ; name = NVIDIA GeForce RTX 3080 Ti 2023-01-29 16:11:42 | INFO | fairseq.utils | ***********************CUDA enviroments for all 4 workers*********************** 2023-01-29 16:11:42 | INFO | fairseq_cli.train | training on 4 devices (GPUs/TPUs) 2023-01-29 16:11:42 | INFO | fairseq_cli.train | max tokens per device = None and max sentences per device = 8 2023-01-29 16:11:42 | INFO | fairseq.trainer | Preparing to load checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 16:11:45 | INFO | fairseq.trainer | Loaded checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt (epoch 2 @ 2162 updates) 2023-01-29 16:11:45 | INFO | fairseq.trainer | loading train data for epoch 2 2023-01-29 16:11:46 | INFO | speecht5.data.speech_to_class_dataset | max_keep=1024000, min_keep=None, loaded 138333, skipped 0 short and 28 long, longest-loaded=1015681, shortest-loaded=63361 2023-01-29 16:11:46 | INFO | speecht5.data.speech_to_class_dataset | max_length=51200, normalize=False 2023-01-29 16:11:52 | INFO | fairseq.trainer | begin training epoch 2 2023-01-29 16:11:52 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 16:11:55 | INFO | train_inner | {"epoch": 2, "update": 1.004, "s2c_loss": "8.723", "loss": "6.04611", "s2c_nll_loss": "8.723", "s2c_accuracy": "9.18", "s2c_total": "64", "s2c_n_correct": "5.875", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "2170", "lr": "1.44759e-05", "gnorm": "4.206", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "13"} 2023-01-29 16:11:58 | INFO | train_inner | {"epoch": 2, "update": 1.008, "s2c_loss": "8.657", "loss": "6.00057", "s2c_nll_loss": "8.657", "s2c_accuracy": "7.969", "s2c_total": "64", "s2c_n_correct": "5.1", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "2180", "lr": "1.45426e-05", "gnorm": "4.474", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "16"} 2023-01-29 16:12:01 | INFO | train_inner | {"epoch": 2, "update": 1.013, "s2c_loss": "8.55", "loss": "5.92655", "s2c_nll_loss": "8.55", "s2c_accuracy": "8.281", "s2c_total": "64", "s2c_n_correct": "5.3", "wps": "246.7", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "2190", "lr": "1.46093e-05", "gnorm": "4.27", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "19"} 2023-01-29 16:12:03 | INFO | train_inner | {"epoch": 2, "update": 1.018, "s2c_loss": "8.465", "loss": "5.86777", "s2c_nll_loss": "8.465", "s2c_accuracy": "10", "s2c_total": "64", "s2c_n_correct": "6.4", "wps": "246.4", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "2200", "lr": "1.46759e-05", "gnorm": "4.255", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "21"} 2023-01-29 16:12:06 | INFO | train_inner | {"epoch": 2, "update": 1.022, "s2c_loss": "8.479", "loss": "5.8773", "s2c_nll_loss": "8.479", "s2c_accuracy": "9.844", "s2c_total": "64", "s2c_n_correct": "6.3", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "2210", "lr": "1.47426e-05", "gnorm": "4.575", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "24"} 2023-01-29 16:12:08 | INFO | train_inner | {"epoch": 2, "update": 1.027, "s2c_loss": "8.584", "loss": "5.94977", "s2c_nll_loss": "8.584", "s2c_accuracy": "10.156", "s2c_total": "64", "s2c_n_correct": "6.5", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "2220", "lr": "1.48093e-05", "gnorm": "4.734", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "26"} 2023-01-29 16:12:11 | INFO | train_inner | {"epoch": 2, "update": 1.031, "s2c_loss": "8.5", "loss": "5.89201", "s2c_nll_loss": "8.5", "s2c_accuracy": "7.656", "s2c_total": "64", "s2c_n_correct": "4.9", "wps": "247.4", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "2230", "lr": "1.48759e-05", "gnorm": "4.653", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "29"} 2023-01-29 16:12:14 | INFO | train_inner | {"epoch": 2, "update": 1.036, "s2c_loss": "8.613", "loss": "5.96987", "s2c_nll_loss": "8.613", "s2c_accuracy": "7.969", "s2c_total": "64", "s2c_n_correct": "5.1", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "2240", "lr": "1.49426e-05", "gnorm": "4.43", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "31"} 2023-01-29 16:12:16 | INFO | train_inner | {"epoch": 2, "update": 1.041, "s2c_loss": "8.413", "loss": "5.83176", "s2c_nll_loss": "8.413", "s2c_accuracy": "8.75", "s2c_total": "64", "s2c_n_correct": "5.6", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "2250", "lr": "1.50092e-05", "gnorm": "4.465", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "34"} 2023-01-29 16:12:19 | INFO | train_inner | {"epoch": 2, "update": 1.045, "s2c_loss": "8.278", "loss": "5.73806", "s2c_nll_loss": "8.278", "s2c_accuracy": "14.531", "s2c_total": "64", "s2c_n_correct": "9.3", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "2260", "lr": "1.50759e-05", "gnorm": "4.309", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "37"} 2023-01-29 16:12:21 | INFO | train_inner | {"epoch": 2, "update": 1.05, "s2c_loss": "8.451", "loss": "5.85774", "s2c_nll_loss": "8.451", "s2c_accuracy": "8.906", "s2c_total": "64", "s2c_n_correct": "5.7", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "2270", "lr": "1.51426e-05", "gnorm": "4.45", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "39"} 2023-01-29 16:12:24 | INFO | train_inner | {"epoch": 2, "update": 1.055, "s2c_loss": "8.52", "loss": "5.9057", "s2c_nll_loss": "8.52", "s2c_accuracy": "9.219", "s2c_total": "64", "s2c_n_correct": "5.9", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "2280", "lr": "1.52092e-05", "gnorm": "4.242", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "42"} 2023-01-29 16:12:26 | INFO | train_inner | {"epoch": 2, "update": 1.059, "s2c_loss": "8.467", "loss": "5.86854", "s2c_nll_loss": "8.467", "s2c_accuracy": "9.219", "s2c_total": "64", "s2c_n_correct": "5.9", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "2290", "lr": "1.52759e-05", "gnorm": "4.533", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "44"} 2023-01-29 16:12:29 | INFO | train_inner | {"epoch": 2, "update": 1.064, "s2c_loss": "8.292", "loss": "5.7474", "s2c_nll_loss": "8.292", "s2c_accuracy": "12.656", "s2c_total": "64", "s2c_n_correct": "8.1", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "2300", "lr": "1.53426e-05", "gnorm": "4.475", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "47"} 2023-01-29 16:12:31 | INFO | train_inner | {"epoch": 2, "update": 1.068, "s2c_loss": "8.382", "loss": "5.80972", "s2c_nll_loss": "8.382", "s2c_accuracy": "10", "s2c_total": "64", "s2c_n_correct": "6.4", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "2310", "lr": "1.54092e-05", "gnorm": "4.454", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "49"} 2023-01-29 16:12:34 | INFO | train_inner | {"epoch": 2, "update": 1.073, "s2c_loss": "8.295", "loss": "5.7498", "s2c_nll_loss": "8.295", "s2c_accuracy": "11.25", "s2c_total": "64", "s2c_n_correct": "7.2", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "2320", "lr": "1.54759e-05", "gnorm": "4.531", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "52"} 2023-01-29 16:12:36 | INFO | train_inner | {"epoch": 2, "update": 1.078, "s2c_loss": "8.491", "loss": "5.88549", "s2c_nll_loss": "8.491", "s2c_accuracy": "9.375", "s2c_total": "64", "s2c_n_correct": "6", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "2330", "lr": "1.55426e-05", "gnorm": "4.473", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "54"} 2023-01-29 16:12:39 | INFO | train_inner | {"epoch": 2, "update": 1.082, "s2c_loss": "8.376", "loss": "5.80608", "s2c_nll_loss": "8.376", "s2c_accuracy": "11.562", "s2c_total": "64", "s2c_n_correct": "7.4", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "2340", "lr": "1.56092e-05", "gnorm": "4.773", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "57"} 2023-01-29 16:12:41 | INFO | train_inner | {"epoch": 2, "update": 1.087, "s2c_loss": "8.324", "loss": "5.77004", "s2c_nll_loss": "8.324", "s2c_accuracy": "11.719", "s2c_total": "64", "s2c_n_correct": "7.5", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "2350", "lr": "1.56759e-05", "gnorm": "4.655", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "59"} 2023-01-29 16:12:44 | INFO | train_inner | {"epoch": 2, "update": 1.092, "s2c_loss": "8.466", "loss": "5.86841", "s2c_nll_loss": "8.466", "s2c_accuracy": "8.75", "s2c_total": "64", "s2c_n_correct": "5.6", "wps": "249.3", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "2360", "lr": "1.57425e-05", "gnorm": "4.675", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "62"} 2023-01-29 16:12:47 | INFO | train_inner | {"epoch": 2, "update": 1.096, "s2c_loss": "8.352", "loss": "5.78908", "s2c_nll_loss": "8.352", "s2c_accuracy": "10.469", "s2c_total": "64", "s2c_n_correct": "6.7", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "2370", "lr": "1.58092e-05", "gnorm": "4.574", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "64"} 2023-01-29 16:12:49 | INFO | train_inner | {"epoch": 2, "update": 1.101, "s2c_loss": "8.215", "loss": "5.69419", "s2c_nll_loss": "8.215", "s2c_accuracy": "12.031", "s2c_total": "64", "s2c_n_correct": "7.7", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "2380", "lr": "1.58759e-05", "gnorm": "4.696", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "67"} 2023-01-29 16:12:52 | INFO | train_inner | {"epoch": 2, "update": 1.105, "s2c_loss": "8.338", "loss": "5.77979", "s2c_nll_loss": "8.338", "s2c_accuracy": "10.469", "s2c_total": "64", "s2c_n_correct": "6.7", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "2390", "lr": "1.59425e-05", "gnorm": "4.814", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "70"} 2023-01-29 16:12:54 | INFO | train_inner | {"epoch": 2, "update": 1.11, "s2c_loss": "8.327", "loss": "5.77198", "s2c_nll_loss": "8.327", "s2c_accuracy": "10.938", "s2c_total": "64", "s2c_n_correct": "7", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "2400", "lr": "1.60092e-05", "gnorm": "4.659", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "72"} 2023-01-29 16:12:57 | INFO | train_inner | {"epoch": 2, "update": 1.115, "s2c_loss": "8.333", "loss": "5.77632", "s2c_nll_loss": "8.333", "s2c_accuracy": "11.25", "s2c_total": "64", "s2c_n_correct": "7.2", "wps": "246.7", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "2410", "lr": "1.60759e-05", "gnorm": "5.18", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "75"} 2023-01-29 16:12:59 | INFO | train_inner | {"epoch": 2, "update": 1.119, "s2c_loss": "8.213", "loss": "5.69282", "s2c_nll_loss": "8.213", "s2c_accuracy": "11.562", "s2c_total": "64", "s2c_n_correct": "7.4", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "2420", "lr": "1.61425e-05", "gnorm": "4.488", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "77"} 2023-01-29 16:13:02 | INFO | train_inner | {"epoch": 2, "update": 1.124, "s2c_loss": "8.236", "loss": "5.70849", "s2c_nll_loss": "8.236", "s2c_accuracy": "11.406", "s2c_total": "64", "s2c_n_correct": "7.3", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "2430", "lr": "1.62092e-05", "gnorm": "4.555", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "80"} 2023-01-29 16:13:04 | INFO | train_inner | {"epoch": 2, "update": 1.129, "s2c_loss": "8.31", "loss": "5.76002", "s2c_nll_loss": "8.31", "s2c_accuracy": "10.938", "s2c_total": "64", "s2c_n_correct": "7", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "2440", "lr": "1.62759e-05", "gnorm": "4.785", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "82"} 2023-01-29 16:13:07 | INFO | train_inner | {"epoch": 2, "update": 1.133, "s2c_loss": "8.179", "loss": "5.66899", "s2c_nll_loss": "8.179", "s2c_accuracy": "11.406", "s2c_total": "64", "s2c_n_correct": "7.3", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "2450", "lr": "1.63425e-05", "gnorm": "4.873", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "85"} 2023-01-29 16:13:09 | INFO | train_inner | {"epoch": 2, "update": 1.138, "s2c_loss": "8.135", "loss": "5.63887", "s2c_nll_loss": "8.135", "s2c_accuracy": "12.656", "s2c_total": "64", "s2c_n_correct": "8.1", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "2460", "lr": "1.64092e-05", "gnorm": "4.427", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "87"} 2023-01-29 16:13:12 | INFO | train_inner | {"epoch": 2, "update": 1.142, "s2c_loss": "8.162", "loss": "5.65758", "s2c_nll_loss": "8.162", "s2c_accuracy": "11.719", "s2c_total": "64", "s2c_n_correct": "7.5", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "2470", "lr": "1.64758e-05", "gnorm": "4.542", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "90"} 2023-01-29 16:13:14 | INFO | train_inner | {"epoch": 2, "update": 1.147, "s2c_loss": "8.088", "loss": "5.60631", "s2c_nll_loss": "8.088", "s2c_accuracy": "12.344", "s2c_total": "64", "s2c_n_correct": "7.9", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "2480", "lr": "1.65425e-05", "gnorm": "4.753", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "92"} 2023-01-29 16:13:17 | INFO | train_inner | {"epoch": 2, "update": 1.152, "s2c_loss": "8.183", "loss": "5.67234", "s2c_nll_loss": "8.183", "s2c_accuracy": "11.094", "s2c_total": "64", "s2c_n_correct": "7.1", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "2490", "lr": "1.66092e-05", "gnorm": "4.67", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "95"} 2023-01-29 16:13:20 | INFO | train_inner | {"epoch": 2, "update": 1.156, "s2c_loss": "7.993", "loss": "5.54", "s2c_nll_loss": "7.993", "s2c_accuracy": "12.5", "s2c_total": "64", "s2c_n_correct": "8", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "2500", "lr": "1.66758e-05", "gnorm": "4.828", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "98"} 2023-01-29 16:13:22 | INFO | train_inner | {"epoch": 2, "update": 1.161, "s2c_loss": "8.02", "loss": "5.55926", "s2c_nll_loss": "8.02", "s2c_accuracy": "13.906", "s2c_total": "64", "s2c_n_correct": "8.9", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "2510", "lr": "1.67425e-05", "gnorm": "4.619", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "100"} 2023-01-29 16:13:25 | INFO | train_inner | {"epoch": 2, "update": 1.166, "s2c_loss": "8.117", "loss": "5.62628", "s2c_nll_loss": "8.117", "s2c_accuracy": "12.656", "s2c_total": "64", "s2c_n_correct": "8.1", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "2520", "lr": "1.68092e-05", "gnorm": "4.4", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "103"} 2023-01-29 16:13:27 | INFO | train_inner | {"epoch": 2, "update": 1.17, "s2c_loss": "8.101", "loss": "5.61541", "s2c_nll_loss": "8.101", "s2c_accuracy": "12.5", "s2c_total": "64", "s2c_n_correct": "8", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "2530", "lr": "1.68758e-05", "gnorm": "4.736", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "105"} 2023-01-29 16:13:30 | INFO | train_inner | {"epoch": 2, "update": 1.175, "s2c_loss": "8.054", "loss": "5.58262", "s2c_nll_loss": "8.054", "s2c_accuracy": "13.125", "s2c_total": "64", "s2c_n_correct": "8.4", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "2540", "lr": "1.69425e-05", "gnorm": "4.678", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "108"} 2023-01-29 16:13:32 | INFO | train_inner | {"epoch": 2, "update": 1.179, "s2c_loss": "8.115", "loss": "5.62467", "s2c_nll_loss": "8.115", "s2c_accuracy": "10.781", "s2c_total": "64", "s2c_n_correct": "6.9", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "2550", "lr": "1.70091e-05", "gnorm": "4.827", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "110"} 2023-01-29 16:13:35 | INFO | train_inner | {"epoch": 2, "update": 1.184, "s2c_loss": "8.025", "loss": "5.56221", "s2c_nll_loss": "8.025", "s2c_accuracy": "13.125", "s2c_total": "64", "s2c_n_correct": "8.4", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "2560", "lr": "1.70758e-05", "gnorm": "4.78", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "113"} 2023-01-29 16:13:37 | INFO | train_inner | {"epoch": 2, "update": 1.189, "s2c_loss": "8.027", "loss": "5.56418", "s2c_nll_loss": "8.027", "s2c_accuracy": "13.438", "s2c_total": "64", "s2c_n_correct": "8.6", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "2570", "lr": "1.71425e-05", "gnorm": "4.953", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "115"} 2023-01-29 16:13:40 | INFO | train_inner | {"epoch": 2, "update": 1.193, "s2c_loss": "7.999", "loss": "5.54438", "s2c_nll_loss": "7.999", "s2c_accuracy": "14.062", "s2c_total": "64", "s2c_n_correct": "9", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "2580", "lr": "1.72091e-05", "gnorm": "5.021", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "118"} 2023-01-29 16:13:43 | INFO | train_inner | {"epoch": 2, "update": 1.198, "s2c_loss": "7.967", "loss": "5.52263", "s2c_nll_loss": "7.967", "s2c_accuracy": "13.438", "s2c_total": "64", "s2c_n_correct": "8.6", "wps": "247.5", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "2590", "lr": "1.72758e-05", "gnorm": "4.993", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "120"} 2023-01-29 16:13:45 | INFO | train_inner | {"epoch": 2, "update": 1.203, "s2c_loss": "7.886", "loss": "5.46612", "s2c_nll_loss": "7.886", "s2c_accuracy": "14.375", "s2c_total": "64", "s2c_n_correct": "9.2", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "2600", "lr": "1.73425e-05", "gnorm": "4.733", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "123"} 2023-01-29 16:13:48 | INFO | train_inner | {"epoch": 2, "update": 1.207, "s2c_loss": "7.803", "loss": "5.40842", "s2c_nll_loss": "7.803", "s2c_accuracy": "17.188", "s2c_total": "64", "s2c_n_correct": "11", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "2610", "lr": "1.74091e-05", "gnorm": "4.947", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "126"} 2023-01-29 16:13:50 | INFO | train_inner | {"epoch": 2, "update": 1.212, "s2c_loss": "7.801", "loss": "5.40725", "s2c_nll_loss": "7.801", "s2c_accuracy": "16.25", "s2c_total": "64", "s2c_n_correct": "10.4", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "2620", "lr": "1.74758e-05", "gnorm": "5.151", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "128"} 2023-01-29 16:13:53 | INFO | train_inner | {"epoch": 2, "update": 1.216, "s2c_loss": "7.922", "loss": "5.49114", "s2c_nll_loss": "7.922", "s2c_accuracy": "13.594", "s2c_total": "64", "s2c_n_correct": "8.7", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "2630", "lr": "1.75425e-05", "gnorm": "5.129", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "131"} 2023-01-29 16:13:55 | INFO | train_inner | {"epoch": 2, "update": 1.221, "s2c_loss": "7.911", "loss": "5.48339", "s2c_nll_loss": "7.911", "s2c_accuracy": "12.969", "s2c_total": "64", "s2c_n_correct": "8.3", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "2640", "lr": "1.76091e-05", "gnorm": "5.061", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "133"} 2023-01-29 16:13:58 | INFO | train_inner | {"epoch": 2, "update": 1.226, "s2c_loss": "7.929", "loss": "5.49611", "s2c_nll_loss": "7.929", "s2c_accuracy": "12.656", "s2c_total": "64", "s2c_n_correct": "8.1", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "2650", "lr": "1.76758e-05", "gnorm": "5.09", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "136"} 2023-01-29 16:14:00 | INFO | train_inner | {"epoch": 2, "update": 1.23, "s2c_loss": "7.933", "loss": "5.49882", "s2c_nll_loss": "7.933", "s2c_accuracy": "12.031", "s2c_total": "64", "s2c_n_correct": "7.7", "wps": "256.3", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "2660", "lr": "1.77424e-05", "gnorm": "4.946", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "138"} 2023-01-29 16:14:03 | INFO | train_inner | {"epoch": 2, "update": 1.235, "s2c_loss": "7.958", "loss": "5.51596", "s2c_nll_loss": "7.958", "s2c_accuracy": "15.938", "s2c_total": "64", "s2c_n_correct": "10.2", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "2670", "lr": "1.78091e-05", "gnorm": "5.008", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "141"} 2023-01-29 16:14:05 | INFO | train_inner | {"epoch": 2, "update": 1.24, "s2c_loss": "7.897", "loss": "5.47355", "s2c_nll_loss": "7.897", "s2c_accuracy": "13.906", "s2c_total": "64", "s2c_n_correct": "8.9", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "2680", "lr": "1.78758e-05", "gnorm": "5.033", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "143"} 2023-01-29 16:14:08 | INFO | train_inner | {"epoch": 2, "update": 1.244, "s2c_loss": "7.812", "loss": "5.41466", "s2c_nll_loss": "7.812", "s2c_accuracy": "14.375", "s2c_total": "64", "s2c_n_correct": "9.2", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "2690", "lr": "1.79424e-05", "gnorm": "5.037", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "146"} 2023-01-29 16:14:10 | INFO | train_inner | {"epoch": 2, "update": 1.249, "s2c_loss": "7.684", "loss": "5.32612", "s2c_nll_loss": "7.684", "s2c_accuracy": "15.781", "s2c_total": "64", "s2c_n_correct": "10.1", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "2700", "lr": "1.80091e-05", "gnorm": "4.996", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "148"} 2023-01-29 16:14:13 | INFO | train_inner | {"epoch": 2, "update": 1.253, "s2c_loss": "7.736", "loss": "5.3621", "s2c_nll_loss": "7.736", "s2c_accuracy": "16.25", "s2c_total": "64", "s2c_n_correct": "10.4", "wps": "246.7", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "2710", "lr": "1.80758e-05", "gnorm": "5.233", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "151"} 2023-01-29 16:14:16 | INFO | train_inner | {"epoch": 2, "update": 1.258, "s2c_loss": "7.621", "loss": "5.28241", "s2c_nll_loss": "7.621", "s2c_accuracy": "17.812", "s2c_total": "64", "s2c_n_correct": "11.4", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "2720", "lr": "1.81424e-05", "gnorm": "4.824", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "154"} 2023-01-29 16:14:18 | INFO | train_inner | {"epoch": 2, "update": 1.263, "s2c_loss": "7.627", "loss": "5.28666", "s2c_nll_loss": "7.627", "s2c_accuracy": "17.344", "s2c_total": "64", "s2c_n_correct": "11.1", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "2730", "lr": "1.82091e-05", "gnorm": "5.11", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "156"} 2023-01-29 16:14:21 | INFO | train_inner | {"epoch": 2, "update": 1.267, "s2c_loss": "7.841", "loss": "5.43486", "s2c_nll_loss": "7.841", "s2c_accuracy": "15", "s2c_total": "64", "s2c_n_correct": "9.6", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "2740", "lr": "1.82758e-05", "gnorm": "4.716", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "159"} 2023-01-29 16:14:23 | INFO | train_inner | {"epoch": 2, "update": 1.272, "s2c_loss": "7.678", "loss": "5.32233", "s2c_nll_loss": "7.678", "s2c_accuracy": "15.312", "s2c_total": "64", "s2c_n_correct": "9.8", "wps": "247.4", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "2750", "lr": "1.83424e-05", "gnorm": "5.16", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "161"} 2023-01-29 16:14:26 | INFO | train_inner | {"epoch": 2, "update": 1.277, "s2c_loss": "7.634", "loss": "5.29144", "s2c_nll_loss": "7.634", "s2c_accuracy": "16.875", "s2c_total": "64", "s2c_n_correct": "10.8", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "2760", "lr": "1.84091e-05", "gnorm": "4.965", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "164"} 2023-01-29 16:14:28 | INFO | train_inner | {"epoch": 2, "update": 1.281, "s2c_loss": "7.743", "loss": "5.36688", "s2c_nll_loss": "7.743", "s2c_accuracy": "15.156", "s2c_total": "64", "s2c_n_correct": "9.7", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "2770", "lr": "1.84757e-05", "gnorm": "5.092", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "166"} 2023-01-29 16:14:31 | INFO | train_inner | {"epoch": 2, "update": 1.286, "s2c_loss": "7.785", "loss": "5.39586", "s2c_nll_loss": "7.785", "s2c_accuracy": "13.906", "s2c_total": "64", "s2c_n_correct": "8.9", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "2780", "lr": "1.85424e-05", "gnorm": "4.811", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "169"} 2023-01-29 16:14:34 | INFO | train_inner | {"epoch": 2, "update": 1.29, "s2c_loss": "7.694", "loss": "5.333", "s2c_nll_loss": "7.694", "s2c_accuracy": "18.281", "s2c_total": "64", "s2c_n_correct": "11.7", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "2790", "lr": "1.86091e-05", "gnorm": "4.902", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "171"} 2023-01-29 16:14:36 | INFO | train_inner | {"epoch": 2, "update": 1.295, "s2c_loss": "7.601", "loss": "5.26832", "s2c_nll_loss": "7.601", "s2c_accuracy": "18.125", "s2c_total": "64", "s2c_n_correct": "11.6", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "2800", "lr": "1.86757e-05", "gnorm": "5.054", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "174"} 2023-01-29 16:14:39 | INFO | train_inner | {"epoch": 2, "update": 1.3, "s2c_loss": "7.71", "loss": "5.34407", "s2c_nll_loss": "7.71", "s2c_accuracy": "14.062", "s2c_total": "64", "s2c_n_correct": "9", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "2810", "lr": "1.87424e-05", "gnorm": "5.242", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "176"} 2023-01-29 16:14:41 | INFO | train_inner | {"epoch": 2, "update": 1.304, "s2c_loss": "7.632", "loss": "5.28988", "s2c_nll_loss": "7.632", "s2c_accuracy": "16.875", "s2c_total": "64", "s2c_n_correct": "10.8", "wps": "257.6", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "2820", "lr": "1.88091e-05", "gnorm": "4.989", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "179"} 2023-01-29 16:14:44 | INFO | train_inner | {"epoch": 2, "update": 1.309, "s2c_loss": "7.752", "loss": "5.37338", "s2c_nll_loss": "7.752", "s2c_accuracy": "16.094", "s2c_total": "64", "s2c_n_correct": "10.3", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "2830", "lr": "1.88757e-05", "gnorm": "5.084", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "181"} 2023-01-29 16:14:46 | INFO | train_inner | {"epoch": 2, "update": 1.314, "s2c_loss": "7.502", "loss": "5.20031", "s2c_nll_loss": "7.502", "s2c_accuracy": "18.75", "s2c_total": "64", "s2c_n_correct": "12", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "2840", "lr": "1.89424e-05", "gnorm": "5.273", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "184"} 2023-01-29 16:14:49 | INFO | train_inner | {"epoch": 2, "update": 1.318, "s2c_loss": "7.357", "loss": "5.09922", "s2c_nll_loss": "7.357", "s2c_accuracy": "20.312", "s2c_total": "64", "s2c_n_correct": "13", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "2850", "lr": "1.9009e-05", "gnorm": "5.355", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "187"} 2023-01-29 16:14:51 | INFO | train_inner | {"epoch": 2, "update": 1.323, "s2c_loss": "7.739", "loss": "5.36439", "s2c_nll_loss": "7.739", "s2c_accuracy": "14.375", "s2c_total": "64", "s2c_n_correct": "9.2", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "2860", "lr": "1.90757e-05", "gnorm": "4.987", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "189"} 2023-01-29 16:14:54 | INFO | train_inner | {"epoch": 2, "update": 1.327, "s2c_loss": "7.393", "loss": "5.12471", "s2c_nll_loss": "7.393", "s2c_accuracy": "19.062", "s2c_total": "64", "s2c_n_correct": "12.2", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "2870", "lr": "1.91424e-05", "gnorm": "5.034", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "192"} 2023-01-29 16:14:56 | INFO | train_inner | {"epoch": 2, "update": 1.332, "s2c_loss": "7.439", "loss": "5.15653", "s2c_nll_loss": "7.439", "s2c_accuracy": "16.25", "s2c_total": "64", "s2c_n_correct": "10.4", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "2880", "lr": "1.9209e-05", "gnorm": "4.939", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "194"} 2023-01-29 16:14:59 | INFO | train_inner | {"epoch": 2, "update": 1.337, "s2c_loss": "7.7", "loss": "5.33718", "s2c_nll_loss": "7.7", "s2c_accuracy": "16.719", "s2c_total": "64", "s2c_n_correct": "10.7", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "2890", "lr": "1.92757e-05", "gnorm": "5.071", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "197"} 2023-01-29 16:15:01 | INFO | train_inner | {"epoch": 2, "update": 1.341, "s2c_loss": "7.542", "loss": "5.22758", "s2c_nll_loss": "7.542", "s2c_accuracy": "17.5", "s2c_total": "64", "s2c_n_correct": "11.2", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "2900", "lr": "1.93424e-05", "gnorm": "5.366", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "199"} 2023-01-29 16:15:04 | INFO | train_inner | {"epoch": 2, "update": 1.346, "s2c_loss": "7.445", "loss": "5.16065", "s2c_nll_loss": "7.445", "s2c_accuracy": "18.281", "s2c_total": "64", "s2c_n_correct": "11.7", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "2910", "lr": "1.9409e-05", "gnorm": "4.92", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "202"} 2023-01-29 16:15:06 | INFO | train_inner | {"epoch": 2, "update": 1.351, "s2c_loss": "7.568", "loss": "5.24593", "s2c_nll_loss": "7.568", "s2c_accuracy": "17.656", "s2c_total": "64", "s2c_n_correct": "11.3", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "2920", "lr": "1.94757e-05", "gnorm": "4.865", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "204"} 2023-01-29 16:15:09 | INFO | train_inner | {"epoch": 2, "update": 1.355, "s2c_loss": "7.4", "loss": "5.12935", "s2c_nll_loss": "7.4", "s2c_accuracy": "17.656", "s2c_total": "64", "s2c_n_correct": "11.3", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "2930", "lr": "1.95424e-05", "gnorm": "5.072", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "207"} 2023-01-29 16:15:11 | INFO | train_inner | {"epoch": 2, "update": 1.36, "s2c_loss": "7.36", "loss": "5.10157", "s2c_nll_loss": "7.36", "s2c_accuracy": "19.375", "s2c_total": "64", "s2c_n_correct": "12.4", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "2940", "lr": "1.9609e-05", "gnorm": "5.373", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "209"} 2023-01-29 16:15:14 | INFO | train_inner | {"epoch": 2, "update": 1.364, "s2c_loss": "7.462", "loss": "5.17258", "s2c_nll_loss": "7.462", "s2c_accuracy": "17.344", "s2c_total": "64", "s2c_n_correct": "11.1", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "2950", "lr": "1.96757e-05", "gnorm": "5.46", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "212"} 2023-01-29 16:15:17 | INFO | train_inner | {"epoch": 2, "update": 1.369, "s2c_loss": "7.373", "loss": "5.11029", "s2c_nll_loss": "7.373", "s2c_accuracy": "16.875", "s2c_total": "64", "s2c_n_correct": "10.8", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "2960", "lr": "1.97423e-05", "gnorm": "5.189", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "214"} 2023-01-29 16:15:19 | INFO | train_inner | {"epoch": 2, "update": 1.374, "s2c_loss": "7.407", "loss": "5.13389", "s2c_nll_loss": "7.407", "s2c_accuracy": "18.125", "s2c_total": "64", "s2c_n_correct": "11.6", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "2970", "lr": "1.9809e-05", "gnorm": "5.243", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "217"} 2023-01-29 16:15:22 | INFO | train_inner | {"epoch": 2, "update": 1.378, "s2c_loss": "7.385", "loss": "5.11884", "s2c_nll_loss": "7.385", "s2c_accuracy": "19.531", "s2c_total": "64", "s2c_n_correct": "12.5", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "2980", "lr": "1.98757e-05", "gnorm": "5.663", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "220"} 2023-01-29 16:15:24 | INFO | train_inner | {"epoch": 2, "update": 1.383, "s2c_loss": "7.319", "loss": "5.07328", "s2c_nll_loss": "7.319", "s2c_accuracy": "20", "s2c_total": "64", "s2c_n_correct": "12.8", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "2990", "lr": "1.99423e-05", "gnorm": "5.599", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "222"} 2023-01-29 16:15:27 | INFO | train_inner | {"epoch": 2, "update": 1.388, "s2c_loss": "7.325", "loss": "5.07721", "s2c_nll_loss": "7.325", "s2c_accuracy": "20.781", "s2c_total": "64", "s2c_n_correct": "13.3", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "3000", "lr": "2.0009e-05", "gnorm": "5.87", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "225"} 2023-01-29 16:15:29 | INFO | train_inner | {"epoch": 2, "update": 1.392, "s2c_loss": "7.295", "loss": "5.05683", "s2c_nll_loss": "7.295", "s2c_accuracy": "19.375", "s2c_total": "64", "s2c_n_correct": "12.4", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "3010", "lr": "2.00757e-05", "gnorm": "5.871", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "227"} 2023-01-29 16:15:32 | INFO | train_inner | {"epoch": 2, "update": 1.397, "s2c_loss": "7.39", "loss": "5.1222", "s2c_nll_loss": "7.39", "s2c_accuracy": "18.438", "s2c_total": "64", "s2c_n_correct": "11.8", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "3020", "lr": "2.01423e-05", "gnorm": "5.573", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "230"} 2023-01-29 16:15:34 | INFO | train_inner | {"epoch": 2, "update": 1.401, "s2c_loss": "7.48", "loss": "5.185", "s2c_nll_loss": "7.48", "s2c_accuracy": "17.656", "s2c_total": "64", "s2c_n_correct": "11.3", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "3030", "lr": "2.0209e-05", "gnorm": "5.266", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "232"} 2023-01-29 16:15:37 | INFO | train_inner | {"epoch": 2, "update": 1.406, "s2c_loss": "7.092", "loss": "4.91588", "s2c_nll_loss": "7.092", "s2c_accuracy": "22.969", "s2c_total": "64", "s2c_n_correct": "14.7", "wps": "247.4", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "3040", "lr": "2.02757e-05", "gnorm": "5.134", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "235"} 2023-01-29 16:15:39 | INFO | train_inner | {"epoch": 2, "update": 1.411, "s2c_loss": "7.266", "loss": "5.03614", "s2c_nll_loss": "7.266", "s2c_accuracy": "19.688", "s2c_total": "64", "s2c_n_correct": "12.6", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "3050", "lr": "2.03423e-05", "gnorm": "4.91", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "237"} 2023-01-29 16:15:42 | INFO | train_inner | {"epoch": 2, "update": 1.415, "s2c_loss": "7.259", "loss": "5.03155", "s2c_nll_loss": "7.259", "s2c_accuracy": "18.75", "s2c_total": "64", "s2c_n_correct": "12", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "3060", "lr": "2.0409e-05", "gnorm": "5.102", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "240"} 2023-01-29 16:15:44 | INFO | train_inner | {"epoch": 2, "update": 1.42, "s2c_loss": "6.971", "loss": "4.83193", "s2c_nll_loss": "6.971", "s2c_accuracy": "24.062", "s2c_total": "64", "s2c_n_correct": "15.4", "wps": "258.7", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "3070", "lr": "2.04756e-05", "gnorm": "5.681", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "242"} 2023-01-29 16:15:47 | INFO | train_inner | {"epoch": 2, "update": 1.425, "s2c_loss": "7.078", "loss": "4.90611", "s2c_nll_loss": "7.078", "s2c_accuracy": "22.344", "s2c_total": "64", "s2c_n_correct": "14.3", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "3080", "lr": "2.05423e-05", "gnorm": "5.361", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "245"} 2023-01-29 16:15:50 | INFO | train_inner | {"epoch": 2, "update": 1.429, "s2c_loss": "7.201", "loss": "4.99154", "s2c_nll_loss": "7.201", "s2c_accuracy": "19.688", "s2c_total": "64", "s2c_n_correct": "12.6", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "3090", "lr": "2.0609e-05", "gnorm": "5.784", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "247"} 2023-01-29 16:15:52 | INFO | train_inner | {"epoch": 2, "update": 1.434, "s2c_loss": "7.028", "loss": "4.87127", "s2c_nll_loss": "7.028", "s2c_accuracy": "25.312", "s2c_total": "64", "s2c_n_correct": "16.2", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "3100", "lr": "2.06756e-05", "gnorm": "5.351", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "250"} 2023-01-29 16:15:55 | INFO | train_inner | {"epoch": 2, "update": 1.438, "s2c_loss": "7.062", "loss": "4.89481", "s2c_nll_loss": "7.062", "s2c_accuracy": "20.469", "s2c_total": "64", "s2c_n_correct": "13.1", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "3110", "lr": "2.07423e-05", "gnorm": "5.597", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "253"} 2023-01-29 16:15:57 | INFO | train_inner | {"epoch": 2, "update": 1.443, "s2c_loss": "6.854", "loss": "4.75113", "s2c_nll_loss": "6.854", "s2c_accuracy": "24.844", "s2c_total": "64", "s2c_n_correct": "15.9", "wps": "247.5", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "3120", "lr": "2.0809e-05", "gnorm": "5.687", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "255"} 2023-01-29 16:16:00 | INFO | train_inner | {"epoch": 2, "update": 1.448, "s2c_loss": "6.925", "loss": "4.80028", "s2c_nll_loss": "6.925", "s2c_accuracy": "23.75", "s2c_total": "64", "s2c_n_correct": "15.2", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "3130", "lr": "2.08756e-05", "gnorm": "5.91", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "258"} 2023-01-29 16:16:02 | INFO | train_inner | {"epoch": 2, "update": 1.452, "s2c_loss": "7.146", "loss": "4.95339", "s2c_nll_loss": "7.146", "s2c_accuracy": "22.188", "s2c_total": "64", "s2c_n_correct": "14.2", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "3140", "lr": "2.09423e-05", "gnorm": "5.832", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "260"} 2023-01-29 16:16:05 | INFO | train_inner | {"epoch": 2, "update": 1.457, "s2c_loss": "7.155", "loss": "4.95923", "s2c_nll_loss": "7.155", "s2c_accuracy": "20.781", "s2c_total": "64", "s2c_n_correct": "13.3", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "3150", "lr": "2.10089e-05", "gnorm": "5.315", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "263"} 2023-01-29 16:16:07 | INFO | train_inner | {"epoch": 2, "update": 1.462, "s2c_loss": "6.931", "loss": "4.80448", "s2c_nll_loss": "6.931", "s2c_accuracy": "22.812", "s2c_total": "64", "s2c_n_correct": "14.6", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "3160", "lr": "2.10756e-05", "gnorm": "5.955", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "265"} 2023-01-29 16:16:10 | INFO | train_inner | {"epoch": 2, "update": 1.466, "s2c_loss": "6.898", "loss": "4.78141", "s2c_nll_loss": "6.898", "s2c_accuracy": "23.75", "s2c_total": "64", "s2c_n_correct": "15.2", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "3170", "lr": "2.11423e-05", "gnorm": "5.89", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "268"} 2023-01-29 16:16:12 | INFO | train_inner | {"epoch": 2, "update": 1.471, "s2c_loss": "7.065", "loss": "4.89683", "s2c_nll_loss": "7.065", "s2c_accuracy": "20.781", "s2c_total": "64", "s2c_n_correct": "13.3", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "3180", "lr": "2.12089e-05", "gnorm": "5.688", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "270"} 2023-01-29 16:16:15 | INFO | train_inner | {"epoch": 2, "update": 1.475, "s2c_loss": "7.065", "loss": "4.89715", "s2c_nll_loss": "7.065", "s2c_accuracy": "21.25", "s2c_total": "64", "s2c_n_correct": "13.6", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "3190", "lr": "2.12756e-05", "gnorm": "6.222", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "273"} 2023-01-29 16:16:18 | INFO | train_inner | {"epoch": 2, "update": 1.48, "s2c_loss": "6.924", "loss": "4.79959", "s2c_nll_loss": "6.924", "s2c_accuracy": "23.594", "s2c_total": "64", "s2c_n_correct": "15.1", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "3200", "lr": "2.13423e-05", "gnorm": "5.554", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "275"} 2023-01-29 16:16:20 | INFO | train_inner | {"epoch": 2, "update": 1.485, "s2c_loss": "6.863", "loss": "4.75721", "s2c_nll_loss": "6.863", "s2c_accuracy": "21.719", "s2c_total": "64", "s2c_n_correct": "13.9", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "3210", "lr": "2.14089e-05", "gnorm": "5.636", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "278"} 2023-01-29 16:16:23 | INFO | train_inner | {"epoch": 2, "update": 1.489, "s2c_loss": "6.827", "loss": "4.7318", "s2c_nll_loss": "6.827", "s2c_accuracy": "23.125", "s2c_total": "64", "s2c_n_correct": "14.8", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "3220", "lr": "2.14756e-05", "gnorm": "5.822", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "281"} 2023-01-29 16:16:25 | INFO | train_inner | {"epoch": 2, "update": 1.494, "s2c_loss": "6.794", "loss": "4.70907", "s2c_nll_loss": "6.794", "s2c_accuracy": "23.594", "s2c_total": "64", "s2c_n_correct": "15.1", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "3230", "lr": "2.15423e-05", "gnorm": "5.756", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "283"} 2023-01-29 16:16:28 | INFO | train_inner | {"epoch": 2, "update": 1.499, "s2c_loss": "7.001", "loss": "4.85255", "s2c_nll_loss": "7.001", "s2c_accuracy": "21.25", "s2c_total": "64", "s2c_n_correct": "13.6", "wps": "247.3", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "3240", "lr": "2.16089e-05", "gnorm": "6.133", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "286"} 2023-01-29 16:16:30 | INFO | train_inner | {"epoch": 2, "update": 1.503, "s2c_loss": "6.763", "loss": "4.68765", "s2c_nll_loss": "6.763", "s2c_accuracy": "24.844", "s2c_total": "64", "s2c_n_correct": "15.9", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "3250", "lr": "2.16756e-05", "gnorm": "6.076", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "288"} 2023-01-29 16:16:33 | INFO | train_inner | {"epoch": 2, "update": 1.508, "s2c_loss": "7.04", "loss": "4.88008", "s2c_nll_loss": "7.04", "s2c_accuracy": "22.969", "s2c_total": "64", "s2c_n_correct": "14.7", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "3260", "lr": "2.17422e-05", "gnorm": "5.632", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "291"} 2023-01-29 16:16:35 | INFO | train_inner | {"epoch": 2, "update": 1.512, "s2c_loss": "6.805", "loss": "4.71705", "s2c_nll_loss": "6.805", "s2c_accuracy": "24.219", "s2c_total": "64", "s2c_n_correct": "15.5", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "3270", "lr": "2.18089e-05", "gnorm": "6.264", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "293"} 2023-01-29 16:16:38 | INFO | train_inner | {"epoch": 2, "update": 1.517, "s2c_loss": "7.103", "loss": "4.92318", "s2c_nll_loss": "7.103", "s2c_accuracy": "20.781", "s2c_total": "64", "s2c_n_correct": "13.3", "wps": "247.3", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "3280", "lr": "2.18756e-05", "gnorm": "5.78", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "296"} 2023-01-29 16:16:41 | INFO | train_inner | {"epoch": 2, "update": 1.522, "s2c_loss": "7.108", "loss": "4.92715", "s2c_nll_loss": "7.108", "s2c_accuracy": "20.156", "s2c_total": "64", "s2c_n_correct": "12.9", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "3290", "lr": "2.19422e-05", "gnorm": "6.268", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "299"} 2023-01-29 16:16:43 | INFO | train_inner | {"epoch": 2, "update": 1.526, "s2c_loss": "7.057", "loss": "4.89125", "s2c_nll_loss": "7.057", "s2c_accuracy": "20.625", "s2c_total": "64", "s2c_n_correct": "13.2", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "3300", "lr": "2.20089e-05", "gnorm": "6.039", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "301"} 2023-01-29 16:16:46 | INFO | train_inner | {"epoch": 2, "update": 1.531, "s2c_loss": "6.697", "loss": "4.64223", "s2c_nll_loss": "6.697", "s2c_accuracy": "25.625", "s2c_total": "64", "s2c_n_correct": "16.4", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "3310", "lr": "2.20756e-05", "gnorm": "6.748", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "304"} 2023-01-29 16:16:48 | INFO | train_inner | {"epoch": 2, "update": 1.536, "s2c_loss": "6.72", "loss": "4.65768", "s2c_nll_loss": "6.72", "s2c_accuracy": "24.531", "s2c_total": "64", "s2c_n_correct": "15.7", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "3320", "lr": "2.21422e-05", "gnorm": "6.022", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "306"} 2023-01-29 16:16:51 | INFO | train_inner | {"epoch": 2, "update": 1.54, "s2c_loss": "6.648", "loss": "4.60812", "s2c_nll_loss": "6.648", "s2c_accuracy": "25.469", "s2c_total": "64", "s2c_n_correct": "16.3", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "3330", "lr": "2.22089e-05", "gnorm": "6.15", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "309"} 2023-01-29 16:16:53 | INFO | train_inner | {"epoch": 2, "update": 1.545, "s2c_loss": "6.996", "loss": "4.84907", "s2c_nll_loss": "6.996", "s2c_accuracy": "22.344", "s2c_total": "64", "s2c_n_correct": "14.3", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "3340", "lr": "2.22756e-05", "gnorm": "5.632", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "311"} 2023-01-29 16:16:56 | INFO | train_inner | {"epoch": 2, "update": 1.549, "s2c_loss": "6.705", "loss": "4.64749", "s2c_nll_loss": "6.705", "s2c_accuracy": "26.094", "s2c_total": "64", "s2c_n_correct": "16.7", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "3350", "lr": "2.23422e-05", "gnorm": "6.013", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "314"} 2023-01-29 16:16:58 | INFO | train_inner | {"epoch": 2, "update": 1.554, "s2c_loss": "6.924", "loss": "4.79911", "s2c_nll_loss": "6.924", "s2c_accuracy": "23.125", "s2c_total": "64", "s2c_n_correct": "14.8", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "3360", "lr": "2.24089e-05", "gnorm": "6.787", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "316"} 2023-01-29 16:17:01 | INFO | train_inner | {"epoch": 2, "update": 1.559, "s2c_loss": "6.832", "loss": "4.73541", "s2c_nll_loss": "6.832", "s2c_accuracy": "21.094", "s2c_total": "64", "s2c_n_correct": "13.5", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "3370", "lr": "2.24755e-05", "gnorm": "5.984", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "319"} 2023-01-29 16:17:03 | INFO | train_inner | {"epoch": 2, "update": 1.563, "s2c_loss": "6.798", "loss": "4.71215", "s2c_nll_loss": "6.798", "s2c_accuracy": "23.75", "s2c_total": "64", "s2c_n_correct": "15.2", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "3380", "lr": "2.25422e-05", "gnorm": "6.245", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "321"} 2023-01-29 16:17:06 | INFO | train_inner | {"epoch": 2, "update": 1.568, "s2c_loss": "6.719", "loss": "4.65719", "s2c_nll_loss": "6.719", "s2c_accuracy": "27.031", "s2c_total": "64", "s2c_n_correct": "17.3", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "3390", "lr": "2.26089e-05", "gnorm": "6.053", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "324"} 2023-01-29 16:17:08 | INFO | train_inner | {"epoch": 2, "update": 1.573, "s2c_loss": "6.624", "loss": "4.5913", "s2c_nll_loss": "6.624", "s2c_accuracy": "28.438", "s2c_total": "64", "s2c_n_correct": "18.2", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "3400", "lr": "2.26755e-05", "gnorm": "6.298", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "326"} 2023-01-29 16:17:11 | INFO | train_inner | {"epoch": 2, "update": 1.577, "s2c_loss": "6.706", "loss": "4.64792", "s2c_nll_loss": "6.706", "s2c_accuracy": "24.219", "s2c_total": "64", "s2c_n_correct": "15.5", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "3410", "lr": "2.27422e-05", "gnorm": "6.037", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "329"} 2023-01-29 16:17:13 | INFO | train_inner | {"epoch": 2, "update": 1.582, "s2c_loss": "6.499", "loss": "4.50478", "s2c_nll_loss": "6.499", "s2c_accuracy": "27.969", "s2c_total": "64", "s2c_n_correct": "17.9", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "3420", "lr": "2.28089e-05", "gnorm": "6.635", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "331"} 2023-01-29 16:17:16 | INFO | train_inner | {"epoch": 2, "update": 1.586, "s2c_loss": "6.731", "loss": "4.66528", "s2c_nll_loss": "6.731", "s2c_accuracy": "24.375", "s2c_total": "64", "s2c_n_correct": "15.6", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "3430", "lr": "2.28755e-05", "gnorm": "6.625", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "334"} 2023-01-29 16:17:19 | INFO | train_inner | {"epoch": 2, "update": 1.591, "s2c_loss": "6.662", "loss": "4.61803", "s2c_nll_loss": "6.662", "s2c_accuracy": "26.719", "s2c_total": "64", "s2c_n_correct": "17.1", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "3440", "lr": "2.29422e-05", "gnorm": "6.33", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "336"} 2023-01-29 16:17:21 | INFO | train_inner | {"epoch": 2, "update": 1.596, "s2c_loss": "6.884", "loss": "4.77142", "s2c_nll_loss": "6.884", "s2c_accuracy": "23.906", "s2c_total": "64", "s2c_n_correct": "15.3", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "3450", "lr": "2.30089e-05", "gnorm": "5.965", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "339"} 2023-01-29 16:17:24 | INFO | train_inner | {"epoch": 2, "update": 1.6, "s2c_loss": "6.619", "loss": "4.58771", "s2c_nll_loss": "6.619", "s2c_accuracy": "28.906", "s2c_total": "64", "s2c_n_correct": "18.5", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "3460", "lr": "2.30755e-05", "gnorm": "5.822", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "342"} 2023-01-29 16:17:26 | INFO | train_inner | {"epoch": 2, "update": 1.605, "s2c_loss": "6.781", "loss": "4.70041", "s2c_nll_loss": "6.781", "s2c_accuracy": "23.281", "s2c_total": "64", "s2c_n_correct": "14.9", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "3470", "lr": "2.31422e-05", "gnorm": "5.608", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "344"} 2023-01-29 16:17:29 | INFO | train_inner | {"epoch": 2, "update": 1.61, "s2c_loss": "6.483", "loss": "4.49376", "s2c_nll_loss": "6.483", "s2c_accuracy": "27.969", "s2c_total": "64", "s2c_n_correct": "17.9", "wps": "247.4", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "3480", "lr": "2.32088e-05", "gnorm": "6.38", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "347"} 2023-01-29 16:17:31 | INFO | train_inner | {"epoch": 2, "update": 1.614, "s2c_loss": "6.404", "loss": "4.43909", "s2c_nll_loss": "6.404", "s2c_accuracy": "26.406", "s2c_total": "64", "s2c_n_correct": "16.9", "wps": "246.7", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "3490", "lr": "2.32755e-05", "gnorm": "5.97", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "349"} 2023-01-29 16:17:34 | INFO | train_inner | {"epoch": 2, "update": 1.619, "s2c_loss": "6.385", "loss": "4.42561", "s2c_nll_loss": "6.385", "s2c_accuracy": "27.031", "s2c_total": "64", "s2c_n_correct": "17.3", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "3500", "lr": "2.33422e-05", "gnorm": "6.171", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "352"} 2023-01-29 16:17:36 | INFO | train_inner | {"epoch": 2, "update": 1.623, "s2c_loss": "6.426", "loss": "4.45386", "s2c_nll_loss": "6.426", "s2c_accuracy": "28.594", "s2c_total": "64", "s2c_n_correct": "18.3", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "3510", "lr": "2.34088e-05", "gnorm": "5.825", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "354"} 2023-01-29 16:17:39 | INFO | train_inner | {"epoch": 2, "update": 1.628, "s2c_loss": "6.029", "loss": "4.1791", "s2c_nll_loss": "6.029", "s2c_accuracy": "32.5", "s2c_total": "64", "s2c_n_correct": "20.8", "wps": "245.3", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "3520", "lr": "2.34755e-05", "gnorm": "6.104", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "357"} 2023-01-29 16:17:42 | INFO | train_inner | {"epoch": 2, "update": 1.633, "s2c_loss": "6.398", "loss": "4.43502", "s2c_nll_loss": "6.398", "s2c_accuracy": "28.75", "s2c_total": "64", "s2c_n_correct": "18.4", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "3530", "lr": "2.35422e-05", "gnorm": "6.319", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "360"} 2023-01-29 16:17:44 | INFO | train_inner | {"epoch": 2, "update": 1.637, "s2c_loss": "6.251", "loss": "4.33292", "s2c_nll_loss": "6.251", "s2c_accuracy": "29.062", "s2c_total": "64", "s2c_n_correct": "18.6", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "3540", "lr": "2.36088e-05", "gnorm": "6.461", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "362"} 2023-01-29 16:17:47 | INFO | train_inner | {"epoch": 2, "update": 1.642, "s2c_loss": "6.243", "loss": "4.32759", "s2c_nll_loss": "6.243", "s2c_accuracy": "29.375", "s2c_total": "64", "s2c_n_correct": "18.8", "wps": "241.3", "ups": "3.77", "wpb": "64", "bsz": "64", "num_updates": "3550", "lr": "2.36755e-05", "gnorm": "7.057", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "365"} 2023-01-29 16:17:50 | INFO | train_inner | {"epoch": 2, "update": 1.647, "s2c_loss": "6.484", "loss": "4.49409", "s2c_nll_loss": "6.484", "s2c_accuracy": "27.031", "s2c_total": "64", "s2c_n_correct": "17.3", "wps": "240.8", "ups": "3.76", "wpb": "64", "bsz": "64", "num_updates": "3560", "lr": "2.37421e-05", "gnorm": "6.63", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "367"} 2023-01-29 16:17:52 | INFO | train_inner | {"epoch": 2, "update": 1.651, "s2c_loss": "6.37", "loss": "4.41512", "s2c_nll_loss": "6.37", "s2c_accuracy": "28.75", "s2c_total": "64", "s2c_n_correct": "18.4", "wps": "241.3", "ups": "3.77", "wpb": "64", "bsz": "64", "num_updates": "3570", "lr": "2.38088e-05", "gnorm": "6.287", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "370"} 2023-01-29 16:17:55 | INFO | train_inner | {"epoch": 2, "update": 1.656, "s2c_loss": "6.218", "loss": "4.30975", "s2c_nll_loss": "6.218", "s2c_accuracy": "30.469", "s2c_total": "64", "s2c_n_correct": "19.5", "wps": "248", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "3580", "lr": "2.38755e-05", "gnorm": "6.611", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "373"} 2023-01-29 16:17:57 | INFO | train_inner | {"epoch": 2, "update": 1.66, "s2c_loss": "6.38", "loss": "4.42231", "s2c_nll_loss": "6.38", "s2c_accuracy": "29.844", "s2c_total": "64", "s2c_n_correct": "19.1", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "3590", "lr": "2.39421e-05", "gnorm": "6.649", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "375"} 2023-01-29 16:18:00 | INFO | train_inner | {"epoch": 2, "update": 1.665, "s2c_loss": "6.397", "loss": "4.43391", "s2c_nll_loss": "6.397", "s2c_accuracy": "28.281", "s2c_total": "64", "s2c_n_correct": "18.1", "wps": "259", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "3600", "lr": "2.40088e-05", "gnorm": "6.217", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "378"} 2023-01-29 16:18:02 | INFO | train_inner | {"epoch": 2, "update": 1.67, "s2c_loss": "6.391", "loss": "4.43021", "s2c_nll_loss": "6.391", "s2c_accuracy": "30", "s2c_total": "64", "s2c_n_correct": "19.2", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "3610", "lr": "2.40755e-05", "gnorm": "6.294", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "380"} 2023-01-29 16:18:05 | INFO | train_inner | {"epoch": 2, "update": 1.674, "s2c_loss": "6.119", "loss": "4.24104", "s2c_nll_loss": "6.119", "s2c_accuracy": "32.5", "s2c_total": "64", "s2c_n_correct": "20.8", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "3620", "lr": "2.41421e-05", "gnorm": "6.8", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "383"} 2023-01-29 16:18:07 | INFO | train_inner | {"epoch": 2, "update": 1.679, "s2c_loss": "6.355", "loss": "4.40483", "s2c_nll_loss": "6.355", "s2c_accuracy": "26.875", "s2c_total": "64", "s2c_n_correct": "17.2", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "3630", "lr": "2.42088e-05", "gnorm": "6.124", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "385"} 2023-01-29 16:18:10 | INFO | train_inner | {"epoch": 2, "update": 1.684, "s2c_loss": "6.449", "loss": "4.47038", "s2c_nll_loss": "6.449", "s2c_accuracy": "25.469", "s2c_total": "64", "s2c_n_correct": "16.3", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "3640", "lr": "2.42755e-05", "gnorm": "7.01", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "388"} 2023-01-29 16:18:12 | INFO | train_inner | {"epoch": 2, "update": 1.688, "s2c_loss": "6.141", "loss": "4.25687", "s2c_nll_loss": "6.141", "s2c_accuracy": "29.844", "s2c_total": "64", "s2c_n_correct": "19.1", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "3650", "lr": "2.43421e-05", "gnorm": "6.474", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "390"} 2023-01-29 16:18:15 | INFO | train_inner | {"epoch": 2, "update": 1.693, "s2c_loss": "6.345", "loss": "4.39785", "s2c_nll_loss": "6.345", "s2c_accuracy": "27.188", "s2c_total": "64", "s2c_n_correct": "17.4", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "3660", "lr": "2.44088e-05", "gnorm": "6.033", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "393"} 2023-01-29 16:18:18 | INFO | train_inner | {"epoch": 2, "update": 1.698, "s2c_loss": "6.164", "loss": "4.27246", "s2c_nll_loss": "6.164", "s2c_accuracy": "29.844", "s2c_total": "64", "s2c_n_correct": "19.1", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "3670", "lr": "2.44754e-05", "gnorm": "6.481", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "395"} 2023-01-29 16:18:20 | INFO | train_inner | {"epoch": 2, "update": 1.702, "s2c_loss": "6.407", "loss": "4.44116", "s2c_nll_loss": "6.407", "s2c_accuracy": "28.281", "s2c_total": "64", "s2c_n_correct": "18.1", "wps": "247.3", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "3680", "lr": "2.45421e-05", "gnorm": "6.51", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "398"} 2023-01-29 16:18:23 | INFO | train_inner | {"epoch": 2, "update": 1.707, "s2c_loss": "6.104", "loss": "4.2312", "s2c_nll_loss": "6.104", "s2c_accuracy": "32.031", "s2c_total": "64", "s2c_n_correct": "20.5", "wps": "244.9", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "3690", "lr": "2.46088e-05", "gnorm": "6.933", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "401"} 2023-01-29 16:18:25 | INFO | train_inner | {"epoch": 2, "update": 1.711, "s2c_loss": "6.289", "loss": "4.35918", "s2c_nll_loss": "6.289", "s2c_accuracy": "31.875", "s2c_total": "64", "s2c_n_correct": "20.4", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "3700", "lr": "2.46754e-05", "gnorm": "6.85", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "403"} 2023-01-29 16:18:28 | INFO | train_inner | {"epoch": 2, "update": 1.716, "s2c_loss": "6.107", "loss": "4.23303", "s2c_nll_loss": "6.107", "s2c_accuracy": "31.875", "s2c_total": "64", "s2c_n_correct": "20.4", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "3710", "lr": "2.47421e-05", "gnorm": "6.281", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "406"} 2023-01-29 16:18:30 | INFO | train_inner | {"epoch": 2, "update": 1.721, "s2c_loss": "6.251", "loss": "4.33261", "s2c_nll_loss": "6.251", "s2c_accuracy": "27.969", "s2c_total": "64", "s2c_n_correct": "17.9", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "3720", "lr": "2.48088e-05", "gnorm": "6.699", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "408"} 2023-01-29 16:18:33 | INFO | train_inner | {"epoch": 2, "update": 1.725, "s2c_loss": "6.119", "loss": "4.24121", "s2c_nll_loss": "6.119", "s2c_accuracy": "30.625", "s2c_total": "64", "s2c_n_correct": "19.6", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "3730", "lr": "2.48754e-05", "gnorm": "6.494", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "411"} 2023-01-29 16:18:35 | INFO | train_inner | {"epoch": 2, "update": 1.73, "s2c_loss": "6.226", "loss": "4.31542", "s2c_nll_loss": "6.226", "s2c_accuracy": "31.25", "s2c_total": "64", "s2c_n_correct": "20", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "3740", "lr": "2.49421e-05", "gnorm": "6.984", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "413"} 2023-01-29 16:18:38 | INFO | train_inner | {"epoch": 2, "update": 1.735, "s2c_loss": "5.879", "loss": "4.07472", "s2c_nll_loss": "5.879", "s2c_accuracy": "34.688", "s2c_total": "64", "s2c_n_correct": "22.2", "wps": "246.3", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "3750", "lr": "2.50088e-05", "gnorm": "6.208", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "416"} 2023-01-29 16:18:41 | INFO | train_inner | {"epoch": 2, "update": 1.739, "s2c_loss": "5.995", "loss": "4.15576", "s2c_nll_loss": "5.995", "s2c_accuracy": "32.344", "s2c_total": "64", "s2c_n_correct": "20.7", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "3760", "lr": "2.50754e-05", "gnorm": "5.993", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "418"} 2023-01-29 16:18:43 | INFO | train_inner | {"epoch": 2, "update": 1.744, "s2c_loss": "6.427", "loss": "4.4549", "s2c_nll_loss": "6.427", "s2c_accuracy": "28.75", "s2c_total": "64", "s2c_n_correct": "18.4", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "3770", "lr": "2.51421e-05", "gnorm": "6.896", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "421"} 2023-01-29 16:18:46 | INFO | train_inner | {"epoch": 2, "update": 1.748, "s2c_loss": "6.078", "loss": "4.21261", "s2c_nll_loss": "6.078", "s2c_accuracy": "30.938", "s2c_total": "64", "s2c_n_correct": "19.8", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "3780", "lr": "2.52087e-05", "gnorm": "6.664", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "424"} 2023-01-29 16:18:48 | INFO | train_inner | {"epoch": 2, "update": 1.753, "s2c_loss": "6.058", "loss": "4.1989", "s2c_nll_loss": "6.058", "s2c_accuracy": "29.688", "s2c_total": "64", "s2c_n_correct": "19", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "3790", "lr": "2.52754e-05", "gnorm": "6.692", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "426"} 2023-01-29 16:18:51 | INFO | train_inner | {"epoch": 2, "update": 1.758, "s2c_loss": "6.065", "loss": "4.20386", "s2c_nll_loss": "6.065", "s2c_accuracy": "30.312", "s2c_total": "64", "s2c_n_correct": "19.4", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "3800", "lr": "2.53421e-05", "gnorm": "6.46", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "429"} 2023-01-29 16:18:53 | INFO | train_inner | {"epoch": 2, "update": 1.762, "s2c_loss": "5.946", "loss": "4.12165", "s2c_nll_loss": "5.946", "s2c_accuracy": "31.875", "s2c_total": "64", "s2c_n_correct": "20.4", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "3810", "lr": "2.54087e-05", "gnorm": "6.626", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "431"} 2023-01-29 16:18:56 | INFO | train_inner | {"epoch": 2, "update": 1.767, "s2c_loss": "5.976", "loss": "4.14205", "s2c_nll_loss": "5.976", "s2c_accuracy": "32.031", "s2c_total": "64", "s2c_n_correct": "20.5", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "3820", "lr": "2.54754e-05", "gnorm": "6.477", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "434"} 2023-01-29 16:18:58 | INFO | train_inner | {"epoch": 2, "update": 1.772, "s2c_loss": "5.939", "loss": "4.11659", "s2c_nll_loss": "5.939", "s2c_accuracy": "32.812", "s2c_total": "64", "s2c_n_correct": "21", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "3830", "lr": "2.55421e-05", "gnorm": "6.547", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "436"} 2023-01-29 16:19:01 | INFO | train_inner | {"epoch": 2, "update": 1.776, "s2c_loss": "6.026", "loss": "4.17686", "s2c_nll_loss": "6.026", "s2c_accuracy": "29.219", "s2c_total": "64", "s2c_n_correct": "18.7", "wps": "247", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "3840", "lr": "2.56087e-05", "gnorm": "6.6", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "439"} 2023-01-29 16:19:03 | INFO | train_inner | {"epoch": 2, "update": 1.781, "s2c_loss": "6.024", "loss": "4.17568", "s2c_nll_loss": "6.024", "s2c_accuracy": "30.312", "s2c_total": "64", "s2c_n_correct": "19.4", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "3850", "lr": "2.56754e-05", "gnorm": "7.449", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "441"} 2023-01-29 16:19:06 | INFO | train_inner | {"epoch": 2, "update": 1.785, "s2c_loss": "5.88", "loss": "4.07563", "s2c_nll_loss": "5.88", "s2c_accuracy": "32.969", "s2c_total": "64", "s2c_n_correct": "21.1", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "3860", "lr": "2.5742e-05", "gnorm": "7.563", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "444"} 2023-01-29 16:19:08 | INFO | train_inner | {"epoch": 2, "update": 1.79, "s2c_loss": "5.902", "loss": "4.09099", "s2c_nll_loss": "5.902", "s2c_accuracy": "34.062", "s2c_total": "64", "s2c_n_correct": "21.8", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "3870", "lr": "2.58087e-05", "gnorm": "6.484", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "446"} 2023-01-29 16:19:11 | INFO | train_inner | {"epoch": 2, "update": 1.795, "s2c_loss": "5.827", "loss": "4.03928", "s2c_nll_loss": "5.827", "s2c_accuracy": "32.812", "s2c_total": "64", "s2c_n_correct": "21", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "3880", "lr": "2.58754e-05", "gnorm": "7.188", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "449"} 2023-01-29 16:19:13 | INFO | train_inner | {"epoch": 2, "update": 1.799, "s2c_loss": "5.782", "loss": "4.00794", "s2c_nll_loss": "5.782", "s2c_accuracy": "34.062", "s2c_total": "64", "s2c_n_correct": "21.8", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "3890", "lr": "2.5942e-05", "gnorm": "6.864", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "451"} 2023-01-29 16:19:16 | INFO | train_inner | {"epoch": 2, "update": 1.804, "s2c_loss": "5.85", "loss": "4.05459", "s2c_nll_loss": "5.85", "s2c_accuracy": "33.594", "s2c_total": "64", "s2c_n_correct": "21.5", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "3900", "lr": "2.60087e-05", "gnorm": "7.391", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "454"} 2023-01-29 16:19:19 | INFO | train_inner | {"epoch": 2, "update": 1.809, "s2c_loss": "5.851", "loss": "4.0554", "s2c_nll_loss": "5.851", "s2c_accuracy": "34.844", "s2c_total": "64", "s2c_n_correct": "22.3", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "3910", "lr": "2.60754e-05", "gnorm": "6.777", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "456"} 2023-01-29 16:19:21 | INFO | train_inner | {"epoch": 2, "update": 1.813, "s2c_loss": "5.702", "loss": "3.95215", "s2c_nll_loss": "5.702", "s2c_accuracy": "33.125", "s2c_total": "64", "s2c_n_correct": "21.2", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "3920", "lr": "2.6142e-05", "gnorm": "6.453", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "459"} 2023-01-29 16:19:24 | INFO | train_inner | {"epoch": 2, "update": 1.818, "s2c_loss": "5.867", "loss": "4.06649", "s2c_nll_loss": "5.867", "s2c_accuracy": "33.438", "s2c_total": "64", "s2c_n_correct": "21.4", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "3930", "lr": "2.62087e-05", "gnorm": "7.176", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "461"} 2023-01-29 16:19:26 | INFO | train_inner | {"epoch": 2, "update": 1.822, "s2c_loss": "5.854", "loss": "4.0579", "s2c_nll_loss": "5.854", "s2c_accuracy": "32.031", "s2c_total": "64", "s2c_n_correct": "20.5", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "3940", "lr": "2.62754e-05", "gnorm": "7.414", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "464"} 2023-01-29 16:19:29 | INFO | train_inner | {"epoch": 2, "update": 1.827, "s2c_loss": "5.806", "loss": "4.02425", "s2c_nll_loss": "5.806", "s2c_accuracy": "35", "s2c_total": "64", "s2c_n_correct": "22.4", "wps": "247.4", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "3950", "lr": "2.6342e-05", "gnorm": "6.092", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "467"} 2023-01-29 16:19:31 | INFO | train_inner | {"epoch": 2, "update": 1.832, "s2c_loss": "5.973", "loss": "4.14004", "s2c_nll_loss": "5.973", "s2c_accuracy": "30.625", "s2c_total": "64", "s2c_n_correct": "19.6", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "3960", "lr": "2.64087e-05", "gnorm": "6.651", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "469"} 2023-01-29 16:19:34 | INFO | train_inner | {"epoch": 2, "update": 1.836, "s2c_loss": "5.627", "loss": "3.90064", "s2c_nll_loss": "5.627", "s2c_accuracy": "37.344", "s2c_total": "64", "s2c_n_correct": "23.9", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "3970", "lr": "2.64753e-05", "gnorm": "6.543", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "472"} 2023-01-29 16:19:36 | INFO | train_inner | {"epoch": 2, "update": 1.841, "s2c_loss": "5.587", "loss": "3.8728", "s2c_nll_loss": "5.587", "s2c_accuracy": "36.719", "s2c_total": "64", "s2c_n_correct": "23.5", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "3980", "lr": "2.6542e-05", "gnorm": "7.272", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "474"} 2023-01-29 16:19:39 | INFO | train_inner | {"epoch": 2, "update": 1.846, "s2c_loss": "5.606", "loss": "3.88603", "s2c_nll_loss": "5.606", "s2c_accuracy": "35.469", "s2c_total": "64", "s2c_n_correct": "22.7", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "3990", "lr": "2.66087e-05", "gnorm": "7.231", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "477"} 2023-01-29 16:19:41 | INFO | train_inner | {"epoch": 2, "update": 1.85, "s2c_loss": "5.698", "loss": "3.94958", "s2c_nll_loss": "5.698", "s2c_accuracy": "36.094", "s2c_total": "64", "s2c_n_correct": "23.1", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "4000", "lr": "2.66753e-05", "gnorm": "6.762", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "479"} 2023-01-29 16:19:44 | INFO | train_inner | {"epoch": 2, "update": 1.855, "s2c_loss": "5.629", "loss": "3.90199", "s2c_nll_loss": "5.629", "s2c_accuracy": "36.719", "s2c_total": "64", "s2c_n_correct": "23.5", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "4010", "lr": "2.6742e-05", "gnorm": "7.056", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "482"} 2023-01-29 16:19:46 | INFO | train_inner | {"epoch": 2, "update": 1.859, "s2c_loss": "5.728", "loss": "3.97007", "s2c_nll_loss": "5.728", "s2c_accuracy": "35.781", "s2c_total": "64", "s2c_n_correct": "22.9", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "4020", "lr": "2.68087e-05", "gnorm": "6.724", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "484"} 2023-01-29 16:19:49 | INFO | train_inner | {"epoch": 2, "update": 1.864, "s2c_loss": "5.702", "loss": "3.95244", "s2c_nll_loss": "5.702", "s2c_accuracy": "35.469", "s2c_total": "64", "s2c_n_correct": "22.7", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "4030", "lr": "2.68753e-05", "gnorm": "6.493", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "487"} 2023-01-29 16:19:51 | INFO | train_inner | {"epoch": 2, "update": 1.869, "s2c_loss": "5.721", "loss": "3.96537", "s2c_nll_loss": "5.721", "s2c_accuracy": "34.844", "s2c_total": "64", "s2c_n_correct": "22.3", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "4040", "lr": "2.6942e-05", "gnorm": "7.399", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "489"} 2023-01-29 16:19:54 | INFO | train_inner | {"epoch": 2, "update": 1.873, "s2c_loss": "5.385", "loss": "3.7329", "s2c_nll_loss": "5.385", "s2c_accuracy": "38.125", "s2c_total": "64", "s2c_n_correct": "24.4", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "4050", "lr": "2.70087e-05", "gnorm": "7.129", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "492"} 2023-01-29 16:19:56 | INFO | train_inner | {"epoch": 2, "update": 1.878, "s2c_loss": "5.753", "loss": "3.98786", "s2c_nll_loss": "5.753", "s2c_accuracy": "35.312", "s2c_total": "64", "s2c_n_correct": "22.6", "wps": "257.6", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "4060", "lr": "2.70753e-05", "gnorm": "6.17", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "494"} 2023-01-29 16:19:59 | INFO | train_inner | {"epoch": 2, "update": 1.883, "s2c_loss": "5.333", "loss": "3.69633", "s2c_nll_loss": "5.333", "s2c_accuracy": "37.969", "s2c_total": "64", "s2c_n_correct": "24.3", "wps": "246.4", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "4070", "lr": "2.7142e-05", "gnorm": "7.838", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "497"} 2023-01-29 16:20:02 | INFO | train_inner | {"epoch": 2, "update": 1.887, "s2c_loss": "5.557", "loss": "3.85184", "s2c_nll_loss": "5.557", "s2c_accuracy": "35.625", "s2c_total": "64", "s2c_n_correct": "22.8", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "4080", "lr": "2.72086e-05", "gnorm": "7.485", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "500"} 2023-01-29 16:20:04 | INFO | train_inner | {"epoch": 2, "update": 1.892, "s2c_loss": "5.581", "loss": "3.86876", "s2c_nll_loss": "5.581", "s2c_accuracy": "36.406", "s2c_total": "64", "s2c_n_correct": "23.3", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "4090", "lr": "2.72753e-05", "gnorm": "6.577", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "502"} 2023-01-29 16:20:07 | INFO | train_inner | {"epoch": 2, "update": 1.896, "s2c_loss": "5.416", "loss": "3.75409", "s2c_nll_loss": "5.416", "s2c_accuracy": "36.875", "s2c_total": "64", "s2c_n_correct": "23.6", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "4100", "lr": "2.7342e-05", "gnorm": "7.397", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "505"} 2023-01-29 16:20:09 | INFO | train_inner | {"epoch": 2, "update": 1.901, "s2c_loss": "5.313", "loss": "3.68261", "s2c_nll_loss": "5.313", "s2c_accuracy": "39.062", "s2c_total": "64", "s2c_n_correct": "25", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "4110", "lr": "2.74086e-05", "gnorm": "7.834", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "507"} 2023-01-29 16:20:12 | INFO | train_inner | {"epoch": 2, "update": 1.906, "s2c_loss": "5.7", "loss": "3.95064", "s2c_nll_loss": "5.7", "s2c_accuracy": "36.25", "s2c_total": "64", "s2c_n_correct": "23.2", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "4120", "lr": "2.74753e-05", "gnorm": "6.874", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "510"} 2023-01-29 16:20:14 | INFO | train_inner | {"epoch": 2, "update": 1.91, "s2c_loss": "5.444", "loss": "3.77841", "s2c_nll_loss": "5.444", "s2c_accuracy": "37.834", "s2c_total": "63.7", "s2c_n_correct": "24.1", "wps": "249.7", "ups": "3.92", "wpb": "63.7", "bsz": "63.7", "num_updates": "4130", "lr": "2.7542e-05", "gnorm": "7.526", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "512"} 2023-01-29 16:20:17 | INFO | train_inner | {"epoch": 2, "update": 1.915, "s2c_loss": "5.422", "loss": "3.75858", "s2c_nll_loss": "5.422", "s2c_accuracy": "37.031", "s2c_total": "64", "s2c_n_correct": "23.7", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "4140", "lr": "2.76086e-05", "gnorm": "7.201", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "515"} 2023-01-29 16:20:19 | INFO | train_inner | {"epoch": 2, "update": 1.92, "s2c_loss": "5.643", "loss": "3.91138", "s2c_nll_loss": "5.643", "s2c_accuracy": "37.344", "s2c_total": "64", "s2c_n_correct": "23.9", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "4150", "lr": "2.76753e-05", "gnorm": "7.425", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "517"} 2023-01-29 16:20:22 | INFO | train_inner | {"epoch": 2, "update": 1.924, "s2c_loss": "5.432", "loss": "3.76491", "s2c_nll_loss": "5.432", "s2c_accuracy": "36.562", "s2c_total": "64", "s2c_n_correct": "23.4", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "4160", "lr": "2.77419e-05", "gnorm": "6.997", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "520"} 2023-01-29 16:20:25 | INFO | train_inner | {"epoch": 2, "update": 1.929, "s2c_loss": "5.453", "loss": "3.77958", "s2c_nll_loss": "5.453", "s2c_accuracy": "36.719", "s2c_total": "64", "s2c_n_correct": "23.5", "wps": "242.5", "ups": "3.79", "wpb": "64", "bsz": "64", "num_updates": "4170", "lr": "2.78086e-05", "gnorm": "7.046", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "522"} 2023-01-29 16:20:27 | INFO | train_inner | {"epoch": 2, "update": 1.933, "s2c_loss": "5.134", "loss": "3.55892", "s2c_nll_loss": "5.134", "s2c_accuracy": "41.094", "s2c_total": "64", "s2c_n_correct": "26.3", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "4180", "lr": "2.78753e-05", "gnorm": "6.81", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "525"} 2023-01-29 16:20:30 | INFO | train_inner | {"epoch": 2, "update": 1.938, "s2c_loss": "5.376", "loss": "3.72605", "s2c_nll_loss": "5.376", "s2c_accuracy": "38.75", "s2c_total": "64", "s2c_n_correct": "24.8", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "4190", "lr": "2.79419e-05", "gnorm": "6.478", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "528"} 2023-01-29 16:20:32 | INFO | train_inner | {"epoch": 2, "update": 1.943, "s2c_loss": "5.442", "loss": "3.7724", "s2c_nll_loss": "5.442", "s2c_accuracy": "39.688", "s2c_total": "64", "s2c_n_correct": "25.4", "wps": "246.1", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "4200", "lr": "2.80086e-05", "gnorm": "6.574", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "530"} 2023-01-29 16:20:35 | INFO | train_inner | {"epoch": 2, "update": 1.947, "s2c_loss": "5.238", "loss": "3.63063", "s2c_nll_loss": "5.238", "s2c_accuracy": "39.219", "s2c_total": "64", "s2c_n_correct": "25.1", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "4210", "lr": "2.80753e-05", "gnorm": "7.404", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "533"} 2023-01-29 16:20:37 | INFO | train_inner | {"epoch": 2, "update": 1.952, "s2c_loss": "5.193", "loss": "3.59957", "s2c_nll_loss": "5.193", "s2c_accuracy": "41.094", "s2c_total": "64", "s2c_n_correct": "26.3", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "4220", "lr": "2.81419e-05", "gnorm": "7.887", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "535"} 2023-01-29 16:20:40 | INFO | train_inner | {"epoch": 2, "update": 1.957, "s2c_loss": "5.232", "loss": "3.6264", "s2c_nll_loss": "5.232", "s2c_accuracy": "39.844", "s2c_total": "64", "s2c_n_correct": "25.5", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "4230", "lr": "2.82086e-05", "gnorm": "7.717", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "538"} 2023-01-29 16:20:42 | INFO | train_inner | {"epoch": 2, "update": 1.961, "s2c_loss": "5.132", "loss": "3.55752", "s2c_nll_loss": "5.132", "s2c_accuracy": "41.25", "s2c_total": "64", "s2c_n_correct": "26.4", "wps": "244.7", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "4240", "lr": "2.82753e-05", "gnorm": "7.615", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "540"} 2023-01-29 16:20:45 | INFO | train_inner | {"epoch": 2, "update": 1.966, "s2c_loss": "5.435", "loss": "3.76699", "s2c_nll_loss": "5.435", "s2c_accuracy": "39.844", "s2c_total": "64", "s2c_n_correct": "25.5", "wps": "246.9", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "4250", "lr": "2.83419e-05", "gnorm": "6.947", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "543"} 2023-01-29 16:20:48 | INFO | train_inner | {"epoch": 2, "update": 1.97, "s2c_loss": "5.294", "loss": "3.66935", "s2c_nll_loss": "5.294", "s2c_accuracy": "36.562", "s2c_total": "64", "s2c_n_correct": "23.4", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "4260", "lr": "2.84086e-05", "gnorm": "7.601", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "546"} 2023-01-29 16:20:50 | INFO | train_inner | {"epoch": 2, "update": 1.975, "s2c_loss": "5.367", "loss": "3.71983", "s2c_nll_loss": "5.367", "s2c_accuracy": "40.312", "s2c_total": "64", "s2c_n_correct": "25.8", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "4270", "lr": "2.84752e-05", "gnorm": "7.191", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "548"} 2023-01-29 16:20:53 | INFO | train_inner | {"epoch": 2, "update": 1.98, "s2c_loss": "5.425", "loss": "3.76028", "s2c_nll_loss": "5.425", "s2c_accuracy": "38.75", "s2c_total": "64", "s2c_n_correct": "24.8", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "4280", "lr": "2.85419e-05", "gnorm": "7.029", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "551"} 2023-01-29 16:20:55 | INFO | train_inner | {"epoch": 2, "update": 1.984, "s2c_loss": "5.287", "loss": "3.66501", "s2c_nll_loss": "5.287", "s2c_accuracy": "39.219", "s2c_total": "64", "s2c_n_correct": "25.1", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "4290", "lr": "2.86086e-05", "gnorm": "7.489", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "553"} 2023-01-29 16:20:58 | INFO | train_inner | {"epoch": 2, "update": 1.989, "s2c_loss": "5.083", "loss": "3.52344", "s2c_nll_loss": "5.083", "s2c_accuracy": "40.312", "s2c_total": "64", "s2c_n_correct": "25.8", "wps": "247.6", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "4300", "lr": "2.86752e-05", "gnorm": "6.933", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "556"} 2023-01-29 16:21:00 | INFO | train_inner | {"epoch": 2, "update": 1.994, "s2c_loss": "4.906", "loss": "3.4009", "s2c_nll_loss": "4.906", "s2c_accuracy": "42.656", "s2c_total": "64", "s2c_n_correct": "27.3", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "4310", "lr": "2.87419e-05", "gnorm": "7.883", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "558"} 2023-01-29 16:21:03 | INFO | train_inner | {"epoch": 2, "update": 1.998, "s2c_loss": "5.348", "loss": "3.7072", "s2c_nll_loss": "5.348", "s2c_accuracy": "39.688", "s2c_total": "64", "s2c_n_correct": "25.4", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "4320", "lr": "2.88086e-05", "gnorm": "8.355", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "561"} 2023-01-29 16:21:04 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 2 @ 4324 updates 2023-01-29 16:21:04 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 16:21:11 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 16:21:11 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt (epoch 2 @ 4324 updates, score None) (writing took 6.920269948896021 seconds) 2023-01-29 16:21:11 | INFO | fairseq_cli.train | end of epoch 2 (average epoch stats below) 2023-01-29 16:21:11 | INFO | train | {"epoch": 2, "train_s2c_loss": "6.906", "train_loss": "4.78672", "train_s2c_nll_loss": "6.906", "train_s2c_accuracy": "23.655", "train_s2c_total": "63.9838", "train_s2c_n_correct": "15.1351", "train_wps": "248.2", "train_ups": "3.88", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "4324", "train_lr": "2.88352e-05", "train_gnorm": "5.868", "train_loss_scale": "512", "train_train_wall": "544", "train_gb_free": "7.5", "train_wall": "569"} 2023-01-29 16:21:17 | INFO | fairseq.trainer | begin training epoch 3 2023-01-29 16:21:17 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 16:21:19 | INFO | train_inner | {"epoch": 3, "update": 2.003, "s2c_loss": "5.365", "loss": "3.71858", "s2c_nll_loss": "5.365", "s2c_accuracy": "36.842", "s2c_total": "60.8", "s2c_n_correct": "22.4", "wps": "38.8", "ups": "0.64", "wpb": "60.8", "bsz": "60.8", "num_updates": "4330", "lr": "2.88752e-05", "gnorm": "8.817", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "576"} 2023-01-29 16:21:21 | INFO | train_inner | {"epoch": 3, "update": 2.007, "s2c_loss": "4.868", "loss": "3.37731", "s2c_nll_loss": "4.868", "s2c_accuracy": "45.997", "s2c_total": "63.7", "s2c_n_correct": "29.3", "wps": "251.6", "ups": "3.95", "wpb": "63.7", "bsz": "63.7", "num_updates": "4340", "lr": "2.89419e-05", "gnorm": "8.938", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "579"} 2023-01-29 16:21:24 | INFO | train_inner | {"epoch": 3, "update": 2.012, "s2c_loss": "5.026", "loss": "3.48351", "s2c_nll_loss": "5.026", "s2c_accuracy": "43.906", "s2c_total": "64", "s2c_n_correct": "28.1", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "4350", "lr": "2.90086e-05", "gnorm": "7.21", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "582"} 2023-01-29 16:21:26 | INFO | train_inner | {"epoch": 3, "update": 2.017, "s2c_loss": "5.298", "loss": "3.67196", "s2c_nll_loss": "5.298", "s2c_accuracy": "38.281", "s2c_total": "64", "s2c_n_correct": "24.5", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "4360", "lr": "2.90752e-05", "gnorm": "8.156", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "584"} 2023-01-29 16:21:29 | INFO | train_inner | {"epoch": 3, "update": 2.021, "s2c_loss": "5.055", "loss": "3.50413", "s2c_nll_loss": "5.055", "s2c_accuracy": "41.719", "s2c_total": "64", "s2c_n_correct": "26.7", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "4370", "lr": "2.91419e-05", "gnorm": "7.634", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "587"} 2023-01-29 16:21:31 | INFO | train_inner | {"epoch": 3, "update": 2.026, "s2c_loss": "4.749", "loss": "3.29149", "s2c_nll_loss": "4.749", "s2c_accuracy": "46.562", "s2c_total": "64", "s2c_n_correct": "29.8", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "4380", "lr": "2.92085e-05", "gnorm": "7.569", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "589"} 2023-01-29 16:21:34 | INFO | train_inner | {"epoch": 3, "update": 2.031, "s2c_loss": "5.143", "loss": "3.56487", "s2c_nll_loss": "5.143", "s2c_accuracy": "39.688", "s2c_total": "64", "s2c_n_correct": "25.4", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "4390", "lr": "2.92752e-05", "gnorm": "7.272", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "592"} 2023-01-29 16:21:36 | INFO | train_inner | {"epoch": 3, "update": 2.035, "s2c_loss": "4.855", "loss": "3.36534", "s2c_nll_loss": "4.855", "s2c_accuracy": "45.312", "s2c_total": "64", "s2c_n_correct": "29", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "4400", "lr": "2.93419e-05", "gnorm": "7.654", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "594"} 2023-01-29 16:21:39 | INFO | train_inner | {"epoch": 3, "update": 2.04, "s2c_loss": "5.001", "loss": "3.46659", "s2c_nll_loss": "5.001", "s2c_accuracy": "44.219", "s2c_total": "64", "s2c_n_correct": "28.3", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "4410", "lr": "2.94085e-05", "gnorm": "7.359", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "597"} 2023-01-29 16:21:42 | INFO | train_inner | {"epoch": 3, "update": 2.044, "s2c_loss": "4.903", "loss": "3.39872", "s2c_nll_loss": "4.903", "s2c_accuracy": "43.906", "s2c_total": "64", "s2c_n_correct": "28.1", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "4420", "lr": "2.94752e-05", "gnorm": "6.777", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "599"} 2023-01-29 16:21:44 | INFO | train_inner | {"epoch": 3, "update": 2.049, "s2c_loss": "4.905", "loss": "3.40021", "s2c_nll_loss": "4.905", "s2c_accuracy": "44.375", "s2c_total": "64", "s2c_n_correct": "28.4", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "4430", "lr": "2.95419e-05", "gnorm": "6.984", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "602"} 2023-01-29 16:21:47 | INFO | train_inner | {"epoch": 3, "update": 2.054, "s2c_loss": "4.867", "loss": "3.37341", "s2c_nll_loss": "4.867", "s2c_accuracy": "44.531", "s2c_total": "64", "s2c_n_correct": "28.5", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "4440", "lr": "2.96085e-05", "gnorm": "7.531", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "605"} 2023-01-29 16:21:49 | INFO | train_inner | {"epoch": 3, "update": 2.058, "s2c_loss": "4.777", "loss": "3.31133", "s2c_nll_loss": "4.777", "s2c_accuracy": "45.156", "s2c_total": "64", "s2c_n_correct": "28.9", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "4450", "lr": "2.96752e-05", "gnorm": "7.466", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "607"} 2023-01-29 16:21:52 | INFO | train_inner | {"epoch": 3, "update": 2.063, "s2c_loss": "4.842", "loss": "3.3559", "s2c_nll_loss": "4.842", "s2c_accuracy": "45.156", "s2c_total": "64", "s2c_n_correct": "28.9", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "4460", "lr": "2.97418e-05", "gnorm": "7.029", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "610"} 2023-01-29 16:21:54 | INFO | train_inner | {"epoch": 3, "update": 2.068, "s2c_loss": "4.789", "loss": "3.31971", "s2c_nll_loss": "4.789", "s2c_accuracy": "47.656", "s2c_total": "64", "s2c_n_correct": "30.5", "wps": "258.7", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "4470", "lr": "2.98085e-05", "gnorm": "7.695", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "612"} 2023-01-29 16:21:57 | INFO | train_inner | {"epoch": 3, "update": 2.072, "s2c_loss": "4.768", "loss": "3.30512", "s2c_nll_loss": "4.768", "s2c_accuracy": "45.625", "s2c_total": "64", "s2c_n_correct": "29.2", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "4480", "lr": "2.98752e-05", "gnorm": "7.673", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "615"} 2023-01-29 16:21:59 | INFO | train_inner | {"epoch": 3, "update": 2.077, "s2c_loss": "4.865", "loss": "3.37228", "s2c_nll_loss": "4.865", "s2c_accuracy": "44.062", "s2c_total": "64", "s2c_n_correct": "28.2", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "4490", "lr": "2.99418e-05", "gnorm": "7.348", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "617"} 2023-01-29 16:22:02 | INFO | train_inner | {"epoch": 3, "update": 2.081, "s2c_loss": "4.883", "loss": "3.38459", "s2c_nll_loss": "4.883", "s2c_accuracy": "44.219", "s2c_total": "64", "s2c_n_correct": "28.3", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "4500", "lr": "3.00085e-05", "gnorm": "7.15", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "620"} 2023-01-29 16:22:04 | INFO | train_inner | {"epoch": 3, "update": 2.086, "s2c_loss": "4.995", "loss": "3.46251", "s2c_nll_loss": "4.995", "s2c_accuracy": "41.406", "s2c_total": "64", "s2c_n_correct": "26.5", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "4510", "lr": "3.00752e-05", "gnorm": "7.95", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "622"} 2023-01-29 16:22:07 | INFO | train_inner | {"epoch": 3, "update": 2.091, "s2c_loss": "5.182", "loss": "3.59215", "s2c_nll_loss": "5.182", "s2c_accuracy": "40.625", "s2c_total": "64", "s2c_n_correct": "26", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "4520", "lr": "3.01418e-05", "gnorm": "7.327", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "625"} 2023-01-29 16:22:09 | INFO | train_inner | {"epoch": 3, "update": 2.095, "s2c_loss": "4.891", "loss": "3.38988", "s2c_nll_loss": "4.891", "s2c_accuracy": "42.5", "s2c_total": "64", "s2c_n_correct": "27.2", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "4530", "lr": "3.02085e-05", "gnorm": "7.603", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "627"} 2023-01-29 16:22:12 | INFO | train_inner | {"epoch": 3, "update": 2.1, "s2c_loss": "5.042", "loss": "3.49477", "s2c_nll_loss": "5.042", "s2c_accuracy": "41.562", "s2c_total": "64", "s2c_n_correct": "26.6", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "4540", "lr": "3.02752e-05", "gnorm": "7.631", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "630"} 2023-01-29 16:22:14 | INFO | train_inner | {"epoch": 3, "update": 2.105, "s2c_loss": "4.788", "loss": "3.31867", "s2c_nll_loss": "4.788", "s2c_accuracy": "42.188", "s2c_total": "64", "s2c_n_correct": "27", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "4550", "lr": "3.03418e-05", "gnorm": "7.478", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "632"} 2023-01-29 16:22:17 | INFO | train_inner | {"epoch": 3, "update": 2.109, "s2c_loss": "4.883", "loss": "3.38466", "s2c_nll_loss": "4.883", "s2c_accuracy": "44.844", "s2c_total": "64", "s2c_n_correct": "28.7", "wps": "243.8", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "4560", "lr": "3.04085e-05", "gnorm": "7.542", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "635"} 2023-01-29 16:22:20 | INFO | train_inner | {"epoch": 3, "update": 2.114, "s2c_loss": "4.964", "loss": "3.44053", "s2c_nll_loss": "4.964", "s2c_accuracy": "43.125", "s2c_total": "64", "s2c_n_correct": "27.6", "wps": "258.2", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "4570", "lr": "3.04751e-05", "gnorm": "6.84", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "638"} 2023-01-29 16:22:22 | INFO | train_inner | {"epoch": 3, "update": 2.118, "s2c_loss": "4.505", "loss": "3.12268", "s2c_nll_loss": "4.505", "s2c_accuracy": "49.062", "s2c_total": "64", "s2c_n_correct": "31.4", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "4580", "lr": "3.05418e-05", "gnorm": "8.19", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "640"} 2023-01-29 16:22:25 | INFO | train_inner | {"epoch": 3, "update": 2.123, "s2c_loss": "4.867", "loss": "3.37344", "s2c_nll_loss": "4.867", "s2c_accuracy": "43.906", "s2c_total": "64", "s2c_n_correct": "28.1", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "4590", "lr": "3.06085e-05", "gnorm": "8.331", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "643"} 2023-01-29 16:22:27 | INFO | train_inner | {"epoch": 3, "update": 2.128, "s2c_loss": "4.514", "loss": "3.12891", "s2c_nll_loss": "4.514", "s2c_accuracy": "46.25", "s2c_total": "64", "s2c_n_correct": "29.6", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "4600", "lr": "3.06751e-05", "gnorm": "8.584", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "645"} 2023-01-29 16:22:30 | INFO | train_inner | {"epoch": 3, "update": 2.132, "s2c_loss": "4.698", "loss": "3.25626", "s2c_nll_loss": "4.698", "s2c_accuracy": "45.625", "s2c_total": "64", "s2c_n_correct": "29.2", "wps": "247.6", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "4610", "lr": "3.07418e-05", "gnorm": "7.863", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "648"} 2023-01-29 16:22:32 | INFO | train_inner | {"epoch": 3, "update": 2.137, "s2c_loss": "4.919", "loss": "3.4097", "s2c_nll_loss": "4.919", "s2c_accuracy": "43.906", "s2c_total": "64", "s2c_n_correct": "28.1", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "4620", "lr": "3.08085e-05", "gnorm": "7.94", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "650"} 2023-01-29 16:22:35 | INFO | train_inner | {"epoch": 3, "update": 2.142, "s2c_loss": "4.592", "loss": "3.18299", "s2c_nll_loss": "4.592", "s2c_accuracy": "46.406", "s2c_total": "64", "s2c_n_correct": "29.7", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "4630", "lr": "3.08751e-05", "gnorm": "7.751", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "653"} 2023-01-29 16:22:37 | INFO | train_inner | {"epoch": 3, "update": 2.146, "s2c_loss": "4.535", "loss": "3.14341", "s2c_nll_loss": "4.535", "s2c_accuracy": "45.781", "s2c_total": "64", "s2c_n_correct": "29.3", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "4640", "lr": "3.09418e-05", "gnorm": "7.683", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "655"} 2023-01-29 16:22:40 | INFO | train_inner | {"epoch": 3, "update": 2.151, "s2c_loss": "4.798", "loss": "3.32562", "s2c_nll_loss": "4.798", "s2c_accuracy": "45.781", "s2c_total": "64", "s2c_n_correct": "29.3", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "4650", "lr": "3.10085e-05", "gnorm": "7.428", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "658"} 2023-01-29 16:22:42 | INFO | train_inner | {"epoch": 3, "update": 2.155, "s2c_loss": "4.47", "loss": "3.09848", "s2c_nll_loss": "4.47", "s2c_accuracy": "46.25", "s2c_total": "64", "s2c_n_correct": "29.6", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "4660", "lr": "3.10751e-05", "gnorm": "9.9", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "660"} 2023-01-29 16:22:45 | INFO | train_inner | {"epoch": 3, "update": 2.16, "s2c_loss": "4.697", "loss": "3.25583", "s2c_nll_loss": "4.697", "s2c_accuracy": "46.406", "s2c_total": "64", "s2c_n_correct": "29.7", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "4670", "lr": "3.11418e-05", "gnorm": "8.127", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "663"} 2023-01-29 16:22:47 | INFO | train_inner | {"epoch": 3, "update": 2.165, "s2c_loss": "4.458", "loss": "3.09023", "s2c_nll_loss": "4.458", "s2c_accuracy": "48.906", "s2c_total": "64", "s2c_n_correct": "31.3", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "4680", "lr": "3.12084e-05", "gnorm": "8.675", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "665"} 2023-01-29 16:22:50 | INFO | train_inner | {"epoch": 3, "update": 2.169, "s2c_loss": "4.654", "loss": "3.22613", "s2c_nll_loss": "4.654", "s2c_accuracy": "43.281", "s2c_total": "64", "s2c_n_correct": "27.7", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "4690", "lr": "3.12751e-05", "gnorm": "7.659", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "668"} 2023-01-29 16:22:53 | INFO | train_inner | {"epoch": 3, "update": 2.174, "s2c_loss": "4.615", "loss": "3.19875", "s2c_nll_loss": "4.615", "s2c_accuracy": "47.656", "s2c_total": "64", "s2c_n_correct": "30.5", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "4700", "lr": "3.13418e-05", "gnorm": "8.298", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "670"} 2023-01-29 16:22:55 | INFO | train_inner | {"epoch": 3, "update": 2.179, "s2c_loss": "4.575", "loss": "3.17113", "s2c_nll_loss": "4.575", "s2c_accuracy": "46.25", "s2c_total": "64", "s2c_n_correct": "29.6", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "4710", "lr": "3.14084e-05", "gnorm": "9.053", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "673"} 2023-01-29 16:22:58 | INFO | train_inner | {"epoch": 3, "update": 2.183, "s2c_loss": "4.286", "loss": "2.97112", "s2c_nll_loss": "4.286", "s2c_accuracy": "47.5", "s2c_total": "64", "s2c_n_correct": "30.4", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "4720", "lr": "3.14751e-05", "gnorm": "8.305", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "676"} 2023-01-29 16:23:00 | INFO | train_inner | {"epoch": 3, "update": 2.188, "s2c_loss": "4.624", "loss": "3.2048", "s2c_nll_loss": "4.624", "s2c_accuracy": "44.531", "s2c_total": "64", "s2c_n_correct": "28.5", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "4730", "lr": "3.15418e-05", "gnorm": "8.776", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "678"} 2023-01-29 16:23:03 | INFO | train_inner | {"epoch": 3, "update": 2.192, "s2c_loss": "4.677", "loss": "3.24172", "s2c_nll_loss": "4.677", "s2c_accuracy": "45.156", "s2c_total": "64", "s2c_n_correct": "28.9", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "4740", "lr": "3.16084e-05", "gnorm": "7.381", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "681"} 2023-01-29 16:23:05 | INFO | train_inner | {"epoch": 3, "update": 2.197, "s2c_loss": "4.33", "loss": "3.00151", "s2c_nll_loss": "4.33", "s2c_accuracy": "48.906", "s2c_total": "64", "s2c_n_correct": "31.3", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "4750", "lr": "3.16751e-05", "gnorm": "7.978", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "683"} 2023-01-29 16:23:08 | INFO | train_inner | {"epoch": 3, "update": 2.202, "s2c_loss": "4.453", "loss": "3.08629", "s2c_nll_loss": "4.453", "s2c_accuracy": "48.75", "s2c_total": "64", "s2c_n_correct": "31.2", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "4760", "lr": "3.17417e-05", "gnorm": "7.93", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "686"} 2023-01-29 16:23:10 | INFO | train_inner | {"epoch": 3, "update": 2.206, "s2c_loss": "4.827", "loss": "3.34553", "s2c_nll_loss": "4.827", "s2c_accuracy": "42.188", "s2c_total": "64", "s2c_n_correct": "27", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "4770", "lr": "3.18084e-05", "gnorm": "7.976", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "688"} 2023-01-29 16:23:13 | INFO | train_inner | {"epoch": 3, "update": 2.211, "s2c_loss": "4.638", "loss": "3.21455", "s2c_nll_loss": "4.638", "s2c_accuracy": "45.781", "s2c_total": "64", "s2c_n_correct": "29.3", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "4780", "lr": "3.18751e-05", "gnorm": "7.83", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "691"} 2023-01-29 16:23:15 | INFO | train_inner | {"epoch": 3, "update": 2.216, "s2c_loss": "4.446", "loss": "3.08146", "s2c_nll_loss": "4.446", "s2c_accuracy": "48.281", "s2c_total": "64", "s2c_n_correct": "30.9", "wps": "259.5", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "4790", "lr": "3.19417e-05", "gnorm": "8.073", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "693"} 2023-01-29 16:23:18 | INFO | train_inner | {"epoch": 3, "update": 2.22, "s2c_loss": "4.556", "loss": "3.15823", "s2c_nll_loss": "4.556", "s2c_accuracy": "48.75", "s2c_total": "64", "s2c_n_correct": "31.2", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "4800", "lr": "3.20084e-05", "gnorm": "7.454", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "696"} 2023-01-29 16:23:20 | INFO | train_inner | {"epoch": 3, "update": 2.225, "s2c_loss": "4.328", "loss": "2.99964", "s2c_nll_loss": "4.328", "s2c_accuracy": "49.531", "s2c_total": "64", "s2c_n_correct": "31.7", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "4810", "lr": "3.20751e-05", "gnorm": "7.874", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "698"} 2023-01-29 16:23:23 | INFO | train_inner | {"epoch": 3, "update": 2.229, "s2c_loss": "4.391", "loss": "3.04337", "s2c_nll_loss": "4.391", "s2c_accuracy": "47.5", "s2c_total": "64", "s2c_n_correct": "30.4", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "4820", "lr": "3.21417e-05", "gnorm": "7.525", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "701"} 2023-01-29 16:23:26 | INFO | train_inner | {"epoch": 3, "update": 2.234, "s2c_loss": "4.335", "loss": "3.00496", "s2c_nll_loss": "4.335", "s2c_accuracy": "48.906", "s2c_total": "64", "s2c_n_correct": "31.3", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "4830", "lr": "3.22084e-05", "gnorm": "8.086", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "703"} 2023-01-29 16:23:28 | INFO | train_inner | {"epoch": 3, "update": 2.239, "s2c_loss": "4.27", "loss": "2.95999", "s2c_nll_loss": "4.27", "s2c_accuracy": "49.688", "s2c_total": "64", "s2c_n_correct": "31.8", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "4840", "lr": "3.22751e-05", "gnorm": "7.912", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "706"} 2023-01-29 16:23:31 | INFO | train_inner | {"epoch": 3, "update": 2.243, "s2c_loss": "4.459", "loss": "3.09075", "s2c_nll_loss": "4.459", "s2c_accuracy": "47.969", "s2c_total": "64", "s2c_n_correct": "30.7", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "4850", "lr": "3.23417e-05", "gnorm": "8.362", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "709"} 2023-01-29 16:23:33 | INFO | train_inner | {"epoch": 3, "update": 2.248, "s2c_loss": "4.508", "loss": "3.12457", "s2c_nll_loss": "4.508", "s2c_accuracy": "48.594", "s2c_total": "64", "s2c_n_correct": "31.1", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "4860", "lr": "3.24084e-05", "gnorm": "8.148", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "711"} 2023-01-29 16:23:36 | INFO | train_inner | {"epoch": 3, "update": 2.253, "s2c_loss": "4.26", "loss": "2.95279", "s2c_nll_loss": "4.26", "s2c_accuracy": "46.875", "s2c_total": "64", "s2c_n_correct": "30", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "4870", "lr": "3.2475e-05", "gnorm": "9.166", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "714"} 2023-01-29 16:23:38 | INFO | train_inner | {"epoch": 3, "update": 2.257, "s2c_loss": "4.382", "loss": "3.03743", "s2c_nll_loss": "4.382", "s2c_accuracy": "49.844", "s2c_total": "64", "s2c_n_correct": "31.9", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "4880", "lr": "3.25417e-05", "gnorm": "8.024", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "716"} 2023-01-29 16:23:41 | INFO | train_inner | {"epoch": 3, "update": 2.262, "s2c_loss": "4.388", "loss": "3.0415", "s2c_nll_loss": "4.388", "s2c_accuracy": "49.062", "s2c_total": "64", "s2c_n_correct": "31.4", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "4890", "lr": "3.26084e-05", "gnorm": "8.228", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "719"} 2023-01-29 16:23:43 | INFO | train_inner | {"epoch": 3, "update": 2.266, "s2c_loss": "4.409", "loss": "3.05593", "s2c_nll_loss": "4.409", "s2c_accuracy": "46.094", "s2c_total": "64", "s2c_n_correct": "29.5", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "4900", "lr": "3.2675e-05", "gnorm": "8.859", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "721"} 2023-01-29 16:23:46 | INFO | train_inner | {"epoch": 3, "update": 2.271, "s2c_loss": "4.264", "loss": "2.95579", "s2c_nll_loss": "4.264", "s2c_accuracy": "48.438", "s2c_total": "64", "s2c_n_correct": "31", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "4910", "lr": "3.27417e-05", "gnorm": "8.355", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "724"} 2023-01-29 16:23:48 | INFO | train_inner | {"epoch": 3, "update": 2.276, "s2c_loss": "4.268", "loss": "2.95854", "s2c_nll_loss": "4.268", "s2c_accuracy": "52.031", "s2c_total": "64", "s2c_n_correct": "33.3", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "4920", "lr": "3.28084e-05", "gnorm": "7.838", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "726"} 2023-01-29 16:23:51 | INFO | train_inner | {"epoch": 3, "update": 2.28, "s2c_loss": "4.377", "loss": "3.0338", "s2c_nll_loss": "4.377", "s2c_accuracy": "49.375", "s2c_total": "64", "s2c_n_correct": "31.6", "wps": "247.9", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "4930", "lr": "3.2875e-05", "gnorm": "7.584", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "729"} 2023-01-29 16:23:54 | INFO | train_inner | {"epoch": 3, "update": 2.285, "s2c_loss": "4.12", "loss": "2.85565", "s2c_nll_loss": "4.12", "s2c_accuracy": "50.625", "s2c_total": "64", "s2c_n_correct": "32.4", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "4940", "lr": "3.29417e-05", "gnorm": "8.801", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "731"} 2023-01-29 16:23:56 | INFO | train_inner | {"epoch": 3, "update": 2.29, "s2c_loss": "4.325", "loss": "2.99819", "s2c_nll_loss": "4.325", "s2c_accuracy": "49.844", "s2c_total": "64", "s2c_n_correct": "31.9", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "4950", "lr": "3.30084e-05", "gnorm": "7.088", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "734"} 2023-01-29 16:23:59 | INFO | train_inner | {"epoch": 3, "update": 2.294, "s2c_loss": "4.43", "loss": "3.07063", "s2c_nll_loss": "4.43", "s2c_accuracy": "47.344", "s2c_total": "64", "s2c_n_correct": "30.3", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "4960", "lr": "3.3075e-05", "gnorm": "7.326", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "737"} 2023-01-29 16:24:01 | INFO | train_inner | {"epoch": 3, "update": 2.299, "s2c_loss": "4.256", "loss": "2.94976", "s2c_nll_loss": "4.256", "s2c_accuracy": "50.312", "s2c_total": "64", "s2c_n_correct": "32.2", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "4970", "lr": "3.31417e-05", "gnorm": "8.034", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "739"} 2023-01-29 16:24:04 | INFO | train_inner | {"epoch": 3, "update": 2.303, "s2c_loss": "4.288", "loss": "2.97205", "s2c_nll_loss": "4.288", "s2c_accuracy": "47.656", "s2c_total": "64", "s2c_n_correct": "30.5", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "4980", "lr": "3.32083e-05", "gnorm": "7.823", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "742"} 2023-01-29 16:24:06 | INFO | train_inner | {"epoch": 3, "update": 2.308, "s2c_loss": "4.349", "loss": "3.01448", "s2c_nll_loss": "4.349", "s2c_accuracy": "47.031", "s2c_total": "64", "s2c_n_correct": "30.1", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "4990", "lr": "3.3275e-05", "gnorm": "8.5", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "744"} 2023-01-29 16:24:09 | INFO | train_inner | {"epoch": 3, "update": 2.313, "s2c_loss": "4.385", "loss": "3.03941", "s2c_nll_loss": "4.385", "s2c_accuracy": "48.438", "s2c_total": "64", "s2c_n_correct": "31", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "5000", "lr": "3.33417e-05", "gnorm": "8.118", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "747"} 2023-01-29 16:24:11 | INFO | train_inner | {"epoch": 3, "update": 2.317, "s2c_loss": "4.153", "loss": "2.87856", "s2c_nll_loss": "4.153", "s2c_accuracy": "49.531", "s2c_total": "64", "s2c_n_correct": "31.7", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "5010", "lr": "3.34083e-05", "gnorm": "8.451", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "749"} 2023-01-29 16:24:14 | INFO | train_inner | {"epoch": 3, "update": 2.322, "s2c_loss": "4.094", "loss": "2.83804", "s2c_nll_loss": "4.094", "s2c_accuracy": "52.969", "s2c_total": "64", "s2c_n_correct": "33.9", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "5020", "lr": "3.3475e-05", "gnorm": "7.841", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "752"} 2023-01-29 16:24:16 | INFO | train_inner | {"epoch": 3, "update": 2.327, "s2c_loss": "4.165", "loss": "2.88663", "s2c_nll_loss": "4.165", "s2c_accuracy": "50.312", "s2c_total": "64", "s2c_n_correct": "32.2", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "5030", "lr": "3.35417e-05", "gnorm": "8.712", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "754"} 2023-01-29 16:24:19 | INFO | train_inner | {"epoch": 3, "update": 2.331, "s2c_loss": "4.201", "loss": "2.91205", "s2c_nll_loss": "4.201", "s2c_accuracy": "51.875", "s2c_total": "64", "s2c_n_correct": "33.2", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "5040", "lr": "3.36083e-05", "gnorm": "7.93", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "757"} 2023-01-29 16:24:21 | INFO | train_inner | {"epoch": 3, "update": 2.336, "s2c_loss": "4.349", "loss": "3.01466", "s2c_nll_loss": "4.349", "s2c_accuracy": "48.594", "s2c_total": "64", "s2c_n_correct": "31.1", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "5050", "lr": "3.3675e-05", "gnorm": "8.784", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "759"} 2023-01-29 16:24:24 | INFO | train_inner | {"epoch": 3, "update": 2.34, "s2c_loss": "4.182", "loss": "2.89864", "s2c_nll_loss": "4.182", "s2c_accuracy": "51.719", "s2c_total": "64", "s2c_n_correct": "33.1", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "5060", "lr": "3.37416e-05", "gnorm": "8.144", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "762"} 2023-01-29 16:24:27 | INFO | train_inner | {"epoch": 3, "update": 2.345, "s2c_loss": "4.031", "loss": "2.79401", "s2c_nll_loss": "4.031", "s2c_accuracy": "52.656", "s2c_total": "64", "s2c_n_correct": "33.7", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "5070", "lr": "3.38083e-05", "gnorm": "9.064", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "765"} 2023-01-29 16:24:29 | INFO | train_inner | {"epoch": 3, "update": 2.35, "s2c_loss": "4.121", "loss": "2.85678", "s2c_nll_loss": "4.121", "s2c_accuracy": "52.812", "s2c_total": "64", "s2c_n_correct": "33.8", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "5080", "lr": "3.3875e-05", "gnorm": "8.42", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "767"} 2023-01-29 16:24:32 | INFO | train_inner | {"epoch": 3, "update": 2.354, "s2c_loss": "4.42", "loss": "3.06405", "s2c_nll_loss": "4.42", "s2c_accuracy": "46.25", "s2c_total": "64", "s2c_n_correct": "29.6", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "5090", "lr": "3.39416e-05", "gnorm": "7.971", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "770"} 2023-01-29 16:24:34 | INFO | train_inner | {"epoch": 3, "update": 2.359, "s2c_loss": "4.052", "loss": "2.8085", "s2c_nll_loss": "4.052", "s2c_accuracy": "51.719", "s2c_total": "64", "s2c_n_correct": "33.1", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "5100", "lr": "3.40083e-05", "gnorm": "8.177", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "772"} 2023-01-29 16:24:37 | INFO | train_inner | {"epoch": 3, "update": 2.364, "s2c_loss": "4.181", "loss": "2.89835", "s2c_nll_loss": "4.181", "s2c_accuracy": "50.312", "s2c_total": "64", "s2c_n_correct": "32.2", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "5110", "lr": "3.4075e-05", "gnorm": "9.561", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "775"} 2023-01-29 16:24:39 | INFO | train_inner | {"epoch": 3, "update": 2.368, "s2c_loss": "4.114", "loss": "2.85126", "s2c_nll_loss": "4.114", "s2c_accuracy": "49.688", "s2c_total": "64", "s2c_n_correct": "31.8", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "5120", "lr": "3.41416e-05", "gnorm": "8.712", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "777"} 2023-01-29 16:24:42 | INFO | train_inner | {"epoch": 3, "update": 2.373, "s2c_loss": "3.991", "loss": "2.76653", "s2c_nll_loss": "3.991", "s2c_accuracy": "50.781", "s2c_total": "64", "s2c_n_correct": "32.5", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "5130", "lr": "3.42083e-05", "gnorm": "8.44", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "780"} 2023-01-29 16:24:44 | INFO | train_inner | {"epoch": 3, "update": 2.377, "s2c_loss": "4.245", "loss": "2.94266", "s2c_nll_loss": "4.245", "s2c_accuracy": "48.438", "s2c_total": "64", "s2c_n_correct": "31", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "5140", "lr": "3.4275e-05", "gnorm": "9.169", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "782"} 2023-01-29 16:24:47 | INFO | train_inner | {"epoch": 3, "update": 2.382, "s2c_loss": "3.962", "loss": "2.74596", "s2c_nll_loss": "3.962", "s2c_accuracy": "52.969", "s2c_total": "64", "s2c_n_correct": "33.9", "wps": "258.3", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "5150", "lr": "3.43416e-05", "gnorm": "9.113", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "785"} 2023-01-29 16:24:49 | INFO | train_inner | {"epoch": 3, "update": 2.387, "s2c_loss": "4.142", "loss": "2.87104", "s2c_nll_loss": "4.142", "s2c_accuracy": "52.969", "s2c_total": "64", "s2c_n_correct": "33.9", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "5160", "lr": "3.44083e-05", "gnorm": "8.265", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "787"} 2023-01-29 16:24:52 | INFO | train_inner | {"epoch": 3, "update": 2.391, "s2c_loss": "4.237", "loss": "2.93695", "s2c_nll_loss": "4.237", "s2c_accuracy": "51.406", "s2c_total": "64", "s2c_n_correct": "32.9", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "5170", "lr": "3.44749e-05", "gnorm": "8.508", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "790"} 2023-01-29 16:24:54 | INFO | train_inner | {"epoch": 3, "update": 2.396, "s2c_loss": "4.114", "loss": "2.85157", "s2c_nll_loss": "4.114", "s2c_accuracy": "50.469", "s2c_total": "64", "s2c_n_correct": "32.3", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "5180", "lr": "3.45416e-05", "gnorm": "8.386", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "792"} 2023-01-29 16:24:57 | INFO | train_inner | {"epoch": 3, "update": 2.401, "s2c_loss": "3.915", "loss": "2.71385", "s2c_nll_loss": "3.915", "s2c_accuracy": "54.062", "s2c_total": "64", "s2c_n_correct": "34.6", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "5190", "lr": "3.46083e-05", "gnorm": "8.655", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "795"} 2023-01-29 16:25:00 | INFO | train_inner | {"epoch": 3, "update": 2.405, "s2c_loss": "3.879", "loss": "2.6884", "s2c_nll_loss": "3.879", "s2c_accuracy": "56.562", "s2c_total": "64", "s2c_n_correct": "36.2", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "5200", "lr": "3.46749e-05", "gnorm": "7.899", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "797"} 2023-01-29 16:25:02 | INFO | train_inner | {"epoch": 3, "update": 2.41, "s2c_loss": "3.879", "loss": "2.68893", "s2c_nll_loss": "3.879", "s2c_accuracy": "54.844", "s2c_total": "64", "s2c_n_correct": "35.1", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "5210", "lr": "3.47416e-05", "gnorm": "7.608", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "800"} 2023-01-29 16:25:05 | INFO | train_inner | {"epoch": 3, "update": 2.414, "s2c_loss": "3.915", "loss": "2.71395", "s2c_nll_loss": "3.915", "s2c_accuracy": "57.656", "s2c_total": "64", "s2c_n_correct": "36.9", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "5220", "lr": "3.48083e-05", "gnorm": "7.97", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "803"} 2023-01-29 16:25:07 | INFO | train_inner | {"epoch": 3, "update": 2.419, "s2c_loss": "4.265", "loss": "2.95641", "s2c_nll_loss": "4.265", "s2c_accuracy": "53.906", "s2c_total": "64", "s2c_n_correct": "34.5", "wps": "257", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "5230", "lr": "3.48749e-05", "gnorm": "8.139", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "805"} 2023-01-29 16:25:10 | INFO | train_inner | {"epoch": 3, "update": 2.424, "s2c_loss": "3.921", "loss": "2.71758", "s2c_nll_loss": "3.921", "s2c_accuracy": "52.5", "s2c_total": "64", "s2c_n_correct": "33.6", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "5240", "lr": "3.49416e-05", "gnorm": "8.436", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "808"} 2023-01-29 16:25:12 | INFO | train_inner | {"epoch": 3, "update": 2.428, "s2c_loss": "3.772", "loss": "2.61453", "s2c_nll_loss": "3.772", "s2c_accuracy": "54.375", "s2c_total": "64", "s2c_n_correct": "34.8", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "5250", "lr": "3.50083e-05", "gnorm": "9.171", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "810"} 2023-01-29 16:25:15 | INFO | train_inner | {"epoch": 3, "update": 2.433, "s2c_loss": "3.755", "loss": "2.60267", "s2c_nll_loss": "3.755", "s2c_accuracy": "53.906", "s2c_total": "64", "s2c_n_correct": "34.5", "wps": "245.2", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "5260", "lr": "3.50749e-05", "gnorm": "8.944", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "813"} 2023-01-29 16:25:17 | INFO | train_inner | {"epoch": 3, "update": 2.438, "s2c_loss": "3.908", "loss": "2.70913", "s2c_nll_loss": "3.908", "s2c_accuracy": "55.781", "s2c_total": "64", "s2c_n_correct": "35.7", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "5270", "lr": "3.51416e-05", "gnorm": "9.126", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "815"} 2023-01-29 16:25:20 | INFO | train_inner | {"epoch": 3, "update": 2.442, "s2c_loss": "3.825", "loss": "2.65106", "s2c_nll_loss": "3.825", "s2c_accuracy": "54.219", "s2c_total": "64", "s2c_n_correct": "34.7", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "5280", "lr": "3.52082e-05", "gnorm": "9.406", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "818"} 2023-01-29 16:25:22 | INFO | train_inner | {"epoch": 3, "update": 2.447, "s2c_loss": "3.912", "loss": "2.71185", "s2c_nll_loss": "3.912", "s2c_accuracy": "55.625", "s2c_total": "64", "s2c_n_correct": "35.6", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "5290", "lr": "3.52749e-05", "gnorm": "9.378", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "820"} 2023-01-29 16:25:25 | INFO | train_inner | {"epoch": 3, "update": 2.451, "s2c_loss": "4.169", "loss": "2.8894", "s2c_nll_loss": "4.169", "s2c_accuracy": "51.094", "s2c_total": "64", "s2c_n_correct": "32.7", "wps": "257.8", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "5300", "lr": "3.53416e-05", "gnorm": "8.908", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "823"} 2023-01-29 16:25:27 | INFO | train_inner | {"epoch": 3, "update": 2.456, "s2c_loss": "3.828", "loss": "2.65358", "s2c_nll_loss": "3.828", "s2c_accuracy": "52.969", "s2c_total": "64", "s2c_n_correct": "33.9", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "5310", "lr": "3.54082e-05", "gnorm": "8.34", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "825"} 2023-01-29 16:25:28 | INFO | fairseq.trainer | NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 256.0 2023-01-29 16:25:30 | INFO | train_inner | {"epoch": 3, "update": 2.461, "s2c_loss": "3.808", "loss": "2.63947", "s2c_nll_loss": "3.808", "s2c_accuracy": "56.562", "s2c_total": "64", "s2c_n_correct": "36.2", "wps": "231.6", "ups": "3.62", "wpb": "64", "bsz": "64", "num_updates": "5320", "lr": "3.54749e-05", "gnorm": "8.441", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "828"} 2023-01-29 16:25:33 | INFO | train_inner | {"epoch": 3, "update": 2.466, "s2c_loss": "3.825", "loss": "2.65118", "s2c_nll_loss": "3.825", "s2c_accuracy": "55", "s2c_total": "64", "s2c_n_correct": "35.2", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "5330", "lr": "3.55416e-05", "gnorm": "8.731", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "831"} 2023-01-29 16:25:35 | INFO | train_inner | {"epoch": 3, "update": 2.47, "s2c_loss": "3.752", "loss": "2.60058", "s2c_nll_loss": "3.752", "s2c_accuracy": "56.094", "s2c_total": "64", "s2c_n_correct": "35.9", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "5340", "lr": "3.56082e-05", "gnorm": "8.186", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "833"} 2023-01-29 16:25:38 | INFO | train_inner | {"epoch": 3, "update": 2.475, "s2c_loss": "3.705", "loss": "2.56829", "s2c_nll_loss": "3.705", "s2c_accuracy": "57.5", "s2c_total": "64", "s2c_n_correct": "36.8", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "5350", "lr": "3.56749e-05", "gnorm": "9.579", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "836"} 2023-01-29 16:25:40 | INFO | train_inner | {"epoch": 3, "update": 2.48, "s2c_loss": "3.743", "loss": "2.59453", "s2c_nll_loss": "3.743", "s2c_accuracy": "55.312", "s2c_total": "64", "s2c_n_correct": "35.4", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "5360", "lr": "3.57415e-05", "gnorm": "9.095", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "838"} 2023-01-29 16:25:43 | INFO | train_inner | {"epoch": 3, "update": 2.484, "s2c_loss": "3.912", "loss": "2.7118", "s2c_nll_loss": "3.912", "s2c_accuracy": "51.875", "s2c_total": "64", "s2c_n_correct": "33.2", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "5370", "lr": "3.58082e-05", "gnorm": "9.986", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "841"} 2023-01-29 16:25:46 | INFO | train_inner | {"epoch": 3, "update": 2.489, "s2c_loss": "3.714", "loss": "2.57423", "s2c_nll_loss": "3.714", "s2c_accuracy": "56.25", "s2c_total": "64", "s2c_n_correct": "36", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "5380", "lr": "3.58749e-05", "gnorm": "9.645", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "843"} 2023-01-29 16:25:48 | INFO | train_inner | {"epoch": 3, "update": 2.494, "s2c_loss": "4.031", "loss": "2.79395", "s2c_nll_loss": "4.031", "s2c_accuracy": "51.406", "s2c_total": "64", "s2c_n_correct": "32.9", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "5390", "lr": "3.59415e-05", "gnorm": "8.512", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "846"} 2023-01-29 16:25:51 | INFO | train_inner | {"epoch": 3, "update": 2.498, "s2c_loss": "3.57", "loss": "2.47428", "s2c_nll_loss": "3.57", "s2c_accuracy": "55.938", "s2c_total": "64", "s2c_n_correct": "35.8", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "5400", "lr": "3.60082e-05", "gnorm": "9.222", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "848"} 2023-01-29 16:25:53 | INFO | train_inner | {"epoch": 3, "update": 2.503, "s2c_loss": "3.947", "loss": "2.73558", "s2c_nll_loss": "3.947", "s2c_accuracy": "55.156", "s2c_total": "64", "s2c_n_correct": "35.3", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "5410", "lr": "3.60749e-05", "gnorm": "8.782", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "851"} 2023-01-29 16:25:56 | INFO | train_inner | {"epoch": 3, "update": 2.507, "s2c_loss": "3.788", "loss": "2.62564", "s2c_nll_loss": "3.788", "s2c_accuracy": "55.781", "s2c_total": "64", "s2c_n_correct": "35.7", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "5420", "lr": "3.61415e-05", "gnorm": "8.055", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "854"} 2023-01-29 16:25:58 | INFO | train_inner | {"epoch": 3, "update": 2.512, "s2c_loss": "3.5", "loss": "2.42605", "s2c_nll_loss": "3.5", "s2c_accuracy": "55.781", "s2c_total": "64", "s2c_n_correct": "35.7", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "5430", "lr": "3.62082e-05", "gnorm": "8.132", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "856"} 2023-01-29 16:26:01 | INFO | train_inner | {"epoch": 3, "update": 2.517, "s2c_loss": "3.822", "loss": "2.64947", "s2c_nll_loss": "3.822", "s2c_accuracy": "54.375", "s2c_total": "64", "s2c_n_correct": "34.8", "wps": "247.5", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "5440", "lr": "3.62749e-05", "gnorm": "8.374", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "859"} 2023-01-29 16:26:03 | INFO | train_inner | {"epoch": 3, "update": 2.521, "s2c_loss": "3.473", "loss": "2.40738", "s2c_nll_loss": "3.473", "s2c_accuracy": "58.594", "s2c_total": "64", "s2c_n_correct": "37.5", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "5450", "lr": "3.63415e-05", "gnorm": "8.277", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "861"} 2023-01-29 16:26:06 | INFO | train_inner | {"epoch": 3, "update": 2.526, "s2c_loss": "3.485", "loss": "2.41595", "s2c_nll_loss": "3.485", "s2c_accuracy": "59.531", "s2c_total": "64", "s2c_n_correct": "38.1", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "5460", "lr": "3.64082e-05", "gnorm": "9.121", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "864"} 2023-01-29 16:26:08 | INFO | train_inner | {"epoch": 3, "update": 2.531, "s2c_loss": "3.767", "loss": "2.61104", "s2c_nll_loss": "3.767", "s2c_accuracy": "54.688", "s2c_total": "64", "s2c_n_correct": "35", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "5470", "lr": "3.64748e-05", "gnorm": "8.703", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "866"} 2023-01-29 16:26:11 | INFO | train_inner | {"epoch": 3, "update": 2.535, "s2c_loss": "3.754", "loss": "2.60189", "s2c_nll_loss": "3.754", "s2c_accuracy": "54.219", "s2c_total": "64", "s2c_n_correct": "34.7", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "5480", "lr": "3.65415e-05", "gnorm": "9.185", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "869"} 2023-01-29 16:26:13 | INFO | train_inner | {"epoch": 3, "update": 2.54, "s2c_loss": "3.864", "loss": "2.67846", "s2c_nll_loss": "3.864", "s2c_accuracy": "53.75", "s2c_total": "64", "s2c_n_correct": "34.4", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "5490", "lr": "3.66082e-05", "gnorm": "8.905", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "871"} 2023-01-29 16:26:16 | INFO | train_inner | {"epoch": 3, "update": 2.544, "s2c_loss": "3.698", "loss": "2.56352", "s2c_nll_loss": "3.698", "s2c_accuracy": "53.906", "s2c_total": "64", "s2c_n_correct": "34.5", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "5500", "lr": "3.66748e-05", "gnorm": "9.547", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "874"} 2023-01-29 16:26:18 | INFO | train_inner | {"epoch": 3, "update": 2.549, "s2c_loss": "3.53", "loss": "2.44684", "s2c_nll_loss": "3.53", "s2c_accuracy": "59.062", "s2c_total": "64", "s2c_n_correct": "37.8", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "5510", "lr": "3.67415e-05", "gnorm": "8.781", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "876"} 2023-01-29 16:26:21 | INFO | train_inner | {"epoch": 3, "update": 2.554, "s2c_loss": "3.586", "loss": "2.48563", "s2c_nll_loss": "3.586", "s2c_accuracy": "57.656", "s2c_total": "64", "s2c_n_correct": "36.9", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "5520", "lr": "3.68082e-05", "gnorm": "8.757", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "879"} 2023-01-29 16:26:23 | INFO | train_inner | {"epoch": 3, "update": 2.558, "s2c_loss": "3.827", "loss": "2.65273", "s2c_nll_loss": "3.827", "s2c_accuracy": "54.688", "s2c_total": "64", "s2c_n_correct": "35", "wps": "259.6", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "5530", "lr": "3.68748e-05", "gnorm": "8.424", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "881"} 2023-01-29 16:26:26 | INFO | train_inner | {"epoch": 3, "update": 2.563, "s2c_loss": "3.642", "loss": "2.52458", "s2c_nll_loss": "3.642", "s2c_accuracy": "56.094", "s2c_total": "64", "s2c_n_correct": "35.9", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "5540", "lr": "3.69415e-05", "gnorm": "8.942", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "884"} 2023-01-29 16:26:28 | INFO | train_inner | {"epoch": 3, "update": 2.568, "s2c_loss": "3.578", "loss": "2.4804", "s2c_nll_loss": "3.578", "s2c_accuracy": "57.188", "s2c_total": "64", "s2c_n_correct": "36.6", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "5550", "lr": "3.70082e-05", "gnorm": "9.299", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "886"} 2023-01-29 16:26:31 | INFO | train_inner | {"epoch": 3, "update": 2.572, "s2c_loss": "3.471", "loss": "2.40558", "s2c_nll_loss": "3.471", "s2c_accuracy": "59.062", "s2c_total": "64", "s2c_n_correct": "37.8", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "5560", "lr": "3.70748e-05", "gnorm": "8.533", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "889"} 2023-01-29 16:26:33 | INFO | train_inner | {"epoch": 3, "update": 2.577, "s2c_loss": "3.512", "loss": "2.43459", "s2c_nll_loss": "3.512", "s2c_accuracy": "59.844", "s2c_total": "64", "s2c_n_correct": "38.3", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "5570", "lr": "3.71415e-05", "gnorm": "8.555", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "891"} 2023-01-29 16:26:36 | INFO | train_inner | {"epoch": 3, "update": 2.581, "s2c_loss": "3.508", "loss": "2.43153", "s2c_nll_loss": "3.508", "s2c_accuracy": "56.719", "s2c_total": "64", "s2c_n_correct": "36.3", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "5580", "lr": "3.72081e-05", "gnorm": "8.847", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "894"} 2023-01-29 16:26:39 | INFO | train_inner | {"epoch": 3, "update": 2.586, "s2c_loss": "3.233", "loss": "2.24115", "s2c_nll_loss": "3.233", "s2c_accuracy": "62.344", "s2c_total": "64", "s2c_n_correct": "39.9", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "5590", "lr": "3.72748e-05", "gnorm": "8.855", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "896"} 2023-01-29 16:26:41 | INFO | train_inner | {"epoch": 3, "update": 2.591, "s2c_loss": "3.334", "loss": "2.31122", "s2c_nll_loss": "3.334", "s2c_accuracy": "58.438", "s2c_total": "64", "s2c_n_correct": "37.4", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "5600", "lr": "3.73415e-05", "gnorm": "8.622", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "899"} 2023-01-29 16:26:44 | INFO | train_inner | {"epoch": 3, "update": 2.595, "s2c_loss": "3.298", "loss": "2.28616", "s2c_nll_loss": "3.298", "s2c_accuracy": "58.594", "s2c_total": "64", "s2c_n_correct": "37.5", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "5610", "lr": "3.74081e-05", "gnorm": "8.536", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "902"} 2023-01-29 16:26:46 | INFO | train_inner | {"epoch": 3, "update": 2.6, "s2c_loss": "3.505", "loss": "2.42923", "s2c_nll_loss": "3.505", "s2c_accuracy": "57.188", "s2c_total": "64", "s2c_n_correct": "36.6", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "5620", "lr": "3.74748e-05", "gnorm": "8.701", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "904"} 2023-01-29 16:26:49 | INFO | train_inner | {"epoch": 3, "update": 2.605, "s2c_loss": "3.254", "loss": "2.25527", "s2c_nll_loss": "3.254", "s2c_accuracy": "62.812", "s2c_total": "64", "s2c_n_correct": "40.2", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "5630", "lr": "3.75415e-05", "gnorm": "8.959", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "907"} 2023-01-29 16:26:51 | INFO | train_inner | {"epoch": 3, "update": 2.609, "s2c_loss": "3.459", "loss": "2.39734", "s2c_nll_loss": "3.459", "s2c_accuracy": "61.094", "s2c_total": "64", "s2c_n_correct": "39.1", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "5640", "lr": "3.76081e-05", "gnorm": "8.686", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "909"} 2023-01-29 16:26:54 | INFO | train_inner | {"epoch": 3, "update": 2.614, "s2c_loss": "3.527", "loss": "2.44463", "s2c_nll_loss": "3.527", "s2c_accuracy": "57.812", "s2c_total": "64", "s2c_n_correct": "37", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "5650", "lr": "3.76748e-05", "gnorm": "8.628", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "912"} 2023-01-29 16:26:56 | INFO | train_inner | {"epoch": 3, "update": 2.618, "s2c_loss": "3.548", "loss": "2.45955", "s2c_nll_loss": "3.548", "s2c_accuracy": "58.125", "s2c_total": "64", "s2c_n_correct": "37.2", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "5660", "lr": "3.77414e-05", "gnorm": "8.897", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "914"} 2023-01-29 16:26:59 | INFO | train_inner | {"epoch": 3, "update": 2.623, "s2c_loss": "3.426", "loss": "2.37477", "s2c_nll_loss": "3.426", "s2c_accuracy": "55.781", "s2c_total": "64", "s2c_n_correct": "35.7", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "5670", "lr": "3.78081e-05", "gnorm": "9.19", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "917"} 2023-01-29 16:27:01 | INFO | train_inner | {"epoch": 3, "update": 2.628, "s2c_loss": "3.468", "loss": "2.40359", "s2c_nll_loss": "3.468", "s2c_accuracy": "57.812", "s2c_total": "64", "s2c_n_correct": "37", "wps": "248", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "5680", "lr": "3.78748e-05", "gnorm": "9.132", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "919"} 2023-01-29 16:27:04 | INFO | train_inner | {"epoch": 3, "update": 2.632, "s2c_loss": "3.68", "loss": "2.55094", "s2c_nll_loss": "3.68", "s2c_accuracy": "52.969", "s2c_total": "64", "s2c_n_correct": "33.9", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "5690", "lr": "3.79414e-05", "gnorm": "9.549", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "922"} 2023-01-29 16:27:06 | INFO | train_inner | {"epoch": 3, "update": 2.637, "s2c_loss": "3.319", "loss": "2.30031", "s2c_nll_loss": "3.319", "s2c_accuracy": "61.25", "s2c_total": "64", "s2c_n_correct": "39.2", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "5700", "lr": "3.80081e-05", "gnorm": "8.742", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "924"} 2023-01-29 16:27:09 | INFO | train_inner | {"epoch": 3, "update": 2.642, "s2c_loss": "3.55", "loss": "2.46094", "s2c_nll_loss": "3.55", "s2c_accuracy": "56.875", "s2c_total": "64", "s2c_n_correct": "36.4", "wps": "249.3", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "5710", "lr": "3.80748e-05", "gnorm": "8.761", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "927"} 2023-01-29 16:27:12 | INFO | train_inner | {"epoch": 3, "update": 2.646, "s2c_loss": "3.417", "loss": "2.36823", "s2c_nll_loss": "3.417", "s2c_accuracy": "59.531", "s2c_total": "64", "s2c_n_correct": "38.1", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "5720", "lr": "3.81414e-05", "gnorm": "8.003", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "929"} 2023-01-29 16:27:14 | INFO | train_inner | {"epoch": 3, "update": 2.651, "s2c_loss": "3.143", "loss": "2.17869", "s2c_nll_loss": "3.143", "s2c_accuracy": "60.625", "s2c_total": "64", "s2c_n_correct": "38.8", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "5730", "lr": "3.82081e-05", "gnorm": "9.21", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "932"} 2023-01-29 16:27:17 | INFO | train_inner | {"epoch": 3, "update": 2.655, "s2c_loss": "3.48", "loss": "2.41215", "s2c_nll_loss": "3.48", "s2c_accuracy": "58.438", "s2c_total": "64", "s2c_n_correct": "37.4", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "5740", "lr": "3.82748e-05", "gnorm": "9.098", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "935"} 2023-01-29 16:27:19 | INFO | train_inner | {"epoch": 3, "update": 2.66, "s2c_loss": "3.567", "loss": "2.47227", "s2c_nll_loss": "3.567", "s2c_accuracy": "57.969", "s2c_total": "64", "s2c_n_correct": "37.1", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "5750", "lr": "3.83414e-05", "gnorm": "8.762", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "937"} 2023-01-29 16:27:22 | INFO | train_inner | {"epoch": 3, "update": 2.665, "s2c_loss": "3.109", "loss": "2.15515", "s2c_nll_loss": "3.109", "s2c_accuracy": "61.094", "s2c_total": "64", "s2c_n_correct": "39.1", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "5760", "lr": "3.84081e-05", "gnorm": "9.189", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "940"} 2023-01-29 16:27:24 | INFO | train_inner | {"epoch": 3, "update": 2.669, "s2c_loss": "2.984", "loss": "2.06844", "s2c_nll_loss": "2.984", "s2c_accuracy": "63.594", "s2c_total": "64", "s2c_n_correct": "40.7", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "5770", "lr": "3.84747e-05", "gnorm": "8.992", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "942"} 2023-01-29 16:27:27 | INFO | train_inner | {"epoch": 3, "update": 2.674, "s2c_loss": "3.456", "loss": "2.39518", "s2c_nll_loss": "3.456", "s2c_accuracy": "58.906", "s2c_total": "64", "s2c_n_correct": "37.7", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "5780", "lr": "3.85414e-05", "gnorm": "9.318", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "945"} 2023-01-29 16:27:29 | INFO | train_inner | {"epoch": 3, "update": 2.679, "s2c_loss": "3.273", "loss": "2.26877", "s2c_nll_loss": "3.273", "s2c_accuracy": "58.438", "s2c_total": "64", "s2c_n_correct": "37.4", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "5790", "lr": "3.86081e-05", "gnorm": "9.531", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "947"} 2023-01-29 16:27:32 | INFO | train_inner | {"epoch": 3, "update": 2.683, "s2c_loss": "3.056", "loss": "2.11818", "s2c_nll_loss": "3.056", "s2c_accuracy": "63.438", "s2c_total": "64", "s2c_n_correct": "40.6", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "5800", "lr": "3.86747e-05", "gnorm": "9.668", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "950"} 2023-01-29 16:27:34 | INFO | train_inner | {"epoch": 3, "update": 2.688, "s2c_loss": "2.967", "loss": "2.05637", "s2c_nll_loss": "2.967", "s2c_accuracy": "65.938", "s2c_total": "64", "s2c_n_correct": "42.2", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "5810", "lr": "3.87414e-05", "gnorm": "10.08", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "952"} 2023-01-29 16:27:37 | INFO | train_inner | {"epoch": 3, "update": 2.692, "s2c_loss": "3.056", "loss": "2.1184", "s2c_nll_loss": "3.056", "s2c_accuracy": "61.719", "s2c_total": "64", "s2c_n_correct": "39.5", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "5820", "lr": "3.88081e-05", "gnorm": "9.159", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "955"} 2023-01-29 16:27:39 | INFO | train_inner | {"epoch": 3, "update": 2.697, "s2c_loss": "3.191", "loss": "2.21183", "s2c_nll_loss": "3.191", "s2c_accuracy": "62.812", "s2c_total": "64", "s2c_n_correct": "40.2", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "5830", "lr": "3.88747e-05", "gnorm": "9.354", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "957"} 2023-01-29 16:27:42 | INFO | train_inner | {"epoch": 3, "update": 2.702, "s2c_loss": "3.06", "loss": "2.12072", "s2c_nll_loss": "3.06", "s2c_accuracy": "63.906", "s2c_total": "64", "s2c_n_correct": "40.9", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "5840", "lr": "3.89414e-05", "gnorm": "9.09", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "960"} 2023-01-29 16:27:45 | INFO | train_inner | {"epoch": 3, "update": 2.706, "s2c_loss": "2.973", "loss": "2.06044", "s2c_nll_loss": "2.973", "s2c_accuracy": "62.188", "s2c_total": "64", "s2c_n_correct": "39.8", "wps": "247.4", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "5850", "lr": "3.90081e-05", "gnorm": "8.946", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "962"} 2023-01-29 16:27:47 | INFO | train_inner | {"epoch": 3, "update": 2.711, "s2c_loss": "3.551", "loss": "2.46126", "s2c_nll_loss": "3.551", "s2c_accuracy": "56.094", "s2c_total": "64", "s2c_n_correct": "35.9", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "5860", "lr": "3.90747e-05", "gnorm": "8.838", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "965"} 2023-01-29 16:27:50 | INFO | train_inner | {"epoch": 3, "update": 2.716, "s2c_loss": "3.377", "loss": "2.34093", "s2c_nll_loss": "3.377", "s2c_accuracy": "59.688", "s2c_total": "64", "s2c_n_correct": "38.2", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "5870", "lr": "3.91414e-05", "gnorm": "9.011", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "968"} 2023-01-29 16:27:52 | INFO | train_inner | {"epoch": 3, "update": 2.72, "s2c_loss": "3.334", "loss": "2.31067", "s2c_nll_loss": "3.334", "s2c_accuracy": "60.156", "s2c_total": "64", "s2c_n_correct": "38.5", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "5880", "lr": "3.9208e-05", "gnorm": "8.22", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "970"} 2023-01-29 16:27:55 | INFO | train_inner | {"epoch": 3, "update": 2.725, "s2c_loss": "3.126", "loss": "2.16694", "s2c_nll_loss": "3.126", "s2c_accuracy": "61.25", "s2c_total": "64", "s2c_n_correct": "39.2", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "5890", "lr": "3.92747e-05", "gnorm": "8.929", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "973"} 2023-01-29 16:27:57 | INFO | train_inner | {"epoch": 3, "update": 2.729, "s2c_loss": "3.05", "loss": "2.11414", "s2c_nll_loss": "3.05", "s2c_accuracy": "62.969", "s2c_total": "64", "s2c_n_correct": "40.3", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "5900", "lr": "3.93414e-05", "gnorm": "8.957", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "975"} 2023-01-29 16:28:00 | INFO | train_inner | {"epoch": 3, "update": 2.734, "s2c_loss": "3.133", "loss": "2.17166", "s2c_nll_loss": "3.133", "s2c_accuracy": "63.75", "s2c_total": "64", "s2c_n_correct": "40.8", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "5910", "lr": "3.9408e-05", "gnorm": "9.051", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "978"} 2023-01-29 16:28:02 | INFO | train_inner | {"epoch": 3, "update": 2.739, "s2c_loss": "3.325", "loss": "2.30473", "s2c_nll_loss": "3.325", "s2c_accuracy": "59.844", "s2c_total": "64", "s2c_n_correct": "38.3", "wps": "259.3", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "5920", "lr": "3.94747e-05", "gnorm": "10.308", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "980"} 2023-01-29 16:28:05 | INFO | train_inner | {"epoch": 3, "update": 2.743, "s2c_loss": "3.177", "loss": "2.20195", "s2c_nll_loss": "3.177", "s2c_accuracy": "61.875", "s2c_total": "64", "s2c_n_correct": "39.6", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "5930", "lr": "3.95414e-05", "gnorm": "10.125", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "983"} 2023-01-29 16:28:07 | INFO | train_inner | {"epoch": 3, "update": 2.748, "s2c_loss": "2.905", "loss": "2.01355", "s2c_nll_loss": "2.905", "s2c_accuracy": "65.156", "s2c_total": "64", "s2c_n_correct": "41.7", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "5940", "lr": "3.9608e-05", "gnorm": "9.508", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "985"} 2023-01-29 16:28:10 | INFO | train_inner | {"epoch": 3, "update": 2.753, "s2c_loss": "3.394", "loss": "2.35251", "s2c_nll_loss": "3.394", "s2c_accuracy": "55", "s2c_total": "64", "s2c_n_correct": "35.2", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "5950", "lr": "3.96747e-05", "gnorm": "10.299", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "988"} 2023-01-29 16:28:12 | INFO | train_inner | {"epoch": 3, "update": 2.757, "s2c_loss": "3.202", "loss": "2.21916", "s2c_nll_loss": "3.202", "s2c_accuracy": "61.875", "s2c_total": "64", "s2c_n_correct": "39.6", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "5960", "lr": "3.97413e-05", "gnorm": "9.745", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "990"} 2023-01-29 16:28:15 | INFO | train_inner | {"epoch": 3, "update": 2.762, "s2c_loss": "3.188", "loss": "2.20982", "s2c_nll_loss": "3.188", "s2c_accuracy": "61.562", "s2c_total": "64", "s2c_n_correct": "39.4", "wps": "246.2", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "5970", "lr": "3.9808e-05", "gnorm": "9.119", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "993"} 2023-01-29 16:28:17 | INFO | train_inner | {"epoch": 3, "update": 2.766, "s2c_loss": "3.357", "loss": "2.32668", "s2c_nll_loss": "3.357", "s2c_accuracy": "62.188", "s2c_total": "64", "s2c_n_correct": "39.8", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "5980", "lr": "3.98747e-05", "gnorm": "8.583", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "995"} 2023-01-29 16:28:20 | INFO | train_inner | {"epoch": 3, "update": 2.771, "s2c_loss": "2.968", "loss": "2.0572", "s2c_nll_loss": "2.968", "s2c_accuracy": "64.062", "s2c_total": "64", "s2c_n_correct": "41", "wps": "258.3", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "5990", "lr": "3.99413e-05", "gnorm": "9.082", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "998"} 2023-01-29 16:28:22 | INFO | train_inner | {"epoch": 3, "update": 2.776, "s2c_loss": "3.082", "loss": "2.1362", "s2c_nll_loss": "3.082", "s2c_accuracy": "63.125", "s2c_total": "64", "s2c_n_correct": "40.4", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "6000", "lr": "4.0008e-05", "gnorm": "8.822", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "1000"} 2023-01-29 16:28:25 | INFO | train_inner | {"epoch": 3, "update": 2.78, "s2c_loss": "2.961", "loss": "2.05219", "s2c_nll_loss": "2.961", "s2c_accuracy": "62.812", "s2c_total": "64", "s2c_n_correct": "40.2", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "6010", "lr": "4.00747e-05", "gnorm": "9.876", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1003"} 2023-01-29 16:28:28 | INFO | train_inner | {"epoch": 3, "update": 2.785, "s2c_loss": "3.231", "loss": "2.2398", "s2c_nll_loss": "3.231", "s2c_accuracy": "59.844", "s2c_total": "64", "s2c_n_correct": "38.3", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "6020", "lr": "4.01413e-05", "gnorm": "9.826", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1005"} 2023-01-29 16:28:30 | INFO | train_inner | {"epoch": 3, "update": 2.79, "s2c_loss": "3.335", "loss": "2.31154", "s2c_nll_loss": "3.335", "s2c_accuracy": "58.125", "s2c_total": "64", "s2c_n_correct": "37.2", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "6030", "lr": "4.0208e-05", "gnorm": "8.736", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1008"} 2023-01-29 16:28:33 | INFO | train_inner | {"epoch": 3, "update": 2.794, "s2c_loss": "3.141", "loss": "2.17744", "s2c_nll_loss": "3.141", "s2c_accuracy": "61.562", "s2c_total": "64", "s2c_n_correct": "39.4", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "6040", "lr": "4.02747e-05", "gnorm": "9.831", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1011"} 2023-01-29 16:28:35 | INFO | train_inner | {"epoch": 3, "update": 2.799, "s2c_loss": "2.946", "loss": "2.04169", "s2c_nll_loss": "2.946", "s2c_accuracy": "62.031", "s2c_total": "64", "s2c_n_correct": "39.7", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "6050", "lr": "4.03413e-05", "gnorm": "9.84", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1013"} 2023-01-29 16:28:38 | INFO | train_inner | {"epoch": 3, "update": 2.803, "s2c_loss": "2.875", "loss": "1.99291", "s2c_nll_loss": "2.875", "s2c_accuracy": "62.969", "s2c_total": "64", "s2c_n_correct": "40.3", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "6060", "lr": "4.0408e-05", "gnorm": "10.193", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1016"} 2023-01-29 16:28:40 | INFO | train_inner | {"epoch": 3, "update": 2.808, "s2c_loss": "2.954", "loss": "2.04732", "s2c_nll_loss": "2.954", "s2c_accuracy": "64.219", "s2c_total": "64", "s2c_n_correct": "41.1", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "6070", "lr": "4.04746e-05", "gnorm": "10.797", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1018"} 2023-01-29 16:28:43 | INFO | train_inner | {"epoch": 3, "update": 2.813, "s2c_loss": "3.14", "loss": "2.17662", "s2c_nll_loss": "3.14", "s2c_accuracy": "61.094", "s2c_total": "64", "s2c_n_correct": "39.1", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "6080", "lr": "4.05413e-05", "gnorm": "9.721", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1021"} 2023-01-29 16:28:45 | INFO | train_inner | {"epoch": 3, "update": 2.817, "s2c_loss": "3.104", "loss": "2.1514", "s2c_nll_loss": "3.104", "s2c_accuracy": "61.562", "s2c_total": "64", "s2c_n_correct": "39.4", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "6090", "lr": "4.0608e-05", "gnorm": "9.869", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1023"} 2023-01-29 16:28:48 | INFO | train_inner | {"epoch": 3, "update": 2.822, "s2c_loss": "2.83", "loss": "1.96146", "s2c_nll_loss": "2.83", "s2c_accuracy": "63.594", "s2c_total": "64", "s2c_n_correct": "40.7", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "6100", "lr": "4.06746e-05", "gnorm": "9.012", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1026"} 2023-01-29 16:28:50 | INFO | train_inner | {"epoch": 3, "update": 2.827, "s2c_loss": "2.847", "loss": "1.97309", "s2c_nll_loss": "2.847", "s2c_accuracy": "67.812", "s2c_total": "64", "s2c_n_correct": "43.4", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "6110", "lr": "4.07413e-05", "gnorm": "8.679", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1028"} 2023-01-29 16:28:53 | INFO | train_inner | {"epoch": 3, "update": 2.831, "s2c_loss": "3.058", "loss": "2.1195", "s2c_nll_loss": "3.058", "s2c_accuracy": "60.312", "s2c_total": "64", "s2c_n_correct": "38.6", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "6120", "lr": "4.0808e-05", "gnorm": "8.773", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "1031"} 2023-01-29 16:28:55 | INFO | train_inner | {"epoch": 3, "update": 2.836, "s2c_loss": "2.716", "loss": "1.88288", "s2c_nll_loss": "2.716", "s2c_accuracy": "66.875", "s2c_total": "64", "s2c_n_correct": "42.8", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "6130", "lr": "4.08746e-05", "gnorm": "9.051", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1033"} 2023-01-29 16:28:58 | INFO | train_inner | {"epoch": 3, "update": 2.84, "s2c_loss": "2.808", "loss": "1.94666", "s2c_nll_loss": "2.808", "s2c_accuracy": "66.406", "s2c_total": "64", "s2c_n_correct": "42.5", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "6140", "lr": "4.09413e-05", "gnorm": "9.619", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1036"} 2023-01-29 16:29:01 | INFO | train_inner | {"epoch": 3, "update": 2.845, "s2c_loss": "2.802", "loss": "1.94192", "s2c_nll_loss": "2.802", "s2c_accuracy": "65.781", "s2c_total": "64", "s2c_n_correct": "42.1", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "6150", "lr": "4.1008e-05", "gnorm": "8.99", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1038"} 2023-01-29 16:29:03 | INFO | train_inner | {"epoch": 3, "update": 2.85, "s2c_loss": "3.104", "loss": "2.15132", "s2c_nll_loss": "3.104", "s2c_accuracy": "60.625", "s2c_total": "64", "s2c_n_correct": "38.8", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "6160", "lr": "4.10746e-05", "gnorm": "10.695", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "1041"} 2023-01-29 16:29:06 | INFO | train_inner | {"epoch": 3, "update": 2.854, "s2c_loss": "2.786", "loss": "1.93112", "s2c_nll_loss": "2.786", "s2c_accuracy": "64.844", "s2c_total": "64", "s2c_n_correct": "41.5", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "6170", "lr": "4.11413e-05", "gnorm": "10.253", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1044"} 2023-01-29 16:29:08 | INFO | train_inner | {"epoch": 3, "update": 2.859, "s2c_loss": "2.989", "loss": "2.0717", "s2c_nll_loss": "2.989", "s2c_accuracy": "64.375", "s2c_total": "64", "s2c_n_correct": "41.2", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "6180", "lr": "4.12079e-05", "gnorm": "9.242", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1046"} 2023-01-29 16:29:11 | INFO | train_inner | {"epoch": 3, "update": 2.864, "s2c_loss": "2.875", "loss": "1.99295", "s2c_nll_loss": "2.875", "s2c_accuracy": "64.375", "s2c_total": "64", "s2c_n_correct": "41.2", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "6190", "lr": "4.12746e-05", "gnorm": "9.305", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1049"} 2023-01-29 16:29:13 | INFO | train_inner | {"epoch": 3, "update": 2.868, "s2c_loss": "2.98", "loss": "2.0658", "s2c_nll_loss": "2.98", "s2c_accuracy": "65.781", "s2c_total": "64", "s2c_n_correct": "42.1", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "6200", "lr": "4.13413e-05", "gnorm": "9.119", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1051"} 2023-01-29 16:29:16 | INFO | train_inner | {"epoch": 3, "update": 2.873, "s2c_loss": "2.706", "loss": "1.87593", "s2c_nll_loss": "2.706", "s2c_accuracy": "68.125", "s2c_total": "64", "s2c_n_correct": "43.6", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "6210", "lr": "4.14079e-05", "gnorm": "8.298", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "1054"} 2023-01-29 16:29:18 | INFO | train_inner | {"epoch": 3, "update": 2.877, "s2c_loss": "3.043", "loss": "2.10918", "s2c_nll_loss": "3.043", "s2c_accuracy": "61.562", "s2c_total": "64", "s2c_n_correct": "39.4", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "6220", "lr": "4.14746e-05", "gnorm": "9.665", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1056"} 2023-01-29 16:29:21 | INFO | train_inner | {"epoch": 3, "update": 2.882, "s2c_loss": "2.732", "loss": "1.89392", "s2c_nll_loss": "2.732", "s2c_accuracy": "67.812", "s2c_total": "64", "s2c_n_correct": "43.4", "wps": "247.4", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "6230", "lr": "4.15413e-05", "gnorm": "10.452", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1059"} 2023-01-29 16:29:23 | INFO | train_inner | {"epoch": 3, "update": 2.887, "s2c_loss": "2.654", "loss": "1.83994", "s2c_nll_loss": "2.654", "s2c_accuracy": "66.719", "s2c_total": "64", "s2c_n_correct": "42.7", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "6240", "lr": "4.16079e-05", "gnorm": "9.656", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1061"} 2023-01-29 16:29:26 | INFO | train_inner | {"epoch": 3, "update": 2.891, "s2c_loss": "2.601", "loss": "1.80285", "s2c_nll_loss": "2.601", "s2c_accuracy": "66.875", "s2c_total": "64", "s2c_n_correct": "42.8", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "6250", "lr": "4.16746e-05", "gnorm": "10.038", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1064"} 2023-01-29 16:29:29 | INFO | train_inner | {"epoch": 3, "update": 2.896, "s2c_loss": "3.13", "loss": "2.16955", "s2c_nll_loss": "3.13", "s2c_accuracy": "61.406", "s2c_total": "64", "s2c_n_correct": "39.3", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "6260", "lr": "4.17412e-05", "gnorm": "9.943", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1067"} 2023-01-29 16:29:31 | INFO | train_inner | {"epoch": 3, "update": 2.901, "s2c_loss": "3.203", "loss": "2.22049", "s2c_nll_loss": "3.203", "s2c_accuracy": "63.125", "s2c_total": "64", "s2c_n_correct": "40.4", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "6270", "lr": "4.18079e-05", "gnorm": "9.706", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1069"} 2023-01-29 16:29:34 | INFO | train_inner | {"epoch": 3, "update": 2.905, "s2c_loss": "2.585", "loss": "1.79203", "s2c_nll_loss": "2.585", "s2c_accuracy": "68.125", "s2c_total": "64", "s2c_n_correct": "43.6", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "6280", "lr": "4.18746e-05", "gnorm": "9.834", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1072"} 2023-01-29 16:29:36 | INFO | train_inner | {"epoch": 3, "update": 2.91, "s2c_loss": "2.776", "loss": "1.92428", "s2c_nll_loss": "2.776", "s2c_accuracy": "64.531", "s2c_total": "64", "s2c_n_correct": "41.3", "wps": "247.9", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "6290", "lr": "4.19412e-05", "gnorm": "9.451", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1074"} 2023-01-29 16:29:39 | INFO | train_inner | {"epoch": 3, "update": 2.914, "s2c_loss": "2.992", "loss": "2.07394", "s2c_nll_loss": "2.992", "s2c_accuracy": "64.844", "s2c_total": "64", "s2c_n_correct": "41.5", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "6300", "lr": "4.20079e-05", "gnorm": "9.239", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1077"} 2023-01-29 16:29:41 | INFO | train_inner | {"epoch": 3, "update": 2.919, "s2c_loss": "2.59", "loss": "1.79516", "s2c_nll_loss": "2.59", "s2c_accuracy": "68.125", "s2c_total": "64", "s2c_n_correct": "43.6", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "6310", "lr": "4.20746e-05", "gnorm": "8.972", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1079"} 2023-01-29 16:29:44 | INFO | train_inner | {"epoch": 3, "update": 2.924, "s2c_loss": "3.037", "loss": "2.10525", "s2c_nll_loss": "3.037", "s2c_accuracy": "63.438", "s2c_total": "64", "s2c_n_correct": "40.6", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "6320", "lr": "4.21412e-05", "gnorm": "8.587", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1082"} 2023-01-29 16:29:46 | INFO | train_inner | {"epoch": 3, "update": 2.928, "s2c_loss": "2.574", "loss": "1.78407", "s2c_nll_loss": "2.574", "s2c_accuracy": "67.031", "s2c_total": "64", "s2c_n_correct": "42.9", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "6330", "lr": "4.22079e-05", "gnorm": "8.739", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1084"} 2023-01-29 16:29:49 | INFO | train_inner | {"epoch": 3, "update": 2.933, "s2c_loss": "2.662", "loss": "1.84487", "s2c_nll_loss": "2.662", "s2c_accuracy": "65.625", "s2c_total": "64", "s2c_n_correct": "42", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "6340", "lr": "4.22746e-05", "gnorm": "9.352", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1087"} 2023-01-29 16:29:51 | INFO | train_inner | {"epoch": 3, "update": 2.938, "s2c_loss": "2.699", "loss": "1.87096", "s2c_nll_loss": "2.699", "s2c_accuracy": "67.344", "s2c_total": "64", "s2c_n_correct": "43.1", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "6350", "lr": "4.23412e-05", "gnorm": "9.51", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1089"} 2023-01-29 16:29:54 | INFO | train_inner | {"epoch": 3, "update": 2.942, "s2c_loss": "2.645", "loss": "1.83329", "s2c_nll_loss": "2.645", "s2c_accuracy": "65", "s2c_total": "64", "s2c_n_correct": "41.6", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "6360", "lr": "4.24079e-05", "gnorm": "8.85", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1092"} 2023-01-29 16:29:56 | INFO | train_inner | {"epoch": 3, "update": 2.947, "s2c_loss": "2.646", "loss": "1.83404", "s2c_nll_loss": "2.646", "s2c_accuracy": "64.062", "s2c_total": "64", "s2c_n_correct": "41", "wps": "247", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "6370", "lr": "4.24745e-05", "gnorm": "9.936", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1094"} 2023-01-29 16:29:59 | INFO | train_inner | {"epoch": 3, "update": 2.951, "s2c_loss": "2.536", "loss": "1.75778", "s2c_nll_loss": "2.536", "s2c_accuracy": "69.844", "s2c_total": "64", "s2c_n_correct": "44.7", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "6380", "lr": "4.25412e-05", "gnorm": "9.223", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1097"} 2023-01-29 16:30:01 | INFO | train_inner | {"epoch": 3, "update": 2.956, "s2c_loss": "2.482", "loss": "1.72013", "s2c_nll_loss": "2.482", "s2c_accuracy": "67.031", "s2c_total": "64", "s2c_n_correct": "42.9", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "6390", "lr": "4.26079e-05", "gnorm": "10.311", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1099"} 2023-01-29 16:30:04 | INFO | train_inner | {"epoch": 3, "update": 2.961, "s2c_loss": "2.938", "loss": "2.03657", "s2c_nll_loss": "2.938", "s2c_accuracy": "65.312", "s2c_total": "64", "s2c_n_correct": "41.8", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "6400", "lr": "4.26745e-05", "gnorm": "10.355", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1102"} 2023-01-29 16:30:07 | INFO | train_inner | {"epoch": 3, "update": 2.965, "s2c_loss": "2.607", "loss": "1.80723", "s2c_nll_loss": "2.607", "s2c_accuracy": "69.531", "s2c_total": "64", "s2c_n_correct": "44.5", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "6410", "lr": "4.27412e-05", "gnorm": "9.667", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1104"} 2023-01-29 16:30:09 | INFO | train_inner | {"epoch": 3, "update": 2.97, "s2c_loss": "2.754", "loss": "1.90907", "s2c_nll_loss": "2.754", "s2c_accuracy": "65", "s2c_total": "64", "s2c_n_correct": "41.6", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "6420", "lr": "4.28079e-05", "gnorm": "9.156", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1107"} 2023-01-29 16:30:12 | INFO | train_inner | {"epoch": 3, "update": 2.975, "s2c_loss": "2.722", "loss": "1.88662", "s2c_nll_loss": "2.722", "s2c_accuracy": "65.312", "s2c_total": "64", "s2c_n_correct": "41.8", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "6430", "lr": "4.28745e-05", "gnorm": "9.538", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1110"} 2023-01-29 16:30:14 | INFO | train_inner | {"epoch": 3, "update": 2.979, "s2c_loss": "2.687", "loss": "1.8622", "s2c_nll_loss": "2.687", "s2c_accuracy": "66.406", "s2c_total": "64", "s2c_n_correct": "42.5", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "6440", "lr": "4.29412e-05", "gnorm": "10.952", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1112"} 2023-01-29 16:30:17 | INFO | train_inner | {"epoch": 3, "update": 2.984, "s2c_loss": "2.796", "loss": "1.93827", "s2c_nll_loss": "2.796", "s2c_accuracy": "64.375", "s2c_total": "64", "s2c_n_correct": "41.2", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "6450", "lr": "4.30079e-05", "gnorm": "8.769", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1115"} 2023-01-29 16:30:19 | INFO | train_inner | {"epoch": 3, "update": 2.988, "s2c_loss": "2.443", "loss": "1.69358", "s2c_nll_loss": "2.443", "s2c_accuracy": "70.625", "s2c_total": "64", "s2c_n_correct": "45.2", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "6460", "lr": "4.30745e-05", "gnorm": "8.823", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1117"} 2023-01-29 16:30:22 | INFO | train_inner | {"epoch": 3, "update": 2.993, "s2c_loss": "2.549", "loss": "1.767", "s2c_nll_loss": "2.549", "s2c_accuracy": "65.781", "s2c_total": "64", "s2c_n_correct": "42.1", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "6470", "lr": "4.31412e-05", "gnorm": "9.425", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1120"} 2023-01-29 16:30:24 | INFO | train_inner | {"epoch": 3, "update": 2.998, "s2c_loss": "2.519", "loss": "1.74576", "s2c_nll_loss": "2.519", "s2c_accuracy": "68.125", "s2c_total": "64", "s2c_n_correct": "43.6", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "6480", "lr": "4.32078e-05", "gnorm": "8.606", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1122"} 2023-01-29 16:30:25 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 3 @ 6485 updates 2023-01-29 16:30:25 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 16:30:32 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 16:30:32 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt (epoch 3 @ 6485 updates, score None) (writing took 7.009676571004093 seconds) 2023-01-29 16:30:32 | INFO | fairseq_cli.train | end of epoch 3 (average epoch stats below) 2023-01-29 16:30:32 | INFO | train | {"epoch": 3, "train_s2c_loss": "3.768", "train_loss": "2.61192", "train_s2c_nll_loss": "3.768", "train_s2c_accuracy": "55.212", "train_s2c_total": "63.9838", "train_s2c_n_correct": "35.3267", "train_wps": "246.1", "train_ups": "3.85", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "6485", "train_lr": "4.32412e-05", "train_gnorm": "8.709", "train_loss_scale": "256", "train_train_wall": "541", "train_gb_free": "7.4", "train_wall": "1130"} 2023-01-29 16:30:39 | INFO | fairseq.trainer | begin training epoch 4 2023-01-29 16:30:39 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 16:30:40 | INFO | train_inner | {"epoch": 4, "update": 3.002, "s2c_loss": "2.907", "loss": "2.01472", "s2c_nll_loss": "2.907", "s2c_accuracy": "65.461", "s2c_total": "60.8", "s2c_n_correct": "39.8", "wps": "38.5", "ups": "0.63", "wpb": "60.8", "bsz": "60.8", "num_updates": "6490", "lr": "4.32745e-05", "gnorm": "10.009", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1138"} 2023-01-29 16:30:43 | INFO | train_inner | {"epoch": 4, "update": 3.007, "s2c_loss": "2.749", "loss": "1.90544", "s2c_nll_loss": "2.749", "s2c_accuracy": "66.406", "s2c_total": "64", "s2c_n_correct": "42.5", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "6500", "lr": "4.33412e-05", "gnorm": "8.913", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1141"} 2023-01-29 16:30:45 | INFO | train_inner | {"epoch": 4, "update": 3.012, "s2c_loss": "2.694", "loss": "1.86768", "s2c_nll_loss": "2.694", "s2c_accuracy": "67.031", "s2c_total": "64", "s2c_n_correct": "42.9", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "6510", "lr": "4.34078e-05", "gnorm": "9.554", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1143"} 2023-01-29 16:30:48 | INFO | train_inner | {"epoch": 4, "update": 3.016, "s2c_loss": "2.376", "loss": "1.64724", "s2c_nll_loss": "2.376", "s2c_accuracy": "68.125", "s2c_total": "64", "s2c_n_correct": "43.6", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "6520", "lr": "4.34745e-05", "gnorm": "9.832", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1146"} 2023-01-29 16:30:50 | INFO | train_inner | {"epoch": 4, "update": 3.021, "s2c_loss": "2.679", "loss": "1.85676", "s2c_nll_loss": "2.679", "s2c_accuracy": "66.094", "s2c_total": "64", "s2c_n_correct": "42.3", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "6530", "lr": "4.35412e-05", "gnorm": "10.469", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1148"} 2023-01-29 16:30:53 | INFO | train_inner | {"epoch": 4, "update": 3.025, "s2c_loss": "2.697", "loss": "1.86973", "s2c_nll_loss": "2.697", "s2c_accuracy": "71.094", "s2c_total": "64", "s2c_n_correct": "45.5", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "6540", "lr": "4.36078e-05", "gnorm": "8.946", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1151"} 2023-01-29 16:30:55 | INFO | train_inner | {"epoch": 4, "update": 3.03, "s2c_loss": "2.602", "loss": "1.80383", "s2c_nll_loss": "2.602", "s2c_accuracy": "69.062", "s2c_total": "64", "s2c_n_correct": "44.2", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "6550", "lr": "4.36745e-05", "gnorm": "8.791", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1153"} 2023-01-29 16:30:58 | INFO | train_inner | {"epoch": 4, "update": 3.035, "s2c_loss": "2.541", "loss": "1.761", "s2c_nll_loss": "2.541", "s2c_accuracy": "67.5", "s2c_total": "64", "s2c_n_correct": "43.2", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "6560", "lr": "4.37411e-05", "gnorm": "10.203", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1156"} 2023-01-29 16:31:00 | INFO | train_inner | {"epoch": 4, "update": 3.039, "s2c_loss": "2.566", "loss": "1.77853", "s2c_nll_loss": "2.566", "s2c_accuracy": "68.438", "s2c_total": "64", "s2c_n_correct": "43.8", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "6570", "lr": "4.38078e-05", "gnorm": "9.74", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1158"} 2023-01-29 16:31:03 | INFO | train_inner | {"epoch": 4, "update": 3.044, "s2c_loss": "2.198", "loss": "1.52365", "s2c_nll_loss": "2.198", "s2c_accuracy": "71.875", "s2c_total": "64", "s2c_n_correct": "46", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "6580", "lr": "4.38745e-05", "gnorm": "9.688", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1161"} 2023-01-29 16:31:06 | INFO | train_inner | {"epoch": 4, "update": 3.049, "s2c_loss": "2.494", "loss": "1.72856", "s2c_nll_loss": "2.494", "s2c_accuracy": "70.781", "s2c_total": "64", "s2c_n_correct": "45.3", "wps": "243.8", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "6590", "lr": "4.39411e-05", "gnorm": "9.161", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1164"} 2023-01-29 16:31:08 | INFO | train_inner | {"epoch": 4, "update": 3.053, "s2c_loss": "2.421", "loss": "1.67828", "s2c_nll_loss": "2.421", "s2c_accuracy": "72.188", "s2c_total": "64", "s2c_n_correct": "46.2", "wps": "235.5", "ups": "3.68", "wpb": "64", "bsz": "64", "num_updates": "6600", "lr": "4.40078e-05", "gnorm": "9.344", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1166"} 2023-01-29 16:31:11 | INFO | train_inner | {"epoch": 4, "update": 3.058, "s2c_loss": "2.487", "loss": "1.7237", "s2c_nll_loss": "2.487", "s2c_accuracy": "69.375", "s2c_total": "64", "s2c_n_correct": "44.4", "wps": "244.6", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "6610", "lr": "4.40745e-05", "gnorm": "9.756", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1169"} 2023-01-29 16:31:14 | INFO | train_inner | {"epoch": 4, "update": 3.062, "s2c_loss": "2.863", "loss": "1.98438", "s2c_nll_loss": "2.863", "s2c_accuracy": "65.312", "s2c_total": "64", "s2c_n_correct": "41.8", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "6620", "lr": "4.41411e-05", "gnorm": "9.433", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1171"} 2023-01-29 16:31:16 | INFO | train_inner | {"epoch": 4, "update": 3.067, "s2c_loss": "2.35", "loss": "1.62895", "s2c_nll_loss": "2.35", "s2c_accuracy": "69.219", "s2c_total": "64", "s2c_n_correct": "44.3", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "6630", "lr": "4.42078e-05", "gnorm": "10.668", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1174"} 2023-01-29 16:31:19 | INFO | train_inner | {"epoch": 4, "update": 3.072, "s2c_loss": "2.38", "loss": "1.64973", "s2c_nll_loss": "2.38", "s2c_accuracy": "70.938", "s2c_total": "64", "s2c_n_correct": "45.4", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "6640", "lr": "4.42745e-05", "gnorm": "8.869", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1177"} 2023-01-29 16:31:21 | INFO | train_inner | {"epoch": 4, "update": 3.076, "s2c_loss": "2.547", "loss": "1.76518", "s2c_nll_loss": "2.547", "s2c_accuracy": "68.125", "s2c_total": "64", "s2c_n_correct": "43.6", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "6650", "lr": "4.43411e-05", "gnorm": "9.902", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1179"} 2023-01-29 16:31:24 | INFO | train_inner | {"epoch": 4, "update": 3.081, "s2c_loss": "2.072", "loss": "1.43611", "s2c_nll_loss": "2.072", "s2c_accuracy": "74.531", "s2c_total": "64", "s2c_n_correct": "47.7", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "6660", "lr": "4.44078e-05", "gnorm": "10.13", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1182"} 2023-01-29 16:31:26 | INFO | train_inner | {"epoch": 4, "update": 3.086, "s2c_loss": "2.483", "loss": "1.72112", "s2c_nll_loss": "2.483", "s2c_accuracy": "69.688", "s2c_total": "64", "s2c_n_correct": "44.6", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "6670", "lr": "4.44744e-05", "gnorm": "9.139", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1184"} 2023-01-29 16:31:29 | INFO | train_inner | {"epoch": 4, "update": 3.09, "s2c_loss": "2.136", "loss": "1.48079", "s2c_nll_loss": "2.136", "s2c_accuracy": "75.469", "s2c_total": "64", "s2c_n_correct": "48.3", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "6680", "lr": "4.45411e-05", "gnorm": "8.402", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1187"} 2023-01-29 16:31:31 | INFO | train_inner | {"epoch": 4, "update": 3.095, "s2c_loss": "2.408", "loss": "1.66922", "s2c_nll_loss": "2.408", "s2c_accuracy": "67.812", "s2c_total": "64", "s2c_n_correct": "43.4", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "6690", "lr": "4.46078e-05", "gnorm": "10.036", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1189"} 2023-01-29 16:31:34 | INFO | train_inner | {"epoch": 4, "update": 3.099, "s2c_loss": "2.416", "loss": "1.67445", "s2c_nll_loss": "2.416", "s2c_accuracy": "70", "s2c_total": "64", "s2c_n_correct": "44.8", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "6700", "lr": "4.46744e-05", "gnorm": "9.786", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1192"} 2023-01-29 16:31:36 | INFO | train_inner | {"epoch": 4, "update": 3.104, "s2c_loss": "2.229", "loss": "1.54487", "s2c_nll_loss": "2.229", "s2c_accuracy": "72.344", "s2c_total": "64", "s2c_n_correct": "46.3", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "6710", "lr": "4.47411e-05", "gnorm": "8.246", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1194"} 2023-01-29 16:31:39 | INFO | train_inner | {"epoch": 4, "update": 3.109, "s2c_loss": "2.169", "loss": "1.50357", "s2c_nll_loss": "2.169", "s2c_accuracy": "74.062", "s2c_total": "64", "s2c_n_correct": "47.4", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "6720", "lr": "4.48078e-05", "gnorm": "9.548", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1197"} 2023-01-29 16:31:41 | INFO | train_inner | {"epoch": 4, "update": 3.113, "s2c_loss": "2.559", "loss": "1.77358", "s2c_nll_loss": "2.559", "s2c_accuracy": "69.688", "s2c_total": "64", "s2c_n_correct": "44.6", "wps": "258.3", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "6730", "lr": "4.48744e-05", "gnorm": "9.324", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1199"} 2023-01-29 16:31:44 | INFO | train_inner | {"epoch": 4, "update": 3.118, "s2c_loss": "2.276", "loss": "1.57762", "s2c_nll_loss": "2.276", "s2c_accuracy": "72.656", "s2c_total": "64", "s2c_n_correct": "46.5", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "6740", "lr": "4.49411e-05", "gnorm": "9.669", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1202"} 2023-01-29 16:31:46 | INFO | train_inner | {"epoch": 4, "update": 3.123, "s2c_loss": "2.198", "loss": "1.52335", "s2c_nll_loss": "2.198", "s2c_accuracy": "72.812", "s2c_total": "64", "s2c_n_correct": "46.6", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "6750", "lr": "4.50078e-05", "gnorm": "8.823", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1204"} 2023-01-29 16:31:49 | INFO | train_inner | {"epoch": 4, "update": 3.127, "s2c_loss": "2.269", "loss": "1.57262", "s2c_nll_loss": "2.269", "s2c_accuracy": "71.094", "s2c_total": "64", "s2c_n_correct": "45.5", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "6760", "lr": "4.50744e-05", "gnorm": "9.527", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "1207"} 2023-01-29 16:31:51 | INFO | train_inner | {"epoch": 4, "update": 3.132, "s2c_loss": "2.631", "loss": "1.82388", "s2c_nll_loss": "2.631", "s2c_accuracy": "71.406", "s2c_total": "64", "s2c_n_correct": "45.7", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "6770", "lr": "4.51411e-05", "gnorm": "9.288", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1209"} 2023-01-29 16:31:54 | INFO | train_inner | {"epoch": 4, "update": 3.136, "s2c_loss": "2.254", "loss": "1.56243", "s2c_nll_loss": "2.254", "s2c_accuracy": "70.469", "s2c_total": "64", "s2c_n_correct": "45.1", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "6780", "lr": "4.52077e-05", "gnorm": "10.086", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1212"} 2023-01-29 16:31:56 | INFO | train_inner | {"epoch": 4, "update": 3.141, "s2c_loss": "2.384", "loss": "1.65245", "s2c_nll_loss": "2.384", "s2c_accuracy": "70.469", "s2c_total": "64", "s2c_n_correct": "45.1", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "6790", "lr": "4.52744e-05", "gnorm": "10.076", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "1214"} 2023-01-29 16:31:59 | INFO | train_inner | {"epoch": 4, "update": 3.146, "s2c_loss": "2.094", "loss": "1.45179", "s2c_nll_loss": "2.094", "s2c_accuracy": "71.719", "s2c_total": "64", "s2c_n_correct": "45.9", "wps": "246.8", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "6800", "lr": "4.53411e-05", "gnorm": "9.138", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1217"} 2023-01-29 16:32:02 | INFO | train_inner | {"epoch": 4, "update": 3.15, "s2c_loss": "2.224", "loss": "1.54159", "s2c_nll_loss": "2.224", "s2c_accuracy": "70.781", "s2c_total": "64", "s2c_n_correct": "45.3", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "6810", "lr": "4.54077e-05", "gnorm": "9.526", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1219"} 2023-01-29 16:32:04 | INFO | train_inner | {"epoch": 4, "update": 3.155, "s2c_loss": "2.357", "loss": "1.63354", "s2c_nll_loss": "2.357", "s2c_accuracy": "69.062", "s2c_total": "64", "s2c_n_correct": "44.2", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "6820", "lr": "4.54744e-05", "gnorm": "10.337", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1222"} 2023-01-29 16:32:07 | INFO | train_inner | {"epoch": 4, "update": 3.16, "s2c_loss": "2.372", "loss": "1.64404", "s2c_nll_loss": "2.372", "s2c_accuracy": "71.094", "s2c_total": "64", "s2c_n_correct": "45.5", "wps": "257.8", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "6830", "lr": "4.55411e-05", "gnorm": "9.916", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1224"} 2023-01-29 16:32:09 | INFO | train_inner | {"epoch": 4, "update": 3.164, "s2c_loss": "2.328", "loss": "1.61371", "s2c_nll_loss": "2.328", "s2c_accuracy": "70.625", "s2c_total": "64", "s2c_n_correct": "45.2", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "6840", "lr": "4.56077e-05", "gnorm": "9.828", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1227"} 2023-01-29 16:32:12 | INFO | train_inner | {"epoch": 4, "update": 3.169, "s2c_loss": "2.116", "loss": "1.46664", "s2c_nll_loss": "2.116", "s2c_accuracy": "71.875", "s2c_total": "64", "s2c_n_correct": "46", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "6850", "lr": "4.56744e-05", "gnorm": "9.569", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1230"} 2023-01-29 16:32:14 | INFO | train_inner | {"epoch": 4, "update": 3.173, "s2c_loss": "2.596", "loss": "1.7991", "s2c_nll_loss": "2.596", "s2c_accuracy": "70.625", "s2c_total": "64", "s2c_n_correct": "45.2", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "6860", "lr": "4.5741e-05", "gnorm": "9.486", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1232"} 2023-01-29 16:32:17 | INFO | train_inner | {"epoch": 4, "update": 3.178, "s2c_loss": "2.199", "loss": "1.52437", "s2c_nll_loss": "2.199", "s2c_accuracy": "73.75", "s2c_total": "64", "s2c_n_correct": "47.2", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "6870", "lr": "4.58077e-05", "gnorm": "8.704", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1235"} 2023-01-29 16:32:19 | INFO | train_inner | {"epoch": 4, "update": 3.183, "s2c_loss": "2.149", "loss": "1.4896", "s2c_nll_loss": "2.149", "s2c_accuracy": "73.594", "s2c_total": "64", "s2c_n_correct": "47.1", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "6880", "lr": "4.58744e-05", "gnorm": "9.181", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1237"} 2023-01-29 16:32:22 | INFO | train_inner | {"epoch": 4, "update": 3.187, "s2c_loss": "2.24", "loss": "1.55265", "s2c_nll_loss": "2.24", "s2c_accuracy": "73.438", "s2c_total": "64", "s2c_n_correct": "47", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "6890", "lr": "4.5941e-05", "gnorm": "10.083", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1240"} 2023-01-29 16:32:24 | INFO | train_inner | {"epoch": 4, "update": 3.192, "s2c_loss": "2.163", "loss": "1.49928", "s2c_nll_loss": "2.163", "s2c_accuracy": "73.281", "s2c_total": "64", "s2c_n_correct": "46.9", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "6900", "lr": "4.60077e-05", "gnorm": "9.748", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1242"} 2023-01-29 16:32:27 | INFO | train_inner | {"epoch": 4, "update": 3.197, "s2c_loss": "2.004", "loss": "1.38928", "s2c_nll_loss": "2.004", "s2c_accuracy": "76.094", "s2c_total": "64", "s2c_n_correct": "48.7", "wps": "246.4", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "6910", "lr": "4.60744e-05", "gnorm": "9.577", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1245"} 2023-01-29 16:32:30 | INFO | train_inner | {"epoch": 4, "update": 3.201, "s2c_loss": "2.22", "loss": "1.53903", "s2c_nll_loss": "2.22", "s2c_accuracy": "72.969", "s2c_total": "64", "s2c_n_correct": "46.7", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "6920", "lr": "4.6141e-05", "gnorm": "10.012", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1248"} 2023-01-29 16:32:32 | INFO | train_inner | {"epoch": 4, "update": 3.206, "s2c_loss": "2.41", "loss": "1.67036", "s2c_nll_loss": "2.41", "s2c_accuracy": "71.875", "s2c_total": "64", "s2c_n_correct": "46", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "6930", "lr": "4.62077e-05", "gnorm": "9.653", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1250"} 2023-01-29 16:32:35 | INFO | train_inner | {"epoch": 4, "update": 3.21, "s2c_loss": "1.986", "loss": "1.37664", "s2c_nll_loss": "1.986", "s2c_accuracy": "74.062", "s2c_total": "64", "s2c_n_correct": "47.4", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "6940", "lr": "4.62744e-05", "gnorm": "9.315", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1253"} 2023-01-29 16:32:37 | INFO | train_inner | {"epoch": 4, "update": 3.215, "s2c_loss": "2.376", "loss": "1.64672", "s2c_nll_loss": "2.376", "s2c_accuracy": "70.781", "s2c_total": "64", "s2c_n_correct": "45.3", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "6950", "lr": "4.6341e-05", "gnorm": "9.696", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1255"} 2023-01-29 16:32:40 | INFO | train_inner | {"epoch": 4, "update": 3.22, "s2c_loss": "2.411", "loss": "1.67126", "s2c_nll_loss": "2.411", "s2c_accuracy": "68.438", "s2c_total": "64", "s2c_n_correct": "43.8", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "6960", "lr": "4.64077e-05", "gnorm": "9.401", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1258"} 2023-01-29 16:32:42 | INFO | train_inner | {"epoch": 4, "update": 3.224, "s2c_loss": "2.086", "loss": "1.44578", "s2c_nll_loss": "2.086", "s2c_accuracy": "73.906", "s2c_total": "64", "s2c_n_correct": "47.3", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "6970", "lr": "4.64743e-05", "gnorm": "8.81", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1260"} 2023-01-29 16:32:45 | INFO | train_inner | {"epoch": 4, "update": 3.229, "s2c_loss": "2.144", "loss": "1.48638", "s2c_nll_loss": "2.144", "s2c_accuracy": "73.125", "s2c_total": "64", "s2c_n_correct": "46.8", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "6980", "lr": "4.6541e-05", "gnorm": "9.198", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1263"} 2023-01-29 16:32:47 | INFO | train_inner | {"epoch": 4, "update": 3.234, "s2c_loss": "1.986", "loss": "1.37685", "s2c_nll_loss": "1.986", "s2c_accuracy": "74.219", "s2c_total": "64", "s2c_n_correct": "47.5", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "6990", "lr": "4.66077e-05", "gnorm": "9.825", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1265"} 2023-01-29 16:32:50 | INFO | train_inner | {"epoch": 4, "update": 3.238, "s2c_loss": "2.027", "loss": "1.40502", "s2c_nll_loss": "2.027", "s2c_accuracy": "75.781", "s2c_total": "64", "s2c_n_correct": "48.5", "wps": "245.8", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "7000", "lr": "4.66743e-05", "gnorm": "9.652", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "1268"} 2023-01-29 16:32:53 | INFO | train_inner | {"epoch": 4, "update": 3.243, "s2c_loss": "2.171", "loss": "1.50495", "s2c_nll_loss": "2.171", "s2c_accuracy": "74.375", "s2c_total": "64", "s2c_n_correct": "47.6", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "7010", "lr": "4.6741e-05", "gnorm": "9.832", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1270"} 2023-01-29 16:32:55 | INFO | train_inner | {"epoch": 4, "update": 3.247, "s2c_loss": "1.957", "loss": "1.35632", "s2c_nll_loss": "1.957", "s2c_accuracy": "74.844", "s2c_total": "64", "s2c_n_correct": "47.9", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "7020", "lr": "4.68077e-05", "gnorm": "9.617", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1273"} 2023-01-29 16:32:58 | INFO | train_inner | {"epoch": 4, "update": 3.252, "s2c_loss": "1.961", "loss": "1.35931", "s2c_nll_loss": "1.961", "s2c_accuracy": "75", "s2c_total": "64", "s2c_n_correct": "48", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "7030", "lr": "4.68743e-05", "gnorm": "9.455", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1276"} 2023-01-29 16:33:00 | INFO | train_inner | {"epoch": 4, "update": 3.257, "s2c_loss": "2.219", "loss": "1.53778", "s2c_nll_loss": "2.219", "s2c_accuracy": "74.219", "s2c_total": "64", "s2c_n_correct": "47.5", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "7040", "lr": "4.6941e-05", "gnorm": "10.467", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1278"} 2023-01-29 16:33:03 | INFO | train_inner | {"epoch": 4, "update": 3.261, "s2c_loss": "2.164", "loss": "1.50021", "s2c_nll_loss": "2.164", "s2c_accuracy": "72.5", "s2c_total": "64", "s2c_n_correct": "46.4", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "7050", "lr": "4.70076e-05", "gnorm": "9.157", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1281"} 2023-01-29 16:33:05 | INFO | train_inner | {"epoch": 4, "update": 3.266, "s2c_loss": "1.655", "loss": "1.14685", "s2c_nll_loss": "1.655", "s2c_accuracy": "78.438", "s2c_total": "64", "s2c_n_correct": "50.2", "wps": "247.6", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "7060", "lr": "4.70743e-05", "gnorm": "9.776", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1283"} 2023-01-29 16:33:08 | INFO | train_inner | {"epoch": 4, "update": 3.271, "s2c_loss": "1.848", "loss": "1.28116", "s2c_nll_loss": "1.848", "s2c_accuracy": "77.5", "s2c_total": "64", "s2c_n_correct": "49.6", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "7070", "lr": "4.7141e-05", "gnorm": "8.741", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1286"} 2023-01-29 16:33:10 | INFO | train_inner | {"epoch": 4, "update": 3.275, "s2c_loss": "2.126", "loss": "1.47341", "s2c_nll_loss": "2.126", "s2c_accuracy": "73.281", "s2c_total": "64", "s2c_n_correct": "46.9", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "7080", "lr": "4.72076e-05", "gnorm": "9.441", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1288"} 2023-01-29 16:33:13 | INFO | train_inner | {"epoch": 4, "update": 3.28, "s2c_loss": "1.971", "loss": "1.366", "s2c_nll_loss": "1.971", "s2c_accuracy": "73.594", "s2c_total": "64", "s2c_n_correct": "47.1", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "7090", "lr": "4.72743e-05", "gnorm": "9.544", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1291"} 2023-01-29 16:33:16 | INFO | train_inner | {"epoch": 4, "update": 3.284, "s2c_loss": "1.935", "loss": "1.34152", "s2c_nll_loss": "1.935", "s2c_accuracy": "73.906", "s2c_total": "64", "s2c_n_correct": "47.3", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "7100", "lr": "4.7341e-05", "gnorm": "10.878", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1294"} 2023-01-29 16:33:18 | INFO | train_inner | {"epoch": 4, "update": 3.289, "s2c_loss": "2.178", "loss": "1.50936", "s2c_nll_loss": "2.178", "s2c_accuracy": "73.75", "s2c_total": "64", "s2c_n_correct": "47.2", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "7110", "lr": "4.74076e-05", "gnorm": "10.402", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1296"} 2023-01-29 16:33:21 | INFO | train_inner | {"epoch": 4, "update": 3.294, "s2c_loss": "1.897", "loss": "1.31515", "s2c_nll_loss": "1.897", "s2c_accuracy": "76.719", "s2c_total": "64", "s2c_n_correct": "49.1", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "7120", "lr": "4.74743e-05", "gnorm": "9.069", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1299"} 2023-01-29 16:33:23 | INFO | train_inner | {"epoch": 4, "update": 3.298, "s2c_loss": "2.004", "loss": "1.38881", "s2c_nll_loss": "2.004", "s2c_accuracy": "74.531", "s2c_total": "64", "s2c_n_correct": "47.7", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "7130", "lr": "4.7541e-05", "gnorm": "9.434", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1301"} 2023-01-29 16:33:26 | INFO | train_inner | {"epoch": 4, "update": 3.303, "s2c_loss": "2.147", "loss": "1.48842", "s2c_nll_loss": "2.147", "s2c_accuracy": "71.875", "s2c_total": "64", "s2c_n_correct": "46", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "7140", "lr": "4.76076e-05", "gnorm": "9.826", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1304"} 2023-01-29 16:33:28 | INFO | train_inner | {"epoch": 4, "update": 3.308, "s2c_loss": "2.172", "loss": "1.5054", "s2c_nll_loss": "2.172", "s2c_accuracy": "71.25", "s2c_total": "64", "s2c_n_correct": "45.6", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "7150", "lr": "4.76743e-05", "gnorm": "9.438", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1306"} 2023-01-29 16:33:31 | INFO | train_inner | {"epoch": 4, "update": 3.312, "s2c_loss": "2.088", "loss": "1.44727", "s2c_nll_loss": "2.088", "s2c_accuracy": "72.656", "s2c_total": "64", "s2c_n_correct": "46.5", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "7160", "lr": "4.77409e-05", "gnorm": "10.155", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "1309"} 2023-01-29 16:33:33 | INFO | train_inner | {"epoch": 4, "update": 3.317, "s2c_loss": "2.104", "loss": "1.45859", "s2c_nll_loss": "2.104", "s2c_accuracy": "72.5", "s2c_total": "64", "s2c_n_correct": "46.4", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "7170", "lr": "4.78076e-05", "gnorm": "9.698", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "1311"} 2023-01-29 16:33:36 | INFO | train_inner | {"epoch": 4, "update": 3.321, "s2c_loss": "1.875", "loss": "1.29977", "s2c_nll_loss": "1.875", "s2c_accuracy": "77.656", "s2c_total": "64", "s2c_n_correct": "49.7", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "7180", "lr": "4.78743e-05", "gnorm": "9.37", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1314"} 2023-01-29 16:33:38 | INFO | train_inner | {"epoch": 4, "update": 3.326, "s2c_loss": "1.861", "loss": "1.28966", "s2c_nll_loss": "1.861", "s2c_accuracy": "76.562", "s2c_total": "64", "s2c_n_correct": "49", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "7190", "lr": "4.79409e-05", "gnorm": "10.242", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1316"} 2023-01-29 16:33:41 | INFO | train_inner | {"epoch": 4, "update": 3.331, "s2c_loss": "2.169", "loss": "1.50345", "s2c_nll_loss": "2.169", "s2c_accuracy": "72.656", "s2c_total": "64", "s2c_n_correct": "46.5", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "7200", "lr": "4.80076e-05", "gnorm": "9.583", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "1319"} 2023-01-29 16:33:43 | INFO | train_inner | {"epoch": 4, "update": 3.335, "s2c_loss": "2.076", "loss": "1.43894", "s2c_nll_loss": "2.076", "s2c_accuracy": "74.688", "s2c_total": "64", "s2c_n_correct": "47.8", "wps": "246.3", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "7210", "lr": "4.80743e-05", "gnorm": "8.821", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1321"} 2023-01-29 16:33:46 | INFO | train_inner | {"epoch": 4, "update": 3.34, "s2c_loss": "2.066", "loss": "1.43238", "s2c_nll_loss": "2.066", "s2c_accuracy": "76.094", "s2c_total": "64", "s2c_n_correct": "48.7", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "7220", "lr": "4.81409e-05", "gnorm": "9.393", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1324"} 2023-01-29 16:33:49 | INFO | train_inner | {"epoch": 4, "update": 3.345, "s2c_loss": "1.926", "loss": "1.33506", "s2c_nll_loss": "1.926", "s2c_accuracy": "76.094", "s2c_total": "64", "s2c_n_correct": "48.7", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "7230", "lr": "4.82076e-05", "gnorm": "9.577", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1326"} 2023-01-29 16:33:51 | INFO | train_inner | {"epoch": 4, "update": 3.349, "s2c_loss": "1.946", "loss": "1.34904", "s2c_nll_loss": "1.946", "s2c_accuracy": "77.656", "s2c_total": "64", "s2c_n_correct": "49.7", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "7240", "lr": "4.82743e-05", "gnorm": "9.93", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1329"} 2023-01-29 16:33:54 | INFO | train_inner | {"epoch": 4, "update": 3.354, "s2c_loss": "1.949", "loss": "1.35126", "s2c_nll_loss": "1.949", "s2c_accuracy": "77.031", "s2c_total": "64", "s2c_n_correct": "49.3", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "7250", "lr": "4.83409e-05", "gnorm": "9.788", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1332"} 2023-01-29 16:33:56 | INFO | train_inner | {"epoch": 4, "update": 3.358, "s2c_loss": "2.2", "loss": "1.52519", "s2c_nll_loss": "2.2", "s2c_accuracy": "72.812", "s2c_total": "64", "s2c_n_correct": "46.6", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "7260", "lr": "4.84076e-05", "gnorm": "11.142", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1334"} 2023-01-29 16:33:59 | INFO | train_inner | {"epoch": 4, "update": 3.363, "s2c_loss": "2.15", "loss": "1.48993", "s2c_nll_loss": "2.15", "s2c_accuracy": "72.969", "s2c_total": "64", "s2c_n_correct": "46.7", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "7270", "lr": "4.84742e-05", "gnorm": "10.3", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1337"} 2023-01-29 16:34:01 | INFO | train_inner | {"epoch": 4, "update": 3.368, "s2c_loss": "1.931", "loss": "1.33821", "s2c_nll_loss": "1.931", "s2c_accuracy": "74.844", "s2c_total": "64", "s2c_n_correct": "47.9", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "7280", "lr": "4.85409e-05", "gnorm": "10.133", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1339"} 2023-01-29 16:34:04 | INFO | train_inner | {"epoch": 4, "update": 3.372, "s2c_loss": "1.816", "loss": "1.25875", "s2c_nll_loss": "1.816", "s2c_accuracy": "77.031", "s2c_total": "64", "s2c_n_correct": "49.3", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "7290", "lr": "4.86076e-05", "gnorm": "9.196", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1342"} 2023-01-29 16:34:06 | INFO | train_inner | {"epoch": 4, "update": 3.377, "s2c_loss": "2.015", "loss": "1.39688", "s2c_nll_loss": "2.015", "s2c_accuracy": "74.062", "s2c_total": "64", "s2c_n_correct": "47.4", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "7300", "lr": "4.86742e-05", "gnorm": "10.484", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1344"} 2023-01-29 16:34:09 | INFO | train_inner | {"epoch": 4, "update": 3.382, "s2c_loss": "1.954", "loss": "1.35453", "s2c_nll_loss": "1.954", "s2c_accuracy": "76.25", "s2c_total": "64", "s2c_n_correct": "48.8", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "7310", "lr": "4.87409e-05", "gnorm": "9.166", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "1347"} 2023-01-29 16:34:11 | INFO | train_inner | {"epoch": 4, "update": 3.386, "s2c_loss": "2.152", "loss": "1.4914", "s2c_nll_loss": "2.152", "s2c_accuracy": "74.844", "s2c_total": "64", "s2c_n_correct": "47.9", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "7320", "lr": "4.88076e-05", "gnorm": "9.941", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "1349"} 2023-01-29 16:34:14 | INFO | train_inner | {"epoch": 4, "update": 3.391, "s2c_loss": "1.889", "loss": "1.30965", "s2c_nll_loss": "1.889", "s2c_accuracy": "77.969", "s2c_total": "64", "s2c_n_correct": "49.9", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "7330", "lr": "4.88742e-05", "gnorm": "9.001", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "1352"} 2023-01-29 16:34:17 | INFO | train_inner | {"epoch": 4, "update": 3.395, "s2c_loss": "1.848", "loss": "1.28077", "s2c_nll_loss": "1.848", "s2c_accuracy": "74.844", "s2c_total": "64", "s2c_n_correct": "47.9", "wps": "245.7", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "7340", "lr": "4.89409e-05", "gnorm": "9.534", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "1354"} 2023-01-29 16:34:19 | INFO | train_inner | {"epoch": 4, "update": 3.4, "s2c_loss": "1.97", "loss": "1.36517", "s2c_nll_loss": "1.97", "s2c_accuracy": "73.906", "s2c_total": "64", "s2c_n_correct": "47.3", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "7350", "lr": "4.90076e-05", "gnorm": "9.256", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "1357"} 2023-01-29 16:34:22 | INFO | train_inner | {"epoch": 4, "update": 3.405, "s2c_loss": "1.76", "loss": "1.22028", "s2c_nll_loss": "1.76", "s2c_accuracy": "76.875", "s2c_total": "64", "s2c_n_correct": "49.2", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "7360", "lr": "4.90742e-05", "gnorm": "9.498", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1359"} 2023-01-29 16:34:24 | INFO | train_inner | {"epoch": 4, "update": 3.409, "s2c_loss": "1.865", "loss": "1.2929", "s2c_nll_loss": "1.865", "s2c_accuracy": "76.25", "s2c_total": "64", "s2c_n_correct": "48.8", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "7370", "lr": "4.91409e-05", "gnorm": "9.45", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1362"} 2023-01-29 16:34:27 | INFO | train_inner | {"epoch": 4, "update": 3.414, "s2c_loss": "1.769", "loss": "1.22623", "s2c_nll_loss": "1.769", "s2c_accuracy": "75.625", "s2c_total": "64", "s2c_n_correct": "48.4", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "7380", "lr": "4.92075e-05", "gnorm": "9.538", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1365"} 2023-01-29 16:34:29 | INFO | train_inner | {"epoch": 4, "update": 3.419, "s2c_loss": "2.052", "loss": "1.42224", "s2c_nll_loss": "2.052", "s2c_accuracy": "72.031", "s2c_total": "64", "s2c_n_correct": "46.1", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "7390", "lr": "4.92742e-05", "gnorm": "9.655", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1367"} 2023-01-29 16:34:32 | INFO | train_inner | {"epoch": 4, "update": 3.423, "s2c_loss": "2.086", "loss": "1.44559", "s2c_nll_loss": "2.086", "s2c_accuracy": "74.531", "s2c_total": "64", "s2c_n_correct": "47.7", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "7400", "lr": "4.93409e-05", "gnorm": "8.849", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1370"} 2023-01-29 16:34:34 | INFO | train_inner | {"epoch": 4, "update": 3.428, "s2c_loss": "2.025", "loss": "1.40383", "s2c_nll_loss": "2.025", "s2c_accuracy": "76.25", "s2c_total": "64", "s2c_n_correct": "48.8", "wps": "258.8", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "7410", "lr": "4.94075e-05", "gnorm": "9.319", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1372"} 2023-01-29 16:34:37 | INFO | train_inner | {"epoch": 4, "update": 3.432, "s2c_loss": "1.854", "loss": "1.28497", "s2c_nll_loss": "1.854", "s2c_accuracy": "76.562", "s2c_total": "64", "s2c_n_correct": "49", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "7420", "lr": "4.94742e-05", "gnorm": "9.535", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1375"} 2023-01-29 16:34:39 | INFO | train_inner | {"epoch": 4, "update": 3.437, "s2c_loss": "1.531", "loss": "1.06095", "s2c_nll_loss": "1.531", "s2c_accuracy": "80.781", "s2c_total": "64", "s2c_n_correct": "51.7", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "7430", "lr": "4.95409e-05", "gnorm": "10.289", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1377"} 2023-01-29 16:34:42 | INFO | train_inner | {"epoch": 4, "update": 3.442, "s2c_loss": "2.193", "loss": "1.52006", "s2c_nll_loss": "2.193", "s2c_accuracy": "73.125", "s2c_total": "64", "s2c_n_correct": "46.8", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "7440", "lr": "4.96075e-05", "gnorm": "10.532", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1380"} 2023-01-29 16:34:44 | INFO | train_inner | {"epoch": 4, "update": 3.446, "s2c_loss": "1.825", "loss": "1.26468", "s2c_nll_loss": "1.825", "s2c_accuracy": "75.625", "s2c_total": "64", "s2c_n_correct": "48.4", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "7450", "lr": "4.96742e-05", "gnorm": "10.846", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1382"} 2023-01-29 16:34:47 | INFO | train_inner | {"epoch": 4, "update": 3.451, "s2c_loss": "1.862", "loss": "1.29039", "s2c_nll_loss": "1.862", "s2c_accuracy": "77.812", "s2c_total": "64", "s2c_n_correct": "49.8", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "7460", "lr": "4.97408e-05", "gnorm": "9.538", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1385"} 2023-01-29 16:34:49 | INFO | train_inner | {"epoch": 4, "update": 3.456, "s2c_loss": "1.927", "loss": "1.3356", "s2c_nll_loss": "1.927", "s2c_accuracy": "73.594", "s2c_total": "64", "s2c_n_correct": "47.1", "wps": "257.4", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "7470", "lr": "4.98075e-05", "gnorm": "10.466", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1387"} 2023-01-29 16:34:52 | INFO | train_inner | {"epoch": 4, "update": 3.46, "s2c_loss": "1.83", "loss": "1.26863", "s2c_nll_loss": "1.83", "s2c_accuracy": "76.406", "s2c_total": "64", "s2c_n_correct": "48.9", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "7480", "lr": "4.98742e-05", "gnorm": "9.129", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1390"} 2023-01-29 16:34:54 | INFO | train_inner | {"epoch": 4, "update": 3.465, "s2c_loss": "1.693", "loss": "1.17317", "s2c_nll_loss": "1.693", "s2c_accuracy": "77.812", "s2c_total": "64", "s2c_n_correct": "49.8", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "7490", "lr": "4.99408e-05", "gnorm": "9.4", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1392"} 2023-01-29 16:34:57 | INFO | train_inner | {"epoch": 4, "update": 3.469, "s2c_loss": "1.704", "loss": "1.1811", "s2c_nll_loss": "1.704", "s2c_accuracy": "77.344", "s2c_total": "64", "s2c_n_correct": "49.5", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "7500", "lr": "5.00075e-05", "gnorm": "9.416", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "1395"} 2023-01-29 16:34:59 | INFO | train_inner | {"epoch": 4, "update": 3.474, "s2c_loss": "1.846", "loss": "1.2794", "s2c_nll_loss": "1.846", "s2c_accuracy": "75.312", "s2c_total": "64", "s2c_n_correct": "48.2", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "7510", "lr": "5.00742e-05", "gnorm": "9.714", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1397"} 2023-01-29 16:35:02 | INFO | train_inner | {"epoch": 4, "update": 3.479, "s2c_loss": "1.713", "loss": "1.18734", "s2c_nll_loss": "1.713", "s2c_accuracy": "78.438", "s2c_total": "64", "s2c_n_correct": "50.2", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "7520", "lr": "5.01408e-05", "gnorm": "9.404", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1400"} 2023-01-29 16:35:04 | INFO | train_inner | {"epoch": 4, "update": 3.483, "s2c_loss": "1.891", "loss": "1.31073", "s2c_nll_loss": "1.891", "s2c_accuracy": "75.469", "s2c_total": "64", "s2c_n_correct": "48.3", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "7530", "lr": "5.02075e-05", "gnorm": "9.849", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1402"} 2023-01-29 16:35:07 | INFO | train_inner | {"epoch": 4, "update": 3.488, "s2c_loss": "1.872", "loss": "1.29748", "s2c_nll_loss": "1.872", "s2c_accuracy": "77.812", "s2c_total": "64", "s2c_n_correct": "49.8", "wps": "257.6", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "7540", "lr": "5.02742e-05", "gnorm": "9.926", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1405"} 2023-01-29 16:35:09 | INFO | train_inner | {"epoch": 4, "update": 3.493, "s2c_loss": "2.042", "loss": "1.41527", "s2c_nll_loss": "2.042", "s2c_accuracy": "76.562", "s2c_total": "64", "s2c_n_correct": "49", "wps": "260", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "7550", "lr": "5.03408e-05", "gnorm": "10.013", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1407"} 2023-01-29 16:35:12 | INFO | train_inner | {"epoch": 4, "update": 3.497, "s2c_loss": "1.876", "loss": "1.30044", "s2c_nll_loss": "1.876", "s2c_accuracy": "76.719", "s2c_total": "64", "s2c_n_correct": "49.1", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "7560", "lr": "5.04075e-05", "gnorm": "10.326", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1410"} 2023-01-29 16:35:14 | INFO | train_inner | {"epoch": 4, "update": 3.502, "s2c_loss": "2.052", "loss": "1.42214", "s2c_nll_loss": "2.052", "s2c_accuracy": "74.219", "s2c_total": "64", "s2c_n_correct": "47.5", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "7570", "lr": "5.04741e-05", "gnorm": "9.68", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1412"} 2023-01-29 16:35:17 | INFO | train_inner | {"epoch": 4, "update": 3.506, "s2c_loss": "1.524", "loss": "1.05642", "s2c_nll_loss": "1.524", "s2c_accuracy": "80.781", "s2c_total": "64", "s2c_n_correct": "51.7", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "7580", "lr": "5.05408e-05", "gnorm": "9.415", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1415"} 2023-01-29 16:35:19 | INFO | train_inner | {"epoch": 4, "update": 3.511, "s2c_loss": "1.88", "loss": "1.30287", "s2c_nll_loss": "1.88", "s2c_accuracy": "77.969", "s2c_total": "64", "s2c_n_correct": "49.9", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "7590", "lr": "5.06075e-05", "gnorm": "10.299", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1417"} 2023-01-29 16:35:22 | INFO | train_inner | {"epoch": 4, "update": 3.516, "s2c_loss": "1.732", "loss": "1.20031", "s2c_nll_loss": "1.732", "s2c_accuracy": "80.312", "s2c_total": "64", "s2c_n_correct": "51.4", "wps": "247.9", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "7600", "lr": "5.06741e-05", "gnorm": "9.535", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1420"} 2023-01-29 16:35:25 | INFO | train_inner | {"epoch": 4, "update": 3.52, "s2c_loss": "1.714", "loss": "1.18832", "s2c_nll_loss": "1.714", "s2c_accuracy": "77.344", "s2c_total": "64", "s2c_n_correct": "49.5", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "7610", "lr": "5.07408e-05", "gnorm": "9.498", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1423"} 2023-01-29 16:35:27 | INFO | train_inner | {"epoch": 4, "update": 3.525, "s2c_loss": "1.75", "loss": "1.21314", "s2c_nll_loss": "1.75", "s2c_accuracy": "77.188", "s2c_total": "64", "s2c_n_correct": "49.4", "wps": "257.6", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "7620", "lr": "5.08075e-05", "gnorm": "9.633", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1425"} 2023-01-29 16:35:30 | INFO | train_inner | {"epoch": 4, "update": 3.53, "s2c_loss": "1.623", "loss": "1.12521", "s2c_nll_loss": "1.623", "s2c_accuracy": "78.281", "s2c_total": "64", "s2c_n_correct": "50.1", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "7630", "lr": "5.08741e-05", "gnorm": "9.775", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1428"} 2023-01-29 16:35:32 | INFO | train_inner | {"epoch": 4, "update": 3.534, "s2c_loss": "1.588", "loss": "1.10041", "s2c_nll_loss": "1.588", "s2c_accuracy": "78.594", "s2c_total": "64", "s2c_n_correct": "50.3", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "7640", "lr": "5.09408e-05", "gnorm": "9.216", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1430"} 2023-01-29 16:35:35 | INFO | train_inner | {"epoch": 4, "update": 3.539, "s2c_loss": "1.665", "loss": "1.15399", "s2c_nll_loss": "1.665", "s2c_accuracy": "77.344", "s2c_total": "64", "s2c_n_correct": "49.5", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "7650", "lr": "5.10074e-05", "gnorm": "10.231", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1433"} 2023-01-29 16:35:37 | INFO | train_inner | {"epoch": 4, "update": 3.543, "s2c_loss": "1.808", "loss": "1.25355", "s2c_nll_loss": "1.808", "s2c_accuracy": "76.875", "s2c_total": "64", "s2c_n_correct": "49.2", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "7660", "lr": "5.10741e-05", "gnorm": "9.239", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1435"} 2023-01-29 16:35:40 | INFO | train_inner | {"epoch": 4, "update": 3.548, "s2c_loss": "1.787", "loss": "1.23862", "s2c_nll_loss": "1.787", "s2c_accuracy": "79.219", "s2c_total": "64", "s2c_n_correct": "50.7", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "7670", "lr": "5.11408e-05", "gnorm": "10.034", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1438"} 2023-01-29 16:35:42 | INFO | train_inner | {"epoch": 4, "update": 3.553, "s2c_loss": "1.75", "loss": "1.21325", "s2c_nll_loss": "1.75", "s2c_accuracy": "79.062", "s2c_total": "64", "s2c_n_correct": "50.6", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "7680", "lr": "5.12074e-05", "gnorm": "9.18", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1440"} 2023-01-29 16:35:45 | INFO | train_inner | {"epoch": 4, "update": 3.557, "s2c_loss": "1.905", "loss": "1.32029", "s2c_nll_loss": "1.905", "s2c_accuracy": "75.938", "s2c_total": "64", "s2c_n_correct": "48.6", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "7690", "lr": "5.12741e-05", "gnorm": "10.547", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1443"} 2023-01-29 16:35:47 | INFO | train_inner | {"epoch": 4, "update": 3.562, "s2c_loss": "1.49", "loss": "1.03308", "s2c_nll_loss": "1.49", "s2c_accuracy": "80.625", "s2c_total": "64", "s2c_n_correct": "51.6", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "7700", "lr": "5.13408e-05", "gnorm": "9.165", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1445"} 2023-01-29 16:35:50 | INFO | train_inner | {"epoch": 4, "update": 3.567, "s2c_loss": "1.723", "loss": "1.19417", "s2c_nll_loss": "1.723", "s2c_accuracy": "79.844", "s2c_total": "64", "s2c_n_correct": "51.1", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "7710", "lr": "5.14074e-05", "gnorm": "9.22", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1448"} 2023-01-29 16:35:52 | INFO | train_inner | {"epoch": 4, "update": 3.571, "s2c_loss": "1.635", "loss": "1.13357", "s2c_nll_loss": "1.635", "s2c_accuracy": "80", "s2c_total": "64", "s2c_n_correct": "51.2", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "7720", "lr": "5.14741e-05", "gnorm": "9.5", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1450"} 2023-01-29 16:35:55 | INFO | train_inner | {"epoch": 4, "update": 3.576, "s2c_loss": "1.829", "loss": "1.26789", "s2c_nll_loss": "1.829", "s2c_accuracy": "79.062", "s2c_total": "64", "s2c_n_correct": "50.6", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "7730", "lr": "5.15408e-05", "gnorm": "9.784", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1453"} 2023-01-29 16:35:57 | INFO | train_inner | {"epoch": 4, "update": 3.58, "s2c_loss": "1.62", "loss": "1.12314", "s2c_nll_loss": "1.62", "s2c_accuracy": "77.969", "s2c_total": "64", "s2c_n_correct": "49.9", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "7740", "lr": "5.16074e-05", "gnorm": "9.499", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "1455"} 2023-01-29 16:36:00 | INFO | train_inner | {"epoch": 4, "update": 3.585, "s2c_loss": "1.748", "loss": "1.21154", "s2c_nll_loss": "1.748", "s2c_accuracy": "75.938", "s2c_total": "64", "s2c_n_correct": "48.6", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "7750", "lr": "5.16741e-05", "gnorm": "9.477", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1458"} 2023-01-29 16:36:03 | INFO | train_inner | {"epoch": 4, "update": 3.59, "s2c_loss": "1.628", "loss": "1.12864", "s2c_nll_loss": "1.628", "s2c_accuracy": "80.938", "s2c_total": "64", "s2c_n_correct": "51.8", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "7760", "lr": "5.17407e-05", "gnorm": "9.959", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1461"} 2023-01-29 16:36:05 | INFO | train_inner | {"epoch": 4, "update": 3.594, "s2c_loss": "1.913", "loss": "1.32631", "s2c_nll_loss": "1.913", "s2c_accuracy": "75.156", "s2c_total": "64", "s2c_n_correct": "48.1", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "7770", "lr": "5.18074e-05", "gnorm": "10.333", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1463"} 2023-01-29 16:36:08 | INFO | train_inner | {"epoch": 4, "update": 3.599, "s2c_loss": "1.816", "loss": "1.25849", "s2c_nll_loss": "1.816", "s2c_accuracy": "76.562", "s2c_total": "64", "s2c_n_correct": "49", "wps": "255.7", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "7780", "lr": "5.18741e-05", "gnorm": "9.1", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1466"} 2023-01-29 16:36:10 | INFO | train_inner | {"epoch": 4, "update": 3.604, "s2c_loss": "1.757", "loss": "1.21776", "s2c_nll_loss": "1.757", "s2c_accuracy": "78.438", "s2c_total": "64", "s2c_n_correct": "50.2", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "7790", "lr": "5.19407e-05", "gnorm": "8.977", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1468"} 2023-01-29 16:36:13 | INFO | train_inner | {"epoch": 4, "update": 3.608, "s2c_loss": "1.702", "loss": "1.17987", "s2c_nll_loss": "1.702", "s2c_accuracy": "76.719", "s2c_total": "64", "s2c_n_correct": "49.1", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "7800", "lr": "5.20074e-05", "gnorm": "9.726", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1471"} 2023-01-29 16:36:15 | INFO | train_inner | {"epoch": 4, "update": 3.613, "s2c_loss": "1.845", "loss": "1.27891", "s2c_nll_loss": "1.845", "s2c_accuracy": "77.656", "s2c_total": "64", "s2c_n_correct": "49.7", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "7810", "lr": "5.20741e-05", "gnorm": "9.391", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1473"} 2023-01-29 16:36:18 | INFO | train_inner | {"epoch": 4, "update": 3.617, "s2c_loss": "1.997", "loss": "1.38445", "s2c_nll_loss": "1.997", "s2c_accuracy": "76.25", "s2c_total": "64", "s2c_n_correct": "48.8", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "7820", "lr": "5.21407e-05", "gnorm": "9.412", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "1476"} 2023-01-29 16:36:20 | INFO | train_inner | {"epoch": 4, "update": 3.622, "s2c_loss": "1.462", "loss": "1.01351", "s2c_nll_loss": "1.462", "s2c_accuracy": "79.531", "s2c_total": "64", "s2c_n_correct": "50.9", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "7830", "lr": "5.22074e-05", "gnorm": "10.71", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1478"} 2023-01-29 16:36:23 | INFO | train_inner | {"epoch": 4, "update": 3.627, "s2c_loss": "1.56", "loss": "1.08147", "s2c_nll_loss": "1.56", "s2c_accuracy": "78.438", "s2c_total": "64", "s2c_n_correct": "50.2", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "7840", "lr": "5.22741e-05", "gnorm": "10.306", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1481"} 2023-01-29 16:36:25 | INFO | train_inner | {"epoch": 4, "update": 3.631, "s2c_loss": "1.703", "loss": "1.1805", "s2c_nll_loss": "1.703", "s2c_accuracy": "77.969", "s2c_total": "64", "s2c_n_correct": "49.9", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "7850", "lr": "5.23407e-05", "gnorm": "10.328", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1483"} 2023-01-29 16:36:28 | INFO | train_inner | {"epoch": 4, "update": 3.636, "s2c_loss": "1.501", "loss": "1.0401", "s2c_nll_loss": "1.501", "s2c_accuracy": "80.156", "s2c_total": "64", "s2c_n_correct": "51.3", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "7860", "lr": "5.24074e-05", "gnorm": "10.396", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1486"} 2023-01-29 16:36:30 | INFO | train_inner | {"epoch": 4, "update": 3.641, "s2c_loss": "1.593", "loss": "1.104", "s2c_nll_loss": "1.593", "s2c_accuracy": "77.969", "s2c_total": "64", "s2c_n_correct": "49.9", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "7870", "lr": "5.2474e-05", "gnorm": "10.667", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1488"} 2023-01-29 16:36:33 | INFO | train_inner | {"epoch": 4, "update": 3.645, "s2c_loss": "1.781", "loss": "1.23424", "s2c_nll_loss": "1.781", "s2c_accuracy": "73.594", "s2c_total": "64", "s2c_n_correct": "47.1", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "7880", "lr": "5.25407e-05", "gnorm": "10.382", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1491"} 2023-01-29 16:36:35 | INFO | train_inner | {"epoch": 4, "update": 3.65, "s2c_loss": "1.62", "loss": "1.12324", "s2c_nll_loss": "1.62", "s2c_accuracy": "79.375", "s2c_total": "64", "s2c_n_correct": "50.8", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "7890", "lr": "5.26074e-05", "gnorm": "9.639", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1493"} 2023-01-29 16:36:38 | INFO | train_inner | {"epoch": 4, "update": 3.654, "s2c_loss": "1.855", "loss": "1.28545", "s2c_nll_loss": "1.855", "s2c_accuracy": "76.094", "s2c_total": "64", "s2c_n_correct": "48.7", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "7900", "lr": "5.2674e-05", "gnorm": "10.106", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1496"} 2023-01-29 16:36:40 | INFO | train_inner | {"epoch": 4, "update": 3.659, "s2c_loss": "1.754", "loss": "1.21582", "s2c_nll_loss": "1.754", "s2c_accuracy": "78.438", "s2c_total": "64", "s2c_n_correct": "50.2", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "7910", "lr": "5.27407e-05", "gnorm": "10.098", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1498"} 2023-01-29 16:36:43 | INFO | train_inner | {"epoch": 4, "update": 3.664, "s2c_loss": "1.713", "loss": "1.18767", "s2c_nll_loss": "1.713", "s2c_accuracy": "77.812", "s2c_total": "64", "s2c_n_correct": "49.8", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "7920", "lr": "5.28074e-05", "gnorm": "10.129", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1501"} 2023-01-29 16:36:46 | INFO | train_inner | {"epoch": 4, "update": 3.668, "s2c_loss": "1.514", "loss": "1.04953", "s2c_nll_loss": "1.514", "s2c_accuracy": "82.969", "s2c_total": "64", "s2c_n_correct": "53.1", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "7930", "lr": "5.2874e-05", "gnorm": "9.728", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "1503"} 2023-01-29 16:36:48 | INFO | train_inner | {"epoch": 4, "update": 3.673, "s2c_loss": "2.076", "loss": "1.43875", "s2c_nll_loss": "2.076", "s2c_accuracy": "76.875", "s2c_total": "64", "s2c_n_correct": "49.2", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "7940", "lr": "5.29407e-05", "gnorm": "9.092", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1506"} 2023-01-29 16:36:51 | INFO | train_inner | {"epoch": 4, "update": 3.678, "s2c_loss": "1.65", "loss": "1.14349", "s2c_nll_loss": "1.65", "s2c_accuracy": "78.594", "s2c_total": "64", "s2c_n_correct": "50.3", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "7950", "lr": "5.30074e-05", "gnorm": "8.839", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1508"} 2023-01-29 16:36:53 | INFO | train_inner | {"epoch": 4, "update": 3.682, "s2c_loss": "1.899", "loss": "1.31628", "s2c_nll_loss": "1.899", "s2c_accuracy": "77.031", "s2c_total": "64", "s2c_n_correct": "49.3", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "7960", "lr": "5.3074e-05", "gnorm": "8.867", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1511"} 2023-01-29 16:36:56 | INFO | train_inner | {"epoch": 4, "update": 3.687, "s2c_loss": "1.654", "loss": "1.14627", "s2c_nll_loss": "1.654", "s2c_accuracy": "77.031", "s2c_total": "64", "s2c_n_correct": "49.3", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "7970", "lr": "5.31407e-05", "gnorm": "10.582", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "1514"} 2023-01-29 16:36:58 | INFO | train_inner | {"epoch": 4, "update": 3.691, "s2c_loss": "1.691", "loss": "1.1719", "s2c_nll_loss": "1.691", "s2c_accuracy": "77.969", "s2c_total": "64", "s2c_n_correct": "49.9", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "7980", "lr": "5.32073e-05", "gnorm": "10.382", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1516"} 2023-01-29 16:37:01 | INFO | train_inner | {"epoch": 4, "update": 3.696, "s2c_loss": "1.688", "loss": "1.17018", "s2c_nll_loss": "1.688", "s2c_accuracy": "77.344", "s2c_total": "64", "s2c_n_correct": "49.5", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "7990", "lr": "5.3274e-05", "gnorm": "12.473", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1519"} 2023-01-29 16:37:03 | INFO | train_inner | {"epoch": 4, "update": 3.701, "s2c_loss": "1.635", "loss": "1.13363", "s2c_nll_loss": "1.635", "s2c_accuracy": "79.375", "s2c_total": "64", "s2c_n_correct": "50.8", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "8000", "lr": "5.33407e-05", "gnorm": "9.554", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1521"} 2023-01-29 16:37:06 | INFO | train_inner | {"epoch": 4, "update": 3.705, "s2c_loss": "1.509", "loss": "1.04595", "s2c_nll_loss": "1.509", "s2c_accuracy": "79.062", "s2c_total": "64", "s2c_n_correct": "50.6", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "8010", "lr": "5.34073e-05", "gnorm": "10.965", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1524"} 2023-01-29 16:37:08 | INFO | train_inner | {"epoch": 4, "update": 3.71, "s2c_loss": "1.616", "loss": "1.11999", "s2c_nll_loss": "1.616", "s2c_accuracy": "79.844", "s2c_total": "64", "s2c_n_correct": "51.1", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "8020", "lr": "5.3474e-05", "gnorm": "10.184", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1526"} 2023-01-29 16:37:11 | INFO | train_inner | {"epoch": 4, "update": 3.715, "s2c_loss": "1.752", "loss": "1.21432", "s2c_nll_loss": "1.752", "s2c_accuracy": "79.375", "s2c_total": "64", "s2c_n_correct": "50.8", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "8030", "lr": "5.35407e-05", "gnorm": "9.169", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1529"} 2023-01-29 16:37:13 | INFO | train_inner | {"epoch": 4, "update": 3.719, "s2c_loss": "1.396", "loss": "0.96729", "s2c_nll_loss": "1.396", "s2c_accuracy": "81.719", "s2c_total": "64", "s2c_n_correct": "52.3", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "8040", "lr": "5.36073e-05", "gnorm": "8.978", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1531"} 2023-01-29 16:37:16 | INFO | train_inner | {"epoch": 4, "update": 3.724, "s2c_loss": "1.612", "loss": "1.11757", "s2c_nll_loss": "1.612", "s2c_accuracy": "79.688", "s2c_total": "64", "s2c_n_correct": "51", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "8050", "lr": "5.3674e-05", "gnorm": "10.242", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1534"} 2023-01-29 16:37:18 | INFO | train_inner | {"epoch": 4, "update": 3.728, "s2c_loss": "1.603", "loss": "1.11121", "s2c_nll_loss": "1.603", "s2c_accuracy": "77.188", "s2c_total": "64", "s2c_n_correct": "49.4", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "8060", "lr": "5.37406e-05", "gnorm": "9.801", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1536"} 2023-01-29 16:37:21 | INFO | train_inner | {"epoch": 4, "update": 3.733, "s2c_loss": "1.605", "loss": "1.1128", "s2c_nll_loss": "1.605", "s2c_accuracy": "77.812", "s2c_total": "64", "s2c_n_correct": "49.8", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "8070", "lr": "5.38073e-05", "gnorm": "10.471", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1539"} 2023-01-29 16:37:24 | INFO | train_inner | {"epoch": 4, "update": 3.738, "s2c_loss": "1.701", "loss": "1.17902", "s2c_nll_loss": "1.701", "s2c_accuracy": "77.656", "s2c_total": "64", "s2c_n_correct": "49.7", "wps": "244.7", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "8080", "lr": "5.3874e-05", "gnorm": "10.291", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1542"} 2023-01-29 16:37:26 | INFO | train_inner | {"epoch": 4, "update": 3.742, "s2c_loss": "1.48", "loss": "1.02585", "s2c_nll_loss": "1.48", "s2c_accuracy": "80.469", "s2c_total": "64", "s2c_n_correct": "51.5", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "8090", "lr": "5.39406e-05", "gnorm": "9.385", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1544"} 2023-01-29 16:37:29 | INFO | train_inner | {"epoch": 4, "update": 3.747, "s2c_loss": "1.752", "loss": "1.21442", "s2c_nll_loss": "1.752", "s2c_accuracy": "79.375", "s2c_total": "64", "s2c_n_correct": "50.8", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "8100", "lr": "5.40073e-05", "gnorm": "9.168", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1547"} 2023-01-29 16:37:31 | INFO | train_inner | {"epoch": 4, "update": 3.752, "s2c_loss": "1.587", "loss": "1.10187", "s2c_nll_loss": "1.587", "s2c_accuracy": "79.278", "s2c_total": "63.7", "s2c_n_correct": "50.5", "wps": "252.6", "ups": "3.97", "wpb": "63.7", "bsz": "63.7", "num_updates": "8110", "lr": "5.4074e-05", "gnorm": "9.481", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1549"} 2023-01-29 16:37:34 | INFO | train_inner | {"epoch": 4, "update": 3.756, "s2c_loss": "1.736", "loss": "1.20313", "s2c_nll_loss": "1.736", "s2c_accuracy": "79.688", "s2c_total": "64", "s2c_n_correct": "51", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "8120", "lr": "5.41406e-05", "gnorm": "9.769", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1552"} 2023-01-29 16:37:36 | INFO | train_inner | {"epoch": 4, "update": 3.761, "s2c_loss": "1.718", "loss": "1.19108", "s2c_nll_loss": "1.718", "s2c_accuracy": "79.219", "s2c_total": "64", "s2c_n_correct": "50.7", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "8130", "lr": "5.42073e-05", "gnorm": "10.886", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1554"} 2023-01-29 16:37:39 | INFO | train_inner | {"epoch": 4, "update": 3.765, "s2c_loss": "1.584", "loss": "1.09805", "s2c_nll_loss": "1.584", "s2c_accuracy": "79.531", "s2c_total": "64", "s2c_n_correct": "50.9", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "8140", "lr": "5.4274e-05", "gnorm": "10.43", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1557"} 2023-01-29 16:37:41 | INFO | train_inner | {"epoch": 4, "update": 3.77, "s2c_loss": "1.479", "loss": "1.02502", "s2c_nll_loss": "1.479", "s2c_accuracy": "79.219", "s2c_total": "64", "s2c_n_correct": "50.7", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "8150", "lr": "5.43406e-05", "gnorm": "9.58", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1559"} 2023-01-29 16:37:44 | INFO | train_inner | {"epoch": 4, "update": 3.775, "s2c_loss": "1.57", "loss": "1.08829", "s2c_nll_loss": "1.57", "s2c_accuracy": "78.594", "s2c_total": "64", "s2c_n_correct": "50.3", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "8160", "lr": "5.44073e-05", "gnorm": "10.483", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1562"} 2023-01-29 16:37:47 | INFO | train_inner | {"epoch": 4, "update": 3.779, "s2c_loss": "1.319", "loss": "0.91401", "s2c_nll_loss": "1.319", "s2c_accuracy": "83.75", "s2c_total": "64", "s2c_n_correct": "53.6", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "8170", "lr": "5.44739e-05", "gnorm": "8.515", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1564"} 2023-01-29 16:37:49 | INFO | train_inner | {"epoch": 4, "update": 3.784, "s2c_loss": "1.482", "loss": "1.0269", "s2c_nll_loss": "1.482", "s2c_accuracy": "80.312", "s2c_total": "64", "s2c_n_correct": "51.4", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "8180", "lr": "5.45406e-05", "gnorm": "9.727", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1567"} 2023-01-29 16:37:52 | INFO | train_inner | {"epoch": 4, "update": 3.789, "s2c_loss": "1.499", "loss": "1.03884", "s2c_nll_loss": "1.499", "s2c_accuracy": "78.438", "s2c_total": "64", "s2c_n_correct": "50.2", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "8190", "lr": "5.46073e-05", "gnorm": "10.443", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1570"} 2023-01-29 16:37:54 | INFO | train_inner | {"epoch": 4, "update": 3.793, "s2c_loss": "1.499", "loss": "1.03927", "s2c_nll_loss": "1.499", "s2c_accuracy": "79.062", "s2c_total": "64", "s2c_n_correct": "50.6", "wps": "247.9", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "8200", "lr": "5.46739e-05", "gnorm": "11.256", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1572"} 2023-01-29 16:37:57 | INFO | train_inner | {"epoch": 4, "update": 3.798, "s2c_loss": "1.446", "loss": "1.00199", "s2c_nll_loss": "1.446", "s2c_accuracy": "81.094", "s2c_total": "64", "s2c_n_correct": "51.9", "wps": "245", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "8210", "lr": "5.47406e-05", "gnorm": "10.344", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1575"} 2023-01-29 16:37:59 | INFO | train_inner | {"epoch": 4, "update": 3.802, "s2c_loss": "1.653", "loss": "1.14555", "s2c_nll_loss": "1.653", "s2c_accuracy": "78.594", "s2c_total": "64", "s2c_n_correct": "50.3", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "8220", "lr": "5.48073e-05", "gnorm": "10.754", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1577"} 2023-01-29 16:38:02 | INFO | train_inner | {"epoch": 4, "update": 3.807, "s2c_loss": "1.632", "loss": "1.13099", "s2c_nll_loss": "1.632", "s2c_accuracy": "79.688", "s2c_total": "64", "s2c_n_correct": "51", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "8230", "lr": "5.48739e-05", "gnorm": "9.454", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1580"} 2023-01-29 16:38:04 | INFO | train_inner | {"epoch": 4, "update": 3.812, "s2c_loss": "1.706", "loss": "1.18229", "s2c_nll_loss": "1.706", "s2c_accuracy": "79.062", "s2c_total": "64", "s2c_n_correct": "50.6", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "8240", "lr": "5.49406e-05", "gnorm": "10.628", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1582"} 2023-01-29 16:38:07 | INFO | train_inner | {"epoch": 4, "update": 3.816, "s2c_loss": "1.641", "loss": "1.13765", "s2c_nll_loss": "1.641", "s2c_accuracy": "78.281", "s2c_total": "64", "s2c_n_correct": "50.1", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "8250", "lr": "5.50072e-05", "gnorm": "11.072", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "1585"} 2023-01-29 16:38:09 | INFO | train_inner | {"epoch": 4, "update": 3.821, "s2c_loss": "1.77", "loss": "1.22687", "s2c_nll_loss": "1.77", "s2c_accuracy": "77.812", "s2c_total": "64", "s2c_n_correct": "49.8", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "8260", "lr": "5.50739e-05", "gnorm": "10.151", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1587"} 2023-01-29 16:38:12 | INFO | train_inner | {"epoch": 4, "update": 3.826, "s2c_loss": "1.74", "loss": "1.2061", "s2c_nll_loss": "1.74", "s2c_accuracy": "76.094", "s2c_total": "64", "s2c_n_correct": "48.7", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "8270", "lr": "5.51406e-05", "gnorm": "9.978", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1590"} 2023-01-29 16:38:15 | INFO | train_inner | {"epoch": 4, "update": 3.83, "s2c_loss": "1.615", "loss": "1.11918", "s2c_nll_loss": "1.615", "s2c_accuracy": "79.844", "s2c_total": "64", "s2c_n_correct": "51.1", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "8280", "lr": "5.52072e-05", "gnorm": "11.119", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1592"} 2023-01-29 16:38:17 | INFO | train_inner | {"epoch": 4, "update": 3.835, "s2c_loss": "1.725", "loss": "1.19545", "s2c_nll_loss": "1.725", "s2c_accuracy": "79.062", "s2c_total": "64", "s2c_n_correct": "50.6", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "8290", "lr": "5.52739e-05", "gnorm": "9.695", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1595"} 2023-01-29 16:38:20 | INFO | train_inner | {"epoch": 4, "update": 3.84, "s2c_loss": "1.536", "loss": "1.06501", "s2c_nll_loss": "1.536", "s2c_accuracy": "79.219", "s2c_total": "64", "s2c_n_correct": "50.7", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "8300", "lr": "5.53406e-05", "gnorm": "10.358", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1598"} 2023-01-29 16:38:22 | INFO | train_inner | {"epoch": 4, "update": 3.844, "s2c_loss": "1.49", "loss": "1.03287", "s2c_nll_loss": "1.49", "s2c_accuracy": "82.5", "s2c_total": "64", "s2c_n_correct": "52.8", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "8310", "lr": "5.54072e-05", "gnorm": "9.341", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1600"} 2023-01-29 16:38:25 | INFO | train_inner | {"epoch": 4, "update": 3.849, "s2c_loss": "1.387", "loss": "0.96144", "s2c_nll_loss": "1.387", "s2c_accuracy": "80.781", "s2c_total": "64", "s2c_n_correct": "51.7", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "8320", "lr": "5.54739e-05", "gnorm": "10.26", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1603"} 2023-01-29 16:38:27 | INFO | train_inner | {"epoch": 4, "update": 3.853, "s2c_loss": "1.631", "loss": "1.13026", "s2c_nll_loss": "1.631", "s2c_accuracy": "80.156", "s2c_total": "64", "s2c_n_correct": "51.3", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "8330", "lr": "5.55406e-05", "gnorm": "9.837", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1605"} 2023-01-29 16:38:30 | INFO | train_inner | {"epoch": 4, "update": 3.858, "s2c_loss": "1.763", "loss": "1.22206", "s2c_nll_loss": "1.763", "s2c_accuracy": "77.656", "s2c_total": "64", "s2c_n_correct": "49.7", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "8340", "lr": "5.56072e-05", "gnorm": "9.8", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1608"} 2023-01-29 16:38:32 | INFO | train_inner | {"epoch": 4, "update": 3.863, "s2c_loss": "1.65", "loss": "1.14372", "s2c_nll_loss": "1.65", "s2c_accuracy": "78.281", "s2c_total": "64", "s2c_n_correct": "50.1", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "8350", "lr": "5.56739e-05", "gnorm": "9.391", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1610"} 2023-01-29 16:38:35 | INFO | train_inner | {"epoch": 4, "update": 3.867, "s2c_loss": "1.347", "loss": "0.93358", "s2c_nll_loss": "1.347", "s2c_accuracy": "81.406", "s2c_total": "64", "s2c_n_correct": "52.1", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "8360", "lr": "5.57405e-05", "gnorm": "9.091", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1613"} 2023-01-29 16:38:37 | INFO | train_inner | {"epoch": 4, "update": 3.872, "s2c_loss": "1.754", "loss": "1.21606", "s2c_nll_loss": "1.754", "s2c_accuracy": "78.125", "s2c_total": "64", "s2c_n_correct": "50", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "8370", "lr": "5.58072e-05", "gnorm": "10.164", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1615"} 2023-01-29 16:38:40 | INFO | train_inner | {"epoch": 4, "update": 3.877, "s2c_loss": "1.543", "loss": "1.06941", "s2c_nll_loss": "1.543", "s2c_accuracy": "79.688", "s2c_total": "64", "s2c_n_correct": "51", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "8380", "lr": "5.58739e-05", "gnorm": "10.633", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1618"} 2023-01-29 16:38:42 | INFO | train_inner | {"epoch": 4, "update": 3.881, "s2c_loss": "1.746", "loss": "1.21032", "s2c_nll_loss": "1.746", "s2c_accuracy": "76.562", "s2c_total": "64", "s2c_n_correct": "49", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "8390", "lr": "5.59405e-05", "gnorm": "9.512", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1620"} 2023-01-29 16:38:45 | INFO | train_inner | {"epoch": 4, "update": 3.886, "s2c_loss": "1.421", "loss": "0.98482", "s2c_nll_loss": "1.421", "s2c_accuracy": "81.719", "s2c_total": "64", "s2c_n_correct": "52.3", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "8400", "lr": "5.60072e-05", "gnorm": "8.909", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1623"} 2023-01-29 16:38:47 | INFO | train_inner | {"epoch": 4, "update": 3.89, "s2c_loss": "1.561", "loss": "1.08187", "s2c_nll_loss": "1.561", "s2c_accuracy": "79.844", "s2c_total": "64", "s2c_n_correct": "51.1", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "8410", "lr": "5.60739e-05", "gnorm": "10.409", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1625"} 2023-01-29 16:38:50 | INFO | train_inner | {"epoch": 4, "update": 3.895, "s2c_loss": "1.566", "loss": "1.08518", "s2c_nll_loss": "1.566", "s2c_accuracy": "80.625", "s2c_total": "64", "s2c_n_correct": "51.6", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "8420", "lr": "5.61405e-05", "gnorm": "11.197", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1628"} 2023-01-29 16:38:53 | INFO | train_inner | {"epoch": 4, "update": 3.9, "s2c_loss": "1.425", "loss": "0.98798", "s2c_nll_loss": "1.425", "s2c_accuracy": "82.5", "s2c_total": "64", "s2c_n_correct": "52.8", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "8430", "lr": "5.62072e-05", "gnorm": "10.551", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1630"} 2023-01-29 16:38:55 | INFO | train_inner | {"epoch": 4, "update": 3.904, "s2c_loss": "1.394", "loss": "0.96596", "s2c_nll_loss": "1.394", "s2c_accuracy": "80.938", "s2c_total": "64", "s2c_n_correct": "51.8", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "8440", "lr": "5.62739e-05", "gnorm": "10.023", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1633"} 2023-01-29 16:38:58 | INFO | train_inner | {"epoch": 4, "update": 3.909, "s2c_loss": "1.361", "loss": "0.94363", "s2c_nll_loss": "1.361", "s2c_accuracy": "81.875", "s2c_total": "64", "s2c_n_correct": "52.4", "wps": "246.9", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "8450", "lr": "5.63405e-05", "gnorm": "10.493", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1636"} 2023-01-29 16:39:00 | INFO | train_inner | {"epoch": 4, "update": 3.914, "s2c_loss": "1.409", "loss": "0.97636", "s2c_nll_loss": "1.409", "s2c_accuracy": "80.938", "s2c_total": "64", "s2c_n_correct": "51.8", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "8460", "lr": "5.64072e-05", "gnorm": "10.102", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1638"} 2023-01-29 16:39:03 | INFO | train_inner | {"epoch": 4, "update": 3.918, "s2c_loss": "1.379", "loss": "0.95577", "s2c_nll_loss": "1.379", "s2c_accuracy": "82.188", "s2c_total": "64", "s2c_n_correct": "52.6", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "8470", "lr": "5.64738e-05", "gnorm": "10.084", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1641"} 2023-01-29 16:39:05 | INFO | train_inner | {"epoch": 4, "update": 3.923, "s2c_loss": "1.505", "loss": "1.04322", "s2c_nll_loss": "1.505", "s2c_accuracy": "80", "s2c_total": "64", "s2c_n_correct": "51.2", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "8480", "lr": "5.65405e-05", "gnorm": "8.991", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1643"} 2023-01-29 16:39:08 | INFO | train_inner | {"epoch": 4, "update": 3.927, "s2c_loss": "1.407", "loss": "0.97555", "s2c_nll_loss": "1.407", "s2c_accuracy": "81.406", "s2c_total": "64", "s2c_n_correct": "52.1", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "8490", "lr": "5.66072e-05", "gnorm": "8.838", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1646"} 2023-01-29 16:39:10 | INFO | train_inner | {"epoch": 4, "update": 3.932, "s2c_loss": "1.559", "loss": "1.08068", "s2c_nll_loss": "1.559", "s2c_accuracy": "79.844", "s2c_total": "64", "s2c_n_correct": "51.1", "wps": "247.9", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "8500", "lr": "5.66738e-05", "gnorm": "9.89", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1648"} 2023-01-29 16:39:13 | INFO | train_inner | {"epoch": 4, "update": 3.937, "s2c_loss": "1.356", "loss": "0.93997", "s2c_nll_loss": "1.356", "s2c_accuracy": "81.094", "s2c_total": "64", "s2c_n_correct": "51.9", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "8510", "lr": "5.67405e-05", "gnorm": "10.046", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1651"} 2023-01-29 16:39:16 | INFO | train_inner | {"epoch": 4, "update": 3.941, "s2c_loss": "1.312", "loss": "0.90937", "s2c_nll_loss": "1.312", "s2c_accuracy": "83.75", "s2c_total": "64", "s2c_n_correct": "53.6", "wps": "247.9", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "8520", "lr": "5.68072e-05", "gnorm": "8.879", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1653"} 2023-01-29 16:39:18 | INFO | train_inner | {"epoch": 4, "update": 3.946, "s2c_loss": "1.27", "loss": "0.88041", "s2c_nll_loss": "1.27", "s2c_accuracy": "83.125", "s2c_total": "64", "s2c_n_correct": "53.2", "wps": "246.5", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "8530", "lr": "5.68738e-05", "gnorm": "9.11", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1656"} 2023-01-29 16:39:21 | INFO | train_inner | {"epoch": 4, "update": 3.951, "s2c_loss": "1.289", "loss": "0.89348", "s2c_nll_loss": "1.289", "s2c_accuracy": "82.812", "s2c_total": "64", "s2c_n_correct": "53", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "8540", "lr": "5.69405e-05", "gnorm": "10.28", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1659"} 2023-01-29 16:39:23 | INFO | train_inner | {"epoch": 4, "update": 3.955, "s2c_loss": "1.33", "loss": "0.92195", "s2c_nll_loss": "1.33", "s2c_accuracy": "82.656", "s2c_total": "64", "s2c_n_correct": "52.9", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "8550", "lr": "5.70071e-05", "gnorm": "9.305", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1661"} 2023-01-29 16:39:26 | INFO | train_inner | {"epoch": 4, "update": 3.96, "s2c_loss": "1.407", "loss": "0.97509", "s2c_nll_loss": "1.407", "s2c_accuracy": "79.688", "s2c_total": "64", "s2c_n_correct": "51", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "8560", "lr": "5.70738e-05", "gnorm": "11.734", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1664"} 2023-01-29 16:39:28 | INFO | train_inner | {"epoch": 4, "update": 3.964, "s2c_loss": "1.53", "loss": "1.0607", "s2c_nll_loss": "1.53", "s2c_accuracy": "80", "s2c_total": "64", "s2c_n_correct": "51.2", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "8570", "lr": "5.71405e-05", "gnorm": "10.471", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1666"} 2023-01-29 16:39:31 | INFO | train_inner | {"epoch": 4, "update": 3.969, "s2c_loss": "1.388", "loss": "0.96177", "s2c_nll_loss": "1.388", "s2c_accuracy": "81.562", "s2c_total": "64", "s2c_n_correct": "52.2", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "8580", "lr": "5.72071e-05", "gnorm": "9.158", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1669"} 2023-01-29 16:39:33 | INFO | train_inner | {"epoch": 4, "update": 3.974, "s2c_loss": "1.446", "loss": "1.00252", "s2c_nll_loss": "1.446", "s2c_accuracy": "80.312", "s2c_total": "64", "s2c_n_correct": "51.4", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "8590", "lr": "5.72738e-05", "gnorm": "9.989", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1671"} 2023-01-29 16:39:36 | INFO | train_inner | {"epoch": 4, "update": 3.978, "s2c_loss": "1.438", "loss": "0.99705", "s2c_nll_loss": "1.438", "s2c_accuracy": "79.531", "s2c_total": "64", "s2c_n_correct": "50.9", "wps": "245.7", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "8600", "lr": "5.73405e-05", "gnorm": "9.853", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1674"} 2023-01-29 16:39:38 | INFO | train_inner | {"epoch": 4, "update": 3.983, "s2c_loss": "1.29", "loss": "0.89412", "s2c_nll_loss": "1.29", "s2c_accuracy": "82.344", "s2c_total": "64", "s2c_n_correct": "52.7", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "8610", "lr": "5.74071e-05", "gnorm": "9.913", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1676"} 2023-01-29 16:39:41 | INFO | train_inner | {"epoch": 4, "update": 3.988, "s2c_loss": "1.36", "loss": "0.94265", "s2c_nll_loss": "1.36", "s2c_accuracy": "81.875", "s2c_total": "64", "s2c_n_correct": "52.4", "wps": "247.1", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "8620", "lr": "5.74738e-05", "gnorm": "10.02", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1679"} 2023-01-29 16:39:44 | INFO | train_inner | {"epoch": 4, "update": 3.992, "s2c_loss": "1.513", "loss": "1.04882", "s2c_nll_loss": "1.513", "s2c_accuracy": "81.25", "s2c_total": "64", "s2c_n_correct": "52", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "8630", "lr": "5.75405e-05", "gnorm": "9.881", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1682"} 2023-01-29 16:39:46 | INFO | train_inner | {"epoch": 4, "update": 3.997, "s2c_loss": "1.46", "loss": "1.0118", "s2c_nll_loss": "1.46", "s2c_accuracy": "80.938", "s2c_total": "64", "s2c_n_correct": "51.8", "wps": "241.8", "ups": "3.78", "wpb": "64", "bsz": "64", "num_updates": "8640", "lr": "5.76071e-05", "gnorm": "11.077", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1684"} 2023-01-29 16:39:48 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 4 @ 8647 updates 2023-01-29 16:39:48 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 16:39:55 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 16:39:55 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt (epoch 4 @ 8647 updates, score None) (writing took 7.023365391883999 seconds) 2023-01-29 16:39:55 | INFO | fairseq_cli.train | end of epoch 4 (average epoch stats below) 2023-01-29 16:39:55 | INFO | train | {"epoch": 4, "train_s2c_loss": "1.873", "train_loss": "1.29841", "train_s2c_nll_loss": "1.873", "train_s2c_accuracy": "76.31", "train_s2c_total": "63.9838", "train_s2c_n_correct": "48.8261", "train_wps": "245.9", "train_ups": "3.84", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "8647", "train_lr": "5.76538e-05", "train_gnorm": "9.783", "train_loss_scale": "512", "train_train_wall": "542", "train_gb_free": "7.5", "train_wall": "1693"} 2023-01-29 16:40:01 | INFO | fairseq.trainer | begin training epoch 5 2023-01-29 16:40:01 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 16:40:02 | INFO | train_inner | {"epoch": 5, "update": 4.001, "s2c_loss": "1.221", "loss": "0.84644", "s2c_nll_loss": "1.221", "s2c_accuracy": "84.046", "s2c_total": "60.8", "s2c_n_correct": "51.1", "wps": "38.1", "ups": "0.63", "wpb": "60.8", "bsz": "60.8", "num_updates": "8650", "lr": "5.76738e-05", "gnorm": "10.311", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1700"} 2023-01-29 16:40:05 | INFO | train_inner | {"epoch": 5, "update": 4.006, "s2c_loss": "1.63", "loss": "1.12963", "s2c_nll_loss": "1.63", "s2c_accuracy": "79.531", "s2c_total": "64", "s2c_n_correct": "50.9", "wps": "246.6", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "8660", "lr": "5.77404e-05", "gnorm": "9.974", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "1703"} 2023-01-29 16:40:07 | INFO | train_inner | {"epoch": 5, "update": 4.011, "s2c_loss": "1.313", "loss": "0.90996", "s2c_nll_loss": "1.313", "s2c_accuracy": "82.188", "s2c_total": "64", "s2c_n_correct": "52.6", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "8670", "lr": "5.78071e-05", "gnorm": "9.306", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "1705"} 2023-01-29 16:40:10 | INFO | train_inner | {"epoch": 5, "update": 4.015, "s2c_loss": "1.436", "loss": "0.99547", "s2c_nll_loss": "1.436", "s2c_accuracy": "82.969", "s2c_total": "64", "s2c_n_correct": "53.1", "wps": "244.5", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "8680", "lr": "5.78738e-05", "gnorm": "8.442", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1708"} 2023-01-29 16:40:13 | INFO | train_inner | {"epoch": 5, "update": 4.02, "s2c_loss": "1.271", "loss": "0.8811", "s2c_nll_loss": "1.271", "s2c_accuracy": "83.281", "s2c_total": "64", "s2c_n_correct": "53.3", "wps": "244.4", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "8690", "lr": "5.79404e-05", "gnorm": "9.47", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1711"} 2023-01-29 16:40:15 | INFO | train_inner | {"epoch": 5, "update": 4.025, "s2c_loss": "1.354", "loss": "0.93879", "s2c_nll_loss": "1.354", "s2c_accuracy": "81.875", "s2c_total": "64", "s2c_n_correct": "52.4", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "8700", "lr": "5.80071e-05", "gnorm": "10.55", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1713"} 2023-01-29 16:40:18 | INFO | train_inner | {"epoch": 5, "update": 4.029, "s2c_loss": "1.219", "loss": "0.84473", "s2c_nll_loss": "1.219", "s2c_accuracy": "82.656", "s2c_total": "64", "s2c_n_correct": "52.9", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "8710", "lr": "5.80738e-05", "gnorm": "10.651", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "1716"} 2023-01-29 16:40:20 | INFO | train_inner | {"epoch": 5, "update": 4.034, "s2c_loss": "1.275", "loss": "0.88372", "s2c_nll_loss": "1.275", "s2c_accuracy": "85.469", "s2c_total": "64", "s2c_n_correct": "54.7", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "8720", "lr": "5.81404e-05", "gnorm": "8.975", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1718"} 2023-01-29 16:40:23 | INFO | train_inner | {"epoch": 5, "update": 4.038, "s2c_loss": "1.29", "loss": "0.89436", "s2c_nll_loss": "1.29", "s2c_accuracy": "83.594", "s2c_total": "64", "s2c_n_correct": "53.5", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "8730", "lr": "5.82071e-05", "gnorm": "9.098", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1721"} 2023-01-29 16:40:25 | INFO | train_inner | {"epoch": 5, "update": 4.043, "s2c_loss": "1.128", "loss": "0.78209", "s2c_nll_loss": "1.128", "s2c_accuracy": "84.375", "s2c_total": "64", "s2c_n_correct": "54", "wps": "244.2", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "8740", "lr": "5.82738e-05", "gnorm": "9.033", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1723"} 2023-01-29 16:40:28 | INFO | train_inner | {"epoch": 5, "update": 4.048, "s2c_loss": "1.369", "loss": "0.94895", "s2c_nll_loss": "1.369", "s2c_accuracy": "84.688", "s2c_total": "64", "s2c_n_correct": "54.2", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "8750", "lr": "5.83404e-05", "gnorm": "8.898", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1726"} 2023-01-29 16:40:31 | INFO | train_inner | {"epoch": 5, "update": 4.052, "s2c_loss": "1.21", "loss": "0.83857", "s2c_nll_loss": "1.21", "s2c_accuracy": "83.75", "s2c_total": "64", "s2c_n_correct": "53.6", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "8760", "lr": "5.84071e-05", "gnorm": "9.23", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1728"} 2023-01-29 16:40:33 | INFO | train_inner | {"epoch": 5, "update": 4.057, "s2c_loss": "1.223", "loss": "0.84794", "s2c_nll_loss": "1.223", "s2c_accuracy": "83.906", "s2c_total": "64", "s2c_n_correct": "53.7", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "8770", "lr": "5.84737e-05", "gnorm": "8.933", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1731"} 2023-01-29 16:40:36 | INFO | train_inner | {"epoch": 5, "update": 4.062, "s2c_loss": "1.097", "loss": "0.76028", "s2c_nll_loss": "1.097", "s2c_accuracy": "85.938", "s2c_total": "64", "s2c_n_correct": "55", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "8780", "lr": "5.85404e-05", "gnorm": "9.348", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1733"} 2023-01-29 16:40:38 | INFO | train_inner | {"epoch": 5, "update": 4.066, "s2c_loss": "1.294", "loss": "0.89724", "s2c_nll_loss": "1.294", "s2c_accuracy": "83.125", "s2c_total": "64", "s2c_n_correct": "53.2", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "8790", "lr": "5.86071e-05", "gnorm": "8.401", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1736"} 2023-01-29 16:40:41 | INFO | train_inner | {"epoch": 5, "update": 4.071, "s2c_loss": "1.348", "loss": "0.9345", "s2c_nll_loss": "1.348", "s2c_accuracy": "83.75", "s2c_total": "64", "s2c_n_correct": "53.6", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "8800", "lr": "5.86737e-05", "gnorm": "9.338", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1739"} 2023-01-29 16:40:43 | INFO | train_inner | {"epoch": 5, "update": 4.075, "s2c_loss": "1.038", "loss": "0.71922", "s2c_nll_loss": "1.038", "s2c_accuracy": "86.562", "s2c_total": "64", "s2c_n_correct": "55.4", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "8810", "lr": "5.87404e-05", "gnorm": "9.263", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1741"} 2023-01-29 16:40:46 | INFO | train_inner | {"epoch": 5, "update": 4.08, "s2c_loss": "1.067", "loss": "0.73943", "s2c_nll_loss": "1.067", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "8820", "lr": "5.88071e-05", "gnorm": "9.217", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1744"} 2023-01-29 16:40:48 | INFO | train_inner | {"epoch": 5, "update": 4.085, "s2c_loss": "1.31", "loss": "0.90821", "s2c_nll_loss": "1.31", "s2c_accuracy": "82.656", "s2c_total": "64", "s2c_n_correct": "52.9", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "8830", "lr": "5.88737e-05", "gnorm": "10.831", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1746"} 2023-01-29 16:40:51 | INFO | train_inner | {"epoch": 5, "update": 4.089, "s2c_loss": "1.333", "loss": "0.92412", "s2c_nll_loss": "1.333", "s2c_accuracy": "82.5", "s2c_total": "64", "s2c_n_correct": "52.8", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "8840", "lr": "5.89404e-05", "gnorm": "9.74", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1749"} 2023-01-29 16:40:53 | INFO | train_inner | {"epoch": 5, "update": 4.094, "s2c_loss": "1.31", "loss": "0.90781", "s2c_nll_loss": "1.31", "s2c_accuracy": "83.75", "s2c_total": "64", "s2c_n_correct": "53.6", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "8850", "lr": "5.9007e-05", "gnorm": "9.894", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1751"} 2023-01-29 16:40:56 | INFO | train_inner | {"epoch": 5, "update": 4.099, "s2c_loss": "1.14", "loss": "0.79001", "s2c_nll_loss": "1.14", "s2c_accuracy": "85.938", "s2c_total": "64", "s2c_n_correct": "55", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "8860", "lr": "5.90737e-05", "gnorm": "9.566", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1754"} 2023-01-29 16:40:58 | INFO | train_inner | {"epoch": 5, "update": 4.103, "s2c_loss": "1.235", "loss": "0.8559", "s2c_nll_loss": "1.235", "s2c_accuracy": "85", "s2c_total": "64", "s2c_n_correct": "54.4", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "8870", "lr": "5.91404e-05", "gnorm": "9.137", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1756"} 2023-01-29 16:41:01 | INFO | train_inner | {"epoch": 5, "update": 4.108, "s2c_loss": "1.439", "loss": "0.99716", "s2c_nll_loss": "1.439", "s2c_accuracy": "80.625", "s2c_total": "64", "s2c_n_correct": "51.6", "wps": "259.2", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "8880", "lr": "5.9207e-05", "gnorm": "9.318", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "1759"} 2023-01-29 16:41:03 | INFO | train_inner | {"epoch": 5, "update": 4.112, "s2c_loss": "1.232", "loss": "0.85413", "s2c_nll_loss": "1.232", "s2c_accuracy": "83.594", "s2c_total": "64", "s2c_n_correct": "53.5", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "8890", "lr": "5.92737e-05", "gnorm": "8.882", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1761"} 2023-01-29 16:41:06 | INFO | train_inner | {"epoch": 5, "update": 4.117, "s2c_loss": "1.384", "loss": "0.95901", "s2c_nll_loss": "1.384", "s2c_accuracy": "82.188", "s2c_total": "64", "s2c_n_correct": "52.6", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "8900", "lr": "5.93404e-05", "gnorm": "10.86", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1764"} 2023-01-29 16:41:08 | INFO | train_inner | {"epoch": 5, "update": 4.122, "s2c_loss": "1.31", "loss": "0.90792", "s2c_nll_loss": "1.31", "s2c_accuracy": "83.438", "s2c_total": "64", "s2c_n_correct": "53.4", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "8910", "lr": "5.9407e-05", "gnorm": "8.716", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1766"} 2023-01-29 16:41:11 | INFO | train_inner | {"epoch": 5, "update": 4.126, "s2c_loss": "1.035", "loss": "0.7177", "s2c_nll_loss": "1.035", "s2c_accuracy": "85.625", "s2c_total": "64", "s2c_n_correct": "54.8", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "8920", "lr": "5.94737e-05", "gnorm": "9.495", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1769"} 2023-01-29 16:41:13 | INFO | train_inner | {"epoch": 5, "update": 4.131, "s2c_loss": "1.569", "loss": "1.08734", "s2c_nll_loss": "1.569", "s2c_accuracy": "80.781", "s2c_total": "64", "s2c_n_correct": "51.7", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "8930", "lr": "5.95404e-05", "gnorm": "10.319", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1771"} 2023-01-29 16:41:16 | INFO | train_inner | {"epoch": 5, "update": 4.136, "s2c_loss": "1.009", "loss": "0.69928", "s2c_nll_loss": "1.009", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "8940", "lr": "5.9607e-05", "gnorm": "8.598", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1774"} 2023-01-29 16:41:18 | INFO | train_inner | {"epoch": 5, "update": 4.14, "s2c_loss": "1.173", "loss": "0.81281", "s2c_nll_loss": "1.173", "s2c_accuracy": "82.5", "s2c_total": "64", "s2c_n_correct": "52.8", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "8950", "lr": "5.96737e-05", "gnorm": "9.479", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1776"} 2023-01-29 16:41:21 | INFO | train_inner | {"epoch": 5, "update": 4.145, "s2c_loss": "1.171", "loss": "0.81169", "s2c_nll_loss": "1.171", "s2c_accuracy": "85.156", "s2c_total": "64", "s2c_n_correct": "54.5", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "8960", "lr": "5.97403e-05", "gnorm": "9.185", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1779"} 2023-01-29 16:41:24 | INFO | train_inner | {"epoch": 5, "update": 4.149, "s2c_loss": "1.007", "loss": "0.69812", "s2c_nll_loss": "1.007", "s2c_accuracy": "85", "s2c_total": "64", "s2c_n_correct": "54.4", "wps": "248", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "8970", "lr": "5.9807e-05", "gnorm": "9.734", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1782"} 2023-01-29 16:41:26 | INFO | train_inner | {"epoch": 5, "update": 4.154, "s2c_loss": "1.462", "loss": "1.01307", "s2c_nll_loss": "1.462", "s2c_accuracy": "82.5", "s2c_total": "64", "s2c_n_correct": "52.8", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "8980", "lr": "5.98737e-05", "gnorm": "9.334", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1784"} 2023-01-29 16:41:29 | INFO | train_inner | {"epoch": 5, "update": 4.159, "s2c_loss": "1.368", "loss": "0.94838", "s2c_nll_loss": "1.368", "s2c_accuracy": "84.375", "s2c_total": "64", "s2c_n_correct": "54", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "8990", "lr": "5.99403e-05", "gnorm": "9.952", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1787"} 2023-01-29 16:41:31 | INFO | train_inner | {"epoch": 5, "update": 4.163, "s2c_loss": "1.268", "loss": "0.87925", "s2c_nll_loss": "1.268", "s2c_accuracy": "83.438", "s2c_total": "64", "s2c_n_correct": "53.4", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "9000", "lr": "6.0007e-05", "gnorm": "10.198", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1789"} 2023-01-29 16:41:34 | INFO | train_inner | {"epoch": 5, "update": 4.168, "s2c_loss": "1.114", "loss": "0.77208", "s2c_nll_loss": "1.114", "s2c_accuracy": "84.375", "s2c_total": "64", "s2c_n_correct": "54", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "9010", "lr": "6.00737e-05", "gnorm": "10.19", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1792"} 2023-01-29 16:41:36 | INFO | train_inner | {"epoch": 5, "update": 4.173, "s2c_loss": "1.164", "loss": "0.80658", "s2c_nll_loss": "1.164", "s2c_accuracy": "84.531", "s2c_total": "64", "s2c_n_correct": "54.1", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "9020", "lr": "6.01403e-05", "gnorm": "9.615", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1794"} 2023-01-29 16:41:39 | INFO | train_inner | {"epoch": 5, "update": 4.177, "s2c_loss": "1.206", "loss": "0.83598", "s2c_nll_loss": "1.206", "s2c_accuracy": "82.656", "s2c_total": "64", "s2c_n_correct": "52.9", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "9030", "lr": "6.0207e-05", "gnorm": "10.619", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1797"} 2023-01-29 16:41:41 | INFO | train_inner | {"epoch": 5, "update": 4.182, "s2c_loss": "1.654", "loss": "1.14626", "s2c_nll_loss": "1.654", "s2c_accuracy": "81.406", "s2c_total": "64", "s2c_n_correct": "52.1", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "9040", "lr": "6.02737e-05", "gnorm": "9.075", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1799"} 2023-01-29 16:41:44 | INFO | train_inner | {"epoch": 5, "update": 4.186, "s2c_loss": "1.208", "loss": "0.83708", "s2c_nll_loss": "1.208", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "9050", "lr": "6.03403e-05", "gnorm": "9.592", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1802"} 2023-01-29 16:41:46 | INFO | train_inner | {"epoch": 5, "update": 4.191, "s2c_loss": "0.997", "loss": "0.69105", "s2c_nll_loss": "0.997", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "9060", "lr": "6.0407e-05", "gnorm": "9.516", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1804"} 2023-01-29 16:41:49 | INFO | train_inner | {"epoch": 5, "update": 4.196, "s2c_loss": "1.271", "loss": "0.88126", "s2c_nll_loss": "1.271", "s2c_accuracy": "82.812", "s2c_total": "64", "s2c_n_correct": "53", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "9070", "lr": "6.04736e-05", "gnorm": "9.142", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1807"} 2023-01-29 16:41:52 | INFO | train_inner | {"epoch": 5, "update": 4.2, "s2c_loss": "1.133", "loss": "0.78513", "s2c_nll_loss": "1.133", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "9080", "lr": "6.05403e-05", "gnorm": "9.015", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1809"} 2023-01-29 16:41:54 | INFO | train_inner | {"epoch": 5, "update": 4.205, "s2c_loss": "1.083", "loss": "0.75047", "s2c_nll_loss": "1.083", "s2c_accuracy": "85.469", "s2c_total": "64", "s2c_n_correct": "54.7", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "9090", "lr": "6.0607e-05", "gnorm": "10.584", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1812"} 2023-01-29 16:41:57 | INFO | train_inner | {"epoch": 5, "update": 4.21, "s2c_loss": "1.157", "loss": "0.80206", "s2c_nll_loss": "1.157", "s2c_accuracy": "83.125", "s2c_total": "64", "s2c_n_correct": "53.2", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "9100", "lr": "6.06736e-05", "gnorm": "9.938", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1815"} 2023-01-29 16:41:59 | INFO | train_inner | {"epoch": 5, "update": 4.214, "s2c_loss": "1.201", "loss": "0.83254", "s2c_nll_loss": "1.201", "s2c_accuracy": "85", "s2c_total": "64", "s2c_n_correct": "54.4", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "9110", "lr": "6.07403e-05", "gnorm": "9.159", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1817"} 2023-01-29 16:42:02 | INFO | train_inner | {"epoch": 5, "update": 4.219, "s2c_loss": "1.13", "loss": "0.78297", "s2c_nll_loss": "1.13", "s2c_accuracy": "83.906", "s2c_total": "64", "s2c_n_correct": "53.7", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "9120", "lr": "6.0807e-05", "gnorm": "8.956", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1820"} 2023-01-29 16:42:04 | INFO | train_inner | {"epoch": 5, "update": 4.223, "s2c_loss": "1.139", "loss": "0.78933", "s2c_nll_loss": "1.139", "s2c_accuracy": "84.062", "s2c_total": "64", "s2c_n_correct": "53.8", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "9130", "lr": "6.08736e-05", "gnorm": "9.431", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1822"} 2023-01-29 16:42:07 | INFO | train_inner | {"epoch": 5, "update": 4.228, "s2c_loss": "0.999", "loss": "0.69249", "s2c_nll_loss": "0.999", "s2c_accuracy": "85.625", "s2c_total": "64", "s2c_n_correct": "54.8", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "9140", "lr": "6.09403e-05", "gnorm": "9.843", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1825"} 2023-01-29 16:42:09 | INFO | train_inner | {"epoch": 5, "update": 4.233, "s2c_loss": "1.253", "loss": "0.8687", "s2c_nll_loss": "1.253", "s2c_accuracy": "81.875", "s2c_total": "64", "s2c_n_correct": "52.4", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "9150", "lr": "6.10069e-05", "gnorm": "10.455", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1827"} 2023-01-29 16:42:12 | INFO | train_inner | {"epoch": 5, "update": 4.237, "s2c_loss": "1.163", "loss": "0.8061", "s2c_nll_loss": "1.163", "s2c_accuracy": "84.844", "s2c_total": "64", "s2c_n_correct": "54.3", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "9160", "lr": "6.10736e-05", "gnorm": "10.246", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1830"} 2023-01-29 16:42:14 | INFO | train_inner | {"epoch": 5, "update": 4.242, "s2c_loss": "1.056", "loss": "0.73208", "s2c_nll_loss": "1.056", "s2c_accuracy": "86.25", "s2c_total": "64", "s2c_n_correct": "55.2", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "9170", "lr": "6.11403e-05", "gnorm": "9.066", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1832"} 2023-01-29 16:42:17 | INFO | train_inner | {"epoch": 5, "update": 4.247, "s2c_loss": "1.14", "loss": "0.79006", "s2c_nll_loss": "1.14", "s2c_accuracy": "84.531", "s2c_total": "64", "s2c_n_correct": "54.1", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "9180", "lr": "6.12069e-05", "gnorm": "9.192", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1835"} 2023-01-29 16:42:20 | INFO | train_inner | {"epoch": 5, "update": 4.251, "s2c_loss": "1.212", "loss": "0.83985", "s2c_nll_loss": "1.212", "s2c_accuracy": "86.562", "s2c_total": "64", "s2c_n_correct": "55.4", "wps": "246.5", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "9190", "lr": "6.12736e-05", "gnorm": "9.595", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1837"} 2023-01-29 16:42:22 | INFO | train_inner | {"epoch": 5, "update": 4.256, "s2c_loss": "1.046", "loss": "0.72514", "s2c_nll_loss": "1.046", "s2c_accuracy": "85.938", "s2c_total": "64", "s2c_n_correct": "55", "wps": "257.8", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "9200", "lr": "6.13403e-05", "gnorm": "9.436", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "1840"} 2023-01-29 16:42:25 | INFO | train_inner | {"epoch": 5, "update": 4.26, "s2c_loss": "1.002", "loss": "0.69434", "s2c_nll_loss": "1.002", "s2c_accuracy": "86.094", "s2c_total": "64", "s2c_n_correct": "55.1", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "9210", "lr": "6.14069e-05", "gnorm": "9.933", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1842"} 2023-01-29 16:42:27 | INFO | train_inner | {"epoch": 5, "update": 4.265, "s2c_loss": "1.098", "loss": "0.76087", "s2c_nll_loss": "1.098", "s2c_accuracy": "85.312", "s2c_total": "64", "s2c_n_correct": "54.6", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "9220", "lr": "6.14736e-05", "gnorm": "9.251", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "1845"} 2023-01-29 16:42:30 | INFO | train_inner | {"epoch": 5, "update": 4.27, "s2c_loss": "1.34", "loss": "0.92893", "s2c_nll_loss": "1.34", "s2c_accuracy": "82.812", "s2c_total": "64", "s2c_n_correct": "53", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "9230", "lr": "6.15403e-05", "gnorm": "10.043", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1847"} 2023-01-29 16:42:32 | INFO | train_inner | {"epoch": 5, "update": 4.274, "s2c_loss": "1.337", "loss": "0.92659", "s2c_nll_loss": "1.337", "s2c_accuracy": "85", "s2c_total": "64", "s2c_n_correct": "54.4", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "9240", "lr": "6.16069e-05", "gnorm": "9.379", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1850"} 2023-01-29 16:42:35 | INFO | train_inner | {"epoch": 5, "update": 4.279, "s2c_loss": "1.066", "loss": "0.73858", "s2c_nll_loss": "1.066", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "9250", "lr": "6.16736e-05", "gnorm": "9.832", "loss_scale": "512", "train_wall": "2", "gb_free": "7.5", "wall": "1853"} 2023-01-29 16:42:37 | INFO | train_inner | {"epoch": 5, "update": 4.284, "s2c_loss": "1.228", "loss": "0.85115", "s2c_nll_loss": "1.228", "s2c_accuracy": "83.281", "s2c_total": "64", "s2c_n_correct": "53.3", "wps": "261.1", "ups": "4.08", "wpb": "64", "bsz": "64", "num_updates": "9260", "lr": "6.17402e-05", "gnorm": "9.424", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1855"} 2023-01-29 16:42:40 | INFO | train_inner | {"epoch": 5, "update": 4.288, "s2c_loss": "1.082", "loss": "0.74991", "s2c_nll_loss": "1.082", "s2c_accuracy": "84.844", "s2c_total": "64", "s2c_n_correct": "54.3", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "9270", "lr": "6.18069e-05", "gnorm": "10.629", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1857"} 2023-01-29 16:42:42 | INFO | train_inner | {"epoch": 5, "update": 4.293, "s2c_loss": "1.238", "loss": "0.8584", "s2c_nll_loss": "1.238", "s2c_accuracy": "85.312", "s2c_total": "64", "s2c_n_correct": "54.6", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "9280", "lr": "6.18736e-05", "gnorm": "9.541", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1860"} 2023-01-29 16:42:45 | INFO | train_inner | {"epoch": 5, "update": 4.297, "s2c_loss": "0.95", "loss": "0.65863", "s2c_nll_loss": "0.95", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "9290", "lr": "6.19402e-05", "gnorm": "10.861", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1863"} 2023-01-29 16:42:47 | INFO | train_inner | {"epoch": 5, "update": 4.302, "s2c_loss": "1.065", "loss": "0.73821", "s2c_nll_loss": "1.065", "s2c_accuracy": "84.844", "s2c_total": "64", "s2c_n_correct": "54.3", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "9300", "lr": "6.20069e-05", "gnorm": "10.14", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1865"} 2023-01-29 16:42:50 | INFO | train_inner | {"epoch": 5, "update": 4.307, "s2c_loss": "0.996", "loss": "0.69021", "s2c_nll_loss": "0.996", "s2c_accuracy": "85.781", "s2c_total": "64", "s2c_n_correct": "54.9", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "9310", "lr": "6.20736e-05", "gnorm": "9.203", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1868"} 2023-01-29 16:42:52 | INFO | train_inner | {"epoch": 5, "update": 4.311, "s2c_loss": "1.143", "loss": "0.79206", "s2c_nll_loss": "1.143", "s2c_accuracy": "82.5", "s2c_total": "64", "s2c_n_correct": "52.8", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "9320", "lr": "6.21402e-05", "gnorm": "10.233", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1870"} 2023-01-29 16:42:55 | INFO | train_inner | {"epoch": 5, "update": 4.316, "s2c_loss": "1.422", "loss": "0.98559", "s2c_nll_loss": "1.422", "s2c_accuracy": "82.344", "s2c_total": "64", "s2c_n_correct": "52.7", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "9330", "lr": "6.22069e-05", "gnorm": "9.569", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1873"} 2023-01-29 16:42:57 | INFO | train_inner | {"epoch": 5, "update": 4.321, "s2c_loss": "1.021", "loss": "0.70755", "s2c_nll_loss": "1.021", "s2c_accuracy": "85.781", "s2c_total": "64", "s2c_n_correct": "54.9", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "9340", "lr": "6.22736e-05", "gnorm": "9.129", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1875"} 2023-01-29 16:43:00 | INFO | train_inner | {"epoch": 5, "update": 4.325, "s2c_loss": "1.065", "loss": "0.7379", "s2c_nll_loss": "1.065", "s2c_accuracy": "85.156", "s2c_total": "64", "s2c_n_correct": "54.5", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "9350", "lr": "6.23402e-05", "gnorm": "8.904", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "1878"} 2023-01-29 16:43:02 | INFO | train_inner | {"epoch": 5, "update": 4.33, "s2c_loss": "1.087", "loss": "0.75352", "s2c_nll_loss": "1.087", "s2c_accuracy": "84.844", "s2c_total": "64", "s2c_n_correct": "54.3", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "9360", "lr": "6.24069e-05", "gnorm": "9.624", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1880"} 2023-01-29 16:43:05 | INFO | train_inner | {"epoch": 5, "update": 4.334, "s2c_loss": "1.048", "loss": "0.72619", "s2c_nll_loss": "1.048", "s2c_accuracy": "86.25", "s2c_total": "64", "s2c_n_correct": "55.2", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "9370", "lr": "6.24735e-05", "gnorm": "8.046", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1883"} 2023-01-29 16:43:07 | INFO | train_inner | {"epoch": 5, "update": 4.339, "s2c_loss": "1.282", "loss": "0.88834", "s2c_nll_loss": "1.282", "s2c_accuracy": "85.938", "s2c_total": "64", "s2c_n_correct": "55", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "9380", "lr": "6.25402e-05", "gnorm": "9.003", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "1885"} 2023-01-29 16:43:10 | INFO | train_inner | {"epoch": 5, "update": 4.344, "s2c_loss": "1.145", "loss": "0.79396", "s2c_nll_loss": "1.145", "s2c_accuracy": "84.219", "s2c_total": "64", "s2c_n_correct": "53.9", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "9390", "lr": "6.26069e-05", "gnorm": "9.36", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "1888"} 2023-01-29 16:43:12 | INFO | train_inner | {"epoch": 5, "update": 4.348, "s2c_loss": "1.134", "loss": "0.78629", "s2c_nll_loss": "1.134", "s2c_accuracy": "85.938", "s2c_total": "64", "s2c_n_correct": "55", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "9400", "lr": "6.26735e-05", "gnorm": "8.893", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "1890"} 2023-01-29 16:43:15 | INFO | train_inner | {"epoch": 5, "update": 4.353, "s2c_loss": "1.006", "loss": "0.69706", "s2c_nll_loss": "1.006", "s2c_accuracy": "86.875", "s2c_total": "64", "s2c_n_correct": "55.6", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "9410", "lr": "6.27402e-05", "gnorm": "8.734", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "1893"} 2023-01-29 16:43:18 | INFO | train_inner | {"epoch": 5, "update": 4.358, "s2c_loss": "0.958", "loss": "0.6642", "s2c_nll_loss": "0.958", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "9420", "lr": "6.28069e-05", "gnorm": "8.917", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "1895"} 2023-01-29 16:43:20 | INFO | train_inner | {"epoch": 5, "update": 4.362, "s2c_loss": "1.073", "loss": "0.74388", "s2c_nll_loss": "1.073", "s2c_accuracy": "85.625", "s2c_total": "64", "s2c_n_correct": "54.8", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "9430", "lr": "6.28735e-05", "gnorm": "10.157", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "1898"} 2023-01-29 16:43:23 | INFO | train_inner | {"epoch": 5, "update": 4.367, "s2c_loss": "1.065", "loss": "0.73815", "s2c_nll_loss": "1.065", "s2c_accuracy": "85.469", "s2c_total": "64", "s2c_n_correct": "54.7", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "9440", "lr": "6.29402e-05", "gnorm": "10.018", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "1901"} 2023-01-29 16:43:25 | INFO | train_inner | {"epoch": 5, "update": 4.371, "s2c_loss": "1.116", "loss": "0.7736", "s2c_nll_loss": "1.116", "s2c_accuracy": "84.688", "s2c_total": "64", "s2c_n_correct": "54.2", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "9450", "lr": "6.30068e-05", "gnorm": "9.115", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "1903"} 2023-01-29 16:43:28 | INFO | train_inner | {"epoch": 5, "update": 4.376, "s2c_loss": "1.167", "loss": "0.80862", "s2c_nll_loss": "1.167", "s2c_accuracy": "83.906", "s2c_total": "64", "s2c_n_correct": "53.7", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "9460", "lr": "6.30735e-05", "gnorm": "9.521", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "1906"} 2023-01-29 16:43:30 | INFO | train_inner | {"epoch": 5, "update": 4.381, "s2c_loss": "0.966", "loss": "0.66984", "s2c_nll_loss": "0.966", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "9470", "lr": "6.31402e-05", "gnorm": "9.239", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "1908"} 2023-01-29 16:43:33 | INFO | train_inner | {"epoch": 5, "update": 4.385, "s2c_loss": "1.377", "loss": "0.95416", "s2c_nll_loss": "1.377", "s2c_accuracy": "82.188", "s2c_total": "64", "s2c_n_correct": "52.6", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "9480", "lr": "6.32068e-05", "gnorm": "10.335", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "1911"} 2023-01-29 16:43:35 | INFO | train_inner | {"epoch": 5, "update": 4.39, "s2c_loss": "1.083", "loss": "0.7509", "s2c_nll_loss": "1.083", "s2c_accuracy": "84.062", "s2c_total": "64", "s2c_n_correct": "53.8", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "9490", "lr": "6.32735e-05", "gnorm": "10.514", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "1913"} 2023-01-29 16:43:38 | INFO | train_inner | {"epoch": 5, "update": 4.395, "s2c_loss": "1.32", "loss": "0.91505", "s2c_nll_loss": "1.32", "s2c_accuracy": "80.938", "s2c_total": "64", "s2c_n_correct": "51.8", "wps": "257.4", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "9500", "lr": "6.33402e-05", "gnorm": "11.286", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "1916"} 2023-01-29 16:43:40 | INFO | train_inner | {"epoch": 5, "update": 4.399, "s2c_loss": "1.148", "loss": "0.79606", "s2c_nll_loss": "1.148", "s2c_accuracy": "84.219", "s2c_total": "64", "s2c_n_correct": "53.9", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "9510", "lr": "6.34068e-05", "gnorm": "9.665", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "1918"} 2023-01-29 16:43:43 | INFO | train_inner | {"epoch": 5, "update": 4.404, "s2c_loss": "1.159", "loss": "0.80317", "s2c_nll_loss": "1.159", "s2c_accuracy": "84.688", "s2c_total": "64", "s2c_n_correct": "54.2", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "9520", "lr": "6.34735e-05", "gnorm": "10.193", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "1921"} 2023-01-29 16:43:45 | INFO | train_inner | {"epoch": 5, "update": 4.408, "s2c_loss": "1.171", "loss": "0.81197", "s2c_nll_loss": "1.171", "s2c_accuracy": "83.281", "s2c_total": "64", "s2c_n_correct": "53.3", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "9530", "lr": "6.35402e-05", "gnorm": "10.21", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "1923"} 2023-01-29 16:43:48 | INFO | train_inner | {"epoch": 5, "update": 4.413, "s2c_loss": "1.241", "loss": "0.86035", "s2c_nll_loss": "1.241", "s2c_accuracy": "82.812", "s2c_total": "64", "s2c_n_correct": "53", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "9540", "lr": "6.36068e-05", "gnorm": "10.888", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "1926"} 2023-01-29 16:43:50 | INFO | train_inner | {"epoch": 5, "update": 4.418, "s2c_loss": "1.081", "loss": "0.749", "s2c_nll_loss": "1.081", "s2c_accuracy": "85", "s2c_total": "64", "s2c_n_correct": "54.4", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "9550", "lr": "6.36735e-05", "gnorm": "10.048", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "1928"} 2023-01-29 16:43:53 | INFO | train_inner | {"epoch": 5, "update": 4.422, "s2c_loss": "1.304", "loss": "0.90395", "s2c_nll_loss": "1.304", "s2c_accuracy": "83.594", "s2c_total": "64", "s2c_n_correct": "53.5", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "9560", "lr": "6.37401e-05", "gnorm": "9.74", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "1931"} 2023-01-29 16:43:55 | INFO | train_inner | {"epoch": 5, "update": 4.427, "s2c_loss": "1.072", "loss": "0.74339", "s2c_nll_loss": "1.072", "s2c_accuracy": "84.688", "s2c_total": "64", "s2c_n_correct": "54.2", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "9570", "lr": "6.38068e-05", "gnorm": "10.832", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "1933"} 2023-01-29 16:43:58 | INFO | train_inner | {"epoch": 5, "update": 4.432, "s2c_loss": "1.101", "loss": "0.76286", "s2c_nll_loss": "1.101", "s2c_accuracy": "85.156", "s2c_total": "64", "s2c_n_correct": "54.5", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "9580", "lr": "6.38735e-05", "gnorm": "9.735", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "1936"} 2023-01-29 16:44:00 | INFO | train_inner | {"epoch": 5, "update": 4.436, "s2c_loss": "1.348", "loss": "0.93458", "s2c_nll_loss": "1.348", "s2c_accuracy": "82.656", "s2c_total": "64", "s2c_n_correct": "52.9", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "9590", "lr": "6.39401e-05", "gnorm": "10.487", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "1938"} 2023-01-29 16:44:03 | INFO | train_inner | {"epoch": 5, "update": 4.441, "s2c_loss": "1.18", "loss": "0.81763", "s2c_nll_loss": "1.18", "s2c_accuracy": "85", "s2c_total": "64", "s2c_n_correct": "54.4", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "9600", "lr": "6.40068e-05", "gnorm": "9.823", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "1941"} 2023-01-29 16:44:06 | INFO | train_inner | {"epoch": 5, "update": 4.445, "s2c_loss": "1.197", "loss": "0.83002", "s2c_nll_loss": "1.197", "s2c_accuracy": "84.531", "s2c_total": "64", "s2c_n_correct": "54.1", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "9610", "lr": "6.40735e-05", "gnorm": "9.713", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "1943"} 2023-01-29 16:44:08 | INFO | train_inner | {"epoch": 5, "update": 4.45, "s2c_loss": "1.189", "loss": "0.82398", "s2c_nll_loss": "1.189", "s2c_accuracy": "85", "s2c_total": "64", "s2c_n_correct": "54.4", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "9620", "lr": "6.41401e-05", "gnorm": "9.572", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "1946"} 2023-01-29 16:44:11 | INFO | train_inner | {"epoch": 5, "update": 4.455, "s2c_loss": "1.135", "loss": "0.78681", "s2c_nll_loss": "1.135", "s2c_accuracy": "82.969", "s2c_total": "64", "s2c_n_correct": "53.1", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "9630", "lr": "6.42068e-05", "gnorm": "10.672", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "1948"} 2023-01-29 16:44:13 | INFO | train_inner | {"epoch": 5, "update": 4.459, "s2c_loss": "1.089", "loss": "0.75459", "s2c_nll_loss": "1.089", "s2c_accuracy": "82.5", "s2c_total": "64", "s2c_n_correct": "52.8", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "9640", "lr": "6.42735e-05", "gnorm": "11.282", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "1951"} 2023-01-29 16:44:16 | INFO | train_inner | {"epoch": 5, "update": 4.464, "s2c_loss": "1.188", "loss": "0.82359", "s2c_nll_loss": "1.188", "s2c_accuracy": "85.938", "s2c_total": "64", "s2c_n_correct": "55", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "9650", "lr": "6.43401e-05", "gnorm": "9.754", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "1954"} 2023-01-29 16:44:18 | INFO | train_inner | {"epoch": 5, "update": 4.469, "s2c_loss": "1.285", "loss": "0.89064", "s2c_nll_loss": "1.285", "s2c_accuracy": "82.344", "s2c_total": "64", "s2c_n_correct": "52.7", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "9660", "lr": "6.44068e-05", "gnorm": "10.488", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "1956"} 2023-01-29 16:44:21 | INFO | train_inner | {"epoch": 5, "update": 4.473, "s2c_loss": "1.129", "loss": "0.78267", "s2c_nll_loss": "1.129", "s2c_accuracy": "84.375", "s2c_total": "64", "s2c_n_correct": "54", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "9670", "lr": "6.44734e-05", "gnorm": "9.774", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "1959"} 2023-01-29 16:44:23 | INFO | train_inner | {"epoch": 5, "update": 4.478, "s2c_loss": "1.033", "loss": "0.71624", "s2c_nll_loss": "1.033", "s2c_accuracy": "85.312", "s2c_total": "64", "s2c_n_correct": "54.6", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "9680", "lr": "6.45401e-05", "gnorm": "10.075", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "1961"} 2023-01-29 16:44:26 | INFO | train_inner | {"epoch": 5, "update": 4.482, "s2c_loss": "1.092", "loss": "0.7571", "s2c_nll_loss": "1.092", "s2c_accuracy": "84.219", "s2c_total": "64", "s2c_n_correct": "53.9", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "9690", "lr": "6.46068e-05", "gnorm": "11.279", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "1964"} 2023-01-29 16:44:28 | INFO | train_inner | {"epoch": 5, "update": 4.487, "s2c_loss": "1.281", "loss": "0.88772", "s2c_nll_loss": "1.281", "s2c_accuracy": "83.281", "s2c_total": "64", "s2c_n_correct": "53.3", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "9700", "lr": "6.46734e-05", "gnorm": "12.174", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "1966"} 2023-01-29 16:44:31 | INFO | train_inner | {"epoch": 5, "update": 4.492, "s2c_loss": "1.192", "loss": "0.82636", "s2c_nll_loss": "1.192", "s2c_accuracy": "81.562", "s2c_total": "64", "s2c_n_correct": "52.2", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "9710", "lr": "6.47401e-05", "gnorm": "10.73", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "1969"} 2023-01-29 16:44:33 | INFO | train_inner | {"epoch": 5, "update": 4.496, "s2c_loss": "0.949", "loss": "0.65803", "s2c_nll_loss": "0.949", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "9720", "lr": "6.48068e-05", "gnorm": "9.611", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "1971"} 2023-01-29 16:44:36 | INFO | train_inner | {"epoch": 5, "update": 4.501, "s2c_loss": "0.952", "loss": "0.65955", "s2c_nll_loss": "0.952", "s2c_accuracy": "84.375", "s2c_total": "64", "s2c_n_correct": "54", "wps": "245.5", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "9730", "lr": "6.48734e-05", "gnorm": "9.755", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "1974"} 2023-01-29 16:44:39 | INFO | train_inner | {"epoch": 5, "update": 4.506, "s2c_loss": "0.995", "loss": "0.68973", "s2c_nll_loss": "0.995", "s2c_accuracy": "84.531", "s2c_total": "64", "s2c_n_correct": "54.1", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "9740", "lr": "6.49401e-05", "gnorm": "9.94", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "1976"} 2023-01-29 16:44:41 | INFO | train_inner | {"epoch": 5, "update": 4.51, "s2c_loss": "1.292", "loss": "0.89546", "s2c_nll_loss": "1.292", "s2c_accuracy": "82.188", "s2c_total": "64", "s2c_n_correct": "52.6", "wps": "246.6", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "9750", "lr": "6.50067e-05", "gnorm": "9.777", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "1979"} 2023-01-29 16:44:44 | INFO | train_inner | {"epoch": 5, "update": 4.515, "s2c_loss": "0.947", "loss": "0.65644", "s2c_nll_loss": "0.947", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "9760", "lr": "6.50734e-05", "gnorm": "9.39", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "1982"} 2023-01-29 16:44:46 | INFO | train_inner | {"epoch": 5, "update": 4.519, "s2c_loss": "0.931", "loss": "0.64559", "s2c_nll_loss": "0.931", "s2c_accuracy": "85.312", "s2c_total": "64", "s2c_n_correct": "54.6", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "9770", "lr": "6.51401e-05", "gnorm": "9.501", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "1984"} 2023-01-29 16:44:49 | INFO | train_inner | {"epoch": 5, "update": 4.524, "s2c_loss": "1.156", "loss": "0.80145", "s2c_nll_loss": "1.156", "s2c_accuracy": "84.531", "s2c_total": "64", "s2c_n_correct": "54.1", "wps": "259.6", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "9780", "lr": "6.52067e-05", "gnorm": "9.793", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "1987"} 2023-01-29 16:44:51 | INFO | train_inner | {"epoch": 5, "update": 4.529, "s2c_loss": "0.851", "loss": "0.58971", "s2c_nll_loss": "0.851", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "9790", "lr": "6.52734e-05", "gnorm": "8.974", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "1989"} 2023-01-29 16:44:54 | INFO | train_inner | {"epoch": 5, "update": 4.533, "s2c_loss": "0.923", "loss": "0.63973", "s2c_nll_loss": "0.923", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "9800", "lr": "6.53401e-05", "gnorm": "8.108", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "1992"} 2023-01-29 16:44:56 | INFO | train_inner | {"epoch": 5, "update": 4.538, "s2c_loss": "1.01", "loss": "0.70039", "s2c_nll_loss": "1.01", "s2c_accuracy": "85.312", "s2c_total": "64", "s2c_n_correct": "54.6", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "9810", "lr": "6.54067e-05", "gnorm": "9.489", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "1994"} 2023-01-29 16:44:59 | INFO | train_inner | {"epoch": 5, "update": 4.543, "s2c_loss": "1.18", "loss": "0.81761", "s2c_nll_loss": "1.18", "s2c_accuracy": "85.469", "s2c_total": "64", "s2c_n_correct": "54.7", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "9820", "lr": "6.54734e-05", "gnorm": "9.591", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "1997"} 2023-01-29 16:45:01 | INFO | train_inner | {"epoch": 5, "update": 4.547, "s2c_loss": "0.801", "loss": "0.55522", "s2c_nll_loss": "0.801", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "9830", "lr": "6.55401e-05", "gnorm": "10.022", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "1999"} 2023-01-29 16:45:04 | INFO | train_inner | {"epoch": 5, "update": 4.552, "s2c_loss": "1.065", "loss": "0.73789", "s2c_nll_loss": "1.065", "s2c_accuracy": "84.062", "s2c_total": "64", "s2c_n_correct": "53.8", "wps": "245", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "9840", "lr": "6.56067e-05", "gnorm": "11.234", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2002"} 2023-01-29 16:45:07 | INFO | train_inner | {"epoch": 5, "update": 4.556, "s2c_loss": "1", "loss": "0.69326", "s2c_nll_loss": "1", "s2c_accuracy": "83.906", "s2c_total": "64", "s2c_n_correct": "53.7", "wps": "242.7", "ups": "3.79", "wpb": "64", "bsz": "64", "num_updates": "9850", "lr": "6.56734e-05", "gnorm": "11.521", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "2005"} 2023-01-29 16:45:09 | INFO | train_inner | {"epoch": 5, "update": 4.561, "s2c_loss": "1.132", "loss": "0.78447", "s2c_nll_loss": "1.132", "s2c_accuracy": "84.531", "s2c_total": "64", "s2c_n_correct": "54.1", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "9860", "lr": "6.574e-05", "gnorm": "9.597", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2007"} 2023-01-29 16:45:12 | INFO | train_inner | {"epoch": 5, "update": 4.566, "s2c_loss": "1.091", "loss": "0.75607", "s2c_nll_loss": "1.091", "s2c_accuracy": "85.781", "s2c_total": "64", "s2c_n_correct": "54.9", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "9870", "lr": "6.58067e-05", "gnorm": "9.301", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2010"} 2023-01-29 16:45:14 | INFO | train_inner | {"epoch": 5, "update": 4.57, "s2c_loss": "1.114", "loss": "0.77187", "s2c_nll_loss": "1.114", "s2c_accuracy": "84.062", "s2c_total": "64", "s2c_n_correct": "53.8", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "9880", "lr": "6.58734e-05", "gnorm": "10.527", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2012"} 2023-01-29 16:45:17 | INFO | train_inner | {"epoch": 5, "update": 4.575, "s2c_loss": "1.444", "loss": "1.00125", "s2c_nll_loss": "1.444", "s2c_accuracy": "83.125", "s2c_total": "64", "s2c_n_correct": "53.2", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "9890", "lr": "6.594e-05", "gnorm": "9.779", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2015"} 2023-01-29 16:45:19 | INFO | train_inner | {"epoch": 5, "update": 4.58, "s2c_loss": "0.952", "loss": "0.65962", "s2c_nll_loss": "0.952", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "248", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "9900", "lr": "6.60067e-05", "gnorm": "9.175", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2017"} 2023-01-29 16:45:22 | INFO | train_inner | {"epoch": 5, "update": 4.584, "s2c_loss": "0.936", "loss": "0.6489", "s2c_nll_loss": "0.936", "s2c_accuracy": "86.094", "s2c_total": "64", "s2c_n_correct": "55.1", "wps": "243.5", "ups": "3.8", "wpb": "64", "bsz": "64", "num_updates": "9910", "lr": "6.60734e-05", "gnorm": "10.103", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2020"} 2023-01-29 16:45:25 | INFO | train_inner | {"epoch": 5, "update": 4.589, "s2c_loss": "0.972", "loss": "0.67378", "s2c_nll_loss": "0.972", "s2c_accuracy": "86.25", "s2c_total": "64", "s2c_n_correct": "55.2", "wps": "244.2", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "9920", "lr": "6.614e-05", "gnorm": "10.585", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2023"} 2023-01-29 16:45:27 | INFO | train_inner | {"epoch": 5, "update": 4.593, "s2c_loss": "1.219", "loss": "0.84481", "s2c_nll_loss": "1.219", "s2c_accuracy": "83.125", "s2c_total": "64", "s2c_n_correct": "53.2", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "9930", "lr": "6.62067e-05", "gnorm": "9.555", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2025"} 2023-01-29 16:45:30 | INFO | train_inner | {"epoch": 5, "update": 4.598, "s2c_loss": "0.974", "loss": "0.67508", "s2c_nll_loss": "0.974", "s2c_accuracy": "86.094", "s2c_total": "64", "s2c_n_correct": "55.1", "wps": "244.7", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "9940", "lr": "6.62734e-05", "gnorm": "12.496", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2028"} 2023-01-29 16:45:32 | INFO | train_inner | {"epoch": 5, "update": 4.603, "s2c_loss": "1.03", "loss": "0.71383", "s2c_nll_loss": "1.03", "s2c_accuracy": "85.312", "s2c_total": "64", "s2c_n_correct": "54.6", "wps": "247.6", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "9950", "lr": "6.634e-05", "gnorm": "9.718", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2030"} 2023-01-29 16:45:35 | INFO | train_inner | {"epoch": 5, "update": 4.607, "s2c_loss": "1.323", "loss": "0.91693", "s2c_nll_loss": "1.323", "s2c_accuracy": "83.75", "s2c_total": "64", "s2c_n_correct": "53.6", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "9960", "lr": "6.64067e-05", "gnorm": "10.005", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2033"} 2023-01-29 16:45:38 | INFO | train_inner | {"epoch": 5, "update": 4.612, "s2c_loss": "0.883", "loss": "0.61196", "s2c_nll_loss": "0.883", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "9970", "lr": "6.64733e-05", "gnorm": "9.454", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2036"} 2023-01-29 16:45:40 | INFO | train_inner | {"epoch": 5, "update": 4.617, "s2c_loss": "1.021", "loss": "0.7077", "s2c_nll_loss": "1.021", "s2c_accuracy": "85.312", "s2c_total": "64", "s2c_n_correct": "54.6", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "9980", "lr": "6.654e-05", "gnorm": "9.569", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2038"} 2023-01-29 16:45:43 | INFO | train_inner | {"epoch": 5, "update": 4.621, "s2c_loss": "0.933", "loss": "0.64645", "s2c_nll_loss": "0.933", "s2c_accuracy": "85.781", "s2c_total": "64", "s2c_n_correct": "54.9", "wps": "258.3", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "9990", "lr": "6.66067e-05", "gnorm": "9.24", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2041"} 2023-01-29 16:45:45 | INFO | train_inner | {"epoch": 5, "update": 4.626, "s2c_loss": "1.02", "loss": "0.70669", "s2c_nll_loss": "1.02", "s2c_accuracy": "86.875", "s2c_total": "64", "s2c_n_correct": "55.6", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "10000", "lr": "6.66733e-05", "gnorm": "10.106", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2043"} 2023-01-29 16:45:48 | INFO | train_inner | {"epoch": 5, "update": 4.63, "s2c_loss": "1.209", "loss": "0.83821", "s2c_nll_loss": "1.209", "s2c_accuracy": "84.844", "s2c_total": "64", "s2c_n_correct": "54.3", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "10010", "lr": "6.674e-05", "gnorm": "9.725", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2046"} 2023-01-29 16:45:50 | INFO | train_inner | {"epoch": 5, "update": 4.635, "s2c_loss": "1.035", "loss": "0.71746", "s2c_nll_loss": "1.035", "s2c_accuracy": "84.688", "s2c_total": "64", "s2c_n_correct": "54.2", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "10020", "lr": "6.68067e-05", "gnorm": "8.845", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2048"} 2023-01-29 16:45:53 | INFO | train_inner | {"epoch": 5, "update": 4.64, "s2c_loss": "0.846", "loss": "0.58625", "s2c_nll_loss": "0.846", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "10030", "lr": "6.68733e-05", "gnorm": "8.865", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2051"} 2023-01-29 16:45:55 | INFO | train_inner | {"epoch": 5, "update": 4.644, "s2c_loss": "1.097", "loss": "0.76007", "s2c_nll_loss": "1.097", "s2c_accuracy": "85.781", "s2c_total": "64", "s2c_n_correct": "54.9", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "10040", "lr": "6.694e-05", "gnorm": "9.156", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2053"} 2023-01-29 16:45:58 | INFO | train_inner | {"epoch": 5, "update": 4.649, "s2c_loss": "0.897", "loss": "0.6216", "s2c_nll_loss": "0.897", "s2c_accuracy": "86.094", "s2c_total": "64", "s2c_n_correct": "55.1", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "10050", "lr": "6.70066e-05", "gnorm": "9.442", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2056"} 2023-01-29 16:46:00 | INFO | train_inner | {"epoch": 5, "update": 4.654, "s2c_loss": "0.995", "loss": "0.68986", "s2c_nll_loss": "0.995", "s2c_accuracy": "86.25", "s2c_total": "64", "s2c_n_correct": "55.2", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "10060", "lr": "6.70733e-05", "gnorm": "9.726", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.5", "wall": "2058"} 2023-01-29 16:46:03 | INFO | train_inner | {"epoch": 5, "update": 4.658, "s2c_loss": "1.121", "loss": "0.77732", "s2c_nll_loss": "1.121", "s2c_accuracy": "83.125", "s2c_total": "64", "s2c_n_correct": "53.2", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "10070", "lr": "6.714e-05", "gnorm": "11.061", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2061"} 2023-01-29 16:46:05 | INFO | train_inner | {"epoch": 5, "update": 4.663, "s2c_loss": "1.229", "loss": "0.85177", "s2c_nll_loss": "1.229", "s2c_accuracy": "81.094", "s2c_total": "64", "s2c_n_correct": "51.9", "wps": "258.2", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "10080", "lr": "6.72066e-05", "gnorm": "11.047", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2063"} 2023-01-29 16:46:08 | INFO | train_inner | {"epoch": 5, "update": 4.667, "s2c_loss": "1.296", "loss": "0.89843", "s2c_nll_loss": "1.296", "s2c_accuracy": "81.094", "s2c_total": "64", "s2c_n_correct": "51.9", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "10090", "lr": "6.72733e-05", "gnorm": "10.745", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2066"} 2023-01-29 16:46:10 | INFO | train_inner | {"epoch": 5, "update": 4.672, "s2c_loss": "1.15", "loss": "0.79732", "s2c_nll_loss": "1.15", "s2c_accuracy": "85", "s2c_total": "64", "s2c_n_correct": "54.4", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "10100", "lr": "6.734e-05", "gnorm": "9.819", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2068"} 2023-01-29 16:46:13 | INFO | train_inner | {"epoch": 5, "update": 4.677, "s2c_loss": "1.191", "loss": "0.8255", "s2c_nll_loss": "1.191", "s2c_accuracy": "82.656", "s2c_total": "64", "s2c_n_correct": "52.9", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "10110", "lr": "6.74066e-05", "gnorm": "10.022", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2071"} 2023-01-29 16:46:16 | INFO | train_inner | {"epoch": 5, "update": 4.681, "s2c_loss": "1.178", "loss": "0.81639", "s2c_nll_loss": "1.178", "s2c_accuracy": "82.344", "s2c_total": "64", "s2c_n_correct": "52.7", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "10120", "lr": "6.74733e-05", "gnorm": "10.389", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2074"} 2023-01-29 16:46:18 | INFO | train_inner | {"epoch": 5, "update": 4.686, "s2c_loss": "1.13", "loss": "0.78308", "s2c_nll_loss": "1.13", "s2c_accuracy": "82.812", "s2c_total": "64", "s2c_n_correct": "53", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "10130", "lr": "6.754e-05", "gnorm": "9.501", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2076"} 2023-01-29 16:46:21 | INFO | train_inner | {"epoch": 5, "update": 4.691, "s2c_loss": "0.94", "loss": "0.65151", "s2c_nll_loss": "0.94", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "10140", "lr": "6.76066e-05", "gnorm": "9.572", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2079"} 2023-01-29 16:46:23 | INFO | train_inner | {"epoch": 5, "update": 4.695, "s2c_loss": "0.917", "loss": "0.63567", "s2c_nll_loss": "0.917", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "10150", "lr": "6.76733e-05", "gnorm": "10.291", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2081"} 2023-01-29 16:46:26 | INFO | train_inner | {"epoch": 5, "update": 4.7, "s2c_loss": "1.139", "loss": "0.78961", "s2c_nll_loss": "1.139", "s2c_accuracy": "84.844", "s2c_total": "64", "s2c_n_correct": "54.3", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "10160", "lr": "6.77399e-05", "gnorm": "9.846", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2084"} 2023-01-29 16:46:28 | INFO | train_inner | {"epoch": 5, "update": 4.704, "s2c_loss": "1.155", "loss": "0.80092", "s2c_nll_loss": "1.155", "s2c_accuracy": "84.062", "s2c_total": "64", "s2c_n_correct": "53.8", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "10170", "lr": "6.78066e-05", "gnorm": "10.995", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2086"} 2023-01-29 16:46:31 | INFO | train_inner | {"epoch": 5, "update": 4.709, "s2c_loss": "1.216", "loss": "0.8431", "s2c_nll_loss": "1.216", "s2c_accuracy": "85.781", "s2c_total": "64", "s2c_n_correct": "54.9", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "10180", "lr": "6.78733e-05", "gnorm": "10.008", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2089"} 2023-01-29 16:46:33 | INFO | train_inner | {"epoch": 5, "update": 4.714, "s2c_loss": "1.2", "loss": "0.83201", "s2c_nll_loss": "1.2", "s2c_accuracy": "84.844", "s2c_total": "64", "s2c_n_correct": "54.3", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "10190", "lr": "6.79399e-05", "gnorm": "11.039", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2091"} 2023-01-29 16:46:36 | INFO | train_inner | {"epoch": 5, "update": 4.718, "s2c_loss": "0.919", "loss": "0.63673", "s2c_nll_loss": "0.919", "s2c_accuracy": "86.25", "s2c_total": "64", "s2c_n_correct": "55.2", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "10200", "lr": "6.80066e-05", "gnorm": "10.513", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2094"} 2023-01-29 16:46:38 | INFO | train_inner | {"epoch": 5, "update": 4.723, "s2c_loss": "1.237", "loss": "0.85721", "s2c_nll_loss": "1.237", "s2c_accuracy": "82.969", "s2c_total": "64", "s2c_n_correct": "53.1", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "10210", "lr": "6.80733e-05", "gnorm": "10.648", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2096"} 2023-01-29 16:46:41 | INFO | train_inner | {"epoch": 5, "update": 4.728, "s2c_loss": "1.213", "loss": "0.84052", "s2c_nll_loss": "1.213", "s2c_accuracy": "80.781", "s2c_total": "64", "s2c_n_correct": "51.7", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "10220", "lr": "6.81399e-05", "gnorm": "9.66", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2099"} 2023-01-29 16:46:43 | INFO | train_inner | {"epoch": 5, "update": 4.732, "s2c_loss": "0.871", "loss": "0.6038", "s2c_nll_loss": "0.871", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "10230", "lr": "6.82066e-05", "gnorm": "8.977", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2101"} 2023-01-29 16:46:46 | INFO | train_inner | {"epoch": 5, "update": 4.737, "s2c_loss": "1.151", "loss": "0.79814", "s2c_nll_loss": "1.151", "s2c_accuracy": "83.75", "s2c_total": "64", "s2c_n_correct": "53.6", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "10240", "lr": "6.82733e-05", "gnorm": "9.839", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2104"} 2023-01-29 16:46:48 | INFO | train_inner | {"epoch": 5, "update": 4.741, "s2c_loss": "1.114", "loss": "0.77209", "s2c_nll_loss": "1.114", "s2c_accuracy": "84.688", "s2c_total": "64", "s2c_n_correct": "54.2", "wps": "259.6", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "10250", "lr": "6.83399e-05", "gnorm": "10.337", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2106"} 2023-01-29 16:46:51 | INFO | train_inner | {"epoch": 5, "update": 4.746, "s2c_loss": "0.982", "loss": "0.68043", "s2c_nll_loss": "0.982", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "246.3", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "10260", "lr": "6.84066e-05", "gnorm": "9.439", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2109"} 2023-01-29 16:46:54 | INFO | train_inner | {"epoch": 5, "update": 4.751, "s2c_loss": "0.867", "loss": "0.60111", "s2c_nll_loss": "0.867", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "10270", "lr": "6.84732e-05", "gnorm": "10.541", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2111"} 2023-01-29 16:46:56 | INFO | train_inner | {"epoch": 5, "update": 4.755, "s2c_loss": "1.071", "loss": "0.74204", "s2c_nll_loss": "1.071", "s2c_accuracy": "84.844", "s2c_total": "64", "s2c_n_correct": "54.3", "wps": "246.3", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "10280", "lr": "6.85399e-05", "gnorm": "10.155", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "2114"} 2023-01-29 16:46:59 | INFO | train_inner | {"epoch": 5, "update": 4.76, "s2c_loss": "0.919", "loss": "0.63691", "s2c_nll_loss": "0.919", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "10290", "lr": "6.86066e-05", "gnorm": "9.884", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2117"} 2023-01-29 16:47:01 | INFO | train_inner | {"epoch": 5, "update": 4.765, "s2c_loss": "0.801", "loss": "0.55552", "s2c_nll_loss": "0.801", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "10300", "lr": "6.86732e-05", "gnorm": "9.42", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2119"} 2023-01-29 16:47:04 | INFO | train_inner | {"epoch": 5, "update": 4.769, "s2c_loss": "0.937", "loss": "0.64954", "s2c_nll_loss": "0.937", "s2c_accuracy": "85.781", "s2c_total": "64", "s2c_n_correct": "54.9", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "10310", "lr": "6.87399e-05", "gnorm": "10.418", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2122"} 2023-01-29 16:47:06 | INFO | train_inner | {"epoch": 5, "update": 4.774, "s2c_loss": "1.009", "loss": "0.69925", "s2c_nll_loss": "1.009", "s2c_accuracy": "83.125", "s2c_total": "64", "s2c_n_correct": "53.2", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "10320", "lr": "6.88066e-05", "gnorm": "9.994", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2124"} 2023-01-29 16:47:09 | INFO | train_inner | {"epoch": 5, "update": 4.778, "s2c_loss": "1.393", "loss": "0.96535", "s2c_nll_loss": "1.393", "s2c_accuracy": "82.5", "s2c_total": "64", "s2c_n_correct": "52.8", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "10330", "lr": "6.88732e-05", "gnorm": "11.225", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2127"} 2023-01-29 16:47:11 | INFO | train_inner | {"epoch": 5, "update": 4.783, "s2c_loss": "0.934", "loss": "0.64713", "s2c_nll_loss": "0.934", "s2c_accuracy": "86.562", "s2c_total": "64", "s2c_n_correct": "55.4", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "10340", "lr": "6.89399e-05", "gnorm": "10.032", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2129"} 2023-01-29 16:47:14 | INFO | train_inner | {"epoch": 5, "update": 4.788, "s2c_loss": "0.948", "loss": "0.65691", "s2c_nll_loss": "0.948", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "10350", "lr": "6.90065e-05", "gnorm": "9.511", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2132"} 2023-01-29 16:47:16 | INFO | train_inner | {"epoch": 5, "update": 4.792, "s2c_loss": "1.092", "loss": "0.75672", "s2c_nll_loss": "1.092", "s2c_accuracy": "82.812", "s2c_total": "64", "s2c_n_correct": "53", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "10360", "lr": "6.90732e-05", "gnorm": "11.59", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2134"} 2023-01-29 16:47:19 | INFO | train_inner | {"epoch": 5, "update": 4.797, "s2c_loss": "1.082", "loss": "0.75017", "s2c_nll_loss": "1.082", "s2c_accuracy": "83.906", "s2c_total": "64", "s2c_n_correct": "53.7", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "10370", "lr": "6.91399e-05", "gnorm": "10.558", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2137"} 2023-01-29 16:47:21 | INFO | train_inner | {"epoch": 5, "update": 4.802, "s2c_loss": "1.017", "loss": "0.705", "s2c_nll_loss": "1.017", "s2c_accuracy": "84.531", "s2c_total": "64", "s2c_n_correct": "54.1", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "10380", "lr": "6.92065e-05", "gnorm": "9.86", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2139"} 2023-01-29 16:47:24 | INFO | train_inner | {"epoch": 5, "update": 4.806, "s2c_loss": "0.947", "loss": "0.6563", "s2c_nll_loss": "0.947", "s2c_accuracy": "86.562", "s2c_total": "64", "s2c_n_correct": "55.4", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "10390", "lr": "6.92732e-05", "gnorm": "9.427", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2142"} 2023-01-29 16:47:26 | INFO | train_inner | {"epoch": 5, "update": 4.811, "s2c_loss": "1.269", "loss": "0.87956", "s2c_nll_loss": "1.269", "s2c_accuracy": "85.781", "s2c_total": "64", "s2c_n_correct": "54.9", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "10400", "lr": "6.93399e-05", "gnorm": "11.014", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2144"} 2023-01-29 16:47:29 | INFO | train_inner | {"epoch": 5, "update": 4.815, "s2c_loss": "0.821", "loss": "0.56915", "s2c_nll_loss": "0.821", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "10410", "lr": "6.94065e-05", "gnorm": "9.139", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2147"} 2023-01-29 16:47:32 | INFO | train_inner | {"epoch": 5, "update": 4.82, "s2c_loss": "0.948", "loss": "0.65731", "s2c_nll_loss": "0.948", "s2c_accuracy": "85.625", "s2c_total": "64", "s2c_n_correct": "54.8", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "10420", "lr": "6.94732e-05", "gnorm": "9.941", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2149"} 2023-01-29 16:47:34 | INFO | train_inner | {"epoch": 5, "update": 4.825, "s2c_loss": "1.186", "loss": "0.82178", "s2c_nll_loss": "1.186", "s2c_accuracy": "81.094", "s2c_total": "64", "s2c_n_correct": "51.9", "wps": "239.2", "ups": "3.74", "wpb": "64", "bsz": "64", "num_updates": "10430", "lr": "6.95399e-05", "gnorm": "10.965", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2152"} 2023-01-29 16:47:37 | INFO | train_inner | {"epoch": 5, "update": 4.829, "s2c_loss": "1.069", "loss": "0.74098", "s2c_nll_loss": "1.069", "s2c_accuracy": "86.25", "s2c_total": "64", "s2c_n_correct": "55.2", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "10440", "lr": "6.96065e-05", "gnorm": "11.188", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2155"} 2023-01-29 16:47:39 | INFO | train_inner | {"epoch": 5, "update": 4.834, "s2c_loss": "0.942", "loss": "0.65301", "s2c_nll_loss": "0.942", "s2c_accuracy": "86.562", "s2c_total": "64", "s2c_n_correct": "55.4", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "10450", "lr": "6.96732e-05", "gnorm": "9.675", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2157"} 2023-01-29 16:47:42 | INFO | train_inner | {"epoch": 5, "update": 4.839, "s2c_loss": "0.942", "loss": "0.65297", "s2c_nll_loss": "0.942", "s2c_accuracy": "85.156", "s2c_total": "64", "s2c_n_correct": "54.5", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "10460", "lr": "6.97398e-05", "gnorm": "11.257", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2160"} 2023-01-29 16:47:44 | INFO | train_inner | {"epoch": 5, "update": 4.843, "s2c_loss": "0.868", "loss": "0.60175", "s2c_nll_loss": "0.868", "s2c_accuracy": "86.562", "s2c_total": "64", "s2c_n_correct": "55.4", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "10470", "lr": "6.98065e-05", "gnorm": "10.009", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2162"} 2023-01-29 16:47:47 | INFO | train_inner | {"epoch": 5, "update": 4.848, "s2c_loss": "0.894", "loss": "0.61944", "s2c_nll_loss": "0.894", "s2c_accuracy": "86.875", "s2c_total": "64", "s2c_n_correct": "55.6", "wps": "241.7", "ups": "3.78", "wpb": "64", "bsz": "64", "num_updates": "10480", "lr": "6.98732e-05", "gnorm": "9.44", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2165"} 2023-01-29 16:47:50 | INFO | train_inner | {"epoch": 5, "update": 4.852, "s2c_loss": "1.141", "loss": "0.79069", "s2c_nll_loss": "1.141", "s2c_accuracy": "85.312", "s2c_total": "64", "s2c_n_correct": "54.6", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "10490", "lr": "6.99398e-05", "gnorm": "9.476", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2168"} 2023-01-29 16:47:52 | INFO | train_inner | {"epoch": 5, "update": 4.857, "s2c_loss": "0.94", "loss": "0.65129", "s2c_nll_loss": "0.94", "s2c_accuracy": "85.312", "s2c_total": "64", "s2c_n_correct": "54.6", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "10500", "lr": "7.00065e-05", "gnorm": "9.074", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2170"} 2023-01-29 16:47:55 | INFO | train_inner | {"epoch": 5, "update": 4.862, "s2c_loss": "0.908", "loss": "0.62965", "s2c_nll_loss": "0.908", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "10510", "lr": "7.00732e-05", "gnorm": "8.859", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2173"} 2023-01-29 16:47:57 | INFO | train_inner | {"epoch": 5, "update": 4.866, "s2c_loss": "0.811", "loss": "0.56189", "s2c_nll_loss": "0.811", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "10520", "lr": "7.01398e-05", "gnorm": "8.791", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2175"} 2023-01-29 16:48:00 | INFO | train_inner | {"epoch": 5, "update": 4.871, "s2c_loss": "0.905", "loss": "0.62737", "s2c_nll_loss": "0.905", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "10530", "lr": "7.02065e-05", "gnorm": "8.871", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2178"} 2023-01-29 16:48:02 | INFO | train_inner | {"epoch": 5, "update": 4.876, "s2c_loss": "0.891", "loss": "0.61784", "s2c_nll_loss": "0.891", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "10540", "lr": "7.02732e-05", "gnorm": "9.068", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2180"} 2023-01-29 16:48:05 | INFO | train_inner | {"epoch": 5, "update": 4.88, "s2c_loss": "0.917", "loss": "0.63546", "s2c_nll_loss": "0.917", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "240.2", "ups": "3.75", "wpb": "64", "bsz": "64", "num_updates": "10550", "lr": "7.03398e-05", "gnorm": "9.401", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "2183"} 2023-01-29 16:48:08 | INFO | train_inner | {"epoch": 5, "update": 4.885, "s2c_loss": "1.136", "loss": "0.78752", "s2c_nll_loss": "1.136", "s2c_accuracy": "84.062", "s2c_total": "64", "s2c_n_correct": "53.8", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "10560", "lr": "7.04065e-05", "gnorm": "9.716", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2185"} 2023-01-29 16:48:10 | INFO | train_inner | {"epoch": 5, "update": 4.889, "s2c_loss": "0.892", "loss": "0.6185", "s2c_nll_loss": "0.892", "s2c_accuracy": "86.25", "s2c_total": "64", "s2c_n_correct": "55.2", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "10570", "lr": "7.04731e-05", "gnorm": "10.234", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2188"} 2023-01-29 16:48:13 | INFO | train_inner | {"epoch": 5, "update": 4.894, "s2c_loss": "0.791", "loss": "0.5485", "s2c_nll_loss": "0.791", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "10580", "lr": "7.05398e-05", "gnorm": "10.307", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2191"} 2023-01-29 16:48:15 | INFO | train_inner | {"epoch": 5, "update": 4.899, "s2c_loss": "1.118", "loss": "0.775", "s2c_nll_loss": "1.118", "s2c_accuracy": "85", "s2c_total": "64", "s2c_n_correct": "54.4", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "10590", "lr": "7.06065e-05", "gnorm": "11.637", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2193"} 2023-01-29 16:48:18 | INFO | train_inner | {"epoch": 5, "update": 4.903, "s2c_loss": "0.938", "loss": "0.65033", "s2c_nll_loss": "0.938", "s2c_accuracy": "85.938", "s2c_total": "64", "s2c_n_correct": "55", "wps": "244.2", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "10600", "lr": "7.06731e-05", "gnorm": "9.646", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "2196"} 2023-01-29 16:48:20 | INFO | train_inner | {"epoch": 5, "update": 4.908, "s2c_loss": "0.969", "loss": "0.67165", "s2c_nll_loss": "0.969", "s2c_accuracy": "86.875", "s2c_total": "64", "s2c_n_correct": "55.6", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "10610", "lr": "7.07398e-05", "gnorm": "9.532", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2198"} 2023-01-29 16:48:23 | INFO | train_inner | {"epoch": 5, "update": 4.913, "s2c_loss": "0.802", "loss": "0.55558", "s2c_nll_loss": "0.802", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "10620", "lr": "7.08065e-05", "gnorm": "8.538", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2201"} 2023-01-29 16:48:25 | INFO | train_inner | {"epoch": 5, "update": 4.917, "s2c_loss": "0.977", "loss": "0.67734", "s2c_nll_loss": "0.977", "s2c_accuracy": "85.156", "s2c_total": "64", "s2c_n_correct": "54.5", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "10630", "lr": "7.08731e-05", "gnorm": "9.695", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2203"} 2023-01-29 16:48:28 | INFO | train_inner | {"epoch": 5, "update": 4.922, "s2c_loss": "0.924", "loss": "0.64046", "s2c_nll_loss": "0.924", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "10640", "lr": "7.09398e-05", "gnorm": "9.295", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2206"} 2023-01-29 16:48:30 | INFO | train_inner | {"epoch": 5, "update": 4.926, "s2c_loss": "1.047", "loss": "0.72506", "s2c_nll_loss": "1.047", "s2c_accuracy": "82.575", "s2c_total": "63.7", "s2c_n_correct": "52.6", "wps": "256.6", "ups": "4.03", "wpb": "63.7", "bsz": "63.7", "num_updates": "10650", "lr": "7.10064e-05", "gnorm": "11.968", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2208"} 2023-01-29 16:48:33 | INFO | train_inner | {"epoch": 5, "update": 4.931, "s2c_loss": "0.903", "loss": "0.62625", "s2c_nll_loss": "0.903", "s2c_accuracy": "86.562", "s2c_total": "64", "s2c_n_correct": "55.4", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "10660", "lr": "7.10731e-05", "gnorm": "10.057", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2211"} 2023-01-29 16:48:35 | INFO | train_inner | {"epoch": 5, "update": 4.936, "s2c_loss": "0.955", "loss": "0.66168", "s2c_nll_loss": "0.955", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "10670", "lr": "7.11398e-05", "gnorm": "10.91", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2213"} 2023-01-29 16:48:38 | INFO | train_inner | {"epoch": 5, "update": 4.94, "s2c_loss": "0.875", "loss": "0.60649", "s2c_nll_loss": "0.875", "s2c_accuracy": "86.094", "s2c_total": "64", "s2c_n_correct": "55.1", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "10680", "lr": "7.12064e-05", "gnorm": "9.501", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2216"} 2023-01-29 16:48:40 | INFO | train_inner | {"epoch": 5, "update": 4.945, "s2c_loss": "0.977", "loss": "0.67698", "s2c_nll_loss": "0.977", "s2c_accuracy": "84.531", "s2c_total": "64", "s2c_n_correct": "54.1", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "10690", "lr": "7.12731e-05", "gnorm": "9.609", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2218"} 2023-01-29 16:48:43 | INFO | train_inner | {"epoch": 5, "update": 4.95, "s2c_loss": "0.702", "loss": "0.48681", "s2c_nll_loss": "0.702", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "10700", "lr": "7.13398e-05", "gnorm": "9.067", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2221"} 2023-01-29 16:48:46 | INFO | train_inner | {"epoch": 5, "update": 4.954, "s2c_loss": "0.989", "loss": "0.68553", "s2c_nll_loss": "0.989", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "10710", "lr": "7.14064e-05", "gnorm": "9.862", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2223"} 2023-01-29 16:48:48 | INFO | train_inner | {"epoch": 5, "update": 4.959, "s2c_loss": "1.047", "loss": "0.72595", "s2c_nll_loss": "1.047", "s2c_accuracy": "84.375", "s2c_total": "64", "s2c_n_correct": "54", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "10720", "lr": "7.14731e-05", "gnorm": "10.042", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2226"} 2023-01-29 16:48:51 | INFO | train_inner | {"epoch": 5, "update": 4.963, "s2c_loss": "0.82", "loss": "0.56831", "s2c_nll_loss": "0.82", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "10730", "lr": "7.15398e-05", "gnorm": "9.075", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2229"} 2023-01-29 16:48:53 | INFO | train_inner | {"epoch": 5, "update": 4.968, "s2c_loss": "0.877", "loss": "0.60804", "s2c_nll_loss": "0.877", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "10740", "lr": "7.16064e-05", "gnorm": "9.854", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2231"} 2023-01-29 16:48:56 | INFO | train_inner | {"epoch": 5, "update": 4.973, "s2c_loss": "0.83", "loss": "0.57527", "s2c_nll_loss": "0.83", "s2c_accuracy": "86.094", "s2c_total": "64", "s2c_n_correct": "55.1", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "10750", "lr": "7.16731e-05", "gnorm": "9.027", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2234"} 2023-01-29 16:48:58 | INFO | train_inner | {"epoch": 5, "update": 4.977, "s2c_loss": "0.872", "loss": "0.60452", "s2c_nll_loss": "0.872", "s2c_accuracy": "85.156", "s2c_total": "64", "s2c_n_correct": "54.5", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "10760", "lr": "7.17397e-05", "gnorm": "10.403", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "2236"} 2023-01-29 16:49:01 | INFO | train_inner | {"epoch": 5, "update": 4.982, "s2c_loss": "1.048", "loss": "0.72612", "s2c_nll_loss": "1.048", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "10770", "lr": "7.18064e-05", "gnorm": "9.408", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2239"} 2023-01-29 16:49:03 | INFO | train_inner | {"epoch": 5, "update": 4.987, "s2c_loss": "1.048", "loss": "0.72646", "s2c_nll_loss": "1.048", "s2c_accuracy": "84.219", "s2c_total": "64", "s2c_n_correct": "53.9", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "10780", "lr": "7.18731e-05", "gnorm": "11.259", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2241"} 2023-01-29 16:49:06 | INFO | train_inner | {"epoch": 5, "update": 4.991, "s2c_loss": "0.976", "loss": "0.67634", "s2c_nll_loss": "0.976", "s2c_accuracy": "86.25", "s2c_total": "64", "s2c_n_correct": "55.2", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "10790", "lr": "7.19397e-05", "gnorm": "10.092", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2244"} 2023-01-29 16:49:08 | INFO | train_inner | {"epoch": 5, "update": 4.996, "s2c_loss": "1.009", "loss": "0.69924", "s2c_nll_loss": "1.009", "s2c_accuracy": "85.469", "s2c_total": "64", "s2c_n_correct": "54.7", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "10800", "lr": "7.20064e-05", "gnorm": "12.298", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2246"} 2023-01-29 16:49:10 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 5 @ 10809 updates 2023-01-29 16:49:10 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 16:49:18 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 16:49:18 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt (epoch 5 @ 10809 updates, score None) (writing took 7.132177956867963 seconds) 2023-01-29 16:49:18 | INFO | fairseq_cli.train | end of epoch 5 (average epoch stats below) 2023-01-29 16:49:18 | INFO | train | {"epoch": 5, "train_s2c_loss": "1.1", "train_loss": "0.76226", "train_s2c_nll_loss": "1.1", "train_s2c_accuracy": "84.949", "train_s2c_total": "63.9838", "train_s2c_n_correct": "54.3538", "train_wps": "245.9", "train_ups": "3.84", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "10809", "train_lr": "7.20664e-05", "train_gnorm": "9.837", "train_loss_scale": "1024", "train_train_wall": "541", "train_gb_free": "7.5", "train_wall": "2256"} 2023-01-29 16:49:24 | INFO | fairseq.trainer | begin training epoch 6 2023-01-29 16:49:24 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 16:49:24 | INFO | train_inner | {"epoch": 6, "update": 5.0, "s2c_loss": "0.854", "loss": "0.59204", "s2c_nll_loss": "0.854", "s2c_accuracy": "88.487", "s2c_total": "60.8", "s2c_n_correct": "53.8", "wps": "37.8", "ups": "0.62", "wpb": "60.8", "bsz": "60.8", "num_updates": "10810", "lr": "7.20731e-05", "gnorm": "9.896", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2262"} 2023-01-29 16:49:27 | INFO | train_inner | {"epoch": 6, "update": 5.005, "s2c_loss": "0.878", "loss": "0.60857", "s2c_nll_loss": "0.878", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "10820", "lr": "7.21397e-05", "gnorm": "8.674", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2265"} 2023-01-29 16:49:30 | INFO | train_inner | {"epoch": 6, "update": 5.01, "s2c_loss": "1.101", "loss": "0.76302", "s2c_nll_loss": "1.101", "s2c_accuracy": "84.219", "s2c_total": "64", "s2c_n_correct": "53.9", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "10830", "lr": "7.22064e-05", "gnorm": "10.441", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2267"} 2023-01-29 16:49:32 | INFO | train_inner | {"epoch": 6, "update": 5.014, "s2c_loss": "0.773", "loss": "0.53558", "s2c_nll_loss": "0.773", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "10840", "lr": "7.22731e-05", "gnorm": "8.873", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2270"} 2023-01-29 16:49:35 | INFO | train_inner | {"epoch": 6, "update": 5.019, "s2c_loss": "0.727", "loss": "0.50411", "s2c_nll_loss": "0.727", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "10850", "lr": "7.23397e-05", "gnorm": "9.648", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2273"} 2023-01-29 16:49:37 | INFO | train_inner | {"epoch": 6, "update": 5.024, "s2c_loss": "0.811", "loss": "0.56181", "s2c_nll_loss": "0.811", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "247.4", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "10860", "lr": "7.24064e-05", "gnorm": "10.552", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2275"} 2023-01-29 16:49:40 | INFO | train_inner | {"epoch": 6, "update": 5.028, "s2c_loss": "0.764", "loss": "0.52988", "s2c_nll_loss": "0.764", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "10870", "lr": "7.2473e-05", "gnorm": "9.255", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2278"} 2023-01-29 16:49:42 | INFO | train_inner | {"epoch": 6, "update": 5.033, "s2c_loss": "0.799", "loss": "0.55371", "s2c_nll_loss": "0.799", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "10880", "lr": "7.25397e-05", "gnorm": "9.028", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2280"} 2023-01-29 16:49:45 | INFO | train_inner | {"epoch": 6, "update": 5.037, "s2c_loss": "0.753", "loss": "0.52197", "s2c_nll_loss": "0.753", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "246.6", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "10890", "lr": "7.26064e-05", "gnorm": "8.671", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2283"} 2023-01-29 16:49:48 | INFO | train_inner | {"epoch": 6, "update": 5.042, "s2c_loss": "0.655", "loss": "0.45376", "s2c_nll_loss": "0.655", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "246.8", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "10900", "lr": "7.2673e-05", "gnorm": "8.415", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2285"} 2023-01-29 16:49:50 | INFO | train_inner | {"epoch": 6, "update": 5.047, "s2c_loss": "0.931", "loss": "0.64561", "s2c_nll_loss": "0.931", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "10910", "lr": "7.27397e-05", "gnorm": "9.851", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "2288"} 2023-01-29 16:49:53 | INFO | train_inner | {"epoch": 6, "update": 5.051, "s2c_loss": "0.731", "loss": "0.50647", "s2c_nll_loss": "0.731", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "10920", "lr": "7.28064e-05", "gnorm": "8.745", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2290"} 2023-01-29 16:49:55 | INFO | train_inner | {"epoch": 6, "update": 5.056, "s2c_loss": "0.854", "loss": "0.59185", "s2c_nll_loss": "0.854", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "10930", "lr": "7.2873e-05", "gnorm": "9.895", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2293"} 2023-01-29 16:49:58 | INFO | train_inner | {"epoch": 6, "update": 5.061, "s2c_loss": "0.846", "loss": "0.58654", "s2c_nll_loss": "0.846", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "235", "ups": "3.67", "wpb": "64", "bsz": "64", "num_updates": "10940", "lr": "7.29397e-05", "gnorm": "8.875", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2296"} 2023-01-29 16:50:00 | INFO | train_inner | {"epoch": 6, "update": 5.065, "s2c_loss": "0.7", "loss": "0.48524", "s2c_nll_loss": "0.7", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "10950", "lr": "7.30063e-05", "gnorm": "8.181", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2298"} 2023-01-29 16:50:03 | INFO | train_inner | {"epoch": 6, "update": 5.07, "s2c_loss": "0.765", "loss": "0.52999", "s2c_nll_loss": "0.765", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "10960", "lr": "7.3073e-05", "gnorm": "9.883", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2301"} 2023-01-29 16:50:05 | INFO | train_inner | {"epoch": 6, "update": 5.074, "s2c_loss": "0.821", "loss": "0.56933", "s2c_nll_loss": "0.821", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "260.3", "ups": "4.07", "wpb": "64", "bsz": "64", "num_updates": "10970", "lr": "7.31397e-05", "gnorm": "8.804", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2303"} 2023-01-29 16:50:08 | INFO | train_inner | {"epoch": 6, "update": 5.079, "s2c_loss": "0.826", "loss": "0.57259", "s2c_nll_loss": "0.826", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "10980", "lr": "7.32063e-05", "gnorm": "8.434", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2306"} 2023-01-29 16:50:10 | INFO | train_inner | {"epoch": 6, "update": 5.084, "s2c_loss": "0.748", "loss": "0.51873", "s2c_nll_loss": "0.748", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "10990", "lr": "7.3273e-05", "gnorm": "8.799", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2308"} 2023-01-29 16:50:13 | INFO | train_inner | {"epoch": 6, "update": 5.088, "s2c_loss": "0.573", "loss": "0.39723", "s2c_nll_loss": "0.573", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "11000", "lr": "7.33397e-05", "gnorm": "8.971", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "2311"} 2023-01-29 16:50:16 | INFO | train_inner | {"epoch": 6, "update": 5.093, "s2c_loss": "0.616", "loss": "0.42674", "s2c_nll_loss": "0.616", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "11010", "lr": "7.34063e-05", "gnorm": "8.541", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2313"} 2023-01-29 16:50:18 | INFO | train_inner | {"epoch": 6, "update": 5.098, "s2c_loss": "0.968", "loss": "0.67103", "s2c_nll_loss": "0.968", "s2c_accuracy": "85.781", "s2c_total": "64", "s2c_n_correct": "54.9", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "11020", "lr": "7.3473e-05", "gnorm": "12.409", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2316"} 2023-01-29 16:50:21 | INFO | train_inner | {"epoch": 6, "update": 5.102, "s2c_loss": "0.739", "loss": "0.51209", "s2c_nll_loss": "0.739", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "11030", "lr": "7.35397e-05", "gnorm": "9.434", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2318"} 2023-01-29 16:50:23 | INFO | train_inner | {"epoch": 6, "update": 5.107, "s2c_loss": "0.739", "loss": "0.51232", "s2c_nll_loss": "0.739", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "11040", "lr": "7.36063e-05", "gnorm": "10.005", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2321"} 2023-01-29 16:50:26 | INFO | train_inner | {"epoch": 6, "update": 5.111, "s2c_loss": "0.8", "loss": "0.55481", "s2c_nll_loss": "0.8", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "11050", "lr": "7.3673e-05", "gnorm": "9.809", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2323"} 2023-01-29 16:50:28 | INFO | train_inner | {"epoch": 6, "update": 5.116, "s2c_loss": "0.783", "loss": "0.54307", "s2c_nll_loss": "0.783", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "11060", "lr": "7.37396e-05", "gnorm": "9.909", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2326"} 2023-01-29 16:50:31 | INFO | train_inner | {"epoch": 6, "update": 5.121, "s2c_loss": "0.942", "loss": "0.653", "s2c_nll_loss": "0.942", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "11070", "lr": "7.38063e-05", "gnorm": "8.646", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2329"} 2023-01-29 16:50:33 | INFO | train_inner | {"epoch": 6, "update": 5.125, "s2c_loss": "0.835", "loss": "0.57884", "s2c_nll_loss": "0.835", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "11080", "lr": "7.3873e-05", "gnorm": "7.439", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "2331"} 2023-01-29 16:50:36 | INFO | train_inner | {"epoch": 6, "update": 5.13, "s2c_loss": "0.828", "loss": "0.574", "s2c_nll_loss": "0.828", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "246.7", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "11090", "lr": "7.39396e-05", "gnorm": "9.28", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2334"} 2023-01-29 16:50:38 | INFO | train_inner | {"epoch": 6, "update": 5.135, "s2c_loss": "0.837", "loss": "0.58042", "s2c_nll_loss": "0.837", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "11100", "lr": "7.40063e-05", "gnorm": "9.167", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2336"} 2023-01-29 16:50:41 | INFO | train_inner | {"epoch": 6, "update": 5.139, "s2c_loss": "0.826", "loss": "0.57255", "s2c_nll_loss": "0.826", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "11110", "lr": "7.4073e-05", "gnorm": "10.007", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2339"} 2023-01-29 16:50:43 | INFO | train_inner | {"epoch": 6, "update": 5.144, "s2c_loss": "0.865", "loss": "0.59964", "s2c_nll_loss": "0.865", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "11120", "lr": "7.41396e-05", "gnorm": "10.225", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2341"} 2023-01-29 16:50:46 | INFO | train_inner | {"epoch": 6, "update": 5.148, "s2c_loss": "0.788", "loss": "0.54591", "s2c_nll_loss": "0.788", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "11130", "lr": "7.42063e-05", "gnorm": "9.38", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2344"} 2023-01-29 16:50:48 | INFO | train_inner | {"epoch": 6, "update": 5.153, "s2c_loss": "0.752", "loss": "0.52113", "s2c_nll_loss": "0.752", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "11140", "lr": "7.4273e-05", "gnorm": "10.446", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2346"} 2023-01-29 16:50:51 | INFO | train_inner | {"epoch": 6, "update": 5.158, "s2c_loss": "0.74", "loss": "0.51311", "s2c_nll_loss": "0.74", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "11150", "lr": "7.43396e-05", "gnorm": "8.847", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2349"} 2023-01-29 16:50:54 | INFO | train_inner | {"epoch": 6, "update": 5.162, "s2c_loss": "0.84", "loss": "0.58237", "s2c_nll_loss": "0.84", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "11160", "lr": "7.44063e-05", "gnorm": "9.901", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2352"} 2023-01-29 16:50:56 | INFO | train_inner | {"epoch": 6, "update": 5.167, "s2c_loss": "0.85", "loss": "0.58942", "s2c_nll_loss": "0.85", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "11170", "lr": "7.44729e-05", "gnorm": "9.502", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2354"} 2023-01-29 16:50:59 | INFO | train_inner | {"epoch": 6, "update": 5.172, "s2c_loss": "0.677", "loss": "0.46892", "s2c_nll_loss": "0.677", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "247.5", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "11180", "lr": "7.45396e-05", "gnorm": "8.384", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2357"} 2023-01-29 16:51:01 | INFO | train_inner | {"epoch": 6, "update": 5.176, "s2c_loss": "0.798", "loss": "0.55315", "s2c_nll_loss": "0.798", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "11190", "lr": "7.46063e-05", "gnorm": "8.398", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2359"} 2023-01-29 16:51:04 | INFO | train_inner | {"epoch": 6, "update": 5.181, "s2c_loss": "0.749", "loss": "0.51902", "s2c_nll_loss": "0.749", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "11200", "lr": "7.46729e-05", "gnorm": "9.102", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2362"} 2023-01-29 16:51:06 | INFO | train_inner | {"epoch": 6, "update": 5.185, "s2c_loss": "1.081", "loss": "0.74951", "s2c_nll_loss": "1.081", "s2c_accuracy": "85.156", "s2c_total": "64", "s2c_n_correct": "54.5", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "11210", "lr": "7.47396e-05", "gnorm": "9.823", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "2364"} 2023-01-29 16:51:09 | INFO | train_inner | {"epoch": 6, "update": 5.19, "s2c_loss": "0.684", "loss": "0.47416", "s2c_nll_loss": "0.684", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "11220", "lr": "7.48063e-05", "gnorm": "10.69", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2367"} 2023-01-29 16:51:12 | INFO | train_inner | {"epoch": 6, "update": 5.195, "s2c_loss": "0.771", "loss": "0.53463", "s2c_nll_loss": "0.771", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "247.3", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "11230", "lr": "7.48729e-05", "gnorm": "9.303", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2370"} 2023-01-29 16:51:14 | INFO | train_inner | {"epoch": 6, "update": 5.199, "s2c_loss": "0.796", "loss": "0.55188", "s2c_nll_loss": "0.796", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "11240", "lr": "7.49396e-05", "gnorm": "10.294", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2372"} 2023-01-29 16:51:17 | INFO | train_inner | {"epoch": 6, "update": 5.204, "s2c_loss": "0.626", "loss": "0.43404", "s2c_nll_loss": "0.626", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "11250", "lr": "7.50062e-05", "gnorm": "8.66", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2375"} 2023-01-29 16:51:19 | INFO | train_inner | {"epoch": 6, "update": 5.209, "s2c_loss": "0.778", "loss": "0.53953", "s2c_nll_loss": "0.778", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "11260", "lr": "7.50729e-05", "gnorm": "9.63", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2377"} 2023-01-29 16:51:22 | INFO | train_inner | {"epoch": 6, "update": 5.213, "s2c_loss": "0.785", "loss": "0.5439", "s2c_nll_loss": "0.785", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "11270", "lr": "7.51396e-05", "gnorm": "9.607", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2380"} 2023-01-29 16:51:24 | INFO | train_inner | {"epoch": 6, "update": 5.218, "s2c_loss": "0.986", "loss": "0.68374", "s2c_nll_loss": "0.986", "s2c_accuracy": "85.781", "s2c_total": "64", "s2c_n_correct": "54.9", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "11280", "lr": "7.52062e-05", "gnorm": "10.373", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2382"} 2023-01-29 16:51:27 | INFO | train_inner | {"epoch": 6, "update": 5.222, "s2c_loss": "0.926", "loss": "0.64196", "s2c_nll_loss": "0.926", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "11290", "lr": "7.52729e-05", "gnorm": "8.881", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2385"} 2023-01-29 16:51:29 | INFO | train_inner | {"epoch": 6, "update": 5.227, "s2c_loss": "0.826", "loss": "0.57239", "s2c_nll_loss": "0.826", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "11300", "lr": "7.53396e-05", "gnorm": "10.064", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "2387"} 2023-01-29 16:51:32 | INFO | train_inner | {"epoch": 6, "update": 5.232, "s2c_loss": "0.862", "loss": "0.59783", "s2c_nll_loss": "0.862", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "259", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "11310", "lr": "7.54062e-05", "gnorm": "10.734", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2390"} 2023-01-29 16:51:35 | INFO | train_inner | {"epoch": 6, "update": 5.236, "s2c_loss": "0.958", "loss": "0.66435", "s2c_nll_loss": "0.958", "s2c_accuracy": "86.562", "s2c_total": "64", "s2c_n_correct": "55.4", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "11320", "lr": "7.54729e-05", "gnorm": "11.308", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2392"} 2023-01-29 16:51:37 | INFO | train_inner | {"epoch": 6, "update": 5.241, "s2c_loss": "0.659", "loss": "0.4565", "s2c_nll_loss": "0.659", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "11330", "lr": "7.55396e-05", "gnorm": "9.086", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2395"} 2023-01-29 16:51:40 | INFO | train_inner | {"epoch": 6, "update": 5.246, "s2c_loss": "0.892", "loss": "0.6183", "s2c_nll_loss": "0.892", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "11340", "lr": "7.56062e-05", "gnorm": "9.224", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2398"} 2023-01-29 16:51:42 | INFO | train_inner | {"epoch": 6, "update": 5.25, "s2c_loss": "0.727", "loss": "0.50399", "s2c_nll_loss": "0.727", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "11350", "lr": "7.56729e-05", "gnorm": "9.476", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2400"} 2023-01-29 16:51:45 | INFO | train_inner | {"epoch": 6, "update": 5.255, "s2c_loss": "0.757", "loss": "0.52495", "s2c_nll_loss": "0.757", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "11360", "lr": "7.57395e-05", "gnorm": "9.264", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2403"} 2023-01-29 16:51:47 | INFO | train_inner | {"epoch": 6, "update": 5.259, "s2c_loss": "0.888", "loss": "0.61522", "s2c_nll_loss": "0.888", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "11370", "lr": "7.58062e-05", "gnorm": "10.025", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2405"} 2023-01-29 16:51:50 | INFO | train_inner | {"epoch": 6, "update": 5.264, "s2c_loss": "0.571", "loss": "0.39595", "s2c_nll_loss": "0.571", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "11380", "lr": "7.58729e-05", "gnorm": "8.049", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2408"} 2023-01-29 16:51:52 | INFO | train_inner | {"epoch": 6, "update": 5.269, "s2c_loss": "0.835", "loss": "0.57849", "s2c_nll_loss": "0.835", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "11390", "lr": "7.59395e-05", "gnorm": "8.312", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2410"} 2023-01-29 16:51:55 | INFO | train_inner | {"epoch": 6, "update": 5.273, "s2c_loss": "0.78", "loss": "0.5406", "s2c_nll_loss": "0.78", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "11400", "lr": "7.60062e-05", "gnorm": "8.481", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2413"} 2023-01-29 16:51:57 | INFO | train_inner | {"epoch": 6, "update": 5.278, "s2c_loss": "0.878", "loss": "0.60877", "s2c_nll_loss": "0.878", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "11410", "lr": "7.60729e-05", "gnorm": "7.827", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2415"} 2023-01-29 16:52:00 | INFO | train_inner | {"epoch": 6, "update": 5.283, "s2c_loss": "0.932", "loss": "0.6457", "s2c_nll_loss": "0.932", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "11420", "lr": "7.61395e-05", "gnorm": "8.921", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2418"} 2023-01-29 16:52:02 | INFO | train_inner | {"epoch": 6, "update": 5.287, "s2c_loss": "0.837", "loss": "0.57988", "s2c_nll_loss": "0.837", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "11430", "lr": "7.62062e-05", "gnorm": "9.008", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2420"} 2023-01-29 16:52:05 | INFO | train_inner | {"epoch": 6, "update": 5.292, "s2c_loss": "0.772", "loss": "0.53487", "s2c_nll_loss": "0.772", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "11440", "lr": "7.62729e-05", "gnorm": "8.935", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2423"} 2023-01-29 16:52:07 | INFO | train_inner | {"epoch": 6, "update": 5.296, "s2c_loss": "0.854", "loss": "0.59185", "s2c_nll_loss": "0.854", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "11450", "lr": "7.63395e-05", "gnorm": "8.719", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2425"} 2023-01-29 16:52:10 | INFO | train_inner | {"epoch": 6, "update": 5.301, "s2c_loss": "0.93", "loss": "0.64468", "s2c_nll_loss": "0.93", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "11460", "lr": "7.64062e-05", "gnorm": "9.817", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "2428"} 2023-01-29 16:52:13 | INFO | train_inner | {"epoch": 6, "update": 5.306, "s2c_loss": "0.853", "loss": "0.59151", "s2c_nll_loss": "0.853", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "11470", "lr": "7.64728e-05", "gnorm": "8.706", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "2430"} 2023-01-29 16:52:15 | INFO | train_inner | {"epoch": 6, "update": 5.31, "s2c_loss": "0.667", "loss": "0.46231", "s2c_nll_loss": "0.667", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "11480", "lr": "7.65395e-05", "gnorm": "9.612", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "2433"} 2023-01-29 16:52:18 | INFO | train_inner | {"epoch": 6, "update": 5.315, "s2c_loss": "0.834", "loss": "0.57838", "s2c_nll_loss": "0.834", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "11490", "lr": "7.66062e-05", "gnorm": "8.991", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "2436"} 2023-01-29 16:52:20 | INFO | train_inner | {"epoch": 6, "update": 5.32, "s2c_loss": "0.751", "loss": "0.52037", "s2c_nll_loss": "0.751", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "11500", "lr": "7.66728e-05", "gnorm": "10.183", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "2438"} 2023-01-29 16:52:23 | INFO | train_inner | {"epoch": 6, "update": 5.324, "s2c_loss": "0.805", "loss": "0.55823", "s2c_nll_loss": "0.805", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "246.8", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "11510", "lr": "7.67395e-05", "gnorm": "9.558", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "2441"} 2023-01-29 16:52:25 | INFO | train_inner | {"epoch": 6, "update": 5.329, "s2c_loss": "0.749", "loss": "0.51925", "s2c_nll_loss": "0.749", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "11520", "lr": "7.68062e-05", "gnorm": "9.377", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "2443"} 2023-01-29 16:52:28 | INFO | train_inner | {"epoch": 6, "update": 5.333, "s2c_loss": "0.757", "loss": "0.52474", "s2c_nll_loss": "0.757", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "11530", "lr": "7.68728e-05", "gnorm": "9.297", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "2446"} 2023-01-29 16:52:30 | INFO | train_inner | {"epoch": 6, "update": 5.338, "s2c_loss": "0.795", "loss": "0.55105", "s2c_nll_loss": "0.795", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "11540", "lr": "7.69395e-05", "gnorm": "9.828", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "2448"} 2023-01-29 16:52:33 | INFO | train_inner | {"epoch": 6, "update": 5.343, "s2c_loss": "0.887", "loss": "0.61504", "s2c_nll_loss": "0.887", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "11550", "lr": "7.70061e-05", "gnorm": "9.575", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "2451"} 2023-01-29 16:52:36 | INFO | train_inner | {"epoch": 6, "update": 5.347, "s2c_loss": "0.692", "loss": "0.47964", "s2c_nll_loss": "0.692", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "11560", "lr": "7.70728e-05", "gnorm": "7.636", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "2453"} 2023-01-29 16:52:38 | INFO | train_inner | {"epoch": 6, "update": 5.352, "s2c_loss": "0.718", "loss": "0.49773", "s2c_nll_loss": "0.718", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "246.5", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "11570", "lr": "7.71395e-05", "gnorm": "7.998", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "2456"} 2023-01-29 16:52:41 | INFO | train_inner | {"epoch": 6, "update": 5.357, "s2c_loss": "0.698", "loss": "0.48411", "s2c_nll_loss": "0.698", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "11580", "lr": "7.72061e-05", "gnorm": "8.847", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "2459"} 2023-01-29 16:52:43 | INFO | train_inner | {"epoch": 6, "update": 5.361, "s2c_loss": "0.823", "loss": "0.57012", "s2c_nll_loss": "0.823", "s2c_accuracy": "86.562", "s2c_total": "64", "s2c_n_correct": "55.4", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "11590", "lr": "7.72728e-05", "gnorm": "9.709", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "2461"} 2023-01-29 16:52:46 | INFO | train_inner | {"epoch": 6, "update": 5.366, "s2c_loss": "0.714", "loss": "0.49515", "s2c_nll_loss": "0.714", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "11600", "lr": "7.73395e-05", "gnorm": "8.949", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "2464"} 2023-01-29 16:52:48 | INFO | train_inner | {"epoch": 6, "update": 5.37, "s2c_loss": "0.766", "loss": "0.53069", "s2c_nll_loss": "0.766", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "11610", "lr": "7.74061e-05", "gnorm": "10.156", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "2466"} 2023-01-29 16:52:51 | INFO | train_inner | {"epoch": 6, "update": 5.375, "s2c_loss": "0.819", "loss": "0.56753", "s2c_nll_loss": "0.819", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "11620", "lr": "7.74728e-05", "gnorm": "9.826", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "2469"} 2023-01-29 16:52:53 | INFO | train_inner | {"epoch": 6, "update": 5.38, "s2c_loss": "0.797", "loss": "0.55269", "s2c_nll_loss": "0.797", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "11630", "lr": "7.75395e-05", "gnorm": "8.889", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "2471"} 2023-01-29 16:52:56 | INFO | train_inner | {"epoch": 6, "update": 5.384, "s2c_loss": "0.675", "loss": "0.46771", "s2c_nll_loss": "0.675", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "11640", "lr": "7.76061e-05", "gnorm": "9.124", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "2474"} 2023-01-29 16:52:59 | INFO | train_inner | {"epoch": 6, "update": 5.389, "s2c_loss": "0.759", "loss": "0.52626", "s2c_nll_loss": "0.759", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "11650", "lr": "7.76728e-05", "gnorm": "9.454", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "2476"} 2023-01-29 16:53:01 | INFO | train_inner | {"epoch": 6, "update": 5.394, "s2c_loss": "0.827", "loss": "0.57341", "s2c_nll_loss": "0.827", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "11660", "lr": "7.77394e-05", "gnorm": "9.51", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "2479"} 2023-01-29 16:53:04 | INFO | train_inner | {"epoch": 6, "update": 5.398, "s2c_loss": "0.941", "loss": "0.65239", "s2c_nll_loss": "0.941", "s2c_accuracy": "84.375", "s2c_total": "64", "s2c_n_correct": "54", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "11670", "lr": "7.78061e-05", "gnorm": "10.376", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "2481"} 2023-01-29 16:53:06 | INFO | train_inner | {"epoch": 6, "update": 5.403, "s2c_loss": "0.73", "loss": "0.50596", "s2c_nll_loss": "0.73", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "11680", "lr": "7.78728e-05", "gnorm": "10.727", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "2484"} 2023-01-29 16:53:09 | INFO | train_inner | {"epoch": 6, "update": 5.407, "s2c_loss": "1.047", "loss": "0.72591", "s2c_nll_loss": "1.047", "s2c_accuracy": "85", "s2c_total": "64", "s2c_n_correct": "54.4", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "11690", "lr": "7.79394e-05", "gnorm": "10.544", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "2487"} 2023-01-29 16:53:11 | INFO | train_inner | {"epoch": 6, "update": 5.412, "s2c_loss": "0.723", "loss": "0.50125", "s2c_nll_loss": "0.723", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "11700", "lr": "7.80061e-05", "gnorm": "8.07", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "2489"} 2023-01-29 16:53:14 | INFO | train_inner | {"epoch": 6, "update": 5.417, "s2c_loss": "0.849", "loss": "0.58842", "s2c_nll_loss": "0.849", "s2c_accuracy": "86.875", "s2c_total": "64", "s2c_n_correct": "55.6", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "11710", "lr": "7.80728e-05", "gnorm": "9.541", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "2492"} 2023-01-29 16:53:16 | INFO | train_inner | {"epoch": 6, "update": 5.421, "s2c_loss": "0.652", "loss": "0.45188", "s2c_nll_loss": "0.652", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "11720", "lr": "7.81394e-05", "gnorm": "8.921", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "2494"} 2023-01-29 16:53:19 | INFO | train_inner | {"epoch": 6, "update": 5.426, "s2c_loss": "0.853", "loss": "0.59107", "s2c_nll_loss": "0.853", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "11730", "lr": "7.82061e-05", "gnorm": "7.982", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "2497"} 2023-01-29 16:53:21 | INFO | train_inner | {"epoch": 6, "update": 5.431, "s2c_loss": "0.737", "loss": "0.51061", "s2c_nll_loss": "0.737", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "11740", "lr": "7.82728e-05", "gnorm": "9.328", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "2499"} 2023-01-29 16:53:24 | INFO | train_inner | {"epoch": 6, "update": 5.435, "s2c_loss": "0.7", "loss": "0.48536", "s2c_nll_loss": "0.7", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "11750", "lr": "7.83394e-05", "gnorm": "10.175", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "2502"} 2023-01-29 16:53:24 | INFO | fairseq.trainer | NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 1024.0 2023-01-29 16:53:27 | INFO | train_inner | {"epoch": 6, "update": 5.44, "s2c_loss": "0.849", "loss": "0.58882", "s2c_nll_loss": "0.849", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "225.7", "ups": "3.53", "wpb": "64", "bsz": "64", "num_updates": "11760", "lr": "7.84061e-05", "gnorm": "9.539", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2505"} 2023-01-29 16:53:29 | INFO | train_inner | {"epoch": 6, "update": 5.445, "s2c_loss": "0.906", "loss": "0.6278", "s2c_nll_loss": "0.906", "s2c_accuracy": "86.094", "s2c_total": "64", "s2c_n_correct": "55.1", "wps": "246.3", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "11770", "lr": "7.84727e-05", "gnorm": "9.852", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2507"} 2023-01-29 16:53:32 | INFO | train_inner | {"epoch": 6, "update": 5.45, "s2c_loss": "0.879", "loss": "0.60925", "s2c_nll_loss": "0.879", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "11780", "lr": "7.85394e-05", "gnorm": "10.018", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2510"} 2023-01-29 16:53:34 | INFO | train_inner | {"epoch": 6, "update": 5.454, "s2c_loss": "0.781", "loss": "0.54132", "s2c_nll_loss": "0.781", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "11790", "lr": "7.86061e-05", "gnorm": "9.133", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2512"} 2023-01-29 16:53:37 | INFO | train_inner | {"epoch": 6, "update": 5.459, "s2c_loss": "0.845", "loss": "0.58582", "s2c_nll_loss": "0.845", "s2c_accuracy": "85.312", "s2c_total": "64", "s2c_n_correct": "54.6", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "11800", "lr": "7.86727e-05", "gnorm": "10.037", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2515"} 2023-01-29 16:53:39 | INFO | train_inner | {"epoch": 6, "update": 5.463, "s2c_loss": "0.952", "loss": "0.65993", "s2c_nll_loss": "0.952", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "11810", "lr": "7.87394e-05", "gnorm": "9.796", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2517"} 2023-01-29 16:53:42 | INFO | train_inner | {"epoch": 6, "update": 5.468, "s2c_loss": "0.944", "loss": "0.65451", "s2c_nll_loss": "0.944", "s2c_accuracy": "85.781", "s2c_total": "64", "s2c_n_correct": "54.9", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "11820", "lr": "7.88061e-05", "gnorm": "9.457", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2520"} 2023-01-29 16:53:45 | INFO | train_inner | {"epoch": 6, "update": 5.473, "s2c_loss": "0.791", "loss": "0.54834", "s2c_nll_loss": "0.791", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "11830", "lr": "7.88727e-05", "gnorm": "9.015", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2522"} 2023-01-29 16:53:47 | INFO | train_inner | {"epoch": 6, "update": 5.477, "s2c_loss": "0.721", "loss": "0.50001", "s2c_nll_loss": "0.721", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "11840", "lr": "7.89394e-05", "gnorm": "8.256", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2525"} 2023-01-29 16:53:50 | INFO | train_inner | {"epoch": 6, "update": 5.482, "s2c_loss": "0.827", "loss": "0.57312", "s2c_nll_loss": "0.827", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "11850", "lr": "7.9006e-05", "gnorm": "8.171", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2527"} 2023-01-29 16:53:52 | INFO | train_inner | {"epoch": 6, "update": 5.487, "s2c_loss": "0.68", "loss": "0.47164", "s2c_nll_loss": "0.68", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "11860", "lr": "7.90727e-05", "gnorm": "8.401", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2530"} 2023-01-29 16:53:55 | INFO | train_inner | {"epoch": 6, "update": 5.491, "s2c_loss": "0.633", "loss": "0.43856", "s2c_nll_loss": "0.633", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "11870", "lr": "7.91394e-05", "gnorm": "9.139", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2533"} 2023-01-29 16:53:57 | INFO | train_inner | {"epoch": 6, "update": 5.496, "s2c_loss": "0.735", "loss": "0.50961", "s2c_nll_loss": "0.735", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "11880", "lr": "7.9206e-05", "gnorm": "10.112", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2535"} 2023-01-29 16:54:00 | INFO | train_inner | {"epoch": 6, "update": 5.5, "s2c_loss": "0.763", "loss": "0.52875", "s2c_nll_loss": "0.763", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "11890", "lr": "7.92727e-05", "gnorm": "8.719", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "2538"} 2023-01-29 16:54:02 | INFO | train_inner | {"epoch": 6, "update": 5.505, "s2c_loss": "0.929", "loss": "0.6442", "s2c_nll_loss": "0.929", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "258.7", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "11900", "lr": "7.93394e-05", "gnorm": "9.032", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2540"} 2023-01-29 16:54:05 | INFO | train_inner | {"epoch": 6, "update": 5.51, "s2c_loss": "0.825", "loss": "0.57154", "s2c_nll_loss": "0.825", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "11910", "lr": "7.9406e-05", "gnorm": "9.232", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2543"} 2023-01-29 16:54:07 | INFO | train_inner | {"epoch": 6, "update": 5.514, "s2c_loss": "0.806", "loss": "0.55863", "s2c_nll_loss": "0.806", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "11920", "lr": "7.94727e-05", "gnorm": "8.536", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2545"} 2023-01-29 16:54:10 | INFO | train_inner | {"epoch": 6, "update": 5.519, "s2c_loss": "0.992", "loss": "0.68752", "s2c_nll_loss": "0.992", "s2c_accuracy": "85", "s2c_total": "64", "s2c_n_correct": "54.4", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "11930", "lr": "7.95394e-05", "gnorm": "9.851", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2548"} 2023-01-29 16:54:12 | INFO | train_inner | {"epoch": 6, "update": 5.524, "s2c_loss": "0.779", "loss": "0.53982", "s2c_nll_loss": "0.779", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "11940", "lr": "7.9606e-05", "gnorm": "8.963", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2550"} 2023-01-29 16:54:15 | INFO | train_inner | {"epoch": 6, "update": 5.528, "s2c_loss": "0.758", "loss": "0.52571", "s2c_nll_loss": "0.758", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "11950", "lr": "7.96727e-05", "gnorm": "8.487", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2553"} 2023-01-29 16:54:17 | INFO | train_inner | {"epoch": 6, "update": 5.533, "s2c_loss": "0.662", "loss": "0.45879", "s2c_nll_loss": "0.662", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "11960", "lr": "7.97393e-05", "gnorm": "8.573", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2555"} 2023-01-29 16:54:20 | INFO | train_inner | {"epoch": 6, "update": 5.537, "s2c_loss": "0.821", "loss": "0.56912", "s2c_nll_loss": "0.821", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "11970", "lr": "7.9806e-05", "gnorm": "9.14", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2558"} 2023-01-29 16:54:22 | INFO | train_inner | {"epoch": 6, "update": 5.542, "s2c_loss": "1.015", "loss": "0.70369", "s2c_nll_loss": "1.015", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "11980", "lr": "7.98727e-05", "gnorm": "9.707", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2560"} 2023-01-29 16:54:25 | INFO | train_inner | {"epoch": 6, "update": 5.547, "s2c_loss": "0.743", "loss": "0.51499", "s2c_nll_loss": "0.743", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "11990", "lr": "7.99393e-05", "gnorm": "9.562", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "2563"} 2023-01-29 16:54:28 | INFO | train_inner | {"epoch": 6, "update": 5.551, "s2c_loss": "0.777", "loss": "0.53834", "s2c_nll_loss": "0.777", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "12000", "lr": "8.0006e-05", "gnorm": "10.275", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2565"} 2023-01-29 16:54:30 | INFO | train_inner | {"epoch": 6, "update": 5.556, "s2c_loss": "0.716", "loss": "0.49613", "s2c_nll_loss": "0.716", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "12010", "lr": "8.00727e-05", "gnorm": "10.039", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2568"} 2023-01-29 16:54:33 | INFO | train_inner | {"epoch": 6, "update": 5.561, "s2c_loss": "0.761", "loss": "0.52774", "s2c_nll_loss": "0.761", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "12020", "lr": "8.01393e-05", "gnorm": "9.948", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2571"} 2023-01-29 16:54:35 | INFO | train_inner | {"epoch": 6, "update": 5.565, "s2c_loss": "0.891", "loss": "0.61781", "s2c_nll_loss": "0.891", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "258.4", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "12030", "lr": "8.0206e-05", "gnorm": "9.106", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2573"} 2023-01-29 16:54:38 | INFO | train_inner | {"epoch": 6, "update": 5.57, "s2c_loss": "0.907", "loss": "0.62876", "s2c_nll_loss": "0.907", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "12040", "lr": "8.02727e-05", "gnorm": "10.161", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2576"} 2023-01-29 16:54:40 | INFO | train_inner | {"epoch": 6, "update": 5.574, "s2c_loss": "0.731", "loss": "0.50646", "s2c_nll_loss": "0.731", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "12050", "lr": "8.03393e-05", "gnorm": "8.967", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2578"} 2023-01-29 16:54:43 | INFO | train_inner | {"epoch": 6, "update": 5.579, "s2c_loss": "0.815", "loss": "0.56498", "s2c_nll_loss": "0.815", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "12060", "lr": "8.0406e-05", "gnorm": "9.129", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2581"} 2023-01-29 16:54:45 | INFO | train_inner | {"epoch": 6, "update": 5.584, "s2c_loss": "0.806", "loss": "0.55838", "s2c_nll_loss": "0.806", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "12070", "lr": "8.04726e-05", "gnorm": "9.47", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2583"} 2023-01-29 16:54:48 | INFO | train_inner | {"epoch": 6, "update": 5.588, "s2c_loss": "0.818", "loss": "0.56674", "s2c_nll_loss": "0.818", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "12080", "lr": "8.05393e-05", "gnorm": "9.664", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2586"} 2023-01-29 16:54:50 | INFO | train_inner | {"epoch": 6, "update": 5.593, "s2c_loss": "0.588", "loss": "0.40736", "s2c_nll_loss": "0.588", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "12090", "lr": "8.0606e-05", "gnorm": "8.363", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2588"} 2023-01-29 16:54:53 | INFO | train_inner | {"epoch": 6, "update": 5.598, "s2c_loss": "0.832", "loss": "0.57676", "s2c_nll_loss": "0.832", "s2c_accuracy": "86.094", "s2c_total": "64", "s2c_n_correct": "55.1", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "12100", "lr": "8.06726e-05", "gnorm": "11.347", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2591"} 2023-01-29 16:54:56 | INFO | train_inner | {"epoch": 6, "update": 5.602, "s2c_loss": "0.807", "loss": "0.55938", "s2c_nll_loss": "0.807", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "12110", "lr": "8.07393e-05", "gnorm": "10.096", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2593"} 2023-01-29 16:54:58 | INFO | train_inner | {"epoch": 6, "update": 5.607, "s2c_loss": "0.786", "loss": "0.54462", "s2c_nll_loss": "0.786", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "259.2", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "12120", "lr": "8.0806e-05", "gnorm": "8.743", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2596"} 2023-01-29 16:55:01 | INFO | train_inner | {"epoch": 6, "update": 5.611, "s2c_loss": "0.878", "loss": "0.6083", "s2c_nll_loss": "0.878", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "12130", "lr": "8.08726e-05", "gnorm": "10.013", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "2598"} 2023-01-29 16:55:03 | INFO | train_inner | {"epoch": 6, "update": 5.616, "s2c_loss": "0.848", "loss": "0.58785", "s2c_nll_loss": "0.848", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "12140", "lr": "8.09393e-05", "gnorm": "8.314", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2601"} 2023-01-29 16:55:06 | INFO | train_inner | {"epoch": 6, "update": 5.621, "s2c_loss": "0.705", "loss": "0.48882", "s2c_nll_loss": "0.705", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "12150", "lr": "8.10059e-05", "gnorm": "9.224", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2604"} 2023-01-29 16:55:08 | INFO | train_inner | {"epoch": 6, "update": 5.625, "s2c_loss": "0.687", "loss": "0.47645", "s2c_nll_loss": "0.687", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "12160", "lr": "8.10726e-05", "gnorm": "10.398", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "2606"} 2023-01-29 16:55:11 | INFO | train_inner | {"epoch": 6, "update": 5.63, "s2c_loss": "0.895", "loss": "0.6207", "s2c_nll_loss": "0.895", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "251.8", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "12170", "lr": "8.11393e-05", "gnorm": "9.458", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "2609"} 2023-01-29 16:55:13 | INFO | train_inner | {"epoch": 6, "update": 5.635, "s2c_loss": "0.649", "loss": "0.44962", "s2c_nll_loss": "0.649", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "12180", "lr": "8.12059e-05", "gnorm": "8.572", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "2611"} 2023-01-29 16:55:16 | INFO | train_inner | {"epoch": 6, "update": 5.639, "s2c_loss": "0.769", "loss": "0.53297", "s2c_nll_loss": "0.769", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "241.8", "ups": "3.78", "wpb": "64", "bsz": "64", "num_updates": "12190", "lr": "8.12726e-05", "gnorm": "9.905", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "2614"} 2023-01-29 16:55:18 | INFO | fairseq.trainer | NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 512.0 2023-01-29 16:55:19 | INFO | train_inner | {"epoch": 6, "update": 5.644, "s2c_loss": "0.811", "loss": "0.56226", "s2c_nll_loss": "0.811", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "227.4", "ups": "3.55", "wpb": "64", "bsz": "64", "num_updates": "12200", "lr": "8.13393e-05", "gnorm": "9.23", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2617"} 2023-01-29 16:55:21 | INFO | train_inner | {"epoch": 6, "update": 5.649, "s2c_loss": "0.598", "loss": "0.41442", "s2c_nll_loss": "0.598", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "12210", "lr": "8.14059e-05", "gnorm": "8.265", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2619"} 2023-01-29 16:55:24 | INFO | train_inner | {"epoch": 6, "update": 5.654, "s2c_loss": "0.765", "loss": "0.53043", "s2c_nll_loss": "0.765", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "12220", "lr": "8.14726e-05", "gnorm": "9.446", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "2622"} 2023-01-29 16:55:26 | INFO | train_inner | {"epoch": 6, "update": 5.658, "s2c_loss": "0.797", "loss": "0.55249", "s2c_nll_loss": "0.797", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "12230", "lr": "8.15393e-05", "gnorm": "9.711", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2624"} 2023-01-29 16:55:29 | INFO | train_inner | {"epoch": 6, "update": 5.663, "s2c_loss": "0.773", "loss": "0.53563", "s2c_nll_loss": "0.773", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "12240", "lr": "8.16059e-05", "gnorm": "9.821", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2627"} 2023-01-29 16:55:31 | INFO | train_inner | {"epoch": 6, "update": 5.667, "s2c_loss": "0.775", "loss": "0.53731", "s2c_nll_loss": "0.775", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "12250", "lr": "8.16726e-05", "gnorm": "10.315", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2629"} 2023-01-29 16:55:34 | INFO | train_inner | {"epoch": 6, "update": 5.672, "s2c_loss": "0.672", "loss": "0.46582", "s2c_nll_loss": "0.672", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "12260", "lr": "8.17392e-05", "gnorm": "9.225", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2632"} 2023-01-29 16:55:36 | INFO | train_inner | {"epoch": 6, "update": 5.677, "s2c_loss": "0.718", "loss": "0.49752", "s2c_nll_loss": "0.718", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "12270", "lr": "8.18059e-05", "gnorm": "8.322", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2634"} 2023-01-29 16:55:39 | INFO | train_inner | {"epoch": 6, "update": 5.681, "s2c_loss": "0.654", "loss": "0.4534", "s2c_nll_loss": "0.654", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "12280", "lr": "8.18726e-05", "gnorm": "8.966", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2637"} 2023-01-29 16:55:42 | INFO | train_inner | {"epoch": 6, "update": 5.686, "s2c_loss": "0.667", "loss": "0.4625", "s2c_nll_loss": "0.667", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "12290", "lr": "8.19392e-05", "gnorm": "8.763", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2640"} 2023-01-29 16:55:44 | INFO | train_inner | {"epoch": 6, "update": 5.691, "s2c_loss": "0.814", "loss": "0.56401", "s2c_nll_loss": "0.814", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "12300", "lr": "8.20059e-05", "gnorm": "10.241", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2642"} 2023-01-29 16:55:47 | INFO | train_inner | {"epoch": 6, "update": 5.695, "s2c_loss": "1.017", "loss": "0.70514", "s2c_nll_loss": "1.017", "s2c_accuracy": "83.438", "s2c_total": "64", "s2c_n_correct": "53.4", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "12310", "lr": "8.20726e-05", "gnorm": "10.165", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2645"} 2023-01-29 16:55:49 | INFO | train_inner | {"epoch": 6, "update": 5.7, "s2c_loss": "0.829", "loss": "0.57485", "s2c_nll_loss": "0.829", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "12320", "lr": "8.21392e-05", "gnorm": "8.734", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2647"} 2023-01-29 16:55:52 | INFO | train_inner | {"epoch": 6, "update": 5.704, "s2c_loss": "0.748", "loss": "0.51871", "s2c_nll_loss": "0.748", "s2c_accuracy": "85.938", "s2c_total": "64", "s2c_n_correct": "55", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "12330", "lr": "8.22059e-05", "gnorm": "9.653", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2650"} 2023-01-29 16:55:54 | INFO | train_inner | {"epoch": 6, "update": 5.709, "s2c_loss": "0.803", "loss": "0.55626", "s2c_nll_loss": "0.803", "s2c_accuracy": "86.562", "s2c_total": "64", "s2c_n_correct": "55.4", "wps": "247.1", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "12340", "lr": "8.22726e-05", "gnorm": "9.131", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2652"} 2023-01-29 16:55:57 | INFO | train_inner | {"epoch": 6, "update": 5.714, "s2c_loss": "0.745", "loss": "0.51644", "s2c_nll_loss": "0.745", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "12350", "lr": "8.23392e-05", "gnorm": "9.053", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2655"} 2023-01-29 16:56:00 | INFO | train_inner | {"epoch": 6, "update": 5.718, "s2c_loss": "0.797", "loss": "0.55233", "s2c_nll_loss": "0.797", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "246.7", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "12360", "lr": "8.24059e-05", "gnorm": "9.08", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2657"} 2023-01-29 16:56:02 | INFO | train_inner | {"epoch": 6, "update": 5.723, "s2c_loss": "0.526", "loss": "0.36455", "s2c_nll_loss": "0.526", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "12370", "lr": "8.24725e-05", "gnorm": "8.399", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2660"} 2023-01-29 16:56:05 | INFO | train_inner | {"epoch": 6, "update": 5.728, "s2c_loss": "0.689", "loss": "0.47743", "s2c_nll_loss": "0.689", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "12380", "lr": "8.25392e-05", "gnorm": "10.571", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2663"} 2023-01-29 16:56:07 | INFO | train_inner | {"epoch": 6, "update": 5.732, "s2c_loss": "0.781", "loss": "0.54143", "s2c_nll_loss": "0.781", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "12390", "lr": "8.26059e-05", "gnorm": "8.335", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2665"} 2023-01-29 16:56:10 | INFO | train_inner | {"epoch": 6, "update": 5.737, "s2c_loss": "0.806", "loss": "0.5574", "s2c_nll_loss": "0.806", "s2c_accuracy": "89.482", "s2c_total": "63.7", "s2c_n_correct": "57", "wps": "248", "ups": "3.89", "wpb": "63.7", "bsz": "63.7", "num_updates": "12400", "lr": "8.26725e-05", "gnorm": "9.378", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2668"} 2023-01-29 16:56:12 | INFO | train_inner | {"epoch": 6, "update": 5.741, "s2c_loss": "0.863", "loss": "0.59844", "s2c_nll_loss": "0.863", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "12410", "lr": "8.27392e-05", "gnorm": "8.654", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2670"} 2023-01-29 16:56:15 | INFO | train_inner | {"epoch": 6, "update": 5.746, "s2c_loss": "0.818", "loss": "0.56711", "s2c_nll_loss": "0.818", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "12420", "lr": "8.28059e-05", "gnorm": "10.471", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2673"} 2023-01-29 16:56:18 | INFO | train_inner | {"epoch": 6, "update": 5.751, "s2c_loss": "0.795", "loss": "0.55111", "s2c_nll_loss": "0.795", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "12430", "lr": "8.28725e-05", "gnorm": "9.547", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2675"} 2023-01-29 16:56:20 | INFO | train_inner | {"epoch": 6, "update": 5.755, "s2c_loss": "0.893", "loss": "0.61925", "s2c_nll_loss": "0.893", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "12440", "lr": "8.29392e-05", "gnorm": "9.604", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2678"} 2023-01-29 16:56:23 | INFO | train_inner | {"epoch": 6, "update": 5.76, "s2c_loss": "0.836", "loss": "0.57935", "s2c_nll_loss": "0.836", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "12450", "lr": "8.30059e-05", "gnorm": "11.681", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2681"} 2023-01-29 16:56:25 | INFO | train_inner | {"epoch": 6, "update": 5.765, "s2c_loss": "1.024", "loss": "0.70959", "s2c_nll_loss": "1.024", "s2c_accuracy": "84.688", "s2c_total": "64", "s2c_n_correct": "54.2", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "12460", "lr": "8.30725e-05", "gnorm": "11.499", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2683"} 2023-01-29 16:56:28 | INFO | train_inner | {"epoch": 6, "update": 5.769, "s2c_loss": "0.758", "loss": "0.52541", "s2c_nll_loss": "0.758", "s2c_accuracy": "85.469", "s2c_total": "64", "s2c_n_correct": "54.7", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "12470", "lr": "8.31392e-05", "gnorm": "11.062", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2686"} 2023-01-29 16:56:30 | INFO | train_inner | {"epoch": 6, "update": 5.774, "s2c_loss": "0.747", "loss": "0.5181", "s2c_nll_loss": "0.747", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "12480", "lr": "8.32058e-05", "gnorm": "9.521", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2688"} 2023-01-29 16:56:33 | INFO | train_inner | {"epoch": 6, "update": 5.778, "s2c_loss": "0.798", "loss": "0.55294", "s2c_nll_loss": "0.798", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "12490", "lr": "8.32725e-05", "gnorm": "10.38", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2691"} 2023-01-29 16:56:35 | INFO | train_inner | {"epoch": 6, "update": 5.783, "s2c_loss": "0.939", "loss": "0.65107", "s2c_nll_loss": "0.939", "s2c_accuracy": "86.25", "s2c_total": "64", "s2c_n_correct": "55.2", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "12500", "lr": "8.33392e-05", "gnorm": "10.114", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2693"} 2023-01-29 16:56:38 | INFO | train_inner | {"epoch": 6, "update": 5.788, "s2c_loss": "0.753", "loss": "0.52172", "s2c_nll_loss": "0.753", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "12510", "lr": "8.34058e-05", "gnorm": "10.452", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2696"} 2023-01-29 16:56:40 | INFO | train_inner | {"epoch": 6, "update": 5.792, "s2c_loss": "0.87", "loss": "0.6031", "s2c_nll_loss": "0.87", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "12520", "lr": "8.34725e-05", "gnorm": "10.542", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2698"} 2023-01-29 16:56:43 | INFO | train_inner | {"epoch": 6, "update": 5.797, "s2c_loss": "0.717", "loss": "0.49722", "s2c_nll_loss": "0.717", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "12530", "lr": "8.35392e-05", "gnorm": "8.868", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2701"} 2023-01-29 16:56:45 | INFO | train_inner | {"epoch": 6, "update": 5.802, "s2c_loss": "0.87", "loss": "0.60333", "s2c_nll_loss": "0.87", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "12540", "lr": "8.36058e-05", "gnorm": "9.159", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2703"} 2023-01-29 16:56:48 | INFO | train_inner | {"epoch": 6, "update": 5.806, "s2c_loss": "0.64", "loss": "0.44361", "s2c_nll_loss": "0.64", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "12550", "lr": "8.36725e-05", "gnorm": "9.211", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2706"} 2023-01-29 16:56:51 | INFO | train_inner | {"epoch": 6, "update": 5.811, "s2c_loss": "0.773", "loss": "0.53547", "s2c_nll_loss": "0.773", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "12560", "lr": "8.37391e-05", "gnorm": "9.534", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2708"} 2023-01-29 16:56:53 | INFO | train_inner | {"epoch": 6, "update": 5.815, "s2c_loss": "0.937", "loss": "0.64926", "s2c_nll_loss": "0.937", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "12570", "lr": "8.38058e-05", "gnorm": "9.579", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2711"} 2023-01-29 16:56:56 | INFO | train_inner | {"epoch": 6, "update": 5.82, "s2c_loss": "0.866", "loss": "0.59993", "s2c_nll_loss": "0.866", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "12580", "lr": "8.38725e-05", "gnorm": "9.325", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2714"} 2023-01-29 16:56:58 | INFO | train_inner | {"epoch": 6, "update": 5.825, "s2c_loss": "0.897", "loss": "0.62189", "s2c_nll_loss": "0.897", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "12590", "lr": "8.39391e-05", "gnorm": "9.761", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2716"} 2023-01-29 16:57:01 | INFO | train_inner | {"epoch": 6, "update": 5.829, "s2c_loss": "0.677", "loss": "0.46916", "s2c_nll_loss": "0.677", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "12600", "lr": "8.40058e-05", "gnorm": "8.543", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2719"} 2023-01-29 16:57:03 | INFO | train_inner | {"epoch": 6, "update": 5.834, "s2c_loss": "0.849", "loss": "0.58842", "s2c_nll_loss": "0.849", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "12610", "lr": "8.40725e-05", "gnorm": "9.662", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2721"} 2023-01-29 16:57:06 | INFO | train_inner | {"epoch": 6, "update": 5.839, "s2c_loss": "0.595", "loss": "0.41211", "s2c_nll_loss": "0.595", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "246.2", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "12620", "lr": "8.41391e-05", "gnorm": "8.027", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2724"} 2023-01-29 16:57:08 | INFO | train_inner | {"epoch": 6, "update": 5.843, "s2c_loss": "0.849", "loss": "0.58815", "s2c_nll_loss": "0.849", "s2c_accuracy": "85.781", "s2c_total": "64", "s2c_n_correct": "54.9", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "12630", "lr": "8.42058e-05", "gnorm": "9.677", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2726"} 2023-01-29 16:57:11 | INFO | train_inner | {"epoch": 6, "update": 5.848, "s2c_loss": "0.898", "loss": "0.62227", "s2c_nll_loss": "0.898", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "12640", "lr": "8.42725e-05", "gnorm": "9.305", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2729"} 2023-01-29 16:57:13 | INFO | train_inner | {"epoch": 6, "update": 5.852, "s2c_loss": "0.751", "loss": "0.52051", "s2c_nll_loss": "0.751", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "12650", "lr": "8.43391e-05", "gnorm": "9.016", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2731"} 2023-01-29 16:57:16 | INFO | train_inner | {"epoch": 6, "update": 5.857, "s2c_loss": "0.829", "loss": "0.57493", "s2c_nll_loss": "0.829", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "12660", "lr": "8.44058e-05", "gnorm": "8.428", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2734"} 2023-01-29 16:57:18 | INFO | train_inner | {"epoch": 6, "update": 5.862, "s2c_loss": "0.741", "loss": "0.51338", "s2c_nll_loss": "0.741", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "12670", "lr": "8.44724e-05", "gnorm": "9.194", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2736"} 2023-01-29 16:57:21 | INFO | train_inner | {"epoch": 6, "update": 5.866, "s2c_loss": "0.73", "loss": "0.50571", "s2c_nll_loss": "0.73", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "12680", "lr": "8.45391e-05", "gnorm": "9.399", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2739"} 2023-01-29 16:57:24 | INFO | train_inner | {"epoch": 6, "update": 5.871, "s2c_loss": "0.744", "loss": "0.51569", "s2c_nll_loss": "0.744", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "12690", "lr": "8.46058e-05", "gnorm": "11.549", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2741"} 2023-01-29 16:57:26 | INFO | train_inner | {"epoch": 6, "update": 5.876, "s2c_loss": "0.699", "loss": "0.48483", "s2c_nll_loss": "0.699", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "12700", "lr": "8.46724e-05", "gnorm": "9.868", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "2744"} 2023-01-29 16:57:29 | INFO | train_inner | {"epoch": 6, "update": 5.88, "s2c_loss": "0.791", "loss": "0.54806", "s2c_nll_loss": "0.791", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "12710", "lr": "8.47391e-05", "gnorm": "10.362", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2747"} 2023-01-29 16:57:31 | INFO | train_inner | {"epoch": 6, "update": 5.885, "s2c_loss": "0.742", "loss": "0.51455", "s2c_nll_loss": "0.742", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "12720", "lr": "8.48058e-05", "gnorm": "9.993", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2749"} 2023-01-29 16:57:34 | INFO | train_inner | {"epoch": 6, "update": 5.889, "s2c_loss": "0.707", "loss": "0.49028", "s2c_nll_loss": "0.707", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "12730", "lr": "8.48724e-05", "gnorm": "9.151", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2752"} 2023-01-29 16:57:36 | INFO | train_inner | {"epoch": 6, "update": 5.894, "s2c_loss": "0.795", "loss": "0.55077", "s2c_nll_loss": "0.795", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "12740", "lr": "8.49391e-05", "gnorm": "9.561", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2754"} 2023-01-29 16:57:39 | INFO | train_inner | {"epoch": 6, "update": 5.899, "s2c_loss": "0.892", "loss": "0.61839", "s2c_nll_loss": "0.892", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "12750", "lr": "8.50058e-05", "gnorm": "10.413", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2757"} 2023-01-29 16:57:41 | INFO | train_inner | {"epoch": 6, "update": 5.903, "s2c_loss": "0.707", "loss": "0.49011", "s2c_nll_loss": "0.707", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "245.9", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "12760", "lr": "8.50724e-05", "gnorm": "9.261", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2759"} 2023-01-29 16:57:44 | INFO | train_inner | {"epoch": 6, "update": 5.908, "s2c_loss": "0.627", "loss": "0.43466", "s2c_nll_loss": "0.627", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "12770", "lr": "8.51391e-05", "gnorm": "8.915", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2762"} 2023-01-29 16:57:46 | INFO | train_inner | {"epoch": 6, "update": 5.913, "s2c_loss": "0.773", "loss": "0.53549", "s2c_nll_loss": "0.773", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "12780", "lr": "8.52057e-05", "gnorm": "10.058", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2764"} 2023-01-29 16:57:49 | INFO | train_inner | {"epoch": 6, "update": 5.917, "s2c_loss": "0.936", "loss": "0.6485", "s2c_nll_loss": "0.936", "s2c_accuracy": "85.312", "s2c_total": "64", "s2c_n_correct": "54.6", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "12790", "lr": "8.52724e-05", "gnorm": "12.101", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2767"} 2023-01-29 16:57:52 | INFO | train_inner | {"epoch": 6, "update": 5.922, "s2c_loss": "0.909", "loss": "0.63034", "s2c_nll_loss": "0.909", "s2c_accuracy": "84.688", "s2c_total": "64", "s2c_n_correct": "54.2", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "12800", "lr": "8.53391e-05", "gnorm": "10.534", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2769"} 2023-01-29 16:57:54 | INFO | train_inner | {"epoch": 6, "update": 5.926, "s2c_loss": "0.906", "loss": "0.62778", "s2c_nll_loss": "0.906", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "246.8", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "12810", "lr": "8.54057e-05", "gnorm": "9.013", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2772"} 2023-01-29 16:57:57 | INFO | train_inner | {"epoch": 6, "update": 5.931, "s2c_loss": "0.848", "loss": "0.58802", "s2c_nll_loss": "0.848", "s2c_accuracy": "85.625", "s2c_total": "64", "s2c_n_correct": "54.8", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "12820", "lr": "8.54724e-05", "gnorm": "9.907", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2775"} 2023-01-29 16:57:59 | INFO | train_inner | {"epoch": 6, "update": 5.936, "s2c_loss": "0.813", "loss": "0.56342", "s2c_nll_loss": "0.813", "s2c_accuracy": "85.781", "s2c_total": "64", "s2c_n_correct": "54.9", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "12830", "lr": "8.55391e-05", "gnorm": "10.264", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2777"} 2023-01-29 16:58:02 | INFO | train_inner | {"epoch": 6, "update": 5.94, "s2c_loss": "0.881", "loss": "0.61062", "s2c_nll_loss": "0.881", "s2c_accuracy": "85.938", "s2c_total": "64", "s2c_n_correct": "55", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "12840", "lr": "8.56057e-05", "gnorm": "10.28", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2780"} 2023-01-29 16:58:04 | INFO | train_inner | {"epoch": 6, "update": 5.945, "s2c_loss": "1.104", "loss": "0.76536", "s2c_nll_loss": "1.104", "s2c_accuracy": "85", "s2c_total": "64", "s2c_n_correct": "54.4", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "12850", "lr": "8.56724e-05", "gnorm": "10.392", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2782"} 2023-01-29 16:58:07 | INFO | train_inner | {"epoch": 6, "update": 5.95, "s2c_loss": "0.888", "loss": "0.61531", "s2c_nll_loss": "0.888", "s2c_accuracy": "84.062", "s2c_total": "64", "s2c_n_correct": "53.8", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "12860", "lr": "8.5739e-05", "gnorm": "10.668", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2785"} 2023-01-29 16:58:09 | INFO | train_inner | {"epoch": 6, "update": 5.954, "s2c_loss": "0.751", "loss": "0.52025", "s2c_nll_loss": "0.751", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "12870", "lr": "8.58057e-05", "gnorm": "11.398", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2787"} 2023-01-29 16:58:12 | INFO | train_inner | {"epoch": 6, "update": 5.959, "s2c_loss": "0.94", "loss": "0.65163", "s2c_nll_loss": "0.94", "s2c_accuracy": "85.312", "s2c_total": "64", "s2c_n_correct": "54.6", "wps": "258.3", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "12880", "lr": "8.58724e-05", "gnorm": "11.096", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2790"} 2023-01-29 16:58:14 | INFO | train_inner | {"epoch": 6, "update": 5.963, "s2c_loss": "0.678", "loss": "0.4702", "s2c_nll_loss": "0.678", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "247.6", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "12890", "lr": "8.5939e-05", "gnorm": "10.366", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2792"} 2023-01-29 16:58:17 | INFO | train_inner | {"epoch": 6, "update": 5.968, "s2c_loss": "0.917", "loss": "0.63555", "s2c_nll_loss": "0.917", "s2c_accuracy": "86.094", "s2c_total": "64", "s2c_n_correct": "55.1", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "12900", "lr": "8.60057e-05", "gnorm": "9.303", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2795"} 2023-01-29 16:58:19 | INFO | train_inner | {"epoch": 6, "update": 5.973, "s2c_loss": "0.737", "loss": "0.51103", "s2c_nll_loss": "0.737", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "12910", "lr": "8.60724e-05", "gnorm": "10.366", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2797"} 2023-01-29 16:58:22 | INFO | train_inner | {"epoch": 6, "update": 5.977, "s2c_loss": "0.918", "loss": "0.63605", "s2c_nll_loss": "0.918", "s2c_accuracy": "85.312", "s2c_total": "64", "s2c_n_correct": "54.6", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "12920", "lr": "8.6139e-05", "gnorm": "9.429", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2800"} 2023-01-29 16:58:24 | INFO | train_inner | {"epoch": 6, "update": 5.982, "s2c_loss": "0.862", "loss": "0.59765", "s2c_nll_loss": "0.862", "s2c_accuracy": "84.219", "s2c_total": "64", "s2c_n_correct": "53.9", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "12930", "lr": "8.62057e-05", "gnorm": "11.299", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2802"} 2023-01-29 16:58:27 | INFO | train_inner | {"epoch": 6, "update": 5.987, "s2c_loss": "1.036", "loss": "0.71814", "s2c_nll_loss": "1.036", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "12940", "lr": "8.62724e-05", "gnorm": "9.844", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "2805"} 2023-01-29 16:58:29 | INFO | train_inner | {"epoch": 6, "update": 5.991, "s2c_loss": "0.735", "loss": "0.50943", "s2c_nll_loss": "0.735", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "12950", "lr": "8.6339e-05", "gnorm": "9.198", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2807"} 2023-01-29 16:58:32 | INFO | train_inner | {"epoch": 6, "update": 5.996, "s2c_loss": "0.836", "loss": "0.57959", "s2c_nll_loss": "0.836", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "12960", "lr": "8.64057e-05", "gnorm": "9.707", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2810"} 2023-01-29 16:58:34 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 6 @ 12969 updates 2023-01-29 16:58:34 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 16:58:41 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 16:58:41 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt (epoch 6 @ 12969 updates, score None) (writing took 6.885793385095894 seconds) 2023-01-29 16:58:41 | INFO | fairseq_cli.train | end of epoch 6 (average epoch stats below) 2023-01-29 16:58:41 | INFO | train | {"epoch": 6, "train_s2c_loss": "0.8", "train_loss": "0.55449", "train_s2c_nll_loss": "0.8", "train_s2c_accuracy": "88.062", "train_s2c_total": "63.9838", "train_s2c_n_correct": "56.3454", "train_wps": "245.3", "train_ups": "3.83", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "12969", "train_lr": "8.64657e-05", "train_gnorm": "9.487", "train_loss_scale": "512", "train_train_wall": "543", "train_gb_free": "7.5", "train_wall": "2819"} 2023-01-29 16:58:48 | INFO | fairseq.trainer | begin training epoch 7 2023-01-29 16:58:48 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 16:58:48 | INFO | train_inner | {"epoch": 7, "update": 6.0, "s2c_loss": "0.753", "loss": "0.52228", "s2c_nll_loss": "0.753", "s2c_accuracy": "87.993", "s2c_total": "60.8", "s2c_n_correct": "53.5", "wps": "38.5", "ups": "0.63", "wpb": "60.8", "bsz": "60.8", "num_updates": "12970", "lr": "8.64723e-05", "gnorm": "9.181", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2826"} 2023-01-29 16:58:50 | INFO | train_inner | {"epoch": 7, "update": 6.005, "s2c_loss": "0.611", "loss": "0.42346", "s2c_nll_loss": "0.611", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "244.3", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "12980", "lr": "8.6539e-05", "gnorm": "9.412", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2828"} 2023-01-29 16:58:53 | INFO | train_inner | {"epoch": 7, "update": 6.01, "s2c_loss": "0.683", "loss": "0.47366", "s2c_nll_loss": "0.683", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "12990", "lr": "8.66057e-05", "gnorm": "8.789", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "2831"} 2023-01-29 16:58:56 | INFO | train_inner | {"epoch": 7, "update": 6.014, "s2c_loss": "0.698", "loss": "0.48402", "s2c_nll_loss": "0.698", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "13000", "lr": "8.66723e-05", "gnorm": "9.388", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2833"} 2023-01-29 16:58:58 | INFO | train_inner | {"epoch": 7, "update": 6.019, "s2c_loss": "0.65", "loss": "0.45032", "s2c_nll_loss": "0.65", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "13010", "lr": "8.6739e-05", "gnorm": "10.305", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2836"} 2023-01-29 16:59:01 | INFO | train_inner | {"epoch": 7, "update": 6.024, "s2c_loss": "0.675", "loss": "0.46782", "s2c_nll_loss": "0.675", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "256.3", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "13020", "lr": "8.68057e-05", "gnorm": "8.891", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2839"} 2023-01-29 16:59:03 | INFO | train_inner | {"epoch": 7, "update": 6.028, "s2c_loss": "0.709", "loss": "0.49155", "s2c_nll_loss": "0.709", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "13030", "lr": "8.68723e-05", "gnorm": "8.566", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2841"} 2023-01-29 16:59:06 | INFO | train_inner | {"epoch": 7, "update": 6.033, "s2c_loss": "0.769", "loss": "0.53306", "s2c_nll_loss": "0.769", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "13040", "lr": "8.6939e-05", "gnorm": "8.612", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2844"} 2023-01-29 16:59:08 | INFO | train_inner | {"epoch": 7, "update": 6.037, "s2c_loss": "0.458", "loss": "0.31732", "s2c_nll_loss": "0.458", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "13050", "lr": "8.70057e-05", "gnorm": "7.249", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2846"} 2023-01-29 16:59:11 | INFO | train_inner | {"epoch": 7, "update": 6.042, "s2c_loss": "0.67", "loss": "0.4641", "s2c_nll_loss": "0.67", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "13060", "lr": "8.70723e-05", "gnorm": "8.972", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2849"} 2023-01-29 16:59:13 | INFO | train_inner | {"epoch": 7, "update": 6.047, "s2c_loss": "0.832", "loss": "0.577", "s2c_nll_loss": "0.832", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "13070", "lr": "8.7139e-05", "gnorm": "7.764", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2851"} 2023-01-29 16:59:16 | INFO | train_inner | {"epoch": 7, "update": 6.051, "s2c_loss": "0.688", "loss": "0.47663", "s2c_nll_loss": "0.688", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "258.1", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "13080", "lr": "8.72056e-05", "gnorm": "8.32", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2854"} 2023-01-29 16:59:18 | INFO | train_inner | {"epoch": 7, "update": 6.056, "s2c_loss": "0.704", "loss": "0.48773", "s2c_nll_loss": "0.704", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "13090", "lr": "8.72723e-05", "gnorm": "9.521", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2856"} 2023-01-29 16:59:21 | INFO | train_inner | {"epoch": 7, "update": 6.061, "s2c_loss": "0.617", "loss": "0.42798", "s2c_nll_loss": "0.617", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "13100", "lr": "8.7339e-05", "gnorm": "8.985", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2859"} 2023-01-29 16:59:24 | INFO | train_inner | {"epoch": 7, "update": 6.065, "s2c_loss": "0.585", "loss": "0.40524", "s2c_nll_loss": "0.585", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "241.3", "ups": "3.77", "wpb": "64", "bsz": "64", "num_updates": "13110", "lr": "8.74056e-05", "gnorm": "8.099", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2861"} 2023-01-29 16:59:26 | INFO | train_inner | {"epoch": 7, "update": 6.07, "s2c_loss": "0.591", "loss": "0.40964", "s2c_nll_loss": "0.591", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "13120", "lr": "8.74723e-05", "gnorm": "8.901", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2864"} 2023-01-29 16:59:29 | INFO | train_inner | {"epoch": 7, "update": 6.074, "s2c_loss": "0.847", "loss": "0.58703", "s2c_nll_loss": "0.847", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "13130", "lr": "8.7539e-05", "gnorm": "8.861", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2867"} 2023-01-29 16:59:31 | INFO | train_inner | {"epoch": 7, "update": 6.079, "s2c_loss": "0.707", "loss": "0.48863", "s2c_nll_loss": "0.707", "s2c_accuracy": "87.755", "s2c_total": "63.7", "s2c_n_correct": "55.9", "wps": "250.7", "ups": "3.94", "wpb": "63.7", "bsz": "63.7", "num_updates": "13140", "lr": "8.76056e-05", "gnorm": "8.31", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2869"} 2023-01-29 16:59:34 | INFO | train_inner | {"epoch": 7, "update": 6.084, "s2c_loss": "0.561", "loss": "0.38919", "s2c_nll_loss": "0.561", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "13150", "lr": "8.76723e-05", "gnorm": "8.282", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "2872"} 2023-01-29 16:59:36 | INFO | train_inner | {"epoch": 7, "update": 6.088, "s2c_loss": "0.705", "loss": "0.48872", "s2c_nll_loss": "0.705", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "13160", "lr": "8.77389e-05", "gnorm": "8.294", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2874"} 2023-01-29 16:59:39 | INFO | train_inner | {"epoch": 7, "update": 6.093, "s2c_loss": "0.786", "loss": "0.54501", "s2c_nll_loss": "0.786", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "13170", "lr": "8.78056e-05", "gnorm": "8.477", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2877"} 2023-01-29 16:59:41 | INFO | train_inner | {"epoch": 7, "update": 6.098, "s2c_loss": "0.558", "loss": "0.38702", "s2c_nll_loss": "0.558", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "13180", "lr": "8.78723e-05", "gnorm": "9.563", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2879"} 2023-01-29 16:59:44 | INFO | train_inner | {"epoch": 7, "update": 6.102, "s2c_loss": "0.581", "loss": "0.4029", "s2c_nll_loss": "0.581", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "13190", "lr": "8.79389e-05", "gnorm": "8.146", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2882"} 2023-01-29 16:59:46 | INFO | train_inner | {"epoch": 7, "update": 6.107, "s2c_loss": "0.683", "loss": "0.47318", "s2c_nll_loss": "0.683", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "13200", "lr": "8.80056e-05", "gnorm": "9.862", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2884"} 2023-01-29 16:59:49 | INFO | train_inner | {"epoch": 7, "update": 6.111, "s2c_loss": "0.64", "loss": "0.4439", "s2c_nll_loss": "0.64", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "13210", "lr": "8.80723e-05", "gnorm": "9.518", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2887"} 2023-01-29 16:59:51 | INFO | train_inner | {"epoch": 7, "update": 6.116, "s2c_loss": "0.637", "loss": "0.44172", "s2c_nll_loss": "0.637", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "13220", "lr": "8.81389e-05", "gnorm": "7.934", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2889"} 2023-01-29 16:59:54 | INFO | train_inner | {"epoch": 7, "update": 6.121, "s2c_loss": "0.849", "loss": "0.58829", "s2c_nll_loss": "0.849", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "247", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "13230", "lr": "8.82056e-05", "gnorm": "9.523", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2892"} 2023-01-29 16:59:57 | INFO | train_inner | {"epoch": 7, "update": 6.125, "s2c_loss": "0.545", "loss": "0.37762", "s2c_nll_loss": "0.545", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "13240", "lr": "8.82723e-05", "gnorm": "8.778", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2895"} 2023-01-29 16:59:59 | INFO | train_inner | {"epoch": 7, "update": 6.13, "s2c_loss": "0.751", "loss": "0.52029", "s2c_nll_loss": "0.751", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "13250", "lr": "8.83389e-05", "gnorm": "9.055", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2897"} 2023-01-29 17:00:02 | INFO | train_inner | {"epoch": 7, "update": 6.135, "s2c_loss": "0.647", "loss": "0.44871", "s2c_nll_loss": "0.647", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "13260", "lr": "8.84056e-05", "gnorm": "8.888", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2900"} 2023-01-29 17:00:04 | INFO | train_inner | {"epoch": 7, "update": 6.139, "s2c_loss": "0.604", "loss": "0.41838", "s2c_nll_loss": "0.604", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "13270", "lr": "8.84722e-05", "gnorm": "8.749", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2902"} 2023-01-29 17:00:07 | INFO | train_inner | {"epoch": 7, "update": 6.144, "s2c_loss": "0.664", "loss": "0.46036", "s2c_nll_loss": "0.664", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "13280", "lr": "8.85389e-05", "gnorm": "8.134", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2905"} 2023-01-29 17:00:09 | INFO | train_inner | {"epoch": 7, "update": 6.148, "s2c_loss": "0.789", "loss": "0.54705", "s2c_nll_loss": "0.789", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "13290", "lr": "8.86056e-05", "gnorm": "9.896", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2907"} 2023-01-29 17:00:12 | INFO | train_inner | {"epoch": 7, "update": 6.153, "s2c_loss": "0.622", "loss": "0.43081", "s2c_nll_loss": "0.622", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "13300", "lr": "8.86722e-05", "gnorm": "8.046", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2910"} 2023-01-29 17:00:14 | INFO | train_inner | {"epoch": 7, "update": 6.158, "s2c_loss": "0.803", "loss": "0.55642", "s2c_nll_loss": "0.803", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "13310", "lr": "8.87389e-05", "gnorm": "8.581", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2912"} 2023-01-29 17:00:17 | INFO | train_inner | {"epoch": 7, "update": 6.162, "s2c_loss": "0.579", "loss": "0.401", "s2c_nll_loss": "0.579", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "13320", "lr": "8.88056e-05", "gnorm": "8.514", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2915"} 2023-01-29 17:00:19 | INFO | train_inner | {"epoch": 7, "update": 6.167, "s2c_loss": "0.713", "loss": "0.49408", "s2c_nll_loss": "0.713", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "13330", "lr": "8.88722e-05", "gnorm": "9.113", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2917"} 2023-01-29 17:00:22 | INFO | train_inner | {"epoch": 7, "update": 6.172, "s2c_loss": "0.829", "loss": "0.57471", "s2c_nll_loss": "0.829", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "13340", "lr": "8.89389e-05", "gnorm": "10.381", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2920"} 2023-01-29 17:00:25 | INFO | train_inner | {"epoch": 7, "update": 6.176, "s2c_loss": "0.734", "loss": "0.50896", "s2c_nll_loss": "0.734", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "13350", "lr": "8.90056e-05", "gnorm": "9.561", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2922"} 2023-01-29 17:00:27 | INFO | train_inner | {"epoch": 7, "update": 6.181, "s2c_loss": "0.81", "loss": "0.56127", "s2c_nll_loss": "0.81", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "13360", "lr": "8.90722e-05", "gnorm": "8.773", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2925"} 2023-01-29 17:00:30 | INFO | train_inner | {"epoch": 7, "update": 6.185, "s2c_loss": "0.614", "loss": "0.42562", "s2c_nll_loss": "0.614", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "13370", "lr": "8.91389e-05", "gnorm": "8.607", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2928"} 2023-01-29 17:00:32 | INFO | train_inner | {"epoch": 7, "update": 6.19, "s2c_loss": "0.848", "loss": "0.58759", "s2c_nll_loss": "0.848", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "13380", "lr": "8.92055e-05", "gnorm": "9.173", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2930"} 2023-01-29 17:00:35 | INFO | train_inner | {"epoch": 7, "update": 6.195, "s2c_loss": "0.652", "loss": "0.4522", "s2c_nll_loss": "0.652", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "13390", "lr": "8.92722e-05", "gnorm": "8.128", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2933"} 2023-01-29 17:00:37 | INFO | train_inner | {"epoch": 7, "update": 6.199, "s2c_loss": "0.646", "loss": "0.448", "s2c_nll_loss": "0.646", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "246.2", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "13400", "lr": "8.93389e-05", "gnorm": "9.137", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2935"} 2023-01-29 17:00:40 | INFO | train_inner | {"epoch": 7, "update": 6.204, "s2c_loss": "0.719", "loss": "0.49827", "s2c_nll_loss": "0.719", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "13410", "lr": "8.94055e-05", "gnorm": "9.161", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2938"} 2023-01-29 17:00:42 | INFO | train_inner | {"epoch": 7, "update": 6.209, "s2c_loss": "0.717", "loss": "0.49669", "s2c_nll_loss": "0.717", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "13420", "lr": "8.94722e-05", "gnorm": "8.997", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2940"} 2023-01-29 17:00:45 | INFO | train_inner | {"epoch": 7, "update": 6.213, "s2c_loss": "0.739", "loss": "0.51199", "s2c_nll_loss": "0.739", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "13430", "lr": "8.95389e-05", "gnorm": "8.089", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2943"} 2023-01-29 17:00:47 | INFO | train_inner | {"epoch": 7, "update": 6.218, "s2c_loss": "0.831", "loss": "0.57574", "s2c_nll_loss": "0.831", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "13440", "lr": "8.96055e-05", "gnorm": "8.885", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2945"} 2023-01-29 17:00:50 | INFO | train_inner | {"epoch": 7, "update": 6.222, "s2c_loss": "0.822", "loss": "0.56983", "s2c_nll_loss": "0.822", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "13450", "lr": "8.96722e-05", "gnorm": "9.241", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2948"} 2023-01-29 17:00:52 | INFO | train_inner | {"epoch": 7, "update": 6.227, "s2c_loss": "0.682", "loss": "0.47298", "s2c_nll_loss": "0.682", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "13460", "lr": "8.97388e-05", "gnorm": "10.512", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "2950"} 2023-01-29 17:00:55 | INFO | train_inner | {"epoch": 7, "update": 6.232, "s2c_loss": "0.688", "loss": "0.47677", "s2c_nll_loss": "0.688", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "246", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "13470", "lr": "8.98055e-05", "gnorm": "10.779", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2953"} 2023-01-29 17:00:57 | INFO | train_inner | {"epoch": 7, "update": 6.236, "s2c_loss": "0.733", "loss": "0.50811", "s2c_nll_loss": "0.733", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "13480", "lr": "8.98722e-05", "gnorm": "9.859", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2955"} 2023-01-29 17:01:00 | INFO | train_inner | {"epoch": 7, "update": 6.241, "s2c_loss": "0.709", "loss": "0.49155", "s2c_nll_loss": "0.709", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "13490", "lr": "8.99388e-05", "gnorm": "11.422", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "2958"} 2023-01-29 17:01:03 | INFO | train_inner | {"epoch": 7, "update": 6.246, "s2c_loss": "0.801", "loss": "0.55493", "s2c_nll_loss": "0.801", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "13500", "lr": "9.00055e-05", "gnorm": "8.784", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "2960"} 2023-01-29 17:01:05 | INFO | train_inner | {"epoch": 7, "update": 6.25, "s2c_loss": "0.594", "loss": "0.41196", "s2c_nll_loss": "0.594", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "13510", "lr": "9.00722e-05", "gnorm": "7.693", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2963"} 2023-01-29 17:01:08 | INFO | train_inner | {"epoch": 7, "update": 6.255, "s2c_loss": "0.633", "loss": "0.43846", "s2c_nll_loss": "0.633", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "13520", "lr": "9.01388e-05", "gnorm": "8.46", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2965"} 2023-01-29 17:01:10 | INFO | train_inner | {"epoch": 7, "update": 6.259, "s2c_loss": "0.665", "loss": "0.46102", "s2c_nll_loss": "0.665", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "13530", "lr": "9.02055e-05", "gnorm": "8.665", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2968"} 2023-01-29 17:01:13 | INFO | train_inner | {"epoch": 7, "update": 6.264, "s2c_loss": "0.637", "loss": "0.44128", "s2c_nll_loss": "0.637", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "13540", "lr": "9.02722e-05", "gnorm": "9.25", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2970"} 2023-01-29 17:01:15 | INFO | train_inner | {"epoch": 7, "update": 6.269, "s2c_loss": "0.658", "loss": "0.45586", "s2c_nll_loss": "0.658", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "13550", "lr": "9.03388e-05", "gnorm": "8.575", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "2973"} 2023-01-29 17:01:18 | INFO | train_inner | {"epoch": 7, "update": 6.273, "s2c_loss": "0.754", "loss": "0.52244", "s2c_nll_loss": "0.754", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "13560", "lr": "9.04055e-05", "gnorm": "8.741", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2976"} 2023-01-29 17:01:20 | INFO | train_inner | {"epoch": 7, "update": 6.278, "s2c_loss": "0.671", "loss": "0.46497", "s2c_nll_loss": "0.671", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "13570", "lr": "9.04721e-05", "gnorm": "8.445", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2978"} 2023-01-29 17:01:23 | INFO | train_inner | {"epoch": 7, "update": 6.283, "s2c_loss": "0.69", "loss": "0.4781", "s2c_nll_loss": "0.69", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "13580", "lr": "9.05388e-05", "gnorm": "8.501", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2981"} 2023-01-29 17:01:25 | INFO | train_inner | {"epoch": 7, "update": 6.287, "s2c_loss": "0.619", "loss": "0.42926", "s2c_nll_loss": "0.619", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "13590", "lr": "9.06055e-05", "gnorm": "7.865", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "2983"} 2023-01-29 17:01:28 | INFO | train_inner | {"epoch": 7, "update": 6.292, "s2c_loss": "0.611", "loss": "0.42342", "s2c_nll_loss": "0.611", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "13600", "lr": "9.06721e-05", "gnorm": "8.69", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2986"} 2023-01-29 17:01:30 | INFO | train_inner | {"epoch": 7, "update": 6.296, "s2c_loss": "0.808", "loss": "0.56013", "s2c_nll_loss": "0.808", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "13610", "lr": "9.07388e-05", "gnorm": "8.624", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2988"} 2023-01-29 17:01:33 | INFO | train_inner | {"epoch": 7, "update": 6.301, "s2c_loss": "0.661", "loss": "0.45826", "s2c_nll_loss": "0.661", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "13620", "lr": "9.08055e-05", "gnorm": "10.183", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2991"} 2023-01-29 17:01:35 | INFO | train_inner | {"epoch": 7, "update": 6.306, "s2c_loss": "0.757", "loss": "0.52455", "s2c_nll_loss": "0.757", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "13630", "lr": "9.08721e-05", "gnorm": "10.661", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "2993"} 2023-01-29 17:01:38 | INFO | train_inner | {"epoch": 7, "update": 6.31, "s2c_loss": "0.729", "loss": "0.50552", "s2c_nll_loss": "0.729", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "13640", "lr": "9.09388e-05", "gnorm": "10.662", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "2996"} 2023-01-29 17:01:40 | INFO | train_inner | {"epoch": 7, "update": 6.315, "s2c_loss": "0.772", "loss": "0.53499", "s2c_nll_loss": "0.772", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "13650", "lr": "9.10055e-05", "gnorm": "9.375", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "2998"} 2023-01-29 17:01:43 | INFO | train_inner | {"epoch": 7, "update": 6.32, "s2c_loss": "0.642", "loss": "0.4447", "s2c_nll_loss": "0.642", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "13660", "lr": "9.10721e-05", "gnorm": "8.327", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "3001"} 2023-01-29 17:01:46 | INFO | train_inner | {"epoch": 7, "update": 6.324, "s2c_loss": "0.677", "loss": "0.46948", "s2c_nll_loss": "0.677", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "13670", "lr": "9.11388e-05", "gnorm": "10.442", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "3003"} 2023-01-29 17:01:48 | INFO | train_inner | {"epoch": 7, "update": 6.329, "s2c_loss": "0.803", "loss": "0.55648", "s2c_nll_loss": "0.803", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "13680", "lr": "9.12054e-05", "gnorm": "9.21", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "3006"} 2023-01-29 17:01:51 | INFO | train_inner | {"epoch": 7, "update": 6.333, "s2c_loss": "0.684", "loss": "0.4741", "s2c_nll_loss": "0.684", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "13690", "lr": "9.12721e-05", "gnorm": "8.639", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "3009"} 2023-01-29 17:01:53 | INFO | train_inner | {"epoch": 7, "update": 6.338, "s2c_loss": "0.606", "loss": "0.42018", "s2c_nll_loss": "0.606", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "13700", "lr": "9.13388e-05", "gnorm": "9.211", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "3011"} 2023-01-29 17:01:56 | INFO | train_inner | {"epoch": 7, "update": 6.343, "s2c_loss": "0.885", "loss": "0.61313", "s2c_nll_loss": "0.885", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "13710", "lr": "9.14054e-05", "gnorm": "9.366", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "3014"} 2023-01-29 17:01:58 | INFO | train_inner | {"epoch": 7, "update": 6.347, "s2c_loss": "0.664", "loss": "0.46033", "s2c_nll_loss": "0.664", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "13720", "lr": "9.14721e-05", "gnorm": "8.976", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "3016"} 2023-01-29 17:02:01 | INFO | train_inner | {"epoch": 7, "update": 6.352, "s2c_loss": "0.732", "loss": "0.50722", "s2c_nll_loss": "0.732", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "13730", "lr": "9.15388e-05", "gnorm": "10.111", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "3019"} 2023-01-29 17:02:03 | INFO | train_inner | {"epoch": 7, "update": 6.357, "s2c_loss": "0.813", "loss": "0.5635", "s2c_nll_loss": "0.813", "s2c_accuracy": "86.094", "s2c_total": "64", "s2c_n_correct": "55.1", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "13740", "lr": "9.16054e-05", "gnorm": "10.842", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "3021"} 2023-01-29 17:02:06 | INFO | train_inner | {"epoch": 7, "update": 6.361, "s2c_loss": "0.506", "loss": "0.35044", "s2c_nll_loss": "0.506", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "13750", "lr": "9.16721e-05", "gnorm": "10.324", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "3024"} 2023-01-29 17:02:08 | INFO | train_inner | {"epoch": 7, "update": 6.366, "s2c_loss": "0.788", "loss": "0.54615", "s2c_nll_loss": "0.788", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "13760", "lr": "9.17387e-05", "gnorm": "10.588", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "3026"} 2023-01-29 17:02:11 | INFO | train_inner | {"epoch": 7, "update": 6.37, "s2c_loss": "0.887", "loss": "0.61503", "s2c_nll_loss": "0.887", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "260.1", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "13770", "lr": "9.18054e-05", "gnorm": "9.094", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "3029"} 2023-01-29 17:02:13 | INFO | train_inner | {"epoch": 7, "update": 6.375, "s2c_loss": "0.612", "loss": "0.42416", "s2c_nll_loss": "0.612", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "13780", "lr": "9.18721e-05", "gnorm": "8.847", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "3031"} 2023-01-29 17:02:16 | INFO | train_inner | {"epoch": 7, "update": 6.38, "s2c_loss": "0.699", "loss": "0.48446", "s2c_nll_loss": "0.699", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "13790", "lr": "9.19387e-05", "gnorm": "10.331", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "3034"} 2023-01-29 17:02:18 | INFO | train_inner | {"epoch": 7, "update": 6.384, "s2c_loss": "0.75", "loss": "0.51999", "s2c_nll_loss": "0.75", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "13800", "lr": "9.20054e-05", "gnorm": "9.277", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "3036"} 2023-01-29 17:02:21 | INFO | train_inner | {"epoch": 7, "update": 6.389, "s2c_loss": "0.679", "loss": "0.47081", "s2c_nll_loss": "0.679", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "247.3", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "13810", "lr": "9.20721e-05", "gnorm": "9.45", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "3039"} 2023-01-29 17:02:24 | INFO | train_inner | {"epoch": 7, "update": 6.394, "s2c_loss": "0.536", "loss": "0.37175", "s2c_nll_loss": "0.536", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "13820", "lr": "9.21387e-05", "gnorm": "7.858", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "3041"} 2023-01-29 17:02:26 | INFO | train_inner | {"epoch": 7, "update": 6.398, "s2c_loss": "0.741", "loss": "0.51379", "s2c_nll_loss": "0.741", "s2c_accuracy": "86.875", "s2c_total": "64", "s2c_n_correct": "55.6", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "13830", "lr": "9.22054e-05", "gnorm": "9.863", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "3044"} 2023-01-29 17:02:29 | INFO | train_inner | {"epoch": 7, "update": 6.403, "s2c_loss": "0.849", "loss": "0.58872", "s2c_nll_loss": "0.849", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "13840", "lr": "9.22721e-05", "gnorm": "8.426", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "3047"} 2023-01-29 17:02:31 | INFO | train_inner | {"epoch": 7, "update": 6.407, "s2c_loss": "0.654", "loss": "0.45322", "s2c_nll_loss": "0.654", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "13850", "lr": "9.23387e-05", "gnorm": "8.408", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "3049"} 2023-01-29 17:02:34 | INFO | train_inner | {"epoch": 7, "update": 6.412, "s2c_loss": "0.717", "loss": "0.49668", "s2c_nll_loss": "0.717", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "258.6", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "13860", "lr": "9.24054e-05", "gnorm": "7.81", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "3052"} 2023-01-29 17:02:36 | INFO | train_inner | {"epoch": 7, "update": 6.417, "s2c_loss": "0.873", "loss": "0.60505", "s2c_nll_loss": "0.873", "s2c_accuracy": "86.25", "s2c_total": "64", "s2c_n_correct": "55.2", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "13870", "lr": "9.2472e-05", "gnorm": "9.522", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "3054"} 2023-01-29 17:02:39 | INFO | train_inner | {"epoch": 7, "update": 6.421, "s2c_loss": "0.644", "loss": "0.44611", "s2c_nll_loss": "0.644", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "246", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "13880", "lr": "9.25387e-05", "gnorm": "8.279", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "3057"} 2023-01-29 17:02:41 | INFO | train_inner | {"epoch": 7, "update": 6.426, "s2c_loss": "0.732", "loss": "0.50717", "s2c_nll_loss": "0.732", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "13890", "lr": "9.26054e-05", "gnorm": "9.897", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "3059"} 2023-01-29 17:02:44 | INFO | train_inner | {"epoch": 7, "update": 6.431, "s2c_loss": "0.757", "loss": "0.52438", "s2c_nll_loss": "0.757", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "257.6", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "13900", "lr": "9.2672e-05", "gnorm": "8.868", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "3062"} 2023-01-29 17:02:46 | INFO | train_inner | {"epoch": 7, "update": 6.435, "s2c_loss": "0.669", "loss": "0.46351", "s2c_nll_loss": "0.669", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "13910", "lr": "9.27387e-05", "gnorm": "10.03", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "3064"} 2023-01-29 17:02:49 | INFO | train_inner | {"epoch": 7, "update": 6.44, "s2c_loss": "0.606", "loss": "0.42033", "s2c_nll_loss": "0.606", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "13920", "lr": "9.28054e-05", "gnorm": "8.172", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "3067"} 2023-01-29 17:02:51 | INFO | train_inner | {"epoch": 7, "update": 6.444, "s2c_loss": "0.613", "loss": "0.42468", "s2c_nll_loss": "0.613", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "243.4", "ups": "3.8", "wpb": "64", "bsz": "64", "num_updates": "13930", "lr": "9.2872e-05", "gnorm": "9.278", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "3069"} 2023-01-29 17:02:54 | INFO | train_inner | {"epoch": 7, "update": 6.449, "s2c_loss": "0.56", "loss": "0.38847", "s2c_nll_loss": "0.56", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "13940", "lr": "9.29387e-05", "gnorm": "8.333", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "3072"} 2023-01-29 17:02:57 | INFO | train_inner | {"epoch": 7, "update": 6.454, "s2c_loss": "0.603", "loss": "0.41815", "s2c_nll_loss": "0.603", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "13950", "lr": "9.30054e-05", "gnorm": "9.937", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "3075"} 2023-01-29 17:02:59 | INFO | train_inner | {"epoch": 7, "update": 6.458, "s2c_loss": "0.713", "loss": "0.494", "s2c_nll_loss": "0.713", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "13960", "lr": "9.3072e-05", "gnorm": "9.335", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "3077"} 2023-01-29 17:03:02 | INFO | train_inner | {"epoch": 7, "update": 6.463, "s2c_loss": "0.556", "loss": "0.38554", "s2c_nll_loss": "0.556", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "13970", "lr": "9.31387e-05", "gnorm": "8.867", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "3080"} 2023-01-29 17:03:04 | INFO | train_inner | {"epoch": 7, "update": 6.468, "s2c_loss": "0.683", "loss": "0.47331", "s2c_nll_loss": "0.683", "s2c_accuracy": "86.25", "s2c_total": "64", "s2c_n_correct": "55.2", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "13980", "lr": "9.32053e-05", "gnorm": "9.724", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "3082"} 2023-01-29 17:03:07 | INFO | train_inner | {"epoch": 7, "update": 6.472, "s2c_loss": "0.663", "loss": "0.45973", "s2c_nll_loss": "0.663", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "13990", "lr": "9.3272e-05", "gnorm": "9.199", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "3085"} 2023-01-29 17:03:09 | INFO | train_inner | {"epoch": 7, "update": 6.477, "s2c_loss": "0.779", "loss": "0.53972", "s2c_nll_loss": "0.779", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "14000", "lr": "9.33387e-05", "gnorm": "8.978", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "3087"} 2023-01-29 17:03:12 | INFO | train_inner | {"epoch": 7, "update": 6.481, "s2c_loss": "0.724", "loss": "0.50155", "s2c_nll_loss": "0.724", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "14010", "lr": "9.34053e-05", "gnorm": "10.159", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "3090"} 2023-01-29 17:03:14 | INFO | train_inner | {"epoch": 7, "update": 6.486, "s2c_loss": "0.632", "loss": "0.43794", "s2c_nll_loss": "0.632", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "14020", "lr": "9.3472e-05", "gnorm": "8.452", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "3092"} 2023-01-29 17:03:17 | INFO | train_inner | {"epoch": 7, "update": 6.491, "s2c_loss": "0.674", "loss": "0.467", "s2c_nll_loss": "0.674", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "14030", "lr": "9.35387e-05", "gnorm": "9.231", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "3095"} 2023-01-29 17:03:19 | INFO | train_inner | {"epoch": 7, "update": 6.495, "s2c_loss": "0.728", "loss": "0.50442", "s2c_nll_loss": "0.728", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "14040", "lr": "9.36053e-05", "gnorm": "9.991", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "3097"} 2023-01-29 17:03:22 | INFO | train_inner | {"epoch": 7, "update": 6.5, "s2c_loss": "0.682", "loss": "0.47257", "s2c_nll_loss": "0.682", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "14050", "lr": "9.3672e-05", "gnorm": "8.509", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "3100"} 2023-01-29 17:03:24 | INFO | train_inner | {"epoch": 7, "update": 6.505, "s2c_loss": "0.783", "loss": "0.54273", "s2c_nll_loss": "0.783", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "14060", "lr": "9.37386e-05", "gnorm": "9.093", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "3102"} 2023-01-29 17:03:27 | INFO | train_inner | {"epoch": 7, "update": 6.509, "s2c_loss": "0.848", "loss": "0.58809", "s2c_nll_loss": "0.848", "s2c_accuracy": "85.781", "s2c_total": "64", "s2c_n_correct": "54.9", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "14070", "lr": "9.38053e-05", "gnorm": "9.929", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "3105"} 2023-01-29 17:03:30 | INFO | train_inner | {"epoch": 7, "update": 6.514, "s2c_loss": "0.546", "loss": "0.37821", "s2c_nll_loss": "0.546", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "14080", "lr": "9.3872e-05", "gnorm": "8.801", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "3107"} 2023-01-29 17:03:32 | INFO | train_inner | {"epoch": 7, "update": 6.519, "s2c_loss": "0.823", "loss": "0.57073", "s2c_nll_loss": "0.823", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "14090", "lr": "9.39386e-05", "gnorm": "9.44", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "3110"} 2023-01-29 17:03:35 | INFO | train_inner | {"epoch": 7, "update": 6.523, "s2c_loss": "0.663", "loss": "0.45974", "s2c_nll_loss": "0.663", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "247", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "14100", "lr": "9.40053e-05", "gnorm": "9.237", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "3113"} 2023-01-29 17:03:37 | INFO | train_inner | {"epoch": 7, "update": 6.528, "s2c_loss": "0.784", "loss": "0.54324", "s2c_nll_loss": "0.784", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "14110", "lr": "9.4072e-05", "gnorm": "9.685", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "3115"} 2023-01-29 17:03:40 | INFO | train_inner | {"epoch": 7, "update": 6.532, "s2c_loss": "0.706", "loss": "0.48947", "s2c_nll_loss": "0.706", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "14120", "lr": "9.41386e-05", "gnorm": "9.819", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "3118"} 2023-01-29 17:03:42 | INFO | train_inner | {"epoch": 7, "update": 6.537, "s2c_loss": "0.647", "loss": "0.44841", "s2c_nll_loss": "0.647", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "14130", "lr": "9.42053e-05", "gnorm": "8.658", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "3120"} 2023-01-29 17:03:45 | INFO | train_inner | {"epoch": 7, "update": 6.542, "s2c_loss": "0.589", "loss": "0.40813", "s2c_nll_loss": "0.589", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "14140", "lr": "9.4272e-05", "gnorm": "8.334", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "3123"} 2023-01-29 17:03:48 | INFO | train_inner | {"epoch": 7, "update": 6.546, "s2c_loss": "0.675", "loss": "0.46796", "s2c_nll_loss": "0.675", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "246.1", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "14150", "lr": "9.43386e-05", "gnorm": "8.925", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "3125"} 2023-01-29 17:03:50 | INFO | train_inner | {"epoch": 7, "update": 6.551, "s2c_loss": "0.703", "loss": "0.48762", "s2c_nll_loss": "0.703", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "249.3", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "14160", "lr": "9.44053e-05", "gnorm": "9.611", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "3128"} 2023-01-29 17:03:53 | INFO | train_inner | {"epoch": 7, "update": 6.556, "s2c_loss": "0.742", "loss": "0.51466", "s2c_nll_loss": "0.742", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "14170", "lr": "9.44719e-05", "gnorm": "9.726", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "3131"} 2023-01-29 17:03:55 | INFO | train_inner | {"epoch": 7, "update": 6.56, "s2c_loss": "0.638", "loss": "0.4419", "s2c_nll_loss": "0.638", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "244.6", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "14180", "lr": "9.45386e-05", "gnorm": "8.142", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "3133"} 2023-01-29 17:03:58 | INFO | train_inner | {"epoch": 7, "update": 6.565, "s2c_loss": "0.593", "loss": "0.41083", "s2c_nll_loss": "0.593", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "14190", "lr": "9.46053e-05", "gnorm": "8.387", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "3136"} 2023-01-29 17:04:00 | INFO | train_inner | {"epoch": 7, "update": 6.569, "s2c_loss": "0.819", "loss": "0.56799", "s2c_nll_loss": "0.819", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "14200", "lr": "9.46719e-05", "gnorm": "9.935", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "3138"} 2023-01-29 17:04:03 | INFO | train_inner | {"epoch": 7, "update": 6.574, "s2c_loss": "0.783", "loss": "0.54253", "s2c_nll_loss": "0.783", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "14210", "lr": "9.47386e-05", "gnorm": "9.112", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "3141"} 2023-01-29 17:04:05 | INFO | train_inner | {"epoch": 7, "update": 6.579, "s2c_loss": "0.649", "loss": "0.44974", "s2c_nll_loss": "0.649", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "258.1", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "14220", "lr": "9.48053e-05", "gnorm": "9.094", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "3143"} 2023-01-29 17:04:08 | INFO | train_inner | {"epoch": 7, "update": 6.583, "s2c_loss": "0.592", "loss": "0.41004", "s2c_nll_loss": "0.592", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "14230", "lr": "9.48719e-05", "gnorm": "8.607", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "3146"} 2023-01-29 17:04:10 | INFO | train_inner | {"epoch": 7, "update": 6.588, "s2c_loss": "0.68", "loss": "0.47107", "s2c_nll_loss": "0.68", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "14240", "lr": "9.49386e-05", "gnorm": "8.816", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "3148"} 2023-01-29 17:04:13 | INFO | train_inner | {"epoch": 7, "update": 6.593, "s2c_loss": "0.708", "loss": "0.49061", "s2c_nll_loss": "0.708", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "14250", "lr": "9.50053e-05", "gnorm": "9.439", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3151"} 2023-01-29 17:04:15 | INFO | train_inner | {"epoch": 7, "update": 6.597, "s2c_loss": "0.699", "loss": "0.48478", "s2c_nll_loss": "0.699", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "14260", "lr": "9.50719e-05", "gnorm": "9.623", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3153"} 2023-01-29 17:04:18 | INFO | train_inner | {"epoch": 7, "update": 6.602, "s2c_loss": "0.634", "loss": "0.43939", "s2c_nll_loss": "0.634", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "14270", "lr": "9.51386e-05", "gnorm": "8.431", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3156"} 2023-01-29 17:04:20 | INFO | train_inner | {"epoch": 7, "update": 6.606, "s2c_loss": "0.72", "loss": "0.49929", "s2c_nll_loss": "0.72", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "14280", "lr": "9.52052e-05", "gnorm": "9.896", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3158"} 2023-01-29 17:04:23 | INFO | train_inner | {"epoch": 7, "update": 6.611, "s2c_loss": "0.707", "loss": "0.4901", "s2c_nll_loss": "0.707", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "14290", "lr": "9.52719e-05", "gnorm": "9.05", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3161"} 2023-01-29 17:04:26 | INFO | train_inner | {"epoch": 7, "update": 6.616, "s2c_loss": "0.561", "loss": "0.3889", "s2c_nll_loss": "0.561", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "244.1", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "14300", "lr": "9.53386e-05", "gnorm": "8.462", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3164"} 2023-01-29 17:04:28 | INFO | train_inner | {"epoch": 7, "update": 6.62, "s2c_loss": "0.499", "loss": "0.34606", "s2c_nll_loss": "0.499", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "14310", "lr": "9.54052e-05", "gnorm": "7.886", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3166"} 2023-01-29 17:04:31 | INFO | train_inner | {"epoch": 7, "update": 6.625, "s2c_loss": "0.634", "loss": "0.43944", "s2c_nll_loss": "0.634", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "14320", "lr": "9.54719e-05", "gnorm": "8.937", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3169"} 2023-01-29 17:04:33 | INFO | train_inner | {"epoch": 7, "update": 6.63, "s2c_loss": "0.716", "loss": "0.4961", "s2c_nll_loss": "0.716", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "14330", "lr": "9.55386e-05", "gnorm": "9.202", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "3171"} 2023-01-29 17:04:36 | INFO | train_inner | {"epoch": 7, "update": 6.634, "s2c_loss": "0.794", "loss": "0.5504", "s2c_nll_loss": "0.794", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "14340", "lr": "9.56052e-05", "gnorm": "10.88", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3174"} 2023-01-29 17:04:38 | INFO | train_inner | {"epoch": 7, "update": 6.639, "s2c_loss": "0.714", "loss": "0.49477", "s2c_nll_loss": "0.714", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "14350", "lr": "9.56719e-05", "gnorm": "8.799", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3176"} 2023-01-29 17:04:41 | INFO | train_inner | {"epoch": 7, "update": 6.643, "s2c_loss": "0.604", "loss": "0.41893", "s2c_nll_loss": "0.604", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "14360", "lr": "9.57385e-05", "gnorm": "9.138", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3179"} 2023-01-29 17:04:43 | INFO | train_inner | {"epoch": 7, "update": 6.648, "s2c_loss": "0.678", "loss": "0.47021", "s2c_nll_loss": "0.678", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "14370", "lr": "9.58052e-05", "gnorm": "7.967", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "3181"} 2023-01-29 17:04:46 | INFO | train_inner | {"epoch": 7, "update": 6.653, "s2c_loss": "0.546", "loss": "0.37863", "s2c_nll_loss": "0.546", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "14380", "lr": "9.58719e-05", "gnorm": "8.177", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3184"} 2023-01-29 17:04:48 | INFO | train_inner | {"epoch": 7, "update": 6.657, "s2c_loss": "0.838", "loss": "0.58105", "s2c_nll_loss": "0.838", "s2c_accuracy": "86.875", "s2c_total": "64", "s2c_n_correct": "55.6", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "14390", "lr": "9.59385e-05", "gnorm": "9.066", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3186"} 2023-01-29 17:04:51 | INFO | train_inner | {"epoch": 7, "update": 6.662, "s2c_loss": "0.538", "loss": "0.37316", "s2c_nll_loss": "0.538", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "14400", "lr": "9.60052e-05", "gnorm": "8.644", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3189"} 2023-01-29 17:04:53 | INFO | train_inner | {"epoch": 7, "update": 6.667, "s2c_loss": "0.667", "loss": "0.46248", "s2c_nll_loss": "0.667", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "14410", "lr": "9.60719e-05", "gnorm": "8.469", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3191"} 2023-01-29 17:04:56 | INFO | train_inner | {"epoch": 7, "update": 6.671, "s2c_loss": "0.649", "loss": "0.44987", "s2c_nll_loss": "0.649", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "14420", "lr": "9.61385e-05", "gnorm": "8.577", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3194"} 2023-01-29 17:04:59 | INFO | train_inner | {"epoch": 7, "update": 6.676, "s2c_loss": "0.684", "loss": "0.47404", "s2c_nll_loss": "0.684", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "249.3", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "14430", "lr": "9.62052e-05", "gnorm": "8.684", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3197"} 2023-01-29 17:05:01 | INFO | train_inner | {"epoch": 7, "update": 6.68, "s2c_loss": "0.645", "loss": "0.44684", "s2c_nll_loss": "0.645", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "14440", "lr": "9.62719e-05", "gnorm": "8.089", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3199"} 2023-01-29 17:05:04 | INFO | train_inner | {"epoch": 7, "update": 6.685, "s2c_loss": "0.706", "loss": "0.48906", "s2c_nll_loss": "0.706", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "14450", "lr": "9.63385e-05", "gnorm": "9.964", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "3202"} 2023-01-29 17:05:06 | INFO | train_inner | {"epoch": 7, "update": 6.69, "s2c_loss": "0.684", "loss": "0.47435", "s2c_nll_loss": "0.684", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "14460", "lr": "9.64052e-05", "gnorm": "8.866", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3204"} 2023-01-29 17:05:09 | INFO | train_inner | {"epoch": 7, "update": 6.694, "s2c_loss": "0.968", "loss": "0.67123", "s2c_nll_loss": "0.968", "s2c_accuracy": "85.781", "s2c_total": "64", "s2c_n_correct": "54.9", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "14470", "lr": "9.64718e-05", "gnorm": "10.149", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3207"} 2023-01-29 17:05:11 | INFO | train_inner | {"epoch": 7, "update": 6.699, "s2c_loss": "0.625", "loss": "0.4335", "s2c_nll_loss": "0.625", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "14480", "lr": "9.65385e-05", "gnorm": "9.466", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3209"} 2023-01-29 17:05:14 | INFO | train_inner | {"epoch": 7, "update": 6.704, "s2c_loss": "0.648", "loss": "0.44918", "s2c_nll_loss": "0.648", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "14490", "lr": "9.66052e-05", "gnorm": "9.41", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3212"} 2023-01-29 17:05:16 | INFO | train_inner | {"epoch": 7, "update": 6.708, "s2c_loss": "0.868", "loss": "0.60168", "s2c_nll_loss": "0.868", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "14500", "lr": "9.66718e-05", "gnorm": "9.539", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3214"} 2023-01-29 17:05:19 | INFO | train_inner | {"epoch": 7, "update": 6.713, "s2c_loss": "0.675", "loss": "0.46765", "s2c_nll_loss": "0.675", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "14510", "lr": "9.67385e-05", "gnorm": "9.233", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "3217"} 2023-01-29 17:05:21 | INFO | train_inner | {"epoch": 7, "update": 6.717, "s2c_loss": "0.787", "loss": "0.54554", "s2c_nll_loss": "0.787", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "14520", "lr": "9.68052e-05", "gnorm": "9.34", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3219"} 2023-01-29 17:05:24 | INFO | train_inner | {"epoch": 7, "update": 6.722, "s2c_loss": "0.632", "loss": "0.43772", "s2c_nll_loss": "0.632", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "247.5", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "14530", "lr": "9.68718e-05", "gnorm": "9.621", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3222"} 2023-01-29 17:05:26 | INFO | train_inner | {"epoch": 7, "update": 6.727, "s2c_loss": "0.821", "loss": "0.56897", "s2c_nll_loss": "0.821", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "14540", "lr": "9.69385e-05", "gnorm": "8.875", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3224"} 2023-01-29 17:05:29 | INFO | train_inner | {"epoch": 7, "update": 6.731, "s2c_loss": "0.955", "loss": "0.66167", "s2c_nll_loss": "0.955", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "14550", "lr": "9.70051e-05", "gnorm": "8.525", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3227"} 2023-01-29 17:05:32 | INFO | train_inner | {"epoch": 7, "update": 6.736, "s2c_loss": "0.6", "loss": "0.4158", "s2c_nll_loss": "0.6", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "14560", "lr": "9.70718e-05", "gnorm": "7.958", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3230"} 2023-01-29 17:05:34 | INFO | train_inner | {"epoch": 7, "update": 6.741, "s2c_loss": "0.594", "loss": "0.41166", "s2c_nll_loss": "0.594", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "14570", "lr": "9.71385e-05", "gnorm": "7.735", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3232"} 2023-01-29 17:05:37 | INFO | train_inner | {"epoch": 7, "update": 6.745, "s2c_loss": "0.476", "loss": "0.33021", "s2c_nll_loss": "0.476", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "14580", "lr": "9.72051e-05", "gnorm": "7.636", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3235"} 2023-01-29 17:05:39 | INFO | train_inner | {"epoch": 7, "update": 6.75, "s2c_loss": "0.557", "loss": "0.38602", "s2c_nll_loss": "0.557", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "14590", "lr": "9.72718e-05", "gnorm": "8.657", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3237"} 2023-01-29 17:05:42 | INFO | train_inner | {"epoch": 7, "update": 6.754, "s2c_loss": "0.602", "loss": "0.41738", "s2c_nll_loss": "0.602", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "14600", "lr": "9.73385e-05", "gnorm": "10.358", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3240"} 2023-01-29 17:05:44 | INFO | train_inner | {"epoch": 7, "update": 6.759, "s2c_loss": "0.721", "loss": "0.49965", "s2c_nll_loss": "0.721", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "14610", "lr": "9.74051e-05", "gnorm": "8.102", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3242"} 2023-01-29 17:05:47 | INFO | train_inner | {"epoch": 7, "update": 6.764, "s2c_loss": "0.775", "loss": "0.53724", "s2c_nll_loss": "0.775", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "14620", "lr": "9.74718e-05", "gnorm": "8.846", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3245"} 2023-01-29 17:05:49 | INFO | train_inner | {"epoch": 7, "update": 6.768, "s2c_loss": "0.505", "loss": "0.3502", "s2c_nll_loss": "0.505", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "14630", "lr": "9.75385e-05", "gnorm": "8.133", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3247"} 2023-01-29 17:05:52 | INFO | train_inner | {"epoch": 7, "update": 6.773, "s2c_loss": "0.698", "loss": "0.48357", "s2c_nll_loss": "0.698", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "14640", "lr": "9.76051e-05", "gnorm": "10.305", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3250"} 2023-01-29 17:05:54 | INFO | train_inner | {"epoch": 7, "update": 6.778, "s2c_loss": "0.632", "loss": "0.43789", "s2c_nll_loss": "0.632", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "14650", "lr": "9.76718e-05", "gnorm": "9.031", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3252"} 2023-01-29 17:05:57 | INFO | train_inner | {"epoch": 7, "update": 6.782, "s2c_loss": "0.699", "loss": "0.48418", "s2c_nll_loss": "0.699", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "14660", "lr": "9.77384e-05", "gnorm": "10.111", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3255"} 2023-01-29 17:06:00 | INFO | train_inner | {"epoch": 7, "update": 6.787, "s2c_loss": "0.781", "loss": "0.54107", "s2c_nll_loss": "0.781", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "14670", "lr": "9.78051e-05", "gnorm": "8.498", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3257"} 2023-01-29 17:06:02 | INFO | train_inner | {"epoch": 7, "update": 6.791, "s2c_loss": "0.65", "loss": "0.45064", "s2c_nll_loss": "0.65", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "14680", "lr": "9.78718e-05", "gnorm": "8.884", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3260"} 2023-01-29 17:06:05 | INFO | train_inner | {"epoch": 7, "update": 6.796, "s2c_loss": "0.886", "loss": "0.61426", "s2c_nll_loss": "0.886", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "14690", "lr": "9.79384e-05", "gnorm": "9.935", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3263"} 2023-01-29 17:06:07 | INFO | train_inner | {"epoch": 7, "update": 6.801, "s2c_loss": "0.743", "loss": "0.5151", "s2c_nll_loss": "0.743", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "14700", "lr": "9.80051e-05", "gnorm": "8.511", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3265"} 2023-01-29 17:06:10 | INFO | train_inner | {"epoch": 7, "update": 6.805, "s2c_loss": "0.781", "loss": "0.54143", "s2c_nll_loss": "0.781", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "14710", "lr": "9.80718e-05", "gnorm": "9.686", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3268"} 2023-01-29 17:06:12 | INFO | train_inner | {"epoch": 7, "update": 6.81, "s2c_loss": "0.632", "loss": "0.43804", "s2c_nll_loss": "0.632", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "14720", "lr": "9.81384e-05", "gnorm": "8.863", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3270"} 2023-01-29 17:06:15 | INFO | train_inner | {"epoch": 7, "update": 6.815, "s2c_loss": "0.655", "loss": "0.45422", "s2c_nll_loss": "0.655", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "14730", "lr": "9.82051e-05", "gnorm": "8.543", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3273"} 2023-01-29 17:06:17 | INFO | train_inner | {"epoch": 7, "update": 6.819, "s2c_loss": "0.744", "loss": "0.51541", "s2c_nll_loss": "0.744", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "14740", "lr": "9.82718e-05", "gnorm": "9.278", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "3275"} 2023-01-29 17:06:20 | INFO | train_inner | {"epoch": 7, "update": 6.824, "s2c_loss": "0.592", "loss": "0.41024", "s2c_nll_loss": "0.592", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "14750", "lr": "9.83384e-05", "gnorm": "7.885", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3278"} 2023-01-29 17:06:22 | INFO | train_inner | {"epoch": 7, "update": 6.828, "s2c_loss": "0.994", "loss": "0.68897", "s2c_nll_loss": "0.994", "s2c_accuracy": "86.25", "s2c_total": "64", "s2c_n_correct": "55.2", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "14760", "lr": "9.84051e-05", "gnorm": "8.24", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3280"} 2023-01-29 17:06:25 | INFO | train_inner | {"epoch": 7, "update": 6.833, "s2c_loss": "0.699", "loss": "0.4844", "s2c_nll_loss": "0.699", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "14770", "lr": "9.84717e-05", "gnorm": "8.323", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3283"} 2023-01-29 17:06:27 | INFO | train_inner | {"epoch": 7, "update": 6.838, "s2c_loss": "0.713", "loss": "0.49443", "s2c_nll_loss": "0.713", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "14780", "lr": "9.85384e-05", "gnorm": "8.884", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3285"} 2023-01-29 17:06:30 | INFO | train_inner | {"epoch": 7, "update": 6.842, "s2c_loss": "0.541", "loss": "0.37518", "s2c_nll_loss": "0.541", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "14790", "lr": "9.86051e-05", "gnorm": "8.918", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3288"} 2023-01-29 17:06:32 | INFO | train_inner | {"epoch": 7, "update": 6.847, "s2c_loss": "0.639", "loss": "0.44302", "s2c_nll_loss": "0.639", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "14800", "lr": "9.86717e-05", "gnorm": "9.318", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3290"} 2023-01-29 17:06:35 | INFO | train_inner | {"epoch": 7, "update": 6.852, "s2c_loss": "0.677", "loss": "0.46905", "s2c_nll_loss": "0.677", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "14810", "lr": "9.87384e-05", "gnorm": "9.053", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3293"} 2023-01-29 17:06:38 | INFO | train_inner | {"epoch": 7, "update": 6.856, "s2c_loss": "0.651", "loss": "0.45094", "s2c_nll_loss": "0.651", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "14820", "lr": "9.88051e-05", "gnorm": "9.472", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3296"} 2023-01-29 17:06:40 | INFO | train_inner | {"epoch": 7, "update": 6.861, "s2c_loss": "0.668", "loss": "0.46292", "s2c_nll_loss": "0.668", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "14830", "lr": "9.88717e-05", "gnorm": "8.294", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3298"} 2023-01-29 17:06:43 | INFO | train_inner | {"epoch": 7, "update": 6.865, "s2c_loss": "0.622", "loss": "0.43133", "s2c_nll_loss": "0.622", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "14840", "lr": "9.89384e-05", "gnorm": "8.826", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3301"} 2023-01-29 17:06:45 | INFO | train_inner | {"epoch": 7, "update": 6.87, "s2c_loss": "0.573", "loss": "0.39686", "s2c_nll_loss": "0.573", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "14850", "lr": "9.90051e-05", "gnorm": "8.403", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3303"} 2023-01-29 17:06:48 | INFO | train_inner | {"epoch": 7, "update": 6.875, "s2c_loss": "0.751", "loss": "0.52039", "s2c_nll_loss": "0.751", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "14860", "lr": "9.90717e-05", "gnorm": "8.304", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3306"} 2023-01-29 17:06:50 | INFO | train_inner | {"epoch": 7, "update": 6.879, "s2c_loss": "0.764", "loss": "0.52948", "s2c_nll_loss": "0.764", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "14870", "lr": "9.91384e-05", "gnorm": "9.907", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3308"} 2023-01-29 17:06:53 | INFO | train_inner | {"epoch": 7, "update": 6.884, "s2c_loss": "0.771", "loss": "0.53468", "s2c_nll_loss": "0.771", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "14880", "lr": "9.9205e-05", "gnorm": "9.137", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3311"} 2023-01-29 17:06:55 | INFO | train_inner | {"epoch": 7, "update": 6.889, "s2c_loss": "0.688", "loss": "0.47657", "s2c_nll_loss": "0.688", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "14890", "lr": "9.92717e-05", "gnorm": "7.96", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3313"} 2023-01-29 17:06:58 | INFO | train_inner | {"epoch": 7, "update": 6.893, "s2c_loss": "0.611", "loss": "0.42326", "s2c_nll_loss": "0.611", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "14900", "lr": "9.93384e-05", "gnorm": "8.412", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "3316"} 2023-01-29 17:07:00 | INFO | train_inner | {"epoch": 7, "update": 6.898, "s2c_loss": "0.683", "loss": "0.47375", "s2c_nll_loss": "0.683", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "14910", "lr": "9.9405e-05", "gnorm": "8.351", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "3318"} 2023-01-29 17:07:03 | INFO | train_inner | {"epoch": 7, "update": 6.902, "s2c_loss": "0.597", "loss": "0.41356", "s2c_nll_loss": "0.597", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "14920", "lr": "9.94717e-05", "gnorm": "8.517", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3321"} 2023-01-29 17:07:06 | INFO | train_inner | {"epoch": 7, "update": 6.907, "s2c_loss": "0.503", "loss": "0.34868", "s2c_nll_loss": "0.503", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "14930", "lr": "9.95384e-05", "gnorm": "8.47", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3324"} 2023-01-29 17:07:08 | INFO | train_inner | {"epoch": 7, "update": 6.912, "s2c_loss": "0.813", "loss": "0.56339", "s2c_nll_loss": "0.813", "s2c_accuracy": "85.156", "s2c_total": "64", "s2c_n_correct": "54.5", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "14940", "lr": "9.9605e-05", "gnorm": "10.517", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "3326"} 2023-01-29 17:07:11 | INFO | train_inner | {"epoch": 7, "update": 6.916, "s2c_loss": "0.722", "loss": "0.5007", "s2c_nll_loss": "0.722", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "14950", "lr": "9.96717e-05", "gnorm": "9.642", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "3329"} 2023-01-29 17:07:13 | INFO | train_inner | {"epoch": 7, "update": 6.921, "s2c_loss": "0.722", "loss": "0.50047", "s2c_nll_loss": "0.722", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "14960", "lr": "9.97383e-05", "gnorm": "9.382", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3331"} 2023-01-29 17:07:16 | INFO | train_inner | {"epoch": 7, "update": 6.926, "s2c_loss": "0.791", "loss": "0.54808", "s2c_nll_loss": "0.791", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "14970", "lr": "9.9805e-05", "gnorm": "8.117", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3334"} 2023-01-29 17:07:18 | INFO | train_inner | {"epoch": 7, "update": 6.93, "s2c_loss": "0.603", "loss": "0.41775", "s2c_nll_loss": "0.603", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "14980", "lr": "9.98717e-05", "gnorm": "8.205", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3336"} 2023-01-29 17:07:21 | INFO | train_inner | {"epoch": 7, "update": 6.935, "s2c_loss": "0.517", "loss": "0.35841", "s2c_nll_loss": "0.517", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "249.3", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "14990", "lr": "9.99383e-05", "gnorm": "8.681", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3339"} 2023-01-29 17:07:23 | INFO | train_inner | {"epoch": 7, "update": 6.939, "s2c_loss": "0.672", "loss": "0.4656", "s2c_nll_loss": "0.672", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "15000", "lr": "0.000100005", "gnorm": "10.842", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3341"} 2023-01-29 17:07:26 | INFO | train_inner | {"epoch": 7, "update": 6.944, "s2c_loss": "0.553", "loss": "0.38354", "s2c_nll_loss": "0.553", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "15010", "lr": "0.000100072", "gnorm": "9.358", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3344"} 2023-01-29 17:07:28 | INFO | train_inner | {"epoch": 7, "update": 6.949, "s2c_loss": "0.603", "loss": "0.41769", "s2c_nll_loss": "0.603", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "15020", "lr": "0.000100138", "gnorm": "9.656", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3346"} 2023-01-29 17:07:31 | INFO | train_inner | {"epoch": 7, "update": 6.953, "s2c_loss": "0.75", "loss": "0.52012", "s2c_nll_loss": "0.75", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "15030", "lr": "0.000100205", "gnorm": "9.692", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3349"} 2023-01-29 17:07:33 | INFO | train_inner | {"epoch": 7, "update": 6.958, "s2c_loss": "0.81", "loss": "0.56167", "s2c_nll_loss": "0.81", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "15040", "lr": "0.000100272", "gnorm": "9.953", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3351"} 2023-01-29 17:07:36 | INFO | train_inner | {"epoch": 7, "update": 6.963, "s2c_loss": "1.009", "loss": "0.69954", "s2c_nll_loss": "1.009", "s2c_accuracy": "84.844", "s2c_total": "64", "s2c_n_correct": "54.3", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "15050", "lr": "0.000100338", "gnorm": "9.289", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3354"} 2023-01-29 17:07:39 | INFO | train_inner | {"epoch": 7, "update": 6.967, "s2c_loss": "0.69", "loss": "0.47825", "s2c_nll_loss": "0.69", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "15060", "lr": "0.000100405", "gnorm": "9.018", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3356"} 2023-01-29 17:07:41 | INFO | train_inner | {"epoch": 7, "update": 6.972, "s2c_loss": "0.723", "loss": "0.50108", "s2c_nll_loss": "0.723", "s2c_accuracy": "85.312", "s2c_total": "64", "s2c_n_correct": "54.6", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "15070", "lr": "0.000100472", "gnorm": "11.678", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3359"} 2023-01-29 17:07:44 | INFO | train_inner | {"epoch": 7, "update": 6.976, "s2c_loss": "0.818", "loss": "0.56667", "s2c_nll_loss": "0.818", "s2c_accuracy": "85.625", "s2c_total": "64", "s2c_n_correct": "54.8", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "15080", "lr": "0.000100538", "gnorm": "9.694", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3362"} 2023-01-29 17:07:46 | INFO | train_inner | {"epoch": 7, "update": 6.981, "s2c_loss": "0.881", "loss": "0.61054", "s2c_nll_loss": "0.881", "s2c_accuracy": "85.625", "s2c_total": "64", "s2c_n_correct": "54.8", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "15090", "lr": "0.000100605", "gnorm": "10.375", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3364"} 2023-01-29 17:07:49 | INFO | train_inner | {"epoch": 7, "update": 6.986, "s2c_loss": "0.524", "loss": "0.36345", "s2c_nll_loss": "0.524", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "15100", "lr": "0.000100672", "gnorm": "8.323", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3367"} 2023-01-29 17:07:51 | INFO | train_inner | {"epoch": 7, "update": 6.99, "s2c_loss": "0.646", "loss": "0.44748", "s2c_nll_loss": "0.646", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "15110", "lr": "0.000100738", "gnorm": "8.932", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3369"} 2023-01-29 17:07:54 | INFO | train_inner | {"epoch": 7, "update": 6.995, "s2c_loss": "0.776", "loss": "0.53804", "s2c_nll_loss": "0.776", "s2c_accuracy": "85", "s2c_total": "64", "s2c_n_correct": "54.4", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "15120", "lr": "0.000100805", "gnorm": "10.742", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3372"} 2023-01-29 17:07:56 | INFO | train_inner | {"epoch": 7, "update": 7.0, "s2c_loss": "0.831", "loss": "0.5759", "s2c_nll_loss": "0.831", "s2c_accuracy": "85.469", "s2c_total": "64", "s2c_n_correct": "54.7", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "15130", "lr": "0.000100872", "gnorm": "9.596", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3374"} 2023-01-29 17:07:57 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 7 @ 15131 updates 2023-01-29 17:07:57 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 17:08:03 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 17:08:03 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt (epoch 7 @ 15131 updates, score None) (writing took 6.866972866933793 seconds) 2023-01-29 17:08:03 | INFO | fairseq_cli.train | end of epoch 7 (average epoch stats below) 2023-01-29 17:08:03 | INFO | train | {"epoch": 7, "train_s2c_loss": "0.693", "train_loss": "0.48017", "train_s2c_nll_loss": "0.693", "train_s2c_accuracy": "88.964", "train_s2c_total": "63.9838", "train_s2c_n_correct": "56.9223", "train_wps": "246", "train_ups": "3.84", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "15131", "train_lr": "0.000100878", "train_gnorm": "9.074", "train_loss_scale": "1024", "train_train_wall": "541", "train_gb_free": "7.4", "train_wall": "3381"} 2023-01-29 17:08:10 | INFO | fairseq.trainer | begin training epoch 8 2023-01-29 17:08:10 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 17:08:12 | INFO | train_inner | {"epoch": 8, "update": 7.004, "s2c_loss": "0.766", "loss": "0.5309", "s2c_nll_loss": "0.766", "s2c_accuracy": "86.349", "s2c_total": "60.8", "s2c_n_correct": "52.5", "wps": "38.2", "ups": "0.63", "wpb": "60.8", "bsz": "60.8", "num_updates": "15140", "lr": "0.000100938", "gnorm": "9.968", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3390"} 2023-01-29 17:08:15 | INFO | train_inner | {"epoch": 8, "update": 7.009, "s2c_loss": "0.735", "loss": "0.50958", "s2c_nll_loss": "0.735", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "15150", "lr": "0.000101005", "gnorm": "9.704", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3393"} 2023-01-29 17:08:17 | INFO | train_inner | {"epoch": 8, "update": 7.013, "s2c_loss": "0.752", "loss": "0.52093", "s2c_nll_loss": "0.752", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "15160", "lr": "0.000101072", "gnorm": "9.099", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3395"} 2023-01-29 17:08:20 | INFO | train_inner | {"epoch": 8, "update": 7.018, "s2c_loss": "0.512", "loss": "0.35463", "s2c_nll_loss": "0.512", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "15170", "lr": "0.000101138", "gnorm": "8.403", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3398"} 2023-01-29 17:08:22 | INFO | train_inner | {"epoch": 8, "update": 7.023, "s2c_loss": "0.646", "loss": "0.44787", "s2c_nll_loss": "0.646", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "15180", "lr": "0.000101205", "gnorm": "8.517", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3400"} 2023-01-29 17:08:25 | INFO | train_inner | {"epoch": 8, "update": 7.027, "s2c_loss": "0.581", "loss": "0.40286", "s2c_nll_loss": "0.581", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "248", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "15190", "lr": "0.000101272", "gnorm": "7.868", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3403"} 2023-01-29 17:08:28 | INFO | train_inner | {"epoch": 8, "update": 7.032, "s2c_loss": "0.523", "loss": "0.36227", "s2c_nll_loss": "0.523", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "15200", "lr": "0.000101338", "gnorm": "7.812", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3405"} 2023-01-29 17:08:30 | INFO | train_inner | {"epoch": 8, "update": 7.037, "s2c_loss": "0.648", "loss": "0.44905", "s2c_nll_loss": "0.648", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "15210", "lr": "0.000101405", "gnorm": "9.168", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3408"} 2023-01-29 17:08:33 | INFO | train_inner | {"epoch": 8, "update": 7.041, "s2c_loss": "0.466", "loss": "0.32285", "s2c_nll_loss": "0.466", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "15220", "lr": "0.000101472", "gnorm": "8.4", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3411"} 2023-01-29 17:08:35 | INFO | train_inner | {"epoch": 8, "update": 7.046, "s2c_loss": "0.571", "loss": "0.39557", "s2c_nll_loss": "0.571", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "15230", "lr": "0.000101538", "gnorm": "10.388", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3413"} 2023-01-29 17:08:38 | INFO | train_inner | {"epoch": 8, "update": 7.05, "s2c_loss": "0.533", "loss": "0.36969", "s2c_nll_loss": "0.533", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "15240", "lr": "0.000101605", "gnorm": "9.073", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "3416"} 2023-01-29 17:08:40 | INFO | train_inner | {"epoch": 8, "update": 7.055, "s2c_loss": "0.574", "loss": "0.39758", "s2c_nll_loss": "0.574", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "258.2", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "15250", "lr": "0.000101672", "gnorm": "8.033", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3418"} 2023-01-29 17:08:43 | INFO | train_inner | {"epoch": 8, "update": 7.06, "s2c_loss": "0.612", "loss": "0.42418", "s2c_nll_loss": "0.612", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "15260", "lr": "0.000101738", "gnorm": "7.156", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3421"} 2023-01-29 17:08:45 | INFO | train_inner | {"epoch": 8, "update": 7.064, "s2c_loss": "0.687", "loss": "0.47605", "s2c_nll_loss": "0.687", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "15270", "lr": "0.000101805", "gnorm": "8.716", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3423"} 2023-01-29 17:08:48 | INFO | train_inner | {"epoch": 8, "update": 7.069, "s2c_loss": "0.792", "loss": "0.54888", "s2c_nll_loss": "0.792", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "246.3", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "15280", "lr": "0.000101872", "gnorm": "9.376", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3426"} 2023-01-29 17:08:50 | INFO | train_inner | {"epoch": 8, "update": 7.074, "s2c_loss": "0.47", "loss": "0.32587", "s2c_nll_loss": "0.47", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "246.2", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "15290", "lr": "0.000101938", "gnorm": "8.575", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3428"} 2023-01-29 17:08:53 | INFO | train_inner | {"epoch": 8, "update": 7.078, "s2c_loss": "0.525", "loss": "0.36415", "s2c_nll_loss": "0.525", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "15300", "lr": "0.000102005", "gnorm": "7.749", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3431"} 2023-01-29 17:08:55 | INFO | train_inner | {"epoch": 8, "update": 7.083, "s2c_loss": "0.564", "loss": "0.39106", "s2c_nll_loss": "0.564", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "15310", "lr": "0.000102072", "gnorm": "9.079", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3433"} 2023-01-29 17:08:58 | INFO | train_inner | {"epoch": 8, "update": 7.087, "s2c_loss": "0.703", "loss": "0.48716", "s2c_nll_loss": "0.703", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "15320", "lr": "0.000102138", "gnorm": "9.474", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3436"} 2023-01-29 17:09:00 | INFO | train_inner | {"epoch": 8, "update": 7.092, "s2c_loss": "0.55", "loss": "0.38147", "s2c_nll_loss": "0.55", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "15330", "lr": "0.000102205", "gnorm": "8.415", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3438"} 2023-01-29 17:09:03 | INFO | train_inner | {"epoch": 8, "update": 7.097, "s2c_loss": "0.753", "loss": "0.52204", "s2c_nll_loss": "0.753", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "250.6", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "15340", "lr": "0.000102272", "gnorm": "8.571", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3441"} 2023-01-29 17:09:06 | INFO | train_inner | {"epoch": 8, "update": 7.101, "s2c_loss": "0.776", "loss": "0.53774", "s2c_nll_loss": "0.776", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "15350", "lr": "0.000102338", "gnorm": "8.344", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3444"} 2023-01-29 17:09:08 | INFO | train_inner | {"epoch": 8, "update": 7.106, "s2c_loss": "0.56", "loss": "0.38819", "s2c_nll_loss": "0.56", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "15360", "lr": "0.000102405", "gnorm": "8.008", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3446"} 2023-01-29 17:09:11 | INFO | train_inner | {"epoch": 8, "update": 7.111, "s2c_loss": "0.582", "loss": "0.40312", "s2c_nll_loss": "0.582", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "15370", "lr": "0.000102472", "gnorm": "8.232", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3449"} 2023-01-29 17:09:13 | INFO | train_inner | {"epoch": 8, "update": 7.115, "s2c_loss": "0.462", "loss": "0.32016", "s2c_nll_loss": "0.462", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "15380", "lr": "0.000102538", "gnorm": "6.818", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3451"} 2023-01-29 17:09:16 | INFO | train_inner | {"epoch": 8, "update": 7.12, "s2c_loss": "0.536", "loss": "0.37137", "s2c_nll_loss": "0.536", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "245.3", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "15390", "lr": "0.000102605", "gnorm": "7.531", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3454"} 2023-01-29 17:09:18 | INFO | train_inner | {"epoch": 8, "update": 7.124, "s2c_loss": "0.599", "loss": "0.41496", "s2c_nll_loss": "0.599", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "15400", "lr": "0.000102672", "gnorm": "7.213", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3456"} 2023-01-29 17:09:21 | INFO | train_inner | {"epoch": 8, "update": 7.129, "s2c_loss": "0.762", "loss": "0.52812", "s2c_nll_loss": "0.762", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "15410", "lr": "0.000102738", "gnorm": "8.012", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3459"} 2023-01-29 17:09:24 | INFO | train_inner | {"epoch": 8, "update": 7.134, "s2c_loss": "0.64", "loss": "0.44357", "s2c_nll_loss": "0.64", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "15420", "lr": "0.000102805", "gnorm": "8.145", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3461"} 2023-01-29 17:09:26 | INFO | train_inner | {"epoch": 8, "update": 7.138, "s2c_loss": "0.509", "loss": "0.35299", "s2c_nll_loss": "0.509", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "15430", "lr": "0.000102872", "gnorm": "7.889", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3464"} 2023-01-29 17:09:29 | INFO | train_inner | {"epoch": 8, "update": 7.143, "s2c_loss": "0.624", "loss": "0.43273", "s2c_nll_loss": "0.624", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "15440", "lr": "0.000102938", "gnorm": "8.341", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3467"} 2023-01-29 17:09:31 | INFO | train_inner | {"epoch": 8, "update": 7.148, "s2c_loss": "0.469", "loss": "0.32514", "s2c_nll_loss": "0.469", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "15450", "lr": "0.000103005", "gnorm": "8.125", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3469"} 2023-01-29 17:09:34 | INFO | train_inner | {"epoch": 8, "update": 7.152, "s2c_loss": "0.685", "loss": "0.475", "s2c_nll_loss": "0.685", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "15460", "lr": "0.000103072", "gnorm": "8.047", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3472"} 2023-01-29 17:09:36 | INFO | train_inner | {"epoch": 8, "update": 7.157, "s2c_loss": "0.606", "loss": "0.42035", "s2c_nll_loss": "0.606", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "247.4", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "15470", "lr": "0.000103138", "gnorm": "8.15", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3474"} 2023-01-29 17:09:39 | INFO | train_inner | {"epoch": 8, "update": 7.161, "s2c_loss": "0.562", "loss": "0.38935", "s2c_nll_loss": "0.562", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "247.6", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "15480", "lr": "0.000103205", "gnorm": "8.102", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3477"} 2023-01-29 17:09:41 | INFO | train_inner | {"epoch": 8, "update": 7.166, "s2c_loss": "0.646", "loss": "0.44789", "s2c_nll_loss": "0.646", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "15490", "lr": "0.000103272", "gnorm": "8.075", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3479"} 2023-01-29 17:09:44 | INFO | train_inner | {"epoch": 8, "update": 7.171, "s2c_loss": "0.578", "loss": "0.40074", "s2c_nll_loss": "0.578", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "15500", "lr": "0.000103338", "gnorm": "7.862", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3482"} 2023-01-29 17:09:47 | INFO | train_inner | {"epoch": 8, "update": 7.175, "s2c_loss": "0.517", "loss": "0.35861", "s2c_nll_loss": "0.517", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "15510", "lr": "0.000103405", "gnorm": "7.754", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3484"} 2023-01-29 17:09:49 | INFO | train_inner | {"epoch": 8, "update": 7.18, "s2c_loss": "0.527", "loss": "0.3652", "s2c_nll_loss": "0.527", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "15520", "lr": "0.000103471", "gnorm": "7.709", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3487"} 2023-01-29 17:09:52 | INFO | train_inner | {"epoch": 8, "update": 7.185, "s2c_loss": "0.584", "loss": "0.40467", "s2c_nll_loss": "0.584", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "15530", "lr": "0.000103538", "gnorm": "8.139", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3490"} 2023-01-29 17:09:54 | INFO | train_inner | {"epoch": 8, "update": 7.189, "s2c_loss": "0.513", "loss": "0.35558", "s2c_nll_loss": "0.513", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "15540", "lr": "0.000103605", "gnorm": "8.236", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3492"} 2023-01-29 17:09:57 | INFO | train_inner | {"epoch": 8, "update": 7.194, "s2c_loss": "0.621", "loss": "0.43068", "s2c_nll_loss": "0.621", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "15550", "lr": "0.000103671", "gnorm": "8.575", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3495"} 2023-01-29 17:09:59 | INFO | train_inner | {"epoch": 8, "update": 7.198, "s2c_loss": "0.634", "loss": "0.43957", "s2c_nll_loss": "0.634", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "15560", "lr": "0.000103738", "gnorm": "8.94", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3497"} 2023-01-29 17:10:02 | INFO | train_inner | {"epoch": 8, "update": 7.203, "s2c_loss": "0.666", "loss": "0.46133", "s2c_nll_loss": "0.666", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "15570", "lr": "0.000103805", "gnorm": "8.133", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3500"} 2023-01-29 17:10:04 | INFO | train_inner | {"epoch": 8, "update": 7.208, "s2c_loss": "0.454", "loss": "0.31485", "s2c_nll_loss": "0.454", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "15580", "lr": "0.000103871", "gnorm": "7.874", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3502"} 2023-01-29 17:10:07 | INFO | train_inner | {"epoch": 8, "update": 7.212, "s2c_loss": "0.716", "loss": "0.49597", "s2c_nll_loss": "0.716", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "15590", "lr": "0.000103938", "gnorm": "8.511", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3505"} 2023-01-29 17:10:09 | INFO | train_inner | {"epoch": 8, "update": 7.217, "s2c_loss": "0.57", "loss": "0.39524", "s2c_nll_loss": "0.57", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "15600", "lr": "0.000104005", "gnorm": "8.143", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3507"} 2023-01-29 17:10:12 | INFO | train_inner | {"epoch": 8, "update": 7.222, "s2c_loss": "0.563", "loss": "0.38992", "s2c_nll_loss": "0.563", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "15610", "lr": "0.000104071", "gnorm": "7.573", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3510"} 2023-01-29 17:10:14 | INFO | train_inner | {"epoch": 8, "update": 7.226, "s2c_loss": "0.512", "loss": "0.35516", "s2c_nll_loss": "0.512", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "15620", "lr": "0.000104138", "gnorm": "7.436", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3512"} 2023-01-29 17:10:17 | INFO | train_inner | {"epoch": 8, "update": 7.231, "s2c_loss": "0.51", "loss": "0.35337", "s2c_nll_loss": "0.51", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "15630", "lr": "0.000104205", "gnorm": "8.236", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3515"} 2023-01-29 17:10:20 | INFO | train_inner | {"epoch": 8, "update": 7.235, "s2c_loss": "0.602", "loss": "0.41699", "s2c_nll_loss": "0.602", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "15640", "lr": "0.000104271", "gnorm": "8.696", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3517"} 2023-01-29 17:10:22 | INFO | train_inner | {"epoch": 8, "update": 7.24, "s2c_loss": "0.536", "loss": "0.37141", "s2c_nll_loss": "0.536", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "15650", "lr": "0.000104338", "gnorm": "8.406", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3520"} 2023-01-29 17:10:25 | INFO | train_inner | {"epoch": 8, "update": 7.245, "s2c_loss": "0.511", "loss": "0.35438", "s2c_nll_loss": "0.511", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "15660", "lr": "0.000104405", "gnorm": "8.646", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3523"} 2023-01-29 17:10:27 | INFO | train_inner | {"epoch": 8, "update": 7.249, "s2c_loss": "0.628", "loss": "0.43546", "s2c_nll_loss": "0.628", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "15670", "lr": "0.000104471", "gnorm": "8.093", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "3525"} 2023-01-29 17:10:30 | INFO | train_inner | {"epoch": 8, "update": 7.254, "s2c_loss": "0.681", "loss": "0.4723", "s2c_nll_loss": "0.681", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "15680", "lr": "0.000104538", "gnorm": "7.863", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3528"} 2023-01-29 17:10:32 | INFO | train_inner | {"epoch": 8, "update": 7.259, "s2c_loss": "0.53", "loss": "0.36741", "s2c_nll_loss": "0.53", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "15690", "lr": "0.000104605", "gnorm": "7.61", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3530"} 2023-01-29 17:10:35 | INFO | train_inner | {"epoch": 8, "update": 7.263, "s2c_loss": "0.515", "loss": "0.35665", "s2c_nll_loss": "0.515", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "247.6", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "15700", "lr": "0.000104671", "gnorm": "7.959", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3533"} 2023-01-29 17:10:37 | INFO | train_inner | {"epoch": 8, "update": 7.268, "s2c_loss": "0.447", "loss": "0.3101", "s2c_nll_loss": "0.447", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "15710", "lr": "0.000104738", "gnorm": "7.03", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3535"} 2023-01-29 17:10:40 | INFO | train_inner | {"epoch": 8, "update": 7.272, "s2c_loss": "0.445", "loss": "0.30876", "s2c_nll_loss": "0.445", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "15720", "lr": "0.000104805", "gnorm": "7.657", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3538"} 2023-01-29 17:10:42 | INFO | train_inner | {"epoch": 8, "update": 7.277, "s2c_loss": "0.6", "loss": "0.41592", "s2c_nll_loss": "0.6", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "15730", "lr": "0.000104871", "gnorm": "8.018", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3540"} 2023-01-29 17:10:45 | INFO | train_inner | {"epoch": 8, "update": 7.282, "s2c_loss": "0.444", "loss": "0.30754", "s2c_nll_loss": "0.444", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "15740", "lr": "0.000104938", "gnorm": "6.641", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3543"} 2023-01-29 17:10:47 | INFO | train_inner | {"epoch": 8, "update": 7.286, "s2c_loss": "0.634", "loss": "0.43971", "s2c_nll_loss": "0.634", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "15750", "lr": "0.000105005", "gnorm": "7.991", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "3545"} 2023-01-29 17:10:50 | INFO | train_inner | {"epoch": 8, "update": 7.291, "s2c_loss": "0.488", "loss": "0.33843", "s2c_nll_loss": "0.488", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "15760", "lr": "0.000105071", "gnorm": "7.965", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3548"} 2023-01-29 17:10:52 | INFO | train_inner | {"epoch": 8, "update": 7.296, "s2c_loss": "0.486", "loss": "0.33654", "s2c_nll_loss": "0.486", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "15770", "lr": "0.000105138", "gnorm": "7.501", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3550"} 2023-01-29 17:10:55 | INFO | train_inner | {"epoch": 8, "update": 7.3, "s2c_loss": "0.517", "loss": "0.35836", "s2c_nll_loss": "0.517", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "15780", "lr": "0.000105205", "gnorm": "7.889", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3553"} 2023-01-29 17:10:58 | INFO | train_inner | {"epoch": 8, "update": 7.305, "s2c_loss": "0.557", "loss": "0.38615", "s2c_nll_loss": "0.557", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "15790", "lr": "0.000105271", "gnorm": "8.586", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "3555"} 2023-01-29 17:11:00 | INFO | train_inner | {"epoch": 8, "update": 7.309, "s2c_loss": "0.55", "loss": "0.38101", "s2c_nll_loss": "0.55", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "15800", "lr": "0.000105338", "gnorm": "8.172", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3558"} 2023-01-29 17:11:03 | INFO | train_inner | {"epoch": 8, "update": 7.314, "s2c_loss": "0.484", "loss": "0.33551", "s2c_nll_loss": "0.484", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "15810", "lr": "0.000105405", "gnorm": "7.535", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3561"} 2023-01-29 17:11:05 | INFO | train_inner | {"epoch": 8, "update": 7.319, "s2c_loss": "0.659", "loss": "0.45701", "s2c_nll_loss": "0.659", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "15820", "lr": "0.000105471", "gnorm": "9.591", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3563"} 2023-01-29 17:11:08 | INFO | train_inner | {"epoch": 8, "update": 7.323, "s2c_loss": "0.662", "loss": "0.45908", "s2c_nll_loss": "0.662", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "15830", "lr": "0.000105538", "gnorm": "8.576", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3566"} 2023-01-29 17:11:10 | INFO | train_inner | {"epoch": 8, "update": 7.328, "s2c_loss": "0.727", "loss": "0.50423", "s2c_nll_loss": "0.727", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "15840", "lr": "0.000105605", "gnorm": "9.676", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3568"} 2023-01-29 17:11:13 | INFO | train_inner | {"epoch": 8, "update": 7.333, "s2c_loss": "0.733", "loss": "0.50811", "s2c_nll_loss": "0.733", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "15850", "lr": "0.000105671", "gnorm": "11.241", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3571"} 2023-01-29 17:11:15 | INFO | train_inner | {"epoch": 8, "update": 7.337, "s2c_loss": "0.647", "loss": "0.44853", "s2c_nll_loss": "0.647", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "15860", "lr": "0.000105738", "gnorm": "8.275", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3573"} 2023-01-29 17:11:18 | INFO | train_inner | {"epoch": 8, "update": 7.342, "s2c_loss": "0.605", "loss": "0.41967", "s2c_nll_loss": "0.605", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "15870", "lr": "0.000105805", "gnorm": "8.054", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3576"} 2023-01-29 17:11:20 | INFO | train_inner | {"epoch": 8, "update": 7.346, "s2c_loss": "0.689", "loss": "0.47768", "s2c_nll_loss": "0.689", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "15880", "lr": "0.000105871", "gnorm": "7.999", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3578"} 2023-01-29 17:11:23 | INFO | train_inner | {"epoch": 8, "update": 7.351, "s2c_loss": "0.795", "loss": "0.55115", "s2c_nll_loss": "0.795", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "15890", "lr": "0.000105938", "gnorm": "8.73", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3581"} 2023-01-29 17:11:25 | INFO | train_inner | {"epoch": 8, "update": 7.356, "s2c_loss": "0.729", "loss": "0.50503", "s2c_nll_loss": "0.729", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "15900", "lr": "0.000106005", "gnorm": "8.499", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3583"} 2023-01-29 17:11:28 | INFO | train_inner | {"epoch": 8, "update": 7.36, "s2c_loss": "0.508", "loss": "0.3523", "s2c_nll_loss": "0.508", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "15910", "lr": "0.000106071", "gnorm": "7.046", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3586"} 2023-01-29 17:11:30 | INFO | train_inner | {"epoch": 8, "update": 7.365, "s2c_loss": "0.641", "loss": "0.44404", "s2c_nll_loss": "0.641", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "15920", "lr": "0.000106138", "gnorm": "7.992", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3588"} 2023-01-29 17:11:33 | INFO | train_inner | {"epoch": 8, "update": 7.37, "s2c_loss": "0.563", "loss": "0.39056", "s2c_nll_loss": "0.563", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "15930", "lr": "0.000106205", "gnorm": "8.14", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3591"} 2023-01-29 17:11:36 | INFO | train_inner | {"epoch": 8, "update": 7.374, "s2c_loss": "0.571", "loss": "0.39567", "s2c_nll_loss": "0.571", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "15940", "lr": "0.000106271", "gnorm": "8.411", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3594"} 2023-01-29 17:11:38 | INFO | train_inner | {"epoch": 8, "update": 7.379, "s2c_loss": "0.684", "loss": "0.47386", "s2c_nll_loss": "0.684", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "15950", "lr": "0.000106338", "gnorm": "8.501", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "3596"} 2023-01-29 17:11:41 | INFO | train_inner | {"epoch": 8, "update": 7.383, "s2c_loss": "0.533", "loss": "0.36953", "s2c_nll_loss": "0.533", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "15960", "lr": "0.000106405", "gnorm": "7.552", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3599"} 2023-01-29 17:11:43 | INFO | train_inner | {"epoch": 8, "update": 7.388, "s2c_loss": "0.623", "loss": "0.43188", "s2c_nll_loss": "0.623", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "15970", "lr": "0.000106471", "gnorm": "7.503", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3601"} 2023-01-29 17:11:46 | INFO | train_inner | {"epoch": 8, "update": 7.393, "s2c_loss": "0.566", "loss": "0.39242", "s2c_nll_loss": "0.566", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "15980", "lr": "0.000106538", "gnorm": "7.808", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3604"} 2023-01-29 17:11:48 | INFO | train_inner | {"epoch": 8, "update": 7.397, "s2c_loss": "0.604", "loss": "0.41883", "s2c_nll_loss": "0.604", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "15990", "lr": "0.000106605", "gnorm": "8.254", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3606"} 2023-01-29 17:11:51 | INFO | train_inner | {"epoch": 8, "update": 7.402, "s2c_loss": "0.701", "loss": "0.48612", "s2c_nll_loss": "0.701", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "16000", "lr": "0.000106671", "gnorm": "8.974", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3609"} 2023-01-29 17:11:53 | INFO | train_inner | {"epoch": 8, "update": 7.407, "s2c_loss": "0.611", "loss": "0.42327", "s2c_nll_loss": "0.611", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "16010", "lr": "0.000106738", "gnorm": "8.831", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3611"} 2023-01-29 17:11:56 | INFO | train_inner | {"epoch": 8, "update": 7.411, "s2c_loss": "0.725", "loss": "0.50249", "s2c_nll_loss": "0.725", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "16020", "lr": "0.000106805", "gnorm": "9.066", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3614"} 2023-01-29 17:11:58 | INFO | train_inner | {"epoch": 8, "update": 7.416, "s2c_loss": "0.653", "loss": "0.45244", "s2c_nll_loss": "0.653", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "259.3", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "16030", "lr": "0.000106871", "gnorm": "9.533", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3616"} 2023-01-29 17:12:01 | INFO | train_inner | {"epoch": 8, "update": 7.42, "s2c_loss": "0.699", "loss": "0.48445", "s2c_nll_loss": "0.699", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "16040", "lr": "0.000106938", "gnorm": "9.504", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3619"} 2023-01-29 17:12:03 | INFO | train_inner | {"epoch": 8, "update": 7.425, "s2c_loss": "0.705", "loss": "0.48855", "s2c_nll_loss": "0.705", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "16050", "lr": "0.000107005", "gnorm": "9.747", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3621"} 2023-01-29 17:12:06 | INFO | train_inner | {"epoch": 8, "update": 7.43, "s2c_loss": "0.618", "loss": "0.42829", "s2c_nll_loss": "0.618", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "16060", "lr": "0.000107071", "gnorm": "9.113", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3624"} 2023-01-29 17:12:08 | INFO | train_inner | {"epoch": 8, "update": 7.434, "s2c_loss": "0.645", "loss": "0.44723", "s2c_nll_loss": "0.645", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "16070", "lr": "0.000107138", "gnorm": "9.708", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3626"} 2023-01-29 17:12:11 | INFO | train_inner | {"epoch": 8, "update": 7.439, "s2c_loss": "0.769", "loss": "0.53293", "s2c_nll_loss": "0.769", "s2c_accuracy": "86.875", "s2c_total": "64", "s2c_n_correct": "55.6", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "16080", "lr": "0.000107205", "gnorm": "10.205", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3629"} 2023-01-29 17:12:13 | INFO | train_inner | {"epoch": 8, "update": 7.444, "s2c_loss": "0.656", "loss": "0.45469", "s2c_nll_loss": "0.656", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "16090", "lr": "0.000107271", "gnorm": "8.601", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3631"} 2023-01-29 17:12:16 | INFO | train_inner | {"epoch": 8, "update": 7.448, "s2c_loss": "0.627", "loss": "0.43427", "s2c_nll_loss": "0.627", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "247.3", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "16100", "lr": "0.000107338", "gnorm": "9.872", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3634"} 2023-01-29 17:12:19 | INFO | train_inner | {"epoch": 8, "update": 7.453, "s2c_loss": "0.823", "loss": "0.57062", "s2c_nll_loss": "0.823", "s2c_accuracy": "85.469", "s2c_total": "64", "s2c_n_correct": "54.7", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "16110", "lr": "0.000107405", "gnorm": "9.623", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3636"} 2023-01-29 17:12:21 | INFO | train_inner | {"epoch": 8, "update": 7.457, "s2c_loss": "0.648", "loss": "0.44903", "s2c_nll_loss": "0.648", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "16120", "lr": "0.000107471", "gnorm": "8.474", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3639"} 2023-01-29 17:12:24 | INFO | train_inner | {"epoch": 8, "update": 7.462, "s2c_loss": "0.589", "loss": "0.4084", "s2c_nll_loss": "0.589", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "16130", "lr": "0.000107538", "gnorm": "9.362", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3642"} 2023-01-29 17:12:26 | INFO | train_inner | {"epoch": 8, "update": 7.467, "s2c_loss": "0.706", "loss": "0.48913", "s2c_nll_loss": "0.706", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "16140", "lr": "0.000107605", "gnorm": "8.231", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3644"} 2023-01-29 17:12:29 | INFO | train_inner | {"epoch": 8, "update": 7.471, "s2c_loss": "0.62", "loss": "0.42962", "s2c_nll_loss": "0.62", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "16150", "lr": "0.000107671", "gnorm": "8.339", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3647"} 2023-01-29 17:12:31 | INFO | train_inner | {"epoch": 8, "update": 7.476, "s2c_loss": "0.684", "loss": "0.47437", "s2c_nll_loss": "0.684", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "16160", "lr": "0.000107738", "gnorm": "8.724", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3649"} 2023-01-29 17:12:34 | INFO | train_inner | {"epoch": 8, "update": 7.481, "s2c_loss": "0.579", "loss": "0.40162", "s2c_nll_loss": "0.579", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "16170", "lr": "0.000107805", "gnorm": "8.094", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3652"} 2023-01-29 17:12:36 | INFO | train_inner | {"epoch": 8, "update": 7.485, "s2c_loss": "0.573", "loss": "0.39697", "s2c_nll_loss": "0.573", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "16180", "lr": "0.000107871", "gnorm": "8.367", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3654"} 2023-01-29 17:12:39 | INFO | train_inner | {"epoch": 8, "update": 7.49, "s2c_loss": "0.629", "loss": "0.43614", "s2c_nll_loss": "0.629", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "16190", "lr": "0.000107938", "gnorm": "9.25", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3657"} 2023-01-29 17:12:41 | INFO | train_inner | {"epoch": 8, "update": 7.494, "s2c_loss": "0.652", "loss": "0.45203", "s2c_nll_loss": "0.652", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "16200", "lr": "0.000108005", "gnorm": "9.932", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3659"} 2023-01-29 17:12:44 | INFO | train_inner | {"epoch": 8, "update": 7.499, "s2c_loss": "0.661", "loss": "0.4581", "s2c_nll_loss": "0.661", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "16210", "lr": "0.000108071", "gnorm": "8.09", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3662"} 2023-01-29 17:12:47 | INFO | train_inner | {"epoch": 8, "update": 7.504, "s2c_loss": "0.623", "loss": "0.43184", "s2c_nll_loss": "0.623", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "16220", "lr": "0.000108138", "gnorm": "8.424", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3664"} 2023-01-29 17:12:49 | INFO | train_inner | {"epoch": 8, "update": 7.508, "s2c_loss": "0.806", "loss": "0.55876", "s2c_nll_loss": "0.806", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "16230", "lr": "0.000108205", "gnorm": "8.314", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3667"} 2023-01-29 17:12:52 | INFO | train_inner | {"epoch": 8, "update": 7.513, "s2c_loss": "0.672", "loss": "0.46563", "s2c_nll_loss": "0.672", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "16240", "lr": "0.000108271", "gnorm": "9.603", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "3670"} 2023-01-29 17:12:54 | INFO | train_inner | {"epoch": 8, "update": 7.518, "s2c_loss": "0.522", "loss": "0.36195", "s2c_nll_loss": "0.522", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "16250", "lr": "0.000108338", "gnorm": "7.967", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3672"} 2023-01-29 17:12:57 | INFO | train_inner | {"epoch": 8, "update": 7.522, "s2c_loss": "0.552", "loss": "0.38251", "s2c_nll_loss": "0.552", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "16260", "lr": "0.000108405", "gnorm": "8.515", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "3675"} 2023-01-29 17:12:59 | INFO | train_inner | {"epoch": 8, "update": 7.527, "s2c_loss": "0.696", "loss": "0.48257", "s2c_nll_loss": "0.696", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "241.3", "ups": "3.77", "wpb": "64", "bsz": "64", "num_updates": "16270", "lr": "0.000108471", "gnorm": "8.59", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3677"} 2023-01-29 17:13:02 | INFO | train_inner | {"epoch": 8, "update": 7.531, "s2c_loss": "0.743", "loss": "0.51516", "s2c_nll_loss": "0.743", "s2c_accuracy": "86.875", "s2c_total": "64", "s2c_n_correct": "55.6", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "16280", "lr": "0.000108538", "gnorm": "9.052", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3680"} 2023-01-29 17:13:04 | INFO | train_inner | {"epoch": 8, "update": 7.536, "s2c_loss": "0.669", "loss": "0.46424", "s2c_nll_loss": "0.669", "s2c_accuracy": "89.168", "s2c_total": "63.7", "s2c_n_correct": "56.8", "wps": "251.1", "ups": "3.94", "wpb": "63.7", "bsz": "63.7", "num_updates": "16290", "lr": "0.000108605", "gnorm": "8.404", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3682"} 2023-01-29 17:13:07 | INFO | train_inner | {"epoch": 8, "update": 7.541, "s2c_loss": "0.526", "loss": "0.36469", "s2c_nll_loss": "0.526", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "247.9", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "16300", "lr": "0.000108671", "gnorm": "7.611", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "3685"} 2023-01-29 17:13:10 | INFO | train_inner | {"epoch": 8, "update": 7.545, "s2c_loss": "0.482", "loss": "0.33414", "s2c_nll_loss": "0.482", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "16310", "lr": "0.000108738", "gnorm": "8.206", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "3687"} 2023-01-29 17:13:12 | INFO | train_inner | {"epoch": 8, "update": 7.55, "s2c_loss": "0.523", "loss": "0.36256", "s2c_nll_loss": "0.523", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "16320", "lr": "0.000108805", "gnorm": "7.481", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "3690"} 2023-01-29 17:13:15 | INFO | train_inner | {"epoch": 8, "update": 7.555, "s2c_loss": "0.649", "loss": "0.44995", "s2c_nll_loss": "0.649", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "16330", "lr": "0.000108871", "gnorm": "8.467", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "3693"} 2023-01-29 17:13:15 | INFO | fairseq.trainer | NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 1024.0 2023-01-29 17:13:17 | INFO | train_inner | {"epoch": 8, "update": 7.56, "s2c_loss": "0.632", "loss": "0.43773", "s2c_nll_loss": "0.632", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "226", "ups": "3.53", "wpb": "64", "bsz": "64", "num_updates": "16340", "lr": "0.000108938", "gnorm": "8.545", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3695"} 2023-01-29 17:13:20 | INFO | train_inner | {"epoch": 8, "update": 7.564, "s2c_loss": "0.748", "loss": "0.51868", "s2c_nll_loss": "0.748", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "16350", "lr": "0.000109005", "gnorm": "8.623", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3698"} 2023-01-29 17:13:22 | INFO | train_inner | {"epoch": 8, "update": 7.569, "s2c_loss": "0.841", "loss": "0.58314", "s2c_nll_loss": "0.841", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "16360", "lr": "0.000109071", "gnorm": "9.937", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3700"} 2023-01-29 17:13:25 | INFO | train_inner | {"epoch": 8, "update": 7.574, "s2c_loss": "0.611", "loss": "0.4235", "s2c_nll_loss": "0.611", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "16370", "lr": "0.000109138", "gnorm": "8.307", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3703"} 2023-01-29 17:13:28 | INFO | train_inner | {"epoch": 8, "update": 7.578, "s2c_loss": "0.603", "loss": "0.41764", "s2c_nll_loss": "0.603", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "16380", "lr": "0.000109205", "gnorm": "8.629", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3705"} 2023-01-29 17:13:30 | INFO | train_inner | {"epoch": 8, "update": 7.583, "s2c_loss": "0.586", "loss": "0.40613", "s2c_nll_loss": "0.586", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "16390", "lr": "0.000109271", "gnorm": "8.767", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3708"} 2023-01-29 17:13:33 | INFO | train_inner | {"epoch": 8, "update": 7.587, "s2c_loss": "0.624", "loss": "0.43223", "s2c_nll_loss": "0.624", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "16400", "lr": "0.000109338", "gnorm": "9.407", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "3711"} 2023-01-29 17:13:35 | INFO | train_inner | {"epoch": 8, "update": 7.592, "s2c_loss": "0.562", "loss": "0.38988", "s2c_nll_loss": "0.562", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "16410", "lr": "0.000109405", "gnorm": "7.734", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3713"} 2023-01-29 17:13:38 | INFO | train_inner | {"epoch": 8, "update": 7.597, "s2c_loss": "0.58", "loss": "0.40181", "s2c_nll_loss": "0.58", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "16420", "lr": "0.000109471", "gnorm": "7.708", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3716"} 2023-01-29 17:13:40 | INFO | train_inner | {"epoch": 8, "update": 7.601, "s2c_loss": "0.578", "loss": "0.4008", "s2c_nll_loss": "0.578", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "16430", "lr": "0.000109538", "gnorm": "8.405", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3718"} 2023-01-29 17:13:43 | INFO | train_inner | {"epoch": 8, "update": 7.606, "s2c_loss": "0.546", "loss": "0.37831", "s2c_nll_loss": "0.546", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "16440", "lr": "0.000109605", "gnorm": "8.303", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3721"} 2023-01-29 17:13:45 | INFO | train_inner | {"epoch": 8, "update": 7.611, "s2c_loss": "0.763", "loss": "0.52881", "s2c_nll_loss": "0.763", "s2c_accuracy": "86.875", "s2c_total": "64", "s2c_n_correct": "55.6", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "16450", "lr": "0.000109671", "gnorm": "10.888", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3723"} 2023-01-29 17:13:48 | INFO | train_inner | {"epoch": 8, "update": 7.615, "s2c_loss": "0.786", "loss": "0.54481", "s2c_nll_loss": "0.786", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "16460", "lr": "0.000109738", "gnorm": "9.761", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3726"} 2023-01-29 17:13:50 | INFO | train_inner | {"epoch": 8, "update": 7.62, "s2c_loss": "0.697", "loss": "0.48319", "s2c_nll_loss": "0.697", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "16470", "lr": "0.000109805", "gnorm": "8.528", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3728"} 2023-01-29 17:13:53 | INFO | train_inner | {"epoch": 8, "update": 7.624, "s2c_loss": "0.665", "loss": "0.46075", "s2c_nll_loss": "0.665", "s2c_accuracy": "86.875", "s2c_total": "64", "s2c_n_correct": "55.6", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "16480", "lr": "0.000109871", "gnorm": "9.509", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3731"} 2023-01-29 17:13:55 | INFO | train_inner | {"epoch": 8, "update": 7.629, "s2c_loss": "0.605", "loss": "0.4192", "s2c_nll_loss": "0.605", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "16490", "lr": "0.000109938", "gnorm": "8.269", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3733"} 2023-01-29 17:13:58 | INFO | train_inner | {"epoch": 8, "update": 7.634, "s2c_loss": "0.568", "loss": "0.39355", "s2c_nll_loss": "0.568", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "16500", "lr": "0.000110005", "gnorm": "8.668", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3736"} 2023-01-29 17:14:00 | INFO | train_inner | {"epoch": 8, "update": 7.638, "s2c_loss": "1.097", "loss": "0.76018", "s2c_nll_loss": "1.097", "s2c_accuracy": "83.594", "s2c_total": "64", "s2c_n_correct": "53.5", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "16510", "lr": "0.000110071", "gnorm": "10.835", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3738"} 2023-01-29 17:14:03 | INFO | train_inner | {"epoch": 8, "update": 7.643, "s2c_loss": "0.729", "loss": "0.50546", "s2c_nll_loss": "0.729", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "16520", "lr": "0.000110138", "gnorm": "10.02", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3741"} 2023-01-29 17:14:05 | INFO | train_inner | {"epoch": 8, "update": 7.648, "s2c_loss": "0.72", "loss": "0.49897", "s2c_nll_loss": "0.72", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "16530", "lr": "0.000110204", "gnorm": "10.002", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3743"} 2023-01-29 17:14:08 | INFO | train_inner | {"epoch": 8, "update": 7.652, "s2c_loss": "0.69", "loss": "0.47842", "s2c_nll_loss": "0.69", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "16540", "lr": "0.000110271", "gnorm": "9.575", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3746"} 2023-01-29 17:14:11 | INFO | train_inner | {"epoch": 8, "update": 7.657, "s2c_loss": "0.699", "loss": "0.4844", "s2c_nll_loss": "0.699", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "16550", "lr": "0.000110338", "gnorm": "8.941", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3748"} 2023-01-29 17:14:13 | INFO | train_inner | {"epoch": 8, "update": 7.661, "s2c_loss": "0.582", "loss": "0.40314", "s2c_nll_loss": "0.582", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "16560", "lr": "0.000110404", "gnorm": "9.556", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3751"} 2023-01-29 17:14:16 | INFO | train_inner | {"epoch": 8, "update": 7.666, "s2c_loss": "0.474", "loss": "0.32883", "s2c_nll_loss": "0.474", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "16570", "lr": "0.000110471", "gnorm": "7.222", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3754"} 2023-01-29 17:14:18 | INFO | train_inner | {"epoch": 8, "update": 7.671, "s2c_loss": "0.608", "loss": "0.42149", "s2c_nll_loss": "0.608", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "16580", "lr": "0.000110538", "gnorm": "9.255", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3756"} 2023-01-29 17:14:21 | INFO | train_inner | {"epoch": 8, "update": 7.675, "s2c_loss": "0.547", "loss": "0.37911", "s2c_nll_loss": "0.547", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "16590", "lr": "0.000110604", "gnorm": "8.02", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3759"} 2023-01-29 17:14:23 | INFO | train_inner | {"epoch": 8, "update": 7.68, "s2c_loss": "0.58", "loss": "0.40174", "s2c_nll_loss": "0.58", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "246.2", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "16600", "lr": "0.000110671", "gnorm": "7.963", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3761"} 2023-01-29 17:14:26 | INFO | train_inner | {"epoch": 8, "update": 7.685, "s2c_loss": "0.68", "loss": "0.47123", "s2c_nll_loss": "0.68", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "257.6", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "16610", "lr": "0.000110738", "gnorm": "8.895", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3764"} 2023-01-29 17:14:28 | INFO | train_inner | {"epoch": 8, "update": 7.689, "s2c_loss": "0.611", "loss": "0.42369", "s2c_nll_loss": "0.611", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "16620", "lr": "0.000110804", "gnorm": "7.662", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3766"} 2023-01-29 17:14:31 | INFO | train_inner | {"epoch": 8, "update": 7.694, "s2c_loss": "0.62", "loss": "0.42948", "s2c_nll_loss": "0.62", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "16630", "lr": "0.000110871", "gnorm": "8.654", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3769"} 2023-01-29 17:14:33 | INFO | train_inner | {"epoch": 8, "update": 7.698, "s2c_loss": "0.623", "loss": "0.43211", "s2c_nll_loss": "0.623", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "16640", "lr": "0.000110938", "gnorm": "7.629", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "3771"} 2023-01-29 17:14:36 | INFO | train_inner | {"epoch": 8, "update": 7.703, "s2c_loss": "0.594", "loss": "0.41178", "s2c_nll_loss": "0.594", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "16650", "lr": "0.000111004", "gnorm": "8.226", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3774"} 2023-01-29 17:14:38 | INFO | train_inner | {"epoch": 8, "update": 7.708, "s2c_loss": "0.649", "loss": "0.45012", "s2c_nll_loss": "0.649", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "258.2", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "16660", "lr": "0.000111071", "gnorm": "8.485", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3776"} 2023-01-29 17:14:41 | INFO | train_inner | {"epoch": 8, "update": 7.712, "s2c_loss": "0.652", "loss": "0.45186", "s2c_nll_loss": "0.652", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "16670", "lr": "0.000111138", "gnorm": "8.862", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "3779"} 2023-01-29 17:14:43 | INFO | train_inner | {"epoch": 8, "update": 7.717, "s2c_loss": "0.597", "loss": "0.4136", "s2c_nll_loss": "0.597", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "16680", "lr": "0.000111204", "gnorm": "8.381", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3781"} 2023-01-29 17:14:46 | INFO | train_inner | {"epoch": 8, "update": 7.722, "s2c_loss": "0.545", "loss": "0.37801", "s2c_nll_loss": "0.545", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "16690", "lr": "0.000111271", "gnorm": "8.766", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3784"} 2023-01-29 17:14:49 | INFO | train_inner | {"epoch": 8, "update": 7.726, "s2c_loss": "0.662", "loss": "0.45907", "s2c_nll_loss": "0.662", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "16700", "lr": "0.000111338", "gnorm": "9.513", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3786"} 2023-01-29 17:14:51 | INFO | train_inner | {"epoch": 8, "update": 7.731, "s2c_loss": "0.566", "loss": "0.39234", "s2c_nll_loss": "0.566", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "16710", "lr": "0.000111404", "gnorm": "8.161", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "3789"} 2023-01-29 17:14:54 | INFO | train_inner | {"epoch": 8, "update": 7.735, "s2c_loss": "0.555", "loss": "0.38439", "s2c_nll_loss": "0.555", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "16720", "lr": "0.000111471", "gnorm": "8.183", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3792"} 2023-01-29 17:14:56 | INFO | train_inner | {"epoch": 8, "update": 7.74, "s2c_loss": "0.66", "loss": "0.45773", "s2c_nll_loss": "0.66", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "16730", "lr": "0.000111538", "gnorm": "8.777", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3794"} 2023-01-29 17:14:59 | INFO | train_inner | {"epoch": 8, "update": 7.745, "s2c_loss": "0.635", "loss": "0.44021", "s2c_nll_loss": "0.635", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "243.5", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "16740", "lr": "0.000111604", "gnorm": "8.2", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3797"} 2023-01-29 17:15:01 | INFO | train_inner | {"epoch": 8, "update": 7.749, "s2c_loss": "0.618", "loss": "0.42811", "s2c_nll_loss": "0.618", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "259.2", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "16750", "lr": "0.000111671", "gnorm": "7.92", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "3799"} 2023-01-29 17:15:04 | INFO | train_inner | {"epoch": 8, "update": 7.754, "s2c_loss": "0.719", "loss": "0.49865", "s2c_nll_loss": "0.719", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "16760", "lr": "0.000111738", "gnorm": "9.513", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3802"} 2023-01-29 17:15:06 | INFO | train_inner | {"epoch": 8, "update": 7.759, "s2c_loss": "0.854", "loss": "0.5917", "s2c_nll_loss": "0.854", "s2c_accuracy": "85.625", "s2c_total": "64", "s2c_n_correct": "54.8", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "16770", "lr": "0.000111804", "gnorm": "9.139", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3804"} 2023-01-29 17:15:09 | INFO | train_inner | {"epoch": 8, "update": 7.763, "s2c_loss": "0.618", "loss": "0.42862", "s2c_nll_loss": "0.618", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "16780", "lr": "0.000111871", "gnorm": "8.931", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3807"} 2023-01-29 17:15:11 | INFO | train_inner | {"epoch": 8, "update": 7.768, "s2c_loss": "0.781", "loss": "0.54145", "s2c_nll_loss": "0.781", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "16790", "lr": "0.000111938", "gnorm": "8.225", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3809"} 2023-01-29 17:15:14 | INFO | train_inner | {"epoch": 8, "update": 7.772, "s2c_loss": "0.557", "loss": "0.38594", "s2c_nll_loss": "0.557", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "16800", "lr": "0.000112004", "gnorm": "7.675", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3812"} 2023-01-29 17:15:16 | INFO | train_inner | {"epoch": 8, "update": 7.777, "s2c_loss": "0.658", "loss": "0.45642", "s2c_nll_loss": "0.658", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "16810", "lr": "0.000112071", "gnorm": "9.677", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3814"} 2023-01-29 17:15:19 | INFO | train_inner | {"epoch": 8, "update": 7.782, "s2c_loss": "0.833", "loss": "0.57706", "s2c_nll_loss": "0.833", "s2c_accuracy": "86.25", "s2c_total": "64", "s2c_n_correct": "55.2", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "16820", "lr": "0.000112138", "gnorm": "10.51", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3817"} 2023-01-29 17:15:22 | INFO | train_inner | {"epoch": 8, "update": 7.786, "s2c_loss": "0.637", "loss": "0.44149", "s2c_nll_loss": "0.637", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "16830", "lr": "0.000112204", "gnorm": "9.494", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "3819"} 2023-01-29 17:15:24 | INFO | train_inner | {"epoch": 8, "update": 7.791, "s2c_loss": "0.785", "loss": "0.54419", "s2c_nll_loss": "0.785", "s2c_accuracy": "85.469", "s2c_total": "64", "s2c_n_correct": "54.7", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "16840", "lr": "0.000112271", "gnorm": "11.585", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3822"} 2023-01-29 17:15:27 | INFO | train_inner | {"epoch": 8, "update": 7.796, "s2c_loss": "0.59", "loss": "0.40867", "s2c_nll_loss": "0.59", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "16850", "lr": "0.000112338", "gnorm": "8.725", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3825"} 2023-01-29 17:15:29 | INFO | train_inner | {"epoch": 8, "update": 7.8, "s2c_loss": "0.576", "loss": "0.39934", "s2c_nll_loss": "0.576", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "16860", "lr": "0.000112404", "gnorm": "8.75", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3827"} 2023-01-29 17:15:32 | INFO | train_inner | {"epoch": 8, "update": 7.805, "s2c_loss": "0.775", "loss": "0.53731", "s2c_nll_loss": "0.775", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "16870", "lr": "0.000112471", "gnorm": "9.142", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3830"} 2023-01-29 17:15:34 | INFO | train_inner | {"epoch": 8, "update": 7.809, "s2c_loss": "0.624", "loss": "0.43229", "s2c_nll_loss": "0.624", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "16880", "lr": "0.000112538", "gnorm": "7.698", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3832"} 2023-01-29 17:15:37 | INFO | train_inner | {"epoch": 8, "update": 7.814, "s2c_loss": "0.558", "loss": "0.38649", "s2c_nll_loss": "0.558", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "16890", "lr": "0.000112604", "gnorm": "8.057", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3835"} 2023-01-29 17:15:39 | INFO | train_inner | {"epoch": 8, "update": 7.819, "s2c_loss": "0.594", "loss": "0.41177", "s2c_nll_loss": "0.594", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "16900", "lr": "0.000112671", "gnorm": "7.872", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3837"} 2023-01-29 17:15:42 | INFO | train_inner | {"epoch": 8, "update": 7.823, "s2c_loss": "0.545", "loss": "0.37769", "s2c_nll_loss": "0.545", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "16910", "lr": "0.000112738", "gnorm": "7.634", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3840"} 2023-01-29 17:15:44 | INFO | train_inner | {"epoch": 8, "update": 7.828, "s2c_loss": "0.614", "loss": "0.42554", "s2c_nll_loss": "0.614", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "16920", "lr": "0.000112804", "gnorm": "8.012", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3842"} 2023-01-29 17:15:47 | INFO | train_inner | {"epoch": 8, "update": 7.833, "s2c_loss": "0.677", "loss": "0.46951", "s2c_nll_loss": "0.677", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "16930", "lr": "0.000112871", "gnorm": "10.113", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3845"} 2023-01-29 17:15:49 | INFO | train_inner | {"epoch": 8, "update": 7.837, "s2c_loss": "0.677", "loss": "0.46935", "s2c_nll_loss": "0.677", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "16940", "lr": "0.000112938", "gnorm": "9.163", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3847"} 2023-01-29 17:15:52 | INFO | train_inner | {"epoch": 8, "update": 7.842, "s2c_loss": "0.626", "loss": "0.43396", "s2c_nll_loss": "0.626", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "248", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "16950", "lr": "0.000113004", "gnorm": "8.485", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "3850"} 2023-01-29 17:15:55 | INFO | train_inner | {"epoch": 8, "update": 7.846, "s2c_loss": "0.619", "loss": "0.4294", "s2c_nll_loss": "0.619", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "16960", "lr": "0.000113071", "gnorm": "8.192", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3853"} 2023-01-29 17:15:57 | INFO | train_inner | {"epoch": 8, "update": 7.851, "s2c_loss": "0.588", "loss": "0.40781", "s2c_nll_loss": "0.588", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "16970", "lr": "0.000113138", "gnorm": "8.424", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3855"} 2023-01-29 17:16:00 | INFO | train_inner | {"epoch": 8, "update": 7.856, "s2c_loss": "0.638", "loss": "0.44216", "s2c_nll_loss": "0.638", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "16980", "lr": "0.000113204", "gnorm": "8.491", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3858"} 2023-01-29 17:16:02 | INFO | train_inner | {"epoch": 8, "update": 7.86, "s2c_loss": "0.621", "loss": "0.43038", "s2c_nll_loss": "0.621", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "246.6", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "16990", "lr": "0.000113271", "gnorm": "7.984", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3860"} 2023-01-29 17:16:05 | INFO | train_inner | {"epoch": 8, "update": 7.865, "s2c_loss": "0.713", "loss": "0.49437", "s2c_nll_loss": "0.713", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "17000", "lr": "0.000113338", "gnorm": "8.156", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3863"} 2023-01-29 17:16:07 | INFO | train_inner | {"epoch": 8, "update": 7.87, "s2c_loss": "0.667", "loss": "0.46204", "s2c_nll_loss": "0.667", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "17010", "lr": "0.000113404", "gnorm": "8.264", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3865"} 2023-01-29 17:16:10 | INFO | train_inner | {"epoch": 8, "update": 7.874, "s2c_loss": "0.757", "loss": "0.52496", "s2c_nll_loss": "0.757", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "17020", "lr": "0.000113471", "gnorm": "10.149", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3868"} 2023-01-29 17:16:12 | INFO | train_inner | {"epoch": 8, "update": 7.879, "s2c_loss": "0.708", "loss": "0.49053", "s2c_nll_loss": "0.708", "s2c_accuracy": "86.875", "s2c_total": "64", "s2c_n_correct": "55.6", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "17030", "lr": "0.000113538", "gnorm": "10.104", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3870"} 2023-01-29 17:16:15 | INFO | train_inner | {"epoch": 8, "update": 7.883, "s2c_loss": "0.683", "loss": "0.47314", "s2c_nll_loss": "0.683", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "17040", "lr": "0.000113604", "gnorm": "8.041", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3873"} 2023-01-29 17:16:18 | INFO | train_inner | {"epoch": 8, "update": 7.888, "s2c_loss": "0.713", "loss": "0.49414", "s2c_nll_loss": "0.713", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "17050", "lr": "0.000113671", "gnorm": "8.952", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3875"} 2023-01-29 17:16:20 | INFO | train_inner | {"epoch": 8, "update": 7.893, "s2c_loss": "0.537", "loss": "0.37238", "s2c_nll_loss": "0.537", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "17060", "lr": "0.000113738", "gnorm": "8.249", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3878"} 2023-01-29 17:16:23 | INFO | train_inner | {"epoch": 8, "update": 7.897, "s2c_loss": "0.488", "loss": "0.33808", "s2c_nll_loss": "0.488", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "246.9", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "17070", "lr": "0.000113804", "gnorm": "7.561", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3881"} 2023-01-29 17:16:25 | INFO | train_inner | {"epoch": 8, "update": 7.902, "s2c_loss": "0.593", "loss": "0.41085", "s2c_nll_loss": "0.593", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "17080", "lr": "0.000113871", "gnorm": "9.375", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "3883"} 2023-01-29 17:16:28 | INFO | train_inner | {"epoch": 8, "update": 7.907, "s2c_loss": "0.697", "loss": "0.48313", "s2c_nll_loss": "0.697", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "247.4", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "17090", "lr": "0.000113938", "gnorm": "8.737", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3886"} 2023-01-29 17:16:30 | INFO | train_inner | {"epoch": 8, "update": 7.911, "s2c_loss": "0.733", "loss": "0.50777", "s2c_nll_loss": "0.733", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "242.6", "ups": "3.79", "wpb": "64", "bsz": "64", "num_updates": "17100", "lr": "0.000114004", "gnorm": "9.179", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3888"} 2023-01-29 17:16:33 | INFO | train_inner | {"epoch": 8, "update": 7.916, "s2c_loss": "0.738", "loss": "0.51166", "s2c_nll_loss": "0.738", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "17110", "lr": "0.000114071", "gnorm": "8.848", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3891"} 2023-01-29 17:16:36 | INFO | train_inner | {"epoch": 8, "update": 7.92, "s2c_loss": "0.638", "loss": "0.44191", "s2c_nll_loss": "0.638", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "17120", "lr": "0.000114138", "gnorm": "8.248", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3893"} 2023-01-29 17:16:38 | INFO | train_inner | {"epoch": 8, "update": 7.925, "s2c_loss": "0.743", "loss": "0.51526", "s2c_nll_loss": "0.743", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "17130", "lr": "0.000114204", "gnorm": "8.279", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3896"} 2023-01-29 17:16:41 | INFO | train_inner | {"epoch": 8, "update": 7.93, "s2c_loss": "0.491", "loss": "0.3404", "s2c_nll_loss": "0.491", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "248", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "17140", "lr": "0.000114271", "gnorm": "7.364", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3899"} 2023-01-29 17:16:43 | INFO | train_inner | {"epoch": 8, "update": 7.934, "s2c_loss": "0.643", "loss": "0.44594", "s2c_nll_loss": "0.643", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "17150", "lr": "0.000114338", "gnorm": "7.619", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3901"} 2023-01-29 17:16:46 | INFO | train_inner | {"epoch": 8, "update": 7.939, "s2c_loss": "0.569", "loss": "0.39467", "s2c_nll_loss": "0.569", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "17160", "lr": "0.000114404", "gnorm": "7.953", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3904"} 2023-01-29 17:16:48 | INFO | train_inner | {"epoch": 8, "update": 7.944, "s2c_loss": "0.687", "loss": "0.47651", "s2c_nll_loss": "0.687", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "17170", "lr": "0.000114471", "gnorm": "8.833", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3906"} 2023-01-29 17:16:51 | INFO | train_inner | {"epoch": 8, "update": 7.948, "s2c_loss": "0.602", "loss": "0.41711", "s2c_nll_loss": "0.602", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "17180", "lr": "0.000114538", "gnorm": "7.991", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3909"} 2023-01-29 17:16:53 | INFO | train_inner | {"epoch": 8, "update": 7.953, "s2c_loss": "0.775", "loss": "0.53738", "s2c_nll_loss": "0.775", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "17190", "lr": "0.000114604", "gnorm": "9.364", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3911"} 2023-01-29 17:16:56 | INFO | train_inner | {"epoch": 8, "update": 7.957, "s2c_loss": "0.574", "loss": "0.39779", "s2c_nll_loss": "0.574", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "17200", "lr": "0.000114671", "gnorm": "8.398", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3914"} 2023-01-29 17:16:58 | INFO | train_inner | {"epoch": 8, "update": 7.962, "s2c_loss": "0.729", "loss": "0.50517", "s2c_nll_loss": "0.729", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "17210", "lr": "0.000114738", "gnorm": "8.643", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "3916"} 2023-01-29 17:17:01 | INFO | train_inner | {"epoch": 8, "update": 7.967, "s2c_loss": "0.655", "loss": "0.45394", "s2c_nll_loss": "0.655", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "17220", "lr": "0.000114804", "gnorm": "8.124", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "3919"} 2023-01-29 17:17:04 | INFO | train_inner | {"epoch": 8, "update": 7.971, "s2c_loss": "0.656", "loss": "0.45495", "s2c_nll_loss": "0.656", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "17230", "lr": "0.000114871", "gnorm": "8.989", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3921"} 2023-01-29 17:17:06 | INFO | train_inner | {"epoch": 8, "update": 7.976, "s2c_loss": "0.541", "loss": "0.37521", "s2c_nll_loss": "0.541", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "17240", "lr": "0.000114938", "gnorm": "8.241", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3924"} 2023-01-29 17:17:09 | INFO | train_inner | {"epoch": 8, "update": 7.981, "s2c_loss": "0.541", "loss": "0.37521", "s2c_nll_loss": "0.541", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "17250", "lr": "0.000115004", "gnorm": "8.927", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "3927"} 2023-01-29 17:17:11 | INFO | train_inner | {"epoch": 8, "update": 7.985, "s2c_loss": "0.726", "loss": "0.503", "s2c_nll_loss": "0.726", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "17260", "lr": "0.000115071", "gnorm": "8.174", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3929"} 2023-01-29 17:17:14 | INFO | train_inner | {"epoch": 8, "update": 7.99, "s2c_loss": "0.547", "loss": "0.37921", "s2c_nll_loss": "0.547", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "17270", "lr": "0.000115138", "gnorm": "8.003", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3932"} 2023-01-29 17:17:16 | INFO | train_inner | {"epoch": 8, "update": 7.994, "s2c_loss": "0.591", "loss": "0.40991", "s2c_nll_loss": "0.591", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "247.1", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "17280", "lr": "0.000115204", "gnorm": "8.382", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3934"} 2023-01-29 17:17:19 | INFO | train_inner | {"epoch": 8, "update": 7.999, "s2c_loss": "0.746", "loss": "0.51703", "s2c_nll_loss": "0.746", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "17290", "lr": "0.000115271", "gnorm": "9.922", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3937"} 2023-01-29 17:17:19 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 8 @ 17292 updates 2023-01-29 17:17:19 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 17:17:26 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 17:17:26 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt (epoch 8 @ 17292 updates, score None) (writing took 6.905499142128974 seconds) 2023-01-29 17:17:26 | INFO | fairseq_cli.train | end of epoch 8 (average epoch stats below) 2023-01-29 17:17:26 | INFO | train | {"epoch": 8, "train_s2c_loss": "0.627", "train_loss": "0.43452", "train_s2c_nll_loss": "0.627", "train_s2c_accuracy": "89.628", "train_s2c_total": "63.9838", "train_s2c_n_correct": "57.3475", "train_wps": "245.7", "train_ups": "3.84", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "17292", "train_lr": "0.000115284", "train_gnorm": "8.546", "train_loss_scale": "1024", "train_train_wall": "542", "train_gb_free": "7.5", "train_wall": "3944"} 2023-01-29 17:17:32 | INFO | fairseq.trainer | begin training epoch 9 2023-01-29 17:17:32 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 17:17:35 | INFO | train_inner | {"epoch": 9, "update": 8.004, "s2c_loss": "0.587", "loss": "0.40676", "s2c_nll_loss": "0.587", "s2c_accuracy": "89.967", "s2c_total": "60.8", "s2c_n_correct": "54.7", "wps": "38.7", "ups": "0.64", "wpb": "60.8", "bsz": "60.8", "num_updates": "17300", "lr": "0.000115338", "gnorm": "8.933", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3952"} 2023-01-29 17:17:37 | INFO | train_inner | {"epoch": 9, "update": 8.008, "s2c_loss": "0.709", "loss": "0.49116", "s2c_nll_loss": "0.709", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "17310", "lr": "0.000115404", "gnorm": "9.258", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "3955"} 2023-01-29 17:17:40 | INFO | train_inner | {"epoch": 9, "update": 8.013, "s2c_loss": "0.619", "loss": "0.42876", "s2c_nll_loss": "0.619", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "245", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "17320", "lr": "0.000115471", "gnorm": "8.433", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3958"} 2023-01-29 17:17:42 | INFO | train_inner | {"epoch": 9, "update": 8.018, "s2c_loss": "0.514", "loss": "0.35642", "s2c_nll_loss": "0.514", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "17330", "lr": "0.000115538", "gnorm": "7.938", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3960"} 2023-01-29 17:17:45 | INFO | train_inner | {"epoch": 9, "update": 8.022, "s2c_loss": "0.43", "loss": "0.29771", "s2c_nll_loss": "0.43", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "17340", "lr": "0.000115604", "gnorm": "7.054", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "3963"} 2023-01-29 17:17:47 | INFO | train_inner | {"epoch": 9, "update": 8.027, "s2c_loss": "0.723", "loss": "0.50137", "s2c_nll_loss": "0.723", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "17350", "lr": "0.000115671", "gnorm": "7.728", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3965"} 2023-01-29 17:17:50 | INFO | train_inner | {"epoch": 9, "update": 8.031, "s2c_loss": "0.543", "loss": "0.37654", "s2c_nll_loss": "0.543", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "258.5", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "17360", "lr": "0.000115738", "gnorm": "7.411", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "3968"} 2023-01-29 17:17:52 | INFO | train_inner | {"epoch": 9, "update": 8.036, "s2c_loss": "0.55", "loss": "0.38125", "s2c_nll_loss": "0.55", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "17370", "lr": "0.000115804", "gnorm": "8.355", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3970"} 2023-01-29 17:17:55 | INFO | train_inner | {"epoch": 9, "update": 8.041, "s2c_loss": "0.461", "loss": "0.31928", "s2c_nll_loss": "0.461", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "17380", "lr": "0.000115871", "gnorm": "7.565", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "3973"} 2023-01-29 17:17:57 | INFO | train_inner | {"epoch": 9, "update": 8.045, "s2c_loss": "0.604", "loss": "0.41895", "s2c_nll_loss": "0.604", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "17390", "lr": "0.000115938", "gnorm": "8.649", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "3975"} 2023-01-29 17:18:00 | INFO | train_inner | {"epoch": 9, "update": 8.05, "s2c_loss": "0.461", "loss": "0.31949", "s2c_nll_loss": "0.461", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "17400", "lr": "0.000116004", "gnorm": "7.787", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3978"} 2023-01-29 17:18:03 | INFO | train_inner | {"epoch": 9, "update": 8.055, "s2c_loss": "0.665", "loss": "0.46087", "s2c_nll_loss": "0.665", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "17410", "lr": "0.000116071", "gnorm": "7.812", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3981"} 2023-01-29 17:18:05 | INFO | train_inner | {"epoch": 9, "update": 8.059, "s2c_loss": "0.433", "loss": "0.30009", "s2c_nll_loss": "0.433", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "243.3", "ups": "3.8", "wpb": "64", "bsz": "64", "num_updates": "17420", "lr": "0.000116138", "gnorm": "7.87", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "3983"} 2023-01-29 17:18:08 | INFO | train_inner | {"epoch": 9, "update": 8.064, "s2c_loss": "0.84", "loss": "0.58242", "s2c_nll_loss": "0.84", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "17430", "lr": "0.000116204", "gnorm": "8.319", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3986"} 2023-01-29 17:18:10 | INFO | train_inner | {"epoch": 9, "update": 8.068, "s2c_loss": "0.78", "loss": "0.5405", "s2c_nll_loss": "0.78", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "17440", "lr": "0.000116271", "gnorm": "8.779", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3988"} 2023-01-29 17:18:13 | INFO | train_inner | {"epoch": 9, "update": 8.073, "s2c_loss": "0.531", "loss": "0.36786", "s2c_nll_loss": "0.531", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "17450", "lr": "0.000116338", "gnorm": "7.414", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.5", "wall": "3991"} 2023-01-29 17:18:15 | INFO | train_inner | {"epoch": 9, "update": 8.078, "s2c_loss": "0.552", "loss": "0.38273", "s2c_nll_loss": "0.552", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "17460", "lr": "0.000116404", "gnorm": "7.818", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3993"} 2023-01-29 17:18:18 | INFO | train_inner | {"epoch": 9, "update": 8.082, "s2c_loss": "0.481", "loss": "0.33359", "s2c_nll_loss": "0.481", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "17470", "lr": "0.000116471", "gnorm": "7.415", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "3996"} 2023-01-29 17:18:21 | INFO | train_inner | {"epoch": 9, "update": 8.087, "s2c_loss": "0.444", "loss": "0.3079", "s2c_nll_loss": "0.444", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "244.9", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "17480", "lr": "0.000116538", "gnorm": "7.91", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "3998"} 2023-01-29 17:18:23 | INFO | train_inner | {"epoch": 9, "update": 8.092, "s2c_loss": "0.602", "loss": "0.41699", "s2c_nll_loss": "0.602", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "241.7", "ups": "3.78", "wpb": "64", "bsz": "64", "num_updates": "17490", "lr": "0.000116604", "gnorm": "8.321", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4001"} 2023-01-29 17:18:26 | INFO | train_inner | {"epoch": 9, "update": 8.096, "s2c_loss": "0.587", "loss": "0.40699", "s2c_nll_loss": "0.587", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "17500", "lr": "0.000116671", "gnorm": "8.394", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "4004"} 2023-01-29 17:18:28 | INFO | train_inner | {"epoch": 9, "update": 8.101, "s2c_loss": "0.47", "loss": "0.32571", "s2c_nll_loss": "0.47", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "17510", "lr": "0.000116737", "gnorm": "8.426", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4006"} 2023-01-29 17:18:31 | INFO | train_inner | {"epoch": 9, "update": 8.105, "s2c_loss": "0.581", "loss": "0.40302", "s2c_nll_loss": "0.581", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "17520", "lr": "0.000116804", "gnorm": "7.57", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4009"} 2023-01-29 17:18:33 | INFO | train_inner | {"epoch": 9, "update": 8.11, "s2c_loss": "0.557", "loss": "0.3864", "s2c_nll_loss": "0.557", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "17530", "lr": "0.000116871", "gnorm": "8.466", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4011"} 2023-01-29 17:18:36 | INFO | train_inner | {"epoch": 9, "update": 8.115, "s2c_loss": "0.467", "loss": "0.32357", "s2c_nll_loss": "0.467", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "17540", "lr": "0.000116937", "gnorm": "7.442", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4014"} 2023-01-29 17:18:38 | INFO | train_inner | {"epoch": 9, "update": 8.119, "s2c_loss": "0.573", "loss": "0.39737", "s2c_nll_loss": "0.573", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "17550", "lr": "0.000117004", "gnorm": "7.92", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4016"} 2023-01-29 17:18:41 | INFO | train_inner | {"epoch": 9, "update": 8.124, "s2c_loss": "0.465", "loss": "0.32204", "s2c_nll_loss": "0.465", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "17560", "lr": "0.000117071", "gnorm": "7.188", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4019"} 2023-01-29 17:18:43 | INFO | train_inner | {"epoch": 9, "update": 8.129, "s2c_loss": "0.6", "loss": "0.41574", "s2c_nll_loss": "0.6", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "17570", "lr": "0.000117137", "gnorm": "8.213", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4021"} 2023-01-29 17:18:46 | INFO | train_inner | {"epoch": 9, "update": 8.133, "s2c_loss": "0.662", "loss": "0.45898", "s2c_nll_loss": "0.662", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "17580", "lr": "0.000117204", "gnorm": "7.831", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4024"} 2023-01-29 17:18:49 | INFO | train_inner | {"epoch": 9, "update": 8.138, "s2c_loss": "0.469", "loss": "0.32535", "s2c_nll_loss": "0.469", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "17590", "lr": "0.000117271", "gnorm": "7.335", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "4026"} 2023-01-29 17:18:51 | INFO | train_inner | {"epoch": 9, "update": 8.142, "s2c_loss": "0.714", "loss": "0.49522", "s2c_nll_loss": "0.714", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "17600", "lr": "0.000117337", "gnorm": "8.815", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4029"} 2023-01-29 17:18:54 | INFO | train_inner | {"epoch": 9, "update": 8.147, "s2c_loss": "0.542", "loss": "0.3754", "s2c_nll_loss": "0.542", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "17610", "lr": "0.000117404", "gnorm": "7.626", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4032"} 2023-01-29 17:18:56 | INFO | train_inner | {"epoch": 9, "update": 8.152, "s2c_loss": "0.767", "loss": "0.53176", "s2c_nll_loss": "0.767", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "17620", "lr": "0.000117471", "gnorm": "9.7", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4034"} 2023-01-29 17:18:59 | INFO | train_inner | {"epoch": 9, "update": 8.156, "s2c_loss": "0.645", "loss": "0.44697", "s2c_nll_loss": "0.645", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "17630", "lr": "0.000117537", "gnorm": "8.68", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4037"} 2023-01-29 17:19:01 | INFO | train_inner | {"epoch": 9, "update": 8.161, "s2c_loss": "0.564", "loss": "0.39101", "s2c_nll_loss": "0.564", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "17640", "lr": "0.000117604", "gnorm": "8.101", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4039"} 2023-01-29 17:19:04 | INFO | train_inner | {"epoch": 9, "update": 8.166, "s2c_loss": "0.644", "loss": "0.4464", "s2c_nll_loss": "0.644", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "17650", "lr": "0.000117671", "gnorm": "8.141", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4042"} 2023-01-29 17:19:06 | INFO | train_inner | {"epoch": 9, "update": 8.17, "s2c_loss": "0.498", "loss": "0.34488", "s2c_nll_loss": "0.498", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "17660", "lr": "0.000117737", "gnorm": "7.811", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4044"} 2023-01-29 17:19:09 | INFO | train_inner | {"epoch": 9, "update": 8.175, "s2c_loss": "0.736", "loss": "0.51029", "s2c_nll_loss": "0.736", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "17670", "lr": "0.000117804", "gnorm": "9.2", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4047"} 2023-01-29 17:19:11 | INFO | train_inner | {"epoch": 9, "update": 8.179, "s2c_loss": "0.654", "loss": "0.45339", "s2c_nll_loss": "0.654", "s2c_accuracy": "89.639", "s2c_total": "63.7", "s2c_n_correct": "57.1", "wps": "255", "ups": "4", "wpb": "63.7", "bsz": "63.7", "num_updates": "17680", "lr": "0.000117871", "gnorm": "8.641", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4049"} 2023-01-29 17:19:14 | INFO | train_inner | {"epoch": 9, "update": 8.184, "s2c_loss": "0.525", "loss": "0.36357", "s2c_nll_loss": "0.525", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "17690", "lr": "0.000117937", "gnorm": "7.982", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4052"} 2023-01-29 17:19:16 | INFO | train_inner | {"epoch": 9, "update": 8.189, "s2c_loss": "0.505", "loss": "0.35011", "s2c_nll_loss": "0.505", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "17700", "lr": "0.000118004", "gnorm": "8.521", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4054"} 2023-01-29 17:19:19 | INFO | train_inner | {"epoch": 9, "update": 8.193, "s2c_loss": "0.536", "loss": "0.37158", "s2c_nll_loss": "0.536", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "17710", "lr": "0.000118071", "gnorm": "9.169", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4057"} 2023-01-29 17:19:21 | INFO | train_inner | {"epoch": 9, "update": 8.198, "s2c_loss": "0.643", "loss": "0.44536", "s2c_nll_loss": "0.643", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "17720", "lr": "0.000118137", "gnorm": "8.892", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4059"} 2023-01-29 17:19:24 | INFO | train_inner | {"epoch": 9, "update": 8.203, "s2c_loss": "0.661", "loss": "0.45822", "s2c_nll_loss": "0.661", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "247.9", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "17730", "lr": "0.000118204", "gnorm": "9.474", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4062"} 2023-01-29 17:19:27 | INFO | train_inner | {"epoch": 9, "update": 8.207, "s2c_loss": "0.631", "loss": "0.43707", "s2c_nll_loss": "0.631", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "17740", "lr": "0.000118271", "gnorm": "8.002", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4065"} 2023-01-29 17:19:29 | INFO | train_inner | {"epoch": 9, "update": 8.212, "s2c_loss": "0.489", "loss": "0.33917", "s2c_nll_loss": "0.489", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "17750", "lr": "0.000118337", "gnorm": "6.799", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4067"} 2023-01-29 17:19:32 | INFO | train_inner | {"epoch": 9, "update": 8.216, "s2c_loss": "0.576", "loss": "0.39937", "s2c_nll_loss": "0.576", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "17760", "lr": "0.000118404", "gnorm": "7.703", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4070"} 2023-01-29 17:19:34 | INFO | train_inner | {"epoch": 9, "update": 8.221, "s2c_loss": "0.719", "loss": "0.49812", "s2c_nll_loss": "0.719", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "17770", "lr": "0.000118471", "gnorm": "8.43", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4072"} 2023-01-29 17:19:37 | INFO | train_inner | {"epoch": 9, "update": 8.226, "s2c_loss": "0.456", "loss": "0.31631", "s2c_nll_loss": "0.456", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "17780", "lr": "0.000118537", "gnorm": "7.671", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4075"} 2023-01-29 17:19:39 | INFO | train_inner | {"epoch": 9, "update": 8.23, "s2c_loss": "0.531", "loss": "0.36828", "s2c_nll_loss": "0.531", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "17790", "lr": "0.000118604", "gnorm": "7.787", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4077"} 2023-01-29 17:19:42 | INFO | train_inner | {"epoch": 9, "update": 8.235, "s2c_loss": "0.526", "loss": "0.36462", "s2c_nll_loss": "0.526", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "242.9", "ups": "3.8", "wpb": "64", "bsz": "64", "num_updates": "17800", "lr": "0.000118671", "gnorm": "6.706", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4080"} 2023-01-29 17:19:44 | INFO | train_inner | {"epoch": 9, "update": 8.24, "s2c_loss": "0.44", "loss": "0.30498", "s2c_nll_loss": "0.44", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "17810", "lr": "0.000118737", "gnorm": "6.803", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4082"} 2023-01-29 17:19:47 | INFO | train_inner | {"epoch": 9, "update": 8.244, "s2c_loss": "0.514", "loss": "0.35599", "s2c_nll_loss": "0.514", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "17820", "lr": "0.000118804", "gnorm": "8.55", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4085"} 2023-01-29 17:19:50 | INFO | train_inner | {"epoch": 9, "update": 8.249, "s2c_loss": "0.559", "loss": "0.38748", "s2c_nll_loss": "0.559", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "17830", "lr": "0.000118871", "gnorm": "8.569", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4087"} 2023-01-29 17:19:52 | INFO | train_inner | {"epoch": 9, "update": 8.253, "s2c_loss": "0.629", "loss": "0.43585", "s2c_nll_loss": "0.629", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "17840", "lr": "0.000118937", "gnorm": "7.946", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4090"} 2023-01-29 17:19:55 | INFO | train_inner | {"epoch": 9, "update": 8.258, "s2c_loss": "0.518", "loss": "0.3592", "s2c_nll_loss": "0.518", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "17850", "lr": "0.000119004", "gnorm": "7.799", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4093"} 2023-01-29 17:19:57 | INFO | train_inner | {"epoch": 9, "update": 8.263, "s2c_loss": "0.748", "loss": "0.51869", "s2c_nll_loss": "0.748", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "17860", "lr": "0.000119071", "gnorm": "7.89", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4095"} 2023-01-29 17:20:00 | INFO | train_inner | {"epoch": 9, "update": 8.267, "s2c_loss": "0.494", "loss": "0.34217", "s2c_nll_loss": "0.494", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "17870", "lr": "0.000119137", "gnorm": "7.537", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4098"} 2023-01-29 17:20:02 | INFO | train_inner | {"epoch": 9, "update": 8.272, "s2c_loss": "0.644", "loss": "0.44615", "s2c_nll_loss": "0.644", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "17880", "lr": "0.000119204", "gnorm": "8.277", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "4100"} 2023-01-29 17:20:05 | INFO | train_inner | {"epoch": 9, "update": 8.277, "s2c_loss": "0.608", "loss": "0.42114", "s2c_nll_loss": "0.608", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "17890", "lr": "0.000119271", "gnorm": "8.438", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4103"} 2023-01-29 17:20:07 | INFO | train_inner | {"epoch": 9, "update": 8.281, "s2c_loss": "0.45", "loss": "0.31212", "s2c_nll_loss": "0.45", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "17900", "lr": "0.000119337", "gnorm": "7.918", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4105"} 2023-01-29 17:20:10 | INFO | train_inner | {"epoch": 9, "update": 8.286, "s2c_loss": "0.586", "loss": "0.40612", "s2c_nll_loss": "0.586", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "17910", "lr": "0.000119404", "gnorm": "8.034", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4108"} 2023-01-29 17:20:12 | INFO | train_inner | {"epoch": 9, "update": 8.29, "s2c_loss": "0.498", "loss": "0.34523", "s2c_nll_loss": "0.498", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "17920", "lr": "0.000119471", "gnorm": "7.375", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4110"} 2023-01-29 17:20:15 | INFO | train_inner | {"epoch": 9, "update": 8.295, "s2c_loss": "0.417", "loss": "0.28932", "s2c_nll_loss": "0.417", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "17930", "lr": "0.000119537", "gnorm": "7.685", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4113"} 2023-01-29 17:20:18 | INFO | train_inner | {"epoch": 9, "update": 8.3, "s2c_loss": "0.556", "loss": "0.38546", "s2c_nll_loss": "0.556", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "17940", "lr": "0.000119604", "gnorm": "9.361", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4115"} 2023-01-29 17:20:20 | INFO | train_inner | {"epoch": 9, "update": 8.304, "s2c_loss": "0.462", "loss": "0.31989", "s2c_nll_loss": "0.462", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "246.8", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "17950", "lr": "0.000119671", "gnorm": "7.668", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4118"} 2023-01-29 17:20:23 | INFO | train_inner | {"epoch": 9, "update": 8.309, "s2c_loss": "0.535", "loss": "0.37072", "s2c_nll_loss": "0.535", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "257.8", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "17960", "lr": "0.000119737", "gnorm": "7.778", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4121"} 2023-01-29 17:20:25 | INFO | train_inner | {"epoch": 9, "update": 8.314, "s2c_loss": "0.483", "loss": "0.33509", "s2c_nll_loss": "0.483", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "17970", "lr": "0.000119804", "gnorm": "7.381", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4123"} 2023-01-29 17:20:28 | INFO | train_inner | {"epoch": 9, "update": 8.318, "s2c_loss": "0.601", "loss": "0.41644", "s2c_nll_loss": "0.601", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "17980", "lr": "0.000119871", "gnorm": "8.463", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4126"} 2023-01-29 17:20:30 | INFO | train_inner | {"epoch": 9, "update": 8.323, "s2c_loss": "0.581", "loss": "0.40242", "s2c_nll_loss": "0.581", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "17990", "lr": "0.000119937", "gnorm": "8.205", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4128"} 2023-01-29 17:20:33 | INFO | train_inner | {"epoch": 9, "update": 8.327, "s2c_loss": "0.653", "loss": "0.4529", "s2c_nll_loss": "0.653", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "18000", "lr": "0.000120004", "gnorm": "8.885", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "4131"} 2023-01-29 17:20:35 | INFO | train_inner | {"epoch": 9, "update": 8.332, "s2c_loss": "0.593", "loss": "0.41113", "s2c_nll_loss": "0.593", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "18010", "lr": "0.000120071", "gnorm": "8.75", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4133"} 2023-01-29 17:20:38 | INFO | train_inner | {"epoch": 9, "update": 8.337, "s2c_loss": "0.515", "loss": "0.35688", "s2c_nll_loss": "0.515", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "18020", "lr": "0.000120137", "gnorm": "7.923", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4136"} 2023-01-29 17:20:40 | INFO | train_inner | {"epoch": 9, "update": 8.341, "s2c_loss": "0.722", "loss": "0.50042", "s2c_nll_loss": "0.722", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "18030", "lr": "0.000120204", "gnorm": "7.759", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4138"} 2023-01-29 17:20:43 | INFO | train_inner | {"epoch": 9, "update": 8.346, "s2c_loss": "0.55", "loss": "0.3813", "s2c_nll_loss": "0.55", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "18040", "lr": "0.000120271", "gnorm": "9.595", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "4141"} 2023-01-29 17:20:45 | INFO | train_inner | {"epoch": 9, "update": 8.351, "s2c_loss": "0.683", "loss": "0.47343", "s2c_nll_loss": "0.683", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "18050", "lr": "0.000120337", "gnorm": "8.556", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4143"} 2023-01-29 17:20:48 | INFO | train_inner | {"epoch": 9, "update": 8.355, "s2c_loss": "0.556", "loss": "0.38558", "s2c_nll_loss": "0.556", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "18060", "lr": "0.000120404", "gnorm": "7.9", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4146"} 2023-01-29 17:20:50 | INFO | train_inner | {"epoch": 9, "update": 8.36, "s2c_loss": "0.583", "loss": "0.40404", "s2c_nll_loss": "0.583", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "18070", "lr": "0.000120471", "gnorm": "8.321", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4148"} 2023-01-29 17:20:53 | INFO | train_inner | {"epoch": 9, "update": 8.364, "s2c_loss": "0.603", "loss": "0.41774", "s2c_nll_loss": "0.603", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "18080", "lr": "0.000120537", "gnorm": "7.488", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4151"} 2023-01-29 17:20:55 | INFO | train_inner | {"epoch": 9, "update": 8.369, "s2c_loss": "0.679", "loss": "0.47074", "s2c_nll_loss": "0.679", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "18090", "lr": "0.000120604", "gnorm": "8.453", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "4153"} 2023-01-29 17:20:58 | INFO | train_inner | {"epoch": 9, "update": 8.374, "s2c_loss": "0.535", "loss": "0.371", "s2c_nll_loss": "0.535", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "18100", "lr": "0.000120671", "gnorm": "7.956", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4156"} 2023-01-29 17:21:01 | INFO | train_inner | {"epoch": 9, "update": 8.378, "s2c_loss": "0.597", "loss": "0.41407", "s2c_nll_loss": "0.597", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "18110", "lr": "0.000120737", "gnorm": "8.988", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4158"} 2023-01-29 17:21:03 | INFO | train_inner | {"epoch": 9, "update": 8.383, "s2c_loss": "0.794", "loss": "0.55026", "s2c_nll_loss": "0.794", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "18120", "lr": "0.000120804", "gnorm": "9.944", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4161"} 2023-01-29 17:21:06 | INFO | fairseq.trainer | NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 512.0 2023-01-29 17:21:06 | INFO | train_inner | {"epoch": 9, "update": 8.388, "s2c_loss": "0.658", "loss": "0.45614", "s2c_nll_loss": "0.658", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "231.4", "ups": "3.62", "wpb": "64", "bsz": "64", "num_updates": "18130", "lr": "0.000120871", "gnorm": "9.46", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4164"} 2023-01-29 17:21:08 | INFO | train_inner | {"epoch": 9, "update": 8.393, "s2c_loss": "0.665", "loss": "0.46112", "s2c_nll_loss": "0.665", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "245.9", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "18140", "lr": "0.000120937", "gnorm": "10.653", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4166"} 2023-01-29 17:21:11 | INFO | train_inner | {"epoch": 9, "update": 8.397, "s2c_loss": "0.612", "loss": "0.4244", "s2c_nll_loss": "0.612", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "18150", "lr": "0.000121004", "gnorm": "8.766", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4169"} 2023-01-29 17:21:14 | INFO | train_inner | {"epoch": 9, "update": 8.402, "s2c_loss": "0.653", "loss": "0.45248", "s2c_nll_loss": "0.653", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "18160", "lr": "0.000121071", "gnorm": "9.136", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4171"} 2023-01-29 17:21:16 | INFO | train_inner | {"epoch": 9, "update": 8.407, "s2c_loss": "0.492", "loss": "0.34089", "s2c_nll_loss": "0.492", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "18170", "lr": "0.000121137", "gnorm": "7.581", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4174"} 2023-01-29 17:21:19 | INFO | train_inner | {"epoch": 9, "update": 8.411, "s2c_loss": "0.557", "loss": "0.38597", "s2c_nll_loss": "0.557", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "18180", "lr": "0.000121204", "gnorm": "8.95", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4177"} 2023-01-29 17:21:21 | INFO | train_inner | {"epoch": 9, "update": 8.416, "s2c_loss": "0.622", "loss": "0.43108", "s2c_nll_loss": "0.622", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "18190", "lr": "0.000121271", "gnorm": "7.916", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4179"} 2023-01-29 17:21:24 | INFO | train_inner | {"epoch": 9, "update": 8.42, "s2c_loss": "0.502", "loss": "0.3479", "s2c_nll_loss": "0.502", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "18200", "lr": "0.000121337", "gnorm": "7.853", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "4182"} 2023-01-29 17:21:26 | INFO | train_inner | {"epoch": 9, "update": 8.425, "s2c_loss": "0.433", "loss": "0.29997", "s2c_nll_loss": "0.433", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "18210", "lr": "0.000121404", "gnorm": "7.007", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4184"} 2023-01-29 17:21:29 | INFO | train_inner | {"epoch": 9, "update": 8.43, "s2c_loss": "0.499", "loss": "0.34603", "s2c_nll_loss": "0.499", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "18220", "lr": "0.000121471", "gnorm": "7.612", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4187"} 2023-01-29 17:21:31 | INFO | train_inner | {"epoch": 9, "update": 8.434, "s2c_loss": "0.547", "loss": "0.37912", "s2c_nll_loss": "0.547", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "18230", "lr": "0.000121537", "gnorm": "9.142", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "4189"} 2023-01-29 17:21:34 | INFO | train_inner | {"epoch": 9, "update": 8.439, "s2c_loss": "0.567", "loss": "0.3933", "s2c_nll_loss": "0.567", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "18240", "lr": "0.000121604", "gnorm": "7.985", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4192"} 2023-01-29 17:21:36 | INFO | train_inner | {"epoch": 9, "update": 8.444, "s2c_loss": "0.614", "loss": "0.42535", "s2c_nll_loss": "0.614", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "18250", "lr": "0.000121671", "gnorm": "8.072", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4194"} 2023-01-29 17:21:39 | INFO | train_inner | {"epoch": 9, "update": 8.448, "s2c_loss": "0.561", "loss": "0.389", "s2c_nll_loss": "0.561", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "18260", "lr": "0.000121737", "gnorm": "8.022", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "4197"} 2023-01-29 17:21:41 | INFO | train_inner | {"epoch": 9, "update": 8.453, "s2c_loss": "0.709", "loss": "0.49143", "s2c_nll_loss": "0.709", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "18270", "lr": "0.000121804", "gnorm": "9.67", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "4199"} 2023-01-29 17:21:44 | INFO | train_inner | {"epoch": 9, "update": 8.457, "s2c_loss": "0.528", "loss": "0.36625", "s2c_nll_loss": "0.528", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "18280", "lr": "0.000121871", "gnorm": "7.796", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4202"} 2023-01-29 17:21:46 | INFO | train_inner | {"epoch": 9, "update": 8.462, "s2c_loss": "0.659", "loss": "0.45668", "s2c_nll_loss": "0.659", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "259.1", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "18290", "lr": "0.000121937", "gnorm": "8.177", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4204"} 2023-01-29 17:21:49 | INFO | train_inner | {"epoch": 9, "update": 8.467, "s2c_loss": "0.593", "loss": "0.41117", "s2c_nll_loss": "0.593", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "18300", "lr": "0.000122004", "gnorm": "8.21", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4207"} 2023-01-29 17:21:51 | INFO | train_inner | {"epoch": 9, "update": 8.471, "s2c_loss": "0.502", "loss": "0.34768", "s2c_nll_loss": "0.502", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "18310", "lr": "0.000122071", "gnorm": "8.188", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4209"} 2023-01-29 17:21:54 | INFO | train_inner | {"epoch": 9, "update": 8.476, "s2c_loss": "0.617", "loss": "0.42743", "s2c_nll_loss": "0.617", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "246.5", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "18320", "lr": "0.000122137", "gnorm": "8.113", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4212"} 2023-01-29 17:21:57 | INFO | train_inner | {"epoch": 9, "update": 8.481, "s2c_loss": "0.675", "loss": "0.46755", "s2c_nll_loss": "0.675", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "18330", "lr": "0.000122204", "gnorm": "7.966", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4214"} 2023-01-29 17:21:59 | INFO | train_inner | {"epoch": 9, "update": 8.485, "s2c_loss": "0.504", "loss": "0.34919", "s2c_nll_loss": "0.504", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "18340", "lr": "0.000122271", "gnorm": "7.921", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4217"} 2023-01-29 17:22:02 | INFO | train_inner | {"epoch": 9, "update": 8.49, "s2c_loss": "0.521", "loss": "0.36144", "s2c_nll_loss": "0.521", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "18350", "lr": "0.000122337", "gnorm": "8.019", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4220"} 2023-01-29 17:22:04 | INFO | train_inner | {"epoch": 9, "update": 8.494, "s2c_loss": "0.629", "loss": "0.43605", "s2c_nll_loss": "0.629", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "18360", "lr": "0.000122404", "gnorm": "7.857", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4222"} 2023-01-29 17:22:07 | INFO | train_inner | {"epoch": 9, "update": 8.499, "s2c_loss": "0.64", "loss": "0.44375", "s2c_nll_loss": "0.64", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "18370", "lr": "0.000122471", "gnorm": "7.577", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4225"} 2023-01-29 17:22:09 | INFO | train_inner | {"epoch": 9, "update": 8.504, "s2c_loss": "0.636", "loss": "0.44114", "s2c_nll_loss": "0.636", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "259.8", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "18380", "lr": "0.000122537", "gnorm": "8.77", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4227"} 2023-01-29 17:22:12 | INFO | train_inner | {"epoch": 9, "update": 8.508, "s2c_loss": "0.69", "loss": "0.47855", "s2c_nll_loss": "0.69", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "18390", "lr": "0.000122604", "gnorm": "9.218", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4230"} 2023-01-29 17:22:14 | INFO | train_inner | {"epoch": 9, "update": 8.513, "s2c_loss": "0.651", "loss": "0.45158", "s2c_nll_loss": "0.651", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "18400", "lr": "0.000122671", "gnorm": "9.052", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4232"} 2023-01-29 17:22:17 | INFO | train_inner | {"epoch": 9, "update": 8.518, "s2c_loss": "0.544", "loss": "0.37708", "s2c_nll_loss": "0.544", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "18410", "lr": "0.000122737", "gnorm": "8.206", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4235"} 2023-01-29 17:22:19 | INFO | train_inner | {"epoch": 9, "update": 8.522, "s2c_loss": "0.528", "loss": "0.36597", "s2c_nll_loss": "0.528", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "18420", "lr": "0.000122804", "gnorm": "7.509", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4237"} 2023-01-29 17:22:22 | INFO | train_inner | {"epoch": 9, "update": 8.527, "s2c_loss": "0.605", "loss": "0.41942", "s2c_nll_loss": "0.605", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "18430", "lr": "0.000122871", "gnorm": "8.287", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4240"} 2023-01-29 17:22:24 | INFO | train_inner | {"epoch": 9, "update": 8.531, "s2c_loss": "0.56", "loss": "0.38831", "s2c_nll_loss": "0.56", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "18440", "lr": "0.000122937", "gnorm": "7.288", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4242"} 2023-01-29 17:22:27 | INFO | train_inner | {"epoch": 9, "update": 8.536, "s2c_loss": "0.526", "loss": "0.3649", "s2c_nll_loss": "0.526", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "18450", "lr": "0.000123004", "gnorm": "8.175", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4245"} 2023-01-29 17:22:30 | INFO | train_inner | {"epoch": 9, "update": 8.541, "s2c_loss": "0.519", "loss": "0.35952", "s2c_nll_loss": "0.519", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "18460", "lr": "0.000123071", "gnorm": "8.266", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4247"} 2023-01-29 17:22:32 | INFO | train_inner | {"epoch": 9, "update": 8.545, "s2c_loss": "0.734", "loss": "0.50867", "s2c_nll_loss": "0.734", "s2c_accuracy": "86.875", "s2c_total": "64", "s2c_n_correct": "55.6", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "18470", "lr": "0.000123137", "gnorm": "8.718", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4250"} 2023-01-29 17:22:35 | INFO | train_inner | {"epoch": 9, "update": 8.55, "s2c_loss": "0.542", "loss": "0.376", "s2c_nll_loss": "0.542", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "18480", "lr": "0.000123204", "gnorm": "8.532", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4252"} 2023-01-29 17:22:37 | INFO | train_inner | {"epoch": 9, "update": 8.555, "s2c_loss": "0.493", "loss": "0.34174", "s2c_nll_loss": "0.493", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "18490", "lr": "0.000123271", "gnorm": "7.253", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4255"} 2023-01-29 17:22:40 | INFO | train_inner | {"epoch": 9, "update": 8.559, "s2c_loss": "0.555", "loss": "0.38502", "s2c_nll_loss": "0.555", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "18500", "lr": "0.000123337", "gnorm": "8.31", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4258"} 2023-01-29 17:22:42 | INFO | train_inner | {"epoch": 9, "update": 8.564, "s2c_loss": "0.62", "loss": "0.42984", "s2c_nll_loss": "0.62", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "18510", "lr": "0.000123404", "gnorm": "9.134", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4260"} 2023-01-29 17:22:45 | INFO | train_inner | {"epoch": 9, "update": 8.568, "s2c_loss": "0.615", "loss": "0.42625", "s2c_nll_loss": "0.615", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "18520", "lr": "0.00012347", "gnorm": "9.283", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4263"} 2023-01-29 17:22:47 | INFO | train_inner | {"epoch": 9, "update": 8.573, "s2c_loss": "0.666", "loss": "0.46142", "s2c_nll_loss": "0.666", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "18530", "lr": "0.000123537", "gnorm": "9.4", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4265"} 2023-01-29 17:22:50 | INFO | train_inner | {"epoch": 9, "update": 8.578, "s2c_loss": "0.534", "loss": "0.37014", "s2c_nll_loss": "0.534", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "18540", "lr": "0.000123604", "gnorm": "7.608", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4268"} 2023-01-29 17:22:52 | INFO | train_inner | {"epoch": 9, "update": 8.582, "s2c_loss": "0.568", "loss": "0.39385", "s2c_nll_loss": "0.568", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "18550", "lr": "0.00012367", "gnorm": "7.424", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4270"} 2023-01-29 17:22:55 | INFO | train_inner | {"epoch": 9, "update": 8.587, "s2c_loss": "0.682", "loss": "0.47297", "s2c_nll_loss": "0.682", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "18560", "lr": "0.000123737", "gnorm": "9.599", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4273"} 2023-01-29 17:22:57 | INFO | train_inner | {"epoch": 9, "update": 8.592, "s2c_loss": "0.565", "loss": "0.39171", "s2c_nll_loss": "0.565", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "18570", "lr": "0.000123804", "gnorm": "8.09", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4275"} 2023-01-29 17:23:00 | INFO | train_inner | {"epoch": 9, "update": 8.596, "s2c_loss": "0.568", "loss": "0.39378", "s2c_nll_loss": "0.568", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "18580", "lr": "0.00012387", "gnorm": "8.188", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4278"} 2023-01-29 17:23:03 | INFO | train_inner | {"epoch": 9, "update": 8.601, "s2c_loss": "0.61", "loss": "0.42267", "s2c_nll_loss": "0.61", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "18590", "lr": "0.000123937", "gnorm": "7.811", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4280"} 2023-01-29 17:23:05 | INFO | train_inner | {"epoch": 9, "update": 8.605, "s2c_loss": "0.452", "loss": "0.31324", "s2c_nll_loss": "0.452", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "18600", "lr": "0.000124004", "gnorm": "6.249", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4283"} 2023-01-29 17:23:08 | INFO | train_inner | {"epoch": 9, "update": 8.61, "s2c_loss": "0.627", "loss": "0.43458", "s2c_nll_loss": "0.627", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "18610", "lr": "0.00012407", "gnorm": "8.418", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4286"} 2023-01-29 17:23:10 | INFO | train_inner | {"epoch": 9, "update": 8.615, "s2c_loss": "0.497", "loss": "0.34471", "s2c_nll_loss": "0.497", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "18620", "lr": "0.000124137", "gnorm": "8.127", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4288"} 2023-01-29 17:23:13 | INFO | train_inner | {"epoch": 9, "update": 8.619, "s2c_loss": "0.666", "loss": "0.46187", "s2c_nll_loss": "0.666", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "18630", "lr": "0.000124204", "gnorm": "8.432", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4291"} 2023-01-29 17:23:15 | INFO | train_inner | {"epoch": 9, "update": 8.624, "s2c_loss": "0.582", "loss": "0.40326", "s2c_nll_loss": "0.582", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "18640", "lr": "0.00012427", "gnorm": "9.046", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4293"} 2023-01-29 17:23:18 | INFO | train_inner | {"epoch": 9, "update": 8.629, "s2c_loss": "0.651", "loss": "0.45154", "s2c_nll_loss": "0.651", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "18650", "lr": "0.000124337", "gnorm": "9.178", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4296"} 2023-01-29 17:23:20 | INFO | train_inner | {"epoch": 9, "update": 8.633, "s2c_loss": "0.561", "loss": "0.38862", "s2c_nll_loss": "0.561", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "18660", "lr": "0.000124404", "gnorm": "7.852", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4298"} 2023-01-29 17:23:23 | INFO | train_inner | {"epoch": 9, "update": 8.638, "s2c_loss": "0.581", "loss": "0.40249", "s2c_nll_loss": "0.581", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "18670", "lr": "0.00012447", "gnorm": "7.944", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4301"} 2023-01-29 17:23:25 | INFO | train_inner | {"epoch": 9, "update": 8.642, "s2c_loss": "0.556", "loss": "0.38548", "s2c_nll_loss": "0.556", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "18680", "lr": "0.000124537", "gnorm": "8.449", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4303"} 2023-01-29 17:23:28 | INFO | train_inner | {"epoch": 9, "update": 8.647, "s2c_loss": "0.535", "loss": "0.37091", "s2c_nll_loss": "0.535", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "18690", "lr": "0.000124604", "gnorm": "7.361", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4306"} 2023-01-29 17:23:31 | INFO | train_inner | {"epoch": 9, "update": 8.652, "s2c_loss": "0.655", "loss": "0.45401", "s2c_nll_loss": "0.655", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "18700", "lr": "0.00012467", "gnorm": "8.753", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4308"} 2023-01-29 17:23:33 | INFO | train_inner | {"epoch": 9, "update": 8.656, "s2c_loss": "0.686", "loss": "0.47526", "s2c_nll_loss": "0.686", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "251.8", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "18710", "lr": "0.000124737", "gnorm": "7.808", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4311"} 2023-01-29 17:23:36 | INFO | train_inner | {"epoch": 9, "update": 8.661, "s2c_loss": "0.648", "loss": "0.44907", "s2c_nll_loss": "0.648", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "18720", "lr": "0.000124804", "gnorm": "8.315", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "4314"} 2023-01-29 17:23:38 | INFO | train_inner | {"epoch": 9, "update": 8.666, "s2c_loss": "0.681", "loss": "0.47174", "s2c_nll_loss": "0.681", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "18730", "lr": "0.00012487", "gnorm": "8.062", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4316"} 2023-01-29 17:23:41 | INFO | train_inner | {"epoch": 9, "update": 8.67, "s2c_loss": "0.458", "loss": "0.31741", "s2c_nll_loss": "0.458", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "18740", "lr": "0.000124937", "gnorm": "8.295", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4319"} 2023-01-29 17:23:43 | INFO | train_inner | {"epoch": 9, "update": 8.675, "s2c_loss": "0.529", "loss": "0.36671", "s2c_nll_loss": "0.529", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "18750", "lr": "0.000125004", "gnorm": "7.692", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4321"} 2023-01-29 17:23:46 | INFO | train_inner | {"epoch": 9, "update": 8.679, "s2c_loss": "0.63", "loss": "0.43666", "s2c_nll_loss": "0.63", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "18760", "lr": "0.00012507", "gnorm": "7.069", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4324"} 2023-01-29 17:23:48 | INFO | train_inner | {"epoch": 9, "update": 8.684, "s2c_loss": "0.565", "loss": "0.3915", "s2c_nll_loss": "0.565", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "18770", "lr": "0.000125137", "gnorm": "8.173", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4326"} 2023-01-29 17:23:51 | INFO | train_inner | {"epoch": 9, "update": 8.689, "s2c_loss": "0.676", "loss": "0.46843", "s2c_nll_loss": "0.676", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "18780", "lr": "0.000125204", "gnorm": "7.535", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4329"} 2023-01-29 17:23:53 | INFO | train_inner | {"epoch": 9, "update": 8.693, "s2c_loss": "0.651", "loss": "0.45128", "s2c_nll_loss": "0.651", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "18790", "lr": "0.00012527", "gnorm": "8.429", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4331"} 2023-01-29 17:23:56 | INFO | train_inner | {"epoch": 9, "update": 8.698, "s2c_loss": "0.517", "loss": "0.35834", "s2c_nll_loss": "0.517", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "18800", "lr": "0.000125337", "gnorm": "7.114", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4334"} 2023-01-29 17:23:59 | INFO | train_inner | {"epoch": 9, "update": 8.703, "s2c_loss": "0.67", "loss": "0.46421", "s2c_nll_loss": "0.67", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "18810", "lr": "0.000125404", "gnorm": "8.045", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "4336"} 2023-01-29 17:24:01 | INFO | train_inner | {"epoch": 9, "update": 8.707, "s2c_loss": "0.585", "loss": "0.40576", "s2c_nll_loss": "0.585", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "18820", "lr": "0.00012547", "gnorm": "7.268", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4339"} 2023-01-29 17:24:04 | INFO | train_inner | {"epoch": 9, "update": 8.712, "s2c_loss": "0.552", "loss": "0.38295", "s2c_nll_loss": "0.552", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "18830", "lr": "0.000125537", "gnorm": "8.158", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4342"} 2023-01-29 17:24:06 | INFO | train_inner | {"epoch": 9, "update": 8.716, "s2c_loss": "0.521", "loss": "0.36129", "s2c_nll_loss": "0.521", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "18840", "lr": "0.000125604", "gnorm": "7.932", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4344"} 2023-01-29 17:24:09 | INFO | train_inner | {"epoch": 9, "update": 8.721, "s2c_loss": "0.535", "loss": "0.37091", "s2c_nll_loss": "0.535", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "18850", "lr": "0.00012567", "gnorm": "7.134", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4347"} 2023-01-29 17:24:11 | INFO | train_inner | {"epoch": 9, "update": 8.726, "s2c_loss": "0.875", "loss": "0.60623", "s2c_nll_loss": "0.875", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "18860", "lr": "0.000125737", "gnorm": "8.004", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4349"} 2023-01-29 17:24:14 | INFO | train_inner | {"epoch": 9, "update": 8.73, "s2c_loss": "0.59", "loss": "0.40865", "s2c_nll_loss": "0.59", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "18870", "lr": "0.000125804", "gnorm": "8.184", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4352"} 2023-01-29 17:24:16 | INFO | train_inner | {"epoch": 9, "update": 8.735, "s2c_loss": "0.599", "loss": "0.41485", "s2c_nll_loss": "0.599", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "18880", "lr": "0.00012587", "gnorm": "8.504", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4354"} 2023-01-29 17:24:19 | INFO | train_inner | {"epoch": 9, "update": 8.74, "s2c_loss": "0.521", "loss": "0.36097", "s2c_nll_loss": "0.521", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "18890", "lr": "0.000125937", "gnorm": "6.656", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4357"} 2023-01-29 17:24:21 | INFO | train_inner | {"epoch": 9, "update": 8.744, "s2c_loss": "0.512", "loss": "0.3547", "s2c_nll_loss": "0.512", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "18900", "lr": "0.000126004", "gnorm": "7.143", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4359"} 2023-01-29 17:24:24 | INFO | train_inner | {"epoch": 9, "update": 8.749, "s2c_loss": "0.576", "loss": "0.39913", "s2c_nll_loss": "0.576", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "18910", "lr": "0.00012607", "gnorm": "8.095", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4362"} 2023-01-29 17:24:26 | INFO | train_inner | {"epoch": 9, "update": 8.753, "s2c_loss": "0.463", "loss": "0.32098", "s2c_nll_loss": "0.463", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "18920", "lr": "0.000126137", "gnorm": "6.855", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4364"} 2023-01-29 17:24:29 | INFO | train_inner | {"epoch": 9, "update": 8.758, "s2c_loss": "0.491", "loss": "0.34013", "s2c_nll_loss": "0.491", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "18930", "lr": "0.000126204", "gnorm": "7.315", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4367"} 2023-01-29 17:24:31 | INFO | train_inner | {"epoch": 9, "update": 8.763, "s2c_loss": "0.574", "loss": "0.39812", "s2c_nll_loss": "0.574", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "18940", "lr": "0.00012627", "gnorm": "7.568", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4369"} 2023-01-29 17:24:34 | INFO | train_inner | {"epoch": 9, "update": 8.767, "s2c_loss": "0.564", "loss": "0.39076", "s2c_nll_loss": "0.564", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "18950", "lr": "0.000126337", "gnorm": "7.911", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "4372"} 2023-01-29 17:24:36 | INFO | train_inner | {"epoch": 9, "update": 8.772, "s2c_loss": "0.772", "loss": "0.53486", "s2c_nll_loss": "0.772", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "18960", "lr": "0.000126404", "gnorm": "8.544", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4374"} 2023-01-29 17:24:39 | INFO | train_inner | {"epoch": 9, "update": 8.777, "s2c_loss": "0.479", "loss": "0.33235", "s2c_nll_loss": "0.479", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "18970", "lr": "0.00012647", "gnorm": "7.523", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4377"} 2023-01-29 17:24:42 | INFO | train_inner | {"epoch": 9, "update": 8.781, "s2c_loss": "0.472", "loss": "0.32691", "s2c_nll_loss": "0.472", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "18980", "lr": "0.000126537", "gnorm": "7.55", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4379"} 2023-01-29 17:24:44 | INFO | train_inner | {"epoch": 9, "update": 8.786, "s2c_loss": "0.572", "loss": "0.39623", "s2c_nll_loss": "0.572", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "18990", "lr": "0.000126604", "gnorm": "7.434", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4382"} 2023-01-29 17:24:47 | INFO | train_inner | {"epoch": 9, "update": 8.79, "s2c_loss": "0.732", "loss": "0.50751", "s2c_nll_loss": "0.732", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "19000", "lr": "0.00012667", "gnorm": "8.57", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4385"} 2023-01-29 17:24:49 | INFO | train_inner | {"epoch": 9, "update": 8.795, "s2c_loss": "0.724", "loss": "0.50174", "s2c_nll_loss": "0.724", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "19010", "lr": "0.000126737", "gnorm": "8.762", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4387"} 2023-01-29 17:24:52 | INFO | train_inner | {"epoch": 9, "update": 8.8, "s2c_loss": "0.589", "loss": "0.40824", "s2c_nll_loss": "0.589", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "19020", "lr": "0.000126804", "gnorm": "8.359", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4390"} 2023-01-29 17:24:54 | INFO | train_inner | {"epoch": 9, "update": 8.804, "s2c_loss": "0.652", "loss": "0.45209", "s2c_nll_loss": "0.652", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "19030", "lr": "0.00012687", "gnorm": "10.183", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4392"} 2023-01-29 17:24:57 | INFO | train_inner | {"epoch": 9, "update": 8.809, "s2c_loss": "0.586", "loss": "0.4062", "s2c_nll_loss": "0.586", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "19040", "lr": "0.000126937", "gnorm": "8.811", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4395"} 2023-01-29 17:24:59 | INFO | train_inner | {"epoch": 9, "update": 8.814, "s2c_loss": "0.561", "loss": "0.38852", "s2c_nll_loss": "0.561", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "19050", "lr": "0.000127004", "gnorm": "7.946", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4397"} 2023-01-29 17:25:02 | INFO | train_inner | {"epoch": 9, "update": 8.818, "s2c_loss": "0.535", "loss": "0.37098", "s2c_nll_loss": "0.535", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "19060", "lr": "0.00012707", "gnorm": "7.484", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4400"} 2023-01-29 17:25:04 | INFO | train_inner | {"epoch": 9, "update": 8.823, "s2c_loss": "0.844", "loss": "0.58489", "s2c_nll_loss": "0.844", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "19070", "lr": "0.000127137", "gnorm": "7.638", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4402"} 2023-01-29 17:25:07 | INFO | train_inner | {"epoch": 9, "update": 8.827, "s2c_loss": "0.677", "loss": "0.46935", "s2c_nll_loss": "0.677", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "19080", "lr": "0.000127204", "gnorm": "9.453", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4405"} 2023-01-29 17:25:09 | INFO | train_inner | {"epoch": 9, "update": 8.832, "s2c_loss": "0.673", "loss": "0.46619", "s2c_nll_loss": "0.673", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "19090", "lr": "0.00012727", "gnorm": "9.187", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4407"} 2023-01-29 17:25:12 | INFO | train_inner | {"epoch": 9, "update": 8.837, "s2c_loss": "0.515", "loss": "0.35726", "s2c_nll_loss": "0.515", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "19100", "lr": "0.000127337", "gnorm": "7.951", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4410"} 2023-01-29 17:25:15 | INFO | train_inner | {"epoch": 9, "update": 8.841, "s2c_loss": "0.929", "loss": "0.64411", "s2c_nll_loss": "0.929", "s2c_accuracy": "85.938", "s2c_total": "64", "s2c_n_correct": "55", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "19110", "lr": "0.000127404", "gnorm": "7.91", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "4412"} 2023-01-29 17:25:17 | INFO | train_inner | {"epoch": 9, "update": 8.846, "s2c_loss": "0.572", "loss": "0.39653", "s2c_nll_loss": "0.572", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "19120", "lr": "0.00012747", "gnorm": "7.362", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4415"} 2023-01-29 17:25:20 | INFO | train_inner | {"epoch": 9, "update": 8.851, "s2c_loss": "0.608", "loss": "0.4216", "s2c_nll_loss": "0.608", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "19130", "lr": "0.000127537", "gnorm": "8.164", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4418"} 2023-01-29 17:25:22 | INFO | train_inner | {"epoch": 9, "update": 8.855, "s2c_loss": "0.716", "loss": "0.49627", "s2c_nll_loss": "0.716", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "19140", "lr": "0.000127604", "gnorm": "8.148", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4420"} 2023-01-29 17:25:25 | INFO | train_inner | {"epoch": 9, "update": 8.86, "s2c_loss": "0.773", "loss": "0.53598", "s2c_nll_loss": "0.773", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "19150", "lr": "0.00012767", "gnorm": "8.96", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4423"} 2023-01-29 17:25:27 | INFO | train_inner | {"epoch": 9, "update": 8.864, "s2c_loss": "0.573", "loss": "0.39714", "s2c_nll_loss": "0.573", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "19160", "lr": "0.000127737", "gnorm": "8.461", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4425"} 2023-01-29 17:25:30 | INFO | train_inner | {"epoch": 9, "update": 8.869, "s2c_loss": "0.686", "loss": "0.47582", "s2c_nll_loss": "0.686", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "19170", "lr": "0.000127804", "gnorm": "7.532", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4428"} 2023-01-29 17:25:32 | INFO | train_inner | {"epoch": 9, "update": 8.874, "s2c_loss": "0.66", "loss": "0.45716", "s2c_nll_loss": "0.66", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "19180", "lr": "0.00012787", "gnorm": "9.513", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4430"} 2023-01-29 17:25:35 | INFO | train_inner | {"epoch": 9, "update": 8.878, "s2c_loss": "0.728", "loss": "0.50439", "s2c_nll_loss": "0.728", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "19190", "lr": "0.000127937", "gnorm": "9.333", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4433"} 2023-01-29 17:25:37 | INFO | train_inner | {"epoch": 9, "update": 8.883, "s2c_loss": "0.692", "loss": "0.47952", "s2c_nll_loss": "0.692", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "19200", "lr": "0.000128004", "gnorm": "8.13", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4435"} 2023-01-29 17:25:40 | INFO | train_inner | {"epoch": 9, "update": 8.888, "s2c_loss": "0.699", "loss": "0.4846", "s2c_nll_loss": "0.699", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "19210", "lr": "0.00012807", "gnorm": "7.373", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4438"} 2023-01-29 17:25:42 | INFO | train_inner | {"epoch": 9, "update": 8.892, "s2c_loss": "0.64", "loss": "0.44362", "s2c_nll_loss": "0.64", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "19220", "lr": "0.000128137", "gnorm": "8.483", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "4440"} 2023-01-29 17:25:45 | INFO | train_inner | {"epoch": 9, "update": 8.897, "s2c_loss": "0.689", "loss": "0.47773", "s2c_nll_loss": "0.689", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "19230", "lr": "0.000128204", "gnorm": "9.09", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "4443"} 2023-01-29 17:25:47 | INFO | train_inner | {"epoch": 9, "update": 8.901, "s2c_loss": "0.733", "loss": "0.50796", "s2c_nll_loss": "0.733", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "19240", "lr": "0.00012827", "gnorm": "8.284", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4445"} 2023-01-29 17:25:50 | INFO | train_inner | {"epoch": 9, "update": 8.906, "s2c_loss": "0.675", "loss": "0.4679", "s2c_nll_loss": "0.675", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "19250", "lr": "0.000128337", "gnorm": "9.299", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4448"} 2023-01-29 17:25:52 | INFO | train_inner | {"epoch": 9, "update": 8.911, "s2c_loss": "0.683", "loss": "0.47356", "s2c_nll_loss": "0.683", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "260.3", "ups": "4.07", "wpb": "64", "bsz": "64", "num_updates": "19260", "lr": "0.000128404", "gnorm": "7.936", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4450"} 2023-01-29 17:25:55 | INFO | train_inner | {"epoch": 9, "update": 8.915, "s2c_loss": "0.543", "loss": "0.37628", "s2c_nll_loss": "0.543", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "259.3", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "19270", "lr": "0.00012847", "gnorm": "7.166", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4453"} 2023-01-29 17:25:57 | INFO | train_inner | {"epoch": 9, "update": 8.92, "s2c_loss": "0.51", "loss": "0.35382", "s2c_nll_loss": "0.51", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "19280", "lr": "0.000128537", "gnorm": "7.534", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4455"} 2023-01-29 17:26:00 | INFO | train_inner | {"epoch": 9, "update": 8.925, "s2c_loss": "0.719", "loss": "0.4985", "s2c_nll_loss": "0.719", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "19290", "lr": "0.000128604", "gnorm": "7.503", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "4458"} 2023-01-29 17:26:02 | INFO | train_inner | {"epoch": 9, "update": 8.929, "s2c_loss": "0.614", "loss": "0.42559", "s2c_nll_loss": "0.614", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "19300", "lr": "0.00012867", "gnorm": "8.856", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4460"} 2023-01-29 17:26:05 | INFO | train_inner | {"epoch": 9, "update": 8.934, "s2c_loss": "0.599", "loss": "0.41522", "s2c_nll_loss": "0.599", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "19310", "lr": "0.000128737", "gnorm": "7.764", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4463"} 2023-01-29 17:26:08 | INFO | train_inner | {"epoch": 9, "update": 8.938, "s2c_loss": "0.449", "loss": "0.31151", "s2c_nll_loss": "0.449", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "19320", "lr": "0.000128804", "gnorm": "7.206", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4465"} 2023-01-29 17:26:10 | INFO | train_inner | {"epoch": 9, "update": 8.943, "s2c_loss": "0.578", "loss": "0.40043", "s2c_nll_loss": "0.578", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "19330", "lr": "0.00012887", "gnorm": "7.406", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4468"} 2023-01-29 17:26:13 | INFO | train_inner | {"epoch": 9, "update": 8.948, "s2c_loss": "0.431", "loss": "0.29864", "s2c_nll_loss": "0.431", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "19340", "lr": "0.000128937", "gnorm": "7.571", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4470"} 2023-01-29 17:26:15 | INFO | train_inner | {"epoch": 9, "update": 8.952, "s2c_loss": "0.558", "loss": "0.38653", "s2c_nll_loss": "0.558", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "19350", "lr": "0.000129004", "gnorm": "8.224", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4473"} 2023-01-29 17:26:18 | INFO | train_inner | {"epoch": 9, "update": 8.957, "s2c_loss": "0.629", "loss": "0.43598", "s2c_nll_loss": "0.629", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "247.6", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "19360", "lr": "0.00012907", "gnorm": "7.99", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4476"} 2023-01-29 17:26:20 | INFO | train_inner | {"epoch": 9, "update": 8.962, "s2c_loss": "0.491", "loss": "0.34051", "s2c_nll_loss": "0.491", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "19370", "lr": "0.000129137", "gnorm": "7.363", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4478"} 2023-01-29 17:26:23 | INFO | train_inner | {"epoch": 9, "update": 8.966, "s2c_loss": "0.643", "loss": "0.44549", "s2c_nll_loss": "0.643", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "19380", "lr": "0.000129204", "gnorm": "7.851", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4481"} 2023-01-29 17:26:25 | INFO | train_inner | {"epoch": 9, "update": 8.971, "s2c_loss": "0.59", "loss": "0.40892", "s2c_nll_loss": "0.59", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "19390", "lr": "0.00012927", "gnorm": "8.16", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4483"} 2023-01-29 17:26:28 | INFO | train_inner | {"epoch": 9, "update": 8.975, "s2c_loss": "0.702", "loss": "0.48624", "s2c_nll_loss": "0.702", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "19400", "lr": "0.000129337", "gnorm": "8.803", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4486"} 2023-01-29 17:26:30 | INFO | train_inner | {"epoch": 9, "update": 8.98, "s2c_loss": "0.514", "loss": "0.35625", "s2c_nll_loss": "0.514", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "19410", "lr": "0.000129404", "gnorm": "6.787", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4488"} 2023-01-29 17:26:33 | INFO | train_inner | {"epoch": 9, "update": 8.985, "s2c_loss": "0.523", "loss": "0.36279", "s2c_nll_loss": "0.523", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "19420", "lr": "0.00012947", "gnorm": "6.985", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4491"} 2023-01-29 17:26:35 | INFO | train_inner | {"epoch": 9, "update": 8.989, "s2c_loss": "0.515", "loss": "0.35706", "s2c_nll_loss": "0.515", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "19430", "lr": "0.000129537", "gnorm": "6.348", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4493"} 2023-01-29 17:26:38 | INFO | train_inner | {"epoch": 9, "update": 8.994, "s2c_loss": "0.526", "loss": "0.36428", "s2c_nll_loss": "0.526", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "19440", "lr": "0.000129604", "gnorm": "6.994", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4496"} 2023-01-29 17:26:41 | INFO | train_inner | {"epoch": 9, "update": 8.999, "s2c_loss": "0.54", "loss": "0.37407", "s2c_nll_loss": "0.54", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "245.6", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "19450", "lr": "0.00012967", "gnorm": "8.388", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4499"} 2023-01-29 17:26:41 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 9 @ 19453 updates 2023-01-29 17:26:41 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 17:26:48 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 17:26:48 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt (epoch 9 @ 19453 updates, score None) (writing took 7.05967228487134 seconds) 2023-01-29 17:26:48 | INFO | fairseq_cli.train | end of epoch 9 (average epoch stats below) 2023-01-29 17:26:48 | INFO | train | {"epoch": 9, "train_s2c_loss": "0.591", "train_loss": "0.40954", "train_s2c_nll_loss": "0.591", "train_s2c_accuracy": "89.951", "train_s2c_total": "63.9838", "train_s2c_n_correct": "57.5539", "train_wps": "246", "train_ups": "3.84", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "19453", "train_lr": "0.00012969", "train_gnorm": "8.124", "train_loss_scale": "512", "train_train_wall": "541", "train_gb_free": "7.5", "train_wall": "4506"} 2023-01-29 17:26:55 | INFO | fairseq.trainer | begin training epoch 10 2023-01-29 17:26:55 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 17:26:57 | INFO | train_inner | {"epoch": 10, "update": 9.003, "s2c_loss": "0.644", "loss": "0.44671", "s2c_nll_loss": "0.644", "s2c_accuracy": "87.5", "s2c_total": "60.8", "s2c_n_correct": "53.2", "wps": "38.2", "ups": "0.63", "wpb": "60.8", "bsz": "60.8", "num_updates": "19460", "lr": "0.000129737", "gnorm": "8.665", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4514"} 2023-01-29 17:26:59 | INFO | train_inner | {"epoch": 10, "update": 9.008, "s2c_loss": "0.552", "loss": "0.38288", "s2c_nll_loss": "0.552", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "19470", "lr": "0.000129804", "gnorm": "7.499", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4517"} 2023-01-29 17:27:02 | INFO | train_inner | {"epoch": 10, "update": 9.012, "s2c_loss": "0.524", "loss": "0.36294", "s2c_nll_loss": "0.524", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "19480", "lr": "0.00012987", "gnorm": "8.816", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4519"} 2023-01-29 17:27:04 | INFO | train_inner | {"epoch": 10, "update": 9.017, "s2c_loss": "0.609", "loss": "0.42216", "s2c_nll_loss": "0.609", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "19490", "lr": "0.000129937", "gnorm": "9.542", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4522"} 2023-01-29 17:27:07 | INFO | train_inner | {"epoch": 10, "update": 9.022, "s2c_loss": "0.554", "loss": "0.3841", "s2c_nll_loss": "0.554", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "19500", "lr": "0.000130003", "gnorm": "8.15", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4524"} 2023-01-29 17:27:09 | INFO | train_inner | {"epoch": 10, "update": 9.026, "s2c_loss": "0.498", "loss": "0.34493", "s2c_nll_loss": "0.498", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "19510", "lr": "0.00013007", "gnorm": "6.6", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4527"} 2023-01-29 17:27:12 | INFO | train_inner | {"epoch": 10, "update": 9.031, "s2c_loss": "0.452", "loss": "0.31326", "s2c_nll_loss": "0.452", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "19520", "lr": "0.000130137", "gnorm": "7.169", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4530"} 2023-01-29 17:27:14 | INFO | train_inner | {"epoch": 10, "update": 9.036, "s2c_loss": "0.439", "loss": "0.3044", "s2c_nll_loss": "0.439", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "19530", "lr": "0.000130203", "gnorm": "7.322", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4532"} 2023-01-29 17:27:17 | INFO | train_inner | {"epoch": 10, "update": 9.04, "s2c_loss": "0.45", "loss": "0.31217", "s2c_nll_loss": "0.45", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "19540", "lr": "0.00013027", "gnorm": "7.425", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "4535"} 2023-01-29 17:27:19 | INFO | train_inner | {"epoch": 10, "update": 9.045, "s2c_loss": "0.597", "loss": "0.41369", "s2c_nll_loss": "0.597", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "19550", "lr": "0.000130337", "gnorm": "7.765", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4537"} 2023-01-29 17:27:22 | INFO | train_inner | {"epoch": 10, "update": 9.049, "s2c_loss": "0.528", "loss": "0.366", "s2c_nll_loss": "0.528", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "260.3", "ups": "4.07", "wpb": "64", "bsz": "64", "num_updates": "19560", "lr": "0.000130403", "gnorm": "7.605", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4540"} 2023-01-29 17:27:24 | INFO | train_inner | {"epoch": 10, "update": 9.054, "s2c_loss": "0.437", "loss": "0.3027", "s2c_nll_loss": "0.437", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "19570", "lr": "0.00013047", "gnorm": "6.612", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4542"} 2023-01-29 17:27:27 | INFO | train_inner | {"epoch": 10, "update": 9.059, "s2c_loss": "0.464", "loss": "0.32192", "s2c_nll_loss": "0.464", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "19580", "lr": "0.000130537", "gnorm": "7.552", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "4545"} 2023-01-29 17:27:29 | INFO | train_inner | {"epoch": 10, "update": 9.063, "s2c_loss": "0.531", "loss": "0.36773", "s2c_nll_loss": "0.531", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "19590", "lr": "0.000130603", "gnorm": "8.139", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4547"} 2023-01-29 17:27:32 | INFO | train_inner | {"epoch": 10, "update": 9.068, "s2c_loss": "0.459", "loss": "0.31817", "s2c_nll_loss": "0.459", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "247.1", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "19600", "lr": "0.00013067", "gnorm": "7.528", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4550"} 2023-01-29 17:27:35 | INFO | train_inner | {"epoch": 10, "update": 9.073, "s2c_loss": "0.601", "loss": "0.41656", "s2c_nll_loss": "0.601", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "244.9", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "19610", "lr": "0.000130737", "gnorm": "8.292", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4553"} 2023-01-29 17:27:37 | INFO | train_inner | {"epoch": 10, "update": 9.077, "s2c_loss": "0.381", "loss": "0.26387", "s2c_nll_loss": "0.381", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "19620", "lr": "0.000130803", "gnorm": "6.718", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4555"} 2023-01-29 17:27:40 | INFO | train_inner | {"epoch": 10, "update": 9.082, "s2c_loss": "0.479", "loss": "0.3318", "s2c_nll_loss": "0.479", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "19630", "lr": "0.00013087", "gnorm": "7.902", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4558"} 2023-01-29 17:27:42 | INFO | train_inner | {"epoch": 10, "update": 9.086, "s2c_loss": "0.411", "loss": "0.28483", "s2c_nll_loss": "0.411", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "19640", "lr": "0.000130937", "gnorm": "7.304", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4560"} 2023-01-29 17:27:45 | INFO | train_inner | {"epoch": 10, "update": 9.091, "s2c_loss": "0.553", "loss": "0.38343", "s2c_nll_loss": "0.553", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "19650", "lr": "0.000131003", "gnorm": "9.068", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4563"} 2023-01-29 17:27:47 | INFO | train_inner | {"epoch": 10, "update": 9.096, "s2c_loss": "0.721", "loss": "0.49957", "s2c_nll_loss": "0.721", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "19660", "lr": "0.00013107", "gnorm": "9.429", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4565"} 2023-01-29 17:27:50 | INFO | train_inner | {"epoch": 10, "update": 9.1, "s2c_loss": "0.589", "loss": "0.40795", "s2c_nll_loss": "0.589", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "19670", "lr": "0.000131137", "gnorm": "8.966", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4568"} 2023-01-29 17:27:53 | INFO | train_inner | {"epoch": 10, "update": 9.105, "s2c_loss": "0.444", "loss": "0.30747", "s2c_nll_loss": "0.444", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "19680", "lr": "0.000131203", "gnorm": "8.12", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4570"} 2023-01-29 17:27:55 | INFO | train_inner | {"epoch": 10, "update": 9.11, "s2c_loss": "0.537", "loss": "0.3722", "s2c_nll_loss": "0.537", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "19690", "lr": "0.00013127", "gnorm": "7.35", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4573"} 2023-01-29 17:27:58 | INFO | train_inner | {"epoch": 10, "update": 9.114, "s2c_loss": "0.525", "loss": "0.36393", "s2c_nll_loss": "0.525", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "258.6", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "19700", "lr": "0.000131337", "gnorm": "8.019", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4575"} 2023-01-29 17:28:00 | INFO | train_inner | {"epoch": 10, "update": 9.119, "s2c_loss": "0.687", "loss": "0.4762", "s2c_nll_loss": "0.687", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "19710", "lr": "0.000131403", "gnorm": "9.713", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4578"} 2023-01-29 17:28:03 | INFO | train_inner | {"epoch": 10, "update": 9.123, "s2c_loss": "0.558", "loss": "0.38663", "s2c_nll_loss": "0.558", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "19720", "lr": "0.00013147", "gnorm": "7.179", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "4581"} 2023-01-29 17:28:05 | INFO | train_inner | {"epoch": 10, "update": 9.128, "s2c_loss": "0.666", "loss": "0.46131", "s2c_nll_loss": "0.666", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "19730", "lr": "0.000131537", "gnorm": "6.383", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4583"} 2023-01-29 17:28:08 | INFO | train_inner | {"epoch": 10, "update": 9.133, "s2c_loss": "0.447", "loss": "0.30985", "s2c_nll_loss": "0.447", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "19740", "lr": "0.000131603", "gnorm": "7.166", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4586"} 2023-01-29 17:28:10 | INFO | train_inner | {"epoch": 10, "update": 9.137, "s2c_loss": "0.471", "loss": "0.32641", "s2c_nll_loss": "0.471", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "19750", "lr": "0.00013167", "gnorm": "7.474", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4588"} 2023-01-29 17:28:13 | INFO | train_inner | {"epoch": 10, "update": 9.142, "s2c_loss": "0.513", "loss": "0.35528", "s2c_nll_loss": "0.513", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "19760", "lr": "0.000131737", "gnorm": "8.786", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4591"} 2023-01-29 17:28:15 | INFO | train_inner | {"epoch": 10, "update": 9.147, "s2c_loss": "0.444", "loss": "0.30781", "s2c_nll_loss": "0.444", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "19770", "lr": "0.000131803", "gnorm": "7.375", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4593"} 2023-01-29 17:28:18 | INFO | train_inner | {"epoch": 10, "update": 9.151, "s2c_loss": "0.442", "loss": "0.30666", "s2c_nll_loss": "0.442", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "19780", "lr": "0.00013187", "gnorm": "6.401", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4596"} 2023-01-29 17:28:20 | INFO | train_inner | {"epoch": 10, "update": 9.156, "s2c_loss": "0.497", "loss": "0.34442", "s2c_nll_loss": "0.497", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "19790", "lr": "0.000131937", "gnorm": "7.79", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4598"} 2023-01-29 17:28:23 | INFO | train_inner | {"epoch": 10, "update": 9.16, "s2c_loss": "0.436", "loss": "0.30245", "s2c_nll_loss": "0.436", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "19800", "lr": "0.000132003", "gnorm": "6.632", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4601"} 2023-01-29 17:28:26 | INFO | train_inner | {"epoch": 10, "update": 9.165, "s2c_loss": "0.443", "loss": "0.30679", "s2c_nll_loss": "0.443", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "19810", "lr": "0.00013207", "gnorm": "6.768", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4603"} 2023-01-29 17:28:28 | INFO | train_inner | {"epoch": 10, "update": 9.17, "s2c_loss": "0.587", "loss": "0.40668", "s2c_nll_loss": "0.587", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "19820", "lr": "0.000132137", "gnorm": "7.806", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4606"} 2023-01-29 17:28:31 | INFO | train_inner | {"epoch": 10, "update": 9.174, "s2c_loss": "0.498", "loss": "0.34546", "s2c_nll_loss": "0.498", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "19830", "lr": "0.000132203", "gnorm": "6.93", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4609"} 2023-01-29 17:28:33 | INFO | train_inner | {"epoch": 10, "update": 9.179, "s2c_loss": "0.446", "loss": "0.30928", "s2c_nll_loss": "0.446", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "19840", "lr": "0.00013227", "gnorm": "6.804", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4611"} 2023-01-29 17:28:36 | INFO | train_inner | {"epoch": 10, "update": 9.184, "s2c_loss": "0.553", "loss": "0.3833", "s2c_nll_loss": "0.553", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "19850", "lr": "0.000132337", "gnorm": "8.145", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4614"} 2023-01-29 17:28:38 | INFO | train_inner | {"epoch": 10, "update": 9.188, "s2c_loss": "0.597", "loss": "0.41414", "s2c_nll_loss": "0.597", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "19860", "lr": "0.000132403", "gnorm": "8.395", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4616"} 2023-01-29 17:28:41 | INFO | train_inner | {"epoch": 10, "update": 9.193, "s2c_loss": "0.57", "loss": "0.39506", "s2c_nll_loss": "0.57", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "19870", "lr": "0.00013247", "gnorm": "7.721", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4619"} 2023-01-29 17:28:43 | INFO | train_inner | {"epoch": 10, "update": 9.198, "s2c_loss": "0.576", "loss": "0.39941", "s2c_nll_loss": "0.576", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "19880", "lr": "0.000132537", "gnorm": "7.168", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4621"} 2023-01-29 17:28:46 | INFO | train_inner | {"epoch": 10, "update": 9.202, "s2c_loss": "0.494", "loss": "0.34265", "s2c_nll_loss": "0.494", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "19890", "lr": "0.000132603", "gnorm": "7.861", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4624"} 2023-01-29 17:28:48 | INFO | train_inner | {"epoch": 10, "update": 9.207, "s2c_loss": "0.525", "loss": "0.36356", "s2c_nll_loss": "0.525", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "19900", "lr": "0.00013267", "gnorm": "7.595", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4626"} 2023-01-29 17:28:51 | INFO | train_inner | {"epoch": 10, "update": 9.211, "s2c_loss": "0.431", "loss": "0.29899", "s2c_nll_loss": "0.431", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "19910", "lr": "0.000132737", "gnorm": "6.839", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4629"} 2023-01-29 17:28:53 | INFO | train_inner | {"epoch": 10, "update": 9.216, "s2c_loss": "0.54", "loss": "0.37409", "s2c_nll_loss": "0.54", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "19920", "lr": "0.000132803", "gnorm": "8.099", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "4631"} 2023-01-29 17:28:56 | INFO | train_inner | {"epoch": 10, "update": 9.221, "s2c_loss": "0.645", "loss": "0.44677", "s2c_nll_loss": "0.645", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "19930", "lr": "0.00013287", "gnorm": "7.194", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4634"} 2023-01-29 17:28:58 | INFO | train_inner | {"epoch": 10, "update": 9.225, "s2c_loss": "0.608", "loss": "0.42176", "s2c_nll_loss": "0.608", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "259.2", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "19940", "lr": "0.000132937", "gnorm": "7.996", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4636"} 2023-01-29 17:29:01 | INFO | train_inner | {"epoch": 10, "update": 9.23, "s2c_loss": "0.467", "loss": "0.32367", "s2c_nll_loss": "0.467", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "19950", "lr": "0.000133003", "gnorm": "6.923", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4639"} 2023-01-29 17:29:04 | INFO | train_inner | {"epoch": 10, "update": 9.235, "s2c_loss": "0.574", "loss": "0.39761", "s2c_nll_loss": "0.574", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "19960", "lr": "0.00013307", "gnorm": "7.848", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4641"} 2023-01-29 17:29:06 | INFO | train_inner | {"epoch": 10, "update": 9.239, "s2c_loss": "0.55", "loss": "0.38119", "s2c_nll_loss": "0.55", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "19970", "lr": "0.000133137", "gnorm": "8.194", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4644"} 2023-01-29 17:29:09 | INFO | train_inner | {"epoch": 10, "update": 9.244, "s2c_loss": "0.555", "loss": "0.38437", "s2c_nll_loss": "0.555", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "19980", "lr": "0.000133203", "gnorm": "7.433", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4647"} 2023-01-29 17:29:11 | INFO | train_inner | {"epoch": 10, "update": 9.248, "s2c_loss": "0.627", "loss": "0.43441", "s2c_nll_loss": "0.627", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "19990", "lr": "0.00013327", "gnorm": "9.069", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4649"} 2023-01-29 17:29:14 | INFO | train_inner | {"epoch": 10, "update": 9.253, "s2c_loss": "0.55", "loss": "0.38098", "s2c_nll_loss": "0.55", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "246", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "20000", "lr": "0.000133337", "gnorm": "8.151", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4652"} 2023-01-29 17:29:14 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 17:29:28 | INFO | valid | {"epoch": 10, "valid_s2c_loss": "1.169", "valid_loss": "0.81009", "valid_s2c_nll_loss": "1.169", "valid_s2c_accuracy": "79.473", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "25.3981", "valid_num_updates": "20000"} 2023-01-29 17:29:28 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 10 @ 20000 updates 2023-01-29 17:29:28 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_10_20000.pt 2023-01-29 17:29:31 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_10_20000.pt 2023-01-29 17:29:37 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_10_20000.pt (epoch 10 @ 20000 updates, score 79.473) (writing took 9.000956175848842 seconds) 2023-01-29 17:29:40 | INFO | train_inner | {"epoch": 10, "update": 9.258, "s2c_loss": "0.601", "loss": "0.41649", "s2c_nll_loss": "0.601", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "24.7", "ups": "0.39", "wpb": "64", "bsz": "64", "num_updates": "20010", "lr": "0.000133403", "gnorm": "8.904", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4678"} 2023-01-29 17:29:42 | INFO | train_inner | {"epoch": 10, "update": 9.262, "s2c_loss": "0.432", "loss": "0.29928", "s2c_nll_loss": "0.432", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "20020", "lr": "0.00013347", "gnorm": "7.466", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4680"} 2023-01-29 17:29:45 | INFO | train_inner | {"epoch": 10, "update": 9.267, "s2c_loss": "0.579", "loss": "0.40134", "s2c_nll_loss": "0.579", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "20030", "lr": "0.000133537", "gnorm": "8.113", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4683"} 2023-01-29 17:29:47 | INFO | train_inner | {"epoch": 10, "update": 9.272, "s2c_loss": "0.655", "loss": "0.45414", "s2c_nll_loss": "0.655", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "20040", "lr": "0.000133603", "gnorm": "7.994", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4685"} 2023-01-29 17:29:50 | INFO | train_inner | {"epoch": 10, "update": 9.276, "s2c_loss": "0.472", "loss": "0.327", "s2c_nll_loss": "0.472", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "20050", "lr": "0.00013367", "gnorm": "7.434", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4688"} 2023-01-29 17:29:52 | INFO | train_inner | {"epoch": 10, "update": 9.281, "s2c_loss": "0.549", "loss": "0.38066", "s2c_nll_loss": "0.549", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "20060", "lr": "0.000133737", "gnorm": "8.65", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4690"} 2023-01-29 17:29:55 | INFO | train_inner | {"epoch": 10, "update": 9.285, "s2c_loss": "0.465", "loss": "0.32212", "s2c_nll_loss": "0.465", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "20070", "lr": "0.000133803", "gnorm": "7.256", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4693"} 2023-01-29 17:29:57 | INFO | train_inner | {"epoch": 10, "update": 9.29, "s2c_loss": "0.6", "loss": "0.41596", "s2c_nll_loss": "0.6", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "20080", "lr": "0.00013387", "gnorm": "7.813", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4695"} 2023-01-29 17:30:00 | INFO | train_inner | {"epoch": 10, "update": 9.295, "s2c_loss": "0.567", "loss": "0.39329", "s2c_nll_loss": "0.567", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "20090", "lr": "0.000133937", "gnorm": "8.119", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4698"} 2023-01-29 17:30:02 | INFO | train_inner | {"epoch": 10, "update": 9.299, "s2c_loss": "0.559", "loss": "0.38738", "s2c_nll_loss": "0.559", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "20100", "lr": "0.000134003", "gnorm": "9.132", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "4700"} 2023-01-29 17:30:05 | INFO | train_inner | {"epoch": 10, "update": 9.304, "s2c_loss": "0.551", "loss": "0.38211", "s2c_nll_loss": "0.551", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "20110", "lr": "0.00013407", "gnorm": "7.787", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "4703"} 2023-01-29 17:30:08 | INFO | train_inner | {"epoch": 10, "update": 9.309, "s2c_loss": "0.619", "loss": "0.42903", "s2c_nll_loss": "0.619", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "20120", "lr": "0.000134137", "gnorm": "7.536", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4705"} 2023-01-29 17:30:10 | INFO | train_inner | {"epoch": 10, "update": 9.313, "s2c_loss": "0.44", "loss": "0.30512", "s2c_nll_loss": "0.44", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "20130", "lr": "0.000134203", "gnorm": "7.433", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "4708"} 2023-01-29 17:30:13 | INFO | train_inner | {"epoch": 10, "update": 9.318, "s2c_loss": "0.757", "loss": "0.52467", "s2c_nll_loss": "0.757", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "20140", "lr": "0.00013427", "gnorm": "8.287", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "4711"} 2023-01-29 17:30:15 | INFO | train_inner | {"epoch": 10, "update": 9.322, "s2c_loss": "0.604", "loss": "0.41884", "s2c_nll_loss": "0.604", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "20150", "lr": "0.000134337", "gnorm": "7.897", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4713"} 2023-01-29 17:30:18 | INFO | train_inner | {"epoch": 10, "update": 9.327, "s2c_loss": "0.504", "loss": "0.34917", "s2c_nll_loss": "0.504", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "245.4", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "20160", "lr": "0.000134403", "gnorm": "7.756", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4716"} 2023-01-29 17:30:20 | INFO | train_inner | {"epoch": 10, "update": 9.332, "s2c_loss": "0.51", "loss": "0.35319", "s2c_nll_loss": "0.51", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "20170", "lr": "0.00013447", "gnorm": "7.357", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "4718"} 2023-01-29 17:30:23 | INFO | train_inner | {"epoch": 10, "update": 9.336, "s2c_loss": "0.436", "loss": "0.30195", "s2c_nll_loss": "0.436", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "248", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "20180", "lr": "0.000134537", "gnorm": "6.877", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4721"} 2023-01-29 17:30:25 | INFO | train_inner | {"epoch": 10, "update": 9.341, "s2c_loss": "0.52", "loss": "0.36038", "s2c_nll_loss": "0.52", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "20190", "lr": "0.000134603", "gnorm": "7.098", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4723"} 2023-01-29 17:30:28 | INFO | train_inner | {"epoch": 10, "update": 9.346, "s2c_loss": "0.781", "loss": "0.54133", "s2c_nll_loss": "0.781", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "20200", "lr": "0.00013467", "gnorm": "8.207", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4726"} 2023-01-29 17:30:31 | INFO | train_inner | {"epoch": 10, "update": 9.35, "s2c_loss": "0.582", "loss": "0.40323", "s2c_nll_loss": "0.582", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "20210", "lr": "0.000134737", "gnorm": "7.968", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4728"} 2023-01-29 17:30:33 | INFO | train_inner | {"epoch": 10, "update": 9.355, "s2c_loss": "0.649", "loss": "0.44986", "s2c_nll_loss": "0.649", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "20220", "lr": "0.000134803", "gnorm": "8.266", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4731"} 2023-01-29 17:30:36 | INFO | train_inner | {"epoch": 10, "update": 9.359, "s2c_loss": "0.553", "loss": "0.38321", "s2c_nll_loss": "0.553", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "20230", "lr": "0.00013487", "gnorm": "6.865", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4734"} 2023-01-29 17:30:38 | INFO | train_inner | {"epoch": 10, "update": 9.364, "s2c_loss": "0.551", "loss": "0.38189", "s2c_nll_loss": "0.551", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "20240", "lr": "0.000134937", "gnorm": "7.162", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4736"} 2023-01-29 17:30:41 | INFO | train_inner | {"epoch": 10, "update": 9.369, "s2c_loss": "0.494", "loss": "0.34239", "s2c_nll_loss": "0.494", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "244.9", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "20250", "lr": "0.000135003", "gnorm": "7.883", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4739"} 2023-01-29 17:30:43 | INFO | train_inner | {"epoch": 10, "update": 9.373, "s2c_loss": "0.507", "loss": "0.35117", "s2c_nll_loss": "0.507", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "20260", "lr": "0.00013507", "gnorm": "7.375", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4741"} 2023-01-29 17:30:46 | INFO | train_inner | {"epoch": 10, "update": 9.378, "s2c_loss": "0.668", "loss": "0.46295", "s2c_nll_loss": "0.668", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "20270", "lr": "0.000135137", "gnorm": "7.052", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4744"} 2023-01-29 17:30:48 | INFO | train_inner | {"epoch": 10, "update": 9.383, "s2c_loss": "0.488", "loss": "0.3386", "s2c_nll_loss": "0.488", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "20280", "lr": "0.000135203", "gnorm": "7.803", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4746"} 2023-01-29 17:30:51 | INFO | train_inner | {"epoch": 10, "update": 9.387, "s2c_loss": "0.553", "loss": "0.38307", "s2c_nll_loss": "0.553", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "20290", "lr": "0.00013527", "gnorm": "8.544", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4749"} 2023-01-29 17:30:54 | INFO | train_inner | {"epoch": 10, "update": 9.392, "s2c_loss": "0.586", "loss": "0.4064", "s2c_nll_loss": "0.586", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "20300", "lr": "0.000135337", "gnorm": "9.043", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4751"} 2023-01-29 17:30:56 | INFO | train_inner | {"epoch": 10, "update": 9.396, "s2c_loss": "0.599", "loss": "0.41511", "s2c_nll_loss": "0.599", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "20310", "lr": "0.000135403", "gnorm": "7.811", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4754"} 2023-01-29 17:30:59 | INFO | train_inner | {"epoch": 10, "update": 9.401, "s2c_loss": "0.491", "loss": "0.34004", "s2c_nll_loss": "0.491", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "20320", "lr": "0.00013547", "gnorm": "7.24", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4757"} 2023-01-29 17:31:01 | INFO | train_inner | {"epoch": 10, "update": 9.406, "s2c_loss": "0.677", "loss": "0.46919", "s2c_nll_loss": "0.677", "s2c_accuracy": "86.562", "s2c_total": "64", "s2c_n_correct": "55.4", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "20330", "lr": "0.000135537", "gnorm": "8.924", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4759"} 2023-01-29 17:31:04 | INFO | train_inner | {"epoch": 10, "update": 9.41, "s2c_loss": "0.587", "loss": "0.40702", "s2c_nll_loss": "0.587", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "20340", "lr": "0.000135603", "gnorm": "8.093", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4762"} 2023-01-29 17:31:06 | INFO | train_inner | {"epoch": 10, "update": 9.415, "s2c_loss": "0.524", "loss": "0.36311", "s2c_nll_loss": "0.524", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "20350", "lr": "0.00013567", "gnorm": "9.63", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "4764"} 2023-01-29 17:31:09 | INFO | train_inner | {"epoch": 10, "update": 9.42, "s2c_loss": "0.667", "loss": "0.46202", "s2c_nll_loss": "0.667", "s2c_accuracy": "86.875", "s2c_total": "64", "s2c_n_correct": "55.6", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "20360", "lr": "0.000135737", "gnorm": "9.378", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4767"} 2023-01-29 17:31:11 | INFO | train_inner | {"epoch": 10, "update": 9.424, "s2c_loss": "0.84", "loss": "0.58239", "s2c_nll_loss": "0.84", "s2c_accuracy": "86.875", "s2c_total": "64", "s2c_n_correct": "55.6", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "20370", "lr": "0.000135803", "gnorm": "9.168", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "4769"} 2023-01-29 17:31:14 | INFO | train_inner | {"epoch": 10, "update": 9.429, "s2c_loss": "0.726", "loss": "0.50347", "s2c_nll_loss": "0.726", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "20380", "lr": "0.00013587", "gnorm": "8.961", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4772"} 2023-01-29 17:31:16 | INFO | train_inner | {"epoch": 10, "update": 9.433, "s2c_loss": "0.632", "loss": "0.43816", "s2c_nll_loss": "0.632", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "20390", "lr": "0.000135937", "gnorm": "7.869", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4774"} 2023-01-29 17:31:19 | INFO | train_inner | {"epoch": 10, "update": 9.438, "s2c_loss": "0.559", "loss": "0.38729", "s2c_nll_loss": "0.559", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "20400", "lr": "0.000136003", "gnorm": "7.729", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4777"} 2023-01-29 17:31:21 | INFO | train_inner | {"epoch": 10, "update": 9.443, "s2c_loss": "0.46", "loss": "0.31871", "s2c_nll_loss": "0.46", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "20410", "lr": "0.00013607", "gnorm": "8.175", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "4779"} 2023-01-29 17:31:24 | INFO | train_inner | {"epoch": 10, "update": 9.447, "s2c_loss": "0.55", "loss": "0.38155", "s2c_nll_loss": "0.55", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "20420", "lr": "0.000136137", "gnorm": "7.648", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4782"} 2023-01-29 17:31:26 | INFO | train_inner | {"epoch": 10, "update": 9.452, "s2c_loss": "0.681", "loss": "0.47231", "s2c_nll_loss": "0.681", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "20430", "lr": "0.000136203", "gnorm": "8.537", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4784"} 2023-01-29 17:31:29 | INFO | train_inner | {"epoch": 10, "update": 9.457, "s2c_loss": "0.564", "loss": "0.39103", "s2c_nll_loss": "0.564", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "20440", "lr": "0.00013627", "gnorm": "9.506", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4787"} 2023-01-29 17:31:31 | INFO | train_inner | {"epoch": 10, "update": 9.461, "s2c_loss": "0.534", "loss": "0.37017", "s2c_nll_loss": "0.534", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "20450", "lr": "0.000136337", "gnorm": "7.799", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4789"} 2023-01-29 17:31:34 | INFO | train_inner | {"epoch": 10, "update": 9.466, "s2c_loss": "0.551", "loss": "0.38222", "s2c_nll_loss": "0.551", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "20460", "lr": "0.000136403", "gnorm": "6.999", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4792"} 2023-01-29 17:31:37 | INFO | train_inner | {"epoch": 10, "update": 9.47, "s2c_loss": "0.446", "loss": "0.30916", "s2c_nll_loss": "0.446", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "20470", "lr": "0.00013647", "gnorm": "6.178", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4794"} 2023-01-29 17:31:39 | INFO | train_inner | {"epoch": 10, "update": 9.475, "s2c_loss": "0.723", "loss": "0.50106", "s2c_nll_loss": "0.723", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "20480", "lr": "0.000136537", "gnorm": "8.384", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4797"} 2023-01-29 17:31:42 | INFO | train_inner | {"epoch": 10, "update": 9.48, "s2c_loss": "0.545", "loss": "0.37743", "s2c_nll_loss": "0.545", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "20490", "lr": "0.000136603", "gnorm": "7.915", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4799"} 2023-01-29 17:31:44 | INFO | train_inner | {"epoch": 10, "update": 9.484, "s2c_loss": "0.591", "loss": "0.40954", "s2c_nll_loss": "0.591", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "20500", "lr": "0.00013667", "gnorm": "8.025", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4802"} 2023-01-29 17:31:47 | INFO | train_inner | {"epoch": 10, "update": 9.489, "s2c_loss": "0.732", "loss": "0.50753", "s2c_nll_loss": "0.732", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "20510", "lr": "0.000136736", "gnorm": "7.832", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4805"} 2023-01-29 17:31:49 | INFO | train_inner | {"epoch": 10, "update": 9.494, "s2c_loss": "0.549", "loss": "0.38063", "s2c_nll_loss": "0.549", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "20520", "lr": "0.000136803", "gnorm": "7.89", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4807"} 2023-01-29 17:31:52 | INFO | train_inner | {"epoch": 10, "update": 9.498, "s2c_loss": "0.585", "loss": "0.40553", "s2c_nll_loss": "0.585", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "20530", "lr": "0.00013687", "gnorm": "8.477", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4810"} 2023-01-29 17:31:54 | INFO | train_inner | {"epoch": 10, "update": 9.503, "s2c_loss": "0.677", "loss": "0.46944", "s2c_nll_loss": "0.677", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "20540", "lr": "0.000136936", "gnorm": "8.91", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4812"} 2023-01-29 17:31:57 | INFO | train_inner | {"epoch": 10, "update": 9.507, "s2c_loss": "0.701", "loss": "0.48574", "s2c_nll_loss": "0.701", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "20550", "lr": "0.000137003", "gnorm": "7.637", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4815"} 2023-01-29 17:31:59 | INFO | train_inner | {"epoch": 10, "update": 9.512, "s2c_loss": "0.673", "loss": "0.46668", "s2c_nll_loss": "0.673", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "20560", "lr": "0.00013707", "gnorm": "8.188", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4817"} 2023-01-29 17:32:02 | INFO | train_inner | {"epoch": 10, "update": 9.517, "s2c_loss": "0.597", "loss": "0.414", "s2c_nll_loss": "0.597", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "20570", "lr": "0.000137136", "gnorm": "8.184", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4820"} 2023-01-29 17:32:04 | INFO | train_inner | {"epoch": 10, "update": 9.521, "s2c_loss": "0.589", "loss": "0.40796", "s2c_nll_loss": "0.589", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "20580", "lr": "0.000137203", "gnorm": "7.782", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4822"} 2023-01-29 17:32:07 | INFO | train_inner | {"epoch": 10, "update": 9.526, "s2c_loss": "0.744", "loss": "0.5156", "s2c_nll_loss": "0.744", "s2c_accuracy": "86.875", "s2c_total": "64", "s2c_n_correct": "55.6", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "20590", "lr": "0.00013727", "gnorm": "8.525", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4825"} 2023-01-29 17:32:09 | INFO | train_inner | {"epoch": 10, "update": 9.531, "s2c_loss": "0.514", "loss": "0.35631", "s2c_nll_loss": "0.514", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "20600", "lr": "0.000137336", "gnorm": "6.96", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4827"} 2023-01-29 17:32:12 | INFO | train_inner | {"epoch": 10, "update": 9.535, "s2c_loss": "0.6", "loss": "0.41555", "s2c_nll_loss": "0.6", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "20610", "lr": "0.000137403", "gnorm": "7.408", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4830"} 2023-01-29 17:32:15 | INFO | train_inner | {"epoch": 10, "update": 9.54, "s2c_loss": "0.451", "loss": "0.3129", "s2c_nll_loss": "0.451", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "20620", "lr": "0.00013747", "gnorm": "7.002", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4832"} 2023-01-29 17:32:17 | INFO | train_inner | {"epoch": 10, "update": 9.544, "s2c_loss": "0.553", "loss": "0.38331", "s2c_nll_loss": "0.553", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "20630", "lr": "0.000137536", "gnorm": "7.814", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4835"} 2023-01-29 17:32:20 | INFO | train_inner | {"epoch": 10, "update": 9.549, "s2c_loss": "0.66", "loss": "0.45779", "s2c_nll_loss": "0.66", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "20640", "lr": "0.000137603", "gnorm": "8.551", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4838"} 2023-01-29 17:32:22 | INFO | train_inner | {"epoch": 10, "update": 9.554, "s2c_loss": "0.524", "loss": "0.36303", "s2c_nll_loss": "0.524", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "20650", "lr": "0.00013767", "gnorm": "8.378", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4840"} 2023-01-29 17:32:25 | INFO | train_inner | {"epoch": 10, "update": 9.558, "s2c_loss": "0.671", "loss": "0.46489", "s2c_nll_loss": "0.671", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "20660", "lr": "0.000137736", "gnorm": "8.023", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4843"} 2023-01-29 17:32:27 | INFO | train_inner | {"epoch": 10, "update": 9.563, "s2c_loss": "0.646", "loss": "0.44772", "s2c_nll_loss": "0.646", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "20670", "lr": "0.000137803", "gnorm": "7.855", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4845"} 2023-01-29 17:32:30 | INFO | train_inner | {"epoch": 10, "update": 9.568, "s2c_loss": "0.499", "loss": "0.34606", "s2c_nll_loss": "0.499", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "20680", "lr": "0.00013787", "gnorm": "7.101", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4848"} 2023-01-29 17:32:32 | INFO | train_inner | {"epoch": 10, "update": 9.572, "s2c_loss": "0.747", "loss": "0.51808", "s2c_nll_loss": "0.747", "s2c_accuracy": "86.25", "s2c_total": "64", "s2c_n_correct": "55.2", "wps": "257.8", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "20690", "lr": "0.000137936", "gnorm": "8.098", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "4850"} 2023-01-29 17:32:35 | INFO | train_inner | {"epoch": 10, "update": 9.577, "s2c_loss": "0.5", "loss": "0.3466", "s2c_nll_loss": "0.5", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "20700", "lr": "0.000138003", "gnorm": "7.198", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4853"} 2023-01-29 17:32:37 | INFO | train_inner | {"epoch": 10, "update": 9.581, "s2c_loss": "0.636", "loss": "0.44062", "s2c_nll_loss": "0.636", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "246.3", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "20710", "lr": "0.00013807", "gnorm": "8.42", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4855"} 2023-01-29 17:32:40 | INFO | train_inner | {"epoch": 10, "update": 9.586, "s2c_loss": "0.571", "loss": "0.39612", "s2c_nll_loss": "0.571", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "20720", "lr": "0.000138136", "gnorm": "7.337", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4858"} 2023-01-29 17:32:43 | INFO | train_inner | {"epoch": 10, "update": 9.591, "s2c_loss": "0.667", "loss": "0.46228", "s2c_nll_loss": "0.667", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "20730", "lr": "0.000138203", "gnorm": "9.552", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4860"} 2023-01-29 17:32:45 | INFO | train_inner | {"epoch": 10, "update": 9.595, "s2c_loss": "0.656", "loss": "0.45466", "s2c_nll_loss": "0.656", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "20740", "lr": "0.00013827", "gnorm": "8.705", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4863"} 2023-01-29 17:32:48 | INFO | train_inner | {"epoch": 10, "update": 9.6, "s2c_loss": "0.758", "loss": "0.52547", "s2c_nll_loss": "0.758", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "20750", "lr": "0.000138336", "gnorm": "9.948", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4866"} 2023-01-29 17:32:50 | INFO | train_inner | {"epoch": 10, "update": 9.605, "s2c_loss": "0.853", "loss": "0.59107", "s2c_nll_loss": "0.853", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "20760", "lr": "0.000138403", "gnorm": "10.022", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4868"} 2023-01-29 17:32:53 | INFO | train_inner | {"epoch": 10, "update": 9.609, "s2c_loss": "0.664", "loss": "0.46005", "s2c_nll_loss": "0.664", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "20770", "lr": "0.00013847", "gnorm": "8.618", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4871"} 2023-01-29 17:32:55 | INFO | train_inner | {"epoch": 10, "update": 9.614, "s2c_loss": "0.655", "loss": "0.45381", "s2c_nll_loss": "0.655", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "20780", "lr": "0.000138536", "gnorm": "8.838", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4873"} 2023-01-29 17:32:58 | INFO | train_inner | {"epoch": 10, "update": 9.618, "s2c_loss": "0.465", "loss": "0.32262", "s2c_nll_loss": "0.465", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "20790", "lr": "0.000138603", "gnorm": "6.775", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4876"} 2023-01-29 17:33:00 | INFO | train_inner | {"epoch": 10, "update": 9.623, "s2c_loss": "0.729", "loss": "0.50556", "s2c_nll_loss": "0.729", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "259.9", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "20800", "lr": "0.00013867", "gnorm": "7.732", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "4878"} 2023-01-29 17:33:03 | INFO | train_inner | {"epoch": 10, "update": 9.628, "s2c_loss": "0.514", "loss": "0.35603", "s2c_nll_loss": "0.514", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "20810", "lr": "0.000138736", "gnorm": "7.11", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4881"} 2023-01-29 17:33:05 | INFO | train_inner | {"epoch": 10, "update": 9.632, "s2c_loss": "0.438", "loss": "0.30368", "s2c_nll_loss": "0.438", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "20820", "lr": "0.000138803", "gnorm": "6.496", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4883"} 2023-01-29 17:33:08 | INFO | train_inner | {"epoch": 10, "update": 9.637, "s2c_loss": "0.601", "loss": "0.41653", "s2c_nll_loss": "0.601", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "20830", "lr": "0.00013887", "gnorm": "7.457", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "4886"} 2023-01-29 17:33:10 | INFO | train_inner | {"epoch": 10, "update": 9.642, "s2c_loss": "0.436", "loss": "0.30231", "s2c_nll_loss": "0.436", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "20840", "lr": "0.000138936", "gnorm": "7.116", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4888"} 2023-01-29 17:33:13 | INFO | train_inner | {"epoch": 10, "update": 9.646, "s2c_loss": "0.64", "loss": "0.4435", "s2c_nll_loss": "0.64", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "20850", "lr": "0.000139003", "gnorm": "8.09", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4891"} 2023-01-29 17:33:15 | INFO | train_inner | {"epoch": 10, "update": 9.651, "s2c_loss": "0.562", "loss": "0.38921", "s2c_nll_loss": "0.562", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "20860", "lr": "0.00013907", "gnorm": "8.176", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4893"} 2023-01-29 17:33:18 | INFO | train_inner | {"epoch": 10, "update": 9.655, "s2c_loss": "0.684", "loss": "0.47383", "s2c_nll_loss": "0.684", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "20870", "lr": "0.000139136", "gnorm": "7.999", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4896"} 2023-01-29 17:33:20 | INFO | train_inner | {"epoch": 10, "update": 9.66, "s2c_loss": "0.615", "loss": "0.42652", "s2c_nll_loss": "0.615", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "20880", "lr": "0.000139203", "gnorm": "8.446", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4898"} 2023-01-29 17:33:23 | INFO | train_inner | {"epoch": 10, "update": 9.665, "s2c_loss": "0.668", "loss": "0.46315", "s2c_nll_loss": "0.668", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "20890", "lr": "0.00013927", "gnorm": "8.202", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4901"} 2023-01-29 17:33:26 | INFO | train_inner | {"epoch": 10, "update": 9.669, "s2c_loss": "0.532", "loss": "0.36851", "s2c_nll_loss": "0.532", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "20900", "lr": "0.000139336", "gnorm": "7.378", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4903"} 2023-01-29 17:33:28 | INFO | train_inner | {"epoch": 10, "update": 9.674, "s2c_loss": "0.515", "loss": "0.35729", "s2c_nll_loss": "0.515", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "20910", "lr": "0.000139403", "gnorm": "7.506", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4906"} 2023-01-29 17:33:31 | INFO | train_inner | {"epoch": 10, "update": 9.679, "s2c_loss": "0.572", "loss": "0.39629", "s2c_nll_loss": "0.572", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "20920", "lr": "0.00013947", "gnorm": "7.458", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4909"} 2023-01-29 17:33:33 | INFO | train_inner | {"epoch": 10, "update": 9.683, "s2c_loss": "0.653", "loss": "0.45235", "s2c_nll_loss": "0.653", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "20930", "lr": "0.000139536", "gnorm": "8.386", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4911"} 2023-01-29 17:33:36 | INFO | train_inner | {"epoch": 10, "update": 9.688, "s2c_loss": "0.589", "loss": "0.40831", "s2c_nll_loss": "0.589", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "20940", "lr": "0.000139603", "gnorm": "7.421", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4914"} 2023-01-29 17:33:38 | INFO | train_inner | {"epoch": 10, "update": 9.692, "s2c_loss": "0.571", "loss": "0.39579", "s2c_nll_loss": "0.571", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "20950", "lr": "0.00013967", "gnorm": "7.3", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4916"} 2023-01-29 17:33:41 | INFO | train_inner | {"epoch": 10, "update": 9.697, "s2c_loss": "0.592", "loss": "0.41041", "s2c_nll_loss": "0.592", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "20960", "lr": "0.000139736", "gnorm": "8.382", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4919"} 2023-01-29 17:33:43 | INFO | train_inner | {"epoch": 10, "update": 9.702, "s2c_loss": "0.619", "loss": "0.42895", "s2c_nll_loss": "0.619", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "20970", "lr": "0.000139803", "gnorm": "8.754", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "4921"} 2023-01-29 17:33:46 | INFO | train_inner | {"epoch": 10, "update": 9.706, "s2c_loss": "0.587", "loss": "0.40712", "s2c_nll_loss": "0.587", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "20980", "lr": "0.00013987", "gnorm": "8.057", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4924"} 2023-01-29 17:33:48 | INFO | train_inner | {"epoch": 10, "update": 9.711, "s2c_loss": "0.554", "loss": "0.38419", "s2c_nll_loss": "0.554", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "20990", "lr": "0.000139936", "gnorm": "8.116", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4926"} 2023-01-29 17:33:51 | INFO | train_inner | {"epoch": 10, "update": 9.716, "s2c_loss": "0.599", "loss": "0.41566", "s2c_nll_loss": "0.599", "s2c_accuracy": "89.011", "s2c_total": "63.7", "s2c_n_correct": "56.7", "wps": "250.6", "ups": "3.93", "wpb": "63.7", "bsz": "63.7", "num_updates": "21000", "lr": "0.000140003", "gnorm": "7.989", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4929"} 2023-01-29 17:33:54 | INFO | train_inner | {"epoch": 10, "update": 9.72, "s2c_loss": "0.59", "loss": "0.4088", "s2c_nll_loss": "0.59", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "21010", "lr": "0.00014007", "gnorm": "8.173", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4931"} 2023-01-29 17:33:56 | INFO | train_inner | {"epoch": 10, "update": 9.725, "s2c_loss": "0.555", "loss": "0.38494", "s2c_nll_loss": "0.555", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "21020", "lr": "0.000140136", "gnorm": "7.446", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4934"} 2023-01-29 17:33:59 | INFO | train_inner | {"epoch": 10, "update": 9.729, "s2c_loss": "0.547", "loss": "0.37928", "s2c_nll_loss": "0.547", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "21030", "lr": "0.000140203", "gnorm": "7.63", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4937"} 2023-01-29 17:34:01 | INFO | train_inner | {"epoch": 10, "update": 9.734, "s2c_loss": "0.675", "loss": "0.46803", "s2c_nll_loss": "0.675", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "21040", "lr": "0.00014027", "gnorm": "7.764", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "4939"} 2023-01-29 17:34:04 | INFO | train_inner | {"epoch": 10, "update": 9.739, "s2c_loss": "0.526", "loss": "0.36465", "s2c_nll_loss": "0.526", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "21050", "lr": "0.000140336", "gnorm": "8.102", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4942"} 2023-01-29 17:34:06 | INFO | train_inner | {"epoch": 10, "update": 9.743, "s2c_loss": "0.609", "loss": "0.42243", "s2c_nll_loss": "0.609", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "21060", "lr": "0.000140403", "gnorm": "8.901", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4944"} 2023-01-29 17:34:09 | INFO | train_inner | {"epoch": 10, "update": 9.748, "s2c_loss": "0.533", "loss": "0.36939", "s2c_nll_loss": "0.533", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "21070", "lr": "0.00014047", "gnorm": "7.795", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4947"} 2023-01-29 17:34:11 | INFO | train_inner | {"epoch": 10, "update": 9.753, "s2c_loss": "0.625", "loss": "0.43321", "s2c_nll_loss": "0.625", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "21080", "lr": "0.000140536", "gnorm": "7.824", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "4949"} 2023-01-29 17:34:14 | INFO | train_inner | {"epoch": 10, "update": 9.757, "s2c_loss": "0.563", "loss": "0.39046", "s2c_nll_loss": "0.563", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "21090", "lr": "0.000140603", "gnorm": "8.171", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4952"} 2023-01-29 17:34:16 | INFO | train_inner | {"epoch": 10, "update": 9.762, "s2c_loss": "0.646", "loss": "0.44801", "s2c_nll_loss": "0.646", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "21100", "lr": "0.00014067", "gnorm": "8.424", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4954"} 2023-01-29 17:34:19 | INFO | train_inner | {"epoch": 10, "update": 9.766, "s2c_loss": "0.541", "loss": "0.37533", "s2c_nll_loss": "0.541", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "21110", "lr": "0.000140736", "gnorm": "8.216", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "4957"} 2023-01-29 17:34:21 | INFO | train_inner | {"epoch": 10, "update": 9.771, "s2c_loss": "0.539", "loss": "0.37359", "s2c_nll_loss": "0.539", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "21120", "lr": "0.000140803", "gnorm": "7.869", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4959"} 2023-01-29 17:34:24 | INFO | train_inner | {"epoch": 10, "update": 9.776, "s2c_loss": "0.613", "loss": "0.42498", "s2c_nll_loss": "0.613", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "21130", "lr": "0.00014087", "gnorm": "8.199", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4962"} 2023-01-29 17:34:26 | INFO | train_inner | {"epoch": 10, "update": 9.78, "s2c_loss": "0.479", "loss": "0.33176", "s2c_nll_loss": "0.479", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "21140", "lr": "0.000140936", "gnorm": "6.875", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4964"} 2023-01-29 17:34:29 | INFO | train_inner | {"epoch": 10, "update": 9.785, "s2c_loss": "0.535", "loss": "0.37091", "s2c_nll_loss": "0.535", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "21150", "lr": "0.000141003", "gnorm": "7.601", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4967"} 2023-01-29 17:34:32 | INFO | train_inner | {"epoch": 10, "update": 9.79, "s2c_loss": "0.604", "loss": "0.41831", "s2c_nll_loss": "0.604", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "21160", "lr": "0.00014107", "gnorm": "8.02", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4970"} 2023-01-29 17:34:34 | INFO | train_inner | {"epoch": 10, "update": 9.794, "s2c_loss": "0.486", "loss": "0.33707", "s2c_nll_loss": "0.486", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "21170", "lr": "0.000141136", "gnorm": "7.656", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4972"} 2023-01-29 17:34:37 | INFO | train_inner | {"epoch": 10, "update": 9.799, "s2c_loss": "0.499", "loss": "0.34556", "s2c_nll_loss": "0.499", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "21180", "lr": "0.000141203", "gnorm": "7.895", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4975"} 2023-01-29 17:34:39 | INFO | train_inner | {"epoch": 10, "update": 9.803, "s2c_loss": "0.679", "loss": "0.47039", "s2c_nll_loss": "0.679", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "21190", "lr": "0.00014127", "gnorm": "8.532", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4977"} 2023-01-29 17:34:42 | INFO | train_inner | {"epoch": 10, "update": 9.808, "s2c_loss": "0.448", "loss": "0.3104", "s2c_nll_loss": "0.448", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "21200", "lr": "0.000141336", "gnorm": "7.525", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4980"} 2023-01-29 17:34:44 | INFO | train_inner | {"epoch": 10, "update": 9.813, "s2c_loss": "0.517", "loss": "0.35851", "s2c_nll_loss": "0.517", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "21210", "lr": "0.000141403", "gnorm": "7.879", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4982"} 2023-01-29 17:34:47 | INFO | train_inner | {"epoch": 10, "update": 9.817, "s2c_loss": "0.606", "loss": "0.42031", "s2c_nll_loss": "0.606", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "21220", "lr": "0.00014147", "gnorm": "9.448", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4985"} 2023-01-29 17:34:49 | INFO | train_inner | {"epoch": 10, "update": 9.822, "s2c_loss": "0.542", "loss": "0.37579", "s2c_nll_loss": "0.542", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "21230", "lr": "0.000141536", "gnorm": "7.902", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4987"} 2023-01-29 17:34:52 | INFO | train_inner | {"epoch": 10, "update": 9.827, "s2c_loss": "0.512", "loss": "0.35487", "s2c_nll_loss": "0.512", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "21240", "lr": "0.000141603", "gnorm": "7.279", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4990"} 2023-01-29 17:34:54 | INFO | train_inner | {"epoch": 10, "update": 9.831, "s2c_loss": "0.625", "loss": "0.43337", "s2c_nll_loss": "0.625", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "21250", "lr": "0.00014167", "gnorm": "7.757", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "4992"} 2023-01-29 17:34:57 | INFO | train_inner | {"epoch": 10, "update": 9.836, "s2c_loss": "0.577", "loss": "0.39963", "s2c_nll_loss": "0.577", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "21260", "lr": "0.000141736", "gnorm": "7.663", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "4995"} 2023-01-29 17:34:59 | INFO | train_inner | {"epoch": 10, "update": 9.84, "s2c_loss": "0.46", "loss": "0.31876", "s2c_nll_loss": "0.46", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "246.1", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "21270", "lr": "0.000141803", "gnorm": "6.366", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "4997"} 2023-01-29 17:35:02 | INFO | train_inner | {"epoch": 10, "update": 9.845, "s2c_loss": "0.617", "loss": "0.42778", "s2c_nll_loss": "0.617", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "21280", "lr": "0.00014187", "gnorm": "7.572", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "5000"} 2023-01-29 17:35:05 | INFO | train_inner | {"epoch": 10, "update": 9.85, "s2c_loss": "0.495", "loss": "0.34326", "s2c_nll_loss": "0.495", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "21290", "lr": "0.000141936", "gnorm": "7.692", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5002"} 2023-01-29 17:35:07 | INFO | train_inner | {"epoch": 10, "update": 9.854, "s2c_loss": "0.674", "loss": "0.46723", "s2c_nll_loss": "0.674", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "21300", "lr": "0.000142003", "gnorm": "9.116", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5005"} 2023-01-29 17:35:10 | INFO | train_inner | {"epoch": 10, "update": 9.859, "s2c_loss": "0.586", "loss": "0.40639", "s2c_nll_loss": "0.586", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "21310", "lr": "0.00014207", "gnorm": "8.256", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5008"} 2023-01-29 17:35:12 | INFO | train_inner | {"epoch": 10, "update": 9.864, "s2c_loss": "0.695", "loss": "0.48165", "s2c_nll_loss": "0.695", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "21320", "lr": "0.000142136", "gnorm": "7.839", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5010"} 2023-01-29 17:35:15 | INFO | train_inner | {"epoch": 10, "update": 9.868, "s2c_loss": "0.569", "loss": "0.39448", "s2c_nll_loss": "0.569", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "21330", "lr": "0.000142203", "gnorm": "8.738", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5013"} 2023-01-29 17:35:17 | INFO | train_inner | {"epoch": 10, "update": 9.873, "s2c_loss": "0.626", "loss": "0.4341", "s2c_nll_loss": "0.626", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "21340", "lr": "0.00014227", "gnorm": "7.716", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5015"} 2023-01-29 17:35:20 | INFO | train_inner | {"epoch": 10, "update": 9.877, "s2c_loss": "0.582", "loss": "0.40357", "s2c_nll_loss": "0.582", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "21350", "lr": "0.000142336", "gnorm": "6.922", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5018"} 2023-01-29 17:35:22 | INFO | train_inner | {"epoch": 10, "update": 9.882, "s2c_loss": "0.476", "loss": "0.33004", "s2c_nll_loss": "0.476", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "247", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "21360", "lr": "0.000142403", "gnorm": "7.552", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5020"} 2023-01-29 17:35:25 | INFO | train_inner | {"epoch": 10, "update": 9.887, "s2c_loss": "0.78", "loss": "0.54053", "s2c_nll_loss": "0.78", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "21370", "lr": "0.00014247", "gnorm": "8.216", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5023"} 2023-01-29 17:35:28 | INFO | train_inner | {"epoch": 10, "update": 9.891, "s2c_loss": "0.553", "loss": "0.3833", "s2c_nll_loss": "0.553", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "21380", "lr": "0.000142536", "gnorm": "6.975", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "5025"} 2023-01-29 17:35:30 | INFO | train_inner | {"epoch": 10, "update": 9.896, "s2c_loss": "0.64", "loss": "0.4433", "s2c_nll_loss": "0.64", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "21390", "lr": "0.000142603", "gnorm": "7.751", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5028"} 2023-01-29 17:35:33 | INFO | train_inner | {"epoch": 10, "update": 9.901, "s2c_loss": "0.514", "loss": "0.35643", "s2c_nll_loss": "0.514", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "21400", "lr": "0.00014267", "gnorm": "7.119", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5031"} 2023-01-29 17:35:35 | INFO | train_inner | {"epoch": 10, "update": 9.905, "s2c_loss": "0.788", "loss": "0.54614", "s2c_nll_loss": "0.788", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "21410", "lr": "0.000142736", "gnorm": "7.516", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "5033"} 2023-01-29 17:35:38 | INFO | train_inner | {"epoch": 10, "update": 9.91, "s2c_loss": "0.513", "loss": "0.35559", "s2c_nll_loss": "0.513", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "21420", "lr": "0.000142803", "gnorm": "6.891", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5036"} 2023-01-29 17:35:40 | INFO | train_inner | {"epoch": 10, "update": 9.914, "s2c_loss": "0.471", "loss": "0.32616", "s2c_nll_loss": "0.471", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "21430", "lr": "0.00014287", "gnorm": "7.189", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5038"} 2023-01-29 17:35:43 | INFO | train_inner | {"epoch": 10, "update": 9.919, "s2c_loss": "0.627", "loss": "0.4343", "s2c_nll_loss": "0.627", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "258.5", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "21440", "lr": "0.000142936", "gnorm": "7.843", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "5041"} 2023-01-29 17:35:45 | INFO | train_inner | {"epoch": 10, "update": 9.924, "s2c_loss": "0.462", "loss": "0.32003", "s2c_nll_loss": "0.462", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "21450", "lr": "0.000143003", "gnorm": "6.47", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5043"} 2023-01-29 17:35:48 | INFO | train_inner | {"epoch": 10, "update": 9.928, "s2c_loss": "0.527", "loss": "0.36546", "s2c_nll_loss": "0.527", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "21460", "lr": "0.00014307", "gnorm": "7.146", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5046"} 2023-01-29 17:35:50 | INFO | train_inner | {"epoch": 10, "update": 9.933, "s2c_loss": "0.358", "loss": "0.24822", "s2c_nll_loss": "0.358", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "21470", "lr": "0.000143136", "gnorm": "5.864", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "5048"} 2023-01-29 17:35:53 | INFO | train_inner | {"epoch": 10, "update": 9.938, "s2c_loss": "0.565", "loss": "0.39169", "s2c_nll_loss": "0.565", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "21480", "lr": "0.000143203", "gnorm": "8.364", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5051"} 2023-01-29 17:35:55 | INFO | train_inner | {"epoch": 10, "update": 9.942, "s2c_loss": "0.879", "loss": "0.60911", "s2c_nll_loss": "0.879", "s2c_accuracy": "86.875", "s2c_total": "64", "s2c_n_correct": "55.6", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "21490", "lr": "0.00014327", "gnorm": "8.301", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "5053"} 2023-01-29 17:35:58 | INFO | train_inner | {"epoch": 10, "update": 9.947, "s2c_loss": "0.564", "loss": "0.39123", "s2c_nll_loss": "0.564", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "21500", "lr": "0.000143336", "gnorm": "7.803", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5056"} 2023-01-29 17:36:00 | INFO | train_inner | {"epoch": 10, "update": 9.951, "s2c_loss": "0.578", "loss": "0.40081", "s2c_nll_loss": "0.578", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "21510", "lr": "0.000143403", "gnorm": "7.668", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "5058"} 2023-01-29 17:36:03 | INFO | train_inner | {"epoch": 10, "update": 9.956, "s2c_loss": "0.544", "loss": "0.37711", "s2c_nll_loss": "0.544", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "21520", "lr": "0.000143469", "gnorm": "7.708", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5061"} 2023-01-29 17:36:06 | INFO | train_inner | {"epoch": 10, "update": 9.961, "s2c_loss": "0.507", "loss": "0.35142", "s2c_nll_loss": "0.507", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "21530", "lr": "0.000143536", "gnorm": "7.587", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5063"} 2023-01-29 17:36:08 | INFO | train_inner | {"epoch": 10, "update": 9.965, "s2c_loss": "0.551", "loss": "0.38172", "s2c_nll_loss": "0.551", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "21540", "lr": "0.000143603", "gnorm": "8.204", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5066"} 2023-01-29 17:36:11 | INFO | train_inner | {"epoch": 10, "update": 9.97, "s2c_loss": "0.526", "loss": "0.36481", "s2c_nll_loss": "0.526", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "21550", "lr": "0.000143669", "gnorm": "6.845", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5069"} 2023-01-29 17:36:13 | INFO | train_inner | {"epoch": 10, "update": 9.975, "s2c_loss": "0.639", "loss": "0.44264", "s2c_nll_loss": "0.639", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "21560", "lr": "0.000143736", "gnorm": "7.796", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "5071"} 2023-01-29 17:36:16 | INFO | train_inner | {"epoch": 10, "update": 9.979, "s2c_loss": "0.597", "loss": "0.41367", "s2c_nll_loss": "0.597", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "21570", "lr": "0.000143803", "gnorm": "7.352", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5074"} 2023-01-29 17:36:18 | INFO | train_inner | {"epoch": 10, "update": 9.984, "s2c_loss": "0.477", "loss": "0.33097", "s2c_nll_loss": "0.477", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "21580", "lr": "0.000143869", "gnorm": "6.534", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5076"} 2023-01-29 17:36:21 | INFO | train_inner | {"epoch": 10, "update": 9.988, "s2c_loss": "0.631", "loss": "0.43738", "s2c_nll_loss": "0.631", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "21590", "lr": "0.000143936", "gnorm": "7.557", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "5079"} 2023-01-29 17:36:23 | INFO | train_inner | {"epoch": 10, "update": 9.993, "s2c_loss": "0.655", "loss": "0.45405", "s2c_nll_loss": "0.655", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "21600", "lr": "0.000144003", "gnorm": "7.892", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "5081"} 2023-01-29 17:36:26 | INFO | train_inner | {"epoch": 10, "update": 9.998, "s2c_loss": "0.736", "loss": "0.50983", "s2c_nll_loss": "0.736", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "21610", "lr": "0.000144069", "gnorm": "8.231", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "5084"} 2023-01-29 17:36:27 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 17:36:41 | INFO | valid | {"epoch": 10, "valid_s2c_loss": "1.171", "valid_loss": "0.81153", "valid_s2c_nll_loss": "1.171", "valid_s2c_accuracy": "79.226", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "25.3194", "valid_num_updates": "21615", "valid_best_s2c_accuracy": "79.473"} 2023-01-29 17:36:41 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 10 @ 21615 updates 2023-01-29 17:36:41 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 17:36:48 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 17:36:48 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt (epoch 10 @ 21615 updates, score 79.226) (writing took 7.114160731900483 seconds) 2023-01-29 17:36:48 | INFO | fairseq_cli.train | end of epoch 10 (average epoch stats below) 2023-01-29 17:36:49 | INFO | train | {"epoch": 10, "train_s2c_loss": "0.571", "train_loss": "0.39587", "train_s2c_nll_loss": "0.571", "train_s2c_accuracy": "90.133", "train_s2c_total": "63.9838", "train_s2c_n_correct": "57.6707", "train_wps": "230.5", "train_ups": "3.6", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "21615", "train_lr": "0.000144103", "train_gnorm": "7.847", "train_loss_scale": "1024", "train_train_wall": "541", "train_gb_free": "7.4", "train_wall": "5106"} 2023-01-29 17:36:55 | INFO | fairseq.trainer | begin training epoch 11 2023-01-29 17:36:55 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 17:36:56 | INFO | train_inner | {"epoch": 11, "update": 10.002, "s2c_loss": "0.631", "loss": "0.43716", "s2c_nll_loss": "0.631", "s2c_accuracy": "88.322", "s2c_total": "60.8", "s2c_n_correct": "53.7", "wps": "20.1", "ups": "0.33", "wpb": "60.8", "bsz": "60.8", "num_updates": "21620", "lr": "0.000144136", "gnorm": "9.182", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "5114"} 2023-01-29 17:36:59 | INFO | train_inner | {"epoch": 11, "update": 10.007, "s2c_loss": "0.464", "loss": "0.32149", "s2c_nll_loss": "0.464", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "247.4", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "21630", "lr": "0.000144203", "gnorm": "7.451", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5117"} 2023-01-29 17:37:01 | INFO | train_inner | {"epoch": 11, "update": 10.012, "s2c_loss": "0.467", "loss": "0.32347", "s2c_nll_loss": "0.467", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "21640", "lr": "0.000144269", "gnorm": "7.649", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "5119"} 2023-01-29 17:37:04 | INFO | train_inner | {"epoch": 11, "update": 10.016, "s2c_loss": "0.569", "loss": "0.39419", "s2c_nll_loss": "0.569", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "21650", "lr": "0.000144336", "gnorm": "8.455", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5122"} 2023-01-29 17:37:06 | INFO | train_inner | {"epoch": 11, "update": 10.021, "s2c_loss": "0.585", "loss": "0.40577", "s2c_nll_loss": "0.585", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "21660", "lr": "0.000144403", "gnorm": "8.152", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5124"} 2023-01-29 17:37:09 | INFO | train_inner | {"epoch": 11, "update": 10.025, "s2c_loss": "0.534", "loss": "0.37037", "s2c_nll_loss": "0.534", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "21670", "lr": "0.000144469", "gnorm": "7.054", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5127"} 2023-01-29 17:37:12 | INFO | train_inner | {"epoch": 11, "update": 10.03, "s2c_loss": "0.42", "loss": "0.29092", "s2c_nll_loss": "0.42", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "21680", "lr": "0.000144536", "gnorm": "6.123", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5129"} 2023-01-29 17:37:14 | INFO | train_inner | {"epoch": 11, "update": 10.035, "s2c_loss": "0.49", "loss": "0.33935", "s2c_nll_loss": "0.49", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "21690", "lr": "0.000144603", "gnorm": "7.089", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5132"} 2023-01-29 17:37:17 | INFO | train_inner | {"epoch": 11, "update": 10.039, "s2c_loss": "0.593", "loss": "0.41086", "s2c_nll_loss": "0.593", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "21700", "lr": "0.000144669", "gnorm": "6.781", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "5135"} 2023-01-29 17:37:19 | INFO | train_inner | {"epoch": 11, "update": 10.044, "s2c_loss": "0.461", "loss": "0.31961", "s2c_nll_loss": "0.461", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "21710", "lr": "0.000144736", "gnorm": "7.28", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5137"} 2023-01-29 17:37:22 | INFO | train_inner | {"epoch": 11, "update": 10.049, "s2c_loss": "0.45", "loss": "0.31172", "s2c_nll_loss": "0.45", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "21720", "lr": "0.000144803", "gnorm": "7.295", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5140"} 2023-01-29 17:37:24 | INFO | train_inner | {"epoch": 11, "update": 10.053, "s2c_loss": "0.493", "loss": "0.3417", "s2c_nll_loss": "0.493", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "21730", "lr": "0.000144869", "gnorm": "6.809", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5142"} 2023-01-29 17:37:27 | INFO | train_inner | {"epoch": 11, "update": 10.058, "s2c_loss": "0.491", "loss": "0.34013", "s2c_nll_loss": "0.491", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "247.1", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "21740", "lr": "0.000144936", "gnorm": "7.14", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5145"} 2023-01-29 17:37:29 | INFO | train_inner | {"epoch": 11, "update": 10.062, "s2c_loss": "0.476", "loss": "0.32999", "s2c_nll_loss": "0.476", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "21750", "lr": "0.000145003", "gnorm": "6.988", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5147"} 2023-01-29 17:37:32 | INFO | train_inner | {"epoch": 11, "update": 10.067, "s2c_loss": "0.382", "loss": "0.26448", "s2c_nll_loss": "0.382", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "21760", "lr": "0.000145069", "gnorm": "6.255", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5150"} 2023-01-29 17:37:35 | INFO | train_inner | {"epoch": 11, "update": 10.072, "s2c_loss": "0.427", "loss": "0.29571", "s2c_nll_loss": "0.427", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "21770", "lr": "0.000145136", "gnorm": "6.39", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "5153"} 2023-01-29 17:37:37 | INFO | train_inner | {"epoch": 11, "update": 10.076, "s2c_loss": "0.563", "loss": "0.39058", "s2c_nll_loss": "0.563", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "21780", "lr": "0.000145203", "gnorm": "7.365", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "5155"} 2023-01-29 17:37:40 | INFO | train_inner | {"epoch": 11, "update": 10.081, "s2c_loss": "0.404", "loss": "0.28031", "s2c_nll_loss": "0.404", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "21790", "lr": "0.000145269", "gnorm": "7.198", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5158"} 2023-01-29 17:37:42 | INFO | train_inner | {"epoch": 11, "update": 10.086, "s2c_loss": "0.413", "loss": "0.28616", "s2c_nll_loss": "0.413", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "245", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "21800", "lr": "0.000145336", "gnorm": "7.239", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5160"} 2023-01-29 17:37:45 | INFO | train_inner | {"epoch": 11, "update": 10.09, "s2c_loss": "0.459", "loss": "0.31846", "s2c_nll_loss": "0.459", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "243.3", "ups": "3.8", "wpb": "64", "bsz": "64", "num_updates": "21810", "lr": "0.000145403", "gnorm": "6.738", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5163"} 2023-01-29 17:37:47 | INFO | train_inner | {"epoch": 11, "update": 10.095, "s2c_loss": "0.567", "loss": "0.39285", "s2c_nll_loss": "0.567", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "21820", "lr": "0.000145469", "gnorm": "8.306", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5165"} 2023-01-29 17:37:50 | INFO | train_inner | {"epoch": 11, "update": 10.099, "s2c_loss": "0.415", "loss": "0.28749", "s2c_nll_loss": "0.415", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "21830", "lr": "0.000145536", "gnorm": "7.903", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5168"} 2023-01-29 17:37:53 | INFO | train_inner | {"epoch": 11, "update": 10.104, "s2c_loss": "0.868", "loss": "0.60152", "s2c_nll_loss": "0.868", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "21840", "lr": "0.000145603", "gnorm": "7.957", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "5171"} 2023-01-29 17:37:55 | INFO | train_inner | {"epoch": 11, "update": 10.109, "s2c_loss": "0.529", "loss": "0.36663", "s2c_nll_loss": "0.529", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "21850", "lr": "0.000145669", "gnorm": "8.021", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "5173"} 2023-01-29 17:37:58 | INFO | train_inner | {"epoch": 11, "update": 10.113, "s2c_loss": "0.461", "loss": "0.31955", "s2c_nll_loss": "0.461", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "21860", "lr": "0.000145736", "gnorm": "6.857", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "5176"} 2023-01-29 17:38:00 | INFO | train_inner | {"epoch": 11, "update": 10.118, "s2c_loss": "0.473", "loss": "0.32779", "s2c_nll_loss": "0.473", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "21870", "lr": "0.000145803", "gnorm": "7.421", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5178"} 2023-01-29 17:38:03 | INFO | train_inner | {"epoch": 11, "update": 10.123, "s2c_loss": "0.474", "loss": "0.32853", "s2c_nll_loss": "0.474", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "247.1", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "21880", "lr": "0.000145869", "gnorm": "7.29", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5181"} 2023-01-29 17:38:05 | INFO | train_inner | {"epoch": 11, "update": 10.127, "s2c_loss": "0.57", "loss": "0.39501", "s2c_nll_loss": "0.57", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "21890", "lr": "0.000145936", "gnorm": "6.859", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "5183"} 2023-01-29 17:38:08 | INFO | train_inner | {"epoch": 11, "update": 10.132, "s2c_loss": "0.361", "loss": "0.25023", "s2c_nll_loss": "0.361", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "21900", "lr": "0.000146003", "gnorm": "6.874", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "5186"} 2023-01-29 17:38:10 | INFO | train_inner | {"epoch": 11, "update": 10.136, "s2c_loss": "0.553", "loss": "0.38349", "s2c_nll_loss": "0.553", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "21910", "lr": "0.000146069", "gnorm": "6.715", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "5188"} 2023-01-29 17:38:13 | INFO | train_inner | {"epoch": 11, "update": 10.141, "s2c_loss": "0.452", "loss": "0.31322", "s2c_nll_loss": "0.452", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "21920", "lr": "0.000146136", "gnorm": "6.928", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5191"} 2023-01-29 17:38:15 | INFO | train_inner | {"epoch": 11, "update": 10.146, "s2c_loss": "0.459", "loss": "0.31838", "s2c_nll_loss": "0.459", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "21930", "lr": "0.000146203", "gnorm": "6.588", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5193"} 2023-01-29 17:38:18 | INFO | train_inner | {"epoch": 11, "update": 10.15, "s2c_loss": "0.413", "loss": "0.2864", "s2c_nll_loss": "0.413", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "21940", "lr": "0.000146269", "gnorm": "7.127", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5196"} 2023-01-29 17:38:21 | INFO | train_inner | {"epoch": 11, "update": 10.155, "s2c_loss": "0.46", "loss": "0.3188", "s2c_nll_loss": "0.46", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "21950", "lr": "0.000146336", "gnorm": "7.06", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "5198"} 2023-01-29 17:38:23 | INFO | train_inner | {"epoch": 11, "update": 10.16, "s2c_loss": "0.389", "loss": "0.26972", "s2c_nll_loss": "0.389", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "21960", "lr": "0.000146403", "gnorm": "6.275", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "5201"} 2023-01-29 17:38:26 | INFO | train_inner | {"epoch": 11, "update": 10.164, "s2c_loss": "0.519", "loss": "0.36", "s2c_nll_loss": "0.519", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "21970", "lr": "0.000146469", "gnorm": "7.563", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "5203"} 2023-01-29 17:38:28 | INFO | train_inner | {"epoch": 11, "update": 10.169, "s2c_loss": "0.576", "loss": "0.39953", "s2c_nll_loss": "0.576", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "21980", "lr": "0.000146536", "gnorm": "7.808", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5206"} 2023-01-29 17:38:31 | INFO | train_inner | {"epoch": 11, "update": 10.173, "s2c_loss": "0.645", "loss": "0.44739", "s2c_nll_loss": "0.645", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "21990", "lr": "0.000146603", "gnorm": "8.606", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "5209"} 2023-01-29 17:38:33 | INFO | train_inner | {"epoch": 11, "update": 10.178, "s2c_loss": "0.47", "loss": "0.32575", "s2c_nll_loss": "0.47", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "22000", "lr": "0.000146669", "gnorm": "6.844", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "5211"} 2023-01-29 17:38:36 | INFO | train_inner | {"epoch": 11, "update": 10.183, "s2c_loss": "0.456", "loss": "0.31627", "s2c_nll_loss": "0.456", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "22010", "lr": "0.000146736", "gnorm": "6.881", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5214"} 2023-01-29 17:38:38 | INFO | train_inner | {"epoch": 11, "update": 10.187, "s2c_loss": "0.546", "loss": "0.37875", "s2c_nll_loss": "0.546", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "22020", "lr": "0.000146803", "gnorm": "6.824", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "5216"} 2023-01-29 17:38:41 | INFO | train_inner | {"epoch": 11, "update": 10.192, "s2c_loss": "0.541", "loss": "0.3753", "s2c_nll_loss": "0.541", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "22030", "lr": "0.000146869", "gnorm": "6.964", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5219"} 2023-01-29 17:38:43 | INFO | train_inner | {"epoch": 11, "update": 10.197, "s2c_loss": "0.568", "loss": "0.39396", "s2c_nll_loss": "0.568", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "22040", "lr": "0.000146936", "gnorm": "8.049", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "5221"} 2023-01-29 17:38:46 | INFO | train_inner | {"epoch": 11, "update": 10.201, "s2c_loss": "0.477", "loss": "0.33069", "s2c_nll_loss": "0.477", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "22050", "lr": "0.000147003", "gnorm": "7.78", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5224"} 2023-01-29 17:38:48 | INFO | train_inner | {"epoch": 11, "update": 10.206, "s2c_loss": "0.627", "loss": "0.43433", "s2c_nll_loss": "0.627", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "245.9", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "22060", "lr": "0.000147069", "gnorm": "8.205", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5226"} 2023-01-29 17:38:51 | INFO | train_inner | {"epoch": 11, "update": 10.21, "s2c_loss": "0.611", "loss": "0.42355", "s2c_nll_loss": "0.611", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "22070", "lr": "0.000147136", "gnorm": "8.013", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "5229"} 2023-01-29 17:38:54 | INFO | train_inner | {"epoch": 11, "update": 10.215, "s2c_loss": "0.516", "loss": "0.35745", "s2c_nll_loss": "0.516", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "22080", "lr": "0.000147203", "gnorm": "6.459", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5231"} 2023-01-29 17:38:56 | INFO | train_inner | {"epoch": 11, "update": 10.22, "s2c_loss": "0.458", "loss": "0.31734", "s2c_nll_loss": "0.458", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "22090", "lr": "0.000147269", "gnorm": "7.255", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "5234"} 2023-01-29 17:38:59 | INFO | train_inner | {"epoch": 11, "update": 10.224, "s2c_loss": "0.551", "loss": "0.38168", "s2c_nll_loss": "0.551", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "22100", "lr": "0.000147336", "gnorm": "7.407", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "5236"} 2023-01-29 17:39:01 | INFO | train_inner | {"epoch": 11, "update": 10.229, "s2c_loss": "0.441", "loss": "0.30577", "s2c_nll_loss": "0.441", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "22110", "lr": "0.000147403", "gnorm": "6.843", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "5239"} 2023-01-29 17:39:04 | INFO | train_inner | {"epoch": 11, "update": 10.234, "s2c_loss": "0.538", "loss": "0.3731", "s2c_nll_loss": "0.538", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "22120", "lr": "0.000147469", "gnorm": "7.972", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5242"} 2023-01-29 17:39:06 | INFO | train_inner | {"epoch": 11, "update": 10.238, "s2c_loss": "0.604", "loss": "0.41845", "s2c_nll_loss": "0.604", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "22130", "lr": "0.000147536", "gnorm": "7.576", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "5244"} 2023-01-29 17:39:09 | INFO | train_inner | {"epoch": 11, "update": 10.243, "s2c_loss": "0.655", "loss": "0.45421", "s2c_nll_loss": "0.655", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "22140", "lr": "0.000147603", "gnorm": "7.922", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "5247"} 2023-01-29 17:39:11 | INFO | train_inner | {"epoch": 11, "update": 10.247, "s2c_loss": "0.579", "loss": "0.40134", "s2c_nll_loss": "0.579", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "22150", "lr": "0.000147669", "gnorm": "8.236", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5249"} 2023-01-29 17:39:14 | INFO | train_inner | {"epoch": 11, "update": 10.252, "s2c_loss": "0.524", "loss": "0.36311", "s2c_nll_loss": "0.524", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "22160", "lr": "0.000147736", "gnorm": "7.784", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5252"} 2023-01-29 17:39:16 | INFO | train_inner | {"epoch": 11, "update": 10.257, "s2c_loss": "0.554", "loss": "0.38376", "s2c_nll_loss": "0.554", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "22170", "lr": "0.000147803", "gnorm": "8.263", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "5254"} 2023-01-29 17:39:19 | INFO | train_inner | {"epoch": 11, "update": 10.261, "s2c_loss": "0.411", "loss": "0.28469", "s2c_nll_loss": "0.411", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "22180", "lr": "0.000147869", "gnorm": "6.348", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "5257"} 2023-01-29 17:39:21 | INFO | train_inner | {"epoch": 11, "update": 10.266, "s2c_loss": "0.444", "loss": "0.3075", "s2c_nll_loss": "0.444", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "22190", "lr": "0.000147936", "gnorm": "7.675", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "5259"} 2023-01-29 17:39:24 | INFO | train_inner | {"epoch": 11, "update": 10.271, "s2c_loss": "0.488", "loss": "0.33792", "s2c_nll_loss": "0.488", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "22200", "lr": "0.000148003", "gnorm": "6.935", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5262"} 2023-01-29 17:39:27 | INFO | train_inner | {"epoch": 11, "update": 10.275, "s2c_loss": "0.528", "loss": "0.36625", "s2c_nll_loss": "0.528", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "22210", "lr": "0.000148069", "gnorm": "7.453", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "5264"} 2023-01-29 17:39:29 | INFO | train_inner | {"epoch": 11, "update": 10.28, "s2c_loss": "0.672", "loss": "0.46568", "s2c_nll_loss": "0.672", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "22220", "lr": "0.000148136", "gnorm": "10.744", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "5267"} 2023-01-29 17:39:32 | INFO | train_inner | {"epoch": 11, "update": 10.284, "s2c_loss": "0.548", "loss": "0.38011", "s2c_nll_loss": "0.548", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "22230", "lr": "0.000148203", "gnorm": "8.502", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5269"} 2023-01-29 17:39:34 | INFO | train_inner | {"epoch": 11, "update": 10.289, "s2c_loss": "0.451", "loss": "0.31269", "s2c_nll_loss": "0.451", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "22240", "lr": "0.000148269", "gnorm": "7.134", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5272"} 2023-01-29 17:39:37 | INFO | train_inner | {"epoch": 11, "update": 10.294, "s2c_loss": "0.613", "loss": "0.42505", "s2c_nll_loss": "0.613", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "245.6", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "22250", "lr": "0.000148336", "gnorm": "8.288", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5275"} 2023-01-29 17:39:39 | INFO | train_inner | {"epoch": 11, "update": 10.298, "s2c_loss": "0.504", "loss": "0.34949", "s2c_nll_loss": "0.504", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "22260", "lr": "0.000148403", "gnorm": "7.114", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5277"} 2023-01-29 17:39:42 | INFO | train_inner | {"epoch": 11, "update": 10.303, "s2c_loss": "0.614", "loss": "0.42543", "s2c_nll_loss": "0.614", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "255.7", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "22270", "lr": "0.000148469", "gnorm": "7.411", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5280"} 2023-01-29 17:39:44 | INFO | train_inner | {"epoch": 11, "update": 10.308, "s2c_loss": "0.422", "loss": "0.29245", "s2c_nll_loss": "0.422", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "255", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "22280", "lr": "0.000148536", "gnorm": "6.717", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5282"} 2023-01-29 17:39:47 | INFO | train_inner | {"epoch": 11, "update": 10.312, "s2c_loss": "0.413", "loss": "0.28657", "s2c_nll_loss": "0.413", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "22290", "lr": "0.000148603", "gnorm": "6.502", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "5285"} 2023-01-29 17:39:49 | INFO | train_inner | {"epoch": 11, "update": 10.317, "s2c_loss": "0.592", "loss": "0.41017", "s2c_nll_loss": "0.592", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "22300", "lr": "0.000148669", "gnorm": "7.651", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "5287"} 2023-01-29 17:39:52 | INFO | train_inner | {"epoch": 11, "update": 10.321, "s2c_loss": "0.475", "loss": "0.32932", "s2c_nll_loss": "0.475", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "22310", "lr": "0.000148736", "gnorm": "7.145", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5290"} 2023-01-29 17:39:54 | INFO | train_inner | {"epoch": 11, "update": 10.326, "s2c_loss": "0.485", "loss": "0.33652", "s2c_nll_loss": "0.485", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "22320", "lr": "0.000148803", "gnorm": "7.207", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5292"} 2023-01-29 17:39:57 | INFO | train_inner | {"epoch": 11, "update": 10.331, "s2c_loss": "0.484", "loss": "0.33546", "s2c_nll_loss": "0.484", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "22330", "lr": "0.000148869", "gnorm": "7.316", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5295"} 2023-01-29 17:40:00 | INFO | train_inner | {"epoch": 11, "update": 10.335, "s2c_loss": "0.494", "loss": "0.34259", "s2c_nll_loss": "0.494", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "245.4", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "22340", "lr": "0.000148936", "gnorm": "8.877", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5298"} 2023-01-29 17:40:02 | INFO | train_inner | {"epoch": 11, "update": 10.34, "s2c_loss": "0.53", "loss": "0.36745", "s2c_nll_loss": "0.53", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "22350", "lr": "0.000149003", "gnorm": "7.992", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "5300"} 2023-01-29 17:40:05 | INFO | train_inner | {"epoch": 11, "update": 10.345, "s2c_loss": "0.526", "loss": "0.36428", "s2c_nll_loss": "0.526", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "22360", "lr": "0.000149069", "gnorm": "7.648", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5303"} 2023-01-29 17:40:07 | INFO | train_inner | {"epoch": 11, "update": 10.349, "s2c_loss": "0.63", "loss": "0.43653", "s2c_nll_loss": "0.63", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "22370", "lr": "0.000149136", "gnorm": "8.861", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5305"} 2023-01-29 17:40:10 | INFO | train_inner | {"epoch": 11, "update": 10.354, "s2c_loss": "0.577", "loss": "0.40029", "s2c_nll_loss": "0.577", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "22380", "lr": "0.000149203", "gnorm": "8.217", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5308"} 2023-01-29 17:40:12 | INFO | train_inner | {"epoch": 11, "update": 10.358, "s2c_loss": "0.658", "loss": "0.45636", "s2c_nll_loss": "0.658", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "22390", "lr": "0.000149269", "gnorm": "8.395", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5310"} 2023-01-29 17:40:15 | INFO | train_inner | {"epoch": 11, "update": 10.363, "s2c_loss": "0.499", "loss": "0.346", "s2c_nll_loss": "0.499", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "22400", "lr": "0.000149336", "gnorm": "7.335", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5313"} 2023-01-29 17:40:17 | INFO | train_inner | {"epoch": 11, "update": 10.368, "s2c_loss": "0.57", "loss": "0.39507", "s2c_nll_loss": "0.57", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "22410", "lr": "0.000149403", "gnorm": "6.733", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5315"} 2023-01-29 17:40:20 | INFO | train_inner | {"epoch": 11, "update": 10.372, "s2c_loss": "0.535", "loss": "0.37106", "s2c_nll_loss": "0.535", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "22420", "lr": "0.000149469", "gnorm": "6.472", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5318"} 2023-01-29 17:40:22 | INFO | train_inner | {"epoch": 11, "update": 10.377, "s2c_loss": "0.46", "loss": "0.31879", "s2c_nll_loss": "0.46", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "242.6", "ups": "3.79", "wpb": "64", "bsz": "64", "num_updates": "22430", "lr": "0.000149536", "gnorm": "6.958", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5320"} 2023-01-29 17:40:25 | INFO | train_inner | {"epoch": 11, "update": 10.382, "s2c_loss": "0.511", "loss": "0.35451", "s2c_nll_loss": "0.511", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "22440", "lr": "0.000149603", "gnorm": "7.58", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5323"} 2023-01-29 17:40:28 | INFO | train_inner | {"epoch": 11, "update": 10.386, "s2c_loss": "0.4", "loss": "0.27692", "s2c_nll_loss": "0.4", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "22450", "lr": "0.000149669", "gnorm": "6.751", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5325"} 2023-01-29 17:40:30 | INFO | train_inner | {"epoch": 11, "update": 10.391, "s2c_loss": "0.782", "loss": "0.54209", "s2c_nll_loss": "0.782", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "22460", "lr": "0.000149736", "gnorm": "8.334", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5328"} 2023-01-29 17:40:33 | INFO | train_inner | {"epoch": 11, "update": 10.395, "s2c_loss": "0.55", "loss": "0.38109", "s2c_nll_loss": "0.55", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "22470", "lr": "0.000149803", "gnorm": "7.508", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5331"} 2023-01-29 17:40:35 | INFO | train_inner | {"epoch": 11, "update": 10.4, "s2c_loss": "0.593", "loss": "0.4111", "s2c_nll_loss": "0.593", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "22480", "lr": "0.000149869", "gnorm": "7.778", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5333"} 2023-01-29 17:40:38 | INFO | train_inner | {"epoch": 11, "update": 10.405, "s2c_loss": "0.62", "loss": "0.42941", "s2c_nll_loss": "0.62", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "22490", "lr": "0.000149936", "gnorm": "8.531", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5336"} 2023-01-29 17:40:40 | INFO | train_inner | {"epoch": 11, "update": 10.409, "s2c_loss": "0.667", "loss": "0.46249", "s2c_nll_loss": "0.667", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "260", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "22500", "lr": "0.000150002", "gnorm": "7.706", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5338"} 2023-01-29 17:40:43 | INFO | train_inner | {"epoch": 11, "update": 10.414, "s2c_loss": "0.626", "loss": "0.43405", "s2c_nll_loss": "0.626", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "22510", "lr": "0.000150069", "gnorm": "9.096", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5341"} 2023-01-29 17:40:45 | INFO | train_inner | {"epoch": 11, "update": 10.419, "s2c_loss": "0.539", "loss": "0.37329", "s2c_nll_loss": "0.539", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "22520", "lr": "0.000150136", "gnorm": "7.547", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5343"} 2023-01-29 17:40:48 | INFO | train_inner | {"epoch": 11, "update": 10.423, "s2c_loss": "0.617", "loss": "0.42793", "s2c_nll_loss": "0.617", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "22530", "lr": "0.000150202", "gnorm": "7.499", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5346"} 2023-01-29 17:40:50 | INFO | train_inner | {"epoch": 11, "update": 10.428, "s2c_loss": "0.445", "loss": "0.30878", "s2c_nll_loss": "0.445", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "22540", "lr": "0.000150269", "gnorm": "7.119", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5348"} 2023-01-29 17:40:53 | INFO | train_inner | {"epoch": 11, "update": 10.432, "s2c_loss": "0.574", "loss": "0.39782", "s2c_nll_loss": "0.574", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "22550", "lr": "0.000150336", "gnorm": "7.914", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5351"} 2023-01-29 17:40:55 | INFO | train_inner | {"epoch": 11, "update": 10.437, "s2c_loss": "0.555", "loss": "0.38489", "s2c_nll_loss": "0.555", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "22560", "lr": "0.000150402", "gnorm": "7.714", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "5353"} 2023-01-29 17:40:58 | INFO | train_inner | {"epoch": 11, "update": 10.442, "s2c_loss": "0.603", "loss": "0.41763", "s2c_nll_loss": "0.603", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "22570", "lr": "0.000150469", "gnorm": "7.508", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5356"} 2023-01-29 17:41:00 | INFO | train_inner | {"epoch": 11, "update": 10.446, "s2c_loss": "0.477", "loss": "0.33064", "s2c_nll_loss": "0.477", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "22580", "lr": "0.000150536", "gnorm": "7.749", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "5358"} 2023-01-29 17:41:03 | INFO | train_inner | {"epoch": 11, "update": 10.451, "s2c_loss": "0.619", "loss": "0.42884", "s2c_nll_loss": "0.619", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "22590", "lr": "0.000150602", "gnorm": "7.086", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5361"} 2023-01-29 17:41:06 | INFO | train_inner | {"epoch": 11, "update": 10.456, "s2c_loss": "0.446", "loss": "0.30913", "s2c_nll_loss": "0.446", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "22600", "lr": "0.000150669", "gnorm": "7.109", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5363"} 2023-01-29 17:41:08 | INFO | train_inner | {"epoch": 11, "update": 10.46, "s2c_loss": "0.579", "loss": "0.40118", "s2c_nll_loss": "0.579", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "247.4", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "22610", "lr": "0.000150736", "gnorm": "7.229", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5366"} 2023-01-29 17:41:11 | INFO | train_inner | {"epoch": 11, "update": 10.465, "s2c_loss": "0.571", "loss": "0.39611", "s2c_nll_loss": "0.571", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "22620", "lr": "0.000150802", "gnorm": "7.854", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5369"} 2023-01-29 17:41:13 | INFO | train_inner | {"epoch": 11, "update": 10.469, "s2c_loss": "0.548", "loss": "0.38014", "s2c_nll_loss": "0.548", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "22630", "lr": "0.000150869", "gnorm": "7.765", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5371"} 2023-01-29 17:41:16 | INFO | train_inner | {"epoch": 11, "update": 10.474, "s2c_loss": "0.532", "loss": "0.36845", "s2c_nll_loss": "0.532", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "22640", "lr": "0.000150936", "gnorm": "7.242", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5374"} 2023-01-29 17:41:18 | INFO | train_inner | {"epoch": 11, "update": 10.479, "s2c_loss": "0.498", "loss": "0.34513", "s2c_nll_loss": "0.498", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "22650", "lr": "0.000151002", "gnorm": "6.255", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5376"} 2023-01-29 17:41:21 | INFO | train_inner | {"epoch": 11, "update": 10.483, "s2c_loss": "0.588", "loss": "0.40777", "s2c_nll_loss": "0.588", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "22660", "lr": "0.000151069", "gnorm": "7.011", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5379"} 2023-01-29 17:41:23 | INFO | train_inner | {"epoch": 11, "update": 10.488, "s2c_loss": "0.455", "loss": "0.31535", "s2c_nll_loss": "0.455", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "22670", "lr": "0.000151136", "gnorm": "6.415", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5381"} 2023-01-29 17:41:26 | INFO | train_inner | {"epoch": 11, "update": 10.493, "s2c_loss": "0.49", "loss": "0.33933", "s2c_nll_loss": "0.49", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "22680", "lr": "0.000151202", "gnorm": "6.503", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5384"} 2023-01-29 17:41:29 | INFO | train_inner | {"epoch": 11, "update": 10.497, "s2c_loss": "0.488", "loss": "0.33856", "s2c_nll_loss": "0.488", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "22690", "lr": "0.000151269", "gnorm": "6.894", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5387"} 2023-01-29 17:41:31 | INFO | train_inner | {"epoch": 11, "update": 10.502, "s2c_loss": "0.637", "loss": "0.44187", "s2c_nll_loss": "0.637", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "22700", "lr": "0.000151336", "gnorm": "6.683", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5389"} 2023-01-29 17:41:34 | INFO | train_inner | {"epoch": 11, "update": 10.506, "s2c_loss": "0.521", "loss": "0.36147", "s2c_nll_loss": "0.521", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "22710", "lr": "0.000151402", "gnorm": "7.326", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5392"} 2023-01-29 17:41:36 | INFO | train_inner | {"epoch": 11, "update": 10.511, "s2c_loss": "0.59", "loss": "0.40927", "s2c_nll_loss": "0.59", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "22720", "lr": "0.000151469", "gnorm": "7.478", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5394"} 2023-01-29 17:41:39 | INFO | train_inner | {"epoch": 11, "update": 10.516, "s2c_loss": "0.716", "loss": "0.49644", "s2c_nll_loss": "0.716", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "22730", "lr": "0.000151536", "gnorm": "8.527", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5397"} 2023-01-29 17:41:41 | INFO | train_inner | {"epoch": 11, "update": 10.52, "s2c_loss": "0.614", "loss": "0.42548", "s2c_nll_loss": "0.614", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "22740", "lr": "0.000151602", "gnorm": "8.987", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5399"} 2023-01-29 17:41:44 | INFO | train_inner | {"epoch": 11, "update": 10.525, "s2c_loss": "0.634", "loss": "0.43924", "s2c_nll_loss": "0.634", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "22750", "lr": "0.000151669", "gnorm": "8.222", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5402"} 2023-01-29 17:41:46 | INFO | train_inner | {"epoch": 11, "update": 10.53, "s2c_loss": "0.577", "loss": "0.39979", "s2c_nll_loss": "0.577", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "22760", "lr": "0.000151736", "gnorm": "7.79", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5404"} 2023-01-29 17:41:49 | INFO | train_inner | {"epoch": 11, "update": 10.534, "s2c_loss": "0.575", "loss": "0.3987", "s2c_nll_loss": "0.575", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "22770", "lr": "0.000151802", "gnorm": "7.181", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5407"} 2023-01-29 17:41:51 | INFO | train_inner | {"epoch": 11, "update": 10.539, "s2c_loss": "0.53", "loss": "0.36759", "s2c_nll_loss": "0.53", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "22780", "lr": "0.000151869", "gnorm": "7.528", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5409"} 2023-01-29 17:41:54 | INFO | train_inner | {"epoch": 11, "update": 10.543, "s2c_loss": "0.56", "loss": "0.38805", "s2c_nll_loss": "0.56", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "22790", "lr": "0.000151936", "gnorm": "7.811", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5412"} 2023-01-29 17:41:56 | INFO | train_inner | {"epoch": 11, "update": 10.548, "s2c_loss": "0.538", "loss": "0.37313", "s2c_nll_loss": "0.538", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "22800", "lr": "0.000152002", "gnorm": "7.13", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5414"} 2023-01-29 17:41:59 | INFO | train_inner | {"epoch": 11, "update": 10.553, "s2c_loss": "0.573", "loss": "0.39716", "s2c_nll_loss": "0.573", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "22810", "lr": "0.000152069", "gnorm": "8.548", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "5417"} 2023-01-29 17:42:02 | INFO | train_inner | {"epoch": 11, "update": 10.557, "s2c_loss": "0.484", "loss": "0.33543", "s2c_nll_loss": "0.484", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "22820", "lr": "0.000152136", "gnorm": "7.476", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5419"} 2023-01-29 17:42:04 | INFO | train_inner | {"epoch": 11, "update": 10.562, "s2c_loss": "0.458", "loss": "0.31767", "s2c_nll_loss": "0.458", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "22830", "lr": "0.000152202", "gnorm": "7.234", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5422"} 2023-01-29 17:42:07 | INFO | train_inner | {"epoch": 11, "update": 10.567, "s2c_loss": "0.57", "loss": "0.39531", "s2c_nll_loss": "0.57", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "22840", "lr": "0.000152269", "gnorm": "8.405", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5425"} 2023-01-29 17:42:09 | INFO | train_inner | {"epoch": 11, "update": 10.571, "s2c_loss": "0.605", "loss": "0.41923", "s2c_nll_loss": "0.605", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "22850", "lr": "0.000152336", "gnorm": "7.331", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5427"} 2023-01-29 17:42:12 | INFO | train_inner | {"epoch": 11, "update": 10.576, "s2c_loss": "0.523", "loss": "0.36247", "s2c_nll_loss": "0.523", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "22860", "lr": "0.000152402", "gnorm": "7.128", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5430"} 2023-01-29 17:42:14 | INFO | train_inner | {"epoch": 11, "update": 10.58, "s2c_loss": "0.529", "loss": "0.36696", "s2c_nll_loss": "0.529", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "22870", "lr": "0.000152469", "gnorm": "6.924", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5432"} 2023-01-29 17:42:17 | INFO | train_inner | {"epoch": 11, "update": 10.585, "s2c_loss": "0.543", "loss": "0.37665", "s2c_nll_loss": "0.543", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "22880", "lr": "0.000152536", "gnorm": "7.638", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5435"} 2023-01-29 17:42:19 | INFO | train_inner | {"epoch": 11, "update": 10.59, "s2c_loss": "0.523", "loss": "0.36225", "s2c_nll_loss": "0.523", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "22890", "lr": "0.000152602", "gnorm": "7.436", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5437"} 2023-01-29 17:42:22 | INFO | train_inner | {"epoch": 11, "update": 10.594, "s2c_loss": "0.457", "loss": "0.317", "s2c_nll_loss": "0.457", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "22900", "lr": "0.000152669", "gnorm": "6.754", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5440"} 2023-01-29 17:42:25 | INFO | train_inner | {"epoch": 11, "update": 10.599, "s2c_loss": "0.559", "loss": "0.38721", "s2c_nll_loss": "0.559", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "22910", "lr": "0.000152736", "gnorm": "7.851", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "5442"} 2023-01-29 17:42:27 | INFO | train_inner | {"epoch": 11, "update": 10.604, "s2c_loss": "0.596", "loss": "0.4131", "s2c_nll_loss": "0.596", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "22920", "lr": "0.000152802", "gnorm": "8.024", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5445"} 2023-01-29 17:42:30 | INFO | train_inner | {"epoch": 11, "update": 10.608, "s2c_loss": "0.507", "loss": "0.35129", "s2c_nll_loss": "0.507", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "22930", "lr": "0.000152869", "gnorm": "8.738", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5447"} 2023-01-29 17:42:32 | INFO | train_inner | {"epoch": 11, "update": 10.613, "s2c_loss": "0.692", "loss": "0.47953", "s2c_nll_loss": "0.692", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "22940", "lr": "0.000152936", "gnorm": "7.47", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5450"} 2023-01-29 17:42:35 | INFO | train_inner | {"epoch": 11, "update": 10.617, "s2c_loss": "0.6", "loss": "0.41619", "s2c_nll_loss": "0.6", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "22950", "lr": "0.000153002", "gnorm": "8.569", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5453"} 2023-01-29 17:42:37 | INFO | train_inner | {"epoch": 11, "update": 10.622, "s2c_loss": "0.522", "loss": "0.36189", "s2c_nll_loss": "0.522", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "22960", "lr": "0.000153069", "gnorm": "7.659", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5455"} 2023-01-29 17:42:40 | INFO | train_inner | {"epoch": 11, "update": 10.627, "s2c_loss": "0.562", "loss": "0.38926", "s2c_nll_loss": "0.562", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "22970", "lr": "0.000153136", "gnorm": "7.934", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5458"} 2023-01-29 17:42:42 | INFO | train_inner | {"epoch": 11, "update": 10.631, "s2c_loss": "0.584", "loss": "0.40447", "s2c_nll_loss": "0.584", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "22980", "lr": "0.000153202", "gnorm": "8.112", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5460"} 2023-01-29 17:42:45 | INFO | train_inner | {"epoch": 11, "update": 10.636, "s2c_loss": "0.59", "loss": "0.4092", "s2c_nll_loss": "0.59", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "247.3", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "22990", "lr": "0.000153269", "gnorm": "7.174", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5463"} 2023-01-29 17:42:47 | INFO | train_inner | {"epoch": 11, "update": 10.641, "s2c_loss": "0.524", "loss": "0.36292", "s2c_nll_loss": "0.524", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "23000", "lr": "0.000153336", "gnorm": "8.222", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5465"} 2023-01-29 17:42:50 | INFO | train_inner | {"epoch": 11, "update": 10.645, "s2c_loss": "0.771", "loss": "0.5342", "s2c_nll_loss": "0.771", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "23010", "lr": "0.000153402", "gnorm": "9.179", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5468"} 2023-01-29 17:42:52 | INFO | train_inner | {"epoch": 11, "update": 10.65, "s2c_loss": "0.453", "loss": "0.31389", "s2c_nll_loss": "0.453", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "23020", "lr": "0.000153469", "gnorm": "7.876", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5470"} 2023-01-29 17:42:55 | INFO | train_inner | {"epoch": 11, "update": 10.654, "s2c_loss": "0.503", "loss": "0.34862", "s2c_nll_loss": "0.503", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "23030", "lr": "0.000153536", "gnorm": "7.813", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5473"} 2023-01-29 17:42:58 | INFO | train_inner | {"epoch": 11, "update": 10.659, "s2c_loss": "0.636", "loss": "0.44108", "s2c_nll_loss": "0.636", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "23040", "lr": "0.000153602", "gnorm": "8.061", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "5475"} 2023-01-29 17:43:00 | INFO | train_inner | {"epoch": 11, "update": 10.664, "s2c_loss": "0.701", "loss": "0.48558", "s2c_nll_loss": "0.701", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "23050", "lr": "0.000153669", "gnorm": "9.299", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5478"} 2023-01-29 17:43:03 | INFO | train_inner | {"epoch": 11, "update": 10.668, "s2c_loss": "0.606", "loss": "0.42014", "s2c_nll_loss": "0.606", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "23060", "lr": "0.000153736", "gnorm": "7.525", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "5480"} 2023-01-29 17:43:05 | INFO | train_inner | {"epoch": 11, "update": 10.673, "s2c_loss": "0.573", "loss": "0.39749", "s2c_nll_loss": "0.573", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "23070", "lr": "0.000153802", "gnorm": "7.768", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5483"} 2023-01-29 17:43:08 | INFO | train_inner | {"epoch": 11, "update": 10.678, "s2c_loss": "0.659", "loss": "0.45683", "s2c_nll_loss": "0.659", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "23080", "lr": "0.000153869", "gnorm": "7.28", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5486"} 2023-01-29 17:43:10 | INFO | train_inner | {"epoch": 11, "update": 10.682, "s2c_loss": "0.471", "loss": "0.32649", "s2c_nll_loss": "0.471", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "23090", "lr": "0.000153936", "gnorm": "7.065", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "5488"} 2023-01-29 17:43:13 | INFO | train_inner | {"epoch": 11, "update": 10.687, "s2c_loss": "0.703", "loss": "0.48722", "s2c_nll_loss": "0.703", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "244", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "23100", "lr": "0.000154002", "gnorm": "6.631", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5491"} 2023-01-29 17:43:15 | INFO | train_inner | {"epoch": 11, "update": 10.691, "s2c_loss": "0.577", "loss": "0.40004", "s2c_nll_loss": "0.577", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "242", "ups": "3.78", "wpb": "64", "bsz": "64", "num_updates": "23110", "lr": "0.000154069", "gnorm": "7.228", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "5493"} 2023-01-29 17:43:18 | INFO | train_inner | {"epoch": 11, "update": 10.696, "s2c_loss": "0.516", "loss": "0.35734", "s2c_nll_loss": "0.516", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "243.8", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "23120", "lr": "0.000154136", "gnorm": "6.464", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5496"} 2023-01-29 17:43:21 | INFO | train_inner | {"epoch": 11, "update": 10.701, "s2c_loss": "0.569", "loss": "0.39449", "s2c_nll_loss": "0.569", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "23130", "lr": "0.000154202", "gnorm": "7.291", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "5499"} 2023-01-29 17:43:23 | INFO | train_inner | {"epoch": 11, "update": 10.705, "s2c_loss": "0.58", "loss": "0.40193", "s2c_nll_loss": "0.58", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "23140", "lr": "0.000154269", "gnorm": "7.018", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5501"} 2023-01-29 17:43:26 | INFO | train_inner | {"epoch": 11, "update": 10.71, "s2c_loss": "0.447", "loss": "0.31001", "s2c_nll_loss": "0.447", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "23150", "lr": "0.000154336", "gnorm": "6.764", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5504"} 2023-01-29 17:43:28 | INFO | train_inner | {"epoch": 11, "update": 10.715, "s2c_loss": "0.623", "loss": "0.43172", "s2c_nll_loss": "0.623", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "23160", "lr": "0.000154402", "gnorm": "6.987", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "5506"} 2023-01-29 17:43:31 | INFO | train_inner | {"epoch": 11, "update": 10.719, "s2c_loss": "0.456", "loss": "0.31576", "s2c_nll_loss": "0.456", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "23170", "lr": "0.000154469", "gnorm": "7.125", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5509"} 2023-01-29 17:43:33 | INFO | train_inner | {"epoch": 11, "update": 10.724, "s2c_loss": "0.481", "loss": "0.33324", "s2c_nll_loss": "0.481", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "23180", "lr": "0.000154536", "gnorm": "7.139", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5511"} 2023-01-29 17:43:36 | INFO | train_inner | {"epoch": 11, "update": 10.728, "s2c_loss": "0.497", "loss": "0.34463", "s2c_nll_loss": "0.497", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "23190", "lr": "0.000154602", "gnorm": "7.116", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5514"} 2023-01-29 17:43:38 | INFO | train_inner | {"epoch": 11, "update": 10.733, "s2c_loss": "0.446", "loss": "0.30905", "s2c_nll_loss": "0.446", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "247.5", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "23200", "lr": "0.000154669", "gnorm": "6.833", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5516"} 2023-01-29 17:43:41 | INFO | train_inner | {"epoch": 11, "update": 10.738, "s2c_loss": "0.53", "loss": "0.36726", "s2c_nll_loss": "0.53", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "23210", "lr": "0.000154736", "gnorm": "7.777", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5519"} 2023-01-29 17:43:44 | INFO | train_inner | {"epoch": 11, "update": 10.742, "s2c_loss": "0.548", "loss": "0.37964", "s2c_nll_loss": "0.548", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "23220", "lr": "0.000154802", "gnorm": "7.058", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5521"} 2023-01-29 17:43:46 | INFO | train_inner | {"epoch": 11, "update": 10.747, "s2c_loss": "0.431", "loss": "0.29846", "s2c_nll_loss": "0.431", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "23230", "lr": "0.000154869", "gnorm": "6.395", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5524"} 2023-01-29 17:43:49 | INFO | train_inner | {"epoch": 11, "update": 10.752, "s2c_loss": "0.669", "loss": "0.46392", "s2c_nll_loss": "0.669", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "23240", "lr": "0.000154936", "gnorm": "7.538", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5527"} 2023-01-29 17:43:51 | INFO | train_inner | {"epoch": 11, "update": 10.756, "s2c_loss": "0.522", "loss": "0.36196", "s2c_nll_loss": "0.522", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "23250", "lr": "0.000155002", "gnorm": "7.504", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5529"} 2023-01-29 17:43:54 | INFO | train_inner | {"epoch": 11, "update": 10.761, "s2c_loss": "0.435", "loss": "0.30157", "s2c_nll_loss": "0.435", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "23260", "lr": "0.000155069", "gnorm": "6.23", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5532"} 2023-01-29 17:43:56 | INFO | train_inner | {"epoch": 11, "update": 10.765, "s2c_loss": "0.519", "loss": "0.35982", "s2c_nll_loss": "0.519", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "23270", "lr": "0.000155136", "gnorm": "7.16", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5534"} 2023-01-29 17:43:59 | INFO | train_inner | {"epoch": 11, "update": 10.77, "s2c_loss": "0.542", "loss": "0.37591", "s2c_nll_loss": "0.542", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "23280", "lr": "0.000155202", "gnorm": "7.031", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5537"} 2023-01-29 17:44:01 | INFO | train_inner | {"epoch": 11, "update": 10.775, "s2c_loss": "0.5", "loss": "0.34677", "s2c_nll_loss": "0.5", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "23290", "lr": "0.000155269", "gnorm": "6.91", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5539"} 2023-01-29 17:44:04 | INFO | train_inner | {"epoch": 11, "update": 10.779, "s2c_loss": "0.48", "loss": "0.33279", "s2c_nll_loss": "0.48", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "23300", "lr": "0.000155336", "gnorm": "7.022", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5542"} 2023-01-29 17:44:06 | INFO | train_inner | {"epoch": 11, "update": 10.784, "s2c_loss": "0.635", "loss": "0.44018", "s2c_nll_loss": "0.635", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "23310", "lr": "0.000155402", "gnorm": "7.836", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5544"} 2023-01-29 17:44:09 | INFO | train_inner | {"epoch": 11, "update": 10.789, "s2c_loss": "0.539", "loss": "0.37394", "s2c_nll_loss": "0.539", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "243.8", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "23320", "lr": "0.000155469", "gnorm": "7.432", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5547"} 2023-01-29 17:44:12 | INFO | train_inner | {"epoch": 11, "update": 10.793, "s2c_loss": "0.567", "loss": "0.39279", "s2c_nll_loss": "0.567", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "23330", "lr": "0.000155536", "gnorm": "7.17", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5550"} 2023-01-29 17:44:14 | INFO | train_inner | {"epoch": 11, "update": 10.798, "s2c_loss": "0.586", "loss": "0.40646", "s2c_nll_loss": "0.586", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "23340", "lr": "0.000155602", "gnorm": "7.77", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5552"} 2023-01-29 17:44:17 | INFO | train_inner | {"epoch": 11, "update": 10.802, "s2c_loss": "0.475", "loss": "0.32956", "s2c_nll_loss": "0.475", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "23350", "lr": "0.000155669", "gnorm": "6.66", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5555"} 2023-01-29 17:44:19 | INFO | train_inner | {"epoch": 11, "update": 10.807, "s2c_loss": "0.533", "loss": "0.36934", "s2c_nll_loss": "0.533", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "23360", "lr": "0.000155736", "gnorm": "6.92", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5557"} 2023-01-29 17:44:22 | INFO | train_inner | {"epoch": 11, "update": 10.812, "s2c_loss": "0.508", "loss": "0.35229", "s2c_nll_loss": "0.508", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "23370", "lr": "0.000155802", "gnorm": "6.366", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5560"} 2023-01-29 17:44:24 | INFO | train_inner | {"epoch": 11, "update": 10.816, "s2c_loss": "0.523", "loss": "0.36248", "s2c_nll_loss": "0.523", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "246.7", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "23380", "lr": "0.000155869", "gnorm": "6.993", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5562"} 2023-01-29 17:44:27 | INFO | train_inner | {"epoch": 11, "update": 10.821, "s2c_loss": "0.524", "loss": "0.36321", "s2c_nll_loss": "0.524", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "247.3", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "23390", "lr": "0.000155936", "gnorm": "7.627", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5565"} 2023-01-29 17:44:30 | INFO | train_inner | {"epoch": 11, "update": 10.826, "s2c_loss": "0.573", "loss": "0.39732", "s2c_nll_loss": "0.573", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "23400", "lr": "0.000156002", "gnorm": "6.622", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5567"} 2023-01-29 17:44:32 | INFO | train_inner | {"epoch": 11, "update": 10.83, "s2c_loss": "0.581", "loss": "0.40238", "s2c_nll_loss": "0.581", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "23410", "lr": "0.000156069", "gnorm": "7.485", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5570"} 2023-01-29 17:44:35 | INFO | train_inner | {"epoch": 11, "update": 10.835, "s2c_loss": "0.542", "loss": "0.37546", "s2c_nll_loss": "0.542", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "23420", "lr": "0.000156136", "gnorm": "7.163", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5573"} 2023-01-29 17:44:37 | INFO | train_inner | {"epoch": 11, "update": 10.84, "s2c_loss": "0.474", "loss": "0.32863", "s2c_nll_loss": "0.474", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "247.1", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "23430", "lr": "0.000156202", "gnorm": "6.857", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5575"} 2023-01-29 17:44:40 | INFO | train_inner | {"epoch": 11, "update": 10.844, "s2c_loss": "0.698", "loss": "0.48408", "s2c_nll_loss": "0.698", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "23440", "lr": "0.000156269", "gnorm": "6.858", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5578"} 2023-01-29 17:44:42 | INFO | train_inner | {"epoch": 11, "update": 10.849, "s2c_loss": "0.563", "loss": "0.39032", "s2c_nll_loss": "0.563", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "23450", "lr": "0.000156336", "gnorm": "7.412", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5580"} 2023-01-29 17:44:45 | INFO | train_inner | {"epoch": 11, "update": 10.853, "s2c_loss": "0.539", "loss": "0.37387", "s2c_nll_loss": "0.539", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "23460", "lr": "0.000156402", "gnorm": "7.453", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5583"} 2023-01-29 17:44:47 | INFO | train_inner | {"epoch": 11, "update": 10.858, "s2c_loss": "0.46", "loss": "0.3191", "s2c_nll_loss": "0.46", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "244.5", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "23470", "lr": "0.000156469", "gnorm": "6.648", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5585"} 2023-01-29 17:44:50 | INFO | train_inner | {"epoch": 11, "update": 10.863, "s2c_loss": "0.498", "loss": "0.34525", "s2c_nll_loss": "0.498", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "23480", "lr": "0.000156536", "gnorm": "7.105", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5588"} 2023-01-29 17:44:53 | INFO | train_inner | {"epoch": 11, "update": 10.867, "s2c_loss": "0.529", "loss": "0.36666", "s2c_nll_loss": "0.529", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "23490", "lr": "0.000156602", "gnorm": "7.86", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5591"} 2023-01-29 17:44:55 | INFO | train_inner | {"epoch": 11, "update": 10.872, "s2c_loss": "0.626", "loss": "0.43373", "s2c_nll_loss": "0.626", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "261.4", "ups": "4.08", "wpb": "64", "bsz": "64", "num_updates": "23500", "lr": "0.000156669", "gnorm": "7.845", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5593"} 2023-01-29 17:44:58 | INFO | train_inner | {"epoch": 11, "update": 10.877, "s2c_loss": "0.571", "loss": "0.39583", "s2c_nll_loss": "0.571", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "23510", "lr": "0.000156735", "gnorm": "7.996", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5596"} 2023-01-29 17:45:00 | INFO | train_inner | {"epoch": 11, "update": 10.881, "s2c_loss": "0.605", "loss": "0.41945", "s2c_nll_loss": "0.605", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "23520", "lr": "0.000156802", "gnorm": "7.573", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5598"} 2023-01-29 17:45:03 | INFO | train_inner | {"epoch": 11, "update": 10.886, "s2c_loss": "0.563", "loss": "0.38994", "s2c_nll_loss": "0.563", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "23530", "lr": "0.000156869", "gnorm": "7.322", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5601"} 2023-01-29 17:45:05 | INFO | train_inner | {"epoch": 11, "update": 10.89, "s2c_loss": "0.656", "loss": "0.455", "s2c_nll_loss": "0.656", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "23540", "lr": "0.000156935", "gnorm": "7.639", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5603"} 2023-01-29 17:45:08 | INFO | train_inner | {"epoch": 11, "update": 10.895, "s2c_loss": "0.631", "loss": "0.43758", "s2c_nll_loss": "0.631", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "23550", "lr": "0.000157002", "gnorm": "7.4", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5606"} 2023-01-29 17:45:10 | INFO | train_inner | {"epoch": 11, "update": 10.9, "s2c_loss": "0.626", "loss": "0.43363", "s2c_nll_loss": "0.626", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "23560", "lr": "0.000157069", "gnorm": "7.362", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5608"} 2023-01-29 17:45:13 | INFO | train_inner | {"epoch": 11, "update": 10.904, "s2c_loss": "0.493", "loss": "0.34173", "s2c_nll_loss": "0.493", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "23570", "lr": "0.000157135", "gnorm": "7.223", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "5611"} 2023-01-29 17:45:15 | INFO | train_inner | {"epoch": 11, "update": 10.909, "s2c_loss": "0.625", "loss": "0.43337", "s2c_nll_loss": "0.625", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "23580", "lr": "0.000157202", "gnorm": "7.672", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5613"} 2023-01-29 17:45:18 | INFO | train_inner | {"epoch": 11, "update": 10.914, "s2c_loss": "0.709", "loss": "0.49111", "s2c_nll_loss": "0.709", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "23590", "lr": "0.000157269", "gnorm": "7.556", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5616"} 2023-01-29 17:45:20 | INFO | train_inner | {"epoch": 11, "update": 10.918, "s2c_loss": "0.643", "loss": "0.44545", "s2c_nll_loss": "0.643", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "23600", "lr": "0.000157335", "gnorm": "7.105", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5618"} 2023-01-29 17:45:23 | INFO | train_inner | {"epoch": 11, "update": 10.923, "s2c_loss": "0.627", "loss": "0.43473", "s2c_nll_loss": "0.627", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "23610", "lr": "0.000157402", "gnorm": "8.613", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5621"} 2023-01-29 17:45:25 | INFO | train_inner | {"epoch": 11, "update": 10.927, "s2c_loss": "0.702", "loss": "0.48511", "s2c_nll_loss": "0.702", "s2c_accuracy": "87.912", "s2c_total": "63.7", "s2c_n_correct": "56", "wps": "255.6", "ups": "4.01", "wpb": "63.7", "bsz": "63.7", "num_updates": "23620", "lr": "0.000157469", "gnorm": "8.707", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5623"} 2023-01-29 17:45:28 | INFO | train_inner | {"epoch": 11, "update": 10.932, "s2c_loss": "0.636", "loss": "0.44112", "s2c_nll_loss": "0.636", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "247.5", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "23630", "lr": "0.000157535", "gnorm": "8.858", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5626"} 2023-01-29 17:45:31 | INFO | train_inner | {"epoch": 11, "update": 10.937, "s2c_loss": "0.527", "loss": "0.36558", "s2c_nll_loss": "0.527", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "23640", "lr": "0.000157602", "gnorm": "7.956", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5628"} 2023-01-29 17:45:33 | INFO | train_inner | {"epoch": 11, "update": 10.941, "s2c_loss": "0.515", "loss": "0.35677", "s2c_nll_loss": "0.515", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "23650", "lr": "0.000157669", "gnorm": "7.905", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5631"} 2023-01-29 17:45:36 | INFO | train_inner | {"epoch": 11, "update": 10.946, "s2c_loss": "0.656", "loss": "0.45438", "s2c_nll_loss": "0.656", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "23660", "lr": "0.000157735", "gnorm": "7.409", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "5634"} 2023-01-29 17:45:38 | INFO | train_inner | {"epoch": 11, "update": 10.951, "s2c_loss": "0.441", "loss": "0.30574", "s2c_nll_loss": "0.441", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "23670", "lr": "0.000157802", "gnorm": "6.64", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5636"} 2023-01-29 17:45:41 | INFO | train_inner | {"epoch": 11, "update": 10.955, "s2c_loss": "0.845", "loss": "0.58546", "s2c_nll_loss": "0.845", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "23680", "lr": "0.000157869", "gnorm": "7.059", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5639"} 2023-01-29 17:45:43 | INFO | train_inner | {"epoch": 11, "update": 10.96, "s2c_loss": "0.504", "loss": "0.34914", "s2c_nll_loss": "0.504", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "23690", "lr": "0.000157935", "gnorm": "7.13", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5641"} 2023-01-29 17:45:46 | INFO | train_inner | {"epoch": 11, "update": 10.964, "s2c_loss": "0.593", "loss": "0.41115", "s2c_nll_loss": "0.593", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "23700", "lr": "0.000158002", "gnorm": "6.958", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5644"} 2023-01-29 17:45:48 | INFO | train_inner | {"epoch": 11, "update": 10.969, "s2c_loss": "0.484", "loss": "0.33533", "s2c_nll_loss": "0.484", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "247.9", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "23710", "lr": "0.000158069", "gnorm": "6.754", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5646"} 2023-01-29 17:45:51 | INFO | train_inner | {"epoch": 11, "update": 10.974, "s2c_loss": "0.631", "loss": "0.43717", "s2c_nll_loss": "0.631", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "23720", "lr": "0.000158135", "gnorm": "7.905", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5649"} 2023-01-29 17:45:54 | INFO | train_inner | {"epoch": 11, "update": 10.978, "s2c_loss": "0.661", "loss": "0.45805", "s2c_nll_loss": "0.661", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "23730", "lr": "0.000158202", "gnorm": "7.314", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5651"} 2023-01-29 17:45:56 | INFO | train_inner | {"epoch": 11, "update": 10.983, "s2c_loss": "0.626", "loss": "0.43394", "s2c_nll_loss": "0.626", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "23740", "lr": "0.000158269", "gnorm": "8.627", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5654"} 2023-01-29 17:45:59 | INFO | train_inner | {"epoch": 11, "update": 10.988, "s2c_loss": "0.51", "loss": "0.35369", "s2c_nll_loss": "0.51", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "23750", "lr": "0.000158335", "gnorm": "6.936", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5656"} 2023-01-29 17:46:01 | INFO | train_inner | {"epoch": 11, "update": 10.992, "s2c_loss": "0.63", "loss": "0.43668", "s2c_nll_loss": "0.63", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "23760", "lr": "0.000158402", "gnorm": "7.22", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5659"} 2023-01-29 17:46:04 | INFO | train_inner | {"epoch": 11, "update": 10.997, "s2c_loss": "0.493", "loss": "0.34175", "s2c_nll_loss": "0.493", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "23770", "lr": "0.000158469", "gnorm": "6.521", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5662"} 2023-01-29 17:46:05 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 17:46:20 | INFO | valid | {"epoch": 11, "valid_s2c_loss": "1.076", "valid_loss": "0.74605", "valid_s2c_nll_loss": "1.076", "valid_s2c_accuracy": "81.037", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "25.8981", "valid_num_updates": "23777", "valid_best_s2c_accuracy": "81.037"} 2023-01-29 17:46:20 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 11 @ 23777 updates 2023-01-29 17:46:20 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 17:46:27 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 17:46:31 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt (epoch 11 @ 23777 updates, score 81.037) (writing took 11.540756722912192 seconds) 2023-01-29 17:46:31 | INFO | fairseq_cli.train | end of epoch 11 (average epoch stats below) 2023-01-29 17:46:31 | INFO | train | {"epoch": 11, "train_s2c_loss": "0.545", "train_loss": "0.37758", "train_s2c_nll_loss": "0.545", "train_s2c_accuracy": "90.33", "train_s2c_total": "63.9838", "train_s2c_n_correct": "57.7965", "train_wps": "237.3", "train_ups": "3.71", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "23777", "train_lr": "0.000158515", "train_gnorm": "7.447", "train_loss_scale": "2048", "train_train_wall": "543", "train_gb_free": "7.5", "train_wall": "5689"} 2023-01-29 17:46:38 | INFO | fairseq.trainer | begin training epoch 12 2023-01-29 17:46:38 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 17:46:39 | INFO | train_inner | {"epoch": 12, "update": 11.001, "s2c_loss": "0.491", "loss": "0.34037", "s2c_nll_loss": "0.491", "s2c_accuracy": "91.776", "s2c_total": "60.8", "s2c_n_correct": "55.8", "wps": "17.3", "ups": "0.29", "wpb": "60.8", "bsz": "60.8", "num_updates": "23780", "lr": "0.000158535", "gnorm": "7.057", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5697"} 2023-01-29 17:46:41 | INFO | train_inner | {"epoch": 12, "update": 11.006, "s2c_loss": "0.48", "loss": "0.33281", "s2c_nll_loss": "0.48", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "23790", "lr": "0.000158602", "gnorm": "7.52", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5699"} 2023-01-29 17:46:44 | INFO | train_inner | {"epoch": 12, "update": 11.011, "s2c_loss": "0.465", "loss": "0.32206", "s2c_nll_loss": "0.465", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "246", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "23800", "lr": "0.000158669", "gnorm": "6.239", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5702"} 2023-01-29 17:46:46 | INFO | train_inner | {"epoch": 12, "update": 11.015, "s2c_loss": "0.412", "loss": "0.28584", "s2c_nll_loss": "0.412", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "23810", "lr": "0.000158735", "gnorm": "6.09", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "5704"} 2023-01-29 17:46:49 | INFO | train_inner | {"epoch": 12, "update": 11.02, "s2c_loss": "0.397", "loss": "0.27521", "s2c_nll_loss": "0.397", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "247.4", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "23820", "lr": "0.000158802", "gnorm": "5.889", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5707"} 2023-01-29 17:46:52 | INFO | train_inner | {"epoch": 12, "update": 11.025, "s2c_loss": "0.468", "loss": "0.32445", "s2c_nll_loss": "0.468", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "23830", "lr": "0.000158869", "gnorm": "6.312", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5710"} 2023-01-29 17:46:54 | INFO | train_inner | {"epoch": 12, "update": 11.029, "s2c_loss": "0.406", "loss": "0.28149", "s2c_nll_loss": "0.406", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "247.4", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "23840", "lr": "0.000158935", "gnorm": "6.551", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5712"} 2023-01-29 17:46:57 | INFO | train_inner | {"epoch": 12, "update": 11.034, "s2c_loss": "0.471", "loss": "0.32676", "s2c_nll_loss": "0.471", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "23850", "lr": "0.000159002", "gnorm": "6.942", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5715"} 2023-01-29 17:46:59 | INFO | train_inner | {"epoch": 12, "update": 11.038, "s2c_loss": "0.385", "loss": "0.26662", "s2c_nll_loss": "0.385", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "23860", "lr": "0.000159069", "gnorm": "6.461", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5717"} 2023-01-29 17:47:02 | INFO | train_inner | {"epoch": 12, "update": 11.043, "s2c_loss": "0.435", "loss": "0.3013", "s2c_nll_loss": "0.435", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "23870", "lr": "0.000159135", "gnorm": "7.37", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5720"} 2023-01-29 17:47:04 | INFO | train_inner | {"epoch": 12, "update": 11.048, "s2c_loss": "0.509", "loss": "0.35309", "s2c_nll_loss": "0.509", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "243.4", "ups": "3.8", "wpb": "64", "bsz": "64", "num_updates": "23880", "lr": "0.000159202", "gnorm": "7.287", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5722"} 2023-01-29 17:47:07 | INFO | train_inner | {"epoch": 12, "update": 11.052, "s2c_loss": "0.37", "loss": "0.25649", "s2c_nll_loss": "0.37", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "23890", "lr": "0.000159269", "gnorm": "5.804", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "5725"} 2023-01-29 17:47:09 | INFO | train_inner | {"epoch": 12, "update": 11.057, "s2c_loss": "0.613", "loss": "0.42481", "s2c_nll_loss": "0.613", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "256.3", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "23900", "lr": "0.000159335", "gnorm": "7.34", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5727"} 2023-01-29 17:47:12 | INFO | train_inner | {"epoch": 12, "update": 11.062, "s2c_loss": "0.579", "loss": "0.40165", "s2c_nll_loss": "0.579", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "246.8", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "23910", "lr": "0.000159402", "gnorm": "7.392", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5730"} 2023-01-29 17:47:15 | INFO | train_inner | {"epoch": 12, "update": 11.066, "s2c_loss": "0.492", "loss": "0.34076", "s2c_nll_loss": "0.492", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "23920", "lr": "0.000159469", "gnorm": "7.691", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5732"} 2023-01-29 17:47:17 | INFO | train_inner | {"epoch": 12, "update": 11.071, "s2c_loss": "0.629", "loss": "0.43601", "s2c_nll_loss": "0.629", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "23930", "lr": "0.000159535", "gnorm": "7.494", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5735"} 2023-01-29 17:47:20 | INFO | train_inner | {"epoch": 12, "update": 11.075, "s2c_loss": "0.574", "loss": "0.39814", "s2c_nll_loss": "0.574", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "23940", "lr": "0.000159602", "gnorm": "9.164", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5738"} 2023-01-29 17:47:22 | INFO | train_inner | {"epoch": 12, "update": 11.08, "s2c_loss": "0.599", "loss": "0.4151", "s2c_nll_loss": "0.599", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "23950", "lr": "0.000159669", "gnorm": "8.373", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5740"} 2023-01-29 17:47:25 | INFO | train_inner | {"epoch": 12, "update": 11.085, "s2c_loss": "0.407", "loss": "0.28185", "s2c_nll_loss": "0.407", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "23960", "lr": "0.000159735", "gnorm": "6.5", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5743"} 2023-01-29 17:47:27 | INFO | train_inner | {"epoch": 12, "update": 11.089, "s2c_loss": "0.327", "loss": "0.2265", "s2c_nll_loss": "0.327", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "23970", "lr": "0.000159802", "gnorm": "5.71", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5745"} 2023-01-29 17:47:30 | INFO | train_inner | {"epoch": 12, "update": 11.094, "s2c_loss": "0.344", "loss": "0.23862", "s2c_nll_loss": "0.344", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "242", "ups": "3.78", "wpb": "64", "bsz": "64", "num_updates": "23980", "lr": "0.000159869", "gnorm": "5.857", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5748"} 2023-01-29 17:47:32 | INFO | train_inner | {"epoch": 12, "update": 11.099, "s2c_loss": "0.499", "loss": "0.34602", "s2c_nll_loss": "0.499", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "23990", "lr": "0.000159935", "gnorm": "7.268", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "5750"} 2023-01-29 17:47:35 | INFO | train_inner | {"epoch": 12, "update": 11.103, "s2c_loss": "0.396", "loss": "0.27439", "s2c_nll_loss": "0.396", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "24000", "lr": "0.000160002", "gnorm": "6.498", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5753"} 2023-01-29 17:47:38 | INFO | train_inner | {"epoch": 12, "update": 11.108, "s2c_loss": "0.516", "loss": "0.35781", "s2c_nll_loss": "0.516", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "244.5", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "24010", "lr": "0.000160069", "gnorm": "6.022", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5755"} 2023-01-29 17:47:40 | INFO | train_inner | {"epoch": 12, "update": 11.112, "s2c_loss": "0.382", "loss": "0.26501", "s2c_nll_loss": "0.382", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "24020", "lr": "0.000160135", "gnorm": "5.685", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5758"} 2023-01-29 17:47:43 | INFO | train_inner | {"epoch": 12, "update": 11.117, "s2c_loss": "0.504", "loss": "0.34939", "s2c_nll_loss": "0.504", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "24030", "lr": "0.000160202", "gnorm": "7.095", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5761"} 2023-01-29 17:47:45 | INFO | train_inner | {"epoch": 12, "update": 11.122, "s2c_loss": "0.426", "loss": "0.29556", "s2c_nll_loss": "0.426", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "24040", "lr": "0.000160269", "gnorm": "6.722", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5763"} 2023-01-29 17:47:48 | INFO | train_inner | {"epoch": 12, "update": 11.126, "s2c_loss": "0.49", "loss": "0.33964", "s2c_nll_loss": "0.49", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "24050", "lr": "0.000160335", "gnorm": "6.825", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5766"} 2023-01-29 17:47:50 | INFO | train_inner | {"epoch": 12, "update": 11.131, "s2c_loss": "0.44", "loss": "0.3049", "s2c_nll_loss": "0.44", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "24060", "lr": "0.000160402", "gnorm": "6.445", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5768"} 2023-01-29 17:47:53 | INFO | train_inner | {"epoch": 12, "update": 11.136, "s2c_loss": "0.491", "loss": "0.34011", "s2c_nll_loss": "0.491", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "24070", "lr": "0.000160469", "gnorm": "7.096", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5771"} 2023-01-29 17:47:55 | INFO | train_inner | {"epoch": 12, "update": 11.14, "s2c_loss": "0.529", "loss": "0.36666", "s2c_nll_loss": "0.529", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "24080", "lr": "0.000160535", "gnorm": "6.453", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5773"} 2023-01-29 17:47:58 | INFO | train_inner | {"epoch": 12, "update": 11.145, "s2c_loss": "0.564", "loss": "0.39106", "s2c_nll_loss": "0.564", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "24090", "lr": "0.000160602", "gnorm": "7.279", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5776"} 2023-01-29 17:48:00 | INFO | train_inner | {"epoch": 12, "update": 11.149, "s2c_loss": "0.553", "loss": "0.38334", "s2c_nll_loss": "0.553", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "24100", "lr": "0.000160669", "gnorm": "7.273", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5778"} 2023-01-29 17:48:03 | INFO | train_inner | {"epoch": 12, "update": 11.154, "s2c_loss": "0.622", "loss": "0.43121", "s2c_nll_loss": "0.622", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "24110", "lr": "0.000160735", "gnorm": "7.863", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "5781"} 2023-01-29 17:48:05 | INFO | train_inner | {"epoch": 12, "update": 11.159, "s2c_loss": "0.631", "loss": "0.43746", "s2c_nll_loss": "0.631", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "24120", "lr": "0.000160802", "gnorm": "8.506", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5783"} 2023-01-29 17:48:08 | INFO | train_inner | {"epoch": 12, "update": 11.163, "s2c_loss": "0.489", "loss": "0.33898", "s2c_nll_loss": "0.489", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "24130", "lr": "0.000160869", "gnorm": "7.411", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5786"} 2023-01-29 17:48:10 | INFO | train_inner | {"epoch": 12, "update": 11.168, "s2c_loss": "0.649", "loss": "0.44998", "s2c_nll_loss": "0.649", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "24140", "lr": "0.000160935", "gnorm": "8.638", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5788"} 2023-01-29 17:48:13 | INFO | train_inner | {"epoch": 12, "update": 11.173, "s2c_loss": "0.618", "loss": "0.42815", "s2c_nll_loss": "0.618", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "24150", "lr": "0.000161002", "gnorm": "7.093", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5791"} 2023-01-29 17:48:15 | INFO | train_inner | {"epoch": 12, "update": 11.177, "s2c_loss": "0.415", "loss": "0.28765", "s2c_nll_loss": "0.415", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "256.3", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "24160", "lr": "0.000161069", "gnorm": "6.587", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5793"} 2023-01-29 17:48:18 | INFO | train_inner | {"epoch": 12, "update": 11.182, "s2c_loss": "0.384", "loss": "0.26626", "s2c_nll_loss": "0.384", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "24170", "lr": "0.000161135", "gnorm": "6.555", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5796"} 2023-01-29 17:48:20 | INFO | train_inner | {"epoch": 12, "update": 11.186, "s2c_loss": "0.583", "loss": "0.40438", "s2c_nll_loss": "0.583", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "247.6", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "24180", "lr": "0.000161202", "gnorm": "6.629", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5798"} 2023-01-29 17:48:23 | INFO | train_inner | {"epoch": 12, "update": 11.191, "s2c_loss": "0.487", "loss": "0.33775", "s2c_nll_loss": "0.487", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "24190", "lr": "0.000161269", "gnorm": "7.072", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5801"} 2023-01-29 17:48:26 | INFO | train_inner | {"epoch": 12, "update": 11.196, "s2c_loss": "0.414", "loss": "0.28705", "s2c_nll_loss": "0.414", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "24200", "lr": "0.000161335", "gnorm": "7.67", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5804"} 2023-01-29 17:48:28 | INFO | train_inner | {"epoch": 12, "update": 11.2, "s2c_loss": "0.439", "loss": "0.30444", "s2c_nll_loss": "0.439", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "24210", "lr": "0.000161402", "gnorm": "6.637", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5806"} 2023-01-29 17:48:31 | INFO | train_inner | {"epoch": 12, "update": 11.205, "s2c_loss": "0.381", "loss": "0.26425", "s2c_nll_loss": "0.381", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "24220", "lr": "0.000161469", "gnorm": "5.934", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5809"} 2023-01-29 17:48:33 | INFO | train_inner | {"epoch": 12, "update": 11.21, "s2c_loss": "0.521", "loss": "0.36126", "s2c_nll_loss": "0.521", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "24230", "lr": "0.000161535", "gnorm": "7.227", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5811"} 2023-01-29 17:48:36 | INFO | train_inner | {"epoch": 12, "update": 11.214, "s2c_loss": "0.446", "loss": "0.30938", "s2c_nll_loss": "0.446", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "24240", "lr": "0.000161602", "gnorm": "6.049", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5814"} 2023-01-29 17:48:38 | INFO | train_inner | {"epoch": 12, "update": 11.219, "s2c_loss": "0.563", "loss": "0.39027", "s2c_nll_loss": "0.563", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "24250", "lr": "0.000161669", "gnorm": "7.267", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5816"} 2023-01-29 17:48:41 | INFO | train_inner | {"epoch": 12, "update": 11.223, "s2c_loss": "0.558", "loss": "0.38683", "s2c_nll_loss": "0.558", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "260.1", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "24260", "lr": "0.000161735", "gnorm": "7.417", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5819"} 2023-01-29 17:48:43 | INFO | train_inner | {"epoch": 12, "update": 11.228, "s2c_loss": "0.438", "loss": "0.30365", "s2c_nll_loss": "0.438", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "24270", "lr": "0.000161802", "gnorm": "6.384", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5821"} 2023-01-29 17:48:46 | INFO | train_inner | {"epoch": 12, "update": 11.233, "s2c_loss": "0.407", "loss": "0.28239", "s2c_nll_loss": "0.407", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "24280", "lr": "0.000161869", "gnorm": "6.006", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "5824"} 2023-01-29 17:48:48 | INFO | train_inner | {"epoch": 12, "update": 11.237, "s2c_loss": "0.504", "loss": "0.3493", "s2c_nll_loss": "0.504", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "24290", "lr": "0.000161935", "gnorm": "6.336", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "5826"} 2023-01-29 17:48:51 | INFO | train_inner | {"epoch": 12, "update": 11.242, "s2c_loss": "0.511", "loss": "0.35452", "s2c_nll_loss": "0.511", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "24300", "lr": "0.000162002", "gnorm": "6.867", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "5829"} 2023-01-29 17:48:53 | INFO | train_inner | {"epoch": 12, "update": 11.247, "s2c_loss": "0.485", "loss": "0.33638", "s2c_nll_loss": "0.485", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "24310", "lr": "0.000162069", "gnorm": "6.309", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "5831"} 2023-01-29 17:48:56 | INFO | train_inner | {"epoch": 12, "update": 11.251, "s2c_loss": "0.583", "loss": "0.40415", "s2c_nll_loss": "0.583", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "259.4", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "24320", "lr": "0.000162135", "gnorm": "7.132", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "5834"} 2023-01-29 17:48:58 | INFO | train_inner | {"epoch": 12, "update": 11.256, "s2c_loss": "0.599", "loss": "0.41499", "s2c_nll_loss": "0.599", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "24330", "lr": "0.000162202", "gnorm": "7.247", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "5836"} 2023-01-29 17:49:01 | INFO | train_inner | {"epoch": 12, "update": 11.26, "s2c_loss": "0.464", "loss": "0.32127", "s2c_nll_loss": "0.464", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "24340", "lr": "0.000162269", "gnorm": "6.762", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "5839"} 2023-01-29 17:49:03 | INFO | train_inner | {"epoch": 12, "update": 11.265, "s2c_loss": "0.623", "loss": "0.43213", "s2c_nll_loss": "0.623", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "258.5", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "24350", "lr": "0.000162335", "gnorm": "7.163", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "5841"} 2023-01-29 17:49:06 | INFO | train_inner | {"epoch": 12, "update": 11.27, "s2c_loss": "0.595", "loss": "0.41275", "s2c_nll_loss": "0.595", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "24360", "lr": "0.000162402", "gnorm": "6.931", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "5844"} 2023-01-29 17:49:08 | INFO | train_inner | {"epoch": 12, "update": 11.274, "s2c_loss": "0.482", "loss": "0.33435", "s2c_nll_loss": "0.482", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "257.6", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "24370", "lr": "0.000162469", "gnorm": "7.057", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "5846"} 2023-01-29 17:49:11 | INFO | train_inner | {"epoch": 12, "update": 11.279, "s2c_loss": "0.466", "loss": "0.32322", "s2c_nll_loss": "0.466", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "24380", "lr": "0.000162535", "gnorm": "6.699", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "5849"} 2023-01-29 17:49:14 | INFO | train_inner | {"epoch": 12, "update": 11.284, "s2c_loss": "0.735", "loss": "0.50939", "s2c_nll_loss": "0.735", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "24390", "lr": "0.000162602", "gnorm": "6.844", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "5851"} 2023-01-29 17:49:16 | INFO | train_inner | {"epoch": 12, "update": 11.288, "s2c_loss": "0.435", "loss": "0.30177", "s2c_nll_loss": "0.435", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "24400", "lr": "0.000162669", "gnorm": "6.214", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "5854"} 2023-01-29 17:49:19 | INFO | train_inner | {"epoch": 12, "update": 11.293, "s2c_loss": "0.467", "loss": "0.32389", "s2c_nll_loss": "0.467", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "24410", "lr": "0.000162735", "gnorm": "6.337", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "5857"} 2023-01-29 17:49:21 | INFO | train_inner | {"epoch": 12, "update": 11.297, "s2c_loss": "0.418", "loss": "0.29", "s2c_nll_loss": "0.418", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "24420", "lr": "0.000162802", "gnorm": "6.026", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "5859"} 2023-01-29 17:49:24 | INFO | train_inner | {"epoch": 12, "update": 11.302, "s2c_loss": "0.48", "loss": "0.3325", "s2c_nll_loss": "0.48", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "24430", "lr": "0.000162869", "gnorm": "6.294", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "5862"} 2023-01-29 17:49:26 | INFO | train_inner | {"epoch": 12, "update": 11.307, "s2c_loss": "0.398", "loss": "0.27605", "s2c_nll_loss": "0.398", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "24440", "lr": "0.000162935", "gnorm": "5.768", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "5864"} 2023-01-29 17:49:29 | INFO | train_inner | {"epoch": 12, "update": 11.311, "s2c_loss": "0.429", "loss": "0.29761", "s2c_nll_loss": "0.429", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "244.8", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "24450", "lr": "0.000163002", "gnorm": "5.974", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "5867"} 2023-01-29 17:49:31 | INFO | train_inner | {"epoch": 12, "update": 11.316, "s2c_loss": "0.418", "loss": "0.29002", "s2c_nll_loss": "0.418", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "24460", "lr": "0.000163069", "gnorm": "5.655", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "5869"} 2023-01-29 17:49:34 | INFO | train_inner | {"epoch": 12, "update": 11.321, "s2c_loss": "0.489", "loss": "0.33911", "s2c_nll_loss": "0.489", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "24470", "lr": "0.000163135", "gnorm": "6.073", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "5872"} 2023-01-29 17:49:36 | INFO | train_inner | {"epoch": 12, "update": 11.325, "s2c_loss": "0.407", "loss": "0.28238", "s2c_nll_loss": "0.407", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "24480", "lr": "0.000163202", "gnorm": "5.83", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.4", "wall": "5874"} 2023-01-29 17:49:39 | INFO | train_inner | {"epoch": 12, "update": 11.33, "s2c_loss": "0.414", "loss": "0.28709", "s2c_nll_loss": "0.414", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "24490", "lr": "0.000163269", "gnorm": "6.579", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "5877"} 2023-01-29 17:49:40 | INFO | fairseq.trainer | NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 2048.0 2023-01-29 17:49:42 | INFO | train_inner | {"epoch": 12, "update": 11.335, "s2c_loss": "0.57", "loss": "0.39514", "s2c_nll_loss": "0.57", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "238.2", "ups": "3.72", "wpb": "64", "bsz": "64", "num_updates": "24500", "lr": "0.000163335", "gnorm": "7.677", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5880"} 2023-01-29 17:49:44 | INFO | train_inner | {"epoch": 12, "update": 11.34, "s2c_loss": "0.523", "loss": "0.36253", "s2c_nll_loss": "0.523", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "24510", "lr": "0.000163402", "gnorm": "7.057", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5882"} 2023-01-29 17:49:47 | INFO | train_inner | {"epoch": 12, "update": 11.344, "s2c_loss": "0.508", "loss": "0.35219", "s2c_nll_loss": "0.508", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "260.4", "ups": "4.07", "wpb": "64", "bsz": "64", "num_updates": "24520", "lr": "0.000163468", "gnorm": "6.925", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5885"} 2023-01-29 17:49:49 | INFO | train_inner | {"epoch": 12, "update": 11.349, "s2c_loss": "0.572", "loss": "0.39655", "s2c_nll_loss": "0.572", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "24530", "lr": "0.000163535", "gnorm": "7.385", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5887"} 2023-01-29 17:49:52 | INFO | train_inner | {"epoch": 12, "update": 11.353, "s2c_loss": "0.484", "loss": "0.33563", "s2c_nll_loss": "0.484", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "24540", "lr": "0.000163602", "gnorm": "6.684", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5890"} 2023-01-29 17:49:54 | INFO | train_inner | {"epoch": 12, "update": 11.358, "s2c_loss": "0.406", "loss": "0.28124", "s2c_nll_loss": "0.406", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "24550", "lr": "0.000163668", "gnorm": "6.592", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5892"} 2023-01-29 17:49:57 | INFO | train_inner | {"epoch": 12, "update": 11.363, "s2c_loss": "0.448", "loss": "0.31055", "s2c_nll_loss": "0.448", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "24560", "lr": "0.000163735", "gnorm": "6.85", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5895"} 2023-01-29 17:49:59 | INFO | train_inner | {"epoch": 12, "update": 11.367, "s2c_loss": "0.439", "loss": "0.30421", "s2c_nll_loss": "0.439", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "24570", "lr": "0.000163802", "gnorm": "6.809", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5897"} 2023-01-29 17:50:02 | INFO | train_inner | {"epoch": 12, "update": 11.372, "s2c_loss": "0.548", "loss": "0.37969", "s2c_nll_loss": "0.548", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "24580", "lr": "0.000163868", "gnorm": "6.977", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5900"} 2023-01-29 17:50:04 | INFO | train_inner | {"epoch": 12, "update": 11.377, "s2c_loss": "0.382", "loss": "0.26505", "s2c_nll_loss": "0.382", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "24590", "lr": "0.000163935", "gnorm": "6.168", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5902"} 2023-01-29 17:50:07 | INFO | train_inner | {"epoch": 12, "update": 11.381, "s2c_loss": "0.543", "loss": "0.37622", "s2c_nll_loss": "0.543", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "24600", "lr": "0.000164002", "gnorm": "6.89", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5905"} 2023-01-29 17:50:09 | INFO | train_inner | {"epoch": 12, "update": 11.386, "s2c_loss": "0.513", "loss": "0.35588", "s2c_nll_loss": "0.513", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "24610", "lr": "0.000164068", "gnorm": "6.762", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "5907"} 2023-01-29 17:50:12 | INFO | train_inner | {"epoch": 12, "update": 11.39, "s2c_loss": "0.553", "loss": "0.38343", "s2c_nll_loss": "0.553", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "260.6", "ups": "4.07", "wpb": "64", "bsz": "64", "num_updates": "24620", "lr": "0.000164135", "gnorm": "6.766", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5910"} 2023-01-29 17:50:14 | INFO | train_inner | {"epoch": 12, "update": 11.395, "s2c_loss": "0.466", "loss": "0.32298", "s2c_nll_loss": "0.466", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "24630", "lr": "0.000164202", "gnorm": "6.266", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5912"} 2023-01-29 17:50:17 | INFO | train_inner | {"epoch": 12, "update": 11.4, "s2c_loss": "0.591", "loss": "0.40945", "s2c_nll_loss": "0.591", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "24640", "lr": "0.000164268", "gnorm": "7.518", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5915"} 2023-01-29 17:50:20 | INFO | train_inner | {"epoch": 12, "update": 11.404, "s2c_loss": "0.594", "loss": "0.41143", "s2c_nll_loss": "0.594", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "24650", "lr": "0.000164335", "gnorm": "7.965", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5917"} 2023-01-29 17:50:22 | INFO | train_inner | {"epoch": 12, "update": 11.409, "s2c_loss": "0.546", "loss": "0.37856", "s2c_nll_loss": "0.546", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "24660", "lr": "0.000164402", "gnorm": "8.173", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5920"} 2023-01-29 17:50:25 | INFO | train_inner | {"epoch": 12, "update": 11.414, "s2c_loss": "0.588", "loss": "0.40728", "s2c_nll_loss": "0.588", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "260.7", "ups": "4.07", "wpb": "64", "bsz": "64", "num_updates": "24670", "lr": "0.000164468", "gnorm": "7.047", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5922"} 2023-01-29 17:50:27 | INFO | train_inner | {"epoch": 12, "update": 11.418, "s2c_loss": "0.61", "loss": "0.42259", "s2c_nll_loss": "0.61", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "24680", "lr": "0.000164535", "gnorm": "8.212", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5925"} 2023-01-29 17:50:30 | INFO | train_inner | {"epoch": 12, "update": 11.423, "s2c_loss": "0.58", "loss": "0.40174", "s2c_nll_loss": "0.58", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "259.7", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "24690", "lr": "0.000164602", "gnorm": "7.021", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5927"} 2023-01-29 17:50:32 | INFO | train_inner | {"epoch": 12, "update": 11.427, "s2c_loss": "0.563", "loss": "0.39004", "s2c_nll_loss": "0.563", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "24700", "lr": "0.000164668", "gnorm": "6.947", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5930"} 2023-01-29 17:50:35 | INFO | train_inner | {"epoch": 12, "update": 11.432, "s2c_loss": "0.651", "loss": "0.45152", "s2c_nll_loss": "0.651", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "24710", "lr": "0.000164735", "gnorm": "7.771", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5933"} 2023-01-29 17:50:37 | INFO | train_inner | {"epoch": 12, "update": 11.437, "s2c_loss": "0.57", "loss": "0.39529", "s2c_nll_loss": "0.57", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "24720", "lr": "0.000164802", "gnorm": "9.051", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5935"} 2023-01-29 17:50:40 | INFO | train_inner | {"epoch": 12, "update": 11.441, "s2c_loss": "0.499", "loss": "0.3456", "s2c_nll_loss": "0.499", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "24730", "lr": "0.000164868", "gnorm": "7.137", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5938"} 2023-01-29 17:50:42 | INFO | train_inner | {"epoch": 12, "update": 11.446, "s2c_loss": "0.671", "loss": "0.4648", "s2c_nll_loss": "0.671", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "24740", "lr": "0.000164935", "gnorm": "8.147", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5940"} 2023-01-29 17:50:45 | INFO | train_inner | {"epoch": 12, "update": 11.451, "s2c_loss": "0.6", "loss": "0.41612", "s2c_nll_loss": "0.6", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "258.4", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "24750", "lr": "0.000165002", "gnorm": "7.753", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5943"} 2023-01-29 17:50:47 | INFO | train_inner | {"epoch": 12, "update": 11.455, "s2c_loss": "0.498", "loss": "0.3454", "s2c_nll_loss": "0.498", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "24760", "lr": "0.000165068", "gnorm": "6.931", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5945"} 2023-01-29 17:50:50 | INFO | train_inner | {"epoch": 12, "update": 11.46, "s2c_loss": "0.745", "loss": "0.51631", "s2c_nll_loss": "0.745", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "24770", "lr": "0.000165135", "gnorm": "7.386", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5948"} 2023-01-29 17:50:52 | INFO | train_inner | {"epoch": 12, "update": 11.464, "s2c_loss": "0.451", "loss": "0.31293", "s2c_nll_loss": "0.451", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "24780", "lr": "0.000165202", "gnorm": "6.052", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5950"} 2023-01-29 17:50:55 | INFO | train_inner | {"epoch": 12, "update": 11.469, "s2c_loss": "0.572", "loss": "0.39675", "s2c_nll_loss": "0.572", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "258.3", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "24790", "lr": "0.000165268", "gnorm": "7.781", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5953"} 2023-01-29 17:50:57 | INFO | train_inner | {"epoch": 12, "update": 11.474, "s2c_loss": "0.456", "loss": "0.31578", "s2c_nll_loss": "0.456", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "257.4", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "24800", "lr": "0.000165335", "gnorm": "6.324", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5955"} 2023-01-29 17:51:00 | INFO | train_inner | {"epoch": 12, "update": 11.478, "s2c_loss": "0.807", "loss": "0.55941", "s2c_nll_loss": "0.807", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "24810", "lr": "0.000165402", "gnorm": "7.453", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "5958"} 2023-01-29 17:51:02 | INFO | train_inner | {"epoch": 12, "update": 11.483, "s2c_loss": "0.545", "loss": "0.37785", "s2c_nll_loss": "0.545", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "24820", "lr": "0.000165468", "gnorm": "7.494", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5960"} 2023-01-29 17:51:05 | INFO | train_inner | {"epoch": 12, "update": 11.488, "s2c_loss": "0.676", "loss": "0.46835", "s2c_nll_loss": "0.676", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "24830", "lr": "0.000165535", "gnorm": "7.189", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5963"} 2023-01-29 17:51:07 | INFO | train_inner | {"epoch": 12, "update": 11.492, "s2c_loss": "0.626", "loss": "0.43422", "s2c_nll_loss": "0.626", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "24840", "lr": "0.000165602", "gnorm": "7.689", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5965"} 2023-01-29 17:51:10 | INFO | train_inner | {"epoch": 12, "update": 11.497, "s2c_loss": "0.514", "loss": "0.35651", "s2c_nll_loss": "0.514", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "24850", "lr": "0.000165668", "gnorm": "8.764", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5968"} 2023-01-29 17:51:12 | INFO | train_inner | {"epoch": 12, "update": 11.501, "s2c_loss": "0.619", "loss": "0.42881", "s2c_nll_loss": "0.619", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "24860", "lr": "0.000165735", "gnorm": "7.648", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5970"} 2023-01-29 17:51:15 | INFO | train_inner | {"epoch": 12, "update": 11.506, "s2c_loss": "0.569", "loss": "0.39406", "s2c_nll_loss": "0.569", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "24870", "lr": "0.000165802", "gnorm": "7.272", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5973"} 2023-01-29 17:51:17 | INFO | train_inner | {"epoch": 12, "update": 11.511, "s2c_loss": "0.54", "loss": "0.37421", "s2c_nll_loss": "0.54", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "24880", "lr": "0.000165868", "gnorm": "7.499", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5975"} 2023-01-29 17:51:20 | INFO | train_inner | {"epoch": 12, "update": 11.515, "s2c_loss": "0.559", "loss": "0.38724", "s2c_nll_loss": "0.559", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "261", "ups": "4.08", "wpb": "64", "bsz": "64", "num_updates": "24890", "lr": "0.000165935", "gnorm": "6.98", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5978"} 2023-01-29 17:51:22 | INFO | train_inner | {"epoch": 12, "update": 11.52, "s2c_loss": "0.535", "loss": "0.37063", "s2c_nll_loss": "0.535", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "24900", "lr": "0.000166002", "gnorm": "6.93", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "5980"} 2023-01-29 17:51:25 | INFO | train_inner | {"epoch": 12, "update": 11.525, "s2c_loss": "0.577", "loss": "0.39987", "s2c_nll_loss": "0.577", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "258.2", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "24910", "lr": "0.000166068", "gnorm": "7.935", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5983"} 2023-01-29 17:51:27 | INFO | train_inner | {"epoch": 12, "update": 11.529, "s2c_loss": "0.532", "loss": "0.36907", "s2c_nll_loss": "0.532", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "24920", "lr": "0.000166135", "gnorm": "6.73", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5985"} 2023-01-29 17:51:30 | INFO | train_inner | {"epoch": 12, "update": 11.534, "s2c_loss": "0.532", "loss": "0.36902", "s2c_nll_loss": "0.532", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "24930", "lr": "0.000166202", "gnorm": "6.833", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5988"} 2023-01-29 17:51:32 | INFO | train_inner | {"epoch": 12, "update": 11.538, "s2c_loss": "0.516", "loss": "0.35748", "s2c_nll_loss": "0.516", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "24940", "lr": "0.000166268", "gnorm": "6.476", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5990"} 2023-01-29 17:51:35 | INFO | train_inner | {"epoch": 12, "update": 11.543, "s2c_loss": "0.499", "loss": "0.34573", "s2c_nll_loss": "0.499", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "24950", "lr": "0.000166335", "gnorm": "6.078", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "5993"} 2023-01-29 17:51:37 | INFO | train_inner | {"epoch": 12, "update": 11.548, "s2c_loss": "0.68", "loss": "0.47123", "s2c_nll_loss": "0.68", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "24960", "lr": "0.000166402", "gnorm": "6.937", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "5995"} 2023-01-29 17:51:40 | INFO | train_inner | {"epoch": 12, "update": 11.552, "s2c_loss": "0.528", "loss": "0.36584", "s2c_nll_loss": "0.528", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "24970", "lr": "0.000166468", "gnorm": "7.983", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "5998"} 2023-01-29 17:51:43 | INFO | train_inner | {"epoch": 12, "update": 11.557, "s2c_loss": "0.457", "loss": "0.31658", "s2c_nll_loss": "0.457", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "24980", "lr": "0.000166535", "gnorm": "7.319", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6000"} 2023-01-29 17:51:45 | INFO | train_inner | {"epoch": 12, "update": 11.562, "s2c_loss": "0.52", "loss": "0.36054", "s2c_nll_loss": "0.52", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "24990", "lr": "0.000166602", "gnorm": "8.518", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6003"} 2023-01-29 17:51:48 | INFO | train_inner | {"epoch": 12, "update": 11.566, "s2c_loss": "0.606", "loss": "0.41981", "s2c_nll_loss": "0.606", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "25000", "lr": "0.000166668", "gnorm": "8.203", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6005"} 2023-01-29 17:51:50 | INFO | train_inner | {"epoch": 12, "update": 11.571, "s2c_loss": "0.542", "loss": "0.37535", "s2c_nll_loss": "0.542", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "25010", "lr": "0.000166735", "gnorm": "7.612", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6008"} 2023-01-29 17:51:53 | INFO | train_inner | {"epoch": 12, "update": 11.575, "s2c_loss": "0.706", "loss": "0.48936", "s2c_nll_loss": "0.706", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "25020", "lr": "0.000166802", "gnorm": "7.796", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6011"} 2023-01-29 17:51:55 | INFO | train_inner | {"epoch": 12, "update": 11.58, "s2c_loss": "0.602", "loss": "0.41727", "s2c_nll_loss": "0.602", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "25030", "lr": "0.000166868", "gnorm": "7.976", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6013"} 2023-01-29 17:51:58 | INFO | train_inner | {"epoch": 12, "update": 11.585, "s2c_loss": "0.604", "loss": "0.41889", "s2c_nll_loss": "0.604", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "25040", "lr": "0.000166935", "gnorm": "7.711", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6016"} 2023-01-29 17:52:00 | INFO | train_inner | {"epoch": 12, "update": 11.589, "s2c_loss": "0.508", "loss": "0.3524", "s2c_nll_loss": "0.508", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "25050", "lr": "0.000167002", "gnorm": "6.939", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6018"} 2023-01-29 17:52:03 | INFO | train_inner | {"epoch": 12, "update": 11.594, "s2c_loss": "0.429", "loss": "0.29737", "s2c_nll_loss": "0.429", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "25060", "lr": "0.000167068", "gnorm": "6.322", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6021"} 2023-01-29 17:52:05 | INFO | train_inner | {"epoch": 12, "update": 11.599, "s2c_loss": "0.566", "loss": "0.39252", "s2c_nll_loss": "0.566", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "25070", "lr": "0.000167135", "gnorm": "6.904", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6023"} 2023-01-29 17:52:08 | INFO | train_inner | {"epoch": 12, "update": 11.603, "s2c_loss": "0.609", "loss": "0.42239", "s2c_nll_loss": "0.609", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "25080", "lr": "0.000167202", "gnorm": "7.952", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6026"} 2023-01-29 17:52:10 | INFO | train_inner | {"epoch": 12, "update": 11.608, "s2c_loss": "0.571", "loss": "0.39587", "s2c_nll_loss": "0.571", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "25090", "lr": "0.000167268", "gnorm": "6.786", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6028"} 2023-01-29 17:52:13 | INFO | train_inner | {"epoch": 12, "update": 11.612, "s2c_loss": "0.507", "loss": "0.35168", "s2c_nll_loss": "0.507", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "25100", "lr": "0.000167335", "gnorm": "5.96", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6031"} 2023-01-29 17:52:15 | INFO | train_inner | {"epoch": 12, "update": 11.617, "s2c_loss": "0.693", "loss": "0.48019", "s2c_nll_loss": "0.693", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "25110", "lr": "0.000167402", "gnorm": "7.788", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "6033"} 2023-01-29 17:52:18 | INFO | train_inner | {"epoch": 12, "update": 11.622, "s2c_loss": "0.573", "loss": "0.39729", "s2c_nll_loss": "0.573", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "25120", "lr": "0.000167468", "gnorm": "9.014", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6036"} 2023-01-29 17:52:20 | INFO | train_inner | {"epoch": 12, "update": 11.626, "s2c_loss": "0.609", "loss": "0.42207", "s2c_nll_loss": "0.609", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "25130", "lr": "0.000167535", "gnorm": "7.013", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6038"} 2023-01-29 17:52:23 | INFO | train_inner | {"epoch": 12, "update": 11.631, "s2c_loss": "0.607", "loss": "0.42106", "s2c_nll_loss": "0.607", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "258.8", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "25140", "lr": "0.000167602", "gnorm": "7.524", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6041"} 2023-01-29 17:52:25 | INFO | train_inner | {"epoch": 12, "update": 11.636, "s2c_loss": "0.637", "loss": "0.44153", "s2c_nll_loss": "0.637", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "259", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "25150", "lr": "0.000167668", "gnorm": "8.714", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6043"} 2023-01-29 17:52:28 | INFO | train_inner | {"epoch": 12, "update": 11.64, "s2c_loss": "0.577", "loss": "0.39968", "s2c_nll_loss": "0.577", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "25160", "lr": "0.000167735", "gnorm": "6.732", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6046"} 2023-01-29 17:52:30 | INFO | train_inner | {"epoch": 12, "update": 11.645, "s2c_loss": "0.648", "loss": "0.44903", "s2c_nll_loss": "0.648", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "260.1", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "25170", "lr": "0.000167802", "gnorm": "8.16", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6048"} 2023-01-29 17:52:33 | INFO | train_inner | {"epoch": 12, "update": 11.649, "s2c_loss": "0.585", "loss": "0.40563", "s2c_nll_loss": "0.585", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "25180", "lr": "0.000167868", "gnorm": "7.438", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6051"} 2023-01-29 17:52:35 | INFO | train_inner | {"epoch": 12, "update": 11.654, "s2c_loss": "0.646", "loss": "0.44747", "s2c_nll_loss": "0.646", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "25190", "lr": "0.000167935", "gnorm": "7.271", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6053"} 2023-01-29 17:52:38 | INFO | train_inner | {"epoch": 12, "update": 11.659, "s2c_loss": "0.631", "loss": "0.43763", "s2c_nll_loss": "0.631", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "25200", "lr": "0.000168002", "gnorm": "8.116", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6056"} 2023-01-29 17:52:41 | INFO | train_inner | {"epoch": 12, "update": 11.663, "s2c_loss": "0.465", "loss": "0.32206", "s2c_nll_loss": "0.465", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "25210", "lr": "0.000168068", "gnorm": "6.94", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6058"} 2023-01-29 17:52:43 | INFO | train_inner | {"epoch": 12, "update": 11.668, "s2c_loss": "0.63", "loss": "0.43666", "s2c_nll_loss": "0.63", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "25220", "lr": "0.000168135", "gnorm": "7.904", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6061"} 2023-01-29 17:52:46 | INFO | train_inner | {"epoch": 12, "update": 11.673, "s2c_loss": "0.523", "loss": "0.36247", "s2c_nll_loss": "0.523", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "258.4", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "25230", "lr": "0.000168202", "gnorm": "7.321", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6063"} 2023-01-29 17:52:48 | INFO | train_inner | {"epoch": 12, "update": 11.677, "s2c_loss": "0.541", "loss": "0.37494", "s2c_nll_loss": "0.541", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "25240", "lr": "0.000168268", "gnorm": "7.957", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6066"} 2023-01-29 17:52:51 | INFO | train_inner | {"epoch": 12, "update": 11.682, "s2c_loss": "0.626", "loss": "0.43387", "s2c_nll_loss": "0.626", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "258.4", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "25250", "lr": "0.000168335", "gnorm": "7.797", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6068"} 2023-01-29 17:52:53 | INFO | train_inner | {"epoch": 12, "update": 11.686, "s2c_loss": "0.466", "loss": "0.32303", "s2c_nll_loss": "0.466", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "25260", "lr": "0.000168402", "gnorm": "6.736", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6071"} 2023-01-29 17:52:56 | INFO | train_inner | {"epoch": 12, "update": 11.691, "s2c_loss": "0.415", "loss": "0.2879", "s2c_nll_loss": "0.415", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "25270", "lr": "0.000168468", "gnorm": "6.692", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6073"} 2023-01-29 17:52:58 | INFO | train_inner | {"epoch": 12, "update": 11.696, "s2c_loss": "0.556", "loss": "0.38562", "s2c_nll_loss": "0.556", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "257.8", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "25280", "lr": "0.000168535", "gnorm": "6.915", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6076"} 2023-01-29 17:53:01 | INFO | train_inner | {"epoch": 12, "update": 11.7, "s2c_loss": "0.52", "loss": "0.36014", "s2c_nll_loss": "0.52", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "25290", "lr": "0.000168602", "gnorm": "7.277", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6078"} 2023-01-29 17:53:03 | INFO | train_inner | {"epoch": 12, "update": 11.705, "s2c_loss": "0.554", "loss": "0.38371", "s2c_nll_loss": "0.554", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "25300", "lr": "0.000168668", "gnorm": "7.857", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6081"} 2023-01-29 17:53:06 | INFO | train_inner | {"epoch": 12, "update": 11.71, "s2c_loss": "0.614", "loss": "0.42556", "s2c_nll_loss": "0.614", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "25310", "lr": "0.000168735", "gnorm": "6.998", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6083"} 2023-01-29 17:53:08 | INFO | train_inner | {"epoch": 12, "update": 11.714, "s2c_loss": "0.473", "loss": "0.32803", "s2c_nll_loss": "0.473", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "25320", "lr": "0.000168802", "gnorm": "6.796", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6086"} 2023-01-29 17:53:11 | INFO | train_inner | {"epoch": 12, "update": 11.719, "s2c_loss": "0.586", "loss": "0.40606", "s2c_nll_loss": "0.586", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "25330", "lr": "0.000168868", "gnorm": "8.171", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6089"} 2023-01-29 17:53:13 | INFO | train_inner | {"epoch": 12, "update": 11.723, "s2c_loss": "0.523", "loss": "0.36243", "s2c_nll_loss": "0.523", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "25340", "lr": "0.000168935", "gnorm": "6.803", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6091"} 2023-01-29 17:53:16 | INFO | train_inner | {"epoch": 12, "update": 11.728, "s2c_loss": "0.533", "loss": "0.36917", "s2c_nll_loss": "0.533", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "245.7", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "25350", "lr": "0.000169002", "gnorm": "6.849", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6094"} 2023-01-29 17:53:18 | INFO | train_inner | {"epoch": 12, "update": 11.733, "s2c_loss": "0.494", "loss": "0.34259", "s2c_nll_loss": "0.494", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "241.5", "ups": "3.77", "wpb": "64", "bsz": "64", "num_updates": "25360", "lr": "0.000169068", "gnorm": "6.973", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6096"} 2023-01-29 17:53:21 | INFO | train_inner | {"epoch": 12, "update": 11.737, "s2c_loss": "0.511", "loss": "0.35435", "s2c_nll_loss": "0.511", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "242.6", "ups": "3.79", "wpb": "64", "bsz": "64", "num_updates": "25370", "lr": "0.000169135", "gnorm": "7.377", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6099"} 2023-01-29 17:53:24 | INFO | train_inner | {"epoch": 12, "update": 11.742, "s2c_loss": "0.554", "loss": "0.38434", "s2c_nll_loss": "0.554", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "25380", "lr": "0.000169202", "gnorm": "7.597", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6102"} 2023-01-29 17:53:26 | INFO | train_inner | {"epoch": 12, "update": 11.747, "s2c_loss": "0.452", "loss": "0.31304", "s2c_nll_loss": "0.452", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "25390", "lr": "0.000169268", "gnorm": "6.594", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6104"} 2023-01-29 17:53:29 | INFO | train_inner | {"epoch": 12, "update": 11.751, "s2c_loss": "0.565", "loss": "0.39147", "s2c_nll_loss": "0.565", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "25400", "lr": "0.000169335", "gnorm": "7.991", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6107"} 2023-01-29 17:53:31 | INFO | train_inner | {"epoch": 12, "update": 11.756, "s2c_loss": "0.556", "loss": "0.38531", "s2c_nll_loss": "0.556", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "25410", "lr": "0.000169402", "gnorm": "6.598", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6109"} 2023-01-29 17:53:34 | INFO | train_inner | {"epoch": 12, "update": 11.76, "s2c_loss": "0.375", "loss": "0.2598", "s2c_nll_loss": "0.375", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "245.2", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "25420", "lr": "0.000169468", "gnorm": "6.295", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6112"} 2023-01-29 17:53:36 | INFO | train_inner | {"epoch": 12, "update": 11.765, "s2c_loss": "0.484", "loss": "0.33531", "s2c_nll_loss": "0.484", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "25430", "lr": "0.000169535", "gnorm": "8.423", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6114"} 2023-01-29 17:53:39 | INFO | train_inner | {"epoch": 12, "update": 11.77, "s2c_loss": "0.497", "loss": "0.34421", "s2c_nll_loss": "0.497", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "25440", "lr": "0.000169602", "gnorm": "6.919", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6117"} 2023-01-29 17:53:42 | INFO | train_inner | {"epoch": 12, "update": 11.774, "s2c_loss": "0.454", "loss": "0.31482", "s2c_nll_loss": "0.454", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "259.1", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "25450", "lr": "0.000169668", "gnorm": "7.098", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6119"} 2023-01-29 17:53:44 | INFO | train_inner | {"epoch": 12, "update": 11.779, "s2c_loss": "0.495", "loss": "0.34314", "s2c_nll_loss": "0.495", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "25460", "lr": "0.000169735", "gnorm": "7.27", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6122"} 2023-01-29 17:53:46 | INFO | train_inner | {"epoch": 12, "update": 11.784, "s2c_loss": "0.494", "loss": "0.34227", "s2c_nll_loss": "0.494", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "25470", "lr": "0.000169802", "gnorm": "6.744", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6124"} 2023-01-29 17:53:49 | INFO | train_inner | {"epoch": 12, "update": 11.788, "s2c_loss": "0.496", "loss": "0.34358", "s2c_nll_loss": "0.496", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "25480", "lr": "0.000169868", "gnorm": "6.24", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6127"} 2023-01-29 17:53:52 | INFO | train_inner | {"epoch": 12, "update": 11.793, "s2c_loss": "0.476", "loss": "0.32967", "s2c_nll_loss": "0.476", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "25490", "lr": "0.000169935", "gnorm": "7.18", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6129"} 2023-01-29 17:53:54 | INFO | train_inner | {"epoch": 12, "update": 11.797, "s2c_loss": "0.564", "loss": "0.39096", "s2c_nll_loss": "0.564", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "259.5", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "25500", "lr": "0.000170002", "gnorm": "7.504", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6132"} 2023-01-29 17:53:57 | INFO | train_inner | {"epoch": 12, "update": 11.802, "s2c_loss": "0.608", "loss": "0.42159", "s2c_nll_loss": "0.608", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "25510", "lr": "0.000170068", "gnorm": "7.918", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6134"} 2023-01-29 17:53:59 | INFO | train_inner | {"epoch": 12, "update": 11.807, "s2c_loss": "0.441", "loss": "0.30575", "s2c_nll_loss": "0.441", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "25520", "lr": "0.000170135", "gnorm": "6.876", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6137"} 2023-01-29 17:54:02 | INFO | train_inner | {"epoch": 12, "update": 11.811, "s2c_loss": "0.477", "loss": "0.33033", "s2c_nll_loss": "0.477", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "25530", "lr": "0.000170201", "gnorm": "6.242", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6139"} 2023-01-29 17:54:04 | INFO | train_inner | {"epoch": 12, "update": 11.816, "s2c_loss": "0.445", "loss": "0.30824", "s2c_nll_loss": "0.445", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "247.3", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "25540", "lr": "0.000170268", "gnorm": "6.525", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6142"} 2023-01-29 17:54:07 | INFO | train_inner | {"epoch": 12, "update": 11.821, "s2c_loss": "0.646", "loss": "0.44763", "s2c_nll_loss": "0.646", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "25550", "lr": "0.000170335", "gnorm": "8.449", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6145"} 2023-01-29 17:54:09 | INFO | train_inner | {"epoch": 12, "update": 11.825, "s2c_loss": "0.569", "loss": "0.39474", "s2c_nll_loss": "0.569", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "25560", "lr": "0.000170401", "gnorm": "7.375", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6147"} 2023-01-29 17:54:12 | INFO | train_inner | {"epoch": 12, "update": 11.83, "s2c_loss": "0.567", "loss": "0.39301", "s2c_nll_loss": "0.567", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "25570", "lr": "0.000170468", "gnorm": "7.351", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6150"} 2023-01-29 17:54:14 | INFO | train_inner | {"epoch": 12, "update": 11.834, "s2c_loss": "0.567", "loss": "0.39304", "s2c_nll_loss": "0.567", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "25580", "lr": "0.000170535", "gnorm": "7.89", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "6152"} 2023-01-29 17:54:17 | INFO | train_inner | {"epoch": 12, "update": 11.839, "s2c_loss": "0.553", "loss": "0.38334", "s2c_nll_loss": "0.553", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "25590", "lr": "0.000170601", "gnorm": "7.2", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6155"} 2023-01-29 17:54:19 | INFO | train_inner | {"epoch": 12, "update": 11.844, "s2c_loss": "0.561", "loss": "0.38851", "s2c_nll_loss": "0.561", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "25600", "lr": "0.000170668", "gnorm": "7.647", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6157"} 2023-01-29 17:54:22 | INFO | train_inner | {"epoch": 12, "update": 11.848, "s2c_loss": "0.438", "loss": "0.30369", "s2c_nll_loss": "0.438", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "25610", "lr": "0.000170735", "gnorm": "6.476", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6160"} 2023-01-29 17:54:24 | INFO | train_inner | {"epoch": 12, "update": 11.853, "s2c_loss": "0.555", "loss": "0.38499", "s2c_nll_loss": "0.555", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "25620", "lr": "0.000170801", "gnorm": "7.536", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6162"} 2023-01-29 17:54:27 | INFO | train_inner | {"epoch": 12, "update": 11.858, "s2c_loss": "0.493", "loss": "0.34186", "s2c_nll_loss": "0.493", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "25630", "lr": "0.000170868", "gnorm": "6.995", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6165"} 2023-01-29 17:54:30 | INFO | train_inner | {"epoch": 12, "update": 11.862, "s2c_loss": "0.517", "loss": "0.35837", "s2c_nll_loss": "0.517", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "25640", "lr": "0.000170935", "gnorm": "6.749", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6167"} 2023-01-29 17:54:32 | INFO | train_inner | {"epoch": 12, "update": 11.867, "s2c_loss": "0.644", "loss": "0.4464", "s2c_nll_loss": "0.644", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "257.4", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "25650", "lr": "0.000171001", "gnorm": "8.129", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6170"} 2023-01-29 17:54:35 | INFO | train_inner | {"epoch": 12, "update": 11.871, "s2c_loss": "0.634", "loss": "0.43914", "s2c_nll_loss": "0.634", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "25660", "lr": "0.000171068", "gnorm": "7.748", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6173"} 2023-01-29 17:54:37 | INFO | train_inner | {"epoch": 12, "update": 11.876, "s2c_loss": "0.603", "loss": "0.41808", "s2c_nll_loss": "0.603", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "25670", "lr": "0.000171135", "gnorm": "8.092", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6175"} 2023-01-29 17:54:40 | INFO | train_inner | {"epoch": 12, "update": 11.881, "s2c_loss": "0.469", "loss": "0.32475", "s2c_nll_loss": "0.469", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "25680", "lr": "0.000171201", "gnorm": "6.184", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6178"} 2023-01-29 17:54:42 | INFO | train_inner | {"epoch": 12, "update": 11.885, "s2c_loss": "0.609", "loss": "0.42191", "s2c_nll_loss": "0.609", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "25690", "lr": "0.000171268", "gnorm": "8.775", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6180"} 2023-01-29 17:54:45 | INFO | train_inner | {"epoch": 12, "update": 11.89, "s2c_loss": "0.513", "loss": "0.35541", "s2c_nll_loss": "0.513", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "25700", "lr": "0.000171335", "gnorm": "7.717", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6183"} 2023-01-29 17:54:47 | INFO | train_inner | {"epoch": 12, "update": 11.895, "s2c_loss": "0.501", "loss": "0.34749", "s2c_nll_loss": "0.501", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "25710", "lr": "0.000171401", "gnorm": "6.279", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6185"} 2023-01-29 17:54:50 | INFO | train_inner | {"epoch": 12, "update": 11.899, "s2c_loss": "0.501", "loss": "0.34703", "s2c_nll_loss": "0.501", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "25720", "lr": "0.000171468", "gnorm": "7.216", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6188"} 2023-01-29 17:54:52 | INFO | train_inner | {"epoch": 12, "update": 11.904, "s2c_loss": "0.433", "loss": "0.30006", "s2c_nll_loss": "0.433", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "25730", "lr": "0.000171535", "gnorm": "6.713", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6190"} 2023-01-29 17:54:55 | INFO | train_inner | {"epoch": 12, "update": 11.908, "s2c_loss": "0.619", "loss": "0.42905", "s2c_nll_loss": "0.619", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "246.9", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "25740", "lr": "0.000171601", "gnorm": "8.195", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6193"} 2023-01-29 17:54:58 | INFO | train_inner | {"epoch": 12, "update": 11.913, "s2c_loss": "0.516", "loss": "0.35788", "s2c_nll_loss": "0.516", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "25750", "lr": "0.000171668", "gnorm": "7.057", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6195"} 2023-01-29 17:55:00 | INFO | train_inner | {"epoch": 12, "update": 11.918, "s2c_loss": "0.553", "loss": "0.38303", "s2c_nll_loss": "0.553", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "25760", "lr": "0.000171735", "gnorm": "6.96", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6198"} 2023-01-29 17:55:03 | INFO | train_inner | {"epoch": 12, "update": 11.922, "s2c_loss": "0.542", "loss": "0.37595", "s2c_nll_loss": "0.542", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "25770", "lr": "0.000171801", "gnorm": "7.764", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6201"} 2023-01-29 17:55:05 | INFO | train_inner | {"epoch": 12, "update": 11.927, "s2c_loss": "0.623", "loss": "0.4317", "s2c_nll_loss": "0.623", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "25780", "lr": "0.000171868", "gnorm": "7.925", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6203"} 2023-01-29 17:55:08 | INFO | train_inner | {"epoch": 12, "update": 11.932, "s2c_loss": "0.694", "loss": "0.48129", "s2c_nll_loss": "0.694", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "25790", "lr": "0.000171935", "gnorm": "7.59", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6206"} 2023-01-29 17:55:10 | INFO | train_inner | {"epoch": 12, "update": 11.936, "s2c_loss": "0.49", "loss": "0.33942", "s2c_nll_loss": "0.49", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "25800", "lr": "0.000172001", "gnorm": "6.883", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6208"} 2023-01-29 17:55:13 | INFO | train_inner | {"epoch": 12, "update": 11.941, "s2c_loss": "0.536", "loss": "0.37157", "s2c_nll_loss": "0.536", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "25810", "lr": "0.000172068", "gnorm": "6.829", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6211"} 2023-01-29 17:55:15 | INFO | train_inner | {"epoch": 12, "update": 11.945, "s2c_loss": "0.441", "loss": "0.30562", "s2c_nll_loss": "0.441", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "25820", "lr": "0.000172135", "gnorm": "6.88", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "6213"} 2023-01-29 17:55:18 | INFO | train_inner | {"epoch": 12, "update": 11.95, "s2c_loss": "0.572", "loss": "0.3962", "s2c_nll_loss": "0.572", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "258.9", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "25830", "lr": "0.000172201", "gnorm": "7.478", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6216"} 2023-01-29 17:55:20 | INFO | train_inner | {"epoch": 12, "update": 11.955, "s2c_loss": "0.598", "loss": "0.41561", "s2c_nll_loss": "0.598", "s2c_accuracy": "89.011", "s2c_total": "63.7", "s2c_n_correct": "56.7", "wps": "254.9", "ups": "4", "wpb": "63.7", "bsz": "63.7", "num_updates": "25840", "lr": "0.000172268", "gnorm": "7.243", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "6218"} 2023-01-29 17:55:23 | INFO | train_inner | {"epoch": 12, "update": 11.959, "s2c_loss": "0.559", "loss": "0.38753", "s2c_nll_loss": "0.559", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "25850", "lr": "0.000172335", "gnorm": "6.94", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6221"} 2023-01-29 17:55:25 | INFO | train_inner | {"epoch": 12, "update": 11.964, "s2c_loss": "0.473", "loss": "0.32803", "s2c_nll_loss": "0.473", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "25860", "lr": "0.000172401", "gnorm": "6.298", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6223"} 2023-01-29 17:55:28 | INFO | train_inner | {"epoch": 12, "update": 11.969, "s2c_loss": "0.53", "loss": "0.36747", "s2c_nll_loss": "0.53", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "25870", "lr": "0.000172468", "gnorm": "6.909", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6226"} 2023-01-29 17:55:31 | INFO | train_inner | {"epoch": 12, "update": 11.973, "s2c_loss": "0.489", "loss": "0.3387", "s2c_nll_loss": "0.489", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "25880", "lr": "0.000172535", "gnorm": "6.442", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6229"} 2023-01-29 17:55:33 | INFO | train_inner | {"epoch": 12, "update": 11.978, "s2c_loss": "0.539", "loss": "0.37393", "s2c_nll_loss": "0.539", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "25890", "lr": "0.000172601", "gnorm": "7.191", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6231"} 2023-01-29 17:55:36 | INFO | train_inner | {"epoch": 12, "update": 11.982, "s2c_loss": "0.642", "loss": "0.44467", "s2c_nll_loss": "0.642", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "25900", "lr": "0.000172668", "gnorm": "7.242", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6234"} 2023-01-29 17:55:38 | INFO | train_inner | {"epoch": 12, "update": 11.987, "s2c_loss": "0.499", "loss": "0.3456", "s2c_nll_loss": "0.499", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "25910", "lr": "0.000172735", "gnorm": "6.286", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6236"} 2023-01-29 17:55:41 | INFO | train_inner | {"epoch": 12, "update": 11.992, "s2c_loss": "0.467", "loss": "0.32339", "s2c_nll_loss": "0.467", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "25920", "lr": "0.000172801", "gnorm": "7.326", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6239"} 2023-01-29 17:55:43 | INFO | train_inner | {"epoch": 12, "update": 11.996, "s2c_loss": "0.578", "loss": "0.40079", "s2c_nll_loss": "0.578", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "247.3", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "25930", "lr": "0.000172868", "gnorm": "7.582", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6241"} 2023-01-29 17:55:45 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 17:56:00 | INFO | valid | {"epoch": 12, "valid_s2c_loss": "1.134", "valid_loss": "0.78598", "valid_s2c_nll_loss": "1.134", "valid_s2c_accuracy": "80.24", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "25.6435", "valid_num_updates": "25938", "valid_best_s2c_accuracy": "81.037"} 2023-01-29 17:56:00 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 12 @ 25938 updates 2023-01-29 17:56:00 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 17:56:07 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 17:56:07 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt (epoch 12 @ 25938 updates, score 80.24) (writing took 6.953057246282697 seconds) 2023-01-29 17:56:07 | INFO | fairseq_cli.train | end of epoch 12 (average epoch stats below) 2023-01-29 17:56:07 | INFO | train | {"epoch": 12, "train_s2c_loss": "0.528", "train_loss": "0.36612", "train_s2c_nll_loss": "0.528", "train_s2c_accuracy": "90.579", "train_s2c_total": "63.9838", "train_s2c_n_correct": "57.956", "train_wps": "240.3", "train_ups": "3.76", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "25938", "train_lr": "0.000172921", "train_gnorm": "7.115", "train_loss_scale": "2048", "train_train_wall": "540", "train_gb_free": "7.4", "train_wall": "6265"} 2023-01-29 17:56:13 | INFO | fairseq.trainer | begin training epoch 13 2023-01-29 17:56:13 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 17:56:14 | INFO | train_inner | {"epoch": 13, "update": 12.001, "s2c_loss": "0.679", "loss": "0.4706", "s2c_nll_loss": "0.679", "s2c_accuracy": "87.5", "s2c_total": "60.8", "s2c_n_correct": "53.2", "wps": "19.9", "ups": "0.33", "wpb": "60.8", "bsz": "60.8", "num_updates": "25940", "lr": "0.000172935", "gnorm": "7.83", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6272"} 2023-01-29 17:56:16 | INFO | train_inner | {"epoch": 13, "update": 12.006, "s2c_loss": "0.494", "loss": "0.34247", "s2c_nll_loss": "0.494", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "245.4", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "25950", "lr": "0.000173001", "gnorm": "6.495", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6274"} 2023-01-29 17:56:19 | INFO | train_inner | {"epoch": 13, "update": 12.01, "s2c_loss": "0.72", "loss": "0.4988", "s2c_nll_loss": "0.72", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "25960", "lr": "0.000173068", "gnorm": "7.153", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6277"} 2023-01-29 17:56:22 | INFO | train_inner | {"epoch": 13, "update": 12.015, "s2c_loss": "0.574", "loss": "0.39781", "s2c_nll_loss": "0.574", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "25970", "lr": "0.000173135", "gnorm": "5.741", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6279"} 2023-01-29 17:56:24 | INFO | train_inner | {"epoch": 13, "update": 12.019, "s2c_loss": "0.333", "loss": "0.23101", "s2c_nll_loss": "0.333", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "247.1", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "25980", "lr": "0.000173201", "gnorm": "5.412", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6282"} 2023-01-29 17:56:27 | INFO | train_inner | {"epoch": 13, "update": 12.024, "s2c_loss": "0.474", "loss": "0.32824", "s2c_nll_loss": "0.474", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "25990", "lr": "0.000173268", "gnorm": "6.919", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6285"} 2023-01-29 17:56:29 | INFO | train_inner | {"epoch": 13, "update": 12.029, "s2c_loss": "0.456", "loss": "0.31619", "s2c_nll_loss": "0.456", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "26000", "lr": "0.000173335", "gnorm": "7.019", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6287"} 2023-01-29 17:56:32 | INFO | train_inner | {"epoch": 13, "update": 12.033, "s2c_loss": "0.587", "loss": "0.40684", "s2c_nll_loss": "0.587", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "26010", "lr": "0.000173401", "gnorm": "6.87", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6290"} 2023-01-29 17:56:34 | INFO | train_inner | {"epoch": 13, "update": 12.038, "s2c_loss": "0.365", "loss": "0.25334", "s2c_nll_loss": "0.365", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "26020", "lr": "0.000173468", "gnorm": "6.021", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6292"} 2023-01-29 17:56:37 | INFO | train_inner | {"epoch": 13, "update": 12.043, "s2c_loss": "0.547", "loss": "0.37949", "s2c_nll_loss": "0.547", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "26030", "lr": "0.000173535", "gnorm": "7.031", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6295"} 2023-01-29 17:56:39 | INFO | train_inner | {"epoch": 13, "update": 12.047, "s2c_loss": "0.463", "loss": "0.32107", "s2c_nll_loss": "0.463", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "26040", "lr": "0.000173601", "gnorm": "6.307", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6297"} 2023-01-29 17:56:42 | INFO | train_inner | {"epoch": 13, "update": 12.052, "s2c_loss": "0.501", "loss": "0.34738", "s2c_nll_loss": "0.501", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "26050", "lr": "0.000173668", "gnorm": "6.71", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6300"} 2023-01-29 17:56:45 | INFO | train_inner | {"epoch": 13, "update": 12.056, "s2c_loss": "0.447", "loss": "0.30978", "s2c_nll_loss": "0.447", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "26060", "lr": "0.000173735", "gnorm": "6.328", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6302"} 2023-01-29 17:56:47 | INFO | train_inner | {"epoch": 13, "update": 12.061, "s2c_loss": "0.387", "loss": "0.26852", "s2c_nll_loss": "0.387", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "26070", "lr": "0.000173801", "gnorm": "5.65", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6305"} 2023-01-29 17:56:50 | INFO | train_inner | {"epoch": 13, "update": 12.066, "s2c_loss": "0.492", "loss": "0.34094", "s2c_nll_loss": "0.492", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "26080", "lr": "0.000173868", "gnorm": "7.204", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6308"} 2023-01-29 17:56:52 | INFO | train_inner | {"epoch": 13, "update": 12.07, "s2c_loss": "0.532", "loss": "0.36904", "s2c_nll_loss": "0.532", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "26090", "lr": "0.000173935", "gnorm": "7.215", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6310"} 2023-01-29 17:56:55 | INFO | train_inner | {"epoch": 13, "update": 12.075, "s2c_loss": "0.514", "loss": "0.35655", "s2c_nll_loss": "0.514", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "248", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "26100", "lr": "0.000174001", "gnorm": "7.565", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6313"} 2023-01-29 17:56:57 | INFO | train_inner | {"epoch": 13, "update": 12.08, "s2c_loss": "0.399", "loss": "0.2768", "s2c_nll_loss": "0.399", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "26110", "lr": "0.000174068", "gnorm": "6.303", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6315"} 2023-01-29 17:57:00 | INFO | train_inner | {"epoch": 13, "update": 12.084, "s2c_loss": "0.393", "loss": "0.27259", "s2c_nll_loss": "0.393", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "26120", "lr": "0.000174135", "gnorm": "5.791", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6318"} 2023-01-29 17:57:02 | INFO | train_inner | {"epoch": 13, "update": 12.089, "s2c_loss": "0.322", "loss": "0.22345", "s2c_nll_loss": "0.322", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "26130", "lr": "0.000174201", "gnorm": "5.605", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6320"} 2023-01-29 17:57:05 | INFO | train_inner | {"epoch": 13, "update": 12.093, "s2c_loss": "0.502", "loss": "0.34809", "s2c_nll_loss": "0.502", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "26140", "lr": "0.000174268", "gnorm": "6.767", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6323"} 2023-01-29 17:57:07 | INFO | train_inner | {"epoch": 13, "update": 12.098, "s2c_loss": "0.576", "loss": "0.399", "s2c_nll_loss": "0.576", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "26150", "lr": "0.000174335", "gnorm": "6.397", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6325"} 2023-01-29 17:57:10 | INFO | train_inner | {"epoch": 13, "update": 12.103, "s2c_loss": "0.431", "loss": "0.29871", "s2c_nll_loss": "0.431", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "26160", "lr": "0.000174401", "gnorm": "6.091", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6328"} 2023-01-29 17:57:12 | INFO | train_inner | {"epoch": 13, "update": 12.107, "s2c_loss": "0.372", "loss": "0.25811", "s2c_nll_loss": "0.372", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "26170", "lr": "0.000174468", "gnorm": "5.864", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6330"} 2023-01-29 17:57:15 | INFO | train_inner | {"epoch": 13, "update": 12.112, "s2c_loss": "0.536", "loss": "0.37174", "s2c_nll_loss": "0.536", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "26180", "lr": "0.000174535", "gnorm": "6.958", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6333"} 2023-01-29 17:57:17 | INFO | train_inner | {"epoch": 13, "update": 12.117, "s2c_loss": "0.4", "loss": "0.27754", "s2c_nll_loss": "0.4", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "26190", "lr": "0.000174601", "gnorm": "6.164", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6335"} 2023-01-29 17:57:20 | INFO | train_inner | {"epoch": 13, "update": 12.121, "s2c_loss": "0.626", "loss": "0.43421", "s2c_nll_loss": "0.626", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "26200", "lr": "0.000174668", "gnorm": "6.375", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6338"} 2023-01-29 17:57:23 | INFO | train_inner | {"epoch": 13, "update": 12.126, "s2c_loss": "0.34", "loss": "0.23559", "s2c_nll_loss": "0.34", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "26210", "lr": "0.000174735", "gnorm": "5.197", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6340"} 2023-01-29 17:57:25 | INFO | train_inner | {"epoch": 13, "update": 12.13, "s2c_loss": "0.481", "loss": "0.33356", "s2c_nll_loss": "0.481", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "26220", "lr": "0.000174801", "gnorm": "6.411", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6343"} 2023-01-29 17:57:28 | INFO | train_inner | {"epoch": 13, "update": 12.135, "s2c_loss": "0.382", "loss": "0.26458", "s2c_nll_loss": "0.382", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "26230", "lr": "0.000174868", "gnorm": "5.674", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6346"} 2023-01-29 17:57:30 | INFO | train_inner | {"epoch": 13, "update": 12.14, "s2c_loss": "0.411", "loss": "0.28514", "s2c_nll_loss": "0.411", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "26240", "lr": "0.000174935", "gnorm": "6.791", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6348"} 2023-01-29 17:57:33 | INFO | train_inner | {"epoch": 13, "update": 12.144, "s2c_loss": "0.479", "loss": "0.33202", "s2c_nll_loss": "0.479", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "26250", "lr": "0.000175001", "gnorm": "7.305", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "6351"} 2023-01-29 17:57:35 | INFO | train_inner | {"epoch": 13, "update": 12.149, "s2c_loss": "0.368", "loss": "0.25515", "s2c_nll_loss": "0.368", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "26260", "lr": "0.000175068", "gnorm": "5.345", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6353"} 2023-01-29 17:57:38 | INFO | train_inner | {"epoch": 13, "update": 12.154, "s2c_loss": "0.383", "loss": "0.26555", "s2c_nll_loss": "0.383", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "26270", "lr": "0.000175135", "gnorm": "5.93", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6356"} 2023-01-29 17:57:40 | INFO | train_inner | {"epoch": 13, "update": 12.158, "s2c_loss": "0.579", "loss": "0.40151", "s2c_nll_loss": "0.579", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "26280", "lr": "0.000175201", "gnorm": "7.507", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6358"} 2023-01-29 17:57:43 | INFO | train_inner | {"epoch": 13, "update": 12.163, "s2c_loss": "0.643", "loss": "0.44551", "s2c_nll_loss": "0.643", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "26290", "lr": "0.000175268", "gnorm": "7.367", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "6361"} 2023-01-29 17:57:45 | INFO | train_inner | {"epoch": 13, "update": 12.167, "s2c_loss": "0.439", "loss": "0.30431", "s2c_nll_loss": "0.439", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "26300", "lr": "0.000175335", "gnorm": "6.284", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6363"} 2023-01-29 17:57:48 | INFO | train_inner | {"epoch": 13, "update": 12.172, "s2c_loss": "0.544", "loss": "0.37698", "s2c_nll_loss": "0.544", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "26310", "lr": "0.000175401", "gnorm": "7.27", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6366"} 2023-01-29 17:57:50 | INFO | train_inner | {"epoch": 13, "update": 12.177, "s2c_loss": "0.485", "loss": "0.33618", "s2c_nll_loss": "0.485", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "26320", "lr": "0.000175468", "gnorm": "7.215", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6368"} 2023-01-29 17:57:53 | INFO | train_inner | {"epoch": 13, "update": 12.181, "s2c_loss": "0.41", "loss": "0.28398", "s2c_nll_loss": "0.41", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "26330", "lr": "0.000175535", "gnorm": "6.759", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6371"} 2023-01-29 17:57:55 | INFO | train_inner | {"epoch": 13, "update": 12.186, "s2c_loss": "0.455", "loss": "0.31528", "s2c_nll_loss": "0.455", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "26340", "lr": "0.000175601", "gnorm": "7.737", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6373"} 2023-01-29 17:57:58 | INFO | train_inner | {"epoch": 13, "update": 12.191, "s2c_loss": "0.651", "loss": "0.45149", "s2c_nll_loss": "0.651", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "26350", "lr": "0.000175668", "gnorm": "8.333", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6376"} 2023-01-29 17:58:01 | INFO | train_inner | {"epoch": 13, "update": 12.195, "s2c_loss": "0.357", "loss": "0.24751", "s2c_nll_loss": "0.357", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "26360", "lr": "0.000175735", "gnorm": "6.066", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6378"} 2023-01-29 17:58:03 | INFO | train_inner | {"epoch": 13, "update": 12.2, "s2c_loss": "0.5", "loss": "0.34686", "s2c_nll_loss": "0.5", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "26370", "lr": "0.000175801", "gnorm": "6.535", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6381"} 2023-01-29 17:58:06 | INFO | train_inner | {"epoch": 13, "update": 12.204, "s2c_loss": "0.502", "loss": "0.34777", "s2c_nll_loss": "0.502", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "26380", "lr": "0.000175868", "gnorm": "6.807", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6383"} 2023-01-29 17:58:08 | INFO | train_inner | {"epoch": 13, "update": 12.209, "s2c_loss": "0.573", "loss": "0.39696", "s2c_nll_loss": "0.573", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "26390", "lr": "0.000175935", "gnorm": "6.519", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6386"} 2023-01-29 17:58:11 | INFO | train_inner | {"epoch": 13, "update": 12.214, "s2c_loss": "0.481", "loss": "0.33345", "s2c_nll_loss": "0.481", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "26400", "lr": "0.000176001", "gnorm": "6.644", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6389"} 2023-01-29 17:58:13 | INFO | train_inner | {"epoch": 13, "update": 12.218, "s2c_loss": "0.369", "loss": "0.25589", "s2c_nll_loss": "0.369", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "26410", "lr": "0.000176068", "gnorm": "5.608", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6391"} 2023-01-29 17:58:16 | INFO | train_inner | {"epoch": 13, "update": 12.223, "s2c_loss": "0.409", "loss": "0.28384", "s2c_nll_loss": "0.409", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "26420", "lr": "0.000176135", "gnorm": "5.485", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6394"} 2023-01-29 17:58:18 | INFO | train_inner | {"epoch": 13, "update": 12.228, "s2c_loss": "0.415", "loss": "0.2875", "s2c_nll_loss": "0.415", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "26430", "lr": "0.000176201", "gnorm": "6.546", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6396"} 2023-01-29 17:58:21 | INFO | train_inner | {"epoch": 13, "update": 12.232, "s2c_loss": "0.39", "loss": "0.27049", "s2c_nll_loss": "0.39", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "26440", "lr": "0.000176268", "gnorm": "5.794", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6399"} 2023-01-29 17:58:23 | INFO | train_inner | {"epoch": 13, "update": 12.237, "s2c_loss": "0.664", "loss": "0.45992", "s2c_nll_loss": "0.664", "s2c_accuracy": "86.875", "s2c_total": "64", "s2c_n_correct": "55.6", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "26450", "lr": "0.000176335", "gnorm": "7.785", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6401"} 2023-01-29 17:58:26 | INFO | train_inner | {"epoch": 13, "update": 12.241, "s2c_loss": "0.638", "loss": "0.44193", "s2c_nll_loss": "0.638", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "26460", "lr": "0.000176401", "gnorm": "8.47", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6404"} 2023-01-29 17:58:28 | INFO | train_inner | {"epoch": 13, "update": 12.246, "s2c_loss": "0.482", "loss": "0.33289", "s2c_nll_loss": "0.482", "s2c_accuracy": "91.994", "s2c_total": "63.7", "s2c_n_correct": "58.6", "wps": "253.6", "ups": "3.98", "wpb": "63.7", "bsz": "63.7", "num_updates": "26470", "lr": "0.000176468", "gnorm": "7.124", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6406"} 2023-01-29 17:58:31 | INFO | train_inner | {"epoch": 13, "update": 12.251, "s2c_loss": "0.437", "loss": "0.30278", "s2c_nll_loss": "0.437", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "26480", "lr": "0.000176535", "gnorm": "6.127", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6409"} 2023-01-29 17:58:33 | INFO | train_inner | {"epoch": 13, "update": 12.255, "s2c_loss": "0.405", "loss": "0.28057", "s2c_nll_loss": "0.405", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "26490", "lr": "0.000176601", "gnorm": "5.935", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6411"} 2023-01-29 17:58:36 | INFO | train_inner | {"epoch": 13, "update": 12.26, "s2c_loss": "0.582", "loss": "0.40346", "s2c_nll_loss": "0.582", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "26500", "lr": "0.000176668", "gnorm": "7.017", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6414"} 2023-01-29 17:58:39 | INFO | train_inner | {"epoch": 13, "update": 12.265, "s2c_loss": "0.403", "loss": "0.27909", "s2c_nll_loss": "0.403", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "26510", "lr": "0.000176734", "gnorm": "6.892", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6416"} 2023-01-29 17:58:41 | INFO | train_inner | {"epoch": 13, "update": 12.269, "s2c_loss": "0.434", "loss": "0.30103", "s2c_nll_loss": "0.434", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "26520", "lr": "0.000176801", "gnorm": "6.789", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6419"} 2023-01-29 17:58:44 | INFO | train_inner | {"epoch": 13, "update": 12.274, "s2c_loss": "0.569", "loss": "0.39406", "s2c_nll_loss": "0.569", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "26530", "lr": "0.000176868", "gnorm": "7.298", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6422"} 2023-01-29 17:58:46 | INFO | train_inner | {"epoch": 13, "update": 12.278, "s2c_loss": "0.724", "loss": "0.50187", "s2c_nll_loss": "0.724", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "26540", "lr": "0.000176934", "gnorm": "7.642", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6424"} 2023-01-29 17:58:49 | INFO | train_inner | {"epoch": 13, "update": 12.283, "s2c_loss": "0.436", "loss": "0.30209", "s2c_nll_loss": "0.436", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "26550", "lr": "0.000177001", "gnorm": "6.57", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "6427"} 2023-01-29 17:58:51 | INFO | train_inner | {"epoch": 13, "update": 12.288, "s2c_loss": "0.584", "loss": "0.40473", "s2c_nll_loss": "0.584", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "26560", "lr": "0.000177068", "gnorm": "7.15", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "6429"} 2023-01-29 17:58:54 | INFO | train_inner | {"epoch": 13, "update": 12.292, "s2c_loss": "0.736", "loss": "0.51035", "s2c_nll_loss": "0.736", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "26570", "lr": "0.000177134", "gnorm": "7.035", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "6432"} 2023-01-29 17:58:56 | INFO | train_inner | {"epoch": 13, "update": 12.297, "s2c_loss": "0.428", "loss": "0.29683", "s2c_nll_loss": "0.428", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "257", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "26580", "lr": "0.000177201", "gnorm": "6.321", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "6434"} 2023-01-29 17:58:59 | INFO | train_inner | {"epoch": 13, "update": 12.302, "s2c_loss": "0.463", "loss": "0.3211", "s2c_nll_loss": "0.463", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "26590", "lr": "0.000177268", "gnorm": "6.299", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "6437"} 2023-01-29 17:59:01 | INFO | train_inner | {"epoch": 13, "update": 12.306, "s2c_loss": "0.45", "loss": "0.31158", "s2c_nll_loss": "0.45", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "26600", "lr": "0.000177334", "gnorm": "6.621", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "6439"} 2023-01-29 17:59:03 | INFO | fairseq.trainer | NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 2048.0 2023-01-29 17:59:04 | INFO | train_inner | {"epoch": 13, "update": 12.311, "s2c_loss": "0.496", "loss": "0.3437", "s2c_nll_loss": "0.496", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "230.9", "ups": "3.61", "wpb": "64", "bsz": "64", "num_updates": "26610", "lr": "0.000177401", "gnorm": "8.356", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6442"} 2023-01-29 17:59:07 | INFO | train_inner | {"epoch": 13, "update": 12.316, "s2c_loss": "0.548", "loss": "0.3799", "s2c_nll_loss": "0.548", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "26620", "lr": "0.000177468", "gnorm": "6.598", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "6445"} 2023-01-29 17:59:09 | INFO | train_inner | {"epoch": 13, "update": 12.321, "s2c_loss": "0.538", "loss": "0.37289", "s2c_nll_loss": "0.538", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "26630", "lr": "0.000177534", "gnorm": "7.105", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6447"} 2023-01-29 17:59:12 | INFO | train_inner | {"epoch": 13, "update": 12.325, "s2c_loss": "0.489", "loss": "0.3388", "s2c_nll_loss": "0.489", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "26640", "lr": "0.000177601", "gnorm": "6.293", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6450"} 2023-01-29 17:59:14 | INFO | train_inner | {"epoch": 13, "update": 12.33, "s2c_loss": "0.474", "loss": "0.32858", "s2c_nll_loss": "0.474", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "26650", "lr": "0.000177668", "gnorm": "6.389", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6452"} 2023-01-29 17:59:17 | INFO | train_inner | {"epoch": 13, "update": 12.334, "s2c_loss": "0.574", "loss": "0.39805", "s2c_nll_loss": "0.574", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "26660", "lr": "0.000177734", "gnorm": "7.122", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "6455"} 2023-01-29 17:59:19 | INFO | train_inner | {"epoch": 13, "update": 12.339, "s2c_loss": "0.457", "loss": "0.3165", "s2c_nll_loss": "0.457", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "26670", "lr": "0.000177801", "gnorm": "8.222", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6457"} 2023-01-29 17:59:22 | INFO | train_inner | {"epoch": 13, "update": 12.344, "s2c_loss": "0.495", "loss": "0.34324", "s2c_nll_loss": "0.495", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "26680", "lr": "0.000177868", "gnorm": "7.08", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "6460"} 2023-01-29 17:59:24 | INFO | train_inner | {"epoch": 13, "update": 12.348, "s2c_loss": "0.522", "loss": "0.36209", "s2c_nll_loss": "0.522", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "26690", "lr": "0.000177934", "gnorm": "7.13", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6462"} 2023-01-29 17:59:27 | INFO | train_inner | {"epoch": 13, "update": 12.353, "s2c_loss": "0.564", "loss": "0.39075", "s2c_nll_loss": "0.564", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "26700", "lr": "0.000178001", "gnorm": "6.581", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6465"} 2023-01-29 17:59:29 | INFO | train_inner | {"epoch": 13, "update": 12.358, "s2c_loss": "0.462", "loss": "0.31997", "s2c_nll_loss": "0.462", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "26710", "lr": "0.000178068", "gnorm": "7.036", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6467"} 2023-01-29 17:59:32 | INFO | train_inner | {"epoch": 13, "update": 12.362, "s2c_loss": "0.591", "loss": "0.40987", "s2c_nll_loss": "0.591", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "26720", "lr": "0.000178134", "gnorm": "7.737", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "6470"} 2023-01-29 17:59:35 | INFO | train_inner | {"epoch": 13, "update": 12.367, "s2c_loss": "0.584", "loss": "0.40492", "s2c_nll_loss": "0.584", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "26730", "lr": "0.000178201", "gnorm": "7.72", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6473"} 2023-01-29 17:59:37 | INFO | train_inner | {"epoch": 13, "update": 12.371, "s2c_loss": "0.692", "loss": "0.47975", "s2c_nll_loss": "0.692", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "26740", "lr": "0.000178268", "gnorm": "9.164", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "6475"} 2023-01-29 17:59:40 | INFO | train_inner | {"epoch": 13, "update": 12.376, "s2c_loss": "0.714", "loss": "0.49478", "s2c_nll_loss": "0.714", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "26750", "lr": "0.000178334", "gnorm": "7.927", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6478"} 2023-01-29 17:59:42 | INFO | train_inner | {"epoch": 13, "update": 12.381, "s2c_loss": "0.528", "loss": "0.36579", "s2c_nll_loss": "0.528", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "26760", "lr": "0.000178401", "gnorm": "7.015", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6480"} 2023-01-29 17:59:45 | INFO | train_inner | {"epoch": 13, "update": 12.385, "s2c_loss": "0.596", "loss": "0.41341", "s2c_nll_loss": "0.596", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "256.3", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "26770", "lr": "0.000178468", "gnorm": "7.366", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6483"} 2023-01-29 17:59:47 | INFO | train_inner | {"epoch": 13, "update": 12.39, "s2c_loss": "0.484", "loss": "0.3353", "s2c_nll_loss": "0.484", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "26780", "lr": "0.000178534", "gnorm": "6.43", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6485"} 2023-01-29 17:59:50 | INFO | train_inner | {"epoch": 13, "update": 12.395, "s2c_loss": "0.534", "loss": "0.37045", "s2c_nll_loss": "0.534", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "26790", "lr": "0.000178601", "gnorm": "7.046", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6488"} 2023-01-29 17:59:52 | INFO | train_inner | {"epoch": 13, "update": 12.399, "s2c_loss": "0.598", "loss": "0.41432", "s2c_nll_loss": "0.598", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "26800", "lr": "0.000178668", "gnorm": "7.504", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6490"} 2023-01-29 17:59:55 | INFO | train_inner | {"epoch": 13, "update": 12.404, "s2c_loss": "0.63", "loss": "0.4366", "s2c_nll_loss": "0.63", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "26810", "lr": "0.000178734", "gnorm": "6.561", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "6493"} 2023-01-29 17:59:57 | INFO | train_inner | {"epoch": 13, "update": 12.408, "s2c_loss": "0.563", "loss": "0.39057", "s2c_nll_loss": "0.563", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "26820", "lr": "0.000178801", "gnorm": "7.382", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6495"} 2023-01-29 18:00:00 | INFO | train_inner | {"epoch": 13, "update": 12.413, "s2c_loss": "0.573", "loss": "0.39684", "s2c_nll_loss": "0.573", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "26830", "lr": "0.000178868", "gnorm": "7.133", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6498"} 2023-01-29 18:00:02 | INFO | train_inner | {"epoch": 13, "update": 12.418, "s2c_loss": "0.546", "loss": "0.37828", "s2c_nll_loss": "0.546", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "26840", "lr": "0.000178934", "gnorm": "6.775", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6500"} 2023-01-29 18:00:05 | INFO | train_inner | {"epoch": 13, "update": 12.422, "s2c_loss": "0.612", "loss": "0.42452", "s2c_nll_loss": "0.612", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "26850", "lr": "0.000179001", "gnorm": "7.052", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6503"} 2023-01-29 18:00:08 | INFO | train_inner | {"epoch": 13, "update": 12.427, "s2c_loss": "0.597", "loss": "0.41371", "s2c_nll_loss": "0.597", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "26860", "lr": "0.000179068", "gnorm": "6.813", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6505"} 2023-01-29 18:00:10 | INFO | train_inner | {"epoch": 13, "update": 12.432, "s2c_loss": "0.435", "loss": "0.30164", "s2c_nll_loss": "0.435", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "26870", "lr": "0.000179134", "gnorm": "6.196", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6508"} 2023-01-29 18:00:13 | INFO | train_inner | {"epoch": 13, "update": 12.436, "s2c_loss": "0.44", "loss": "0.30528", "s2c_nll_loss": "0.44", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "26880", "lr": "0.000179201", "gnorm": "5.922", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6511"} 2023-01-29 18:00:15 | INFO | train_inner | {"epoch": 13, "update": 12.441, "s2c_loss": "0.617", "loss": "0.42753", "s2c_nll_loss": "0.617", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "26890", "lr": "0.000179268", "gnorm": "7.809", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6513"} 2023-01-29 18:00:18 | INFO | train_inner | {"epoch": 13, "update": 12.445, "s2c_loss": "0.493", "loss": "0.34174", "s2c_nll_loss": "0.493", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "26900", "lr": "0.000179334", "gnorm": "7.302", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6516"} 2023-01-29 18:00:20 | INFO | train_inner | {"epoch": 13, "update": 12.45, "s2c_loss": "0.552", "loss": "0.38265", "s2c_nll_loss": "0.552", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "26910", "lr": "0.000179401", "gnorm": "6.923", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6518"} 2023-01-29 18:00:23 | INFO | train_inner | {"epoch": 13, "update": 12.455, "s2c_loss": "0.488", "loss": "0.33844", "s2c_nll_loss": "0.488", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "26920", "lr": "0.000179468", "gnorm": "5.91", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "6521"} 2023-01-29 18:00:25 | INFO | train_inner | {"epoch": 13, "update": 12.459, "s2c_loss": "0.573", "loss": "0.39745", "s2c_nll_loss": "0.573", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "26930", "lr": "0.000179534", "gnorm": "8.06", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6523"} 2023-01-29 18:00:28 | INFO | train_inner | {"epoch": 13, "update": 12.464, "s2c_loss": "0.569", "loss": "0.3946", "s2c_nll_loss": "0.569", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "26940", "lr": "0.000179601", "gnorm": "6.983", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6526"} 2023-01-29 18:00:30 | INFO | train_inner | {"epoch": 13, "update": 12.469, "s2c_loss": "0.487", "loss": "0.33752", "s2c_nll_loss": "0.487", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "26950", "lr": "0.000179668", "gnorm": "7.078", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6528"} 2023-01-29 18:00:33 | INFO | train_inner | {"epoch": 13, "update": 12.473, "s2c_loss": "0.533", "loss": "0.36969", "s2c_nll_loss": "0.533", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "26960", "lr": "0.000179734", "gnorm": "7.285", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "6531"} 2023-01-29 18:00:36 | INFO | train_inner | {"epoch": 13, "update": 12.478, "s2c_loss": "0.603", "loss": "0.41786", "s2c_nll_loss": "0.603", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "246.6", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "26970", "lr": "0.000179801", "gnorm": "7.565", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6533"} 2023-01-29 18:00:38 | INFO | train_inner | {"epoch": 13, "update": 12.482, "s2c_loss": "0.665", "loss": "0.46113", "s2c_nll_loss": "0.665", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "246.8", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "26980", "lr": "0.000179868", "gnorm": "7.717", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6536"} 2023-01-29 18:00:41 | INFO | train_inner | {"epoch": 13, "update": 12.487, "s2c_loss": "0.599", "loss": "0.41544", "s2c_nll_loss": "0.599", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "26990", "lr": "0.000179934", "gnorm": "7.811", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "6539"} 2023-01-29 18:00:43 | INFO | train_inner | {"epoch": 13, "update": 12.492, "s2c_loss": "0.696", "loss": "0.48214", "s2c_nll_loss": "0.696", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "27000", "lr": "0.000180001", "gnorm": "9.104", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6541"} 2023-01-29 18:00:46 | INFO | train_inner | {"epoch": 13, "update": 12.496, "s2c_loss": "0.574", "loss": "0.3978", "s2c_nll_loss": "0.574", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "27010", "lr": "0.000180068", "gnorm": "7.297", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "6544"} 2023-01-29 18:00:48 | INFO | train_inner | {"epoch": 13, "update": 12.501, "s2c_loss": "0.696", "loss": "0.48241", "s2c_nll_loss": "0.696", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "246.3", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "27020", "lr": "0.000180134", "gnorm": "7.373", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "6546"} 2023-01-29 18:00:51 | INFO | train_inner | {"epoch": 13, "update": 12.506, "s2c_loss": "0.57", "loss": "0.39527", "s2c_nll_loss": "0.57", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "27030", "lr": "0.000180201", "gnorm": "7.03", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "6549"} 2023-01-29 18:00:52 | INFO | fairseq.trainer | NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 1024.0 2023-01-29 18:00:54 | INFO | train_inner | {"epoch": 13, "update": 12.511, "s2c_loss": "0.46", "loss": "0.31916", "s2c_nll_loss": "0.46", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "230.8", "ups": "3.61", "wpb": "64", "bsz": "64", "num_updates": "27040", "lr": "0.000180268", "gnorm": "6.9", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "6552"} 2023-01-29 18:00:56 | INFO | train_inner | {"epoch": 13, "update": 12.515, "s2c_loss": "0.469", "loss": "0.32534", "s2c_nll_loss": "0.469", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "262.2", "ups": "4.1", "wpb": "64", "bsz": "64", "num_updates": "27050", "lr": "0.000180334", "gnorm": "6.93", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "6554"} 2023-01-29 18:00:59 | INFO | train_inner | {"epoch": 13, "update": 12.52, "s2c_loss": "0.465", "loss": "0.32231", "s2c_nll_loss": "0.465", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "27060", "lr": "0.000180401", "gnorm": "7.114", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "6557"} 2023-01-29 18:01:01 | INFO | train_inner | {"epoch": 13, "update": 12.525, "s2c_loss": "0.559", "loss": "0.38768", "s2c_nll_loss": "0.559", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "27070", "lr": "0.000180468", "gnorm": "7.984", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "6559"} 2023-01-29 18:01:04 | INFO | train_inner | {"epoch": 13, "update": 12.529, "s2c_loss": "0.608", "loss": "0.42136", "s2c_nll_loss": "0.608", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "27080", "lr": "0.000180534", "gnorm": "7.922", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "6562"} 2023-01-29 18:01:06 | INFO | train_inner | {"epoch": 13, "update": 12.534, "s2c_loss": "0.637", "loss": "0.44146", "s2c_nll_loss": "0.637", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "27090", "lr": "0.000180601", "gnorm": "7.321", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "6564"} 2023-01-29 18:01:09 | INFO | train_inner | {"epoch": 13, "update": 12.538, "s2c_loss": "0.564", "loss": "0.39115", "s2c_nll_loss": "0.564", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "27100", "lr": "0.000180668", "gnorm": "7.278", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "6567"} 2023-01-29 18:01:11 | INFO | train_inner | {"epoch": 13, "update": 12.543, "s2c_loss": "0.68", "loss": "0.47135", "s2c_nll_loss": "0.68", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "27110", "lr": "0.000180734", "gnorm": "7.915", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "6569"} 2023-01-29 18:01:14 | INFO | train_inner | {"epoch": 13, "update": 12.548, "s2c_loss": "0.613", "loss": "0.42519", "s2c_nll_loss": "0.613", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "27120", "lr": "0.000180801", "gnorm": "6.83", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "6572"} 2023-01-29 18:01:16 | INFO | train_inner | {"epoch": 13, "update": 12.552, "s2c_loss": "0.594", "loss": "0.41162", "s2c_nll_loss": "0.594", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "27130", "lr": "0.000180868", "gnorm": "7.389", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "6574"} 2023-01-29 18:01:19 | INFO | train_inner | {"epoch": 13, "update": 12.557, "s2c_loss": "0.66", "loss": "0.45743", "s2c_nll_loss": "0.66", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "258.5", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "27140", "lr": "0.000180934", "gnorm": "7.326", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "6577"} 2023-01-29 18:01:21 | INFO | train_inner | {"epoch": 13, "update": 12.562, "s2c_loss": "0.568", "loss": "0.39341", "s2c_nll_loss": "0.568", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "27150", "lr": "0.000181001", "gnorm": "8.085", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "6579"} 2023-01-29 18:01:24 | INFO | train_inner | {"epoch": 13, "update": 12.566, "s2c_loss": "0.451", "loss": "0.31237", "s2c_nll_loss": "0.451", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "27160", "lr": "0.000181068", "gnorm": "6.158", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "6582"} 2023-01-29 18:01:26 | INFO | train_inner | {"epoch": 13, "update": 12.571, "s2c_loss": "0.549", "loss": "0.38081", "s2c_nll_loss": "0.549", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "27170", "lr": "0.000181134", "gnorm": "7.252", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "6584"} 2023-01-29 18:01:29 | INFO | train_inner | {"epoch": 13, "update": 12.575, "s2c_loss": "0.425", "loss": "0.29484", "s2c_nll_loss": "0.425", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "27180", "lr": "0.000181201", "gnorm": "6.309", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "6587"} 2023-01-29 18:01:31 | INFO | train_inner | {"epoch": 13, "update": 12.58, "s2c_loss": "0.546", "loss": "0.3782", "s2c_nll_loss": "0.546", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "27190", "lr": "0.000181268", "gnorm": "6.289", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "6589"} 2023-01-29 18:01:34 | INFO | train_inner | {"epoch": 13, "update": 12.585, "s2c_loss": "0.488", "loss": "0.33823", "s2c_nll_loss": "0.488", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "262.4", "ups": "4.1", "wpb": "64", "bsz": "64", "num_updates": "27200", "lr": "0.000181334", "gnorm": "5.986", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "6592"} 2023-01-29 18:01:36 | INFO | train_inner | {"epoch": 13, "update": 12.589, "s2c_loss": "0.511", "loss": "0.35448", "s2c_nll_loss": "0.511", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "27210", "lr": "0.000181401", "gnorm": "6.847", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "6594"} 2023-01-29 18:01:39 | INFO | train_inner | {"epoch": 13, "update": 12.594, "s2c_loss": "0.516", "loss": "0.35741", "s2c_nll_loss": "0.516", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "27220", "lr": "0.000181468", "gnorm": "6.555", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "6597"} 2023-01-29 18:01:41 | INFO | train_inner | {"epoch": 13, "update": 12.599, "s2c_loss": "0.513", "loss": "0.35569", "s2c_nll_loss": "0.513", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "27230", "lr": "0.000181534", "gnorm": "7.064", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "6599"} 2023-01-29 18:01:44 | INFO | train_inner | {"epoch": 13, "update": 12.603, "s2c_loss": "0.506", "loss": "0.35101", "s2c_nll_loss": "0.506", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "27240", "lr": "0.000181601", "gnorm": "7.63", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "6602"} 2023-01-29 18:01:46 | INFO | train_inner | {"epoch": 13, "update": 12.608, "s2c_loss": "0.578", "loss": "0.40082", "s2c_nll_loss": "0.578", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "27250", "lr": "0.000181668", "gnorm": "6.957", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "6604"} 2023-01-29 18:01:49 | INFO | train_inner | {"epoch": 13, "update": 12.612, "s2c_loss": "0.497", "loss": "0.34417", "s2c_nll_loss": "0.497", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "243.5", "ups": "3.8", "wpb": "64", "bsz": "64", "num_updates": "27260", "lr": "0.000181734", "gnorm": "7.479", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "6607"} 2023-01-29 18:01:52 | INFO | train_inner | {"epoch": 13, "update": 12.617, "s2c_loss": "0.7", "loss": "0.48496", "s2c_nll_loss": "0.7", "s2c_accuracy": "86.406", "s2c_total": "64", "s2c_n_correct": "55.3", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "27270", "lr": "0.000181801", "gnorm": "8.449", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "6609"} 2023-01-29 18:01:54 | INFO | train_inner | {"epoch": 13, "update": 12.622, "s2c_loss": "0.468", "loss": "0.32417", "s2c_nll_loss": "0.468", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "27280", "lr": "0.000181868", "gnorm": "7.752", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "6612"} 2023-01-29 18:01:57 | INFO | train_inner | {"epoch": 13, "update": 12.626, "s2c_loss": "0.483", "loss": "0.3349", "s2c_nll_loss": "0.483", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "27290", "lr": "0.000181934", "gnorm": "6.903", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "6615"} 2023-01-29 18:01:59 | INFO | train_inner | {"epoch": 13, "update": 12.631, "s2c_loss": "0.399", "loss": "0.27643", "s2c_nll_loss": "0.399", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "27300", "lr": "0.000182001", "gnorm": "5.383", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "6617"} 2023-01-29 18:02:02 | INFO | train_inner | {"epoch": 13, "update": 12.636, "s2c_loss": "0.542", "loss": "0.37603", "s2c_nll_loss": "0.542", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "247", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "27310", "lr": "0.000182068", "gnorm": "7.127", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "6620"} 2023-01-29 18:02:04 | INFO | train_inner | {"epoch": 13, "update": 12.64, "s2c_loss": "0.515", "loss": "0.35724", "s2c_nll_loss": "0.515", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "27320", "lr": "0.000182134", "gnorm": "6.745", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "6622"} 2023-01-29 18:02:07 | INFO | train_inner | {"epoch": 13, "update": 12.645, "s2c_loss": "0.607", "loss": "0.42077", "s2c_nll_loss": "0.607", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "27330", "lr": "0.000182201", "gnorm": "9.425", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "6625"} 2023-01-29 18:02:09 | INFO | train_inner | {"epoch": 13, "update": 12.649, "s2c_loss": "0.505", "loss": "0.34988", "s2c_nll_loss": "0.505", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "27340", "lr": "0.000182268", "gnorm": "6.623", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "6627"} 2023-01-29 18:02:12 | INFO | train_inner | {"epoch": 13, "update": 12.654, "s2c_loss": "0.496", "loss": "0.3435", "s2c_nll_loss": "0.496", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "27350", "lr": "0.000182334", "gnorm": "7.424", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "6630"} 2023-01-29 18:02:14 | INFO | train_inner | {"epoch": 13, "update": 12.659, "s2c_loss": "0.53", "loss": "0.36771", "s2c_nll_loss": "0.53", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "27360", "lr": "0.000182401", "gnorm": "7.888", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "6632"} 2023-01-29 18:02:17 | INFO | train_inner | {"epoch": 13, "update": 12.663, "s2c_loss": "0.498", "loss": "0.34547", "s2c_nll_loss": "0.498", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "27370", "lr": "0.000182468", "gnorm": "8.744", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "6635"} 2023-01-29 18:02:20 | INFO | train_inner | {"epoch": 13, "update": 12.668, "s2c_loss": "0.403", "loss": "0.27961", "s2c_nll_loss": "0.403", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "27380", "lr": "0.000182534", "gnorm": "5.463", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "6637"} 2023-01-29 18:02:22 | INFO | train_inner | {"epoch": 13, "update": 12.673, "s2c_loss": "0.571", "loss": "0.39613", "s2c_nll_loss": "0.571", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "27390", "lr": "0.000182601", "gnorm": "6.677", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "6640"} 2023-01-29 18:02:25 | INFO | train_inner | {"epoch": 13, "update": 12.677, "s2c_loss": "0.474", "loss": "0.32858", "s2c_nll_loss": "0.474", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "27400", "lr": "0.000182668", "gnorm": "6.996", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "6643"} 2023-01-29 18:02:27 | INFO | train_inner | {"epoch": 13, "update": 12.682, "s2c_loss": "0.589", "loss": "0.40823", "s2c_nll_loss": "0.589", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "27410", "lr": "0.000182734", "gnorm": "7.826", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "6645"} 2023-01-29 18:02:28 | INFO | fairseq.trainer | NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 512.0 2023-01-29 18:02:30 | INFO | train_inner | {"epoch": 13, "update": 12.687, "s2c_loss": "0.554", "loss": "0.38434", "s2c_nll_loss": "0.554", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "233.1", "ups": "3.64", "wpb": "64", "bsz": "64", "num_updates": "27420", "lr": "0.000182801", "gnorm": "7.96", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "6648"} 2023-01-29 18:02:32 | INFO | train_inner | {"epoch": 13, "update": 12.691, "s2c_loss": "0.526", "loss": "0.36483", "s2c_nll_loss": "0.526", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "27430", "lr": "0.000182868", "gnorm": "7.051", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "6650"} 2023-01-29 18:02:35 | INFO | train_inner | {"epoch": 13, "update": 12.696, "s2c_loss": "0.528", "loss": "0.36565", "s2c_nll_loss": "0.528", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "255", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "27440", "lr": "0.000182934", "gnorm": "6.751", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6653"} 2023-01-29 18:02:37 | INFO | train_inner | {"epoch": 13, "update": 12.701, "s2c_loss": "0.449", "loss": "0.31094", "s2c_nll_loss": "0.449", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "27450", "lr": "0.000183001", "gnorm": "7.897", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "6655"} 2023-01-29 18:02:40 | INFO | train_inner | {"epoch": 13, "update": 12.705, "s2c_loss": "0.516", "loss": "0.35748", "s2c_nll_loss": "0.516", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "27460", "lr": "0.000183068", "gnorm": "6.356", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6658"} 2023-01-29 18:02:43 | INFO | train_inner | {"epoch": 13, "update": 12.71, "s2c_loss": "0.55", "loss": "0.38148", "s2c_nll_loss": "0.55", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "27470", "lr": "0.000183134", "gnorm": "7.61", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6660"} 2023-01-29 18:02:45 | INFO | train_inner | {"epoch": 13, "update": 12.715, "s2c_loss": "0.495", "loss": "0.34282", "s2c_nll_loss": "0.495", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "27480", "lr": "0.000183201", "gnorm": "7.59", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6663"} 2023-01-29 18:02:48 | INFO | train_inner | {"epoch": 13, "update": 12.719, "s2c_loss": "0.477", "loss": "0.33045", "s2c_nll_loss": "0.477", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "27490", "lr": "0.000183268", "gnorm": "6.766", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "6665"} 2023-01-29 18:02:50 | INFO | train_inner | {"epoch": 13, "update": 12.724, "s2c_loss": "0.546", "loss": "0.3787", "s2c_nll_loss": "0.546", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "255", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "27500", "lr": "0.000183334", "gnorm": "7.906", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6668"} 2023-01-29 18:02:53 | INFO | train_inner | {"epoch": 13, "update": 12.728, "s2c_loss": "0.438", "loss": "0.30333", "s2c_nll_loss": "0.438", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "27510", "lr": "0.000183401", "gnorm": "6.263", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6670"} 2023-01-29 18:02:55 | INFO | train_inner | {"epoch": 13, "update": 12.733, "s2c_loss": "0.507", "loss": "0.35157", "s2c_nll_loss": "0.507", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "262.4", "ups": "4.1", "wpb": "64", "bsz": "64", "num_updates": "27520", "lr": "0.000183467", "gnorm": "6.318", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "6673"} 2023-01-29 18:02:57 | INFO | train_inner | {"epoch": 13, "update": 12.738, "s2c_loss": "0.527", "loss": "0.36501", "s2c_nll_loss": "0.527", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "27530", "lr": "0.000183534", "gnorm": "6.812", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6675"} 2023-01-29 18:03:00 | INFO | train_inner | {"epoch": 13, "update": 12.742, "s2c_loss": "0.446", "loss": "0.309", "s2c_nll_loss": "0.446", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "27540", "lr": "0.000183601", "gnorm": "6.565", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6678"} 2023-01-29 18:03:02 | INFO | train_inner | {"epoch": 13, "update": 12.747, "s2c_loss": "0.525", "loss": "0.36413", "s2c_nll_loss": "0.525", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "27550", "lr": "0.000183667", "gnorm": "7.393", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6680"} 2023-01-29 18:03:05 | INFO | train_inner | {"epoch": 13, "update": 12.752, "s2c_loss": "0.68", "loss": "0.4712", "s2c_nll_loss": "0.68", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "27560", "lr": "0.000183734", "gnorm": "7.276", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6683"} 2023-01-29 18:03:08 | INFO | train_inner | {"epoch": 13, "update": 12.756, "s2c_loss": "0.457", "loss": "0.31674", "s2c_nll_loss": "0.457", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "27570", "lr": "0.000183801", "gnorm": "5.856", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "6685"} 2023-01-29 18:03:10 | INFO | train_inner | {"epoch": 13, "update": 12.761, "s2c_loss": "0.53", "loss": "0.3677", "s2c_nll_loss": "0.53", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "247.5", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "27580", "lr": "0.000183867", "gnorm": "7.158", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "6688"} 2023-01-29 18:03:13 | INFO | train_inner | {"epoch": 13, "update": 12.765, "s2c_loss": "0.84", "loss": "0.58198", "s2c_nll_loss": "0.84", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "27590", "lr": "0.000183934", "gnorm": "6.135", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "6691"} 2023-01-29 18:03:15 | INFO | train_inner | {"epoch": 13, "update": 12.77, "s2c_loss": "0.447", "loss": "0.31008", "s2c_nll_loss": "0.447", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "256.3", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "27600", "lr": "0.000184001", "gnorm": "6.334", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6693"} 2023-01-29 18:03:18 | INFO | train_inner | {"epoch": 13, "update": 12.775, "s2c_loss": "0.665", "loss": "0.46111", "s2c_nll_loss": "0.665", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "27610", "lr": "0.000184067", "gnorm": "7.494", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6696"} 2023-01-29 18:03:20 | INFO | train_inner | {"epoch": 13, "update": 12.779, "s2c_loss": "0.43", "loss": "0.29796", "s2c_nll_loss": "0.43", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "27620", "lr": "0.000184134", "gnorm": "6.826", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6698"} 2023-01-29 18:03:23 | INFO | train_inner | {"epoch": 13, "update": 12.784, "s2c_loss": "0.534", "loss": "0.37033", "s2c_nll_loss": "0.534", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "27630", "lr": "0.000184201", "gnorm": "6.664", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6701"} 2023-01-29 18:03:25 | INFO | train_inner | {"epoch": 13, "update": 12.789, "s2c_loss": "0.577", "loss": "0.39979", "s2c_nll_loss": "0.577", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "27640", "lr": "0.000184267", "gnorm": "6.934", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "6703"} 2023-01-29 18:03:28 | INFO | train_inner | {"epoch": 13, "update": 12.793, "s2c_loss": "0.804", "loss": "0.55756", "s2c_nll_loss": "0.804", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "27650", "lr": "0.000184334", "gnorm": "6.599", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6706"} 2023-01-29 18:03:30 | INFO | train_inner | {"epoch": 13, "update": 12.798, "s2c_loss": "0.525", "loss": "0.36409", "s2c_nll_loss": "0.525", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "27660", "lr": "0.000184401", "gnorm": "7.814", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6708"} 2023-01-29 18:03:33 | INFO | train_inner | {"epoch": 13, "update": 12.802, "s2c_loss": "0.425", "loss": "0.29436", "s2c_nll_loss": "0.425", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "27670", "lr": "0.000184467", "gnorm": "5.998", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6711"} 2023-01-29 18:03:35 | INFO | train_inner | {"epoch": 13, "update": 12.807, "s2c_loss": "0.533", "loss": "0.36931", "s2c_nll_loss": "0.533", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "27680", "lr": "0.000184534", "gnorm": "7.322", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6713"} 2023-01-29 18:03:38 | INFO | train_inner | {"epoch": 13, "update": 12.812, "s2c_loss": "0.649", "loss": "0.44976", "s2c_nll_loss": "0.649", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "27690", "lr": "0.000184601", "gnorm": "7.359", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "6716"} 2023-01-29 18:03:40 | INFO | train_inner | {"epoch": 13, "update": 12.816, "s2c_loss": "0.685", "loss": "0.47496", "s2c_nll_loss": "0.685", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "27700", "lr": "0.000184667", "gnorm": "6.878", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "6718"} 2023-01-29 18:03:43 | INFO | train_inner | {"epoch": 13, "update": 12.821, "s2c_loss": "0.641", "loss": "0.44397", "s2c_nll_loss": "0.641", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "27710", "lr": "0.000184734", "gnorm": "7.218", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6721"} 2023-01-29 18:03:46 | INFO | train_inner | {"epoch": 13, "update": 12.826, "s2c_loss": "0.453", "loss": "0.31393", "s2c_nll_loss": "0.453", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "27720", "lr": "0.000184801", "gnorm": "6.667", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6723"} 2023-01-29 18:03:48 | INFO | train_inner | {"epoch": 13, "update": 12.83, "s2c_loss": "0.569", "loss": "0.39462", "s2c_nll_loss": "0.569", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "257.4", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "27730", "lr": "0.000184867", "gnorm": "6.397", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "6726"} 2023-01-29 18:03:51 | INFO | train_inner | {"epoch": 13, "update": 12.835, "s2c_loss": "0.69", "loss": "0.47858", "s2c_nll_loss": "0.69", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "27740", "lr": "0.000184934", "gnorm": "6.627", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6728"} 2023-01-29 18:03:53 | INFO | train_inner | {"epoch": 13, "update": 12.84, "s2c_loss": "0.572", "loss": "0.39618", "s2c_nll_loss": "0.572", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "27750", "lr": "0.000185001", "gnorm": "6.604", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6731"} 2023-01-29 18:03:56 | INFO | train_inner | {"epoch": 13, "update": 12.844, "s2c_loss": "0.486", "loss": "0.33664", "s2c_nll_loss": "0.486", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "27760", "lr": "0.000185067", "gnorm": "6.374", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6733"} 2023-01-29 18:03:58 | INFO | train_inner | {"epoch": 13, "update": 12.849, "s2c_loss": "0.557", "loss": "0.3859", "s2c_nll_loss": "0.557", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "27770", "lr": "0.000185134", "gnorm": "8.189", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6736"} 2023-01-29 18:04:01 | INFO | train_inner | {"epoch": 13, "update": 12.853, "s2c_loss": "0.574", "loss": "0.39783", "s2c_nll_loss": "0.574", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "259.8", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "27780", "lr": "0.000185201", "gnorm": "7.13", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6738"} 2023-01-29 18:04:03 | INFO | train_inner | {"epoch": 13, "update": 12.858, "s2c_loss": "0.449", "loss": "0.31155", "s2c_nll_loss": "0.449", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "27790", "lr": "0.000185267", "gnorm": "6.376", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6741"} 2023-01-29 18:04:06 | INFO | train_inner | {"epoch": 13, "update": 12.863, "s2c_loss": "0.608", "loss": "0.42139", "s2c_nll_loss": "0.608", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "27800", "lr": "0.000185334", "gnorm": "7.455", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6744"} 2023-01-29 18:04:08 | INFO | train_inner | {"epoch": 13, "update": 12.867, "s2c_loss": "0.569", "loss": "0.39455", "s2c_nll_loss": "0.569", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "27810", "lr": "0.000185401", "gnorm": "7.782", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6746"} 2023-01-29 18:04:11 | INFO | train_inner | {"epoch": 13, "update": 12.872, "s2c_loss": "0.53", "loss": "0.36742", "s2c_nll_loss": "0.53", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "27820", "lr": "0.000185467", "gnorm": "8.033", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "6749"} 2023-01-29 18:04:13 | INFO | train_inner | {"epoch": 13, "update": 12.877, "s2c_loss": "0.513", "loss": "0.35572", "s2c_nll_loss": "0.513", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "27830", "lr": "0.000185534", "gnorm": "7.535", "loss_scale": "512", "train_wall": "3", "gb_free": "7.5", "wall": "6751"} 2023-01-29 18:04:16 | INFO | train_inner | {"epoch": 13, "update": 12.881, "s2c_loss": "0.616", "loss": "0.42728", "s2c_nll_loss": "0.616", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "27840", "lr": "0.000185601", "gnorm": "7.527", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6754"} 2023-01-29 18:04:18 | INFO | train_inner | {"epoch": 13, "update": 12.886, "s2c_loss": "0.445", "loss": "0.30865", "s2c_nll_loss": "0.445", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "27850", "lr": "0.000185667", "gnorm": "6.105", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "6756"} 2023-01-29 18:04:21 | INFO | train_inner | {"epoch": 13, "update": 12.89, "s2c_loss": "0.425", "loss": "0.29426", "s2c_nll_loss": "0.425", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "27860", "lr": "0.000185734", "gnorm": "6.152", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "6759"} 2023-01-29 18:04:23 | INFO | train_inner | {"epoch": 13, "update": 12.895, "s2c_loss": "0.427", "loss": "0.29605", "s2c_nll_loss": "0.427", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "256.3", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "27870", "lr": "0.000185801", "gnorm": "5.586", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6761"} 2023-01-29 18:04:26 | INFO | train_inner | {"epoch": 13, "update": 12.9, "s2c_loss": "0.457", "loss": "0.31669", "s2c_nll_loss": "0.457", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "255", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "27880", "lr": "0.000185867", "gnorm": "6.319", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6764"} 2023-01-29 18:04:28 | INFO | train_inner | {"epoch": 13, "update": 12.904, "s2c_loss": "0.399", "loss": "0.27659", "s2c_nll_loss": "0.399", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "27890", "lr": "0.000185934", "gnorm": "6.049", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6766"} 2023-01-29 18:04:31 | INFO | train_inner | {"epoch": 13, "update": 12.909, "s2c_loss": "0.832", "loss": "0.57669", "s2c_nll_loss": "0.832", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "262", "ups": "4.09", "wpb": "64", "bsz": "64", "num_updates": "27900", "lr": "0.000186001", "gnorm": "7.1", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6769"} 2023-01-29 18:04:33 | INFO | train_inner | {"epoch": 13, "update": 12.914, "s2c_loss": "0.534", "loss": "0.37042", "s2c_nll_loss": "0.534", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "27910", "lr": "0.000186067", "gnorm": "6.089", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6771"} 2023-01-29 18:04:36 | INFO | train_inner | {"epoch": 13, "update": 12.918, "s2c_loss": "0.488", "loss": "0.33821", "s2c_nll_loss": "0.488", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "27920", "lr": "0.000186134", "gnorm": "6.686", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6774"} 2023-01-29 18:04:39 | INFO | train_inner | {"epoch": 13, "update": 12.923, "s2c_loss": "0.425", "loss": "0.29477", "s2c_nll_loss": "0.425", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "27930", "lr": "0.000186201", "gnorm": "7.307", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6776"} 2023-01-29 18:04:41 | INFO | train_inner | {"epoch": 13, "update": 12.927, "s2c_loss": "0.363", "loss": "0.25144", "s2c_nll_loss": "0.363", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "27940", "lr": "0.000186267", "gnorm": "5.731", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "6779"} 2023-01-29 18:04:44 | INFO | train_inner | {"epoch": 13, "update": 12.932, "s2c_loss": "0.555", "loss": "0.38435", "s2c_nll_loss": "0.555", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "27950", "lr": "0.000186334", "gnorm": "6.899", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "6782"} 2023-01-29 18:04:46 | INFO | train_inner | {"epoch": 13, "update": 12.937, "s2c_loss": "0.65", "loss": "0.45047", "s2c_nll_loss": "0.65", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "27960", "lr": "0.000186401", "gnorm": "6.774", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6784"} 2023-01-29 18:04:49 | INFO | train_inner | {"epoch": 13, "update": 12.941, "s2c_loss": "0.603", "loss": "0.41795", "s2c_nll_loss": "0.603", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "27970", "lr": "0.000186467", "gnorm": "8.054", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6787"} 2023-01-29 18:04:51 | INFO | train_inner | {"epoch": 13, "update": 12.946, "s2c_loss": "0.705", "loss": "0.48869", "s2c_nll_loss": "0.705", "s2c_accuracy": "86.719", "s2c_total": "64", "s2c_n_correct": "55.5", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "27980", "lr": "0.000186534", "gnorm": "8.236", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6789"} 2023-01-29 18:04:54 | INFO | train_inner | {"epoch": 13, "update": 12.951, "s2c_loss": "0.683", "loss": "0.47362", "s2c_nll_loss": "0.683", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "27990", "lr": "0.000186601", "gnorm": "7.249", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6792"} 2023-01-29 18:04:56 | INFO | train_inner | {"epoch": 13, "update": 12.955, "s2c_loss": "0.582", "loss": "0.40345", "s2c_nll_loss": "0.582", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "28000", "lr": "0.000186667", "gnorm": "6.972", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "6794"} 2023-01-29 18:04:59 | INFO | train_inner | {"epoch": 13, "update": 12.96, "s2c_loss": "0.599", "loss": "0.41533", "s2c_nll_loss": "0.599", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "247.3", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "28010", "lr": "0.000186734", "gnorm": "7.249", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6797"} 2023-01-29 18:05:01 | INFO | train_inner | {"epoch": 13, "update": 12.964, "s2c_loss": "0.596", "loss": "0.41287", "s2c_nll_loss": "0.596", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "28020", "lr": "0.000186801", "gnorm": "7.607", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6799"} 2023-01-29 18:05:04 | INFO | train_inner | {"epoch": 13, "update": 12.969, "s2c_loss": "0.534", "loss": "0.37004", "s2c_nll_loss": "0.534", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "28030", "lr": "0.000186867", "gnorm": "7.079", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6802"} 2023-01-29 18:05:07 | INFO | train_inner | {"epoch": 13, "update": 12.974, "s2c_loss": "0.667", "loss": "0.46262", "s2c_nll_loss": "0.667", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "28040", "lr": "0.000186934", "gnorm": "7.017", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6804"} 2023-01-29 18:05:09 | INFO | train_inner | {"epoch": 13, "update": 12.978, "s2c_loss": "0.461", "loss": "0.31988", "s2c_nll_loss": "0.461", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "28050", "lr": "0.000187001", "gnorm": "6.682", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6807"} 2023-01-29 18:05:12 | INFO | train_inner | {"epoch": 13, "update": 12.983, "s2c_loss": "0.506", "loss": "0.35055", "s2c_nll_loss": "0.506", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "28060", "lr": "0.000187067", "gnorm": "6.44", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6809"} 2023-01-29 18:05:14 | INFO | train_inner | {"epoch": 13, "update": 12.988, "s2c_loss": "0.567", "loss": "0.39325", "s2c_nll_loss": "0.567", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "28070", "lr": "0.000187134", "gnorm": "7.782", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6812"} 2023-01-29 18:05:17 | INFO | train_inner | {"epoch": 13, "update": 12.992, "s2c_loss": "0.474", "loss": "0.32854", "s2c_nll_loss": "0.474", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "28080", "lr": "0.000187201", "gnorm": "8.052", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6815"} 2023-01-29 18:05:19 | INFO | train_inner | {"epoch": 13, "update": 12.997, "s2c_loss": "0.411", "loss": "0.28489", "s2c_nll_loss": "0.411", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "28090", "lr": "0.000187267", "gnorm": "8.086", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6817"} 2023-01-29 18:05:21 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 18:05:35 | INFO | valid | {"epoch": 13, "valid_s2c_loss": "1.249", "valid_loss": "0.86549", "valid_s2c_nll_loss": "1.249", "valid_s2c_accuracy": "78.386", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "25.0509", "valid_num_updates": "28097", "valid_best_s2c_accuracy": "81.037"} 2023-01-29 18:05:35 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 13 @ 28097 updates 2023-01-29 18:05:35 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 18:05:43 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 18:05:43 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt (epoch 13 @ 28097 updates, score 78.386) (writing took 7.509505259804428 seconds) 2023-01-29 18:05:43 | INFO | fairseq_cli.train | end of epoch 13 (average epoch stats below) 2023-01-29 18:05:43 | INFO | train | {"epoch": 13, "train_s2c_loss": "0.527", "train_loss": "0.36517", "train_s2c_nll_loss": "0.527", "train_s2c_accuracy": "90.547", "train_s2c_total": "63.9838", "train_s2c_n_correct": "57.9356", "train_wps": "239.8", "train_ups": "3.75", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "28097", "train_lr": "0.000187314", "train_gnorm": "6.963", "train_loss_scale": "512", "train_train_wall": "540", "train_gb_free": "7.5", "train_wall": "6841"} 2023-01-29 18:05:49 | INFO | fairseq.trainer | begin training epoch 14 2023-01-29 18:05:49 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 18:05:50 | INFO | train_inner | {"epoch": 14, "update": 13.001, "s2c_loss": "0.544", "loss": "0.3772", "s2c_nll_loss": "0.544", "s2c_accuracy": "88.816", "s2c_total": "60.8", "s2c_n_correct": "54", "wps": "19.7", "ups": "0.32", "wpb": "60.8", "bsz": "60.8", "num_updates": "28100", "lr": "0.000187334", "gnorm": "10.487", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "6848"} 2023-01-29 18:05:52 | INFO | train_inner | {"epoch": 14, "update": 13.006, "s2c_loss": "0.619", "loss": "0.42913", "s2c_nll_loss": "0.619", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "28110", "lr": "0.000187401", "gnorm": "8.094", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "6850"} 2023-01-29 18:05:55 | INFO | train_inner | {"epoch": 14, "update": 13.011, "s2c_loss": "0.426", "loss": "0.29519", "s2c_nll_loss": "0.426", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "28120", "lr": "0.000187467", "gnorm": "8.294", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6853"} 2023-01-29 18:05:58 | INFO | train_inner | {"epoch": 14, "update": 13.015, "s2c_loss": "0.498", "loss": "0.34515", "s2c_nll_loss": "0.498", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "255.7", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "28130", "lr": "0.000187534", "gnorm": "7.722", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6855"} 2023-01-29 18:06:00 | INFO | train_inner | {"epoch": 14, "update": 13.02, "s2c_loss": "0.566", "loss": "0.39243", "s2c_nll_loss": "0.566", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "247", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "28140", "lr": "0.000187601", "gnorm": "6.117", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "6858"} 2023-01-29 18:06:03 | INFO | train_inner | {"epoch": 14, "update": 13.025, "s2c_loss": "0.423", "loss": "0.29333", "s2c_nll_loss": "0.423", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "28150", "lr": "0.000187667", "gnorm": "6.248", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6861"} 2023-01-29 18:06:05 | INFO | train_inner | {"epoch": 14, "update": 13.029, "s2c_loss": "0.502", "loss": "0.34764", "s2c_nll_loss": "0.502", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "258.6", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "28160", "lr": "0.000187734", "gnorm": "6.416", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6863"} 2023-01-29 18:06:08 | INFO | train_inner | {"epoch": 14, "update": 13.034, "s2c_loss": "0.438", "loss": "0.30383", "s2c_nll_loss": "0.438", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "28170", "lr": "0.000187801", "gnorm": "7.033", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6866"} 2023-01-29 18:06:10 | INFO | train_inner | {"epoch": 14, "update": 13.038, "s2c_loss": "0.371", "loss": "0.25685", "s2c_nll_loss": "0.371", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "28180", "lr": "0.000187867", "gnorm": "6.26", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6868"} 2023-01-29 18:06:13 | INFO | train_inner | {"epoch": 14, "update": 13.043, "s2c_loss": "0.446", "loss": "0.30881", "s2c_nll_loss": "0.446", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "28190", "lr": "0.000187934", "gnorm": "7.509", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "6871"} 2023-01-29 18:06:15 | INFO | train_inner | {"epoch": 14, "update": 13.048, "s2c_loss": "0.618", "loss": "0.42858", "s2c_nll_loss": "0.618", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "28200", "lr": "0.000188001", "gnorm": "7.03", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6873"} 2023-01-29 18:06:18 | INFO | train_inner | {"epoch": 14, "update": 13.052, "s2c_loss": "0.419", "loss": "0.29035", "s2c_nll_loss": "0.419", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "28210", "lr": "0.000188067", "gnorm": "6.137", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "6876"} 2023-01-29 18:06:21 | INFO | train_inner | {"epoch": 14, "update": 13.057, "s2c_loss": "0.435", "loss": "0.3012", "s2c_nll_loss": "0.435", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "244.3", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "28220", "lr": "0.000188134", "gnorm": "6.713", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6878"} 2023-01-29 18:06:23 | INFO | train_inner | {"epoch": 14, "update": 13.062, "s2c_loss": "0.473", "loss": "0.3276", "s2c_nll_loss": "0.473", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "28230", "lr": "0.000188201", "gnorm": "6.584", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6881"} 2023-01-29 18:06:26 | INFO | train_inner | {"epoch": 14, "update": 13.066, "s2c_loss": "0.591", "loss": "0.40935", "s2c_nll_loss": "0.591", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "28240", "lr": "0.000188267", "gnorm": "6.824", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "6884"} 2023-01-29 18:06:28 | INFO | train_inner | {"epoch": 14, "update": 13.071, "s2c_loss": "0.649", "loss": "0.44967", "s2c_nll_loss": "0.649", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "28250", "lr": "0.000188334", "gnorm": "6.409", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6886"} 2023-01-29 18:06:31 | INFO | train_inner | {"epoch": 14, "update": 13.075, "s2c_loss": "0.379", "loss": "0.26258", "s2c_nll_loss": "0.379", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "246.2", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "28260", "lr": "0.000188401", "gnorm": "5.519", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "6889"} 2023-01-29 18:06:33 | INFO | train_inner | {"epoch": 14, "update": 13.08, "s2c_loss": "0.463", "loss": "0.32071", "s2c_nll_loss": "0.463", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "245.9", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "28270", "lr": "0.000188467", "gnorm": "5.692", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6891"} 2023-01-29 18:06:36 | INFO | train_inner | {"epoch": 14, "update": 13.085, "s2c_loss": "0.509", "loss": "0.35271", "s2c_nll_loss": "0.509", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "28280", "lr": "0.000188534", "gnorm": "7.534", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "6894"} 2023-01-29 18:06:38 | INFO | train_inner | {"epoch": 14, "update": 13.089, "s2c_loss": "0.395", "loss": "0.27351", "s2c_nll_loss": "0.395", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "28290", "lr": "0.000188601", "gnorm": "5.995", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6896"} 2023-01-29 18:06:41 | INFO | train_inner | {"epoch": 14, "update": 13.094, "s2c_loss": "0.348", "loss": "0.24141", "s2c_nll_loss": "0.348", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "28300", "lr": "0.000188667", "gnorm": "5.539", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6899"} 2023-01-29 18:06:44 | INFO | train_inner | {"epoch": 14, "update": 13.099, "s2c_loss": "0.41", "loss": "0.28431", "s2c_nll_loss": "0.41", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "28310", "lr": "0.000188734", "gnorm": "5.708", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6901"} 2023-01-29 18:06:46 | INFO | train_inner | {"epoch": 14, "update": 13.103, "s2c_loss": "0.445", "loss": "0.30847", "s2c_nll_loss": "0.445", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "28320", "lr": "0.000188801", "gnorm": "6.336", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6904"} 2023-01-29 18:06:49 | INFO | train_inner | {"epoch": 14, "update": 13.108, "s2c_loss": "0.392", "loss": "0.27146", "s2c_nll_loss": "0.392", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "28330", "lr": "0.000188867", "gnorm": "5.287", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6907"} 2023-01-29 18:06:51 | INFO | train_inner | {"epoch": 14, "update": 13.112, "s2c_loss": "0.465", "loss": "0.32259", "s2c_nll_loss": "0.465", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "28340", "lr": "0.000188934", "gnorm": "6.06", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6909"} 2023-01-29 18:06:54 | INFO | train_inner | {"epoch": 14, "update": 13.117, "s2c_loss": "0.555", "loss": "0.38454", "s2c_nll_loss": "0.555", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "28350", "lr": "0.000189001", "gnorm": "6.395", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6912"} 2023-01-29 18:06:56 | INFO | train_inner | {"epoch": 14, "update": 13.122, "s2c_loss": "0.386", "loss": "0.26744", "s2c_nll_loss": "0.386", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "28360", "lr": "0.000189067", "gnorm": "5.991", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6914"} 2023-01-29 18:06:59 | INFO | train_inner | {"epoch": 14, "update": 13.126, "s2c_loss": "0.463", "loss": "0.32058", "s2c_nll_loss": "0.463", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "244.4", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "28370", "lr": "0.000189134", "gnorm": "6.438", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "6917"} 2023-01-29 18:07:01 | INFO | train_inner | {"epoch": 14, "update": 13.131, "s2c_loss": "0.385", "loss": "0.26684", "s2c_nll_loss": "0.385", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "28380", "lr": "0.000189201", "gnorm": "7.185", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6919"} 2023-01-29 18:07:04 | INFO | train_inner | {"epoch": 14, "update": 13.136, "s2c_loss": "0.478", "loss": "0.33162", "s2c_nll_loss": "0.478", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "28390", "lr": "0.000189267", "gnorm": "6.15", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "6922"} 2023-01-29 18:07:06 | INFO | train_inner | {"epoch": 14, "update": 13.14, "s2c_loss": "0.416", "loss": "0.28852", "s2c_nll_loss": "0.416", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "28400", "lr": "0.000189334", "gnorm": "6.213", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6924"} 2023-01-29 18:07:09 | INFO | train_inner | {"epoch": 14, "update": 13.145, "s2c_loss": "0.476", "loss": "0.33028", "s2c_nll_loss": "0.476", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "28410", "lr": "0.000189401", "gnorm": "6.013", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6927"} 2023-01-29 18:07:12 | INFO | train_inner | {"epoch": 14, "update": 13.149, "s2c_loss": "0.524", "loss": "0.36299", "s2c_nll_loss": "0.524", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "28420", "lr": "0.000189467", "gnorm": "6.268", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6930"} 2023-01-29 18:07:14 | INFO | train_inner | {"epoch": 14, "update": 13.154, "s2c_loss": "0.654", "loss": "0.45356", "s2c_nll_loss": "0.654", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "28430", "lr": "0.000189534", "gnorm": "7.806", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6932"} 2023-01-29 18:07:17 | INFO | train_inner | {"epoch": 14, "update": 13.159, "s2c_loss": "0.367", "loss": "0.25452", "s2c_nll_loss": "0.367", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "28440", "lr": "0.000189601", "gnorm": "5.993", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6935"} 2023-01-29 18:07:19 | INFO | train_inner | {"epoch": 14, "update": 13.163, "s2c_loss": "0.527", "loss": "0.36536", "s2c_nll_loss": "0.527", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "28450", "lr": "0.000189667", "gnorm": "7.708", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6937"} 2023-01-29 18:07:22 | INFO | train_inner | {"epoch": 14, "update": 13.168, "s2c_loss": "0.557", "loss": "0.38628", "s2c_nll_loss": "0.557", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "28460", "lr": "0.000189734", "gnorm": "7.027", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6940"} 2023-01-29 18:07:24 | INFO | train_inner | {"epoch": 14, "update": 13.173, "s2c_loss": "0.543", "loss": "0.37636", "s2c_nll_loss": "0.543", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "28470", "lr": "0.000189801", "gnorm": "7.158", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6942"} 2023-01-29 18:07:27 | INFO | train_inner | {"epoch": 14, "update": 13.177, "s2c_loss": "0.51", "loss": "0.3537", "s2c_nll_loss": "0.51", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "28480", "lr": "0.000189867", "gnorm": "8.402", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "6945"} 2023-01-29 18:07:29 | INFO | train_inner | {"epoch": 14, "update": 13.182, "s2c_loss": "0.459", "loss": "0.31827", "s2c_nll_loss": "0.459", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "28490", "lr": "0.000189934", "gnorm": "7.527", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "6947"} 2023-01-29 18:07:32 | INFO | train_inner | {"epoch": 14, "update": 13.186, "s2c_loss": "0.602", "loss": "0.4176", "s2c_nll_loss": "0.602", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "28500", "lr": "0.00019", "gnorm": "8.248", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6950"} 2023-01-29 18:07:34 | INFO | train_inner | {"epoch": 14, "update": 13.191, "s2c_loss": "0.466", "loss": "0.32295", "s2c_nll_loss": "0.466", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "28510", "lr": "0.000190067", "gnorm": "7.393", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "6952"} 2023-01-29 18:07:37 | INFO | train_inner | {"epoch": 14, "update": 13.196, "s2c_loss": "0.341", "loss": "0.23613", "s2c_nll_loss": "0.341", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "245.3", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "28520", "lr": "0.000190134", "gnorm": "5.444", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6955"} 2023-01-29 18:07:40 | INFO | train_inner | {"epoch": 14, "update": 13.2, "s2c_loss": "0.9", "loss": "0.62413", "s2c_nll_loss": "0.9", "s2c_accuracy": "85.469", "s2c_total": "64", "s2c_n_correct": "54.7", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "28530", "lr": "0.0001902", "gnorm": "7.312", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "6958"} 2023-01-29 18:07:42 | INFO | train_inner | {"epoch": 14, "update": 13.205, "s2c_loss": "0.482", "loss": "0.334", "s2c_nll_loss": "0.482", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "28540", "lr": "0.000190267", "gnorm": "7.679", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6960"} 2023-01-29 18:07:45 | INFO | train_inner | {"epoch": 14, "update": 13.21, "s2c_loss": "0.409", "loss": "0.28367", "s2c_nll_loss": "0.409", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "28550", "lr": "0.000190334", "gnorm": "6.438", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6963"} 2023-01-29 18:07:47 | INFO | train_inner | {"epoch": 14, "update": 13.214, "s2c_loss": "0.461", "loss": "0.31962", "s2c_nll_loss": "0.461", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "28560", "lr": "0.0001904", "gnorm": "6.97", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6965"} 2023-01-29 18:07:50 | INFO | train_inner | {"epoch": 14, "update": 13.219, "s2c_loss": "0.437", "loss": "0.30275", "s2c_nll_loss": "0.437", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "28570", "lr": "0.000190467", "gnorm": "6.62", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6968"} 2023-01-29 18:07:52 | INFO | train_inner | {"epoch": 14, "update": 13.223, "s2c_loss": "0.527", "loss": "0.36544", "s2c_nll_loss": "0.527", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "28580", "lr": "0.000190534", "gnorm": "6.374", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "6970"} 2023-01-29 18:07:55 | INFO | train_inner | {"epoch": 14, "update": 13.228, "s2c_loss": "0.461", "loss": "0.31949", "s2c_nll_loss": "0.461", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "28590", "lr": "0.0001906", "gnorm": "5.822", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6973"} 2023-01-29 18:07:58 | INFO | train_inner | {"epoch": 14, "update": 13.233, "s2c_loss": "0.427", "loss": "0.29573", "s2c_nll_loss": "0.427", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "28600", "lr": "0.000190667", "gnorm": "5.659", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6975"} 2023-01-29 18:08:00 | INFO | train_inner | {"epoch": 14, "update": 13.237, "s2c_loss": "0.574", "loss": "0.39804", "s2c_nll_loss": "0.574", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "28610", "lr": "0.000190734", "gnorm": "6.226", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6978"} 2023-01-29 18:08:03 | INFO | train_inner | {"epoch": 14, "update": 13.242, "s2c_loss": "0.541", "loss": "0.37493", "s2c_nll_loss": "0.541", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "28620", "lr": "0.0001908", "gnorm": "6.248", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6980"} 2023-01-29 18:08:05 | INFO | train_inner | {"epoch": 14, "update": 13.247, "s2c_loss": "0.54", "loss": "0.37418", "s2c_nll_loss": "0.54", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "28630", "lr": "0.000190867", "gnorm": "6.773", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6983"} 2023-01-29 18:08:08 | INFO | train_inner | {"epoch": 14, "update": 13.251, "s2c_loss": "0.491", "loss": "0.34027", "s2c_nll_loss": "0.491", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "28640", "lr": "0.000190934", "gnorm": "6.875", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "6986"} 2023-01-29 18:08:10 | INFO | train_inner | {"epoch": 14, "update": 13.256, "s2c_loss": "0.623", "loss": "0.43173", "s2c_nll_loss": "0.623", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "28650", "lr": "0.000191", "gnorm": "7.417", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "6988"} 2023-01-29 18:08:13 | INFO | train_inner | {"epoch": 14, "update": 13.26, "s2c_loss": "0.441", "loss": "0.30566", "s2c_nll_loss": "0.441", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "28660", "lr": "0.000191067", "gnorm": "6.079", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "6991"} 2023-01-29 18:08:15 | INFO | train_inner | {"epoch": 14, "update": 13.265, "s2c_loss": "0.472", "loss": "0.32749", "s2c_nll_loss": "0.472", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "28670", "lr": "0.000191134", "gnorm": "7.568", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6993"} 2023-01-29 18:08:18 | INFO | train_inner | {"epoch": 14, "update": 13.27, "s2c_loss": "0.608", "loss": "0.42133", "s2c_nll_loss": "0.608", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "28680", "lr": "0.0001912", "gnorm": "6.453", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6996"} 2023-01-29 18:08:20 | INFO | train_inner | {"epoch": 14, "update": 13.274, "s2c_loss": "0.431", "loss": "0.29893", "s2c_nll_loss": "0.431", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "28690", "lr": "0.000191267", "gnorm": "6.283", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "6998"} 2023-01-29 18:08:23 | INFO | train_inner | {"epoch": 14, "update": 13.279, "s2c_loss": "0.488", "loss": "0.33818", "s2c_nll_loss": "0.488", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "28700", "lr": "0.000191334", "gnorm": "6.95", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7001"} 2023-01-29 18:08:25 | INFO | train_inner | {"epoch": 14, "update": 13.284, "s2c_loss": "0.498", "loss": "0.34551", "s2c_nll_loss": "0.498", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "28710", "lr": "0.0001914", "gnorm": "8.99", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7003"} 2023-01-29 18:08:28 | INFO | train_inner | {"epoch": 14, "update": 13.288, "s2c_loss": "0.472", "loss": "0.32751", "s2c_nll_loss": "0.472", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "28720", "lr": "0.000191467", "gnorm": "6.163", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "7006"} 2023-01-29 18:08:31 | INFO | train_inner | {"epoch": 14, "update": 13.293, "s2c_loss": "0.448", "loss": "0.31084", "s2c_nll_loss": "0.448", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "28730", "lr": "0.000191534", "gnorm": "6.849", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7008"} 2023-01-29 18:08:33 | INFO | train_inner | {"epoch": 14, "update": 13.297, "s2c_loss": "0.408", "loss": "0.28274", "s2c_nll_loss": "0.408", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "246.2", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "28740", "lr": "0.0001916", "gnorm": "5.958", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7011"} 2023-01-29 18:08:36 | INFO | train_inner | {"epoch": 14, "update": 13.302, "s2c_loss": "0.497", "loss": "0.34459", "s2c_nll_loss": "0.497", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "248", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "28750", "lr": "0.000191667", "gnorm": "6.526", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7014"} 2023-01-29 18:08:38 | INFO | train_inner | {"epoch": 14, "update": 13.307, "s2c_loss": "0.55", "loss": "0.38107", "s2c_nll_loss": "0.55", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "28760", "lr": "0.000191734", "gnorm": "6.966", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7016"} 2023-01-29 18:08:41 | INFO | train_inner | {"epoch": 14, "update": 13.311, "s2c_loss": "0.545", "loss": "0.37799", "s2c_nll_loss": "0.545", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "28770", "lr": "0.0001918", "gnorm": "6.972", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7019"} 2023-01-29 18:08:43 | INFO | train_inner | {"epoch": 14, "update": 13.316, "s2c_loss": "0.506", "loss": "0.35052", "s2c_nll_loss": "0.506", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "28780", "lr": "0.000191867", "gnorm": "6.845", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7021"} 2023-01-29 18:08:46 | INFO | train_inner | {"epoch": 14, "update": 13.321, "s2c_loss": "0.469", "loss": "0.32501", "s2c_nll_loss": "0.469", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "28790", "lr": "0.000191934", "gnorm": "6.32", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7024"} 2023-01-29 18:08:48 | INFO | train_inner | {"epoch": 14, "update": 13.325, "s2c_loss": "0.429", "loss": "0.29729", "s2c_nll_loss": "0.429", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "28800", "lr": "0.000192", "gnorm": "6.458", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7026"} 2023-01-29 18:08:51 | INFO | train_inner | {"epoch": 14, "update": 13.33, "s2c_loss": "0.41", "loss": "0.28403", "s2c_nll_loss": "0.41", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "28810", "lr": "0.000192067", "gnorm": "5.849", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7029"} 2023-01-29 18:08:54 | INFO | train_inner | {"epoch": 14, "update": 13.334, "s2c_loss": "0.554", "loss": "0.38425", "s2c_nll_loss": "0.554", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "28820", "lr": "0.000192134", "gnorm": "6.837", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7031"} 2023-01-29 18:08:56 | INFO | train_inner | {"epoch": 14, "update": 13.339, "s2c_loss": "0.415", "loss": "0.28763", "s2c_nll_loss": "0.415", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "245.6", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "28830", "lr": "0.0001922", "gnorm": "7.032", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7034"} 2023-01-29 18:08:59 | INFO | train_inner | {"epoch": 14, "update": 13.344, "s2c_loss": "0.465", "loss": "0.32216", "s2c_nll_loss": "0.465", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "28840", "lr": "0.000192267", "gnorm": "6.483", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7037"} 2023-01-29 18:09:01 | INFO | train_inner | {"epoch": 14, "update": 13.348, "s2c_loss": "0.484", "loss": "0.33565", "s2c_nll_loss": "0.484", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "28850", "lr": "0.000192334", "gnorm": "6.357", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7039"} 2023-01-29 18:09:04 | INFO | train_inner | {"epoch": 14, "update": 13.353, "s2c_loss": "0.477", "loss": "0.33045", "s2c_nll_loss": "0.477", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "28860", "lr": "0.0001924", "gnorm": "5.899", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7042"} 2023-01-29 18:09:06 | INFO | train_inner | {"epoch": 14, "update": 13.358, "s2c_loss": "0.526", "loss": "0.36469", "s2c_nll_loss": "0.526", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "28870", "lr": "0.000192467", "gnorm": "6.769", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7044"} 2023-01-29 18:09:09 | INFO | train_inner | {"epoch": 14, "update": 13.362, "s2c_loss": "0.568", "loss": "0.39376", "s2c_nll_loss": "0.568", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "28880", "lr": "0.000192534", "gnorm": "6.531", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7047"} 2023-01-29 18:09:11 | INFO | train_inner | {"epoch": 14, "update": 13.367, "s2c_loss": "0.54", "loss": "0.3741", "s2c_nll_loss": "0.54", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "28890", "lr": "0.0001926", "gnorm": "6.148", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7049"} 2023-01-29 18:09:14 | INFO | train_inner | {"epoch": 14, "update": 13.371, "s2c_loss": "0.543", "loss": "0.37638", "s2c_nll_loss": "0.543", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "28900", "lr": "0.000192667", "gnorm": "6.348", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7052"} 2023-01-29 18:09:17 | INFO | train_inner | {"epoch": 14, "update": 13.376, "s2c_loss": "0.532", "loss": "0.36881", "s2c_nll_loss": "0.532", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "28910", "lr": "0.000192734", "gnorm": "7.894", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7054"} 2023-01-29 18:09:19 | INFO | train_inner | {"epoch": 14, "update": 13.381, "s2c_loss": "0.463", "loss": "0.32126", "s2c_nll_loss": "0.463", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "246.7", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "28920", "lr": "0.0001928", "gnorm": "6.576", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7057"} 2023-01-29 18:09:22 | INFO | train_inner | {"epoch": 14, "update": 13.385, "s2c_loss": "0.59", "loss": "0.40915", "s2c_nll_loss": "0.59", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "28930", "lr": "0.000192867", "gnorm": "8.11", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7060"} 2023-01-29 18:09:24 | INFO | train_inner | {"epoch": 14, "update": 13.39, "s2c_loss": "0.487", "loss": "0.33754", "s2c_nll_loss": "0.487", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "251.8", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "28940", "lr": "0.000192934", "gnorm": "6.173", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7062"} 2023-01-29 18:09:27 | INFO | train_inner | {"epoch": 14, "update": 13.395, "s2c_loss": "0.505", "loss": "0.34978", "s2c_nll_loss": "0.505", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "28950", "lr": "0.000193", "gnorm": "6.035", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7065"} 2023-01-29 18:09:29 | INFO | train_inner | {"epoch": 14, "update": 13.399, "s2c_loss": "0.438", "loss": "0.30361", "s2c_nll_loss": "0.438", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "28960", "lr": "0.000193067", "gnorm": "6.205", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7067"} 2023-01-29 18:09:32 | INFO | train_inner | {"epoch": 14, "update": 13.404, "s2c_loss": "0.539", "loss": "0.37337", "s2c_nll_loss": "0.539", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "243.9", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "28970", "lr": "0.000193134", "gnorm": "7.187", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7070"} 2023-01-29 18:09:35 | INFO | train_inner | {"epoch": 14, "update": 13.408, "s2c_loss": "0.457", "loss": "0.31685", "s2c_nll_loss": "0.457", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "246", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "28980", "lr": "0.0001932", "gnorm": "6.925", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7072"} 2023-01-29 18:09:37 | INFO | train_inner | {"epoch": 14, "update": 13.413, "s2c_loss": "0.547", "loss": "0.37888", "s2c_nll_loss": "0.547", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "28990", "lr": "0.000193267", "gnorm": "6.865", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7075"} 2023-01-29 18:09:39 | INFO | fairseq.trainer | NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 256.0 2023-01-29 18:09:40 | INFO | train_inner | {"epoch": 14, "update": 13.418, "s2c_loss": "0.563", "loss": "0.3901", "s2c_nll_loss": "0.563", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "235.4", "ups": "3.68", "wpb": "64", "bsz": "64", "num_updates": "29000", "lr": "0.000193334", "gnorm": "7.019", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7078"} 2023-01-29 18:09:42 | INFO | train_inner | {"epoch": 14, "update": 13.423, "s2c_loss": "0.483", "loss": "0.33503", "s2c_nll_loss": "0.483", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "246.2", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "29010", "lr": "0.0001934", "gnorm": "6.558", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7080"} 2023-01-29 18:09:45 | INFO | train_inner | {"epoch": 14, "update": 13.427, "s2c_loss": "0.478", "loss": "0.33119", "s2c_nll_loss": "0.478", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "29020", "lr": "0.000193467", "gnorm": "6.968", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7083"} 2023-01-29 18:09:47 | INFO | train_inner | {"epoch": 14, "update": 13.432, "s2c_loss": "0.589", "loss": "0.40808", "s2c_nll_loss": "0.589", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "245.6", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "29030", "lr": "0.000193534", "gnorm": "7.117", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7085"} 2023-01-29 18:09:50 | INFO | train_inner | {"epoch": 14, "update": 13.437, "s2c_loss": "0.676", "loss": "0.46873", "s2c_nll_loss": "0.676", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "29040", "lr": "0.0001936", "gnorm": "9.424", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7088"} 2023-01-29 18:09:53 | INFO | train_inner | {"epoch": 14, "update": 13.441, "s2c_loss": "0.569", "loss": "0.39473", "s2c_nll_loss": "0.569", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "248", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "29050", "lr": "0.000193667", "gnorm": "6.752", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7091"} 2023-01-29 18:09:55 | INFO | train_inner | {"epoch": 14, "update": 13.446, "s2c_loss": "0.663", "loss": "0.45974", "s2c_nll_loss": "0.663", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "29060", "lr": "0.000193734", "gnorm": "7.492", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7093"} 2023-01-29 18:09:58 | INFO | train_inner | {"epoch": 14, "update": 13.451, "s2c_loss": "0.556", "loss": "0.38542", "s2c_nll_loss": "0.556", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "247.9", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "29070", "lr": "0.0001938", "gnorm": "7.821", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7096"} 2023-01-29 18:10:00 | INFO | train_inner | {"epoch": 14, "update": 13.455, "s2c_loss": "0.635", "loss": "0.44024", "s2c_nll_loss": "0.635", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "29080", "lr": "0.000193867", "gnorm": "8.456", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7098"} 2023-01-29 18:10:03 | INFO | train_inner | {"epoch": 14, "update": 13.46, "s2c_loss": "0.481", "loss": "0.3332", "s2c_nll_loss": "0.481", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "29090", "lr": "0.000193934", "gnorm": "7.808", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7101"} 2023-01-29 18:10:05 | INFO | train_inner | {"epoch": 14, "update": 13.464, "s2c_loss": "0.602", "loss": "0.41696", "s2c_nll_loss": "0.602", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "248", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "29100", "lr": "0.000194", "gnorm": "7.207", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7103"} 2023-01-29 18:10:08 | INFO | train_inner | {"epoch": 14, "update": 13.469, "s2c_loss": "0.537", "loss": "0.37235", "s2c_nll_loss": "0.537", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "246.5", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "29110", "lr": "0.000194067", "gnorm": "6.853", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7106"} 2023-01-29 18:10:11 | INFO | train_inner | {"epoch": 14, "update": 13.474, "s2c_loss": "0.485", "loss": "0.33588", "s2c_nll_loss": "0.485", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "29120", "lr": "0.000194134", "gnorm": "6.031", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7108"} 2023-01-29 18:10:13 | INFO | train_inner | {"epoch": 14, "update": 13.478, "s2c_loss": "0.618", "loss": "0.42813", "s2c_nll_loss": "0.618", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "29130", "lr": "0.0001942", "gnorm": "7.393", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7111"} 2023-01-29 18:10:16 | INFO | train_inner | {"epoch": 14, "update": 13.483, "s2c_loss": "0.413", "loss": "0.28625", "s2c_nll_loss": "0.413", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "29140", "lr": "0.000194267", "gnorm": "5.839", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7114"} 2023-01-29 18:10:18 | INFO | train_inner | {"epoch": 14, "update": 13.488, "s2c_loss": "0.704", "loss": "0.48799", "s2c_nll_loss": "0.704", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "246.1", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "29150", "lr": "0.000194334", "gnorm": "9.482", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7116"} 2023-01-29 18:10:21 | INFO | train_inner | {"epoch": 14, "update": 13.492, "s2c_loss": "0.636", "loss": "0.44063", "s2c_nll_loss": "0.636", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "29160", "lr": "0.0001944", "gnorm": "7.39", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7119"} 2023-01-29 18:10:23 | INFO | train_inner | {"epoch": 14, "update": 13.497, "s2c_loss": "0.661", "loss": "0.45846", "s2c_nll_loss": "0.661", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "29170", "lr": "0.000194467", "gnorm": "8.754", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7121"} 2023-01-29 18:10:26 | INFO | train_inner | {"epoch": 14, "update": 13.501, "s2c_loss": "0.639", "loss": "0.44304", "s2c_nll_loss": "0.639", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "29180", "lr": "0.000194534", "gnorm": "7.209", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7124"} 2023-01-29 18:10:28 | INFO | train_inner | {"epoch": 14, "update": 13.506, "s2c_loss": "0.54", "loss": "0.37433", "s2c_nll_loss": "0.54", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "29190", "lr": "0.0001946", "gnorm": "7.441", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7126"} 2023-01-29 18:10:31 | INFO | train_inner | {"epoch": 14, "update": 13.511, "s2c_loss": "0.488", "loss": "0.33822", "s2c_nll_loss": "0.488", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "29200", "lr": "0.000194667", "gnorm": "6.633", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7129"} 2023-01-29 18:10:33 | INFO | train_inner | {"epoch": 14, "update": 13.515, "s2c_loss": "0.375", "loss": "0.25974", "s2c_nll_loss": "0.375", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "29210", "lr": "0.000194734", "gnorm": "5.968", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7131"} 2023-01-29 18:10:36 | INFO | train_inner | {"epoch": 14, "update": 13.52, "s2c_loss": "0.47", "loss": "0.32581", "s2c_nll_loss": "0.47", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "29220", "lr": "0.0001948", "gnorm": "6.458", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7134"} 2023-01-29 18:10:38 | INFO | train_inner | {"epoch": 14, "update": 13.525, "s2c_loss": "0.49", "loss": "0.33957", "s2c_nll_loss": "0.49", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "29230", "lr": "0.000194867", "gnorm": "6.411", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7136"} 2023-01-29 18:10:41 | INFO | train_inner | {"epoch": 14, "update": 13.529, "s2c_loss": "0.517", "loss": "0.35821", "s2c_nll_loss": "0.517", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "29240", "lr": "0.000194934", "gnorm": "6.937", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7139"} 2023-01-29 18:10:44 | INFO | train_inner | {"epoch": 14, "update": 13.534, "s2c_loss": "0.534", "loss": "0.37025", "s2c_nll_loss": "0.534", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "29250", "lr": "0.000195", "gnorm": "6.435", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7141"} 2023-01-29 18:10:46 | INFO | train_inner | {"epoch": 14, "update": 13.538, "s2c_loss": "0.458", "loss": "0.31717", "s2c_nll_loss": "0.458", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "29260", "lr": "0.000195067", "gnorm": "6.687", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7144"} 2023-01-29 18:10:49 | INFO | train_inner | {"epoch": 14, "update": 13.543, "s2c_loss": "0.518", "loss": "0.35882", "s2c_nll_loss": "0.518", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "29270", "lr": "0.000195134", "gnorm": "7.091", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7147"} 2023-01-29 18:10:51 | INFO | train_inner | {"epoch": 14, "update": 13.548, "s2c_loss": "0.519", "loss": "0.35942", "s2c_nll_loss": "0.519", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "29280", "lr": "0.0001952", "gnorm": "7.114", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7149"} 2023-01-29 18:10:54 | INFO | train_inner | {"epoch": 14, "update": 13.552, "s2c_loss": "0.437", "loss": "0.3026", "s2c_nll_loss": "0.437", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "29290", "lr": "0.000195267", "gnorm": "7.728", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7152"} 2023-01-29 18:10:56 | INFO | train_inner | {"epoch": 14, "update": 13.557, "s2c_loss": "0.604", "loss": "0.4185", "s2c_nll_loss": "0.604", "s2c_accuracy": "87.969", "s2c_total": "64", "s2c_n_correct": "56.3", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "29300", "lr": "0.000195334", "gnorm": "7.321", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7154"} 2023-01-29 18:10:59 | INFO | train_inner | {"epoch": 14, "update": 13.562, "s2c_loss": "0.516", "loss": "0.35752", "s2c_nll_loss": "0.516", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "29310", "lr": "0.0001954", "gnorm": "6.838", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7157"} 2023-01-29 18:11:01 | INFO | train_inner | {"epoch": 14, "update": 13.566, "s2c_loss": "0.576", "loss": "0.39893", "s2c_nll_loss": "0.576", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "29320", "lr": "0.000195467", "gnorm": "7.221", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7159"} 2023-01-29 18:11:04 | INFO | train_inner | {"epoch": 14, "update": 13.571, "s2c_loss": "0.512", "loss": "0.35463", "s2c_nll_loss": "0.512", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "29330", "lr": "0.000195534", "gnorm": "7.691", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7162"} 2023-01-29 18:11:06 | INFO | train_inner | {"epoch": 14, "update": 13.575, "s2c_loss": "0.698", "loss": "0.4839", "s2c_nll_loss": "0.698", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "29340", "lr": "0.0001956", "gnorm": "8.44", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7164"} 2023-01-29 18:11:09 | INFO | train_inner | {"epoch": 14, "update": 13.58, "s2c_loss": "0.621", "loss": "0.43037", "s2c_nll_loss": "0.621", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "29350", "lr": "0.000195667", "gnorm": "8.265", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7167"} 2023-01-29 18:11:12 | INFO | train_inner | {"epoch": 14, "update": 13.585, "s2c_loss": "0.48", "loss": "0.33245", "s2c_nll_loss": "0.48", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "29360", "lr": "0.000195734", "gnorm": "6.497", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7169"} 2023-01-29 18:11:14 | INFO | train_inner | {"epoch": 14, "update": 13.589, "s2c_loss": "0.562", "loss": "0.38922", "s2c_nll_loss": "0.562", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "29370", "lr": "0.0001958", "gnorm": "7.216", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7172"} 2023-01-29 18:11:17 | INFO | train_inner | {"epoch": 14, "update": 13.594, "s2c_loss": "0.668", "loss": "0.46284", "s2c_nll_loss": "0.668", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "29380", "lr": "0.000195867", "gnorm": "7.052", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7175"} 2023-01-29 18:11:19 | INFO | train_inner | {"epoch": 14, "update": 13.599, "s2c_loss": "0.57", "loss": "0.39514", "s2c_nll_loss": "0.57", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "29390", "lr": "0.000195934", "gnorm": "5.881", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7177"} 2023-01-29 18:11:22 | INFO | train_inner | {"epoch": 14, "update": 13.603, "s2c_loss": "0.492", "loss": "0.34131", "s2c_nll_loss": "0.492", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "244.7", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "29400", "lr": "0.000196", "gnorm": "6.396", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "7180"} 2023-01-29 18:11:24 | INFO | train_inner | {"epoch": 14, "update": 13.608, "s2c_loss": "0.445", "loss": "0.30849", "s2c_nll_loss": "0.445", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "29410", "lr": "0.000196067", "gnorm": "6.903", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7182"} 2023-01-29 18:11:27 | INFO | train_inner | {"epoch": 14, "update": 13.612, "s2c_loss": "0.577", "loss": "0.40026", "s2c_nll_loss": "0.577", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "29420", "lr": "0.000196134", "gnorm": "6.785", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7185"} 2023-01-29 18:11:29 | INFO | train_inner | {"epoch": 14, "update": 13.617, "s2c_loss": "0.639", "loss": "0.44283", "s2c_nll_loss": "0.639", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "29430", "lr": "0.0001962", "gnorm": "7.436", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7187"} 2023-01-29 18:11:32 | INFO | train_inner | {"epoch": 14, "update": 13.622, "s2c_loss": "0.475", "loss": "0.32896", "s2c_nll_loss": "0.475", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "29440", "lr": "0.000196267", "gnorm": "5.862", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7190"} 2023-01-29 18:11:35 | INFO | train_inner | {"epoch": 14, "update": 13.626, "s2c_loss": "0.452", "loss": "0.3131", "s2c_nll_loss": "0.452", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "246", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "29450", "lr": "0.000196334", "gnorm": "6.027", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7193"} 2023-01-29 18:11:37 | INFO | train_inner | {"epoch": 14, "update": 13.631, "s2c_loss": "0.528", "loss": "0.36609", "s2c_nll_loss": "0.528", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "245.8", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "29460", "lr": "0.0001964", "gnorm": "6.726", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7195"} 2023-01-29 18:11:40 | INFO | train_inner | {"epoch": 14, "update": 13.636, "s2c_loss": "0.492", "loss": "0.34104", "s2c_nll_loss": "0.492", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "29470", "lr": "0.000196467", "gnorm": "6.47", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7198"} 2023-01-29 18:11:42 | INFO | train_inner | {"epoch": 14, "update": 13.64, "s2c_loss": "0.757", "loss": "0.52487", "s2c_nll_loss": "0.757", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "245.6", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "29480", "lr": "0.000196534", "gnorm": "6.727", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7200"} 2023-01-29 18:11:45 | INFO | train_inner | {"epoch": 14, "update": 13.645, "s2c_loss": "0.399", "loss": "0.27661", "s2c_nll_loss": "0.399", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "29490", "lr": "0.0001966", "gnorm": "5.284", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7203"} 2023-01-29 18:11:47 | INFO | train_inner | {"epoch": 14, "update": 13.649, "s2c_loss": "0.551", "loss": "0.38218", "s2c_nll_loss": "0.551", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "29500", "lr": "0.000196667", "gnorm": "6.267", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7205"} 2023-01-29 18:11:50 | INFO | train_inner | {"epoch": 14, "update": 13.654, "s2c_loss": "0.439", "loss": "0.30418", "s2c_nll_loss": "0.439", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "245.9", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "29510", "lr": "0.000196733", "gnorm": "6.193", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7208"} 2023-01-29 18:11:53 | INFO | train_inner | {"epoch": 14, "update": 13.659, "s2c_loss": "0.431", "loss": "0.29889", "s2c_nll_loss": "0.431", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "29520", "lr": "0.0001968", "gnorm": "6.52", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7211"} 2023-01-29 18:11:55 | INFO | train_inner | {"epoch": 14, "update": 13.663, "s2c_loss": "0.429", "loss": "0.29757", "s2c_nll_loss": "0.429", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "244", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "29530", "lr": "0.000196867", "gnorm": "6.396", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7213"} 2023-01-29 18:11:58 | INFO | train_inner | {"epoch": 14, "update": 13.668, "s2c_loss": "0.434", "loss": "0.30099", "s2c_nll_loss": "0.434", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "246.2", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "29540", "lr": "0.000196933", "gnorm": "7.357", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7216"} 2023-01-29 18:12:00 | INFO | train_inner | {"epoch": 14, "update": 13.673, "s2c_loss": "0.589", "loss": "0.40856", "s2c_nll_loss": "0.589", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "245.9", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "29550", "lr": "0.000197", "gnorm": "7.256", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7218"} 2023-01-29 18:12:03 | INFO | train_inner | {"epoch": 14, "update": 13.677, "s2c_loss": "0.493", "loss": "0.34158", "s2c_nll_loss": "0.493", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "29560", "lr": "0.000197067", "gnorm": "7.311", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7221"} 2023-01-29 18:12:06 | INFO | train_inner | {"epoch": 14, "update": 13.682, "s2c_loss": "0.517", "loss": "0.35837", "s2c_nll_loss": "0.517", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "255.7", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "29570", "lr": "0.000197133", "gnorm": "5.633", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7223"} 2023-01-29 18:12:08 | INFO | train_inner | {"epoch": 14, "update": 13.686, "s2c_loss": "0.466", "loss": "0.32325", "s2c_nll_loss": "0.466", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "29580", "lr": "0.0001972", "gnorm": "7.337", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7226"} 2023-01-29 18:12:11 | INFO | train_inner | {"epoch": 14, "update": 13.691, "s2c_loss": "0.463", "loss": "0.32096", "s2c_nll_loss": "0.463", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "245.3", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "29590", "lr": "0.000197267", "gnorm": "6.981", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7229"} 2023-01-29 18:12:13 | INFO | train_inner | {"epoch": 14, "update": 13.696, "s2c_loss": "0.656", "loss": "0.45451", "s2c_nll_loss": "0.656", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "29600", "lr": "0.000197333", "gnorm": "7.506", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7231"} 2023-01-29 18:12:16 | INFO | train_inner | {"epoch": 14, "update": 13.7, "s2c_loss": "0.467", "loss": "0.32378", "s2c_nll_loss": "0.467", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "29610", "lr": "0.0001974", "gnorm": "8.399", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7234"} 2023-01-29 18:12:18 | INFO | train_inner | {"epoch": 14, "update": 13.705, "s2c_loss": "0.493", "loss": "0.34161", "s2c_nll_loss": "0.493", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "29620", "lr": "0.000197467", "gnorm": "7.651", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7236"} 2023-01-29 18:12:21 | INFO | train_inner | {"epoch": 14, "update": 13.71, "s2c_loss": "0.501", "loss": "0.34725", "s2c_nll_loss": "0.501", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "29630", "lr": "0.000197533", "gnorm": "6.313", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7239"} 2023-01-29 18:12:23 | INFO | train_inner | {"epoch": 14, "update": 13.714, "s2c_loss": "0.504", "loss": "0.34926", "s2c_nll_loss": "0.504", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "29640", "lr": "0.0001976", "gnorm": "8.03", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "7241"} 2023-01-29 18:12:26 | INFO | train_inner | {"epoch": 14, "update": 13.719, "s2c_loss": "0.675", "loss": "0.46762", "s2c_nll_loss": "0.675", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "29650", "lr": "0.000197667", "gnorm": "7.647", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7244"} 2023-01-29 18:12:29 | INFO | train_inner | {"epoch": 14, "update": 13.723, "s2c_loss": "0.559", "loss": "0.38728", "s2c_nll_loss": "0.559", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "29660", "lr": "0.000197733", "gnorm": "7.753", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7246"} 2023-01-29 18:12:31 | INFO | train_inner | {"epoch": 14, "update": 13.728, "s2c_loss": "0.57", "loss": "0.39506", "s2c_nll_loss": "0.57", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "29670", "lr": "0.0001978", "gnorm": "6.915", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7249"} 2023-01-29 18:12:34 | INFO | train_inner | {"epoch": 14, "update": 13.733, "s2c_loss": "0.582", "loss": "0.40326", "s2c_nll_loss": "0.582", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "29680", "lr": "0.000197867", "gnorm": "7.367", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7252"} 2023-01-29 18:12:36 | INFO | train_inner | {"epoch": 14, "update": 13.737, "s2c_loss": "0.621", "loss": "0.43013", "s2c_nll_loss": "0.621", "s2c_accuracy": "89.062", "s2c_total": "64", "s2c_n_correct": "57", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "29690", "lr": "0.000197933", "gnorm": "8.331", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7254"} 2023-01-29 18:12:39 | INFO | train_inner | {"epoch": 14, "update": 13.742, "s2c_loss": "0.732", "loss": "0.50713", "s2c_nll_loss": "0.732", "s2c_accuracy": "88.281", "s2c_total": "64", "s2c_n_correct": "56.5", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "29700", "lr": "0.000198", "gnorm": "8.161", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7257"} 2023-01-29 18:12:41 | INFO | train_inner | {"epoch": 14, "update": 13.747, "s2c_loss": "0.463", "loss": "0.32098", "s2c_nll_loss": "0.463", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "29710", "lr": "0.000198067", "gnorm": "5.731", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7259"} 2023-01-29 18:12:44 | INFO | train_inner | {"epoch": 14, "update": 13.751, "s2c_loss": "0.505", "loss": "0.35005", "s2c_nll_loss": "0.505", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "29720", "lr": "0.000198133", "gnorm": "6.742", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7262"} 2023-01-29 18:12:46 | INFO | train_inner | {"epoch": 14, "update": 13.756, "s2c_loss": "0.48", "loss": "0.33285", "s2c_nll_loss": "0.48", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "29730", "lr": "0.0001982", "gnorm": "7.085", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7264"} 2023-01-29 18:12:49 | INFO | train_inner | {"epoch": 14, "update": 13.76, "s2c_loss": "0.466", "loss": "0.32268", "s2c_nll_loss": "0.466", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "29740", "lr": "0.000198267", "gnorm": "7.155", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7267"} 2023-01-29 18:12:51 | INFO | train_inner | {"epoch": 14, "update": 13.765, "s2c_loss": "0.455", "loss": "0.31511", "s2c_nll_loss": "0.455", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "29750", "lr": "0.000198333", "gnorm": "6.166", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7269"} 2023-01-29 18:12:54 | INFO | train_inner | {"epoch": 14, "update": 13.77, "s2c_loss": "0.556", "loss": "0.38539", "s2c_nll_loss": "0.556", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "29760", "lr": "0.0001984", "gnorm": "6.952", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7272"} 2023-01-29 18:12:56 | INFO | train_inner | {"epoch": 14, "update": 13.774, "s2c_loss": "0.526", "loss": "0.36453", "s2c_nll_loss": "0.526", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "29770", "lr": "0.000198467", "gnorm": "6.087", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7274"} 2023-01-29 18:12:59 | INFO | train_inner | {"epoch": 14, "update": 13.779, "s2c_loss": "0.54", "loss": "0.37433", "s2c_nll_loss": "0.54", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "29780", "lr": "0.000198533", "gnorm": "6.972", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7277"} 2023-01-29 18:13:01 | INFO | train_inner | {"epoch": 14, "update": 13.784, "s2c_loss": "0.537", "loss": "0.37398", "s2c_nll_loss": "0.537", "s2c_accuracy": "90.738", "s2c_total": "63.7", "s2c_n_correct": "57.8", "wps": "254.2", "ups": "3.99", "wpb": "63.7", "bsz": "63.7", "num_updates": "29790", "lr": "0.0001986", "gnorm": "7.207", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7279"} 2023-01-29 18:13:04 | INFO | train_inner | {"epoch": 14, "update": 13.788, "s2c_loss": "0.693", "loss": "0.48042", "s2c_nll_loss": "0.693", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "257.6", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "29800", "lr": "0.000198667", "gnorm": "8.314", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "7282"} 2023-01-29 18:13:06 | INFO | train_inner | {"epoch": 14, "update": 13.793, "s2c_loss": "0.452", "loss": "0.31355", "s2c_nll_loss": "0.452", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "29810", "lr": "0.000198733", "gnorm": "5.986", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7284"} 2023-01-29 18:13:09 | INFO | train_inner | {"epoch": 14, "update": 13.797, "s2c_loss": "0.562", "loss": "0.38974", "s2c_nll_loss": "0.562", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "29820", "lr": "0.0001988", "gnorm": "7.375", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7287"} 2023-01-29 18:13:11 | INFO | train_inner | {"epoch": 14, "update": 13.802, "s2c_loss": "0.555", "loss": "0.38461", "s2c_nll_loss": "0.555", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "29830", "lr": "0.000198867", "gnorm": "6.758", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7289"} 2023-01-29 18:13:14 | INFO | train_inner | {"epoch": 14, "update": 13.807, "s2c_loss": "0.548", "loss": "0.38005", "s2c_nll_loss": "0.548", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "257", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "29840", "lr": "0.000198933", "gnorm": "8.086", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7292"} 2023-01-29 18:13:17 | INFO | train_inner | {"epoch": 14, "update": 13.811, "s2c_loss": "0.552", "loss": "0.38255", "s2c_nll_loss": "0.552", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "29850", "lr": "0.000199", "gnorm": "6.26", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "7294"} 2023-01-29 18:13:19 | INFO | train_inner | {"epoch": 14, "update": 13.816, "s2c_loss": "0.498", "loss": "0.34489", "s2c_nll_loss": "0.498", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "29860", "lr": "0.000199067", "gnorm": "6.919", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7297"} 2023-01-29 18:13:22 | INFO | train_inner | {"epoch": 14, "update": 13.821, "s2c_loss": "0.593", "loss": "0.41081", "s2c_nll_loss": "0.593", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "29870", "lr": "0.000199133", "gnorm": "8.168", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7300"} 2023-01-29 18:13:24 | INFO | train_inner | {"epoch": 14, "update": 13.825, "s2c_loss": "0.53", "loss": "0.3672", "s2c_nll_loss": "0.53", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "29880", "lr": "0.0001992", "gnorm": "6.926", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7302"} 2023-01-29 18:13:27 | INFO | train_inner | {"epoch": 14, "update": 13.83, "s2c_loss": "0.569", "loss": "0.39415", "s2c_nll_loss": "0.569", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "29890", "lr": "0.000199267", "gnorm": "6.517", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7305"} 2023-01-29 18:13:29 | INFO | train_inner | {"epoch": 14, "update": 13.834, "s2c_loss": "0.546", "loss": "0.3783", "s2c_nll_loss": "0.546", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "29900", "lr": "0.000199333", "gnorm": "6.423", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7307"} 2023-01-29 18:13:32 | INFO | train_inner | {"epoch": 14, "update": 13.839, "s2c_loss": "0.528", "loss": "0.36622", "s2c_nll_loss": "0.528", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "29910", "lr": "0.0001994", "gnorm": "6.115", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7310"} 2023-01-29 18:13:34 | INFO | train_inner | {"epoch": 14, "update": 13.844, "s2c_loss": "0.43", "loss": "0.29791", "s2c_nll_loss": "0.43", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "29920", "lr": "0.000199467", "gnorm": "6.047", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7312"} 2023-01-29 18:13:37 | INFO | train_inner | {"epoch": 14, "update": 13.848, "s2c_loss": "0.585", "loss": "0.40541", "s2c_nll_loss": "0.585", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "29930", "lr": "0.000199533", "gnorm": "6.929", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7315"} 2023-01-29 18:13:39 | INFO | train_inner | {"epoch": 14, "update": 13.853, "s2c_loss": "0.855", "loss": "0.59239", "s2c_nll_loss": "0.855", "s2c_accuracy": "85.781", "s2c_total": "64", "s2c_n_correct": "54.9", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "29940", "lr": "0.0001996", "gnorm": "6.871", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7317"} 2023-01-29 18:13:42 | INFO | train_inner | {"epoch": 14, "update": 13.858, "s2c_loss": "0.408", "loss": "0.28286", "s2c_nll_loss": "0.408", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "29950", "lr": "0.000199667", "gnorm": "5.816", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7320"} 2023-01-29 18:13:45 | INFO | train_inner | {"epoch": 14, "update": 13.862, "s2c_loss": "0.462", "loss": "0.3199", "s2c_nll_loss": "0.462", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "29960", "lr": "0.000199733", "gnorm": "5.964", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "7322"} 2023-01-29 18:13:47 | INFO | train_inner | {"epoch": 14, "update": 13.867, "s2c_loss": "0.509", "loss": "0.35274", "s2c_nll_loss": "0.509", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "242.9", "ups": "3.8", "wpb": "64", "bsz": "64", "num_updates": "29970", "lr": "0.0001998", "gnorm": "6.931", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7325"} 2023-01-29 18:13:50 | INFO | train_inner | {"epoch": 14, "update": 13.871, "s2c_loss": "0.466", "loss": "0.32301", "s2c_nll_loss": "0.466", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "246.6", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "29980", "lr": "0.000199867", "gnorm": "6.7", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7328"} 2023-01-29 18:13:52 | INFO | train_inner | {"epoch": 14, "update": 13.876, "s2c_loss": "0.56", "loss": "0.38835", "s2c_nll_loss": "0.56", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "29990", "lr": "0.000199933", "gnorm": "7.92", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7330"} 2023-01-29 18:13:55 | INFO | train_inner | {"epoch": 14, "update": 13.881, "s2c_loss": "0.482", "loss": "0.33433", "s2c_nll_loss": "0.482", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "244.4", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "30000", "lr": "0.0002", "gnorm": "6.871", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7333"} 2023-01-29 18:13:55 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 18:14:10 | INFO | valid | {"epoch": 14, "valid_s2c_loss": "1.176", "valid_loss": "0.8149", "valid_s2c_nll_loss": "1.176", "valid_s2c_accuracy": "78.922", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "25.2222", "valid_num_updates": "30000", "valid_best_s2c_accuracy": "81.037"} 2023-01-29 18:14:10 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 14 @ 30000 updates 2023-01-29 18:14:10 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_14_30000.pt 2023-01-29 18:14:13 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_14_30000.pt 2023-01-29 18:14:18 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_14_30000.pt (epoch 14 @ 30000 updates, score 78.922) (writing took 8.41398049890995 seconds) 2023-01-29 18:14:21 | INFO | train_inner | {"epoch": 14, "update": 13.885, "s2c_loss": "0.519", "loss": "0.36009", "s2c_nll_loss": "0.519", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "25", "ups": "0.39", "wpb": "64", "bsz": "64", "num_updates": "30010", "lr": "0.000199933", "gnorm": "6.437", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7358"} 2023-01-29 18:14:23 | INFO | train_inner | {"epoch": 14, "update": 13.89, "s2c_loss": "0.486", "loss": "0.3368", "s2c_nll_loss": "0.486", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "241.6", "ups": "3.78", "wpb": "64", "bsz": "64", "num_updates": "30020", "lr": "0.000199867", "gnorm": "7.953", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7361"} 2023-01-29 18:14:26 | INFO | train_inner | {"epoch": 14, "update": 13.895, "s2c_loss": "0.588", "loss": "0.4079", "s2c_nll_loss": "0.588", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "30030", "lr": "0.0001998", "gnorm": "6.756", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7364"} 2023-01-29 18:14:28 | INFO | train_inner | {"epoch": 14, "update": 13.899, "s2c_loss": "0.601", "loss": "0.41663", "s2c_nll_loss": "0.601", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "30040", "lr": "0.000199733", "gnorm": "7.353", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7366"} 2023-01-29 18:14:31 | INFO | train_inner | {"epoch": 14, "update": 13.904, "s2c_loss": "0.476", "loss": "0.32984", "s2c_nll_loss": "0.476", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "246.5", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "30050", "lr": "0.000199667", "gnorm": "6.746", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "7369"} 2023-01-29 18:14:34 | INFO | train_inner | {"epoch": 14, "update": 13.908, "s2c_loss": "0.474", "loss": "0.32863", "s2c_nll_loss": "0.474", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "245", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "30060", "lr": "0.0001996", "gnorm": "6.366", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7371"} 2023-01-29 18:14:36 | INFO | train_inner | {"epoch": 14, "update": 13.913, "s2c_loss": "0.421", "loss": "0.2915", "s2c_nll_loss": "0.421", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "30070", "lr": "0.000199533", "gnorm": "6.231", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7374"} 2023-01-29 18:14:39 | INFO | train_inner | {"epoch": 14, "update": 13.918, "s2c_loss": "0.46", "loss": "0.31877", "s2c_nll_loss": "0.46", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "30080", "lr": "0.000199467", "gnorm": "6.059", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7377"} 2023-01-29 18:14:41 | INFO | train_inner | {"epoch": 14, "update": 13.922, "s2c_loss": "0.439", "loss": "0.30429", "s2c_nll_loss": "0.439", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "247.4", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "30090", "lr": "0.0001994", "gnorm": "6.093", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7379"} 2023-01-29 18:14:44 | INFO | train_inner | {"epoch": 14, "update": 13.927, "s2c_loss": "0.472", "loss": "0.32741", "s2c_nll_loss": "0.472", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "30100", "lr": "0.000199333", "gnorm": "6.879", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7382"} 2023-01-29 18:14:46 | INFO | train_inner | {"epoch": 14, "update": 13.932, "s2c_loss": "0.406", "loss": "0.28156", "s2c_nll_loss": "0.406", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "30110", "lr": "0.000199267", "gnorm": "6.246", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7384"} 2023-01-29 18:14:49 | INFO | train_inner | {"epoch": 14, "update": 13.936, "s2c_loss": "0.569", "loss": "0.39423", "s2c_nll_loss": "0.569", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "30120", "lr": "0.0001992", "gnorm": "7.053", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7387"} 2023-01-29 18:14:51 | INFO | train_inner | {"epoch": 14, "update": 13.941, "s2c_loss": "0.558", "loss": "0.38659", "s2c_nll_loss": "0.558", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "30130", "lr": "0.000199133", "gnorm": "7.53", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7389"} 2023-01-29 18:14:54 | INFO | train_inner | {"epoch": 14, "update": 13.945, "s2c_loss": "0.618", "loss": "0.42836", "s2c_nll_loss": "0.618", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "30140", "lr": "0.000199067", "gnorm": "8.524", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7392"} 2023-01-29 18:14:56 | INFO | train_inner | {"epoch": 14, "update": 13.95, "s2c_loss": "0.473", "loss": "0.32783", "s2c_nll_loss": "0.473", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "30150", "lr": "0.000199", "gnorm": "7.462", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7394"} 2023-01-29 18:14:59 | INFO | train_inner | {"epoch": 14, "update": 13.955, "s2c_loss": "0.471", "loss": "0.32679", "s2c_nll_loss": "0.471", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "30160", "lr": "0.000198933", "gnorm": "7.927", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7397"} 2023-01-29 18:15:02 | INFO | train_inner | {"epoch": 14, "update": 13.959, "s2c_loss": "0.573", "loss": "0.39694", "s2c_nll_loss": "0.573", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "30170", "lr": "0.000198867", "gnorm": "6.444", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7399"} 2023-01-29 18:15:04 | INFO | train_inner | {"epoch": 14, "update": 13.964, "s2c_loss": "0.489", "loss": "0.33918", "s2c_nll_loss": "0.489", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "30180", "lr": "0.0001988", "gnorm": "7.767", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7402"} 2023-01-29 18:15:07 | INFO | train_inner | {"epoch": 14, "update": 13.969, "s2c_loss": "0.49", "loss": "0.33964", "s2c_nll_loss": "0.49", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "30190", "lr": "0.000198733", "gnorm": "6.554", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7405"} 2023-01-29 18:15:09 | INFO | train_inner | {"epoch": 14, "update": 13.973, "s2c_loss": "0.552", "loss": "0.38289", "s2c_nll_loss": "0.552", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "30200", "lr": "0.000198667", "gnorm": "7.728", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7407"} 2023-01-29 18:15:12 | INFO | train_inner | {"epoch": 14, "update": 13.978, "s2c_loss": "0.563", "loss": "0.39053", "s2c_nll_loss": "0.563", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "30210", "lr": "0.0001986", "gnorm": "7.517", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7410"} 2023-01-29 18:15:14 | INFO | train_inner | {"epoch": 14, "update": 13.982, "s2c_loss": "0.75", "loss": "0.52013", "s2c_nll_loss": "0.75", "s2c_accuracy": "86.562", "s2c_total": "64", "s2c_n_correct": "55.4", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "30220", "lr": "0.000198533", "gnorm": "9.069", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7412"} 2023-01-29 18:15:17 | INFO | train_inner | {"epoch": 14, "update": 13.987, "s2c_loss": "0.483", "loss": "0.33507", "s2c_nll_loss": "0.483", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "30230", "lr": "0.000198467", "gnorm": "6.747", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "7415"} 2023-01-29 18:15:19 | INFO | train_inner | {"epoch": 14, "update": 13.992, "s2c_loss": "0.553", "loss": "0.38342", "s2c_nll_loss": "0.553", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "30240", "lr": "0.0001984", "gnorm": "7.682", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "7417"} 2023-01-29 18:15:22 | INFO | train_inner | {"epoch": 14, "update": 13.996, "s2c_loss": "0.567", "loss": "0.39314", "s2c_nll_loss": "0.567", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "30250", "lr": "0.000198333", "gnorm": "6.61", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7420"} 2023-01-29 18:15:24 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 18:15:39 | INFO | valid | {"epoch": 14, "valid_s2c_loss": "0.86", "valid_loss": "0.59612", "valid_s2c_nll_loss": "0.86", "valid_s2c_accuracy": "84.427", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "26.9815", "valid_num_updates": "30258", "valid_best_s2c_accuracy": "84.427"} 2023-01-29 18:15:39 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 14 @ 30258 updates 2023-01-29 18:15:39 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 18:15:46 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 18:15:50 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt (epoch 14 @ 30258 updates, score 84.427) (writing took 11.687433236278594 seconds) 2023-01-29 18:15:50 | INFO | fairseq_cli.train | end of epoch 14 (average epoch stats below) 2023-01-29 18:15:50 | INFO | train | {"epoch": 14, "train_s2c_loss": "0.517", "train_loss": "0.35864", "train_s2c_nll_loss": "0.517", "train_s2c_accuracy": "90.636", "train_s2c_total": "63.9838", "train_s2c_n_correct": "57.9926", "train_wps": "227.7", "train_ups": "3.56", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "30258", "train_lr": "0.00019828", "train_gnorm": "6.893", "train_loss_scale": "256", "train_train_wall": "544", "train_gb_free": "7.5", "train_wall": "7448"} 2023-01-29 18:15:57 | INFO | fairseq.trainer | begin training epoch 15 2023-01-29 18:15:57 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 18:15:57 | INFO | train_inner | {"epoch": 15, "update": 14.001, "s2c_loss": "0.51", "loss": "0.35337", "s2c_nll_loss": "0.51", "s2c_accuracy": "91.941", "s2c_total": "60.8", "s2c_n_correct": "55.9", "wps": "17.3", "ups": "0.28", "wpb": "60.8", "bsz": "60.8", "num_updates": "30260", "lr": "0.000198267", "gnorm": "6.16", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7455"} 2023-01-29 18:16:00 | INFO | train_inner | {"epoch": 15, "update": 14.006, "s2c_loss": "0.575", "loss": "0.39871", "s2c_nll_loss": "0.575", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "30270", "lr": "0.0001982", "gnorm": "7.406", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7458"} 2023-01-29 18:16:02 | INFO | train_inner | {"epoch": 15, "update": 14.01, "s2c_loss": "0.578", "loss": "0.40061", "s2c_nll_loss": "0.578", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "30280", "lr": "0.000198133", "gnorm": "7.805", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "7460"} 2023-01-29 18:16:05 | INFO | train_inner | {"epoch": 15, "update": 14.015, "s2c_loss": "0.487", "loss": "0.33731", "s2c_nll_loss": "0.487", "s2c_accuracy": "90.625", "s2c_total": "64", "s2c_n_correct": "58", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "30290", "lr": "0.000198067", "gnorm": "8.002", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7463"} 2023-01-29 18:16:07 | INFO | train_inner | {"epoch": 15, "update": 14.019, "s2c_loss": "0.421", "loss": "0.29161", "s2c_nll_loss": "0.421", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "30300", "lr": "0.000198", "gnorm": "5.709", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7465"} 2023-01-29 18:16:10 | INFO | train_inner | {"epoch": 15, "update": 14.024, "s2c_loss": "0.751", "loss": "0.52044", "s2c_nll_loss": "0.751", "s2c_accuracy": "87.188", "s2c_total": "64", "s2c_n_correct": "55.8", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "30310", "lr": "0.000197933", "gnorm": "6.877", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7468"} 2023-01-29 18:16:12 | INFO | train_inner | {"epoch": 15, "update": 14.029, "s2c_loss": "0.334", "loss": "0.2312", "s2c_nll_loss": "0.334", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "245.7", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "30320", "lr": "0.000197867", "gnorm": "5.527", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7470"} 2023-01-29 18:16:15 | INFO | train_inner | {"epoch": 15, "update": 14.033, "s2c_loss": "0.472", "loss": "0.32749", "s2c_nll_loss": "0.472", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "246", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "30330", "lr": "0.0001978", "gnorm": "6.941", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7473"} 2023-01-29 18:16:18 | INFO | train_inner | {"epoch": 15, "update": 14.038, "s2c_loss": "0.486", "loss": "0.33662", "s2c_nll_loss": "0.486", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "30340", "lr": "0.000197733", "gnorm": "6.671", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7475"} 2023-01-29 18:16:20 | INFO | train_inner | {"epoch": 15, "update": 14.043, "s2c_loss": "0.446", "loss": "0.30918", "s2c_nll_loss": "0.446", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "30350", "lr": "0.000197667", "gnorm": "6.392", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7478"} 2023-01-29 18:16:23 | INFO | train_inner | {"epoch": 15, "update": 14.047, "s2c_loss": "0.409", "loss": "0.2838", "s2c_nll_loss": "0.409", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "30360", "lr": "0.0001976", "gnorm": "5.778", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7481"} 2023-01-29 18:16:25 | INFO | train_inner | {"epoch": 15, "update": 14.052, "s2c_loss": "0.502", "loss": "0.34782", "s2c_nll_loss": "0.502", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "30370", "lr": "0.000197533", "gnorm": "6.949", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7483"} 2023-01-29 18:16:28 | INFO | train_inner | {"epoch": 15, "update": 14.056, "s2c_loss": "0.491", "loss": "0.3404", "s2c_nll_loss": "0.491", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "30380", "lr": "0.000197467", "gnorm": "6.059", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7486"} 2023-01-29 18:16:30 | INFO | train_inner | {"epoch": 15, "update": 14.061, "s2c_loss": "0.403", "loss": "0.27968", "s2c_nll_loss": "0.403", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "30390", "lr": "0.0001974", "gnorm": "6.386", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7488"} 2023-01-29 18:16:33 | INFO | train_inner | {"epoch": 15, "update": 14.066, "s2c_loss": "0.452", "loss": "0.31334", "s2c_nll_loss": "0.452", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "30400", "lr": "0.000197333", "gnorm": "7.559", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7491"} 2023-01-29 18:16:35 | INFO | train_inner | {"epoch": 15, "update": 14.07, "s2c_loss": "0.412", "loss": "0.28572", "s2c_nll_loss": "0.412", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "30410", "lr": "0.000197267", "gnorm": "6.454", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7493"} 2023-01-29 18:16:38 | INFO | train_inner | {"epoch": 15, "update": 14.075, "s2c_loss": "0.412", "loss": "0.28532", "s2c_nll_loss": "0.412", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "30420", "lr": "0.0001972", "gnorm": "6.103", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7496"} 2023-01-29 18:16:40 | INFO | train_inner | {"epoch": 15, "update": 14.08, "s2c_loss": "0.521", "loss": "0.36087", "s2c_nll_loss": "0.521", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "30430", "lr": "0.000197133", "gnorm": "6.674", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7498"} 2023-01-29 18:16:43 | INFO | train_inner | {"epoch": 15, "update": 14.084, "s2c_loss": "0.462", "loss": "0.31992", "s2c_nll_loss": "0.462", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "30440", "lr": "0.000197067", "gnorm": "6.862", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7501"} 2023-01-29 18:16:46 | INFO | train_inner | {"epoch": 15, "update": 14.089, "s2c_loss": "0.483", "loss": "0.33474", "s2c_nll_loss": "0.483", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "30450", "lr": "0.000197", "gnorm": "5.686", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7504"} 2023-01-29 18:16:48 | INFO | train_inner | {"epoch": 15, "update": 14.093, "s2c_loss": "0.368", "loss": "0.25513", "s2c_nll_loss": "0.368", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "246.6", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "30460", "lr": "0.000196933", "gnorm": "6.091", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7506"} 2023-01-29 18:16:51 | INFO | train_inner | {"epoch": 15, "update": 14.098, "s2c_loss": "0.486", "loss": "0.33701", "s2c_nll_loss": "0.486", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "30470", "lr": "0.000196867", "gnorm": "7.138", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7509"} 2023-01-29 18:16:53 | INFO | train_inner | {"epoch": 15, "update": 14.103, "s2c_loss": "0.322", "loss": "0.22303", "s2c_nll_loss": "0.322", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "244.8", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "30480", "lr": "0.0001968", "gnorm": "5.461", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7511"} 2023-01-29 18:16:56 | INFO | train_inner | {"epoch": 15, "update": 14.107, "s2c_loss": "0.586", "loss": "0.40642", "s2c_nll_loss": "0.586", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "244.8", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "30490", "lr": "0.000196733", "gnorm": "5.618", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7514"} 2023-01-29 18:16:59 | INFO | train_inner | {"epoch": 15, "update": 14.112, "s2c_loss": "0.379", "loss": "0.26244", "s2c_nll_loss": "0.379", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "30500", "lr": "0.000196667", "gnorm": "6.245", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7517"} 2023-01-29 18:17:01 | INFO | train_inner | {"epoch": 15, "update": 14.117, "s2c_loss": "0.462", "loss": "0.32013", "s2c_nll_loss": "0.462", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "30510", "lr": "0.0001966", "gnorm": "5.639", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7519"} 2023-01-29 18:17:04 | INFO | train_inner | {"epoch": 15, "update": 14.121, "s2c_loss": "0.469", "loss": "0.3248", "s2c_nll_loss": "0.469", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "30520", "lr": "0.000196534", "gnorm": "7.233", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7522"} 2023-01-29 18:17:06 | INFO | train_inner | {"epoch": 15, "update": 14.126, "s2c_loss": "0.406", "loss": "0.28126", "s2c_nll_loss": "0.406", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "30530", "lr": "0.000196467", "gnorm": "6.034", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7524"} 2023-01-29 18:17:09 | INFO | train_inner | {"epoch": 15, "update": 14.13, "s2c_loss": "0.383", "loss": "0.26546", "s2c_nll_loss": "0.383", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "30540", "lr": "0.0001964", "gnorm": "5.973", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7527"} 2023-01-29 18:17:11 | INFO | train_inner | {"epoch": 15, "update": 14.135, "s2c_loss": "0.348", "loss": "0.24116", "s2c_nll_loss": "0.348", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "30550", "lr": "0.000196334", "gnorm": "5.509", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7529"} 2023-01-29 18:17:14 | INFO | train_inner | {"epoch": 15, "update": 14.14, "s2c_loss": "0.624", "loss": "0.43285", "s2c_nll_loss": "0.624", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "30560", "lr": "0.000196267", "gnorm": "5.608", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7532"} 2023-01-29 18:17:16 | INFO | train_inner | {"epoch": 15, "update": 14.144, "s2c_loss": "0.327", "loss": "0.22662", "s2c_nll_loss": "0.327", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "30570", "lr": "0.0001962", "gnorm": "5.614", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7534"} 2023-01-29 18:17:19 | INFO | train_inner | {"epoch": 15, "update": 14.149, "s2c_loss": "0.46", "loss": "0.31879", "s2c_nll_loss": "0.46", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "30580", "lr": "0.000196134", "gnorm": "5.659", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7537"} 2023-01-29 18:17:22 | INFO | train_inner | {"epoch": 15, "update": 14.154, "s2c_loss": "0.386", "loss": "0.26749", "s2c_nll_loss": "0.386", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "244.3", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "30590", "lr": "0.000196067", "gnorm": "5.592", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "7539"} 2023-01-29 18:17:24 | INFO | train_inner | {"epoch": 15, "update": 14.158, "s2c_loss": "0.429", "loss": "0.29758", "s2c_nll_loss": "0.429", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "30600", "lr": "0.000196", "gnorm": "6.184", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7542"} 2023-01-29 18:17:27 | INFO | train_inner | {"epoch": 15, "update": 14.163, "s2c_loss": "0.394", "loss": "0.27305", "s2c_nll_loss": "0.394", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "30610", "lr": "0.000195934", "gnorm": "5.989", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7545"} 2023-01-29 18:17:29 | INFO | train_inner | {"epoch": 15, "update": 14.167, "s2c_loss": "0.446", "loss": "0.30932", "s2c_nll_loss": "0.446", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "30620", "lr": "0.000195867", "gnorm": "6.976", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "7547"} 2023-01-29 18:17:32 | INFO | train_inner | {"epoch": 15, "update": 14.172, "s2c_loss": "0.543", "loss": "0.37634", "s2c_nll_loss": "0.543", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "30630", "lr": "0.0001958", "gnorm": "6.406", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7550"} 2023-01-29 18:17:34 | INFO | train_inner | {"epoch": 15, "update": 14.177, "s2c_loss": "0.443", "loss": "0.30717", "s2c_nll_loss": "0.443", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "244.2", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "30640", "lr": "0.000195734", "gnorm": "6.62", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7552"} 2023-01-29 18:17:37 | INFO | train_inner | {"epoch": 15, "update": 14.181, "s2c_loss": "0.501", "loss": "0.34737", "s2c_nll_loss": "0.501", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "30650", "lr": "0.000195667", "gnorm": "9.206", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7555"} 2023-01-29 18:17:39 | INFO | train_inner | {"epoch": 15, "update": 14.186, "s2c_loss": "0.45", "loss": "0.31213", "s2c_nll_loss": "0.45", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "30660", "lr": "0.0001956", "gnorm": "7.66", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7557"} 2023-01-29 18:17:42 | INFO | train_inner | {"epoch": 15, "update": 14.191, "s2c_loss": "0.414", "loss": "0.2869", "s2c_nll_loss": "0.414", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "30670", "lr": "0.000195534", "gnorm": "7.31", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7560"} 2023-01-29 18:17:44 | INFO | train_inner | {"epoch": 15, "update": 14.195, "s2c_loss": "0.605", "loss": "0.41959", "s2c_nll_loss": "0.605", "s2c_accuracy": "88.906", "s2c_total": "64", "s2c_n_correct": "56.9", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "30680", "lr": "0.000195467", "gnorm": "6.791", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7562"} 2023-01-29 18:17:47 | INFO | train_inner | {"epoch": 15, "update": 14.2, "s2c_loss": "0.549", "loss": "0.38022", "s2c_nll_loss": "0.549", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "30690", "lr": "0.0001954", "gnorm": "7.763", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7565"} 2023-01-29 18:17:49 | INFO | train_inner | {"epoch": 15, "update": 14.204, "s2c_loss": "0.452", "loss": "0.31334", "s2c_nll_loss": "0.452", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "30700", "lr": "0.000195334", "gnorm": "8.226", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7567"} 2023-01-29 18:17:52 | INFO | train_inner | {"epoch": 15, "update": 14.209, "s2c_loss": "0.442", "loss": "0.30617", "s2c_nll_loss": "0.442", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "30710", "lr": "0.000195267", "gnorm": "7.269", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7570"} 2023-01-29 18:17:55 | INFO | train_inner | {"epoch": 15, "update": 14.214, "s2c_loss": "0.452", "loss": "0.31349", "s2c_nll_loss": "0.452", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "247.3", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "30720", "lr": "0.0001952", "gnorm": "6.832", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7573"} 2023-01-29 18:17:57 | INFO | train_inner | {"epoch": 15, "update": 14.218, "s2c_loss": "0.456", "loss": "0.31626", "s2c_nll_loss": "0.456", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "30730", "lr": "0.000195134", "gnorm": "6.007", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "7575"} 2023-01-29 18:18:00 | INFO | train_inner | {"epoch": 15, "update": 14.223, "s2c_loss": "0.435", "loss": "0.30153", "s2c_nll_loss": "0.435", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "30740", "lr": "0.000195067", "gnorm": "6.979", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7578"} 2023-01-29 18:18:02 | INFO | train_inner | {"epoch": 15, "update": 14.228, "s2c_loss": "0.61", "loss": "0.4226", "s2c_nll_loss": "0.61", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "30750", "lr": "0.000195", "gnorm": "6.63", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7580"} 2023-01-29 18:18:05 | INFO | train_inner | {"epoch": 15, "update": 14.232, "s2c_loss": "0.477", "loss": "0.33063", "s2c_nll_loss": "0.477", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "30760", "lr": "0.000194934", "gnorm": "7.487", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7583"} 2023-01-29 18:18:07 | INFO | train_inner | {"epoch": 15, "update": 14.237, "s2c_loss": "0.928", "loss": "0.64306", "s2c_nll_loss": "0.928", "s2c_accuracy": "85.781", "s2c_total": "64", "s2c_n_correct": "54.9", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "30770", "lr": "0.000194867", "gnorm": "7.362", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7585"} 2023-01-29 18:18:10 | INFO | train_inner | {"epoch": 15, "update": 14.241, "s2c_loss": "0.417", "loss": "0.28936", "s2c_nll_loss": "0.417", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "30780", "lr": "0.0001948", "gnorm": "5.52", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7588"} 2023-01-29 18:18:12 | INFO | train_inner | {"epoch": 15, "update": 14.246, "s2c_loss": "0.367", "loss": "0.25456", "s2c_nll_loss": "0.367", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "30790", "lr": "0.000194734", "gnorm": "6.718", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "7590"} 2023-01-29 18:18:15 | INFO | train_inner | {"epoch": 15, "update": 14.251, "s2c_loss": "0.481", "loss": "0.3335", "s2c_nll_loss": "0.481", "s2c_accuracy": "90.156", "s2c_total": "64", "s2c_n_correct": "57.7", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "30800", "lr": "0.000194667", "gnorm": "6.51", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7593"} 2023-01-29 18:18:18 | INFO | train_inner | {"epoch": 15, "update": 14.255, "s2c_loss": "0.446", "loss": "0.30887", "s2c_nll_loss": "0.446", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "30810", "lr": "0.0001946", "gnorm": "5.968", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "7595"} 2023-01-29 18:18:20 | INFO | train_inner | {"epoch": 15, "update": 14.26, "s2c_loss": "0.49", "loss": "0.33941", "s2c_nll_loss": "0.49", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "30820", "lr": "0.000194534", "gnorm": "6.066", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7598"} 2023-01-29 18:18:23 | INFO | train_inner | {"epoch": 15, "update": 14.265, "s2c_loss": "0.47", "loss": "0.32575", "s2c_nll_loss": "0.47", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "247.4", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "30830", "lr": "0.000194467", "gnorm": "6.899", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7601"} 2023-01-29 18:18:25 | INFO | train_inner | {"epoch": 15, "update": 14.269, "s2c_loss": "0.557", "loss": "0.38629", "s2c_nll_loss": "0.557", "s2c_accuracy": "88.125", "s2c_total": "64", "s2c_n_correct": "56.4", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "30840", "lr": "0.0001944", "gnorm": "7.11", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7603"} 2023-01-29 18:18:28 | INFO | train_inner | {"epoch": 15, "update": 14.274, "s2c_loss": "0.451", "loss": "0.31251", "s2c_nll_loss": "0.451", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "30850", "lr": "0.000194334", "gnorm": "6.05", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7606"} 2023-01-29 18:18:30 | INFO | train_inner | {"epoch": 15, "update": 14.278, "s2c_loss": "0.555", "loss": "0.38454", "s2c_nll_loss": "0.555", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "30860", "lr": "0.000194267", "gnorm": "7.2", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7608"} 2023-01-29 18:18:33 | INFO | train_inner | {"epoch": 15, "update": 14.283, "s2c_loss": "0.759", "loss": "0.52612", "s2c_nll_loss": "0.759", "s2c_accuracy": "87.656", "s2c_total": "64", "s2c_n_correct": "56.1", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "30870", "lr": "0.0001942", "gnorm": "6.306", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7611"} 2023-01-29 18:18:35 | INFO | train_inner | {"epoch": 15, "update": 14.288, "s2c_loss": "0.369", "loss": "0.25575", "s2c_nll_loss": "0.369", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "30880", "lr": "0.000194134", "gnorm": "6.023", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7613"} 2023-01-29 18:18:38 | INFO | train_inner | {"epoch": 15, "update": 14.292, "s2c_loss": "0.344", "loss": "0.23863", "s2c_nll_loss": "0.344", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "30890", "lr": "0.000194067", "gnorm": "5.716", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7616"} 2023-01-29 18:18:40 | INFO | train_inner | {"epoch": 15, "update": 14.297, "s2c_loss": "0.36", "loss": "0.24939", "s2c_nll_loss": "0.36", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "30900", "lr": "0.000194", "gnorm": "5.607", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7618"} 2023-01-29 18:18:43 | INFO | train_inner | {"epoch": 15, "update": 14.302, "s2c_loss": "0.347", "loss": "0.24027", "s2c_nll_loss": "0.347", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "30910", "lr": "0.000193934", "gnorm": "6.834", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7621"} 2023-01-29 18:18:46 | INFO | train_inner | {"epoch": 15, "update": 14.306, "s2c_loss": "0.402", "loss": "0.27883", "s2c_nll_loss": "0.402", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "30920", "lr": "0.000193867", "gnorm": "6.149", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7623"} 2023-01-29 18:18:48 | INFO | train_inner | {"epoch": 15, "update": 14.311, "s2c_loss": "0.343", "loss": "0.23744", "s2c_nll_loss": "0.343", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "30930", "lr": "0.0001938", "gnorm": "5.422", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7626"} 2023-01-29 18:18:51 | INFO | train_inner | {"epoch": 15, "update": 14.315, "s2c_loss": "0.398", "loss": "0.27617", "s2c_nll_loss": "0.398", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "247.4", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "30940", "lr": "0.000193734", "gnorm": "5.849", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7629"} 2023-01-29 18:18:53 | INFO | train_inner | {"epoch": 15, "update": 14.32, "s2c_loss": "0.337", "loss": "0.23336", "s2c_nll_loss": "0.337", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "30950", "lr": "0.000193667", "gnorm": "5.896", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7631"} 2023-01-29 18:18:56 | INFO | train_inner | {"epoch": 15, "update": 14.325, "s2c_loss": "0.483", "loss": "0.33493", "s2c_nll_loss": "0.483", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "30960", "lr": "0.0001936", "gnorm": "5.959", "loss_scale": "256", "train_wall": "2", "gb_free": "7.4", "wall": "7634"} 2023-01-29 18:18:58 | INFO | train_inner | {"epoch": 15, "update": 14.329, "s2c_loss": "0.502", "loss": "0.34789", "s2c_nll_loss": "0.502", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "30970", "lr": "0.000193534", "gnorm": "6.262", "loss_scale": "256", "train_wall": "2", "gb_free": "7.3", "wall": "7636"} 2023-01-29 18:19:01 | INFO | train_inner | {"epoch": 15, "update": 14.334, "s2c_loss": "0.437", "loss": "0.30265", "s2c_nll_loss": "0.437", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "30980", "lr": "0.000193467", "gnorm": "6.84", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7639"} 2023-01-29 18:19:03 | INFO | train_inner | {"epoch": 15, "update": 14.339, "s2c_loss": "0.288", "loss": "0.19969", "s2c_nll_loss": "0.288", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "30990", "lr": "0.0001934", "gnorm": "5.47", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7641"} 2023-01-29 18:19:06 | INFO | train_inner | {"epoch": 15, "update": 14.343, "s2c_loss": "0.457", "loss": "0.31685", "s2c_nll_loss": "0.457", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "31000", "lr": "0.000193334", "gnorm": "5.773", "loss_scale": "256", "train_wall": "3", "gb_free": "7.2", "wall": "7644"} 2023-01-29 18:19:09 | INFO | train_inner | {"epoch": 15, "update": 14.348, "s2c_loss": "0.489", "loss": "0.33917", "s2c_nll_loss": "0.489", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "31010", "lr": "0.000193267", "gnorm": "6.236", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7646"} 2023-01-29 18:19:11 | INFO | train_inner | {"epoch": 15, "update": 14.352, "s2c_loss": "0.396", "loss": "0.27472", "s2c_nll_loss": "0.396", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "31020", "lr": "0.0001932", "gnorm": "5.572", "loss_scale": "256", "train_wall": "3", "gb_free": "7.3", "wall": "7649"} 2023-01-29 18:19:14 | INFO | train_inner | {"epoch": 15, "update": 14.357, "s2c_loss": "0.436", "loss": "0.30187", "s2c_nll_loss": "0.436", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "31030", "lr": "0.000193134", "gnorm": "6.177", "loss_scale": "256", "train_wall": "3", "gb_free": "7.4", "wall": "7652"} 2023-01-29 18:19:16 | INFO | train_inner | {"epoch": 15, "update": 14.362, "s2c_loss": "0.4", "loss": "0.27698", "s2c_nll_loss": "0.4", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "31040", "lr": "0.000193067", "gnorm": "5.841", "loss_scale": "256", "train_wall": "2", "gb_free": "7.2", "wall": "7654"} 2023-01-29 18:19:19 | INFO | train_inner | {"epoch": 15, "update": 14.366, "s2c_loss": "0.303", "loss": "0.2102", "s2c_nll_loss": "0.303", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "31050", "lr": "0.000193", "gnorm": "4.713", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7657"} 2023-01-29 18:19:21 | INFO | train_inner | {"epoch": 15, "update": 14.371, "s2c_loss": "0.622", "loss": "0.43081", "s2c_nll_loss": "0.622", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "31060", "lr": "0.000192934", "gnorm": "5.114", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7659"} 2023-01-29 18:19:24 | INFO | train_inner | {"epoch": 15, "update": 14.376, "s2c_loss": "0.329", "loss": "0.22826", "s2c_nll_loss": "0.329", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "31070", "lr": "0.000192867", "gnorm": "5.238", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7662"} 2023-01-29 18:19:26 | INFO | train_inner | {"epoch": 15, "update": 14.38, "s2c_loss": "0.401", "loss": "0.27816", "s2c_nll_loss": "0.401", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "31080", "lr": "0.0001928", "gnorm": "5.471", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7664"} 2023-01-29 18:19:29 | INFO | train_inner | {"epoch": 15, "update": 14.385, "s2c_loss": "0.493", "loss": "0.34204", "s2c_nll_loss": "0.493", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "31090", "lr": "0.000192734", "gnorm": "5.268", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7667"} 2023-01-29 18:19:31 | INFO | train_inner | {"epoch": 15, "update": 14.389, "s2c_loss": "0.391", "loss": "0.27068", "s2c_nll_loss": "0.391", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "258.3", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "31100", "lr": "0.000192667", "gnorm": "6.121", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7669"} 2023-01-29 18:19:34 | INFO | train_inner | {"epoch": 15, "update": 14.394, "s2c_loss": "0.479", "loss": "0.33216", "s2c_nll_loss": "0.479", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "31110", "lr": "0.0001926", "gnorm": "6.215", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7672"} 2023-01-29 18:19:36 | INFO | train_inner | {"epoch": 15, "update": 14.399, "s2c_loss": "0.319", "loss": "0.2213", "s2c_nll_loss": "0.319", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "31120", "lr": "0.000192534", "gnorm": "4.898", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7674"} 2023-01-29 18:19:39 | INFO | train_inner | {"epoch": 15, "update": 14.403, "s2c_loss": "0.441", "loss": "0.3054", "s2c_nll_loss": "0.441", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "261.1", "ups": "4.08", "wpb": "64", "bsz": "64", "num_updates": "31130", "lr": "0.000192467", "gnorm": "6.679", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7677"} 2023-01-29 18:19:41 | INFO | train_inner | {"epoch": 15, "update": 14.408, "s2c_loss": "0.445", "loss": "0.30873", "s2c_nll_loss": "0.445", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "31140", "lr": "0.0001924", "gnorm": "6.212", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7679"} 2023-01-29 18:19:44 | INFO | train_inner | {"epoch": 15, "update": 14.413, "s2c_loss": "0.347", "loss": "0.24034", "s2c_nll_loss": "0.347", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "31150", "lr": "0.000192334", "gnorm": "5.665", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7682"} 2023-01-29 18:19:46 | INFO | train_inner | {"epoch": 15, "update": 14.417, "s2c_loss": "0.515", "loss": "0.35717", "s2c_nll_loss": "0.515", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "31160", "lr": "0.000192267", "gnorm": "5.444", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7684"} 2023-01-29 18:19:49 | INFO | train_inner | {"epoch": 15, "update": 14.422, "s2c_loss": "0.399", "loss": "0.27669", "s2c_nll_loss": "0.399", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "31170", "lr": "0.0001922", "gnorm": "6.086", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7687"} 2023-01-29 18:19:52 | INFO | train_inner | {"epoch": 15, "update": 14.426, "s2c_loss": "0.472", "loss": "0.32742", "s2c_nll_loss": "0.472", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "31180", "lr": "0.000192134", "gnorm": "7.116", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7689"} 2023-01-29 18:19:54 | INFO | train_inner | {"epoch": 15, "update": 14.431, "s2c_loss": "0.477", "loss": "0.33066", "s2c_nll_loss": "0.477", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "31190", "lr": "0.000192067", "gnorm": "7.244", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7692"} 2023-01-29 18:19:57 | INFO | train_inner | {"epoch": 15, "update": 14.436, "s2c_loss": "0.561", "loss": "0.38889", "s2c_nll_loss": "0.561", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "31200", "lr": "0.000192", "gnorm": "7.395", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7695"} 2023-01-29 18:19:59 | INFO | train_inner | {"epoch": 15, "update": 14.44, "s2c_loss": "0.447", "loss": "0.30962", "s2c_nll_loss": "0.447", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "31210", "lr": "0.000191934", "gnorm": "6.654", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7697"} 2023-01-29 18:20:02 | INFO | train_inner | {"epoch": 15, "update": 14.445, "s2c_loss": "0.611", "loss": "0.42353", "s2c_nll_loss": "0.611", "s2c_accuracy": "87.344", "s2c_total": "64", "s2c_n_correct": "55.9", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "31220", "lr": "0.000191867", "gnorm": "7.502", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7700"} 2023-01-29 18:20:04 | INFO | train_inner | {"epoch": 15, "update": 14.45, "s2c_loss": "0.558", "loss": "0.38697", "s2c_nll_loss": "0.558", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "31230", "lr": "0.0001918", "gnorm": "5.553", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7702"} 2023-01-29 18:20:07 | INFO | train_inner | {"epoch": 15, "update": 14.454, "s2c_loss": "0.405", "loss": "0.28042", "s2c_nll_loss": "0.405", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "31240", "lr": "0.000191734", "gnorm": "6.008", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7705"} 2023-01-29 18:20:09 | INFO | train_inner | {"epoch": 15, "update": 14.459, "s2c_loss": "1.152", "loss": "0.79826", "s2c_nll_loss": "1.152", "s2c_accuracy": "79.688", "s2c_total": "64", "s2c_n_correct": "51", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "31250", "lr": "0.000191667", "gnorm": "19.915", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7707"} 2023-01-29 18:20:12 | INFO | train_inner | {"epoch": 15, "update": 14.463, "s2c_loss": "1.102", "loss": "0.76405", "s2c_nll_loss": "1.102", "s2c_accuracy": "80.938", "s2c_total": "64", "s2c_n_correct": "51.8", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "31260", "lr": "0.0001916", "gnorm": "12.602", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "7710"} 2023-01-29 18:20:14 | INFO | train_inner | {"epoch": 15, "update": 14.468, "s2c_loss": "1.36", "loss": "0.94286", "s2c_nll_loss": "1.36", "s2c_accuracy": "78.281", "s2c_total": "64", "s2c_n_correct": "50.1", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "31270", "lr": "0.000191534", "gnorm": "10.715", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7712"} 2023-01-29 18:20:17 | INFO | train_inner | {"epoch": 15, "update": 14.473, "s2c_loss": "1.094", "loss": "0.75799", "s2c_nll_loss": "1.094", "s2c_accuracy": "80.625", "s2c_total": "64", "s2c_n_correct": "51.6", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "31280", "lr": "0.000191467", "gnorm": "16.021", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7715"} 2023-01-29 18:20:20 | INFO | train_inner | {"epoch": 15, "update": 14.477, "s2c_loss": "1.028", "loss": "0.7124", "s2c_nll_loss": "1.028", "s2c_accuracy": "81.25", "s2c_total": "64", "s2c_n_correct": "52", "wps": "251.8", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "31290", "lr": "0.0001914", "gnorm": "11.145", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7717"} 2023-01-29 18:20:22 | INFO | train_inner | {"epoch": 15, "update": 14.482, "s2c_loss": "1.002", "loss": "0.6948", "s2c_nll_loss": "1.002", "s2c_accuracy": "81.25", "s2c_total": "64", "s2c_n_correct": "52", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "31300", "lr": "0.000191334", "gnorm": "9.588", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7720"} 2023-01-29 18:20:25 | INFO | train_inner | {"epoch": 15, "update": 14.487, "s2c_loss": "0.833", "loss": "0.57762", "s2c_nll_loss": "0.833", "s2c_accuracy": "84.531", "s2c_total": "64", "s2c_n_correct": "54.1", "wps": "249.3", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "31310", "lr": "0.000191267", "gnorm": "8.384", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7723"} 2023-01-29 18:20:27 | INFO | train_inner | {"epoch": 15, "update": 14.491, "s2c_loss": "0.807", "loss": "0.55968", "s2c_nll_loss": "0.807", "s2c_accuracy": "85", "s2c_total": "64", "s2c_n_correct": "54.4", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "31320", "lr": "0.0001912", "gnorm": "8.633", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7725"} 2023-01-29 18:20:30 | INFO | train_inner | {"epoch": 15, "update": 14.496, "s2c_loss": "0.63", "loss": "0.43666", "s2c_nll_loss": "0.63", "s2c_accuracy": "87.031", "s2c_total": "64", "s2c_n_correct": "55.7", "wps": "244.9", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "31330", "lr": "0.000191134", "gnorm": "8.586", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7728"} 2023-01-29 18:20:32 | INFO | train_inner | {"epoch": 15, "update": 14.5, "s2c_loss": "0.637", "loss": "0.44144", "s2c_nll_loss": "0.637", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "31340", "lr": "0.000191067", "gnorm": "7.968", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7730"} 2023-01-29 18:20:35 | INFO | train_inner | {"epoch": 15, "update": 14.505, "s2c_loss": "0.555", "loss": "0.38477", "s2c_nll_loss": "0.555", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "31350", "lr": "0.000191", "gnorm": "7.101", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7733"} 2023-01-29 18:20:37 | INFO | train_inner | {"epoch": 15, "update": 14.51, "s2c_loss": "0.566", "loss": "0.39218", "s2c_nll_loss": "0.566", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "31360", "lr": "0.000190934", "gnorm": "7.464", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7735"} 2023-01-29 18:20:40 | INFO | train_inner | {"epoch": 15, "update": 14.514, "s2c_loss": "0.546", "loss": "0.37836", "s2c_nll_loss": "0.546", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "31370", "lr": "0.000190867", "gnorm": "6.94", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7738"} 2023-01-29 18:20:43 | INFO | train_inner | {"epoch": 15, "update": 14.519, "s2c_loss": "0.632", "loss": "0.43808", "s2c_nll_loss": "0.632", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "243.3", "ups": "3.8", "wpb": "64", "bsz": "64", "num_updates": "31380", "lr": "0.0001908", "gnorm": "8.297", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7741"} 2023-01-29 18:20:45 | INFO | train_inner | {"epoch": 15, "update": 14.524, "s2c_loss": "0.603", "loss": "0.41787", "s2c_nll_loss": "0.603", "s2c_accuracy": "88.438", "s2c_total": "64", "s2c_n_correct": "56.6", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "31390", "lr": "0.000190734", "gnorm": "6.938", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7743"} 2023-01-29 18:20:48 | INFO | train_inner | {"epoch": 15, "update": 14.528, "s2c_loss": "0.531", "loss": "0.36823", "s2c_nll_loss": "0.531", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "31400", "lr": "0.000190667", "gnorm": "7.077", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7746"} 2023-01-29 18:20:50 | INFO | train_inner | {"epoch": 15, "update": 14.533, "s2c_loss": "0.453", "loss": "0.31368", "s2c_nll_loss": "0.453", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "31410", "lr": "0.0001906", "gnorm": "6.873", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7748"} 2023-01-29 18:20:53 | INFO | train_inner | {"epoch": 15, "update": 14.537, "s2c_loss": "0.504", "loss": "0.34946", "s2c_nll_loss": "0.504", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "31420", "lr": "0.000190534", "gnorm": "6.623", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7751"} 2023-01-29 18:20:55 | INFO | train_inner | {"epoch": 15, "update": 14.542, "s2c_loss": "0.447", "loss": "0.31013", "s2c_nll_loss": "0.447", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "31430", "lr": "0.000190467", "gnorm": "6.173", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7753"} 2023-01-29 18:20:58 | INFO | train_inner | {"epoch": 15, "update": 14.547, "s2c_loss": "0.503", "loss": "0.34871", "s2c_nll_loss": "0.503", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "31440", "lr": "0.0001904", "gnorm": "6.555", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7756"} 2023-01-29 18:21:00 | INFO | train_inner | {"epoch": 15, "update": 14.551, "s2c_loss": "0.561", "loss": "0.38887", "s2c_nll_loss": "0.561", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "31450", "lr": "0.000190334", "gnorm": "6.902", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7758"} 2023-01-29 18:21:03 | INFO | train_inner | {"epoch": 15, "update": 14.556, "s2c_loss": "0.319", "loss": "0.22146", "s2c_nll_loss": "0.319", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "31460", "lr": "0.000190267", "gnorm": "5.327", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7761"} 2023-01-29 18:21:05 | INFO | train_inner | {"epoch": 15, "update": 14.561, "s2c_loss": "0.508", "loss": "0.352", "s2c_nll_loss": "0.508", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "259.1", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "31470", "lr": "0.0001902", "gnorm": "6.002", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7763"} 2023-01-29 18:21:08 | INFO | train_inner | {"epoch": 15, "update": 14.565, "s2c_loss": "0.625", "loss": "0.43298", "s2c_nll_loss": "0.625", "s2c_accuracy": "88.75", "s2c_total": "64", "s2c_n_correct": "56.8", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "31480", "lr": "0.000190134", "gnorm": "6.691", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "7766"} 2023-01-29 18:21:10 | INFO | train_inner | {"epoch": 15, "update": 14.57, "s2c_loss": "0.31", "loss": "0.21494", "s2c_nll_loss": "0.31", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "31490", "lr": "0.000190067", "gnorm": "5.791", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "7768"} 2023-01-29 18:21:13 | INFO | train_inner | {"epoch": 15, "update": 14.574, "s2c_loss": "0.371", "loss": "0.25715", "s2c_nll_loss": "0.371", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "31500", "lr": "0.00019", "gnorm": "5.606", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7771"} 2023-01-29 18:21:16 | INFO | train_inner | {"epoch": 15, "update": 14.579, "s2c_loss": "0.393", "loss": "0.27224", "s2c_nll_loss": "0.393", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "31510", "lr": "0.000189934", "gnorm": "5.966", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7773"} 2023-01-29 18:21:18 | INFO | train_inner | {"epoch": 15, "update": 14.584, "s2c_loss": "0.344", "loss": "0.23826", "s2c_nll_loss": "0.344", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "31520", "lr": "0.000189867", "gnorm": "5.28", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7776"} 2023-01-29 18:21:21 | INFO | train_inner | {"epoch": 15, "update": 14.588, "s2c_loss": "0.505", "loss": "0.34993", "s2c_nll_loss": "0.505", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "31530", "lr": "0.000189801", "gnorm": "6.5", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "7779"} 2023-01-29 18:21:23 | INFO | train_inner | {"epoch": 15, "update": 14.593, "s2c_loss": "0.367", "loss": "0.25432", "s2c_nll_loss": "0.367", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "31540", "lr": "0.000189734", "gnorm": "6.325", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7781"} 2023-01-29 18:21:26 | INFO | train_inner | {"epoch": 15, "update": 14.598, "s2c_loss": "0.468", "loss": "0.32434", "s2c_nll_loss": "0.468", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "31550", "lr": "0.000189667", "gnorm": "8.104", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "7784"} 2023-01-29 18:21:28 | INFO | train_inner | {"epoch": 15, "update": 14.602, "s2c_loss": "0.471", "loss": "0.32652", "s2c_nll_loss": "0.471", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "31560", "lr": "0.000189601", "gnorm": "7.824", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7786"} 2023-01-29 18:21:31 | INFO | train_inner | {"epoch": 15, "update": 14.607, "s2c_loss": "0.465", "loss": "0.32256", "s2c_nll_loss": "0.465", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "31570", "lr": "0.000189534", "gnorm": "8.971", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7789"} 2023-01-29 18:21:33 | INFO | train_inner | {"epoch": 15, "update": 14.611, "s2c_loss": "0.52", "loss": "0.36021", "s2c_nll_loss": "0.52", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "31580", "lr": "0.000189467", "gnorm": "8.539", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7791"} 2023-01-29 18:21:36 | INFO | train_inner | {"epoch": 15, "update": 14.616, "s2c_loss": "0.655", "loss": "0.45407", "s2c_nll_loss": "0.655", "s2c_accuracy": "87.5", "s2c_total": "64", "s2c_n_correct": "56", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "31590", "lr": "0.000189401", "gnorm": "10.431", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7794"} 2023-01-29 18:21:38 | INFO | train_inner | {"epoch": 15, "update": 14.621, "s2c_loss": "0.682", "loss": "0.47266", "s2c_nll_loss": "0.682", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "242.9", "ups": "3.79", "wpb": "64", "bsz": "64", "num_updates": "31600", "lr": "0.000189334", "gnorm": "7.222", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7796"} 2023-01-29 18:21:41 | INFO | train_inner | {"epoch": 15, "update": 14.625, "s2c_loss": "0.461", "loss": "0.31933", "s2c_nll_loss": "0.461", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "31610", "lr": "0.000189267", "gnorm": "6.363", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7799"} 2023-01-29 18:21:44 | INFO | train_inner | {"epoch": 15, "update": 14.63, "s2c_loss": "0.458", "loss": "0.31749", "s2c_nll_loss": "0.458", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "244.4", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "31620", "lr": "0.000189201", "gnorm": "6.096", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7802"} 2023-01-29 18:21:46 | INFO | train_inner | {"epoch": 15, "update": 14.635, "s2c_loss": "0.673", "loss": "0.46671", "s2c_nll_loss": "0.673", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "248", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "31630", "lr": "0.000189134", "gnorm": "7.058", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7804"} 2023-01-29 18:21:49 | INFO | train_inner | {"epoch": 15, "update": 14.639, "s2c_loss": "0.508", "loss": "0.3521", "s2c_nll_loss": "0.508", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "31640", "lr": "0.000189067", "gnorm": "5.328", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7807"} 2023-01-29 18:21:51 | INFO | train_inner | {"epoch": 15, "update": 14.644, "s2c_loss": "0.509", "loss": "0.35304", "s2c_nll_loss": "0.509", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "31650", "lr": "0.000189001", "gnorm": "6.527", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "7809"} 2023-01-29 18:21:54 | INFO | train_inner | {"epoch": 15, "update": 14.648, "s2c_loss": "0.518", "loss": "0.35897", "s2c_nll_loss": "0.518", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "31660", "lr": "0.000188934", "gnorm": "6.235", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7812"} 2023-01-29 18:21:56 | INFO | train_inner | {"epoch": 15, "update": 14.653, "s2c_loss": "0.385", "loss": "0.26658", "s2c_nll_loss": "0.385", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "31670", "lr": "0.000188867", "gnorm": "6.044", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7814"} 2023-01-29 18:21:59 | INFO | train_inner | {"epoch": 15, "update": 14.658, "s2c_loss": "0.522", "loss": "0.36191", "s2c_nll_loss": "0.522", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "31680", "lr": "0.000188801", "gnorm": "6.546", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7817"} 2023-01-29 18:22:01 | INFO | train_inner | {"epoch": 15, "update": 14.662, "s2c_loss": "0.405", "loss": "0.28074", "s2c_nll_loss": "0.405", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "31690", "lr": "0.000188734", "gnorm": "5.953", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7819"} 2023-01-29 18:22:04 | INFO | train_inner | {"epoch": 15, "update": 14.667, "s2c_loss": "0.395", "loss": "0.27406", "s2c_nll_loss": "0.395", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "31700", "lr": "0.000188667", "gnorm": "5.804", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7822"} 2023-01-29 18:22:07 | INFO | train_inner | {"epoch": 15, "update": 14.672, "s2c_loss": "0.317", "loss": "0.22", "s2c_nll_loss": "0.317", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "31710", "lr": "0.000188601", "gnorm": "5.035", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7824"} 2023-01-29 18:22:09 | INFO | train_inner | {"epoch": 15, "update": 14.676, "s2c_loss": "0.498", "loss": "0.34541", "s2c_nll_loss": "0.498", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "31720", "lr": "0.000188534", "gnorm": "5.491", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7827"} 2023-01-29 18:22:12 | INFO | train_inner | {"epoch": 15, "update": 14.681, "s2c_loss": "0.398", "loss": "0.27584", "s2c_nll_loss": "0.398", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "31730", "lr": "0.000188467", "gnorm": "5.58", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7829"} 2023-01-29 18:22:14 | INFO | train_inner | {"epoch": 15, "update": 14.685, "s2c_loss": "0.514", "loss": "0.35604", "s2c_nll_loss": "0.514", "s2c_accuracy": "89.844", "s2c_total": "64", "s2c_n_correct": "57.5", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "31740", "lr": "0.000188401", "gnorm": "6.753", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7832"} 2023-01-29 18:22:17 | INFO | train_inner | {"epoch": 15, "update": 14.69, "s2c_loss": "0.417", "loss": "0.2891", "s2c_nll_loss": "0.417", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "31750", "lr": "0.000188334", "gnorm": "5.571", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7835"} 2023-01-29 18:22:19 | INFO | train_inner | {"epoch": 15, "update": 14.695, "s2c_loss": "0.467", "loss": "0.32341", "s2c_nll_loss": "0.467", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "31760", "lr": "0.000188267", "gnorm": "6.017", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7837"} 2023-01-29 18:22:22 | INFO | train_inner | {"epoch": 15, "update": 14.699, "s2c_loss": "0.388", "loss": "0.26919", "s2c_nll_loss": "0.388", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "31770", "lr": "0.000188201", "gnorm": "5.38", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7840"} 2023-01-29 18:22:24 | INFO | train_inner | {"epoch": 15, "update": 14.704, "s2c_loss": "0.493", "loss": "0.34139", "s2c_nll_loss": "0.493", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "31780", "lr": "0.000188134", "gnorm": "7.099", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7842"} 2023-01-29 18:22:27 | INFO | train_inner | {"epoch": 15, "update": 14.709, "s2c_loss": "0.424", "loss": "0.29408", "s2c_nll_loss": "0.424", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "31790", "lr": "0.000188067", "gnorm": "6.086", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7845"} 2023-01-29 18:22:29 | INFO | train_inner | {"epoch": 15, "update": 14.713, "s2c_loss": "0.503", "loss": "0.34842", "s2c_nll_loss": "0.503", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "31800", "lr": "0.000188001", "gnorm": "6.386", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7847"} 2023-01-29 18:22:32 | INFO | train_inner | {"epoch": 15, "update": 14.718, "s2c_loss": "0.403", "loss": "0.27956", "s2c_nll_loss": "0.403", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "31810", "lr": "0.000187934", "gnorm": "5.657", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7850"} 2023-01-29 18:22:35 | INFO | train_inner | {"epoch": 15, "update": 14.722, "s2c_loss": "0.372", "loss": "0.25754", "s2c_nll_loss": "0.372", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "31820", "lr": "0.000187867", "gnorm": "4.743", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "7852"} 2023-01-29 18:22:37 | INFO | train_inner | {"epoch": 15, "update": 14.727, "s2c_loss": "0.406", "loss": "0.28173", "s2c_nll_loss": "0.406", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "31830", "lr": "0.000187801", "gnorm": "5.932", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7855"} 2023-01-29 18:22:40 | INFO | train_inner | {"epoch": 15, "update": 14.732, "s2c_loss": "0.349", "loss": "0.24211", "s2c_nll_loss": "0.349", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "31840", "lr": "0.000187734", "gnorm": "5.226", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7857"} 2023-01-29 18:22:42 | INFO | train_inner | {"epoch": 15, "update": 14.736, "s2c_loss": "0.439", "loss": "0.30454", "s2c_nll_loss": "0.439", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "31850", "lr": "0.000187667", "gnorm": "5.701", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7860"} 2023-01-29 18:22:45 | INFO | train_inner | {"epoch": 15, "update": 14.741, "s2c_loss": "0.508", "loss": "0.35187", "s2c_nll_loss": "0.508", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "31860", "lr": "0.000187601", "gnorm": "5.917", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7863"} 2023-01-29 18:22:47 | INFO | train_inner | {"epoch": 15, "update": 14.746, "s2c_loss": "0.507", "loss": "0.35127", "s2c_nll_loss": "0.507", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "258.5", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "31870", "lr": "0.000187534", "gnorm": "6.834", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7865"} 2023-01-29 18:22:50 | INFO | train_inner | {"epoch": 15, "update": 14.75, "s2c_loss": "0.383", "loss": "0.26579", "s2c_nll_loss": "0.383", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "31880", "lr": "0.000187467", "gnorm": "5.114", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7868"} 2023-01-29 18:22:52 | INFO | train_inner | {"epoch": 15, "update": 14.755, "s2c_loss": "0.39", "loss": "0.27048", "s2c_nll_loss": "0.39", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "31890", "lr": "0.000187401", "gnorm": "6.732", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7870"} 2023-01-29 18:22:55 | INFO | train_inner | {"epoch": 15, "update": 14.759, "s2c_loss": "0.539", "loss": "0.37379", "s2c_nll_loss": "0.539", "s2c_accuracy": "90.469", "s2c_total": "64", "s2c_n_correct": "57.9", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "31900", "lr": "0.000187334", "gnorm": "6.481", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7873"} 2023-01-29 18:22:57 | INFO | train_inner | {"epoch": 15, "update": 14.764, "s2c_loss": "0.587", "loss": "0.40674", "s2c_nll_loss": "0.587", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "31910", "lr": "0.000187267", "gnorm": "6.459", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7875"} 2023-01-29 18:23:00 | INFO | train_inner | {"epoch": 15, "update": 14.769, "s2c_loss": "0.478", "loss": "0.33115", "s2c_nll_loss": "0.478", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "31920", "lr": "0.000187201", "gnorm": "6.19", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7878"} 2023-01-29 18:23:02 | INFO | train_inner | {"epoch": 15, "update": 14.773, "s2c_loss": "0.459", "loss": "0.31821", "s2c_nll_loss": "0.459", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "31930", "lr": "0.000187134", "gnorm": "5.992", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7880"} 2023-01-29 18:23:05 | INFO | train_inner | {"epoch": 15, "update": 14.778, "s2c_loss": "0.383", "loss": "0.26566", "s2c_nll_loss": "0.383", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "31940", "lr": "0.000187067", "gnorm": "5.467", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7883"} 2023-01-29 18:23:07 | INFO | train_inner | {"epoch": 15, "update": 14.783, "s2c_loss": "0.36", "loss": "0.24964", "s2c_nll_loss": "0.36", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "31950", "lr": "0.000187001", "gnorm": "4.802", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7885"} 2023-01-29 18:23:10 | INFO | train_inner | {"epoch": 15, "update": 14.787, "s2c_loss": "0.313", "loss": "0.21672", "s2c_nll_loss": "0.313", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "31960", "lr": "0.000186934", "gnorm": "5.189", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7888"} 2023-01-29 18:23:12 | INFO | train_inner | {"epoch": 15, "update": 14.792, "s2c_loss": "0.444", "loss": "0.30766", "s2c_nll_loss": "0.444", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "31970", "lr": "0.000186867", "gnorm": "5.871", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7890"} 2023-01-29 18:23:15 | INFO | train_inner | {"epoch": 15, "update": 14.796, "s2c_loss": "0.42", "loss": "0.2911", "s2c_nll_loss": "0.42", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "31980", "lr": "0.000186801", "gnorm": "6.563", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7893"} 2023-01-29 18:23:17 | INFO | train_inner | {"epoch": 15, "update": 14.801, "s2c_loss": "0.415", "loss": "0.28772", "s2c_nll_loss": "0.415", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "31990", "lr": "0.000186734", "gnorm": "5.894", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7895"} 2023-01-29 18:23:20 | INFO | train_inner | {"epoch": 15, "update": 14.806, "s2c_loss": "0.483", "loss": "0.33505", "s2c_nll_loss": "0.483", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "32000", "lr": "0.000186667", "gnorm": "6.722", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "7898"} 2023-01-29 18:23:22 | INFO | train_inner | {"epoch": 15, "update": 14.81, "s2c_loss": "0.493", "loss": "0.34202", "s2c_nll_loss": "0.493", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "32010", "lr": "0.000186601", "gnorm": "6.397", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "7900"} 2023-01-29 18:23:25 | INFO | train_inner | {"epoch": 15, "update": 14.815, "s2c_loss": "0.429", "loss": "0.29753", "s2c_nll_loss": "0.429", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "32020", "lr": "0.000186534", "gnorm": "7.154", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7903"} 2023-01-29 18:23:28 | INFO | train_inner | {"epoch": 15, "update": 14.82, "s2c_loss": "0.443", "loss": "0.30729", "s2c_nll_loss": "0.443", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "32030", "lr": "0.000186467", "gnorm": "5.799", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7905"} 2023-01-29 18:23:30 | INFO | train_inner | {"epoch": 15, "update": 14.824, "s2c_loss": "0.404", "loss": "0.28002", "s2c_nll_loss": "0.404", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "32040", "lr": "0.000186401", "gnorm": "6.073", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "7908"} 2023-01-29 18:23:33 | INFO | train_inner | {"epoch": 15, "update": 14.829, "s2c_loss": "0.364", "loss": "0.25212", "s2c_nll_loss": "0.364", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "32050", "lr": "0.000186334", "gnorm": "5.856", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7911"} 2023-01-29 18:23:35 | INFO | train_inner | {"epoch": 15, "update": 14.833, "s2c_loss": "0.587", "loss": "0.40662", "s2c_nll_loss": "0.587", "s2c_accuracy": "89.375", "s2c_total": "64", "s2c_n_correct": "57.2", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "32060", "lr": "0.000186267", "gnorm": "6.997", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7913"} 2023-01-29 18:23:38 | INFO | train_inner | {"epoch": 15, "update": 14.838, "s2c_loss": "0.415", "loss": "0.28737", "s2c_nll_loss": "0.415", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "32070", "lr": "0.000186201", "gnorm": "6.241", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7916"} 2023-01-29 18:23:40 | INFO | train_inner | {"epoch": 15, "update": 14.843, "s2c_loss": "0.425", "loss": "0.29445", "s2c_nll_loss": "0.425", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "32080", "lr": "0.000186134", "gnorm": "5.993", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7918"} 2023-01-29 18:23:43 | INFO | train_inner | {"epoch": 15, "update": 14.847, "s2c_loss": "0.4", "loss": "0.27757", "s2c_nll_loss": "0.4", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "32090", "lr": "0.000186067", "gnorm": "6.002", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7921"} 2023-01-29 18:23:45 | INFO | train_inner | {"epoch": 15, "update": 14.852, "s2c_loss": "0.388", "loss": "0.26892", "s2c_nll_loss": "0.388", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "258.5", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "32100", "lr": "0.000186001", "gnorm": "6.067", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7923"} 2023-01-29 18:23:48 | INFO | train_inner | {"epoch": 15, "update": 14.857, "s2c_loss": "0.399", "loss": "0.27586", "s2c_nll_loss": "0.399", "s2c_accuracy": "92.622", "s2c_total": "63.7", "s2c_n_correct": "59", "wps": "250.2", "ups": "3.93", "wpb": "63.7", "bsz": "63.7", "num_updates": "32110", "lr": "0.000185934", "gnorm": "5.93", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7926"} 2023-01-29 18:23:50 | INFO | train_inner | {"epoch": 15, "update": 14.861, "s2c_loss": "0.353", "loss": "0.24467", "s2c_nll_loss": "0.353", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "32120", "lr": "0.000185867", "gnorm": "5.41", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7928"} 2023-01-29 18:23:53 | INFO | train_inner | {"epoch": 15, "update": 14.866, "s2c_loss": "0.42", "loss": "0.29109", "s2c_nll_loss": "0.42", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "32130", "lr": "0.000185801", "gnorm": "6.108", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7931"} 2023-01-29 18:23:55 | INFO | train_inner | {"epoch": 15, "update": 14.87, "s2c_loss": "0.366", "loss": "0.25336", "s2c_nll_loss": "0.366", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "247.6", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "32140", "lr": "0.000185734", "gnorm": "5.155", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7933"} 2023-01-29 18:23:58 | INFO | train_inner | {"epoch": 15, "update": 14.875, "s2c_loss": "0.422", "loss": "0.29229", "s2c_nll_loss": "0.422", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "245.9", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "32150", "lr": "0.000185667", "gnorm": "4.787", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7936"} 2023-01-29 18:24:01 | INFO | train_inner | {"epoch": 15, "update": 14.88, "s2c_loss": "0.336", "loss": "0.23261", "s2c_nll_loss": "0.336", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "32160", "lr": "0.000185601", "gnorm": "6.183", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7939"} 2023-01-29 18:24:03 | INFO | train_inner | {"epoch": 15, "update": 14.884, "s2c_loss": "0.507", "loss": "0.35119", "s2c_nll_loss": "0.507", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "260.2", "ups": "4.07", "wpb": "64", "bsz": "64", "num_updates": "32170", "lr": "0.000185534", "gnorm": "6.157", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7941"} 2023-01-29 18:24:06 | INFO | train_inner | {"epoch": 15, "update": 14.889, "s2c_loss": "0.363", "loss": "0.25168", "s2c_nll_loss": "0.363", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "32180", "lr": "0.000185467", "gnorm": "5.777", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7943"} 2023-01-29 18:24:08 | INFO | train_inner | {"epoch": 15, "update": 14.894, "s2c_loss": "0.356", "loss": "0.2467", "s2c_nll_loss": "0.356", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "32190", "lr": "0.000185401", "gnorm": "5.278", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7946"} 2023-01-29 18:24:11 | INFO | train_inner | {"epoch": 15, "update": 14.898, "s2c_loss": "0.329", "loss": "0.22821", "s2c_nll_loss": "0.329", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "32200", "lr": "0.000185334", "gnorm": "4.796", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7949"} 2023-01-29 18:24:13 | INFO | train_inner | {"epoch": 15, "update": 14.903, "s2c_loss": "0.429", "loss": "0.29718", "s2c_nll_loss": "0.429", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "244.1", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "32210", "lr": "0.000185267", "gnorm": "5.036", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7951"} 2023-01-29 18:24:16 | INFO | train_inner | {"epoch": 15, "update": 14.907, "s2c_loss": "0.379", "loss": "0.26275", "s2c_nll_loss": "0.379", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "32220", "lr": "0.000185201", "gnorm": "4.866", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7954"} 2023-01-29 18:24:18 | INFO | train_inner | {"epoch": 15, "update": 14.912, "s2c_loss": "0.365", "loss": "0.25294", "s2c_nll_loss": "0.365", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "32230", "lr": "0.000185134", "gnorm": "4.914", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7956"} 2023-01-29 18:24:21 | INFO | train_inner | {"epoch": 15, "update": 14.917, "s2c_loss": "0.319", "loss": "0.22095", "s2c_nll_loss": "0.319", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "32240", "lr": "0.000185067", "gnorm": "5.34", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7959"} 2023-01-29 18:24:23 | INFO | train_inner | {"epoch": 15, "update": 14.921, "s2c_loss": "0.281", "loss": "0.19499", "s2c_nll_loss": "0.281", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "247", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "32250", "lr": "0.000185001", "gnorm": "5.222", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7961"} 2023-01-29 18:24:26 | INFO | train_inner | {"epoch": 15, "update": 14.926, "s2c_loss": "0.32", "loss": "0.22203", "s2c_nll_loss": "0.32", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "234.9", "ups": "3.67", "wpb": "64", "bsz": "64", "num_updates": "32260", "lr": "0.000184934", "gnorm": "6.273", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7964"} 2023-01-29 18:24:29 | INFO | train_inner | {"epoch": 15, "update": 14.931, "s2c_loss": "0.357", "loss": "0.24764", "s2c_nll_loss": "0.357", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "247.9", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "32270", "lr": "0.000184867", "gnorm": "5.844", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7967"} 2023-01-29 18:24:31 | INFO | train_inner | {"epoch": 15, "update": 14.935, "s2c_loss": "0.391", "loss": "0.27091", "s2c_nll_loss": "0.391", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "32280", "lr": "0.000184801", "gnorm": "6.442", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7969"} 2023-01-29 18:24:34 | INFO | train_inner | {"epoch": 15, "update": 14.94, "s2c_loss": "0.429", "loss": "0.29714", "s2c_nll_loss": "0.429", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "32290", "lr": "0.000184734", "gnorm": "6.179", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "7972"} 2023-01-29 18:24:36 | INFO | train_inner | {"epoch": 15, "update": 14.944, "s2c_loss": "0.31", "loss": "0.21489", "s2c_nll_loss": "0.31", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "32300", "lr": "0.000184667", "gnorm": "5.083", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7974"} 2023-01-29 18:24:39 | INFO | train_inner | {"epoch": 15, "update": 14.949, "s2c_loss": "0.518", "loss": "0.35875", "s2c_nll_loss": "0.518", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "32310", "lr": "0.000184601", "gnorm": "6.508", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "7977"} 2023-01-29 18:24:41 | INFO | train_inner | {"epoch": 15, "update": 14.954, "s2c_loss": "0.436", "loss": "0.3022", "s2c_nll_loss": "0.436", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "255.7", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "32320", "lr": "0.000184534", "gnorm": "6.933", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7979"} 2023-01-29 18:24:44 | INFO | train_inner | {"epoch": 15, "update": 14.958, "s2c_loss": "0.398", "loss": "0.27558", "s2c_nll_loss": "0.398", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "32330", "lr": "0.000184467", "gnorm": "6.56", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7982"} 2023-01-29 18:24:47 | INFO | train_inner | {"epoch": 15, "update": 14.963, "s2c_loss": "0.371", "loss": "0.25748", "s2c_nll_loss": "0.371", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "32340", "lr": "0.000184401", "gnorm": "5.955", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7984"} 2023-01-29 18:24:49 | INFO | train_inner | {"epoch": 15, "update": 14.968, "s2c_loss": "0.384", "loss": "0.26623", "s2c_nll_loss": "0.384", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "32350", "lr": "0.000184334", "gnorm": "5.85", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "7987"} 2023-01-29 18:24:52 | INFO | train_inner | {"epoch": 15, "update": 14.972, "s2c_loss": "0.324", "loss": "0.22448", "s2c_nll_loss": "0.324", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "32360", "lr": "0.000184267", "gnorm": "6.925", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7989"} 2023-01-29 18:24:54 | INFO | train_inner | {"epoch": 15, "update": 14.977, "s2c_loss": "0.468", "loss": "0.32412", "s2c_nll_loss": "0.468", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "32370", "lr": "0.000184201", "gnorm": "5.749", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "7992"} 2023-01-29 18:24:56 | INFO | train_inner | {"epoch": 15, "update": 14.981, "s2c_loss": "0.461", "loss": "0.31934", "s2c_nll_loss": "0.461", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "262.1", "ups": "4.09", "wpb": "64", "bsz": "64", "num_updates": "32380", "lr": "0.000184134", "gnorm": "6.423", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7994"} 2023-01-29 18:24:59 | INFO | train_inner | {"epoch": 15, "update": 14.986, "s2c_loss": "0.421", "loss": "0.29168", "s2c_nll_loss": "0.421", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "32390", "lr": "0.000184067", "gnorm": "5.612", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "7997"} 2023-01-29 18:25:02 | INFO | train_inner | {"epoch": 15, "update": 14.991, "s2c_loss": "0.429", "loss": "0.29707", "s2c_nll_loss": "0.429", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "247.9", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "32400", "lr": "0.000184001", "gnorm": "6.809", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "8000"} 2023-01-29 18:25:04 | INFO | train_inner | {"epoch": 15, "update": 14.995, "s2c_loss": "0.424", "loss": "0.2939", "s2c_nll_loss": "0.424", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "32410", "lr": "0.000183934", "gnorm": "5.165", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "8002"} 2023-01-29 18:25:07 | INFO | train_inner | {"epoch": 15, "update": 15.0, "s2c_loss": "0.406", "loss": "0.28148", "s2c_nll_loss": "0.406", "s2c_accuracy": "92.434", "s2c_total": "60.8", "s2c_n_correct": "56.2", "wps": "252.9", "ups": "4.16", "wpb": "60.8", "bsz": "60.8", "num_updates": "32420", "lr": "0.000183867", "gnorm": "5.448", "loss_scale": "512", "train_wall": "2", "gb_free": "7.5", "wall": "8004"} 2023-01-29 18:25:07 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 18:25:21 | INFO | valid | {"epoch": 15, "valid_s2c_loss": "0.798", "valid_loss": "0.55338", "valid_s2c_nll_loss": "0.798", "valid_s2c_accuracy": "85.745", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "27.4028", "valid_num_updates": "32420", "valid_best_s2c_accuracy": "85.745"} 2023-01-29 18:25:21 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 15 @ 32420 updates 2023-01-29 18:25:21 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 18:25:28 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 18:25:33 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt (epoch 15 @ 32420 updates, score 85.745) (writing took 11.846482349094003 seconds) 2023-01-29 18:25:33 | INFO | fairseq_cli.train | end of epoch 15 (average epoch stats below) 2023-01-29 18:25:33 | INFO | train | {"epoch": 15, "train_s2c_loss": "0.473", "train_loss": "0.32785", "train_s2c_nll_loss": "0.473", "train_s2c_accuracy": "91.425", "train_s2c_total": "63.9838", "train_s2c_n_correct": "58.4972", "train_wps": "237.3", "train_ups": "3.71", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "32420", "train_lr": "0.000183867", "train_gnorm": "6.503", "train_loss_scale": "512", "train_train_wall": "542", "train_gb_free": "7.5", "train_wall": "8031"} 2023-01-29 18:25:40 | INFO | fairseq.trainer | begin training epoch 16 2023-01-29 18:25:40 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 18:25:42 | INFO | train_inner | {"epoch": 16, "update": 15.005, "s2c_loss": "0.853", "loss": "0.59116", "s2c_nll_loss": "0.853", "s2c_accuracy": "87.812", "s2c_total": "64", "s2c_n_correct": "56.2", "wps": "17.9", "ups": "0.28", "wpb": "64", "bsz": "64", "num_updates": "32430", "lr": "0.000183801", "gnorm": "5.609", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "8040"} 2023-01-29 18:25:45 | INFO | train_inner | {"epoch": 16, "update": 15.009, "s2c_loss": "0.39", "loss": "0.27063", "s2c_nll_loss": "0.39", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "32440", "lr": "0.000183734", "gnorm": "5.815", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8043"} 2023-01-29 18:25:47 | INFO | train_inner | {"epoch": 16, "update": 15.014, "s2c_loss": "0.281", "loss": "0.19449", "s2c_nll_loss": "0.281", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "247.5", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "32450", "lr": "0.000183667", "gnorm": "4.77", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "8045"} 2023-01-29 18:25:50 | INFO | train_inner | {"epoch": 16, "update": 15.019, "s2c_loss": "0.32", "loss": "0.22174", "s2c_nll_loss": "0.32", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "247.1", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "32460", "lr": "0.000183601", "gnorm": "5.958", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "8048"} 2023-01-29 18:25:53 | INFO | train_inner | {"epoch": 16, "update": 15.023, "s2c_loss": "0.283", "loss": "0.19608", "s2c_nll_loss": "0.283", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "32470", "lr": "0.000183534", "gnorm": "5.959", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "8050"} 2023-01-29 18:25:55 | INFO | train_inner | {"epoch": 16, "update": 15.028, "s2c_loss": "0.356", "loss": "0.24694", "s2c_nll_loss": "0.356", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "260.1", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "32480", "lr": "0.000183467", "gnorm": "5.08", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8053"} 2023-01-29 18:25:58 | INFO | train_inner | {"epoch": 16, "update": 15.032, "s2c_loss": "0.368", "loss": "0.25541", "s2c_nll_loss": "0.368", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "32490", "lr": "0.000183401", "gnorm": "5.432", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "8055"} 2023-01-29 18:26:00 | INFO | train_inner | {"epoch": 16, "update": 15.037, "s2c_loss": "0.288", "loss": "0.19983", "s2c_nll_loss": "0.288", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "32500", "lr": "0.000183334", "gnorm": "5.482", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "8058"} 2023-01-29 18:26:03 | INFO | train_inner | {"epoch": 16, "update": 15.042, "s2c_loss": "0.306", "loss": "0.21194", "s2c_nll_loss": "0.306", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "32510", "lr": "0.000183268", "gnorm": "4.802", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "8061"} 2023-01-29 18:26:05 | INFO | train_inner | {"epoch": 16, "update": 15.046, "s2c_loss": "0.294", "loss": "0.20357", "s2c_nll_loss": "0.294", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "32520", "lr": "0.000183201", "gnorm": "4.79", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8063"} 2023-01-29 18:26:08 | INFO | train_inner | {"epoch": 16, "update": 15.051, "s2c_loss": "0.406", "loss": "0.28122", "s2c_nll_loss": "0.406", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "32530", "lr": "0.000183134", "gnorm": "4.904", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8066"} 2023-01-29 18:26:10 | INFO | train_inner | {"epoch": 16, "update": 15.056, "s2c_loss": "0.323", "loss": "0.22404", "s2c_nll_loss": "0.323", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "32540", "lr": "0.000183068", "gnorm": "5.338", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "8068"} 2023-01-29 18:26:13 | INFO | train_inner | {"epoch": 16, "update": 15.06, "s2c_loss": "0.226", "loss": "0.15656", "s2c_nll_loss": "0.226", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "246", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "32550", "lr": "0.000183001", "gnorm": "4.492", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "8071"} 2023-01-29 18:26:15 | INFO | train_inner | {"epoch": 16, "update": 15.065, "s2c_loss": "0.214", "loss": "0.14855", "s2c_nll_loss": "0.214", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "32560", "lr": "0.000182934", "gnorm": "4.639", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8073"} 2023-01-29 18:26:18 | INFO | train_inner | {"epoch": 16, "update": 15.069, "s2c_loss": "0.309", "loss": "0.21432", "s2c_nll_loss": "0.309", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "32570", "lr": "0.000182868", "gnorm": "4.27", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "8076"} 2023-01-29 18:26:20 | INFO | train_inner | {"epoch": 16, "update": 15.074, "s2c_loss": "0.257", "loss": "0.17816", "s2c_nll_loss": "0.257", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "32580", "lr": "0.000182801", "gnorm": "4.236", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "8078"} 2023-01-29 18:26:23 | INFO | train_inner | {"epoch": 16, "update": 15.079, "s2c_loss": "0.204", "loss": "0.14144", "s2c_nll_loss": "0.204", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "32590", "lr": "0.000182734", "gnorm": "4.126", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "8081"} 2023-01-29 18:26:25 | INFO | train_inner | {"epoch": 16, "update": 15.083, "s2c_loss": "0.341", "loss": "0.23626", "s2c_nll_loss": "0.341", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "32600", "lr": "0.000182668", "gnorm": "5.286", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "8083"} 2023-01-29 18:26:28 | INFO | train_inner | {"epoch": 16, "update": 15.088, "s2c_loss": "0.447", "loss": "0.30979", "s2c_nll_loss": "0.447", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "32610", "lr": "0.000182601", "gnorm": "5.673", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8086"} 2023-01-29 18:26:30 | INFO | train_inner | {"epoch": 16, "update": 15.093, "s2c_loss": "0.299", "loss": "0.20712", "s2c_nll_loss": "0.299", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "32620", "lr": "0.000182534", "gnorm": "4.451", "loss_scale": "512", "train_wall": "2", "gb_free": "7.5", "wall": "8088"} 2023-01-29 18:26:33 | INFO | train_inner | {"epoch": 16, "update": 15.097, "s2c_loss": "0.45", "loss": "0.31224", "s2c_nll_loss": "0.45", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "32630", "lr": "0.000182468", "gnorm": "5.221", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8091"} 2023-01-29 18:26:35 | INFO | train_inner | {"epoch": 16, "update": 15.102, "s2c_loss": "0.728", "loss": "0.50484", "s2c_nll_loss": "0.728", "s2c_accuracy": "89.219", "s2c_total": "64", "s2c_n_correct": "57.1", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "32640", "lr": "0.000182401", "gnorm": "6.325", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8093"} 2023-01-29 18:26:38 | INFO | train_inner | {"epoch": 16, "update": 15.106, "s2c_loss": "0.344", "loss": "0.23848", "s2c_nll_loss": "0.344", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "32650", "lr": "0.000182334", "gnorm": "6.021", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "8096"} 2023-01-29 18:26:41 | INFO | train_inner | {"epoch": 16, "update": 15.111, "s2c_loss": "0.316", "loss": "0.21931", "s2c_nll_loss": "0.316", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "32660", "lr": "0.000182268", "gnorm": "5.315", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8098"} 2023-01-29 18:26:43 | INFO | train_inner | {"epoch": 16, "update": 15.116, "s2c_loss": "0.342", "loss": "0.23712", "s2c_nll_loss": "0.342", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "32670", "lr": "0.000182201", "gnorm": "5.337", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "8101"} 2023-01-29 18:26:46 | INFO | train_inner | {"epoch": 16, "update": 15.12, "s2c_loss": "0.29", "loss": "0.20115", "s2c_nll_loss": "0.29", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "32680", "lr": "0.000182134", "gnorm": "4.215", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "8104"} 2023-01-29 18:26:48 | INFO | train_inner | {"epoch": 16, "update": 15.125, "s2c_loss": "0.242", "loss": "0.16809", "s2c_nll_loss": "0.242", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "32690", "lr": "0.000182068", "gnorm": "4.689", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "8106"} 2023-01-29 18:26:51 | INFO | train_inner | {"epoch": 16, "update": 15.13, "s2c_loss": "0.155", "loss": "0.10768", "s2c_nll_loss": "0.155", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "32700", "lr": "0.000182001", "gnorm": "3.506", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "8109"} 2023-01-29 18:26:53 | INFO | train_inner | {"epoch": 16, "update": 15.134, "s2c_loss": "0.251", "loss": "0.17402", "s2c_nll_loss": "0.251", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "32710", "lr": "0.000181934", "gnorm": "4.511", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "8111"} 2023-01-29 18:26:56 | INFO | train_inner | {"epoch": 16, "update": 15.139, "s2c_loss": "0.407", "loss": "0.28179", "s2c_nll_loss": "0.407", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "32720", "lr": "0.000181868", "gnorm": "5.695", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "8114"} 2023-01-29 18:26:58 | INFO | train_inner | {"epoch": 16, "update": 15.143, "s2c_loss": "0.367", "loss": "0.2542", "s2c_nll_loss": "0.367", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "258.2", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "32730", "lr": "0.000181801", "gnorm": "5.191", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8116"} 2023-01-29 18:27:01 | INFO | train_inner | {"epoch": 16, "update": 15.148, "s2c_loss": "0.392", "loss": "0.27205", "s2c_nll_loss": "0.392", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "32740", "lr": "0.000181734", "gnorm": "5.685", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8119"} 2023-01-29 18:27:03 | INFO | train_inner | {"epoch": 16, "update": 15.153, "s2c_loss": "0.486", "loss": "0.33678", "s2c_nll_loss": "0.486", "s2c_accuracy": "90.938", "s2c_total": "64", "s2c_n_correct": "58.2", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "32750", "lr": "0.000181668", "gnorm": "6.591", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8121"} 2023-01-29 18:27:06 | INFO | train_inner | {"epoch": 16, "update": 15.157, "s2c_loss": "0.298", "loss": "0.20632", "s2c_nll_loss": "0.298", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "32760", "lr": "0.000181601", "gnorm": "5.414", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "8124"} 2023-01-29 18:27:08 | INFO | train_inner | {"epoch": 16, "update": 15.162, "s2c_loss": "0.302", "loss": "0.20927", "s2c_nll_loss": "0.302", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "32770", "lr": "0.000181534", "gnorm": "5.454", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "8126"} 2023-01-29 18:27:11 | INFO | train_inner | {"epoch": 16, "update": 15.167, "s2c_loss": "0.312", "loss": "0.21598", "s2c_nll_loss": "0.312", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "32780", "lr": "0.000181468", "gnorm": "4.713", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8129"} 2023-01-29 18:27:13 | INFO | train_inner | {"epoch": 16, "update": 15.171, "s2c_loss": "0.347", "loss": "0.24083", "s2c_nll_loss": "0.347", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "32790", "lr": "0.000181401", "gnorm": "5.806", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "8131"} 2023-01-29 18:27:16 | INFO | train_inner | {"epoch": 16, "update": 15.176, "s2c_loss": "0.408", "loss": "0.28287", "s2c_nll_loss": "0.408", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "32800", "lr": "0.000181334", "gnorm": "5.368", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8134"} 2023-01-29 18:27:18 | INFO | train_inner | {"epoch": 16, "update": 15.18, "s2c_loss": "0.303", "loss": "0.2099", "s2c_nll_loss": "0.303", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "32810", "lr": "0.000181268", "gnorm": "5.31", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "8136"} 2023-01-29 18:27:21 | INFO | train_inner | {"epoch": 16, "update": 15.185, "s2c_loss": "0.305", "loss": "0.21151", "s2c_nll_loss": "0.305", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "32820", "lr": "0.000181201", "gnorm": "5.507", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8139"} 2023-01-29 18:27:23 | INFO | train_inner | {"epoch": 16, "update": 15.19, "s2c_loss": "0.27", "loss": "0.18725", "s2c_nll_loss": "0.27", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "32830", "lr": "0.000181134", "gnorm": "5.262", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "8141"} 2023-01-29 18:27:26 | INFO | train_inner | {"epoch": 16, "update": 15.194, "s2c_loss": "0.305", "loss": "0.21166", "s2c_nll_loss": "0.305", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "32840", "lr": "0.000181068", "gnorm": "5.273", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8144"} 2023-01-29 18:27:28 | INFO | train_inner | {"epoch": 16, "update": 15.199, "s2c_loss": "0.296", "loss": "0.20487", "s2c_nll_loss": "0.296", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "32850", "lr": "0.000181001", "gnorm": "4.728", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8146"} 2023-01-29 18:27:31 | INFO | train_inner | {"epoch": 16, "update": 15.204, "s2c_loss": "0.267", "loss": "0.18486", "s2c_nll_loss": "0.267", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "32860", "lr": "0.000180934", "gnorm": "4.476", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "8149"} 2023-01-29 18:27:33 | INFO | train_inner | {"epoch": 16, "update": 15.208, "s2c_loss": "0.28", "loss": "0.19406", "s2c_nll_loss": "0.28", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "32870", "lr": "0.000180868", "gnorm": "4.832", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "8151"} 2023-01-29 18:27:36 | INFO | train_inner | {"epoch": 16, "update": 15.213, "s2c_loss": "0.289", "loss": "0.20001", "s2c_nll_loss": "0.289", "s2c_accuracy": "94.819", "s2c_total": "63.7", "s2c_n_correct": "60.4", "wps": "253.5", "ups": "3.98", "wpb": "63.7", "bsz": "63.7", "num_updates": "32880", "lr": "0.000180801", "gnorm": "5.213", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "8154"} 2023-01-29 18:27:38 | INFO | train_inner | {"epoch": 16, "update": 15.217, "s2c_loss": "0.313", "loss": "0.21663", "s2c_nll_loss": "0.313", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "32890", "lr": "0.000180734", "gnorm": "5.175", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "8156"} 2023-01-29 18:27:41 | INFO | train_inner | {"epoch": 16, "update": 15.222, "s2c_loss": "0.347", "loss": "0.24024", "s2c_nll_loss": "0.347", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "32900", "lr": "0.000180668", "gnorm": "5.676", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8159"} 2023-01-29 18:27:43 | INFO | train_inner | {"epoch": 16, "update": 15.227, "s2c_loss": "0.322", "loss": "0.22302", "s2c_nll_loss": "0.322", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "32910", "lr": "0.000180601", "gnorm": "5.181", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "8161"} 2023-01-29 18:27:46 | INFO | train_inner | {"epoch": 16, "update": 15.231, "s2c_loss": "0.32", "loss": "0.22196", "s2c_nll_loss": "0.32", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "32920", "lr": "0.000180534", "gnorm": "5.712", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "8164"} 2023-01-29 18:27:49 | INFO | train_inner | {"epoch": 16, "update": 15.236, "s2c_loss": "0.347", "loss": "0.24031", "s2c_nll_loss": "0.347", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "32930", "lr": "0.000180468", "gnorm": "5.799", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8166"} 2023-01-29 18:27:51 | INFO | train_inner | {"epoch": 16, "update": 15.241, "s2c_loss": "0.305", "loss": "0.21115", "s2c_nll_loss": "0.305", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "257.8", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "32940", "lr": "0.000180401", "gnorm": "5.483", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8169"} 2023-01-29 18:27:54 | INFO | train_inner | {"epoch": 16, "update": 15.245, "s2c_loss": "0.363", "loss": "0.25173", "s2c_nll_loss": "0.363", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "32950", "lr": "0.000180334", "gnorm": "5.137", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "8171"} 2023-01-29 18:27:56 | INFO | train_inner | {"epoch": 16, "update": 15.25, "s2c_loss": "0.454", "loss": "0.31482", "s2c_nll_loss": "0.454", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "32960", "lr": "0.000180268", "gnorm": "5.657", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "8174"} 2023-01-29 18:27:59 | INFO | train_inner | {"epoch": 16, "update": 15.254, "s2c_loss": "0.368", "loss": "0.25486", "s2c_nll_loss": "0.368", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "32970", "lr": "0.000180201", "gnorm": "4.97", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "8177"} 2023-01-29 18:28:01 | INFO | train_inner | {"epoch": 16, "update": 15.259, "s2c_loss": "0.284", "loss": "0.19719", "s2c_nll_loss": "0.284", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "257.4", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "32980", "lr": "0.000180134", "gnorm": "4.54", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "8179"} 2023-01-29 18:28:04 | INFO | train_inner | {"epoch": 16, "update": 15.264, "s2c_loss": "0.308", "loss": "0.21367", "s2c_nll_loss": "0.308", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "32990", "lr": "0.000180068", "gnorm": "4.704", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8181"} 2023-01-29 18:28:06 | INFO | train_inner | {"epoch": 16, "update": 15.268, "s2c_loss": "0.37", "loss": "0.25643", "s2c_nll_loss": "0.37", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "33000", "lr": "0.000180001", "gnorm": "5.19", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "8184"} 2023-01-29 18:28:09 | INFO | train_inner | {"epoch": 16, "update": 15.273, "s2c_loss": "0.298", "loss": "0.20646", "s2c_nll_loss": "0.298", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "33010", "lr": "0.000179934", "gnorm": "5.188", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "8187"} 2023-01-29 18:28:11 | INFO | train_inner | {"epoch": 16, "update": 15.278, "s2c_loss": "0.384", "loss": "0.26607", "s2c_nll_loss": "0.384", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "33020", "lr": "0.000179868", "gnorm": "8.953", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "8189"} 2023-01-29 18:28:14 | INFO | train_inner | {"epoch": 16, "update": 15.282, "s2c_loss": "0.331", "loss": "0.22972", "s2c_nll_loss": "0.331", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "258.2", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "33030", "lr": "0.000179801", "gnorm": "5.629", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "8192"} 2023-01-29 18:28:16 | INFO | train_inner | {"epoch": 16, "update": 15.287, "s2c_loss": "0.292", "loss": "0.20248", "s2c_nll_loss": "0.292", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "33040", "lr": "0.000179734", "gnorm": "4.568", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "8194"} 2023-01-29 18:28:19 | INFO | train_inner | {"epoch": 16, "update": 15.291, "s2c_loss": "0.278", "loss": "0.19272", "s2c_nll_loss": "0.278", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "247", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "33050", "lr": "0.000179668", "gnorm": "5.342", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "8197"} 2023-01-29 18:28:21 | INFO | train_inner | {"epoch": 16, "update": 15.296, "s2c_loss": "0.346", "loss": "0.23984", "s2c_nll_loss": "0.346", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "33060", "lr": "0.000179601", "gnorm": "5.168", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "8199"} 2023-01-29 18:28:24 | INFO | train_inner | {"epoch": 16, "update": 15.301, "s2c_loss": "0.302", "loss": "0.20916", "s2c_nll_loss": "0.302", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "33070", "lr": "0.000179534", "gnorm": "5.331", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "8202"} 2023-01-29 18:28:26 | INFO | train_inner | {"epoch": 16, "update": 15.305, "s2c_loss": "0.394", "loss": "0.2733", "s2c_nll_loss": "0.394", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "33080", "lr": "0.000179468", "gnorm": "6.291", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "8204"} 2023-01-29 18:28:29 | INFO | train_inner | {"epoch": 16, "update": 15.31, "s2c_loss": "0.343", "loss": "0.23792", "s2c_nll_loss": "0.343", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "259.3", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "33090", "lr": "0.000179401", "gnorm": "5.534", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "8207"} 2023-01-29 18:28:31 | INFO | train_inner | {"epoch": 16, "update": 15.315, "s2c_loss": "0.379", "loss": "0.26245", "s2c_nll_loss": "0.379", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "33100", "lr": "0.000179334", "gnorm": "6.11", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8209"} 2023-01-29 18:28:34 | INFO | train_inner | {"epoch": 16, "update": 15.319, "s2c_loss": "0.315", "loss": "0.21863", "s2c_nll_loss": "0.315", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "33110", "lr": "0.000179268", "gnorm": "5.469", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8212"} 2023-01-29 18:28:36 | INFO | train_inner | {"epoch": 16, "update": 15.324, "s2c_loss": "0.312", "loss": "0.21601", "s2c_nll_loss": "0.312", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "33120", "lr": "0.000179201", "gnorm": "5.023", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8214"} 2023-01-29 18:28:39 | INFO | train_inner | {"epoch": 16, "update": 15.328, "s2c_loss": "0.417", "loss": "0.28876", "s2c_nll_loss": "0.417", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "33130", "lr": "0.000179134", "gnorm": "5.208", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8217"} 2023-01-29 18:28:41 | INFO | train_inner | {"epoch": 16, "update": 15.333, "s2c_loss": "0.412", "loss": "0.28574", "s2c_nll_loss": "0.412", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "33140", "lr": "0.000179068", "gnorm": "5.56", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8219"} 2023-01-29 18:28:44 | INFO | train_inner | {"epoch": 16, "update": 15.338, "s2c_loss": "0.68", "loss": "0.47147", "s2c_nll_loss": "0.68", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "33150", "lr": "0.000179001", "gnorm": "5.783", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8222"} 2023-01-29 18:28:46 | INFO | train_inner | {"epoch": 16, "update": 15.342, "s2c_loss": "0.346", "loss": "0.23984", "s2c_nll_loss": "0.346", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "33160", "lr": "0.000178934", "gnorm": "5.951", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8224"} 2023-01-29 18:28:49 | INFO | train_inner | {"epoch": 16, "update": 15.347, "s2c_loss": "0.369", "loss": "0.25574", "s2c_nll_loss": "0.369", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "33170", "lr": "0.000178868", "gnorm": "6.684", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8227"} 2023-01-29 18:28:51 | INFO | train_inner | {"epoch": 16, "update": 15.352, "s2c_loss": "0.46", "loss": "0.31889", "s2c_nll_loss": "0.46", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "33180", "lr": "0.000178801", "gnorm": "6.422", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8229"} 2023-01-29 18:28:54 | INFO | train_inner | {"epoch": 16, "update": 15.356, "s2c_loss": "0.631", "loss": "0.43713", "s2c_nll_loss": "0.631", "s2c_accuracy": "89.531", "s2c_total": "64", "s2c_n_correct": "57.3", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "33190", "lr": "0.000178734", "gnorm": "5.673", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8232"} 2023-01-29 18:28:56 | INFO | train_inner | {"epoch": 16, "update": 15.361, "s2c_loss": "0.33", "loss": "0.22898", "s2c_nll_loss": "0.33", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "33200", "lr": "0.000178668", "gnorm": "4.756", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "8234"} 2023-01-29 18:28:59 | INFO | train_inner | {"epoch": 16, "update": 15.365, "s2c_loss": "0.343", "loss": "0.23809", "s2c_nll_loss": "0.343", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "33210", "lr": "0.000178601", "gnorm": "5.018", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8237"} 2023-01-29 18:29:02 | INFO | train_inner | {"epoch": 16, "update": 15.37, "s2c_loss": "0.286", "loss": "0.19807", "s2c_nll_loss": "0.286", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "33220", "lr": "0.000178534", "gnorm": "4.936", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8240"} 2023-01-29 18:29:04 | INFO | train_inner | {"epoch": 16, "update": 15.375, "s2c_loss": "0.34", "loss": "0.23534", "s2c_nll_loss": "0.34", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "33230", "lr": "0.000178468", "gnorm": "5.541", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8242"} 2023-01-29 18:29:07 | INFO | train_inner | {"epoch": 16, "update": 15.379, "s2c_loss": "0.368", "loss": "0.25495", "s2c_nll_loss": "0.368", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "33240", "lr": "0.000178401", "gnorm": "5.514", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8244"} 2023-01-29 18:29:09 | INFO | train_inner | {"epoch": 16, "update": 15.384, "s2c_loss": "0.354", "loss": "0.24509", "s2c_nll_loss": "0.354", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "33250", "lr": "0.000178334", "gnorm": "5.874", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8247"} 2023-01-29 18:29:12 | INFO | train_inner | {"epoch": 16, "update": 15.389, "s2c_loss": "0.357", "loss": "0.24713", "s2c_nll_loss": "0.357", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "33260", "lr": "0.000178268", "gnorm": "5.849", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8250"} 2023-01-29 18:29:14 | INFO | train_inner | {"epoch": 16, "update": 15.393, "s2c_loss": "0.331", "loss": "0.22976", "s2c_nll_loss": "0.331", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "33270", "lr": "0.000178201", "gnorm": "6.85", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8252"} 2023-01-29 18:29:17 | INFO | train_inner | {"epoch": 16, "update": 15.398, "s2c_loss": "0.353", "loss": "0.24501", "s2c_nll_loss": "0.353", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "262.6", "ups": "4.1", "wpb": "64", "bsz": "64", "num_updates": "33280", "lr": "0.000178134", "gnorm": "5.836", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8254"} 2023-01-29 18:29:19 | INFO | train_inner | {"epoch": 16, "update": 15.402, "s2c_loss": "0.352", "loss": "0.24374", "s2c_nll_loss": "0.352", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "33290", "lr": "0.000178068", "gnorm": "6.733", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8257"} 2023-01-29 18:29:22 | INFO | train_inner | {"epoch": 16, "update": 15.407, "s2c_loss": "0.336", "loss": "0.23323", "s2c_nll_loss": "0.336", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "33300", "lr": "0.000178001", "gnorm": "5.634", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8259"} 2023-01-29 18:29:24 | INFO | train_inner | {"epoch": 16, "update": 15.412, "s2c_loss": "0.316", "loss": "0.21928", "s2c_nll_loss": "0.316", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "33310", "lr": "0.000177934", "gnorm": "5.375", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8262"} 2023-01-29 18:29:27 | INFO | train_inner | {"epoch": 16, "update": 15.416, "s2c_loss": "0.288", "loss": "0.19959", "s2c_nll_loss": "0.288", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "262.7", "ups": "4.1", "wpb": "64", "bsz": "64", "num_updates": "33320", "lr": "0.000177868", "gnorm": "4.724", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8264"} 2023-01-29 18:29:29 | INFO | train_inner | {"epoch": 16, "update": 15.421, "s2c_loss": "0.311", "loss": "0.21526", "s2c_nll_loss": "0.311", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "33330", "lr": "0.000177801", "gnorm": "5.325", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8267"} 2023-01-29 18:29:32 | INFO | train_inner | {"epoch": 16, "update": 15.426, "s2c_loss": "0.353", "loss": "0.2449", "s2c_nll_loss": "0.353", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "33340", "lr": "0.000177734", "gnorm": "5.225", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8269"} 2023-01-29 18:29:34 | INFO | train_inner | {"epoch": 16, "update": 15.43, "s2c_loss": "0.285", "loss": "0.19765", "s2c_nll_loss": "0.285", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "253.8", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "33350", "lr": "0.000177668", "gnorm": "4.92", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8272"} 2023-01-29 18:29:37 | INFO | train_inner | {"epoch": 16, "update": 15.435, "s2c_loss": "0.319", "loss": "0.22143", "s2c_nll_loss": "0.319", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "33360", "lr": "0.000177601", "gnorm": "5.386", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8274"} 2023-01-29 18:29:39 | INFO | train_inner | {"epoch": 16, "update": 15.439, "s2c_loss": "0.276", "loss": "0.19129", "s2c_nll_loss": "0.276", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "33370", "lr": "0.000177534", "gnorm": "4.863", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8277"} 2023-01-29 18:29:42 | INFO | train_inner | {"epoch": 16, "update": 15.444, "s2c_loss": "0.342", "loss": "0.23697", "s2c_nll_loss": "0.342", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "33380", "lr": "0.000177468", "gnorm": "5.657", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "8280"} 2023-01-29 18:29:44 | INFO | train_inner | {"epoch": 16, "update": 15.449, "s2c_loss": "0.403", "loss": "0.27931", "s2c_nll_loss": "0.403", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "33390", "lr": "0.000177401", "gnorm": "6.446", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.5", "wall": "8282"} 2023-01-29 18:29:47 | INFO | train_inner | {"epoch": 16, "update": 15.453, "s2c_loss": "0.336", "loss": "0.23307", "s2c_nll_loss": "0.336", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "257.4", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "33400", "lr": "0.000177334", "gnorm": "5.893", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8285"} 2023-01-29 18:29:49 | INFO | train_inner | {"epoch": 16, "update": 15.458, "s2c_loss": "0.281", "loss": "0.19474", "s2c_nll_loss": "0.281", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "33410", "lr": "0.000177268", "gnorm": "4.858", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8287"} 2023-01-29 18:29:52 | INFO | train_inner | {"epoch": 16, "update": 15.463, "s2c_loss": "0.312", "loss": "0.2161", "s2c_nll_loss": "0.312", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "33420", "lr": "0.000177201", "gnorm": "6.014", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8290"} 2023-01-29 18:29:54 | INFO | train_inner | {"epoch": 16, "update": 15.467, "s2c_loss": "0.29", "loss": "0.20112", "s2c_nll_loss": "0.29", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "33430", "lr": "0.000177134", "gnorm": "5.563", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8292"} 2023-01-29 18:29:57 | INFO | train_inner | {"epoch": 16, "update": 15.472, "s2c_loss": "0.34", "loss": "0.23591", "s2c_nll_loss": "0.34", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "33440", "lr": "0.000177068", "gnorm": "6.567", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8295"} 2023-01-29 18:29:59 | INFO | train_inner | {"epoch": 16, "update": 15.476, "s2c_loss": "0.331", "loss": "0.22935", "s2c_nll_loss": "0.331", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "33450", "lr": "0.000177001", "gnorm": "5.118", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8297"} 2023-01-29 18:30:02 | INFO | train_inner | {"epoch": 16, "update": 15.481, "s2c_loss": "0.7", "loss": "0.48488", "s2c_nll_loss": "0.7", "s2c_accuracy": "90", "s2c_total": "64", "s2c_n_correct": "57.6", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "33460", "lr": "0.000176934", "gnorm": "5.734", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8300"} 2023-01-29 18:30:04 | INFO | train_inner | {"epoch": 16, "update": 15.486, "s2c_loss": "0.364", "loss": "0.25258", "s2c_nll_loss": "0.364", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "33470", "lr": "0.000176868", "gnorm": "5.073", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8302"} 2023-01-29 18:30:07 | INFO | train_inner | {"epoch": 16, "update": 15.49, "s2c_loss": "0.395", "loss": "0.27384", "s2c_nll_loss": "0.395", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "33480", "lr": "0.000176801", "gnorm": "5.633", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8305"} 2023-01-29 18:30:09 | INFO | train_inner | {"epoch": 16, "update": 15.495, "s2c_loss": "0.395", "loss": "0.27406", "s2c_nll_loss": "0.395", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "33490", "lr": "0.000176734", "gnorm": "5.965", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8307"} 2023-01-29 18:30:12 | INFO | train_inner | {"epoch": 16, "update": 15.5, "s2c_loss": "0.374", "loss": "0.25902", "s2c_nll_loss": "0.374", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "33500", "lr": "0.000176668", "gnorm": "6.349", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8310"} 2023-01-29 18:30:14 | INFO | train_inner | {"epoch": 16, "update": 15.504, "s2c_loss": "0.373", "loss": "0.25886", "s2c_nll_loss": "0.373", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "33510", "lr": "0.000176601", "gnorm": "6.118", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8312"} 2023-01-29 18:30:17 | INFO | train_inner | {"epoch": 16, "update": 15.509, "s2c_loss": "0.362", "loss": "0.25123", "s2c_nll_loss": "0.362", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "33520", "lr": "0.000176535", "gnorm": "5.323", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8315"} 2023-01-29 18:30:19 | INFO | train_inner | {"epoch": 16, "update": 15.513, "s2c_loss": "0.37", "loss": "0.25641", "s2c_nll_loss": "0.37", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "33530", "lr": "0.000176468", "gnorm": "5.745", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8317"} 2023-01-29 18:30:22 | INFO | train_inner | {"epoch": 16, "update": 15.518, "s2c_loss": "0.333", "loss": "0.2309", "s2c_nll_loss": "0.333", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "33540", "lr": "0.000176401", "gnorm": "4.939", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8320"} 2023-01-29 18:30:24 | INFO | train_inner | {"epoch": 16, "update": 15.523, "s2c_loss": "0.34", "loss": "0.23545", "s2c_nll_loss": "0.34", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "33550", "lr": "0.000176335", "gnorm": "5.164", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8322"} 2023-01-29 18:30:27 | INFO | train_inner | {"epoch": 16, "update": 15.527, "s2c_loss": "0.352", "loss": "0.24431", "s2c_nll_loss": "0.352", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "33560", "lr": "0.000176268", "gnorm": "6.336", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8325"} 2023-01-29 18:30:29 | INFO | train_inner | {"epoch": 16, "update": 15.532, "s2c_loss": "0.35", "loss": "0.24264", "s2c_nll_loss": "0.35", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "33570", "lr": "0.000176201", "gnorm": "5.793", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8327"} 2023-01-29 18:30:32 | INFO | train_inner | {"epoch": 16, "update": 15.537, "s2c_loss": "0.361", "loss": "0.24988", "s2c_nll_loss": "0.361", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "258.1", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "33580", "lr": "0.000176135", "gnorm": "5.964", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8330"} 2023-01-29 18:30:34 | INFO | train_inner | {"epoch": 16, "update": 15.541, "s2c_loss": "0.296", "loss": "0.20492", "s2c_nll_loss": "0.296", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "33590", "lr": "0.000176068", "gnorm": "4.918", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8332"} 2023-01-29 18:30:37 | INFO | train_inner | {"epoch": 16, "update": 15.546, "s2c_loss": "0.292", "loss": "0.20245", "s2c_nll_loss": "0.292", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "33600", "lr": "0.000176001", "gnorm": "5.053", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8335"} 2023-01-29 18:30:39 | INFO | train_inner | {"epoch": 16, "update": 15.55, "s2c_loss": "0.406", "loss": "0.28168", "s2c_nll_loss": "0.406", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "256.3", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "33610", "lr": "0.000175935", "gnorm": "4.86", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "8337"} 2023-01-29 18:30:42 | INFO | train_inner | {"epoch": 16, "update": 15.555, "s2c_loss": "0.461", "loss": "0.31971", "s2c_nll_loss": "0.461", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "33620", "lr": "0.000175868", "gnorm": "6.103", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8340"} 2023-01-29 18:30:44 | INFO | train_inner | {"epoch": 16, "update": 15.56, "s2c_loss": "0.337", "loss": "0.23335", "s2c_nll_loss": "0.337", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "33630", "lr": "0.000175801", "gnorm": "5.215", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8342"} 2023-01-29 18:30:47 | INFO | train_inner | {"epoch": 16, "update": 15.564, "s2c_loss": "0.375", "loss": "0.25969", "s2c_nll_loss": "0.375", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "33640", "lr": "0.000175735", "gnorm": "5.007", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8345"} 2023-01-29 18:30:50 | INFO | train_inner | {"epoch": 16, "update": 15.569, "s2c_loss": "0.289", "loss": "0.2005", "s2c_nll_loss": "0.289", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "33650", "lr": "0.000175668", "gnorm": "4.915", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8347"} 2023-01-29 18:30:52 | INFO | train_inner | {"epoch": 16, "update": 15.574, "s2c_loss": "0.304", "loss": "0.2104", "s2c_nll_loss": "0.304", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "259.4", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "33660", "lr": "0.000175601", "gnorm": "6.344", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8350"} 2023-01-29 18:30:54 | INFO | train_inner | {"epoch": 16, "update": 15.578, "s2c_loss": "0.376", "loss": "0.26043", "s2c_nll_loss": "0.376", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "262", "ups": "4.09", "wpb": "64", "bsz": "64", "num_updates": "33670", "lr": "0.000175535", "gnorm": "5.245", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8352"} 2023-01-29 18:30:57 | INFO | train_inner | {"epoch": 16, "update": 15.583, "s2c_loss": "0.253", "loss": "0.17528", "s2c_nll_loss": "0.253", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "33680", "lr": "0.000175468", "gnorm": "4.512", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8355"} 2023-01-29 18:31:00 | INFO | train_inner | {"epoch": 16, "update": 15.587, "s2c_loss": "0.261", "loss": "0.18068", "s2c_nll_loss": "0.261", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "33690", "lr": "0.000175401", "gnorm": "4.419", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8357"} 2023-01-29 18:31:02 | INFO | train_inner | {"epoch": 16, "update": 15.592, "s2c_loss": "0.215", "loss": "0.14929", "s2c_nll_loss": "0.215", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "33700", "lr": "0.000175335", "gnorm": "4.233", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8360"} 2023-01-29 18:31:05 | INFO | train_inner | {"epoch": 16, "update": 15.597, "s2c_loss": "0.37", "loss": "0.25659", "s2c_nll_loss": "0.37", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "33710", "lr": "0.000175268", "gnorm": "4.919", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8363"} 2023-01-29 18:31:07 | INFO | train_inner | {"epoch": 16, "update": 15.601, "s2c_loss": "0.325", "loss": "0.22556", "s2c_nll_loss": "0.325", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "33720", "lr": "0.000175201", "gnorm": "6.618", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8365"} 2023-01-29 18:31:10 | INFO | train_inner | {"epoch": 16, "update": 15.606, "s2c_loss": "0.391", "loss": "0.27073", "s2c_nll_loss": "0.391", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "33730", "lr": "0.000175135", "gnorm": "6.502", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8368"} 2023-01-29 18:31:12 | INFO | train_inner | {"epoch": 16, "update": 15.611, "s2c_loss": "0.252", "loss": "0.17458", "s2c_nll_loss": "0.252", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "33740", "lr": "0.000175068", "gnorm": "5.404", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8370"} 2023-01-29 18:31:15 | INFO | train_inner | {"epoch": 16, "update": 15.615, "s2c_loss": "0.322", "loss": "0.22331", "s2c_nll_loss": "0.322", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "33750", "lr": "0.000175001", "gnorm": "4.924", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8373"} 2023-01-29 18:31:17 | INFO | train_inner | {"epoch": 16, "update": 15.62, "s2c_loss": "0.333", "loss": "0.23115", "s2c_nll_loss": "0.333", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "33760", "lr": "0.000174935", "gnorm": "5.281", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8375"} 2023-01-29 18:31:20 | INFO | train_inner | {"epoch": 16, "update": 15.624, "s2c_loss": "0.313", "loss": "0.21665", "s2c_nll_loss": "0.313", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "33770", "lr": "0.000174868", "gnorm": "4.861", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8378"} 2023-01-29 18:31:22 | INFO | train_inner | {"epoch": 16, "update": 15.629, "s2c_loss": "0.284", "loss": "0.19707", "s2c_nll_loss": "0.284", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "33780", "lr": "0.000174801", "gnorm": "4.698", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8380"} 2023-01-29 18:31:25 | INFO | train_inner | {"epoch": 16, "update": 15.634, "s2c_loss": "0.303", "loss": "0.21001", "s2c_nll_loss": "0.303", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "262", "ups": "4.09", "wpb": "64", "bsz": "64", "num_updates": "33790", "lr": "0.000174735", "gnorm": "5.184", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8383"} 2023-01-29 18:31:27 | INFO | train_inner | {"epoch": 16, "update": 15.638, "s2c_loss": "0.389", "loss": "0.26975", "s2c_nll_loss": "0.389", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "33800", "lr": "0.000174668", "gnorm": "5.749", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8385"} 2023-01-29 18:31:30 | INFO | train_inner | {"epoch": 16, "update": 15.643, "s2c_loss": "0.304", "loss": "0.21101", "s2c_nll_loss": "0.304", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "33810", "lr": "0.000174601", "gnorm": "5.592", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8388"} 2023-01-29 18:31:32 | INFO | train_inner | {"epoch": 16, "update": 15.648, "s2c_loss": "0.358", "loss": "0.24797", "s2c_nll_loss": "0.358", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "33820", "lr": "0.000174535", "gnorm": "5.547", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8390"} 2023-01-29 18:31:35 | INFO | train_inner | {"epoch": 16, "update": 15.652, "s2c_loss": "0.448", "loss": "0.3107", "s2c_nll_loss": "0.448", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "33830", "lr": "0.000174468", "gnorm": "5.793", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8393"} 2023-01-29 18:31:37 | INFO | train_inner | {"epoch": 16, "update": 15.657, "s2c_loss": "0.42", "loss": "0.29093", "s2c_nll_loss": "0.42", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "33840", "lr": "0.000174401", "gnorm": "5.573", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8395"} 2023-01-29 18:31:40 | INFO | train_inner | {"epoch": 16, "update": 15.661, "s2c_loss": "0.406", "loss": "0.28129", "s2c_nll_loss": "0.406", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "256.3", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "33850", "lr": "0.000174335", "gnorm": "5.621", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8398"} 2023-01-29 18:31:42 | INFO | train_inner | {"epoch": 16, "update": 15.666, "s2c_loss": "0.395", "loss": "0.27359", "s2c_nll_loss": "0.395", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "33860", "lr": "0.000174268", "gnorm": "5.154", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8400"} 2023-01-29 18:31:45 | INFO | train_inner | {"epoch": 16, "update": 15.671, "s2c_loss": "0.305", "loss": "0.21129", "s2c_nll_loss": "0.305", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "33870", "lr": "0.000174201", "gnorm": "4.827", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8403"} 2023-01-29 18:31:47 | INFO | train_inner | {"epoch": 16, "update": 15.675, "s2c_loss": "0.383", "loss": "0.26521", "s2c_nll_loss": "0.383", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "33880", "lr": "0.000174135", "gnorm": "5.678", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8405"} 2023-01-29 18:31:50 | INFO | train_inner | {"epoch": 16, "update": 15.68, "s2c_loss": "0.401", "loss": "0.27782", "s2c_nll_loss": "0.401", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "33890", "lr": "0.000174068", "gnorm": "5.826", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8408"} 2023-01-29 18:31:52 | INFO | train_inner | {"epoch": 16, "update": 15.685, "s2c_loss": "0.43", "loss": "0.29834", "s2c_nll_loss": "0.43", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "33900", "lr": "0.000174001", "gnorm": "5.377", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8410"} 2023-01-29 18:31:55 | INFO | train_inner | {"epoch": 16, "update": 15.689, "s2c_loss": "0.455", "loss": "0.31544", "s2c_nll_loss": "0.455", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "33910", "lr": "0.000173935", "gnorm": "5.716", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8413"} 2023-01-29 18:31:57 | INFO | train_inner | {"epoch": 16, "update": 15.694, "s2c_loss": "0.325", "loss": "0.22517", "s2c_nll_loss": "0.325", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "33920", "lr": "0.000173868", "gnorm": "4.996", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8415"} 2023-01-29 18:32:00 | INFO | train_inner | {"epoch": 16, "update": 15.698, "s2c_loss": "0.38", "loss": "0.26333", "s2c_nll_loss": "0.38", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "33930", "lr": "0.000173801", "gnorm": "5.25", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8418"} 2023-01-29 18:32:02 | INFO | train_inner | {"epoch": 16, "update": 15.703, "s2c_loss": "0.324", "loss": "0.22474", "s2c_nll_loss": "0.324", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "33940", "lr": "0.000173735", "gnorm": "4.652", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8420"} 2023-01-29 18:32:05 | INFO | train_inner | {"epoch": 16, "update": 15.708, "s2c_loss": "0.309", "loss": "0.21404", "s2c_nll_loss": "0.309", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "33950", "lr": "0.000173668", "gnorm": "5.019", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8423"} 2023-01-29 18:32:08 | INFO | train_inner | {"epoch": 16, "update": 15.712, "s2c_loss": "0.279", "loss": "0.1936", "s2c_nll_loss": "0.279", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "259.6", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "33960", "lr": "0.000173601", "gnorm": "5.056", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "8425"} 2023-01-29 18:32:10 | INFO | train_inner | {"epoch": 16, "update": 15.717, "s2c_loss": "0.335", "loss": "0.23218", "s2c_nll_loss": "0.335", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "33970", "lr": "0.000173535", "gnorm": "4.827", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "8428"} 2023-01-29 18:32:13 | INFO | train_inner | {"epoch": 16, "update": 15.722, "s2c_loss": "0.367", "loss": "0.25453", "s2c_nll_loss": "0.367", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "33980", "lr": "0.000173468", "gnorm": "5.314", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8430"} 2023-01-29 18:32:15 | INFO | train_inner | {"epoch": 16, "update": 15.726, "s2c_loss": "0.353", "loss": "0.24469", "s2c_nll_loss": "0.353", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "247.3", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "33990", "lr": "0.000173401", "gnorm": "5.588", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8433"} 2023-01-29 18:32:18 | INFO | train_inner | {"epoch": 16, "update": 15.731, "s2c_loss": "0.363", "loss": "0.25154", "s2c_nll_loss": "0.363", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "34000", "lr": "0.000173335", "gnorm": "5.656", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8436"} 2023-01-29 18:32:20 | INFO | train_inner | {"epoch": 16, "update": 15.735, "s2c_loss": "0.33", "loss": "0.22877", "s2c_nll_loss": "0.33", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "34010", "lr": "0.000173268", "gnorm": "5.093", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "8438"} 2023-01-29 18:32:23 | INFO | train_inner | {"epoch": 16, "update": 15.74, "s2c_loss": "0.222", "loss": "0.15359", "s2c_nll_loss": "0.222", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "34020", "lr": "0.000173201", "gnorm": "4.463", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8441"} 2023-01-29 18:32:25 | INFO | train_inner | {"epoch": 16, "update": 15.745, "s2c_loss": "0.382", "loss": "0.2648", "s2c_nll_loss": "0.382", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "34030", "lr": "0.000173135", "gnorm": "7.277", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8443"} 2023-01-29 18:32:28 | INFO | train_inner | {"epoch": 16, "update": 15.749, "s2c_loss": "0.36", "loss": "0.24972", "s2c_nll_loss": "0.36", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "258.5", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "34040", "lr": "0.000173068", "gnorm": "6.778", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8446"} 2023-01-29 18:32:30 | INFO | train_inner | {"epoch": 16, "update": 15.754, "s2c_loss": "0.571", "loss": "0.39582", "s2c_nll_loss": "0.571", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "258.4", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "34050", "lr": "0.000173001", "gnorm": "6.57", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8448"} 2023-01-29 18:32:33 | INFO | train_inner | {"epoch": 16, "update": 15.759, "s2c_loss": "0.361", "loss": "0.25008", "s2c_nll_loss": "0.361", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "34060", "lr": "0.000172935", "gnorm": "5.305", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8451"} 2023-01-29 18:32:35 | INFO | train_inner | {"epoch": 16, "update": 15.763, "s2c_loss": "0.325", "loss": "0.22495", "s2c_nll_loss": "0.325", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "34070", "lr": "0.000172868", "gnorm": "4.672", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8453"} 2023-01-29 18:32:38 | INFO | train_inner | {"epoch": 16, "update": 15.768, "s2c_loss": "0.298", "loss": "0.20655", "s2c_nll_loss": "0.298", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "34080", "lr": "0.000172801", "gnorm": "5.221", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8456"} 2023-01-29 18:32:40 | INFO | train_inner | {"epoch": 16, "update": 15.772, "s2c_loss": "0.342", "loss": "0.23688", "s2c_nll_loss": "0.342", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "34090", "lr": "0.000172735", "gnorm": "5.717", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8458"} 2023-01-29 18:32:43 | INFO | train_inner | {"epoch": 16, "update": 15.777, "s2c_loss": "0.368", "loss": "0.25496", "s2c_nll_loss": "0.368", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "34100", "lr": "0.000172668", "gnorm": "6.22", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8461"} 2023-01-29 18:32:45 | INFO | train_inner | {"epoch": 16, "update": 15.782, "s2c_loss": "0.376", "loss": "0.26076", "s2c_nll_loss": "0.376", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "34110", "lr": "0.000172601", "gnorm": "6.844", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8463"} 2023-01-29 18:32:48 | INFO | train_inner | {"epoch": 16, "update": 15.786, "s2c_loss": "0.498", "loss": "0.345", "s2c_nll_loss": "0.498", "s2c_accuracy": "90.781", "s2c_total": "64", "s2c_n_correct": "58.1", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "34120", "lr": "0.000172535", "gnorm": "5.837", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8466"} 2023-01-29 18:32:50 | INFO | train_inner | {"epoch": 16, "update": 15.791, "s2c_loss": "0.385", "loss": "0.26653", "s2c_nll_loss": "0.385", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "34130", "lr": "0.000172468", "gnorm": "5.565", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "8468"} 2023-01-29 18:32:53 | INFO | train_inner | {"epoch": 16, "update": 15.796, "s2c_loss": "0.277", "loss": "0.19225", "s2c_nll_loss": "0.277", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "34140", "lr": "0.000172401", "gnorm": "5.466", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8471"} 2023-01-29 18:32:55 | INFO | train_inner | {"epoch": 16, "update": 15.8, "s2c_loss": "0.364", "loss": "0.25237", "s2c_nll_loss": "0.364", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "257.4", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "34150", "lr": "0.000172335", "gnorm": "5.672", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "8473"} 2023-01-29 18:32:58 | INFO | train_inner | {"epoch": 16, "update": 15.805, "s2c_loss": "0.299", "loss": "0.207", "s2c_nll_loss": "0.299", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "34160", "lr": "0.000172268", "gnorm": "5.148", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8476"} 2023-01-29 18:33:00 | INFO | train_inner | {"epoch": 16, "update": 15.809, "s2c_loss": "0.41", "loss": "0.2841", "s2c_nll_loss": "0.41", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "259.4", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "34170", "lr": "0.000172201", "gnorm": "7.685", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8478"} 2023-01-29 18:33:03 | INFO | train_inner | {"epoch": 16, "update": 15.814, "s2c_loss": "0.36", "loss": "0.24949", "s2c_nll_loss": "0.36", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "34180", "lr": "0.000172135", "gnorm": "6.104", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8481"} 2023-01-29 18:33:05 | INFO | train_inner | {"epoch": 16, "update": 15.819, "s2c_loss": "0.442", "loss": "0.30621", "s2c_nll_loss": "0.442", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "34190", "lr": "0.000172068", "gnorm": "6.158", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8483"} 2023-01-29 18:33:08 | INFO | train_inner | {"epoch": 16, "update": 15.823, "s2c_loss": "0.34", "loss": "0.23594", "s2c_nll_loss": "0.34", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "34200", "lr": "0.000172001", "gnorm": "5.617", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8486"} 2023-01-29 18:33:10 | INFO | train_inner | {"epoch": 16, "update": 15.828, "s2c_loss": "0.337", "loss": "0.23329", "s2c_nll_loss": "0.337", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "34210", "lr": "0.000171935", "gnorm": "4.986", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8488"} 2023-01-29 18:33:13 | INFO | train_inner | {"epoch": 16, "update": 15.833, "s2c_loss": "0.374", "loss": "0.25947", "s2c_nll_loss": "0.374", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "260.3", "ups": "4.07", "wpb": "64", "bsz": "64", "num_updates": "34220", "lr": "0.000171868", "gnorm": "6.208", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8491"} 2023-01-29 18:33:15 | INFO | train_inner | {"epoch": 16, "update": 15.837, "s2c_loss": "0.33", "loss": "0.2288", "s2c_nll_loss": "0.33", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "34230", "lr": "0.000171801", "gnorm": "5.834", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8493"} 2023-01-29 18:33:18 | INFO | train_inner | {"epoch": 16, "update": 15.842, "s2c_loss": "0.388", "loss": "0.26918", "s2c_nll_loss": "0.388", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "34240", "lr": "0.000171735", "gnorm": "5.745", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8496"} 2023-01-29 18:33:20 | INFO | train_inner | {"epoch": 16, "update": 15.846, "s2c_loss": "0.409", "loss": "0.28343", "s2c_nll_loss": "0.409", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "34250", "lr": "0.000171668", "gnorm": "6.814", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8498"} 2023-01-29 18:33:23 | INFO | train_inner | {"epoch": 16, "update": 15.851, "s2c_loss": "0.52", "loss": "0.36061", "s2c_nll_loss": "0.52", "s2c_accuracy": "91.562", "s2c_total": "64", "s2c_n_correct": "58.6", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "34260", "lr": "0.000171601", "gnorm": "6.568", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8501"} 2023-01-29 18:33:25 | INFO | train_inner | {"epoch": 16, "update": 15.856, "s2c_loss": "0.421", "loss": "0.29179", "s2c_nll_loss": "0.421", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "34270", "lr": "0.000171535", "gnorm": "7.751", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8503"} 2023-01-29 18:33:28 | INFO | train_inner | {"epoch": 16, "update": 15.86, "s2c_loss": "0.336", "loss": "0.23299", "s2c_nll_loss": "0.336", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "34280", "lr": "0.000171468", "gnorm": "5.716", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8506"} 2023-01-29 18:33:31 | INFO | train_inner | {"epoch": 16, "update": 15.865, "s2c_loss": "0.412", "loss": "0.2854", "s2c_nll_loss": "0.412", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "34290", "lr": "0.000171401", "gnorm": "7.496", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "8508"} 2023-01-29 18:33:33 | INFO | train_inner | {"epoch": 16, "update": 15.87, "s2c_loss": "0.302", "loss": "0.20915", "s2c_nll_loss": "0.302", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "34300", "lr": "0.000171335", "gnorm": "5.138", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "8511"} 2023-01-29 18:33:36 | INFO | train_inner | {"epoch": 16, "update": 15.874, "s2c_loss": "0.352", "loss": "0.24421", "s2c_nll_loss": "0.352", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "34310", "lr": "0.000171268", "gnorm": "5.539", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8514"} 2023-01-29 18:33:38 | INFO | train_inner | {"epoch": 16, "update": 15.879, "s2c_loss": "0.334", "loss": "0.23163", "s2c_nll_loss": "0.334", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "34320", "lr": "0.000171201", "gnorm": "5.26", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8516"} 2023-01-29 18:33:41 | INFO | train_inner | {"epoch": 16, "update": 15.883, "s2c_loss": "0.601", "loss": "0.41634", "s2c_nll_loss": "0.601", "s2c_accuracy": "88.594", "s2c_total": "64", "s2c_n_correct": "56.7", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "34330", "lr": "0.000171135", "gnorm": "6.911", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8519"} 2023-01-29 18:33:43 | INFO | train_inner | {"epoch": 16, "update": 15.888, "s2c_loss": "0.305", "loss": "0.21111", "s2c_nll_loss": "0.305", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "34340", "lr": "0.000171068", "gnorm": "5.665", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8521"} 2023-01-29 18:33:46 | INFO | train_inner | {"epoch": 16, "update": 15.893, "s2c_loss": "0.366", "loss": "0.25396", "s2c_nll_loss": "0.366", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "258.5", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "34350", "lr": "0.000171001", "gnorm": "5.289", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8524"} 2023-01-29 18:33:48 | INFO | train_inner | {"epoch": 16, "update": 15.897, "s2c_loss": "0.422", "loss": "0.29242", "s2c_nll_loss": "0.422", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "34360", "lr": "0.000170935", "gnorm": "5.44", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8526"} 2023-01-29 18:33:51 | INFO | train_inner | {"epoch": 16, "update": 15.902, "s2c_loss": "0.35", "loss": "0.24269", "s2c_nll_loss": "0.35", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "34370", "lr": "0.000170868", "gnorm": "5.43", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8529"} 2023-01-29 18:33:53 | INFO | train_inner | {"epoch": 16, "update": 15.907, "s2c_loss": "0.374", "loss": "0.25911", "s2c_nll_loss": "0.374", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "34380", "lr": "0.000170801", "gnorm": "5.112", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "8531"} 2023-01-29 18:33:56 | INFO | train_inner | {"epoch": 16, "update": 15.911, "s2c_loss": "0.373", "loss": "0.25827", "s2c_nll_loss": "0.373", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "34390", "lr": "0.000170735", "gnorm": "6.199", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8534"} 2023-01-29 18:33:58 | INFO | train_inner | {"epoch": 16, "update": 15.916, "s2c_loss": "0.487", "loss": "0.33748", "s2c_nll_loss": "0.487", "s2c_accuracy": "90.312", "s2c_total": "64", "s2c_n_correct": "57.8", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "34400", "lr": "0.000170668", "gnorm": "6.598", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8536"} 2023-01-29 18:34:01 | INFO | train_inner | {"epoch": 16, "update": 15.92, "s2c_loss": "0.369", "loss": "0.25557", "s2c_nll_loss": "0.369", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "34410", "lr": "0.000170601", "gnorm": "4.883", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8539"} 2023-01-29 18:34:03 | INFO | train_inner | {"epoch": 16, "update": 15.925, "s2c_loss": "0.473", "loss": "0.32798", "s2c_nll_loss": "0.473", "s2c_accuracy": "91.25", "s2c_total": "64", "s2c_n_correct": "58.4", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "34420", "lr": "0.000170535", "gnorm": "6.634", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8541"} 2023-01-29 18:34:06 | INFO | train_inner | {"epoch": 16, "update": 15.93, "s2c_loss": "0.328", "loss": "0.22702", "s2c_nll_loss": "0.328", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "34430", "lr": "0.000170468", "gnorm": "5.031", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8544"} 2023-01-29 18:34:08 | INFO | train_inner | {"epoch": 16, "update": 15.934, "s2c_loss": "0.336", "loss": "0.23294", "s2c_nll_loss": "0.336", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "34440", "lr": "0.000170401", "gnorm": "5.369", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8546"} 2023-01-29 18:34:11 | INFO | train_inner | {"epoch": 16, "update": 15.939, "s2c_loss": "0.382", "loss": "0.26487", "s2c_nll_loss": "0.382", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "34450", "lr": "0.000170335", "gnorm": "5.976", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8549"} 2023-01-29 18:34:13 | INFO | train_inner | {"epoch": 16, "update": 15.944, "s2c_loss": "0.366", "loss": "0.25344", "s2c_nll_loss": "0.366", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "34460", "lr": "0.000170268", "gnorm": "5.752", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8551"} 2023-01-29 18:34:16 | INFO | train_inner | {"epoch": 16, "update": 15.948, "s2c_loss": "0.354", "loss": "0.24566", "s2c_nll_loss": "0.354", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "34470", "lr": "0.000170201", "gnorm": "5.765", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8554"} 2023-01-29 18:34:18 | INFO | train_inner | {"epoch": 16, "update": 15.953, "s2c_loss": "0.419", "loss": "0.29073", "s2c_nll_loss": "0.419", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "34480", "lr": "0.000170135", "gnorm": "5.679", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8556"} 2023-01-29 18:34:21 | INFO | train_inner | {"epoch": 16, "update": 15.957, "s2c_loss": "0.383", "loss": "0.26513", "s2c_nll_loss": "0.383", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "34490", "lr": "0.000170068", "gnorm": "6.126", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8559"} 2023-01-29 18:34:24 | INFO | train_inner | {"epoch": 16, "update": 15.962, "s2c_loss": "0.424", "loss": "0.2936", "s2c_nll_loss": "0.424", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "34500", "lr": "0.000170002", "gnorm": "5.297", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8561"} 2023-01-29 18:34:26 | INFO | train_inner | {"epoch": 16, "update": 15.967, "s2c_loss": "0.276", "loss": "0.19116", "s2c_nll_loss": "0.276", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "34510", "lr": "0.000169935", "gnorm": "5.08", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8564"} 2023-01-29 18:34:29 | INFO | train_inner | {"epoch": 16, "update": 15.971, "s2c_loss": "0.383", "loss": "0.26574", "s2c_nll_loss": "0.383", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "34520", "lr": "0.000169868", "gnorm": "5.692", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8567"} 2023-01-29 18:34:31 | INFO | train_inner | {"epoch": 16, "update": 15.976, "s2c_loss": "0.483", "loss": "0.33472", "s2c_nll_loss": "0.483", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "34530", "lr": "0.000169802", "gnorm": "6.251", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8569"} 2023-01-29 18:34:34 | INFO | train_inner | {"epoch": 16, "update": 15.981, "s2c_loss": "0.443", "loss": "0.30705", "s2c_nll_loss": "0.443", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "34540", "lr": "0.000169735", "gnorm": "6.029", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8572"} 2023-01-29 18:34:36 | INFO | train_inner | {"epoch": 16, "update": 15.985, "s2c_loss": "0.414", "loss": "0.28675", "s2c_nll_loss": "0.414", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "34550", "lr": "0.000169668", "gnorm": "5.607", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8574"} 2023-01-29 18:34:39 | INFO | train_inner | {"epoch": 16, "update": 15.99, "s2c_loss": "0.383", "loss": "0.26522", "s2c_nll_loss": "0.383", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "34560", "lr": "0.000169602", "gnorm": "5.208", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8577"} 2023-01-29 18:34:41 | INFO | train_inner | {"epoch": 16, "update": 15.994, "s2c_loss": "0.372", "loss": "0.25789", "s2c_nll_loss": "0.372", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "34570", "lr": "0.000169535", "gnorm": "5.749", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8579"} 2023-01-29 18:34:44 | INFO | train_inner | {"epoch": 16, "update": 15.999, "s2c_loss": "0.388", "loss": "0.26901", "s2c_nll_loss": "0.388", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "34580", "lr": "0.000169468", "gnorm": "5.69", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8582"} 2023-01-29 18:34:44 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 18:34:59 | INFO | valid | {"epoch": 16, "valid_s2c_loss": "0.824", "valid_loss": "0.57106", "valid_s2c_nll_loss": "0.824", "valid_s2c_accuracy": "85.427", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "27.3009", "valid_num_updates": "34582", "valid_best_s2c_accuracy": "85.745"} 2023-01-29 18:34:59 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 16 @ 34582 updates 2023-01-29 18:34:59 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 18:35:06 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 18:35:06 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt (epoch 16 @ 34582 updates, score 85.427) (writing took 7.10306018171832 seconds) 2023-01-29 18:35:06 | INFO | fairseq_cli.train | end of epoch 16 (average epoch stats below) 2023-01-29 18:35:06 | INFO | train | {"epoch": 16, "train_s2c_loss": "0.357", "train_loss": "0.24735", "train_s2c_nll_loss": "0.357", "train_s2c_accuracy": "93.62", "train_s2c_total": "63.9838", "train_s2c_n_correct": "59.9019", "train_wps": "241.6", "train_ups": "3.78", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "34582", "train_lr": "0.000169455", "train_gnorm": "5.51", "train_loss_scale": "1024", "train_train_wall": "538", "train_gb_free": "7.5", "train_wall": "8604"} 2023-01-29 18:35:12 | INFO | fairseq.trainer | begin training epoch 17 2023-01-29 18:35:12 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 18:35:14 | INFO | train_inner | {"epoch": 17, "update": 16.004, "s2c_loss": "0.374", "loss": "0.25902", "s2c_nll_loss": "0.374", "s2c_accuracy": "93.421", "s2c_total": "60.8", "s2c_n_correct": "56.8", "wps": "20", "ups": "0.33", "wpb": "60.8", "bsz": "60.8", "num_updates": "34590", "lr": "0.000169402", "gnorm": "6.191", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "8612"} 2023-01-29 18:35:17 | INFO | train_inner | {"epoch": 17, "update": 16.008, "s2c_loss": "0.312", "loss": "0.21652", "s2c_nll_loss": "0.312", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "34600", "lr": "0.000169335", "gnorm": "4.895", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8615"} 2023-01-29 18:35:19 | INFO | train_inner | {"epoch": 17, "update": 16.013, "s2c_loss": "0.288", "loss": "0.19937", "s2c_nll_loss": "0.288", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "34610", "lr": "0.000169268", "gnorm": "4.779", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8617"} 2023-01-29 18:35:22 | INFO | train_inner | {"epoch": 17, "update": 16.018, "s2c_loss": "0.237", "loss": "0.16405", "s2c_nll_loss": "0.237", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "34620", "lr": "0.000169202", "gnorm": "4.268", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8620"} 2023-01-29 18:35:24 | INFO | train_inner | {"epoch": 17, "update": 16.022, "s2c_loss": "0.335", "loss": "0.23227", "s2c_nll_loss": "0.335", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "34630", "lr": "0.000169135", "gnorm": "4.813", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8622"} 2023-01-29 18:35:27 | INFO | train_inner | {"epoch": 17, "update": 16.027, "s2c_loss": "0.313", "loss": "0.21673", "s2c_nll_loss": "0.313", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "34640", "lr": "0.000169068", "gnorm": "4.698", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8625"} 2023-01-29 18:35:30 | INFO | train_inner | {"epoch": 17, "update": 16.031, "s2c_loss": "0.306", "loss": "0.21214", "s2c_nll_loss": "0.306", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "34650", "lr": "0.000169002", "gnorm": "5.263", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "8627"} 2023-01-29 18:35:32 | INFO | train_inner | {"epoch": 17, "update": 16.036, "s2c_loss": "0.379", "loss": "0.26283", "s2c_nll_loss": "0.379", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "34660", "lr": "0.000168935", "gnorm": "6.032", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8630"} 2023-01-29 18:35:35 | INFO | train_inner | {"epoch": 17, "update": 16.041, "s2c_loss": "0.373", "loss": "0.25864", "s2c_nll_loss": "0.373", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "34670", "lr": "0.000168868", "gnorm": "5.424", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8632"} 2023-01-29 18:35:37 | INFO | train_inner | {"epoch": 17, "update": 16.045, "s2c_loss": "0.306", "loss": "0.21241", "s2c_nll_loss": "0.306", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "34680", "lr": "0.000168802", "gnorm": "4.672", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8635"} 2023-01-29 18:35:40 | INFO | train_inner | {"epoch": 17, "update": 16.05, "s2c_loss": "0.344", "loss": "0.2382", "s2c_nll_loss": "0.344", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "34690", "lr": "0.000168735", "gnorm": "5.955", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "8638"} 2023-01-29 18:35:42 | INFO | train_inner | {"epoch": 17, "update": 16.055, "s2c_loss": "0.369", "loss": "0.25601", "s2c_nll_loss": "0.369", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "34700", "lr": "0.000168668", "gnorm": "5.498", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.5", "wall": "8640"} 2023-01-29 18:35:45 | INFO | train_inner | {"epoch": 17, "update": 16.059, "s2c_loss": "0.33", "loss": "0.22865", "s2c_nll_loss": "0.33", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "34710", "lr": "0.000168602", "gnorm": "4.63", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8643"} 2023-01-29 18:35:47 | INFO | train_inner | {"epoch": 17, "update": 16.064, "s2c_loss": "0.299", "loss": "0.20751", "s2c_nll_loss": "0.299", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "34720", "lr": "0.000168535", "gnorm": "4.773", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8645"} 2023-01-29 18:35:50 | INFO | train_inner | {"epoch": 17, "update": 16.068, "s2c_loss": "0.241", "loss": "0.167", "s2c_nll_loss": "0.241", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "34730", "lr": "0.000168468", "gnorm": "4.823", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8648"} 2023-01-29 18:35:52 | INFO | train_inner | {"epoch": 17, "update": 16.073, "s2c_loss": "0.307", "loss": "0.21293", "s2c_nll_loss": "0.307", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "34740", "lr": "0.000168402", "gnorm": "4.967", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8650"} 2023-01-29 18:35:55 | INFO | train_inner | {"epoch": 17, "update": 16.078, "s2c_loss": "0.35", "loss": "0.24239", "s2c_nll_loss": "0.35", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "34750", "lr": "0.000168335", "gnorm": "5.764", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8653"} 2023-01-29 18:35:57 | INFO | train_inner | {"epoch": 17, "update": 16.082, "s2c_loss": "0.392", "loss": "0.27146", "s2c_nll_loss": "0.392", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "34760", "lr": "0.000168268", "gnorm": "6.3", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8655"} 2023-01-29 18:36:00 | INFO | train_inner | {"epoch": 17, "update": 16.087, "s2c_loss": "0.268", "loss": "0.18579", "s2c_nll_loss": "0.268", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "34770", "lr": "0.000168202", "gnorm": "4.309", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8658"} 2023-01-29 18:36:02 | INFO | train_inner | {"epoch": 17, "update": 16.092, "s2c_loss": "0.268", "loss": "0.18569", "s2c_nll_loss": "0.268", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "34780", "lr": "0.000168135", "gnorm": "4.996", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8660"} 2023-01-29 18:36:05 | INFO | train_inner | {"epoch": 17, "update": 16.096, "s2c_loss": "0.282", "loss": "0.19546", "s2c_nll_loss": "0.282", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "34790", "lr": "0.000168068", "gnorm": "4.681", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8663"} 2023-01-29 18:36:07 | INFO | train_inner | {"epoch": 17, "update": 16.101, "s2c_loss": "0.69", "loss": "0.47802", "s2c_nll_loss": "0.69", "s2c_accuracy": "91.094", "s2c_total": "64", "s2c_n_correct": "58.3", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "34800", "lr": "0.000168002", "gnorm": "4.98", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8665"} 2023-01-29 18:36:10 | INFO | train_inner | {"epoch": 17, "update": 16.105, "s2c_loss": "0.422", "loss": "0.29227", "s2c_nll_loss": "0.422", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "34810", "lr": "0.000167935", "gnorm": "4.724", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8668"} 2023-01-29 18:36:12 | INFO | train_inner | {"epoch": 17, "update": 16.11, "s2c_loss": "0.291", "loss": "0.20196", "s2c_nll_loss": "0.291", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "34820", "lr": "0.000167868", "gnorm": "5.021", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8670"} 2023-01-29 18:36:15 | INFO | train_inner | {"epoch": 17, "update": 16.115, "s2c_loss": "0.219", "loss": "0.15165", "s2c_nll_loss": "0.219", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "34830", "lr": "0.000167802", "gnorm": "4.485", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8673"} 2023-01-29 18:36:18 | INFO | train_inner | {"epoch": 17, "update": 16.119, "s2c_loss": "0.294", "loss": "0.20357", "s2c_nll_loss": "0.294", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "34840", "lr": "0.000167735", "gnorm": "5.268", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8675"} 2023-01-29 18:36:20 | INFO | train_inner | {"epoch": 17, "update": 16.124, "s2c_loss": "0.371", "loss": "0.25713", "s2c_nll_loss": "0.371", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "34850", "lr": "0.000167668", "gnorm": "4.832", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8678"} 2023-01-29 18:36:23 | INFO | train_inner | {"epoch": 17, "update": 16.129, "s2c_loss": "0.369", "loss": "0.25588", "s2c_nll_loss": "0.369", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "34860", "lr": "0.000167602", "gnorm": "5.498", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8680"} 2023-01-29 18:36:25 | INFO | train_inner | {"epoch": 17, "update": 16.133, "s2c_loss": "0.351", "loss": "0.24298", "s2c_nll_loss": "0.351", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "34870", "lr": "0.000167535", "gnorm": "5.947", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8683"} 2023-01-29 18:36:28 | INFO | train_inner | {"epoch": 17, "update": 16.138, "s2c_loss": "0.459", "loss": "0.31806", "s2c_nll_loss": "0.459", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "34880", "lr": "0.000167468", "gnorm": "6.289", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8686"} 2023-01-29 18:36:30 | INFO | train_inner | {"epoch": 17, "update": 16.142, "s2c_loss": "0.317", "loss": "0.21966", "s2c_nll_loss": "0.317", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "255", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "34890", "lr": "0.000167402", "gnorm": "4.924", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8688"} 2023-01-29 18:36:33 | INFO | train_inner | {"epoch": 17, "update": 16.147, "s2c_loss": "0.316", "loss": "0.21895", "s2c_nll_loss": "0.316", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "34900", "lr": "0.000167335", "gnorm": "5.152", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "8691"} 2023-01-29 18:36:35 | INFO | train_inner | {"epoch": 17, "update": 16.152, "s2c_loss": "0.341", "loss": "0.23627", "s2c_nll_loss": "0.341", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "34910", "lr": "0.000167268", "gnorm": "6.043", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8693"} 2023-01-29 18:36:38 | INFO | train_inner | {"epoch": 17, "update": 16.156, "s2c_loss": "0.331", "loss": "0.22917", "s2c_nll_loss": "0.331", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "34920", "lr": "0.000167202", "gnorm": "5.781", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8696"} 2023-01-29 18:36:40 | INFO | train_inner | {"epoch": 17, "update": 16.161, "s2c_loss": "0.299", "loss": "0.20739", "s2c_nll_loss": "0.299", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "34930", "lr": "0.000167135", "gnorm": "5.006", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8698"} 2023-01-29 18:36:43 | INFO | train_inner | {"epoch": 17, "update": 16.166, "s2c_loss": "0.239", "loss": "0.16558", "s2c_nll_loss": "0.239", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "34940", "lr": "0.000167068", "gnorm": "4.807", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8701"} 2023-01-29 18:36:45 | INFO | train_inner | {"epoch": 17, "update": 16.17, "s2c_loss": "0.568", "loss": "0.3937", "s2c_nll_loss": "0.568", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "34950", "lr": "0.000167002", "gnorm": "4.941", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8703"} 2023-01-29 18:36:48 | INFO | train_inner | {"epoch": 17, "update": 16.175, "s2c_loss": "0.273", "loss": "0.1892", "s2c_nll_loss": "0.273", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "34960", "lr": "0.000166935", "gnorm": "4.469", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8706"} 2023-01-29 18:36:50 | INFO | train_inner | {"epoch": 17, "update": 16.179, "s2c_loss": "0.284", "loss": "0.19652", "s2c_nll_loss": "0.284", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "34970", "lr": "0.000166868", "gnorm": "5.13", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8708"} 2023-01-29 18:36:53 | INFO | train_inner | {"epoch": 17, "update": 16.184, "s2c_loss": "0.438", "loss": "0.30331", "s2c_nll_loss": "0.438", "s2c_accuracy": "92.656", "s2c_total": "64", "s2c_n_correct": "59.3", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "34980", "lr": "0.000166802", "gnorm": "4.522", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8711"} 2023-01-29 18:36:55 | INFO | train_inner | {"epoch": 17, "update": 16.189, "s2c_loss": "0.243", "loss": "0.16846", "s2c_nll_loss": "0.243", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "34990", "lr": "0.000166735", "gnorm": "4.516", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8713"} 2023-01-29 18:36:58 | INFO | train_inner | {"epoch": 17, "update": 16.193, "s2c_loss": "0.287", "loss": "0.19864", "s2c_nll_loss": "0.287", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "247.5", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "35000", "lr": "0.000166668", "gnorm": "4.366", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8716"} 2023-01-29 18:37:01 | INFO | train_inner | {"epoch": 17, "update": 16.198, "s2c_loss": "0.309", "loss": "0.21433", "s2c_nll_loss": "0.309", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "245.7", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "35010", "lr": "0.000166602", "gnorm": "5.138", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "8718"} 2023-01-29 18:37:03 | INFO | train_inner | {"epoch": 17, "update": 16.203, "s2c_loss": "0.368", "loss": "0.25483", "s2c_nll_loss": "0.368", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "35020", "lr": "0.000166535", "gnorm": "5.139", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8721"} 2023-01-29 18:37:06 | INFO | train_inner | {"epoch": 17, "update": 16.207, "s2c_loss": "0.339", "loss": "0.23516", "s2c_nll_loss": "0.339", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "35030", "lr": "0.000166468", "gnorm": "4.85", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8724"} 2023-01-29 18:37:08 | INFO | train_inner | {"epoch": 17, "update": 16.212, "s2c_loss": "0.322", "loss": "0.22302", "s2c_nll_loss": "0.322", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "35040", "lr": "0.000166402", "gnorm": "6.108", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8726"} 2023-01-29 18:37:11 | INFO | train_inner | {"epoch": 17, "update": 16.216, "s2c_loss": "0.251", "loss": "0.17388", "s2c_nll_loss": "0.251", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "35050", "lr": "0.000166335", "gnorm": "4.673", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8729"} 2023-01-29 18:37:13 | INFO | train_inner | {"epoch": 17, "update": 16.221, "s2c_loss": "0.288", "loss": "0.19987", "s2c_nll_loss": "0.288", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "259.1", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "35060", "lr": "0.000166268", "gnorm": "4.43", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8731"} 2023-01-29 18:37:16 | INFO | train_inner | {"epoch": 17, "update": 16.226, "s2c_loss": "0.274", "loss": "0.18968", "s2c_nll_loss": "0.274", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "35070", "lr": "0.000166202", "gnorm": "4.239", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8734"} 2023-01-29 18:37:18 | INFO | train_inner | {"epoch": 17, "update": 16.23, "s2c_loss": "0.306", "loss": "0.21233", "s2c_nll_loss": "0.306", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "35080", "lr": "0.000166135", "gnorm": "4.936", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8736"} 2023-01-29 18:37:21 | INFO | train_inner | {"epoch": 17, "update": 16.235, "s2c_loss": "0.407", "loss": "0.28218", "s2c_nll_loss": "0.407", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "35090", "lr": "0.000166068", "gnorm": "5.041", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8739"} 2023-01-29 18:37:23 | INFO | train_inner | {"epoch": 17, "update": 16.24, "s2c_loss": "0.267", "loss": "0.185", "s2c_nll_loss": "0.267", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "35100", "lr": "0.000166002", "gnorm": "5.104", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8741"} 2023-01-29 18:37:26 | INFO | train_inner | {"epoch": 17, "update": 16.244, "s2c_loss": "0.545", "loss": "0.37769", "s2c_nll_loss": "0.545", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "35110", "lr": "0.000165935", "gnorm": "5.315", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8744"} 2023-01-29 18:37:28 | INFO | train_inner | {"epoch": 17, "update": 16.249, "s2c_loss": "0.269", "loss": "0.18659", "s2c_nll_loss": "0.269", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "35120", "lr": "0.000165868", "gnorm": "5.184", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8746"} 2023-01-29 18:37:31 | INFO | train_inner | {"epoch": 17, "update": 16.253, "s2c_loss": "0.356", "loss": "0.24659", "s2c_nll_loss": "0.356", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "35130", "lr": "0.000165802", "gnorm": "6.019", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8749"} 2023-01-29 18:37:33 | INFO | train_inner | {"epoch": 17, "update": 16.258, "s2c_loss": "0.321", "loss": "0.22283", "s2c_nll_loss": "0.321", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "35140", "lr": "0.000165735", "gnorm": "5.723", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8751"} 2023-01-29 18:37:36 | INFO | train_inner | {"epoch": 17, "update": 16.263, "s2c_loss": "0.242", "loss": "0.168", "s2c_nll_loss": "0.242", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "35150", "lr": "0.000165668", "gnorm": "4.701", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "8754"} 2023-01-29 18:37:38 | INFO | train_inner | {"epoch": 17, "update": 16.267, "s2c_loss": "0.326", "loss": "0.22631", "s2c_nll_loss": "0.326", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "258.7", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "35160", "lr": "0.000165602", "gnorm": "5.489", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "8756"} 2023-01-29 18:37:41 | INFO | train_inner | {"epoch": 17, "update": 16.272, "s2c_loss": "0.258", "loss": "0.17864", "s2c_nll_loss": "0.258", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "35170", "lr": "0.000165535", "gnorm": "4.96", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "8759"} 2023-01-29 18:37:43 | INFO | train_inner | {"epoch": 17, "update": 16.277, "s2c_loss": "0.263", "loss": "0.1824", "s2c_nll_loss": "0.263", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "35180", "lr": "0.000165468", "gnorm": "5.066", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "8761"} 2023-01-29 18:37:46 | INFO | train_inner | {"epoch": 17, "update": 16.281, "s2c_loss": "0.327", "loss": "0.22642", "s2c_nll_loss": "0.327", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "35190", "lr": "0.000165402", "gnorm": "4.975", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "8764"} 2023-01-29 18:37:49 | INFO | train_inner | {"epoch": 17, "update": 16.286, "s2c_loss": "0.266", "loss": "0.18412", "s2c_nll_loss": "0.266", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "35200", "lr": "0.000165335", "gnorm": "4.591", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "8766"} 2023-01-29 18:37:51 | INFO | train_inner | {"epoch": 17, "update": 16.29, "s2c_loss": "0.286", "loss": "0.19846", "s2c_nll_loss": "0.286", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "35210", "lr": "0.000165268", "gnorm": "4.399", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "8769"} 2023-01-29 18:37:54 | INFO | train_inner | {"epoch": 17, "update": 16.295, "s2c_loss": "0.32", "loss": "0.22207", "s2c_nll_loss": "0.32", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "35220", "lr": "0.000165202", "gnorm": "6.053", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "8772"} 2023-01-29 18:37:56 | INFO | train_inner | {"epoch": 17, "update": 16.3, "s2c_loss": "0.352", "loss": "0.24366", "s2c_nll_loss": "0.352", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "35230", "lr": "0.000165135", "gnorm": "4.963", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "8774"} 2023-01-29 18:37:59 | INFO | train_inner | {"epoch": 17, "update": 16.304, "s2c_loss": "0.255", "loss": "0.17683", "s2c_nll_loss": "0.255", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "35240", "lr": "0.000165068", "gnorm": "4.409", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "8777"} 2023-01-29 18:38:01 | INFO | train_inner | {"epoch": 17, "update": 16.309, "s2c_loss": "0.335", "loss": "0.23198", "s2c_nll_loss": "0.335", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "35250", "lr": "0.000165002", "gnorm": "5.228", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "8779"} 2023-01-29 18:38:04 | INFO | train_inner | {"epoch": 17, "update": 16.314, "s2c_loss": "0.235", "loss": "0.16288", "s2c_nll_loss": "0.235", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "35260", "lr": "0.000164935", "gnorm": "4.274", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "8782"} 2023-01-29 18:38:06 | INFO | train_inner | {"epoch": 17, "update": 16.318, "s2c_loss": "0.312", "loss": "0.21617", "s2c_nll_loss": "0.312", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "35270", "lr": "0.000164868", "gnorm": "4.91", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "8784"} 2023-01-29 18:38:09 | INFO | train_inner | {"epoch": 17, "update": 16.323, "s2c_loss": "0.398", "loss": "0.27604", "s2c_nll_loss": "0.398", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "35280", "lr": "0.000164802", "gnorm": "5.208", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "8787"} 2023-01-29 18:38:11 | INFO | train_inner | {"epoch": 17, "update": 16.327, "s2c_loss": "0.257", "loss": "0.17828", "s2c_nll_loss": "0.257", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "257.4", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "35290", "lr": "0.000164735", "gnorm": "4.576", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "8789"} 2023-01-29 18:38:14 | INFO | train_inner | {"epoch": 17, "update": 16.332, "s2c_loss": "0.184", "loss": "0.12783", "s2c_nll_loss": "0.184", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "35300", "lr": "0.000164668", "gnorm": "4.387", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "8792"} 2023-01-29 18:38:15 | INFO | fairseq.trainer | NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 1024.0 2023-01-29 18:38:17 | INFO | train_inner | {"epoch": 17, "update": 16.337, "s2c_loss": "0.287", "loss": "0.19914", "s2c_nll_loss": "0.287", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "230.4", "ups": "3.6", "wpb": "64", "bsz": "64", "num_updates": "35310", "lr": "0.000164602", "gnorm": "5.149", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8794"} 2023-01-29 18:38:19 | INFO | train_inner | {"epoch": 17, "update": 16.342, "s2c_loss": "0.291", "loss": "0.20204", "s2c_nll_loss": "0.291", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "35320", "lr": "0.000164535", "gnorm": "4.976", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8797"} 2023-01-29 18:38:21 | INFO | train_inner | {"epoch": 17, "update": 16.346, "s2c_loss": "0.413", "loss": "0.2865", "s2c_nll_loss": "0.413", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "261.4", "ups": "4.08", "wpb": "64", "bsz": "64", "num_updates": "35330", "lr": "0.000164468", "gnorm": "5.282", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "8799"} 2023-01-29 18:38:24 | INFO | train_inner | {"epoch": 17, "update": 16.351, "s2c_loss": "0.264", "loss": "0.183", "s2c_nll_loss": "0.264", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "35340", "lr": "0.000164402", "gnorm": "4.275", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8802"} 2023-01-29 18:38:27 | INFO | train_inner | {"epoch": 17, "update": 16.356, "s2c_loss": "0.271", "loss": "0.18805", "s2c_nll_loss": "0.271", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "35350", "lr": "0.000164335", "gnorm": "4.491", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "8804"} 2023-01-29 18:38:29 | INFO | train_inner | {"epoch": 17, "update": 16.36, "s2c_loss": "0.294", "loss": "0.20392", "s2c_nll_loss": "0.294", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "35360", "lr": "0.000164268", "gnorm": "4.347", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8807"} 2023-01-29 18:38:32 | INFO | train_inner | {"epoch": 17, "update": 16.365, "s2c_loss": "0.289", "loss": "0.20006", "s2c_nll_loss": "0.289", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "35370", "lr": "0.000164202", "gnorm": "5.294", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8809"} 2023-01-29 18:38:34 | INFO | train_inner | {"epoch": 17, "update": 16.37, "s2c_loss": "0.266", "loss": "0.18406", "s2c_nll_loss": "0.266", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "257.4", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "35380", "lr": "0.000164135", "gnorm": "4.419", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8812"} 2023-01-29 18:38:37 | INFO | train_inner | {"epoch": 17, "update": 16.374, "s2c_loss": "0.264", "loss": "0.18277", "s2c_nll_loss": "0.264", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "35390", "lr": "0.000164068", "gnorm": "4.537", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "8815"} 2023-01-29 18:38:39 | INFO | train_inner | {"epoch": 17, "update": 16.379, "s2c_loss": "0.319", "loss": "0.22109", "s2c_nll_loss": "0.319", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "35400", "lr": "0.000164002", "gnorm": "5.572", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8817"} 2023-01-29 18:38:42 | INFO | train_inner | {"epoch": 17, "update": 16.383, "s2c_loss": "0.348", "loss": "0.24094", "s2c_nll_loss": "0.348", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "35410", "lr": "0.000163935", "gnorm": "4.888", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8820"} 2023-01-29 18:38:44 | INFO | train_inner | {"epoch": 17, "update": 16.388, "s2c_loss": "0.259", "loss": "0.17971", "s2c_nll_loss": "0.259", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "35420", "lr": "0.000163868", "gnorm": "5.173", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8822"} 2023-01-29 18:38:47 | INFO | train_inner | {"epoch": 17, "update": 16.393, "s2c_loss": "0.286", "loss": "0.19834", "s2c_nll_loss": "0.286", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "35430", "lr": "0.000163802", "gnorm": "5.117", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8825"} 2023-01-29 18:38:49 | INFO | train_inner | {"epoch": 17, "update": 16.397, "s2c_loss": "0.34", "loss": "0.23595", "s2c_nll_loss": "0.34", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "35440", "lr": "0.000163735", "gnorm": "5.523", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "8827"} 2023-01-29 18:38:52 | INFO | train_inner | {"epoch": 17, "update": 16.402, "s2c_loss": "0.243", "loss": "0.16825", "s2c_nll_loss": "0.243", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "247.1", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "35450", "lr": "0.000163668", "gnorm": "4.47", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "8830"} 2023-01-29 18:38:54 | INFO | train_inner | {"epoch": 17, "update": 16.407, "s2c_loss": "0.3", "loss": "0.20773", "s2c_nll_loss": "0.3", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "35460", "lr": "0.000163602", "gnorm": "5.739", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8832"} 2023-01-29 18:38:57 | INFO | train_inner | {"epoch": 17, "update": 16.411, "s2c_loss": "0.258", "loss": "0.17887", "s2c_nll_loss": "0.258", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "35470", "lr": "0.000163535", "gnorm": "5.499", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8835"} 2023-01-29 18:38:59 | INFO | train_inner | {"epoch": 17, "update": 16.416, "s2c_loss": "0.286", "loss": "0.19833", "s2c_nll_loss": "0.286", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "35480", "lr": "0.000163468", "gnorm": "5.923", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8837"} 2023-01-29 18:39:02 | INFO | train_inner | {"epoch": 17, "update": 16.42, "s2c_loss": "0.33", "loss": "0.22872", "s2c_nll_loss": "0.33", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "35490", "lr": "0.000163402", "gnorm": "5.477", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8840"} 2023-01-29 18:39:04 | INFO | train_inner | {"epoch": 17, "update": 16.425, "s2c_loss": "0.345", "loss": "0.2394", "s2c_nll_loss": "0.345", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "35500", "lr": "0.000163335", "gnorm": "5.215", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8842"} 2023-01-29 18:39:07 | INFO | train_inner | {"epoch": 17, "update": 16.43, "s2c_loss": "0.282", "loss": "0.19519", "s2c_nll_loss": "0.282", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "255.7", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "35510", "lr": "0.000163269", "gnorm": "4.833", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8845"} 2023-01-29 18:39:10 | INFO | train_inner | {"epoch": 17, "update": 16.434, "s2c_loss": "0.249", "loss": "0.17284", "s2c_nll_loss": "0.249", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "35520", "lr": "0.000163202", "gnorm": "4.888", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8847"} 2023-01-29 18:39:12 | INFO | train_inner | {"epoch": 17, "update": 16.439, "s2c_loss": "0.298", "loss": "0.2068", "s2c_nll_loss": "0.298", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "35530", "lr": "0.000163135", "gnorm": "4.983", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8850"} 2023-01-29 18:39:15 | INFO | train_inner | {"epoch": 17, "update": 16.444, "s2c_loss": "0.249", "loss": "0.17236", "s2c_nll_loss": "0.249", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "35540", "lr": "0.000163069", "gnorm": "4.558", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8853"} 2023-01-29 18:39:17 | INFO | train_inner | {"epoch": 17, "update": 16.448, "s2c_loss": "0.233", "loss": "0.16176", "s2c_nll_loss": "0.233", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "258.4", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "35550", "lr": "0.000163002", "gnorm": "5.087", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8855"} 2023-01-29 18:39:20 | INFO | train_inner | {"epoch": 17, "update": 16.453, "s2c_loss": "0.552", "loss": "0.38268", "s2c_nll_loss": "0.552", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "35560", "lr": "0.000162935", "gnorm": "4.404", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8858"} 2023-01-29 18:39:22 | INFO | train_inner | {"epoch": 17, "update": 16.457, "s2c_loss": "0.282", "loss": "0.19535", "s2c_nll_loss": "0.282", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "35570", "lr": "0.000162869", "gnorm": "4.836", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8860"} 2023-01-29 18:39:25 | INFO | train_inner | {"epoch": 17, "update": 16.462, "s2c_loss": "0.342", "loss": "0.23706", "s2c_nll_loss": "0.342", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "35580", "lr": "0.000162802", "gnorm": "5.334", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8863"} 2023-01-29 18:39:27 | INFO | train_inner | {"epoch": 17, "update": 16.467, "s2c_loss": "0.332", "loss": "0.23008", "s2c_nll_loss": "0.332", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "35590", "lr": "0.000162735", "gnorm": "5.576", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8865"} 2023-01-29 18:39:30 | INFO | train_inner | {"epoch": 17, "update": 16.471, "s2c_loss": "0.287", "loss": "0.19877", "s2c_nll_loss": "0.287", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "35600", "lr": "0.000162669", "gnorm": "5.35", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8868"} 2023-01-29 18:39:32 | INFO | train_inner | {"epoch": 17, "update": 16.476, "s2c_loss": "0.296", "loss": "0.20522", "s2c_nll_loss": "0.296", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "35610", "lr": "0.000162602", "gnorm": "4.653", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8870"} 2023-01-29 18:39:35 | INFO | train_inner | {"epoch": 17, "update": 16.481, "s2c_loss": "0.22", "loss": "0.15243", "s2c_nll_loss": "0.22", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "35620", "lr": "0.000162535", "gnorm": "4.218", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8873"} 2023-01-29 18:39:37 | INFO | train_inner | {"epoch": 17, "update": 16.485, "s2c_loss": "0.218", "loss": "0.15103", "s2c_nll_loss": "0.218", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "35630", "lr": "0.000162469", "gnorm": "4.511", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8875"} 2023-01-29 18:39:40 | INFO | train_inner | {"epoch": 17, "update": 16.49, "s2c_loss": "0.303", "loss": "0.21007", "s2c_nll_loss": "0.303", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "35640", "lr": "0.000162402", "gnorm": "4.385", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8878"} 2023-01-29 18:39:42 | INFO | train_inner | {"epoch": 17, "update": 16.494, "s2c_loss": "0.195", "loss": "0.1353", "s2c_nll_loss": "0.195", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "35650", "lr": "0.000162335", "gnorm": "4.008", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8880"} 2023-01-29 18:39:45 | INFO | train_inner | {"epoch": 17, "update": 16.499, "s2c_loss": "0.246", "loss": "0.17083", "s2c_nll_loss": "0.246", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "35660", "lr": "0.000162269", "gnorm": "4.408", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8883"} 2023-01-29 18:39:47 | INFO | train_inner | {"epoch": 17, "update": 16.504, "s2c_loss": "0.235", "loss": "0.16306", "s2c_nll_loss": "0.235", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "35670", "lr": "0.000162202", "gnorm": "4.44", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8885"} 2023-01-29 18:39:50 | INFO | train_inner | {"epoch": 17, "update": 16.508, "s2c_loss": "0.293", "loss": "0.2034", "s2c_nll_loss": "0.293", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "35680", "lr": "0.000162135", "gnorm": "4.588", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8888"} 2023-01-29 18:39:53 | INFO | train_inner | {"epoch": 17, "update": 16.513, "s2c_loss": "0.212", "loss": "0.14707", "s2c_nll_loss": "0.212", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "35690", "lr": "0.000162069", "gnorm": "4.324", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8891"} 2023-01-29 18:39:55 | INFO | train_inner | {"epoch": 17, "update": 16.518, "s2c_loss": "0.222", "loss": "0.15358", "s2c_nll_loss": "0.222", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "35700", "lr": "0.000162002", "gnorm": "4.487", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8893"} 2023-01-29 18:39:58 | INFO | train_inner | {"epoch": 17, "update": 16.522, "s2c_loss": "0.24", "loss": "0.16613", "s2c_nll_loss": "0.24", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "35710", "lr": "0.000161935", "gnorm": "5.016", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "8896"} 2023-01-29 18:40:00 | INFO | train_inner | {"epoch": 17, "update": 16.527, "s2c_loss": "0.401", "loss": "0.27802", "s2c_nll_loss": "0.401", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "35720", "lr": "0.000161869", "gnorm": "6.441", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8898"} 2023-01-29 18:40:03 | INFO | train_inner | {"epoch": 17, "update": 16.531, "s2c_loss": "0.346", "loss": "0.24017", "s2c_nll_loss": "0.346", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "246.3", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "35730", "lr": "0.000161802", "gnorm": "5.581", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8901"} 2023-01-29 18:40:05 | INFO | train_inner | {"epoch": 17, "update": 16.536, "s2c_loss": "0.25", "loss": "0.17298", "s2c_nll_loss": "0.25", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "35740", "lr": "0.000161735", "gnorm": "4.834", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8903"} 2023-01-29 18:40:08 | INFO | train_inner | {"epoch": 17, "update": 16.541, "s2c_loss": "0.211", "loss": "0.14612", "s2c_nll_loss": "0.211", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "35750", "lr": "0.000161669", "gnorm": "4.174", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8906"} 2023-01-29 18:40:11 | INFO | train_inner | {"epoch": 17, "update": 16.545, "s2c_loss": "0.358", "loss": "0.24813", "s2c_nll_loss": "0.358", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "35760", "lr": "0.000161602", "gnorm": "5.627", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8908"} 2023-01-29 18:40:13 | INFO | train_inner | {"epoch": 17, "update": 16.55, "s2c_loss": "0.372", "loss": "0.25761", "s2c_nll_loss": "0.372", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "35770", "lr": "0.000161535", "gnorm": "6.454", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8911"} 2023-01-29 18:40:16 | INFO | train_inner | {"epoch": 17, "update": 16.555, "s2c_loss": "0.266", "loss": "0.18412", "s2c_nll_loss": "0.266", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "245.9", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "35780", "lr": "0.000161469", "gnorm": "5.675", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8914"} 2023-01-29 18:40:18 | INFO | train_inner | {"epoch": 17, "update": 16.559, "s2c_loss": "0.316", "loss": "0.21886", "s2c_nll_loss": "0.316", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "245.6", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "35790", "lr": "0.000161402", "gnorm": "5.407", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8916"} 2023-01-29 18:40:21 | INFO | train_inner | {"epoch": 17, "update": 16.564, "s2c_loss": "0.344", "loss": "0.23852", "s2c_nll_loss": "0.344", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "35800", "lr": "0.000161335", "gnorm": "4.595", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8919"} 2023-01-29 18:40:23 | INFO | train_inner | {"epoch": 17, "update": 16.568, "s2c_loss": "0.288", "loss": "0.19933", "s2c_nll_loss": "0.288", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "35810", "lr": "0.000161269", "gnorm": "4.734", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8921"} 2023-01-29 18:40:26 | INFO | train_inner | {"epoch": 17, "update": 16.573, "s2c_loss": "0.287", "loss": "0.19892", "s2c_nll_loss": "0.287", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "35820", "lr": "0.000161202", "gnorm": "4.564", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8924"} 2023-01-29 18:40:28 | INFO | train_inner | {"epoch": 17, "update": 16.578, "s2c_loss": "0.28", "loss": "0.19416", "s2c_nll_loss": "0.28", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "35830", "lr": "0.000161135", "gnorm": "4.441", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8926"} 2023-01-29 18:40:31 | INFO | train_inner | {"epoch": 17, "update": 16.582, "s2c_loss": "0.259", "loss": "0.17969", "s2c_nll_loss": "0.259", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "35840", "lr": "0.000161069", "gnorm": "4.223", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8929"} 2023-01-29 18:40:33 | INFO | train_inner | {"epoch": 17, "update": 16.587, "s2c_loss": "0.345", "loss": "0.23947", "s2c_nll_loss": "0.345", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "35850", "lr": "0.000161002", "gnorm": "5.474", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8931"} 2023-01-29 18:40:36 | INFO | train_inner | {"epoch": 17, "update": 16.592, "s2c_loss": "0.213", "loss": "0.14731", "s2c_nll_loss": "0.213", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "35860", "lr": "0.000160935", "gnorm": "4.452", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8934"} 2023-01-29 18:40:38 | INFO | train_inner | {"epoch": 17, "update": 16.596, "s2c_loss": "0.309", "loss": "0.21417", "s2c_nll_loss": "0.309", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "35870", "lr": "0.000160869", "gnorm": "6.331", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8936"} 2023-01-29 18:40:41 | INFO | train_inner | {"epoch": 17, "update": 16.601, "s2c_loss": "0.262", "loss": "0.1813", "s2c_nll_loss": "0.262", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "35880", "lr": "0.000160802", "gnorm": "5.294", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8939"} 2023-01-29 18:40:43 | INFO | train_inner | {"epoch": 17, "update": 16.605, "s2c_loss": "0.302", "loss": "0.20905", "s2c_nll_loss": "0.302", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "35890", "lr": "0.000160735", "gnorm": "5.393", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8941"} 2023-01-29 18:40:46 | INFO | train_inner | {"epoch": 17, "update": 16.61, "s2c_loss": "0.348", "loss": "0.24108", "s2c_nll_loss": "0.348", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "35900", "lr": "0.000160669", "gnorm": "5.114", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8944"} 2023-01-29 18:40:48 | INFO | train_inner | {"epoch": 17, "update": 16.615, "s2c_loss": "0.224", "loss": "0.15542", "s2c_nll_loss": "0.224", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "258.6", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "35910", "lr": "0.000160602", "gnorm": "4.798", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8946"} 2023-01-29 18:40:51 | INFO | train_inner | {"epoch": 17, "update": 16.619, "s2c_loss": "0.329", "loss": "0.22824", "s2c_nll_loss": "0.329", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "35920", "lr": "0.000160535", "gnorm": "4.937", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8949"} 2023-01-29 18:40:54 | INFO | train_inner | {"epoch": 17, "update": 16.624, "s2c_loss": "0.355", "loss": "0.24574", "s2c_nll_loss": "0.355", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "253.8", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "35930", "lr": "0.000160469", "gnorm": "5.3", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8951"} 2023-01-29 18:40:56 | INFO | train_inner | {"epoch": 17, "update": 16.629, "s2c_loss": "0.322", "loss": "0.22302", "s2c_nll_loss": "0.322", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "35940", "lr": "0.000160402", "gnorm": "5.011", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8954"} 2023-01-29 18:40:59 | INFO | train_inner | {"epoch": 17, "update": 16.633, "s2c_loss": "0.394", "loss": "0.2732", "s2c_nll_loss": "0.394", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "35950", "lr": "0.000160335", "gnorm": "5.561", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8957"} 2023-01-29 18:41:01 | INFO | train_inner | {"epoch": 17, "update": 16.638, "s2c_loss": "0.443", "loss": "0.30711", "s2c_nll_loss": "0.443", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "35960", "lr": "0.000160269", "gnorm": "5.567", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8959"} 2023-01-29 18:41:04 | INFO | train_inner | {"epoch": 17, "update": 16.642, "s2c_loss": "0.389", "loss": "0.26961", "s2c_nll_loss": "0.389", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "35970", "lr": "0.000160202", "gnorm": "4.526", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8962"} 2023-01-29 18:41:06 | INFO | train_inner | {"epoch": 17, "update": 16.647, "s2c_loss": "0.299", "loss": "0.20716", "s2c_nll_loss": "0.299", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "35980", "lr": "0.000160135", "gnorm": "4.786", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8964"} 2023-01-29 18:41:09 | INFO | train_inner | {"epoch": 17, "update": 16.652, "s2c_loss": "0.381", "loss": "0.26396", "s2c_nll_loss": "0.381", "s2c_accuracy": "91.719", "s2c_total": "64", "s2c_n_correct": "58.7", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "35990", "lr": "0.000160069", "gnorm": "6.947", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8967"} 2023-01-29 18:41:11 | INFO | train_inner | {"epoch": 17, "update": 16.656, "s2c_loss": "0.229", "loss": "0.15897", "s2c_nll_loss": "0.229", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "36000", "lr": "0.000160002", "gnorm": "4.159", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8969"} 2023-01-29 18:41:14 | INFO | train_inner | {"epoch": 17, "update": 16.661, "s2c_loss": "0.247", "loss": "0.17102", "s2c_nll_loss": "0.247", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "36010", "lr": "0.000159935", "gnorm": "4.785", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8972"} 2023-01-29 18:41:16 | INFO | train_inner | {"epoch": 17, "update": 16.666, "s2c_loss": "0.281", "loss": "0.19452", "s2c_nll_loss": "0.281", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "36020", "lr": "0.000159869", "gnorm": "6.114", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8974"} 2023-01-29 18:41:19 | INFO | train_inner | {"epoch": 17, "update": 16.67, "s2c_loss": "0.314", "loss": "0.21737", "s2c_nll_loss": "0.314", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "36030", "lr": "0.000159802", "gnorm": "4.567", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8977"} 2023-01-29 18:41:21 | INFO | train_inner | {"epoch": 17, "update": 16.675, "s2c_loss": "0.364", "loss": "0.25213", "s2c_nll_loss": "0.364", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "36040", "lr": "0.000159735", "gnorm": "4.898", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8979"} 2023-01-29 18:41:24 | INFO | train_inner | {"epoch": 17, "update": 16.679, "s2c_loss": "0.27", "loss": "0.18733", "s2c_nll_loss": "0.27", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "36050", "lr": "0.000159669", "gnorm": "4.727", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8982"} 2023-01-29 18:41:26 | INFO | train_inner | {"epoch": 17, "update": 16.684, "s2c_loss": "0.296", "loss": "0.2052", "s2c_nll_loss": "0.296", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "259.4", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "36060", "lr": "0.000159602", "gnorm": "4.379", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8984"} 2023-01-29 18:41:29 | INFO | train_inner | {"epoch": 17, "update": 16.689, "s2c_loss": "0.264", "loss": "0.18302", "s2c_nll_loss": "0.264", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "36070", "lr": "0.000159535", "gnorm": "5.228", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "8987"} 2023-01-29 18:41:31 | INFO | train_inner | {"epoch": 17, "update": 16.693, "s2c_loss": "0.307", "loss": "0.213", "s2c_nll_loss": "0.307", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "36080", "lr": "0.000159469", "gnorm": "5.156", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "8989"} 2023-01-29 18:41:34 | INFO | train_inner | {"epoch": 17, "update": 16.698, "s2c_loss": "0.398", "loss": "0.27575", "s2c_nll_loss": "0.398", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "36090", "lr": "0.000159402", "gnorm": "5.91", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8992"} 2023-01-29 18:41:36 | INFO | train_inner | {"epoch": 17, "update": 16.703, "s2c_loss": "0.371", "loss": "0.25693", "s2c_nll_loss": "0.371", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "36100", "lr": "0.000159335", "gnorm": "5.348", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8994"} 2023-01-29 18:41:39 | INFO | train_inner | {"epoch": 17, "update": 16.707, "s2c_loss": "0.378", "loss": "0.26184", "s2c_nll_loss": "0.378", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "36110", "lr": "0.000159269", "gnorm": "5.962", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "8997"} 2023-01-29 18:41:41 | INFO | train_inner | {"epoch": 17, "update": 16.712, "s2c_loss": "0.34", "loss": "0.23541", "s2c_nll_loss": "0.34", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "36120", "lr": "0.000159202", "gnorm": "5.141", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "8999"} 2023-01-29 18:41:44 | INFO | train_inner | {"epoch": 17, "update": 16.716, "s2c_loss": "0.314", "loss": "0.21796", "s2c_nll_loss": "0.314", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "36130", "lr": "0.000159135", "gnorm": "4.815", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9002"} 2023-01-29 18:41:47 | INFO | train_inner | {"epoch": 17, "update": 16.721, "s2c_loss": "0.327", "loss": "0.22657", "s2c_nll_loss": "0.327", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "36140", "lr": "0.000159069", "gnorm": "5.567", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9004"} 2023-01-29 18:41:49 | INFO | train_inner | {"epoch": 17, "update": 16.726, "s2c_loss": "0.335", "loss": "0.23241", "s2c_nll_loss": "0.335", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "36150", "lr": "0.000159002", "gnorm": "5.254", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9007"} 2023-01-29 18:41:52 | INFO | train_inner | {"epoch": 17, "update": 16.73, "s2c_loss": "0.237", "loss": "0.16452", "s2c_nll_loss": "0.237", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "36160", "lr": "0.000158935", "gnorm": "4.175", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9009"} 2023-01-29 18:41:54 | INFO | train_inner | {"epoch": 17, "update": 16.735, "s2c_loss": "0.304", "loss": "0.21038", "s2c_nll_loss": "0.304", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "257.4", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "36170", "lr": "0.000158869", "gnorm": "4.846", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9012"} 2023-01-29 18:41:57 | INFO | train_inner | {"epoch": 17, "update": 16.74, "s2c_loss": "0.255", "loss": "0.17683", "s2c_nll_loss": "0.255", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "36180", "lr": "0.000158802", "gnorm": "4.971", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9014"} 2023-01-29 18:41:59 | INFO | train_inner | {"epoch": 17, "update": 16.744, "s2c_loss": "0.498", "loss": "0.34487", "s2c_nll_loss": "0.498", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "36190", "lr": "0.000158735", "gnorm": "5.615", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9017"} 2023-01-29 18:42:02 | INFO | train_inner | {"epoch": 17, "update": 16.749, "s2c_loss": "0.353", "loss": "0.24437", "s2c_nll_loss": "0.353", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "258.5", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "36200", "lr": "0.000158669", "gnorm": "5.285", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9019"} 2023-01-29 18:42:04 | INFO | train_inner | {"epoch": 17, "update": 16.753, "s2c_loss": "0.275", "loss": "0.19096", "s2c_nll_loss": "0.275", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "36210", "lr": "0.000158602", "gnorm": "4.44", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9022"} 2023-01-29 18:42:07 | INFO | train_inner | {"epoch": 17, "update": 16.758, "s2c_loss": "0.233", "loss": "0.16121", "s2c_nll_loss": "0.233", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "258.8", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "36220", "lr": "0.000158535", "gnorm": "4.706", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9024"} 2023-01-29 18:42:09 | INFO | train_inner | {"epoch": 17, "update": 16.763, "s2c_loss": "0.319", "loss": "0.22105", "s2c_nll_loss": "0.319", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "36230", "lr": "0.000158469", "gnorm": "5.282", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.5", "wall": "9027"} 2023-01-29 18:42:12 | INFO | train_inner | {"epoch": 17, "update": 16.767, "s2c_loss": "0.369", "loss": "0.25558", "s2c_nll_loss": "0.369", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "36240", "lr": "0.000158402", "gnorm": "5.904", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9029"} 2023-01-29 18:42:14 | INFO | train_inner | {"epoch": 17, "update": 16.772, "s2c_loss": "0.351", "loss": "0.24359", "s2c_nll_loss": "0.351", "s2c_accuracy": "92.812", "s2c_total": "64", "s2c_n_correct": "59.4", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "36250", "lr": "0.000158335", "gnorm": "5.794", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9032"} 2023-01-29 18:42:17 | INFO | train_inner | {"epoch": 17, "update": 16.777, "s2c_loss": "0.41", "loss": "0.28424", "s2c_nll_loss": "0.41", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "36260", "lr": "0.000158269", "gnorm": "5.479", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9034"} 2023-01-29 18:42:19 | INFO | train_inner | {"epoch": 17, "update": 16.781, "s2c_loss": "0.39", "loss": "0.27059", "s2c_nll_loss": "0.39", "s2c_accuracy": "92.188", "s2c_total": "64", "s2c_n_correct": "59", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "36270", "lr": "0.000158202", "gnorm": "6.444", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9037"} 2023-01-29 18:42:22 | INFO | train_inner | {"epoch": 17, "update": 16.786, "s2c_loss": "0.357", "loss": "0.24764", "s2c_nll_loss": "0.357", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "36280", "lr": "0.000158135", "gnorm": "5.901", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9040"} 2023-01-29 18:42:24 | INFO | train_inner | {"epoch": 17, "update": 16.79, "s2c_loss": "0.309", "loss": "0.21441", "s2c_nll_loss": "0.309", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "36290", "lr": "0.000158069", "gnorm": "5.45", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9042"} 2023-01-29 18:42:27 | INFO | train_inner | {"epoch": 17, "update": 16.795, "s2c_loss": "0.315", "loss": "0.21806", "s2c_nll_loss": "0.315", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "36300", "lr": "0.000158002", "gnorm": "6.33", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9045"} 2023-01-29 18:42:29 | INFO | train_inner | {"epoch": 17, "update": 16.8, "s2c_loss": "0.267", "loss": "0.18478", "s2c_nll_loss": "0.267", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "36310", "lr": "0.000157935", "gnorm": "5.383", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9047"} 2023-01-29 18:42:32 | INFO | train_inner | {"epoch": 17, "update": 16.804, "s2c_loss": "0.359", "loss": "0.24834", "s2c_nll_loss": "0.359", "s2c_accuracy": "93.878", "s2c_total": "63.7", "s2c_n_correct": "59.8", "wps": "252.1", "ups": "3.96", "wpb": "63.7", "bsz": "63.7", "num_updates": "36320", "lr": "0.000157869", "gnorm": "5.516", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9050"} 2023-01-29 18:42:34 | INFO | train_inner | {"epoch": 17, "update": 16.809, "s2c_loss": "0.269", "loss": "0.1868", "s2c_nll_loss": "0.269", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "36330", "lr": "0.000157802", "gnorm": "4.409", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9052"} 2023-01-29 18:42:37 | INFO | train_inner | {"epoch": 17, "update": 16.814, "s2c_loss": "0.358", "loss": "0.24827", "s2c_nll_loss": "0.358", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "36340", "lr": "0.000157735", "gnorm": "6.174", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9055"} 2023-01-29 18:42:39 | INFO | train_inner | {"epoch": 17, "update": 16.818, "s2c_loss": "0.253", "loss": "0.17561", "s2c_nll_loss": "0.253", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "36350", "lr": "0.000157669", "gnorm": "4.56", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9057"} 2023-01-29 18:42:42 | INFO | train_inner | {"epoch": 17, "update": 16.823, "s2c_loss": "0.304", "loss": "0.21096", "s2c_nll_loss": "0.304", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "36360", "lr": "0.000157602", "gnorm": "4.942", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9060"} 2023-01-29 18:42:44 | INFO | train_inner | {"epoch": 17, "update": 16.827, "s2c_loss": "0.225", "loss": "0.15616", "s2c_nll_loss": "0.225", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "36370", "lr": "0.000157535", "gnorm": "5.682", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9062"} 2023-01-29 18:42:47 | INFO | train_inner | {"epoch": 17, "update": 16.832, "s2c_loss": "0.239", "loss": "0.166", "s2c_nll_loss": "0.239", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "36380", "lr": "0.000157469", "gnorm": "4.416", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9065"} 2023-01-29 18:42:50 | INFO | train_inner | {"epoch": 17, "update": 16.837, "s2c_loss": "0.283", "loss": "0.19593", "s2c_nll_loss": "0.283", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "36390", "lr": "0.000157402", "gnorm": "4.109", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9067"} 2023-01-29 18:42:52 | INFO | train_inner | {"epoch": 17, "update": 16.841, "s2c_loss": "0.295", "loss": "0.20469", "s2c_nll_loss": "0.295", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "36400", "lr": "0.000157335", "gnorm": "4.909", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9070"} 2023-01-29 18:42:55 | INFO | train_inner | {"epoch": 17, "update": 16.846, "s2c_loss": "0.192", "loss": "0.13318", "s2c_nll_loss": "0.192", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "36410", "lr": "0.000157269", "gnorm": "3.689", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9072"} 2023-01-29 18:42:57 | INFO | train_inner | {"epoch": 17, "update": 16.851, "s2c_loss": "0.234", "loss": "0.16214", "s2c_nll_loss": "0.234", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "36420", "lr": "0.000157202", "gnorm": "4.524", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.5", "wall": "9075"} 2023-01-29 18:43:00 | INFO | train_inner | {"epoch": 17, "update": 16.855, "s2c_loss": "0.573", "loss": "0.39734", "s2c_nll_loss": "0.573", "s2c_accuracy": "91.875", "s2c_total": "64", "s2c_n_correct": "58.8", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "36430", "lr": "0.000157135", "gnorm": "4.781", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9078"} 2023-01-29 18:43:02 | INFO | train_inner | {"epoch": 17, "update": 16.86, "s2c_loss": "0.328", "loss": "0.22719", "s2c_nll_loss": "0.328", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "36440", "lr": "0.000157069", "gnorm": "5.442", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9080"} 2023-01-29 18:43:05 | INFO | train_inner | {"epoch": 17, "update": 16.864, "s2c_loss": "0.296", "loss": "0.20505", "s2c_nll_loss": "0.296", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "36450", "lr": "0.000157002", "gnorm": "5.051", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9083"} 2023-01-29 18:43:07 | INFO | train_inner | {"epoch": 17, "update": 16.869, "s2c_loss": "0.325", "loss": "0.22561", "s2c_nll_loss": "0.325", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "36460", "lr": "0.000156935", "gnorm": "4.865", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9085"} 2023-01-29 18:43:10 | INFO | train_inner | {"epoch": 17, "update": 16.874, "s2c_loss": "0.236", "loss": "0.16386", "s2c_nll_loss": "0.236", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "36470", "lr": "0.000156869", "gnorm": "4.234", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9088"} 2023-01-29 18:43:12 | INFO | train_inner | {"epoch": 17, "update": 16.878, "s2c_loss": "0.291", "loss": "0.2016", "s2c_nll_loss": "0.291", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "255.7", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "36480", "lr": "0.000156802", "gnorm": "4.886", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9090"} 2023-01-29 18:43:15 | INFO | train_inner | {"epoch": 17, "update": 16.883, "s2c_loss": "0.253", "loss": "0.17516", "s2c_nll_loss": "0.253", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "36490", "lr": "0.000156735", "gnorm": "5.008", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9093"} 2023-01-29 18:43:17 | INFO | train_inner | {"epoch": 17, "update": 16.888, "s2c_loss": "0.347", "loss": "0.24056", "s2c_nll_loss": "0.347", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "36500", "lr": "0.000156669", "gnorm": "4.912", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9095"} 2023-01-29 18:43:20 | INFO | train_inner | {"epoch": 17, "update": 16.892, "s2c_loss": "0.228", "loss": "0.1583", "s2c_nll_loss": "0.228", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "36510", "lr": "0.000156602", "gnorm": "4.411", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9098"} 2023-01-29 18:43:22 | INFO | train_inner | {"epoch": 17, "update": 16.897, "s2c_loss": "0.277", "loss": "0.19172", "s2c_nll_loss": "0.277", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "36520", "lr": "0.000156536", "gnorm": "4.756", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9100"} 2023-01-29 18:43:25 | INFO | train_inner | {"epoch": 17, "update": 16.901, "s2c_loss": "0.165", "loss": "0.11467", "s2c_nll_loss": "0.165", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "36530", "lr": "0.000156469", "gnorm": "3.73", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9103"} 2023-01-29 18:43:27 | INFO | train_inner | {"epoch": 17, "update": 16.906, "s2c_loss": "0.198", "loss": "0.13705", "s2c_nll_loss": "0.198", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "36540", "lr": "0.000156402", "gnorm": "4.698", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "9105"} 2023-01-29 18:43:30 | INFO | train_inner | {"epoch": 17, "update": 16.911, "s2c_loss": "0.325", "loss": "0.22527", "s2c_nll_loss": "0.325", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "36550", "lr": "0.000156336", "gnorm": "4.8", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9108"} 2023-01-29 18:43:33 | INFO | train_inner | {"epoch": 17, "update": 16.915, "s2c_loss": "0.246", "loss": "0.1707", "s2c_nll_loss": "0.246", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "248", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "36560", "lr": "0.000156269", "gnorm": "4.587", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9110"} 2023-01-29 18:43:35 | INFO | train_inner | {"epoch": 17, "update": 16.92, "s2c_loss": "0.206", "loss": "0.14249", "s2c_nll_loss": "0.206", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "36570", "lr": "0.000156202", "gnorm": "4.292", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9113"} 2023-01-29 18:43:38 | INFO | train_inner | {"epoch": 17, "update": 16.925, "s2c_loss": "0.299", "loss": "0.20706", "s2c_nll_loss": "0.299", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "36580", "lr": "0.000156136", "gnorm": "5.292", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9116"} 2023-01-29 18:43:40 | INFO | train_inner | {"epoch": 17, "update": 16.929, "s2c_loss": "0.331", "loss": "0.22933", "s2c_nll_loss": "0.331", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "36590", "lr": "0.000156069", "gnorm": "5.911", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9118"} 2023-01-29 18:43:43 | INFO | train_inner | {"epoch": 17, "update": 16.934, "s2c_loss": "0.253", "loss": "0.17503", "s2c_nll_loss": "0.253", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "36600", "lr": "0.000156002", "gnorm": "4.675", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9121"} 2023-01-29 18:43:45 | INFO | train_inner | {"epoch": 17, "update": 16.938, "s2c_loss": "0.277", "loss": "0.19174", "s2c_nll_loss": "0.277", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "36610", "lr": "0.000155936", "gnorm": "4.472", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9123"} 2023-01-29 18:43:48 | INFO | train_inner | {"epoch": 17, "update": 16.943, "s2c_loss": "0.259", "loss": "0.17953", "s2c_nll_loss": "0.259", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "36620", "lr": "0.000155869", "gnorm": "5.905", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9126"} 2023-01-29 18:43:50 | INFO | train_inner | {"epoch": 17, "update": 16.948, "s2c_loss": "0.199", "loss": "0.13766", "s2c_nll_loss": "0.199", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "36630", "lr": "0.000155802", "gnorm": "4.423", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9128"} 2023-01-29 18:43:53 | INFO | train_inner | {"epoch": 17, "update": 16.952, "s2c_loss": "0.28", "loss": "0.19434", "s2c_nll_loss": "0.28", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "257.4", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "36640", "lr": "0.000155736", "gnorm": "4.507", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "9131"} 2023-01-29 18:43:55 | INFO | train_inner | {"epoch": 17, "update": 16.957, "s2c_loss": "0.27", "loss": "0.18687", "s2c_nll_loss": "0.27", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "36650", "lr": "0.000155669", "gnorm": "4.295", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9133"} 2023-01-29 18:43:58 | INFO | train_inner | {"epoch": 17, "update": 16.962, "s2c_loss": "0.278", "loss": "0.19286", "s2c_nll_loss": "0.278", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "36660", "lr": "0.000155602", "gnorm": "5.531", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9136"} 2023-01-29 18:44:00 | INFO | train_inner | {"epoch": 17, "update": 16.966, "s2c_loss": "0.299", "loss": "0.20693", "s2c_nll_loss": "0.299", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "36670", "lr": "0.000155536", "gnorm": "5.44", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9138"} 2023-01-29 18:44:03 | INFO | train_inner | {"epoch": 17, "update": 16.971, "s2c_loss": "0.261", "loss": "0.1808", "s2c_nll_loss": "0.261", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "36680", "lr": "0.000155469", "gnorm": "4.351", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9141"} 2023-01-29 18:44:05 | INFO | train_inner | {"epoch": 17, "update": 16.975, "s2c_loss": "0.27", "loss": "0.18713", "s2c_nll_loss": "0.27", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "36690", "lr": "0.000155402", "gnorm": "5.078", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "9143"} 2023-01-29 18:44:08 | INFO | train_inner | {"epoch": 17, "update": 16.98, "s2c_loss": "0.435", "loss": "0.30131", "s2c_nll_loss": "0.435", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "36700", "lr": "0.000155336", "gnorm": "5.39", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9146"} 2023-01-29 18:44:10 | INFO | train_inner | {"epoch": 17, "update": 16.985, "s2c_loss": "0.28", "loss": "0.1942", "s2c_nll_loss": "0.28", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "36710", "lr": "0.000155269", "gnorm": "5.458", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9148"} 2023-01-29 18:44:13 | INFO | train_inner | {"epoch": 17, "update": 16.989, "s2c_loss": "0.524", "loss": "0.36349", "s2c_nll_loss": "0.524", "s2c_accuracy": "92.5", "s2c_total": "64", "s2c_n_correct": "59.2", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "36720", "lr": "0.000155202", "gnorm": "6.565", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9151"} 2023-01-29 18:44:15 | INFO | train_inner | {"epoch": 17, "update": 16.994, "s2c_loss": "0.26", "loss": "0.1805", "s2c_nll_loss": "0.26", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "36730", "lr": "0.000155136", "gnorm": "5.971", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9153"} 2023-01-29 18:44:18 | INFO | train_inner | {"epoch": 17, "update": 16.999, "s2c_loss": "0.299", "loss": "0.20703", "s2c_nll_loss": "0.299", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "255.7", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "36740", "lr": "0.000155069", "gnorm": "5.687", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9156"} 2023-01-29 18:44:19 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 18:44:33 | INFO | valid | {"epoch": 17, "valid_s2c_loss": "0.839", "valid_loss": "0.58127", "valid_s2c_nll_loss": "0.839", "valid_s2c_accuracy": "85.311", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "27.2639", "valid_num_updates": "36743", "valid_best_s2c_accuracy": "85.745"} 2023-01-29 18:44:33 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 17 @ 36743 updates 2023-01-29 18:44:33 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 18:44:40 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 18:44:40 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt (epoch 17 @ 36743 updates, score 85.311) (writing took 6.66662970604375 seconds) 2023-01-29 18:44:40 | INFO | fairseq_cli.train | end of epoch 17 (average epoch stats below) 2023-01-29 18:44:40 | INFO | train | {"epoch": 17, "train_s2c_loss": "0.308", "train_loss": "0.2136", "train_s2c_nll_loss": "0.308", "train_s2c_accuracy": "94.464", "train_s2c_total": "63.9838", "train_s2c_n_correct": "60.4419", "train_wps": "240.9", "train_ups": "3.76", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "36743", "train_lr": "0.000155049", "train_gnorm": "5.051", "train_loss_scale": "1024", "train_train_wall": "539", "train_gb_free": "7.5", "train_wall": "9178"} 2023-01-29 18:44:46 | INFO | fairseq.trainer | begin training epoch 18 2023-01-29 18:44:46 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 18:44:48 | INFO | train_inner | {"epoch": 18, "update": 17.003, "s2c_loss": "0.296", "loss": "0.20522", "s2c_nll_loss": "0.296", "s2c_accuracy": "93.586", "s2c_total": "60.8", "s2c_n_correct": "56.9", "wps": "20.3", "ups": "0.33", "wpb": "60.8", "bsz": "60.8", "num_updates": "36750", "lr": "0.000155002", "gnorm": "8.084", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9186"} 2023-01-29 18:44:50 | INFO | train_inner | {"epoch": 18, "update": 17.008, "s2c_loss": "0.285", "loss": "0.19727", "s2c_nll_loss": "0.285", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "36760", "lr": "0.000154936", "gnorm": "5.331", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9188"} 2023-01-29 18:44:53 | INFO | train_inner | {"epoch": 18, "update": 17.012, "s2c_loss": "0.287", "loss": "0.19894", "s2c_nll_loss": "0.287", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "36770", "lr": "0.000154869", "gnorm": "6.026", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9191"} 2023-01-29 18:44:56 | INFO | train_inner | {"epoch": 18, "update": 17.017, "s2c_loss": "0.284", "loss": "0.19711", "s2c_nll_loss": "0.284", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "36780", "lr": "0.000154802", "gnorm": "4.747", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9193"} 2023-01-29 18:44:58 | INFO | train_inner | {"epoch": 18, "update": 17.022, "s2c_loss": "0.302", "loss": "0.20916", "s2c_nll_loss": "0.302", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "36790", "lr": "0.000154736", "gnorm": "4.592", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9196"} 2023-01-29 18:45:01 | INFO | train_inner | {"epoch": 18, "update": 17.026, "s2c_loss": "0.249", "loss": "0.17245", "s2c_nll_loss": "0.249", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "36800", "lr": "0.000154669", "gnorm": "5.065", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9199"} 2023-01-29 18:45:03 | INFO | train_inner | {"epoch": 18, "update": 17.031, "s2c_loss": "0.269", "loss": "0.18639", "s2c_nll_loss": "0.269", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "36810", "lr": "0.000154602", "gnorm": "4.045", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9201"} 2023-01-29 18:45:06 | INFO | train_inner | {"epoch": 18, "update": 17.036, "s2c_loss": "0.242", "loss": "0.16771", "s2c_nll_loss": "0.242", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "242.7", "ups": "3.79", "wpb": "64", "bsz": "64", "num_updates": "36820", "lr": "0.000154536", "gnorm": "5.86", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9204"} 2023-01-29 18:45:08 | INFO | train_inner | {"epoch": 18, "update": 17.04, "s2c_loss": "0.234", "loss": "0.16253", "s2c_nll_loss": "0.234", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "36830", "lr": "0.000154469", "gnorm": "5.03", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9206"} 2023-01-29 18:45:11 | INFO | train_inner | {"epoch": 18, "update": 17.045, "s2c_loss": "0.204", "loss": "0.14122", "s2c_nll_loss": "0.204", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "258.5", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "36840", "lr": "0.000154402", "gnorm": "5.054", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9209"} 2023-01-29 18:45:13 | INFO | train_inner | {"epoch": 18, "update": 17.049, "s2c_loss": "0.217", "loss": "0.15061", "s2c_nll_loss": "0.217", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "36850", "lr": "0.000154336", "gnorm": "4.372", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9211"} 2023-01-29 18:45:16 | INFO | train_inner | {"epoch": 18, "update": 17.054, "s2c_loss": "0.177", "loss": "0.123", "s2c_nll_loss": "0.177", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "36860", "lr": "0.000154269", "gnorm": "3.603", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9214"} 2023-01-29 18:45:18 | INFO | train_inner | {"epoch": 18, "update": 17.059, "s2c_loss": "0.275", "loss": "0.19039", "s2c_nll_loss": "0.275", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "36870", "lr": "0.000154202", "gnorm": "4.835", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9216"} 2023-01-29 18:45:21 | INFO | train_inner | {"epoch": 18, "update": 17.063, "s2c_loss": "0.178", "loss": "0.12318", "s2c_nll_loss": "0.178", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "36880", "lr": "0.000154136", "gnorm": "4.325", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9219"} 2023-01-29 18:45:24 | INFO | train_inner | {"epoch": 18, "update": 17.068, "s2c_loss": "0.271", "loss": "0.18779", "s2c_nll_loss": "0.271", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "36890", "lr": "0.000154069", "gnorm": "4.767", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9221"} 2023-01-29 18:45:26 | INFO | train_inner | {"epoch": 18, "update": 17.073, "s2c_loss": "0.355", "loss": "0.24635", "s2c_nll_loss": "0.355", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "36900", "lr": "0.000154002", "gnorm": "4.241", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9224"} 2023-01-29 18:45:29 | INFO | train_inner | {"epoch": 18, "update": 17.077, "s2c_loss": "0.201", "loss": "0.13926", "s2c_nll_loss": "0.201", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "36910", "lr": "0.000153936", "gnorm": "4.642", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9226"} 2023-01-29 18:45:31 | INFO | train_inner | {"epoch": 18, "update": 17.082, "s2c_loss": "0.249", "loss": "0.17254", "s2c_nll_loss": "0.249", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "36920", "lr": "0.000153869", "gnorm": "4.311", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9229"} 2023-01-29 18:45:34 | INFO | train_inner | {"epoch": 18, "update": 17.086, "s2c_loss": "0.221", "loss": "0.15312", "s2c_nll_loss": "0.221", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "36930", "lr": "0.000153802", "gnorm": "4.451", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9232"} 2023-01-29 18:45:36 | INFO | train_inner | {"epoch": 18, "update": 17.091, "s2c_loss": "0.177", "loss": "0.12277", "s2c_nll_loss": "0.177", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "36940", "lr": "0.000153736", "gnorm": "4.116", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9234"} 2023-01-29 18:45:39 | INFO | train_inner | {"epoch": 18, "update": 17.096, "s2c_loss": "0.257", "loss": "0.17813", "s2c_nll_loss": "0.257", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "36950", "lr": "0.000153669", "gnorm": "4.109", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9237"} 2023-01-29 18:45:41 | INFO | train_inner | {"epoch": 18, "update": 17.1, "s2c_loss": "0.201", "loss": "0.13919", "s2c_nll_loss": "0.201", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "36960", "lr": "0.000153602", "gnorm": "3.872", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9239"} 2023-01-29 18:45:44 | INFO | train_inner | {"epoch": 18, "update": 17.105, "s2c_loss": "0.239", "loss": "0.16596", "s2c_nll_loss": "0.239", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "36970", "lr": "0.000153536", "gnorm": "4.101", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9242"} 2023-01-29 18:45:46 | INFO | train_inner | {"epoch": 18, "update": 17.11, "s2c_loss": "0.405", "loss": "0.28071", "s2c_nll_loss": "0.405", "s2c_accuracy": "92.969", "s2c_total": "64", "s2c_n_correct": "59.5", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "36980", "lr": "0.000153469", "gnorm": "5.106", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9244"} 2023-01-29 18:45:49 | INFO | train_inner | {"epoch": 18, "update": 17.114, "s2c_loss": "0.294", "loss": "0.20365", "s2c_nll_loss": "0.294", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "36990", "lr": "0.000153402", "gnorm": "4.812", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9247"} 2023-01-29 18:45:51 | INFO | train_inner | {"epoch": 18, "update": 17.119, "s2c_loss": "0.327", "loss": "0.22699", "s2c_nll_loss": "0.327", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "37000", "lr": "0.000153336", "gnorm": "4.955", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9249"} 2023-01-29 18:45:54 | INFO | train_inner | {"epoch": 18, "update": 17.123, "s2c_loss": "0.216", "loss": "0.14961", "s2c_nll_loss": "0.216", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "37010", "lr": "0.000153269", "gnorm": "4.018", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9252"} 2023-01-29 18:45:56 | INFO | train_inner | {"epoch": 18, "update": 17.128, "s2c_loss": "0.243", "loss": "0.1687", "s2c_nll_loss": "0.243", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "245.2", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "37020", "lr": "0.000153202", "gnorm": "5.284", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9254"} 2023-01-29 18:45:59 | INFO | train_inner | {"epoch": 18, "update": 17.133, "s2c_loss": "0.247", "loss": "0.17115", "s2c_nll_loss": "0.247", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "37030", "lr": "0.000153136", "gnorm": "5.4", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9257"} 2023-01-29 18:46:01 | INFO | train_inner | {"epoch": 18, "update": 17.137, "s2c_loss": "0.205", "loss": "0.14234", "s2c_nll_loss": "0.205", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "37040", "lr": "0.000153069", "gnorm": "5.359", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "9259"} 2023-01-29 18:46:04 | INFO | train_inner | {"epoch": 18, "update": 17.142, "s2c_loss": "0.275", "loss": "0.1903", "s2c_nll_loss": "0.275", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "242.1", "ups": "3.78", "wpb": "64", "bsz": "64", "num_updates": "37050", "lr": "0.000153002", "gnorm": "5.471", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9262"} 2023-01-29 18:46:07 | INFO | train_inner | {"epoch": 18, "update": 17.147, "s2c_loss": "0.363", "loss": "0.25169", "s2c_nll_loss": "0.363", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "259.8", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "37060", "lr": "0.000152936", "gnorm": "5.35", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9264"} 2023-01-29 18:46:09 | INFO | train_inner | {"epoch": 18, "update": 17.151, "s2c_loss": "0.248", "loss": "0.1722", "s2c_nll_loss": "0.248", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "37070", "lr": "0.000152869", "gnorm": "4.255", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9267"} 2023-01-29 18:46:12 | INFO | train_inner | {"epoch": 18, "update": 17.156, "s2c_loss": "0.293", "loss": "0.20324", "s2c_nll_loss": "0.293", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "37080", "lr": "0.000152802", "gnorm": "4.486", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9270"} 2023-01-29 18:46:14 | INFO | train_inner | {"epoch": 18, "update": 17.16, "s2c_loss": "0.287", "loss": "0.19869", "s2c_nll_loss": "0.287", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "258.1", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "37090", "lr": "0.000152736", "gnorm": "5.248", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9272"} 2023-01-29 18:46:17 | INFO | train_inner | {"epoch": 18, "update": 17.165, "s2c_loss": "0.271", "loss": "0.18794", "s2c_nll_loss": "0.271", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "37100", "lr": "0.000152669", "gnorm": "4.394", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9275"} 2023-01-29 18:46:19 | INFO | train_inner | {"epoch": 18, "update": 17.17, "s2c_loss": "0.269", "loss": "0.18649", "s2c_nll_loss": "0.269", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "37110", "lr": "0.000152602", "gnorm": "4.233", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "9277"} 2023-01-29 18:46:22 | INFO | train_inner | {"epoch": 18, "update": 17.174, "s2c_loss": "0.267", "loss": "0.18489", "s2c_nll_loss": "0.267", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "37120", "lr": "0.000152536", "gnorm": "4.6", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9280"} 2023-01-29 18:46:24 | INFO | train_inner | {"epoch": 18, "update": 17.179, "s2c_loss": "0.135", "loss": "0.09388", "s2c_nll_loss": "0.135", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "37130", "lr": "0.000152469", "gnorm": "3.356", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9282"} 2023-01-29 18:46:27 | INFO | train_inner | {"epoch": 18, "update": 17.184, "s2c_loss": "0.169", "loss": "0.11694", "s2c_nll_loss": "0.169", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "37140", "lr": "0.000152402", "gnorm": "3.656", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "9285"} 2023-01-29 18:46:29 | INFO | train_inner | {"epoch": 18, "update": 17.188, "s2c_loss": "0.217", "loss": "0.15025", "s2c_nll_loss": "0.217", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "246.5", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "37150", "lr": "0.000152336", "gnorm": "4.746", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9287"} 2023-01-29 18:46:32 | INFO | train_inner | {"epoch": 18, "update": 17.193, "s2c_loss": "0.281", "loss": "0.19499", "s2c_nll_loss": "0.281", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "37160", "lr": "0.000152269", "gnorm": "4.731", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9290"} 2023-01-29 18:46:34 | INFO | train_inner | {"epoch": 18, "update": 17.198, "s2c_loss": "0.222", "loss": "0.15395", "s2c_nll_loss": "0.222", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "37170", "lr": "0.000152202", "gnorm": "4.522", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9292"} 2023-01-29 18:46:37 | INFO | train_inner | {"epoch": 18, "update": 17.202, "s2c_loss": "0.249", "loss": "0.17272", "s2c_nll_loss": "0.249", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "37180", "lr": "0.000152136", "gnorm": "4.663", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9295"} 2023-01-29 18:46:39 | INFO | train_inner | {"epoch": 18, "update": 17.207, "s2c_loss": "0.186", "loss": "0.12892", "s2c_nll_loss": "0.186", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "37190", "lr": "0.000152069", "gnorm": "4.055", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9297"} 2023-01-29 18:46:42 | INFO | train_inner | {"epoch": 18, "update": 17.211, "s2c_loss": "0.189", "loss": "0.13085", "s2c_nll_loss": "0.189", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "37200", "lr": "0.000152002", "gnorm": "4.004", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9300"} 2023-01-29 18:46:44 | INFO | train_inner | {"epoch": 18, "update": 17.216, "s2c_loss": "0.257", "loss": "0.17808", "s2c_nll_loss": "0.257", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "37210", "lr": "0.000151936", "gnorm": "5.299", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9302"} 2023-01-29 18:46:47 | INFO | train_inner | {"epoch": 18, "update": 17.221, "s2c_loss": "0.355", "loss": "0.2462", "s2c_nll_loss": "0.355", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "37220", "lr": "0.000151869", "gnorm": "4.875", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9305"} 2023-01-29 18:46:49 | INFO | train_inner | {"epoch": 18, "update": 17.225, "s2c_loss": "0.261", "loss": "0.18111", "s2c_nll_loss": "0.261", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "37230", "lr": "0.000151802", "gnorm": "4.543", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9307"} 2023-01-29 18:46:52 | INFO | train_inner | {"epoch": 18, "update": 17.23, "s2c_loss": "0.33", "loss": "0.22884", "s2c_nll_loss": "0.33", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "37240", "lr": "0.000151736", "gnorm": "5.094", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9310"} 2023-01-29 18:46:54 | INFO | train_inner | {"epoch": 18, "update": 17.235, "s2c_loss": "0.216", "loss": "0.14984", "s2c_nll_loss": "0.216", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "37250", "lr": "0.000151669", "gnorm": "4.782", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9312"} 2023-01-29 18:46:57 | INFO | train_inner | {"epoch": 18, "update": 17.239, "s2c_loss": "0.209", "loss": "0.14481", "s2c_nll_loss": "0.209", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "37260", "lr": "0.000151602", "gnorm": "4.574", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "9315"} 2023-01-29 18:46:59 | INFO | train_inner | {"epoch": 18, "update": 17.244, "s2c_loss": "0.196", "loss": "0.13567", "s2c_nll_loss": "0.196", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "37270", "lr": "0.000151536", "gnorm": "4.048", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9317"} 2023-01-29 18:47:02 | INFO | train_inner | {"epoch": 18, "update": 17.248, "s2c_loss": "0.249", "loss": "0.17277", "s2c_nll_loss": "0.249", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "37280", "lr": "0.000151469", "gnorm": "4.49", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9320"} 2023-01-29 18:47:04 | INFO | train_inner | {"epoch": 18, "update": 17.253, "s2c_loss": "0.3", "loss": "0.20811", "s2c_nll_loss": "0.3", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "259", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "37290", "lr": "0.000151402", "gnorm": "4.196", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9322"} 2023-01-29 18:47:07 | INFO | train_inner | {"epoch": 18, "update": 17.258, "s2c_loss": "0.211", "loss": "0.14595", "s2c_nll_loss": "0.211", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "37300", "lr": "0.000151336", "gnorm": "4.239", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "9325"} 2023-01-29 18:47:10 | INFO | train_inner | {"epoch": 18, "update": 17.262, "s2c_loss": "0.266", "loss": "0.18467", "s2c_nll_loss": "0.266", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "37310", "lr": "0.000151269", "gnorm": "5.632", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9327"} 2023-01-29 18:47:12 | INFO | train_inner | {"epoch": 18, "update": 17.267, "s2c_loss": "0.376", "loss": "0.26076", "s2c_nll_loss": "0.376", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "37320", "lr": "0.000151202", "gnorm": "6.737", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9330"} 2023-01-29 18:47:15 | INFO | train_inner | {"epoch": 18, "update": 17.272, "s2c_loss": "0.226", "loss": "0.15657", "s2c_nll_loss": "0.226", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "37330", "lr": "0.000151136", "gnorm": "4.293", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9332"} 2023-01-29 18:47:17 | INFO | train_inner | {"epoch": 18, "update": 17.276, "s2c_loss": "0.184", "loss": "0.12743", "s2c_nll_loss": "0.184", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "37340", "lr": "0.000151069", "gnorm": "3.593", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9335"} 2023-01-29 18:47:20 | INFO | train_inner | {"epoch": 18, "update": 17.281, "s2c_loss": "0.204", "loss": "0.14129", "s2c_nll_loss": "0.204", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "37350", "lr": "0.000151002", "gnorm": "3.899", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9338"} 2023-01-29 18:47:22 | INFO | train_inner | {"epoch": 18, "update": 17.285, "s2c_loss": "0.166", "loss": "0.11512", "s2c_nll_loss": "0.166", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "37360", "lr": "0.000150936", "gnorm": "4.078", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9340"} 2023-01-29 18:47:25 | INFO | train_inner | {"epoch": 18, "update": 17.29, "s2c_loss": "0.23", "loss": "0.15909", "s2c_nll_loss": "0.23", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "37370", "lr": "0.000150869", "gnorm": "4.238", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "9343"} 2023-01-29 18:47:27 | INFO | train_inner | {"epoch": 18, "update": 17.295, "s2c_loss": "0.203", "loss": "0.1405", "s2c_nll_loss": "0.203", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "37380", "lr": "0.000150802", "gnorm": "3.64", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9345"} 2023-01-29 18:47:30 | INFO | train_inner | {"epoch": 18, "update": 17.299, "s2c_loss": "0.223", "loss": "0.15479", "s2c_nll_loss": "0.223", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "37390", "lr": "0.000150736", "gnorm": "4.002", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "9348"} 2023-01-29 18:47:32 | INFO | train_inner | {"epoch": 18, "update": 17.304, "s2c_loss": "0.253", "loss": "0.17511", "s2c_nll_loss": "0.253", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "37400", "lr": "0.000150669", "gnorm": "4.213", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "9350"} 2023-01-29 18:47:35 | INFO | train_inner | {"epoch": 18, "update": 17.309, "s2c_loss": "0.259", "loss": "0.17921", "s2c_nll_loss": "0.259", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "37410", "lr": "0.000150602", "gnorm": "4.532", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "9353"} 2023-01-29 18:47:37 | INFO | train_inner | {"epoch": 18, "update": 17.313, "s2c_loss": "0.666", "loss": "0.46131", "s2c_nll_loss": "0.666", "s2c_accuracy": "92.031", "s2c_total": "64", "s2c_n_correct": "58.9", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "37420", "lr": "0.000150536", "gnorm": "4.121", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9355"} 2023-01-29 18:47:40 | INFO | train_inner | {"epoch": 18, "update": 17.318, "s2c_loss": "0.195", "loss": "0.135", "s2c_nll_loss": "0.195", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "37430", "lr": "0.000150469", "gnorm": "3.785", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9358"} 2023-01-29 18:47:42 | INFO | train_inner | {"epoch": 18, "update": 17.322, "s2c_loss": "0.241", "loss": "0.16734", "s2c_nll_loss": "0.241", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "37440", "lr": "0.000150402", "gnorm": "4.313", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9360"} 2023-01-29 18:47:45 | INFO | train_inner | {"epoch": 18, "update": 17.327, "s2c_loss": "0.207", "loss": "0.14331", "s2c_nll_loss": "0.207", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "242.1", "ups": "3.78", "wpb": "64", "bsz": "64", "num_updates": "37450", "lr": "0.000150336", "gnorm": "3.789", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9363"} 2023-01-29 18:47:48 | INFO | train_inner | {"epoch": 18, "update": 17.332, "s2c_loss": "0.254", "loss": "0.17619", "s2c_nll_loss": "0.254", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "37460", "lr": "0.000150269", "gnorm": "4.621", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9365"} 2023-01-29 18:47:50 | INFO | train_inner | {"epoch": 18, "update": 17.336, "s2c_loss": "0.221", "loss": "0.15309", "s2c_nll_loss": "0.221", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "37470", "lr": "0.000150202", "gnorm": "4.904", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "9368"} 2023-01-29 18:47:53 | INFO | train_inner | {"epoch": 18, "update": 17.341, "s2c_loss": "0.185", "loss": "0.12842", "s2c_nll_loss": "0.185", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "37480", "lr": "0.000150136", "gnorm": "4.537", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9371"} 2023-01-29 18:47:55 | INFO | train_inner | {"epoch": 18, "update": 17.346, "s2c_loss": "0.154", "loss": "0.10696", "s2c_nll_loss": "0.154", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "37490", "lr": "0.000150069", "gnorm": "3.67", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9373"} 2023-01-29 18:47:58 | INFO | train_inner | {"epoch": 18, "update": 17.35, "s2c_loss": "0.176", "loss": "0.12233", "s2c_nll_loss": "0.176", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "37500", "lr": "0.000150002", "gnorm": "4.569", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "9376"} 2023-01-29 18:48:00 | INFO | train_inner | {"epoch": 18, "update": 17.355, "s2c_loss": "0.173", "loss": "0.1202", "s2c_nll_loss": "0.173", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "37510", "lr": "0.000149936", "gnorm": "4.926", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9378"} 2023-01-29 18:48:03 | INFO | train_inner | {"epoch": 18, "update": 17.359, "s2c_loss": "0.182", "loss": "0.12623", "s2c_nll_loss": "0.182", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "37520", "lr": "0.000149869", "gnorm": "3.7", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9381"} 2023-01-29 18:48:05 | INFO | train_inner | {"epoch": 18, "update": 17.364, "s2c_loss": "0.211", "loss": "0.14609", "s2c_nll_loss": "0.211", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "37530", "lr": "0.000149803", "gnorm": "4.6", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "9383"} 2023-01-29 18:48:08 | INFO | train_inner | {"epoch": 18, "update": 17.369, "s2c_loss": "0.237", "loss": "0.16429", "s2c_nll_loss": "0.237", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "37540", "lr": "0.000149736", "gnorm": "3.965", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9386"} 2023-01-29 18:48:11 | INFO | train_inner | {"epoch": 18, "update": 17.373, "s2c_loss": "0.271", "loss": "0.1879", "s2c_nll_loss": "0.271", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "37550", "lr": "0.000149669", "gnorm": "4.258", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9388"} 2023-01-29 18:48:13 | INFO | train_inner | {"epoch": 18, "update": 17.378, "s2c_loss": "0.182", "loss": "0.1261", "s2c_nll_loss": "0.182", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "246.1", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "37560", "lr": "0.000149603", "gnorm": "4.407", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9391"} 2023-01-29 18:48:16 | INFO | train_inner | {"epoch": 18, "update": 17.383, "s2c_loss": "0.256", "loss": "0.17763", "s2c_nll_loss": "0.256", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "37570", "lr": "0.000149536", "gnorm": "5.971", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "9394"} 2023-01-29 18:48:18 | INFO | train_inner | {"epoch": 18, "update": 17.387, "s2c_loss": "0.249", "loss": "0.17278", "s2c_nll_loss": "0.249", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "37580", "lr": "0.000149469", "gnorm": "4.788", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "9396"} 2023-01-29 18:48:21 | INFO | train_inner | {"epoch": 18, "update": 17.392, "s2c_loss": "0.242", "loss": "0.16786", "s2c_nll_loss": "0.242", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "37590", "lr": "0.000149403", "gnorm": "5.089", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9399"} 2023-01-29 18:48:23 | INFO | train_inner | {"epoch": 18, "update": 17.396, "s2c_loss": "0.19", "loss": "0.13173", "s2c_nll_loss": "0.19", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "245", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "37600", "lr": "0.000149336", "gnorm": "4.6", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9401"} 2023-01-29 18:48:26 | INFO | train_inner | {"epoch": 18, "update": 17.401, "s2c_loss": "0.227", "loss": "0.15729", "s2c_nll_loss": "0.227", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "239.9", "ups": "3.75", "wpb": "64", "bsz": "64", "num_updates": "37610", "lr": "0.000149269", "gnorm": "4.589", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "9404"} 2023-01-29 18:48:29 | INFO | train_inner | {"epoch": 18, "update": 17.406, "s2c_loss": "0.208", "loss": "0.1445", "s2c_nll_loss": "0.208", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "37620", "lr": "0.000149203", "gnorm": "4.038", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9406"} 2023-01-29 18:48:31 | INFO | train_inner | {"epoch": 18, "update": 17.41, "s2c_loss": "0.289", "loss": "0.20043", "s2c_nll_loss": "0.289", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "37630", "lr": "0.000149136", "gnorm": "5.493", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9409"} 2023-01-29 18:48:34 | INFO | train_inner | {"epoch": 18, "update": 17.415, "s2c_loss": "0.248", "loss": "0.17157", "s2c_nll_loss": "0.248", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "37640", "lr": "0.000149069", "gnorm": "5.357", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "9411"} 2023-01-29 18:48:36 | INFO | train_inner | {"epoch": 18, "update": 17.42, "s2c_loss": "0.298", "loss": "0.20677", "s2c_nll_loss": "0.298", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "37650", "lr": "0.000149003", "gnorm": "4.979", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9414"} 2023-01-29 18:48:39 | INFO | train_inner | {"epoch": 18, "update": 17.424, "s2c_loss": "0.264", "loss": "0.18289", "s2c_nll_loss": "0.264", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "37660", "lr": "0.000148936", "gnorm": "4.899", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9417"} 2023-01-29 18:48:41 | INFO | train_inner | {"epoch": 18, "update": 17.429, "s2c_loss": "0.276", "loss": "0.19182", "s2c_nll_loss": "0.276", "s2c_accuracy": "94.349", "s2c_total": "63.7", "s2c_n_correct": "60.1", "wps": "252.5", "ups": "3.96", "wpb": "63.7", "bsz": "63.7", "num_updates": "37670", "lr": "0.000148869", "gnorm": "5.246", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9419"} 2023-01-29 18:48:44 | INFO | train_inner | {"epoch": 18, "update": 17.433, "s2c_loss": "0.282", "loss": "0.19551", "s2c_nll_loss": "0.282", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "37680", "lr": "0.000148803", "gnorm": "5.059", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9422"} 2023-01-29 18:48:46 | INFO | train_inner | {"epoch": 18, "update": 17.438, "s2c_loss": "0.345", "loss": "0.23909", "s2c_nll_loss": "0.345", "s2c_accuracy": "93.438", "s2c_total": "64", "s2c_n_correct": "59.8", "wps": "259.1", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "37690", "lr": "0.000148736", "gnorm": "5.389", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9424"} 2023-01-29 18:48:49 | INFO | train_inner | {"epoch": 18, "update": 17.443, "s2c_loss": "0.285", "loss": "0.19788", "s2c_nll_loss": "0.285", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "37700", "lr": "0.000148669", "gnorm": "5.669", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9427"} 2023-01-29 18:48:51 | INFO | train_inner | {"epoch": 18, "update": 17.447, "s2c_loss": "0.219", "loss": "0.15161", "s2c_nll_loss": "0.219", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "37710", "lr": "0.000148603", "gnorm": "3.876", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "9429"} 2023-01-29 18:48:54 | INFO | train_inner | {"epoch": 18, "update": 17.452, "s2c_loss": "0.242", "loss": "0.16768", "s2c_nll_loss": "0.242", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "37720", "lr": "0.000148536", "gnorm": "4.269", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9432"} 2023-01-29 18:48:56 | INFO | train_inner | {"epoch": 18, "update": 17.457, "s2c_loss": "0.133", "loss": "0.09206", "s2c_nll_loss": "0.133", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "37730", "lr": "0.000148469", "gnorm": "3.271", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9434"} 2023-01-29 18:48:59 | INFO | train_inner | {"epoch": 18, "update": 17.461, "s2c_loss": "0.174", "loss": "0.12041", "s2c_nll_loss": "0.174", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "246.4", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "37740", "lr": "0.000148403", "gnorm": "4.155", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9437"} 2023-01-29 18:49:01 | INFO | train_inner | {"epoch": 18, "update": 17.466, "s2c_loss": "0.18", "loss": "0.12456", "s2c_nll_loss": "0.18", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "37750", "lr": "0.000148336", "gnorm": "3.983", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9439"} 2023-01-29 18:49:04 | INFO | train_inner | {"epoch": 18, "update": 17.47, "s2c_loss": "0.232", "loss": "0.16075", "s2c_nll_loss": "0.232", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "37760", "lr": "0.000148269", "gnorm": "3.931", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9442"} 2023-01-29 18:49:06 | INFO | train_inner | {"epoch": 18, "update": 17.475, "s2c_loss": "0.24", "loss": "0.16637", "s2c_nll_loss": "0.24", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "37770", "lr": "0.000148203", "gnorm": "5.068", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9444"} 2023-01-29 18:49:09 | INFO | train_inner | {"epoch": 18, "update": 17.48, "s2c_loss": "0.238", "loss": "0.1651", "s2c_nll_loss": "0.238", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "253.8", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "37780", "lr": "0.000148136", "gnorm": "4.325", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9447"} 2023-01-29 18:49:11 | INFO | train_inner | {"epoch": 18, "update": 17.484, "s2c_loss": "0.214", "loss": "0.148", "s2c_nll_loss": "0.214", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "37790", "lr": "0.000148069", "gnorm": "5.757", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9449"} 2023-01-29 18:49:14 | INFO | train_inner | {"epoch": 18, "update": 17.489, "s2c_loss": "0.163", "loss": "0.11275", "s2c_nll_loss": "0.163", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "37800", "lr": "0.000148003", "gnorm": "3.203", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9452"} 2023-01-29 18:49:16 | INFO | train_inner | {"epoch": 18, "update": 17.494, "s2c_loss": "0.243", "loss": "0.16836", "s2c_nll_loss": "0.243", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "37810", "lr": "0.000147936", "gnorm": "4.156", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9454"} 2023-01-29 18:49:19 | INFO | train_inner | {"epoch": 18, "update": 17.498, "s2c_loss": "0.27", "loss": "0.18733", "s2c_nll_loss": "0.27", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "37820", "lr": "0.000147869", "gnorm": "4.348", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9457"} 2023-01-29 18:49:21 | INFO | train_inner | {"epoch": 18, "update": 17.503, "s2c_loss": "0.24", "loss": "0.16651", "s2c_nll_loss": "0.24", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "37830", "lr": "0.000147803", "gnorm": "4.295", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9459"} 2023-01-29 18:49:24 | INFO | train_inner | {"epoch": 18, "update": 17.507, "s2c_loss": "0.252", "loss": "0.17469", "s2c_nll_loss": "0.252", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "37840", "lr": "0.000147736", "gnorm": "4.347", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9462"} 2023-01-29 18:49:26 | INFO | train_inner | {"epoch": 18, "update": 17.512, "s2c_loss": "0.305", "loss": "0.21111", "s2c_nll_loss": "0.305", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "37850", "lr": "0.000147669", "gnorm": "5.17", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9464"} 2023-01-29 18:49:29 | INFO | train_inner | {"epoch": 18, "update": 17.517, "s2c_loss": "0.226", "loss": "0.15686", "s2c_nll_loss": "0.226", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "37860", "lr": "0.000147603", "gnorm": "3.991", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9467"} 2023-01-29 18:49:31 | INFO | train_inner | {"epoch": 18, "update": 17.521, "s2c_loss": "0.198", "loss": "0.13703", "s2c_nll_loss": "0.198", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "37870", "lr": "0.000147536", "gnorm": "4.038", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9469"} 2023-01-29 18:49:34 | INFO | train_inner | {"epoch": 18, "update": 17.526, "s2c_loss": "0.187", "loss": "0.12965", "s2c_nll_loss": "0.187", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "255.7", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "37880", "lr": "0.000147469", "gnorm": "3.654", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9472"} 2023-01-29 18:49:36 | INFO | train_inner | {"epoch": 18, "update": 17.531, "s2c_loss": "0.18", "loss": "0.12509", "s2c_nll_loss": "0.18", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "37890", "lr": "0.000147403", "gnorm": "3.804", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9474"} 2023-01-29 18:49:39 | INFO | train_inner | {"epoch": 18, "update": 17.535, "s2c_loss": "0.162", "loss": "0.11209", "s2c_nll_loss": "0.162", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "37900", "lr": "0.000147336", "gnorm": "3.667", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9477"} 2023-01-29 18:49:41 | INFO | train_inner | {"epoch": 18, "update": 17.54, "s2c_loss": "0.177", "loss": "0.1228", "s2c_nll_loss": "0.177", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "37910", "lr": "0.000147269", "gnorm": "3.595", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9479"} 2023-01-29 18:49:44 | INFO | train_inner | {"epoch": 18, "update": 17.544, "s2c_loss": "0.242", "loss": "0.16778", "s2c_nll_loss": "0.242", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "259.5", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "37920", "lr": "0.000147203", "gnorm": "4.347", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9482"} 2023-01-29 18:49:46 | INFO | train_inner | {"epoch": 18, "update": 17.549, "s2c_loss": "0.236", "loss": "0.16357", "s2c_nll_loss": "0.236", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "37930", "lr": "0.000147136", "gnorm": "4.703", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9484"} 2023-01-29 18:49:49 | INFO | train_inner | {"epoch": 18, "update": 17.554, "s2c_loss": "0.305", "loss": "0.2111", "s2c_nll_loss": "0.305", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "37940", "lr": "0.000147069", "gnorm": "5.224", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9487"} 2023-01-29 18:49:52 | INFO | train_inner | {"epoch": 18, "update": 17.558, "s2c_loss": "0.215", "loss": "0.1493", "s2c_nll_loss": "0.215", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "245.8", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "37950", "lr": "0.000147003", "gnorm": "5.054", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9490"} 2023-01-29 18:49:54 | INFO | train_inner | {"epoch": 18, "update": 17.563, "s2c_loss": "0.28", "loss": "0.19389", "s2c_nll_loss": "0.28", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "37960", "lr": "0.000146936", "gnorm": "4.664", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9492"} 2023-01-29 18:49:57 | INFO | train_inner | {"epoch": 18, "update": 17.568, "s2c_loss": "0.256", "loss": "0.17771", "s2c_nll_loss": "0.256", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "37970", "lr": "0.000146869", "gnorm": "5.132", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9495"} 2023-01-29 18:49:59 | INFO | train_inner | {"epoch": 18, "update": 17.572, "s2c_loss": "0.262", "loss": "0.18157", "s2c_nll_loss": "0.262", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "37980", "lr": "0.000146803", "gnorm": "5.243", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9497"} 2023-01-29 18:50:02 | INFO | train_inner | {"epoch": 18, "update": 17.577, "s2c_loss": "0.248", "loss": "0.17167", "s2c_nll_loss": "0.248", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "37990", "lr": "0.000146736", "gnorm": "4.87", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9500"} 2023-01-29 18:50:04 | INFO | train_inner | {"epoch": 18, "update": 17.581, "s2c_loss": "0.2", "loss": "0.13877", "s2c_nll_loss": "0.2", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "38000", "lr": "0.000146669", "gnorm": "5.706", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "9502"} 2023-01-29 18:50:07 | INFO | train_inner | {"epoch": 18, "update": 17.586, "s2c_loss": "0.287", "loss": "0.19895", "s2c_nll_loss": "0.287", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "38010", "lr": "0.000146603", "gnorm": "6.234", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9505"} 2023-01-29 18:50:09 | INFO | train_inner | {"epoch": 18, "update": 17.591, "s2c_loss": "0.315", "loss": "0.21863", "s2c_nll_loss": "0.315", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "38020", "lr": "0.000146536", "gnorm": "4.85", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9507"} 2023-01-29 18:50:12 | INFO | train_inner | {"epoch": 18, "update": 17.595, "s2c_loss": "0.234", "loss": "0.16246", "s2c_nll_loss": "0.234", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "38030", "lr": "0.000146469", "gnorm": "5.161", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9510"} 2023-01-29 18:50:14 | INFO | train_inner | {"epoch": 18, "update": 17.6, "s2c_loss": "0.213", "loss": "0.14773", "s2c_nll_loss": "0.213", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "38040", "lr": "0.000146403", "gnorm": "4.442", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9512"} 2023-01-29 18:50:17 | INFO | train_inner | {"epoch": 18, "update": 17.605, "s2c_loss": "0.305", "loss": "0.21161", "s2c_nll_loss": "0.305", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "262.2", "ups": "4.1", "wpb": "64", "bsz": "64", "num_updates": "38050", "lr": "0.000146336", "gnorm": "5.33", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "9515"} 2023-01-29 18:50:19 | INFO | train_inner | {"epoch": 18, "update": 17.609, "s2c_loss": "0.248", "loss": "0.17205", "s2c_nll_loss": "0.248", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "245.8", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "38060", "lr": "0.000146269", "gnorm": "4.501", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "9517"} 2023-01-29 18:50:22 | INFO | train_inner | {"epoch": 18, "update": 17.614, "s2c_loss": "0.275", "loss": "0.19069", "s2c_nll_loss": "0.275", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "38070", "lr": "0.000146203", "gnorm": "5.367", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "9520"} 2023-01-29 18:50:24 | INFO | train_inner | {"epoch": 18, "update": 17.618, "s2c_loss": "0.195", "loss": "0.13542", "s2c_nll_loss": "0.195", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "38080", "lr": "0.000146136", "gnorm": "4.025", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "9522"} 2023-01-29 18:50:27 | INFO | train_inner | {"epoch": 18, "update": 17.623, "s2c_loss": "0.223", "loss": "0.1548", "s2c_nll_loss": "0.223", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "38090", "lr": "0.000146069", "gnorm": "4.375", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9525"} 2023-01-29 18:50:30 | INFO | train_inner | {"epoch": 18, "update": 17.628, "s2c_loss": "0.226", "loss": "0.15676", "s2c_nll_loss": "0.226", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "38100", "lr": "0.000146003", "gnorm": "4.923", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "9527"} 2023-01-29 18:50:32 | INFO | train_inner | {"epoch": 18, "update": 17.632, "s2c_loss": "0.297", "loss": "0.20552", "s2c_nll_loss": "0.297", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "38110", "lr": "0.000145936", "gnorm": "5.548", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9530"} 2023-01-29 18:50:35 | INFO | train_inner | {"epoch": 18, "update": 17.637, "s2c_loss": "0.241", "loss": "0.16712", "s2c_nll_loss": "0.241", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "38120", "lr": "0.000145869", "gnorm": "6.064", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9532"} 2023-01-29 18:50:37 | INFO | train_inner | {"epoch": 18, "update": 17.642, "s2c_loss": "0.247", "loss": "0.17101", "s2c_nll_loss": "0.247", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "38130", "lr": "0.000145803", "gnorm": "5.802", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9535"} 2023-01-29 18:50:40 | INFO | train_inner | {"epoch": 18, "update": 17.646, "s2c_loss": "0.274", "loss": "0.18985", "s2c_nll_loss": "0.274", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "259.5", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "38140", "lr": "0.000145736", "gnorm": "5.359", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9537"} 2023-01-29 18:50:42 | INFO | train_inner | {"epoch": 18, "update": 17.651, "s2c_loss": "0.263", "loss": "0.18232", "s2c_nll_loss": "0.263", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "38150", "lr": "0.000145669", "gnorm": "5.26", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "9540"} 2023-01-29 18:50:45 | INFO | train_inner | {"epoch": 18, "update": 17.655, "s2c_loss": "0.297", "loss": "0.20581", "s2c_nll_loss": "0.297", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "38160", "lr": "0.000145603", "gnorm": "4.845", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "9543"} 2023-01-29 18:50:47 | INFO | train_inner | {"epoch": 18, "update": 17.66, "s2c_loss": "0.246", "loss": "0.17071", "s2c_nll_loss": "0.246", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "38170", "lr": "0.000145536", "gnorm": "4.36", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9545"} 2023-01-29 18:50:50 | INFO | train_inner | {"epoch": 18, "update": 17.665, "s2c_loss": "0.254", "loss": "0.17579", "s2c_nll_loss": "0.254", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "38180", "lr": "0.000145469", "gnorm": "4.271", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9548"} 2023-01-29 18:50:52 | INFO | train_inner | {"epoch": 18, "update": 17.669, "s2c_loss": "0.233", "loss": "0.16156", "s2c_nll_loss": "0.233", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "38190", "lr": "0.000145403", "gnorm": "4.818", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9550"} 2023-01-29 18:50:55 | INFO | train_inner | {"epoch": 18, "update": 17.674, "s2c_loss": "0.18", "loss": "0.12461", "s2c_nll_loss": "0.18", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "38200", "lr": "0.000145336", "gnorm": "3.838", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "9553"} 2023-01-29 18:50:57 | INFO | train_inner | {"epoch": 18, "update": 17.679, "s2c_loss": "0.228", "loss": "0.15797", "s2c_nll_loss": "0.228", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "38210", "lr": "0.000145269", "gnorm": "4.68", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "9555"} 2023-01-29 18:51:00 | INFO | train_inner | {"epoch": 18, "update": 17.683, "s2c_loss": "0.192", "loss": "0.13277", "s2c_nll_loss": "0.192", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "38220", "lr": "0.000145203", "gnorm": "3.773", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9558"} 2023-01-29 18:51:02 | INFO | train_inner | {"epoch": 18, "update": 17.688, "s2c_loss": "0.258", "loss": "0.17901", "s2c_nll_loss": "0.258", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "38230", "lr": "0.000145136", "gnorm": "5.828", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9560"} 2023-01-29 18:51:05 | INFO | train_inner | {"epoch": 18, "update": 17.692, "s2c_loss": "0.373", "loss": "0.25847", "s2c_nll_loss": "0.373", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "38240", "lr": "0.000145069", "gnorm": "5.229", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "9563"} 2023-01-29 18:51:07 | INFO | train_inner | {"epoch": 18, "update": 17.697, "s2c_loss": "0.28", "loss": "0.19382", "s2c_nll_loss": "0.28", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "38250", "lr": "0.000145003", "gnorm": "5.458", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "9565"} 2023-01-29 18:51:10 | INFO | train_inner | {"epoch": 18, "update": 17.702, "s2c_loss": "0.259", "loss": "0.1792", "s2c_nll_loss": "0.259", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "38260", "lr": "0.000144936", "gnorm": "4.215", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9568"} 2023-01-29 18:51:12 | INFO | train_inner | {"epoch": 18, "update": 17.706, "s2c_loss": "0.349", "loss": "0.2417", "s2c_nll_loss": "0.349", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "38270", "lr": "0.000144869", "gnorm": "5.621", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "9570"} 2023-01-29 18:51:15 | INFO | train_inner | {"epoch": 18, "update": 17.711, "s2c_loss": "0.184", "loss": "0.12779", "s2c_nll_loss": "0.184", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "38280", "lr": "0.000144803", "gnorm": "4.03", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9573"} 2023-01-29 18:51:18 | INFO | train_inner | {"epoch": 18, "update": 17.716, "s2c_loss": "0.242", "loss": "0.16767", "s2c_nll_loss": "0.242", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "38290", "lr": "0.000144736", "gnorm": "5.228", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "9575"} 2023-01-29 18:51:20 | INFO | train_inner | {"epoch": 18, "update": 17.72, "s2c_loss": "0.235", "loss": "0.16291", "s2c_nll_loss": "0.235", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "38300", "lr": "0.000144669", "gnorm": "5.487", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9578"} 2023-01-29 18:51:23 | INFO | train_inner | {"epoch": 18, "update": 17.725, "s2c_loss": "0.23", "loss": "0.15956", "s2c_nll_loss": "0.23", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "38310", "lr": "0.000144603", "gnorm": "5.202", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9581"} 2023-01-29 18:51:25 | INFO | train_inner | {"epoch": 18, "update": 17.729, "s2c_loss": "0.3", "loss": "0.20792", "s2c_nll_loss": "0.3", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "38320", "lr": "0.000144536", "gnorm": "5.193", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9583"} 2023-01-29 18:51:28 | INFO | train_inner | {"epoch": 18, "update": 17.734, "s2c_loss": "0.241", "loss": "0.16728", "s2c_nll_loss": "0.241", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "38330", "lr": "0.000144469", "gnorm": "4.597", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9586"} 2023-01-29 18:51:30 | INFO | train_inner | {"epoch": 18, "update": 17.739, "s2c_loss": "0.222", "loss": "0.15418", "s2c_nll_loss": "0.222", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "38340", "lr": "0.000144403", "gnorm": "4.709", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9588"} 2023-01-29 18:51:33 | INFO | train_inner | {"epoch": 18, "update": 17.743, "s2c_loss": "0.273", "loss": "0.18905", "s2c_nll_loss": "0.273", "s2c_accuracy": "94.219", "s2c_total": "64", "s2c_n_correct": "60.3", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "38350", "lr": "0.000144336", "gnorm": "4.861", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9591"} 2023-01-29 18:51:35 | INFO | train_inner | {"epoch": 18, "update": 17.748, "s2c_loss": "0.261", "loss": "0.18071", "s2c_nll_loss": "0.261", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "38360", "lr": "0.000144269", "gnorm": "4.326", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9593"} 2023-01-29 18:51:38 | INFO | train_inner | {"epoch": 18, "update": 17.753, "s2c_loss": "0.245", "loss": "0.17002", "s2c_nll_loss": "0.245", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "38370", "lr": "0.000144203", "gnorm": "4.287", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "9596"} 2023-01-29 18:51:40 | INFO | train_inner | {"epoch": 18, "update": 17.757, "s2c_loss": "0.199", "loss": "0.1376", "s2c_nll_loss": "0.199", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "38380", "lr": "0.000144136", "gnorm": "3.934", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9598"} 2023-01-29 18:51:43 | INFO | train_inner | {"epoch": 18, "update": 17.762, "s2c_loss": "0.227", "loss": "0.15746", "s2c_nll_loss": "0.227", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "38390", "lr": "0.000144069", "gnorm": "4.143", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9601"} 2023-01-29 18:51:45 | INFO | train_inner | {"epoch": 18, "update": 17.766, "s2c_loss": "0.206", "loss": "0.14309", "s2c_nll_loss": "0.206", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "38400", "lr": "0.000144003", "gnorm": "3.902", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9603"} 2023-01-29 18:51:48 | INFO | train_inner | {"epoch": 18, "update": 17.771, "s2c_loss": "0.257", "loss": "0.17806", "s2c_nll_loss": "0.257", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "38410", "lr": "0.000143936", "gnorm": "4.568", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9606"} 2023-01-29 18:51:50 | INFO | train_inner | {"epoch": 18, "update": 17.776, "s2c_loss": "0.152", "loss": "0.10563", "s2c_nll_loss": "0.152", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "38420", "lr": "0.000143869", "gnorm": "3.562", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9608"} 2023-01-29 18:51:53 | INFO | train_inner | {"epoch": 18, "update": 17.78, "s2c_loss": "0.196", "loss": "0.13602", "s2c_nll_loss": "0.196", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "38430", "lr": "0.000143803", "gnorm": "4.161", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9611"} 2023-01-29 18:51:56 | INFO | train_inner | {"epoch": 18, "update": 17.785, "s2c_loss": "0.196", "loss": "0.13615", "s2c_nll_loss": "0.196", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "38440", "lr": "0.000143736", "gnorm": "4.21", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "9613"} 2023-01-29 18:51:58 | INFO | train_inner | {"epoch": 18, "update": 17.79, "s2c_loss": "0.222", "loss": "0.15408", "s2c_nll_loss": "0.222", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "38450", "lr": "0.000143669", "gnorm": "3.928", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9616"} 2023-01-29 18:52:01 | INFO | train_inner | {"epoch": 18, "update": 17.794, "s2c_loss": "0.263", "loss": "0.18243", "s2c_nll_loss": "0.263", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "38460", "lr": "0.000143603", "gnorm": "5.241", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9619"} 2023-01-29 18:52:03 | INFO | train_inner | {"epoch": 18, "update": 17.799, "s2c_loss": "0.232", "loss": "0.16052", "s2c_nll_loss": "0.232", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "38470", "lr": "0.000143536", "gnorm": "4.756", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9621"} 2023-01-29 18:52:06 | INFO | train_inner | {"epoch": 18, "update": 17.803, "s2c_loss": "0.351", "loss": "0.24334", "s2c_nll_loss": "0.351", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "38480", "lr": "0.000143469", "gnorm": "5.91", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "9624"} 2023-01-29 18:52:08 | INFO | train_inner | {"epoch": 18, "update": 17.808, "s2c_loss": "0.313", "loss": "0.21675", "s2c_nll_loss": "0.313", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "38490", "lr": "0.000143403", "gnorm": "5.794", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9626"} 2023-01-29 18:52:11 | INFO | train_inner | {"epoch": 18, "update": 17.813, "s2c_loss": "0.274", "loss": "0.18992", "s2c_nll_loss": "0.274", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "38500", "lr": "0.000143336", "gnorm": "4.886", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "9629"} 2023-01-29 18:52:13 | INFO | train_inner | {"epoch": 18, "update": 17.817, "s2c_loss": "0.219", "loss": "0.15184", "s2c_nll_loss": "0.219", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "258.2", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "38510", "lr": "0.00014327", "gnorm": "4.486", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9631"} 2023-01-29 18:52:16 | INFO | train_inner | {"epoch": 18, "update": 17.822, "s2c_loss": "0.303", "loss": "0.20974", "s2c_nll_loss": "0.303", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "38520", "lr": "0.000143203", "gnorm": "5.249", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9634"} 2023-01-29 18:52:18 | INFO | train_inner | {"epoch": 18, "update": 17.827, "s2c_loss": "0.243", "loss": "0.16838", "s2c_nll_loss": "0.243", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "38530", "lr": "0.000143136", "gnorm": "4.725", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9636"} 2023-01-29 18:52:21 | INFO | train_inner | {"epoch": 18, "update": 17.831, "s2c_loss": "0.245", "loss": "0.16963", "s2c_nll_loss": "0.245", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "38540", "lr": "0.00014307", "gnorm": "5.081", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9639"} 2023-01-29 18:52:23 | INFO | train_inner | {"epoch": 18, "update": 17.836, "s2c_loss": "0.215", "loss": "0.14891", "s2c_nll_loss": "0.215", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "38550", "lr": "0.000143003", "gnorm": "3.815", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "9641"} 2023-01-29 18:52:26 | INFO | train_inner | {"epoch": 18, "update": 17.84, "s2c_loss": "0.239", "loss": "0.16598", "s2c_nll_loss": "0.239", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "38560", "lr": "0.000142936", "gnorm": "4.581", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9644"} 2023-01-29 18:52:29 | INFO | train_inner | {"epoch": 18, "update": 17.845, "s2c_loss": "0.22", "loss": "0.15283", "s2c_nll_loss": "0.22", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "38570", "lr": "0.00014287", "gnorm": "4.426", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9646"} 2023-01-29 18:52:31 | INFO | train_inner | {"epoch": 18, "update": 17.85, "s2c_loss": "0.309", "loss": "0.21405", "s2c_nll_loss": "0.309", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "38580", "lr": "0.000142803", "gnorm": "5.035", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9649"} 2023-01-29 18:52:34 | INFO | train_inner | {"epoch": 18, "update": 17.854, "s2c_loss": "0.178", "loss": "0.12319", "s2c_nll_loss": "0.178", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "38590", "lr": "0.000142736", "gnorm": "4.236", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "9651"} 2023-01-29 18:52:36 | INFO | train_inner | {"epoch": 18, "update": 17.859, "s2c_loss": "0.209", "loss": "0.14454", "s2c_nll_loss": "0.209", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "258.2", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "38600", "lr": "0.00014267", "gnorm": "4.686", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9654"} 2023-01-29 18:52:39 | INFO | train_inner | {"epoch": 18, "update": 17.864, "s2c_loss": "0.238", "loss": "0.16489", "s2c_nll_loss": "0.238", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "38610", "lr": "0.000142603", "gnorm": "5.061", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9656"} 2023-01-29 18:52:41 | INFO | train_inner | {"epoch": 18, "update": 17.868, "s2c_loss": "0.294", "loss": "0.20388", "s2c_nll_loss": "0.294", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "38620", "lr": "0.000142536", "gnorm": "5.313", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9659"} 2023-01-29 18:52:44 | INFO | train_inner | {"epoch": 18, "update": 17.873, "s2c_loss": "0.632", "loss": "0.43825", "s2c_nll_loss": "0.632", "s2c_accuracy": "89.688", "s2c_total": "64", "s2c_n_correct": "57.4", "wps": "257.6", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "38630", "lr": "0.00014247", "gnorm": "6.083", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9661"} 2023-01-29 18:52:46 | INFO | train_inner | {"epoch": 18, "update": 17.877, "s2c_loss": "0.262", "loss": "0.1815", "s2c_nll_loss": "0.262", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "38640", "lr": "0.000142403", "gnorm": "4.65", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9664"} 2023-01-29 18:52:49 | INFO | train_inner | {"epoch": 18, "update": 17.882, "s2c_loss": "0.298", "loss": "0.20681", "s2c_nll_loss": "0.298", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "259.1", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "38650", "lr": "0.000142336", "gnorm": "4.586", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9666"} 2023-01-29 18:52:51 | INFO | train_inner | {"epoch": 18, "update": 17.887, "s2c_loss": "0.187", "loss": "0.12991", "s2c_nll_loss": "0.187", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "38660", "lr": "0.00014227", "gnorm": "4.318", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9669"} 2023-01-29 18:52:54 | INFO | train_inner | {"epoch": 18, "update": 17.891, "s2c_loss": "0.217", "loss": "0.15022", "s2c_nll_loss": "0.217", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "38670", "lr": "0.000142203", "gnorm": "4.417", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9672"} 2023-01-29 18:52:56 | INFO | train_inner | {"epoch": 18, "update": 17.896, "s2c_loss": "0.186", "loss": "0.12884", "s2c_nll_loss": "0.186", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "38680", "lr": "0.000142136", "gnorm": "4.524", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "9674"} 2023-01-29 18:52:59 | INFO | train_inner | {"epoch": 18, "update": 17.901, "s2c_loss": "0.269", "loss": "0.18645", "s2c_nll_loss": "0.269", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "38690", "lr": "0.00014207", "gnorm": "4.437", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9677"} 2023-01-29 18:53:01 | INFO | train_inner | {"epoch": 18, "update": 17.905, "s2c_loss": "0.241", "loss": "0.16728", "s2c_nll_loss": "0.241", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "38700", "lr": "0.000142003", "gnorm": "4.505", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "9679"} 2023-01-29 18:53:04 | INFO | train_inner | {"epoch": 18, "update": 17.91, "s2c_loss": "0.291", "loss": "0.20189", "s2c_nll_loss": "0.291", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "38710", "lr": "0.000141936", "gnorm": "5.037", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9682"} 2023-01-29 18:53:06 | INFO | train_inner | {"epoch": 18, "update": 17.914, "s2c_loss": "0.272", "loss": "0.18857", "s2c_nll_loss": "0.272", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "38720", "lr": "0.00014187", "gnorm": "5.268", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "9684"} 2023-01-29 18:53:09 | INFO | train_inner | {"epoch": 18, "update": 17.919, "s2c_loss": "0.303", "loss": "0.21005", "s2c_nll_loss": "0.303", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "38730", "lr": "0.000141803", "gnorm": "5.148", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9687"} 2023-01-29 18:53:11 | INFO | train_inner | {"epoch": 18, "update": 17.924, "s2c_loss": "0.251", "loss": "0.17392", "s2c_nll_loss": "0.251", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "38740", "lr": "0.000141736", "gnorm": "4.68", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9689"} 2023-01-29 18:53:14 | INFO | train_inner | {"epoch": 18, "update": 17.928, "s2c_loss": "0.156", "loss": "0.10827", "s2c_nll_loss": "0.156", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "38750", "lr": "0.00014167", "gnorm": "3.545", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "9692"} 2023-01-29 18:53:16 | INFO | train_inner | {"epoch": 18, "update": 17.933, "s2c_loss": "0.251", "loss": "0.17408", "s2c_nll_loss": "0.251", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "38760", "lr": "0.000141603", "gnorm": "4.159", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "9694"} 2023-01-29 18:53:19 | INFO | train_inner | {"epoch": 18, "update": 17.938, "s2c_loss": "0.383", "loss": "0.26567", "s2c_nll_loss": "0.383", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "38770", "lr": "0.000141536", "gnorm": "3.583", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9697"} 2023-01-29 18:53:22 | INFO | train_inner | {"epoch": 18, "update": 17.942, "s2c_loss": "0.258", "loss": "0.17868", "s2c_nll_loss": "0.258", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "38780", "lr": "0.00014147", "gnorm": "4.747", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "9699"} 2023-01-29 18:53:24 | INFO | train_inner | {"epoch": 18, "update": 17.947, "s2c_loss": "0.341", "loss": "0.23664", "s2c_nll_loss": "0.341", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "38790", "lr": "0.000141403", "gnorm": "4.883", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9702"} 2023-01-29 18:53:26 | INFO | train_inner | {"epoch": 18, "update": 17.951, "s2c_loss": "0.294", "loss": "0.20409", "s2c_nll_loss": "0.294", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "259.4", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "38800", "lr": "0.000141336", "gnorm": "4.136", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9704"} 2023-01-29 18:53:29 | INFO | train_inner | {"epoch": 18, "update": 17.956, "s2c_loss": "0.209", "loss": "0.14474", "s2c_nll_loss": "0.209", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "38810", "lr": "0.00014127", "gnorm": "3.921", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9707"} 2023-01-29 18:53:32 | INFO | train_inner | {"epoch": 18, "update": 17.961, "s2c_loss": "0.227", "loss": "0.15748", "s2c_nll_loss": "0.227", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "38820", "lr": "0.000141203", "gnorm": "4.111", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9709"} 2023-01-29 18:53:34 | INFO | train_inner | {"epoch": 18, "update": 17.965, "s2c_loss": "0.394", "loss": "0.27306", "s2c_nll_loss": "0.394", "s2c_accuracy": "92.344", "s2c_total": "64", "s2c_n_correct": "59.1", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "38830", "lr": "0.000141136", "gnorm": "6.797", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9712"} 2023-01-29 18:53:37 | INFO | train_inner | {"epoch": 18, "update": 17.97, "s2c_loss": "0.234", "loss": "0.16193", "s2c_nll_loss": "0.234", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "38840", "lr": "0.00014107", "gnorm": "5.144", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9714"} 2023-01-29 18:53:39 | INFO | train_inner | {"epoch": 18, "update": 17.975, "s2c_loss": "0.184", "loss": "0.12756", "s2c_nll_loss": "0.184", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "38850", "lr": "0.000141003", "gnorm": "4.179", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "9717"} 2023-01-29 18:53:42 | INFO | train_inner | {"epoch": 18, "update": 17.979, "s2c_loss": "0.224", "loss": "0.15513", "s2c_nll_loss": "0.224", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "255", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "38860", "lr": "0.000140936", "gnorm": "5.494", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "9719"} 2023-01-29 18:53:43 | INFO | fairseq.trainer | NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 1024.0 2023-01-29 18:53:44 | INFO | train_inner | {"epoch": 18, "update": 17.984, "s2c_loss": "0.256", "loss": "0.1774", "s2c_nll_loss": "0.256", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "229.1", "ups": "3.58", "wpb": "64", "bsz": "64", "num_updates": "38870", "lr": "0.00014087", "gnorm": "4.036", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9722"} 2023-01-29 18:53:47 | INFO | train_inner | {"epoch": 18, "update": 17.989, "s2c_loss": "0.205", "loss": "0.14177", "s2c_nll_loss": "0.205", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "38880", "lr": "0.000140803", "gnorm": "4.229", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9725"} 2023-01-29 18:53:49 | INFO | train_inner | {"epoch": 18, "update": 17.994, "s2c_loss": "0.195", "loss": "0.13509", "s2c_nll_loss": "0.195", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "38890", "lr": "0.000140736", "gnorm": "4.171", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9727"} 2023-01-29 18:53:52 | INFO | train_inner | {"epoch": 18, "update": 17.998, "s2c_loss": "0.23", "loss": "0.15949", "s2c_nll_loss": "0.23", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "38900", "lr": "0.00014067", "gnorm": "4.558", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9730"} 2023-01-29 18:53:53 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 18:54:07 | INFO | valid | {"epoch": 18, "valid_s2c_loss": "0.754", "valid_loss": "0.52233", "valid_s2c_nll_loss": "0.754", "valid_s2c_accuracy": "86.774", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "27.7315", "valid_num_updates": "38904", "valid_best_s2c_accuracy": "86.774"} 2023-01-29 18:54:07 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 18 @ 38904 updates 2023-01-29 18:54:07 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 18:54:14 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 18:54:19 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt (epoch 18 @ 38904 updates, score 86.774) (writing took 11.75264347018674 seconds) 2023-01-29 18:54:19 | INFO | fairseq_cli.train | end of epoch 18 (average epoch stats below) 2023-01-29 18:54:19 | INFO | train | {"epoch": 18, "train_s2c_loss": "0.248", "train_loss": "0.17168", "train_s2c_nll_loss": "0.248", "train_s2c_accuracy": "95.566", "train_s2c_total": "63.9838", "train_s2c_n_correct": "61.1467", "train_wps": "238.7", "train_ups": "3.73", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "38904", "train_lr": "0.000140643", "train_gnorm": "4.651", "train_loss_scale": "1024", "train_train_wall": "540", "train_gb_free": "7.5", "train_wall": "9757"} 2023-01-29 18:54:25 | INFO | fairseq.trainer | begin training epoch 19 2023-01-29 18:54:25 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 18:54:27 | INFO | train_inner | {"epoch": 19, "update": 18.003, "s2c_loss": "0.157", "loss": "0.10914", "s2c_nll_loss": "0.157", "s2c_accuracy": "97.862", "s2c_total": "60.8", "s2c_n_correct": "59.5", "wps": "17.3", "ups": "0.28", "wpb": "60.8", "bsz": "60.8", "num_updates": "38910", "lr": "0.000140603", "gnorm": "3.168", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9765"} 2023-01-29 18:54:30 | INFO | train_inner | {"epoch": 19, "update": 18.007, "s2c_loss": "0.149", "loss": "0.10341", "s2c_nll_loss": "0.149", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "38920", "lr": "0.000140536", "gnorm": "3.706", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9767"} 2023-01-29 18:54:32 | INFO | train_inner | {"epoch": 19, "update": 18.012, "s2c_loss": "0.171", "loss": "0.1188", "s2c_nll_loss": "0.171", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "38930", "lr": "0.00014047", "gnorm": "4.304", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "9770"} 2023-01-29 18:54:35 | INFO | train_inner | {"epoch": 19, "update": 18.017, "s2c_loss": "0.226", "loss": "0.15684", "s2c_nll_loss": "0.226", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "38940", "lr": "0.000140403", "gnorm": "4.467", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9773"} 2023-01-29 18:54:37 | INFO | train_inner | {"epoch": 19, "update": 18.021, "s2c_loss": "0.177", "loss": "0.12279", "s2c_nll_loss": "0.177", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "38950", "lr": "0.000140336", "gnorm": "3.799", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9775"} 2023-01-29 18:54:40 | INFO | train_inner | {"epoch": 19, "update": 18.026, "s2c_loss": "0.152", "loss": "0.10552", "s2c_nll_loss": "0.152", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "38960", "lr": "0.00014027", "gnorm": "3.908", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9778"} 2023-01-29 18:54:42 | INFO | train_inner | {"epoch": 19, "update": 18.031, "s2c_loss": "0.14", "loss": "0.0967", "s2c_nll_loss": "0.14", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "38970", "lr": "0.000140203", "gnorm": "3.393", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9780"} 2023-01-29 18:54:45 | INFO | train_inner | {"epoch": 19, "update": 18.035, "s2c_loss": "0.157", "loss": "0.10869", "s2c_nll_loss": "0.157", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "38980", "lr": "0.000140136", "gnorm": "3.482", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9783"} 2023-01-29 18:54:47 | INFO | train_inner | {"epoch": 19, "update": 18.04, "s2c_loss": "0.152", "loss": "0.1056", "s2c_nll_loss": "0.152", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "38990", "lr": "0.00014007", "gnorm": "3.433", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9785"} 2023-01-29 18:54:50 | INFO | train_inner | {"epoch": 19, "update": 18.044, "s2c_loss": "0.133", "loss": "0.09237", "s2c_nll_loss": "0.133", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "39000", "lr": "0.000140003", "gnorm": "3.363", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9788"} 2023-01-29 18:54:52 | INFO | train_inner | {"epoch": 19, "update": 18.049, "s2c_loss": "0.172", "loss": "0.11939", "s2c_nll_loss": "0.172", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "246.6", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "39010", "lr": "0.000139936", "gnorm": "3.536", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9790"} 2023-01-29 18:54:55 | INFO | train_inner | {"epoch": 19, "update": 18.054, "s2c_loss": "0.22", "loss": "0.15236", "s2c_nll_loss": "0.22", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "39020", "lr": "0.00013987", "gnorm": "4.752", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9793"} 2023-01-29 18:54:57 | INFO | train_inner | {"epoch": 19, "update": 18.058, "s2c_loss": "0.124", "loss": "0.08566", "s2c_nll_loss": "0.124", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "39030", "lr": "0.000139803", "gnorm": "3.426", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9795"} 2023-01-29 18:55:00 | INFO | train_inner | {"epoch": 19, "update": 18.063, "s2c_loss": "0.229", "loss": "0.15866", "s2c_nll_loss": "0.229", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "39040", "lr": "0.000139736", "gnorm": "4.301", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9798"} 2023-01-29 18:55:03 | INFO | train_inner | {"epoch": 19, "update": 18.068, "s2c_loss": "0.185", "loss": "0.12806", "s2c_nll_loss": "0.185", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "39050", "lr": "0.00013967", "gnorm": "3.808", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9800"} 2023-01-29 18:55:05 | INFO | train_inner | {"epoch": 19, "update": 18.072, "s2c_loss": "0.193", "loss": "0.13397", "s2c_nll_loss": "0.193", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "39060", "lr": "0.000139603", "gnorm": "3.867", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9803"} 2023-01-29 18:55:08 | INFO | train_inner | {"epoch": 19, "update": 18.077, "s2c_loss": "0.181", "loss": "0.12516", "s2c_nll_loss": "0.181", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "39070", "lr": "0.000139536", "gnorm": "4.08", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9805"} 2023-01-29 18:55:10 | INFO | train_inner | {"epoch": 19, "update": 18.081, "s2c_loss": "0.417", "loss": "0.28899", "s2c_nll_loss": "0.417", "s2c_accuracy": "93.75", "s2c_total": "64", "s2c_n_correct": "60", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "39080", "lr": "0.00013947", "gnorm": "4.094", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9808"} 2023-01-29 18:55:13 | INFO | train_inner | {"epoch": 19, "update": 18.086, "s2c_loss": "0.132", "loss": "0.09175", "s2c_nll_loss": "0.132", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "39090", "lr": "0.000139403", "gnorm": "3.06", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9811"} 2023-01-29 18:55:15 | INFO | train_inner | {"epoch": 19, "update": 18.091, "s2c_loss": "0.228", "loss": "0.15778", "s2c_nll_loss": "0.228", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "39100", "lr": "0.000139336", "gnorm": "4.558", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9813"} 2023-01-29 18:55:18 | INFO | train_inner | {"epoch": 19, "update": 18.095, "s2c_loss": "0.172", "loss": "0.11923", "s2c_nll_loss": "0.172", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "39110", "lr": "0.00013927", "gnorm": "4.211", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9815"} 2023-01-29 18:55:20 | INFO | train_inner | {"epoch": 19, "update": 18.1, "s2c_loss": "0.187", "loss": "0.12932", "s2c_nll_loss": "0.187", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "39120", "lr": "0.000139203", "gnorm": "3.775", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9818"} 2023-01-29 18:55:23 | INFO | train_inner | {"epoch": 19, "update": 18.105, "s2c_loss": "0.168", "loss": "0.11679", "s2c_nll_loss": "0.168", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "39130", "lr": "0.000139136", "gnorm": "3.69", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9821"} 2023-01-29 18:55:25 | INFO | train_inner | {"epoch": 19, "update": 18.109, "s2c_loss": "0.181", "loss": "0.12515", "s2c_nll_loss": "0.181", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "39140", "lr": "0.00013907", "gnorm": "3.863", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9823"} 2023-01-29 18:55:28 | INFO | train_inner | {"epoch": 19, "update": 18.114, "s2c_loss": "0.185", "loss": "0.12852", "s2c_nll_loss": "0.185", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "39150", "lr": "0.000139003", "gnorm": "4.343", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9826"} 2023-01-29 18:55:30 | INFO | train_inner | {"epoch": 19, "update": 18.118, "s2c_loss": "0.245", "loss": "0.16953", "s2c_nll_loss": "0.245", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "39160", "lr": "0.000138936", "gnorm": "3.999", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9828"} 2023-01-29 18:55:33 | INFO | train_inner | {"epoch": 19, "update": 18.123, "s2c_loss": "0.18", "loss": "0.12462", "s2c_nll_loss": "0.18", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "39170", "lr": "0.00013887", "gnorm": "4.099", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9831"} 2023-01-29 18:55:35 | INFO | train_inner | {"epoch": 19, "update": 18.128, "s2c_loss": "0.187", "loss": "0.12941", "s2c_nll_loss": "0.187", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "39180", "lr": "0.000138803", "gnorm": "5.278", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "9833"} 2023-01-29 18:55:38 | INFO | train_inner | {"epoch": 19, "update": 18.132, "s2c_loss": "0.179", "loss": "0.12413", "s2c_nll_loss": "0.179", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "39190", "lr": "0.000138736", "gnorm": "3.39", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9836"} 2023-01-29 18:55:40 | INFO | train_inner | {"epoch": 19, "update": 18.137, "s2c_loss": "0.227", "loss": "0.15747", "s2c_nll_loss": "0.227", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "39200", "lr": "0.00013867", "gnorm": "4.955", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9838"} 2023-01-29 18:55:43 | INFO | train_inner | {"epoch": 19, "update": 18.142, "s2c_loss": "0.198", "loss": "0.13728", "s2c_nll_loss": "0.198", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "39210", "lr": "0.000138603", "gnorm": "4.371", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9841"} 2023-01-29 18:55:45 | INFO | train_inner | {"epoch": 19, "update": 18.146, "s2c_loss": "0.178", "loss": "0.12372", "s2c_nll_loss": "0.178", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "39220", "lr": "0.000138536", "gnorm": "4.163", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9843"} 2023-01-29 18:55:48 | INFO | train_inner | {"epoch": 19, "update": 18.151, "s2c_loss": "0.192", "loss": "0.13286", "s2c_nll_loss": "0.192", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "39230", "lr": "0.00013847", "gnorm": "3.778", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9846"} 2023-01-29 18:55:50 | INFO | train_inner | {"epoch": 19, "update": 18.155, "s2c_loss": "0.227", "loss": "0.15722", "s2c_nll_loss": "0.227", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "39240", "lr": "0.000138403", "gnorm": "4.364", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9848"} 2023-01-29 18:55:53 | INFO | train_inner | {"epoch": 19, "update": 18.16, "s2c_loss": "0.186", "loss": "0.12918", "s2c_nll_loss": "0.186", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "39250", "lr": "0.000138336", "gnorm": "4.104", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9851"} 2023-01-29 18:55:56 | INFO | train_inner | {"epoch": 19, "update": 18.165, "s2c_loss": "0.253", "loss": "0.17531", "s2c_nll_loss": "0.253", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "39260", "lr": "0.00013827", "gnorm": "3.807", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9853"} 2023-01-29 18:55:58 | INFO | train_inner | {"epoch": 19, "update": 18.169, "s2c_loss": "0.181", "loss": "0.12505", "s2c_nll_loss": "0.181", "s2c_accuracy": "96.703", "s2c_total": "63.7", "s2c_n_correct": "61.6", "wps": "253.9", "ups": "3.99", "wpb": "63.7", "bsz": "63.7", "num_updates": "39270", "lr": "0.000138203", "gnorm": "4.079", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9856"} 2023-01-29 18:56:01 | INFO | train_inner | {"epoch": 19, "update": 18.174, "s2c_loss": "0.142", "loss": "0.09865", "s2c_nll_loss": "0.142", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "39280", "lr": "0.000138136", "gnorm": "3.333", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9858"} 2023-01-29 18:56:03 | INFO | train_inner | {"epoch": 19, "update": 18.179, "s2c_loss": "0.193", "loss": "0.13346", "s2c_nll_loss": "0.193", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "39290", "lr": "0.00013807", "gnorm": "4.14", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9861"} 2023-01-29 18:56:06 | INFO | train_inner | {"epoch": 19, "update": 18.183, "s2c_loss": "0.146", "loss": "0.10134", "s2c_nll_loss": "0.146", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "39300", "lr": "0.000138003", "gnorm": "3.162", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9864"} 2023-01-29 18:56:08 | INFO | train_inner | {"epoch": 19, "update": 18.188, "s2c_loss": "0.236", "loss": "0.16383", "s2c_nll_loss": "0.236", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "39310", "lr": "0.000137936", "gnorm": "3.697", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9866"} 2023-01-29 18:56:11 | INFO | train_inner | {"epoch": 19, "update": 18.192, "s2c_loss": "0.162", "loss": "0.11262", "s2c_nll_loss": "0.162", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "39320", "lr": "0.00013787", "gnorm": "3.335", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9869"} 2023-01-29 18:56:13 | INFO | train_inner | {"epoch": 19, "update": 18.197, "s2c_loss": "0.181", "loss": "0.12538", "s2c_nll_loss": "0.181", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "39330", "lr": "0.000137803", "gnorm": "3.921", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9871"} 2023-01-29 18:56:16 | INFO | train_inner | {"epoch": 19, "update": 18.202, "s2c_loss": "0.205", "loss": "0.14213", "s2c_nll_loss": "0.205", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "39340", "lr": "0.000137736", "gnorm": "3.347", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9874"} 2023-01-29 18:56:18 | INFO | train_inner | {"epoch": 19, "update": 18.206, "s2c_loss": "0.163", "loss": "0.11296", "s2c_nll_loss": "0.163", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "39350", "lr": "0.00013767", "gnorm": "3.995", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9876"} 2023-01-29 18:56:21 | INFO | train_inner | {"epoch": 19, "update": 18.211, "s2c_loss": "0.18", "loss": "0.12478", "s2c_nll_loss": "0.18", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "39360", "lr": "0.000137603", "gnorm": "3.516", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9879"} 2023-01-29 18:56:23 | INFO | train_inner | {"epoch": 19, "update": 18.216, "s2c_loss": "0.185", "loss": "0.12848", "s2c_nll_loss": "0.185", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "39370", "lr": "0.000137536", "gnorm": "4.172", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9881"} 2023-01-29 18:56:26 | INFO | train_inner | {"epoch": 19, "update": 18.22, "s2c_loss": "0.164", "loss": "0.11347", "s2c_nll_loss": "0.164", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "39380", "lr": "0.00013747", "gnorm": "3.908", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9884"} 2023-01-29 18:56:28 | INFO | train_inner | {"epoch": 19, "update": 18.225, "s2c_loss": "0.198", "loss": "0.13742", "s2c_nll_loss": "0.198", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "39390", "lr": "0.000137403", "gnorm": "4.37", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9886"} 2023-01-29 18:56:31 | INFO | train_inner | {"epoch": 19, "update": 18.229, "s2c_loss": "0.135", "loss": "0.09384", "s2c_nll_loss": "0.135", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "39400", "lr": "0.000137336", "gnorm": "3.627", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9889"} 2023-01-29 18:56:34 | INFO | train_inner | {"epoch": 19, "update": 18.234, "s2c_loss": "0.211", "loss": "0.14607", "s2c_nll_loss": "0.211", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "39410", "lr": "0.00013727", "gnorm": "4.154", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9891"} 2023-01-29 18:56:36 | INFO | train_inner | {"epoch": 19, "update": 18.239, "s2c_loss": "0.25", "loss": "0.17327", "s2c_nll_loss": "0.25", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "39420", "lr": "0.000137203", "gnorm": "4.244", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9894"} 2023-01-29 18:56:39 | INFO | train_inner | {"epoch": 19, "update": 18.243, "s2c_loss": "0.181", "loss": "0.1252", "s2c_nll_loss": "0.181", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "247.3", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "39430", "lr": "0.000137136", "gnorm": "4.045", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9897"} 2023-01-29 18:56:41 | INFO | train_inner | {"epoch": 19, "update": 18.248, "s2c_loss": "0.234", "loss": "0.16207", "s2c_nll_loss": "0.234", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "39440", "lr": "0.00013707", "gnorm": "4.415", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9899"} 2023-01-29 18:56:44 | INFO | train_inner | {"epoch": 19, "update": 18.253, "s2c_loss": "0.166", "loss": "0.11514", "s2c_nll_loss": "0.166", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "39450", "lr": "0.000137003", "gnorm": "4.103", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9902"} 2023-01-29 18:56:46 | INFO | train_inner | {"epoch": 19, "update": 18.257, "s2c_loss": "0.161", "loss": "0.11164", "s2c_nll_loss": "0.161", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "39460", "lr": "0.000136936", "gnorm": "4.267", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9904"} 2023-01-29 18:56:49 | INFO | train_inner | {"epoch": 19, "update": 18.262, "s2c_loss": "0.142", "loss": "0.0981", "s2c_nll_loss": "0.142", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "39470", "lr": "0.00013687", "gnorm": "3.701", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9907"} 2023-01-29 18:56:51 | INFO | train_inner | {"epoch": 19, "update": 18.266, "s2c_loss": "0.194", "loss": "0.13464", "s2c_nll_loss": "0.194", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "39480", "lr": "0.000136803", "gnorm": "4.698", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "9909"} 2023-01-29 18:56:54 | INFO | train_inner | {"epoch": 19, "update": 18.271, "s2c_loss": "0.168", "loss": "0.11615", "s2c_nll_loss": "0.168", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "259.8", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "39490", "lr": "0.000136736", "gnorm": "3.965", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9912"} 2023-01-29 18:56:56 | INFO | train_inner | {"epoch": 19, "update": 18.276, "s2c_loss": "0.183", "loss": "0.1267", "s2c_nll_loss": "0.183", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "39500", "lr": "0.00013667", "gnorm": "3.881", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "9914"} 2023-01-29 18:56:59 | INFO | train_inner | {"epoch": 19, "update": 18.28, "s2c_loss": "0.166", "loss": "0.11489", "s2c_nll_loss": "0.166", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "39510", "lr": "0.000136603", "gnorm": "3.945", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9917"} 2023-01-29 18:57:01 | INFO | train_inner | {"epoch": 19, "update": 18.285, "s2c_loss": "0.181", "loss": "0.12534", "s2c_nll_loss": "0.181", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "245.7", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "39520", "lr": "0.000136537", "gnorm": "3.748", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9919"} 2023-01-29 18:57:04 | INFO | train_inner | {"epoch": 19, "update": 18.29, "s2c_loss": "0.15", "loss": "0.10429", "s2c_nll_loss": "0.15", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "39530", "lr": "0.00013647", "gnorm": "3.487", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9922"} 2023-01-29 18:57:06 | INFO | train_inner | {"epoch": 19, "update": 18.294, "s2c_loss": "0.17", "loss": "0.1179", "s2c_nll_loss": "0.17", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "248", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "39540", "lr": "0.000136403", "gnorm": "4.119", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9924"} 2023-01-29 18:57:09 | INFO | train_inner | {"epoch": 19, "update": 18.299, "s2c_loss": "0.242", "loss": "0.16797", "s2c_nll_loss": "0.242", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "243.8", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "39550", "lr": "0.000136337", "gnorm": "4.323", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9927"} 2023-01-29 18:57:12 | INFO | train_inner | {"epoch": 19, "update": 18.303, "s2c_loss": "0.187", "loss": "0.12934", "s2c_nll_loss": "0.187", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "39560", "lr": "0.00013627", "gnorm": "3.781", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9930"} 2023-01-29 18:57:14 | INFO | train_inner | {"epoch": 19, "update": 18.308, "s2c_loss": "0.177", "loss": "0.12284", "s2c_nll_loss": "0.177", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "39570", "lr": "0.000136203", "gnorm": "3.429", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "9932"} 2023-01-29 18:57:17 | INFO | train_inner | {"epoch": 19, "update": 18.313, "s2c_loss": "0.265", "loss": "0.18353", "s2c_nll_loss": "0.265", "s2c_accuracy": "94.375", "s2c_total": "64", "s2c_n_correct": "60.4", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "39580", "lr": "0.000136137", "gnorm": "4.195", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9935"} 2023-01-29 18:57:19 | INFO | train_inner | {"epoch": 19, "update": 18.317, "s2c_loss": "0.222", "loss": "0.15413", "s2c_nll_loss": "0.222", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "39590", "lr": "0.00013607", "gnorm": "3.897", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9937"} 2023-01-29 18:57:22 | INFO | train_inner | {"epoch": 19, "update": 18.322, "s2c_loss": "0.188", "loss": "0.13053", "s2c_nll_loss": "0.188", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "39600", "lr": "0.000136003", "gnorm": "4.708", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9940"} 2023-01-29 18:57:24 | INFO | train_inner | {"epoch": 19, "update": 18.327, "s2c_loss": "0.181", "loss": "0.12552", "s2c_nll_loss": "0.181", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "39610", "lr": "0.000135937", "gnorm": "4.091", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9942"} 2023-01-29 18:57:27 | INFO | train_inner | {"epoch": 19, "update": 18.331, "s2c_loss": "0.187", "loss": "0.12992", "s2c_nll_loss": "0.187", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "39620", "lr": "0.00013587", "gnorm": "3.076", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9945"} 2023-01-29 18:57:29 | INFO | train_inner | {"epoch": 19, "update": 18.336, "s2c_loss": "0.207", "loss": "0.1437", "s2c_nll_loss": "0.207", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "39630", "lr": "0.000135803", "gnorm": "4.503", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "9947"} 2023-01-29 18:57:32 | INFO | train_inner | {"epoch": 19, "update": 18.34, "s2c_loss": "0.258", "loss": "0.1787", "s2c_nll_loss": "0.258", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "39640", "lr": "0.000135737", "gnorm": "3.805", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "9950"} 2023-01-29 18:57:33 | INFO | fairseq.trainer | NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 512.0 2023-01-29 18:57:35 | INFO | train_inner | {"epoch": 19, "update": 18.346, "s2c_loss": "0.265", "loss": "0.18399", "s2c_nll_loss": "0.265", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "234.6", "ups": "3.67", "wpb": "64", "bsz": "64", "num_updates": "39650", "lr": "0.00013567", "gnorm": "4.876", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "9953"} 2023-01-29 18:57:37 | INFO | train_inner | {"epoch": 19, "update": 18.35, "s2c_loss": "0.142", "loss": "0.09877", "s2c_nll_loss": "0.142", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "39660", "lr": "0.000135603", "gnorm": "3.52", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "9955"} 2023-01-29 18:57:40 | INFO | train_inner | {"epoch": 19, "update": 18.355, "s2c_loss": "0.222", "loss": "0.15392", "s2c_nll_loss": "0.222", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "39670", "lr": "0.000135537", "gnorm": "4.769", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "9958"} 2023-01-29 18:57:42 | INFO | train_inner | {"epoch": 19, "update": 18.359, "s2c_loss": "0.161", "loss": "0.11144", "s2c_nll_loss": "0.161", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "39680", "lr": "0.00013547", "gnorm": "3.655", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "9960"} 2023-01-29 18:57:45 | INFO | train_inner | {"epoch": 19, "update": 18.364, "s2c_loss": "0.237", "loss": "0.16413", "s2c_nll_loss": "0.237", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "39690", "lr": "0.000135403", "gnorm": "4.567", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "9963"} 2023-01-29 18:57:47 | INFO | train_inner | {"epoch": 19, "update": 18.369, "s2c_loss": "0.23", "loss": "0.15969", "s2c_nll_loss": "0.23", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "39700", "lr": "0.000135337", "gnorm": "5.546", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "9965"} 2023-01-29 18:57:50 | INFO | train_inner | {"epoch": 19, "update": 18.373, "s2c_loss": "0.204", "loss": "0.14165", "s2c_nll_loss": "0.204", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "39710", "lr": "0.00013527", "gnorm": "4.422", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "9968"} 2023-01-29 18:57:52 | INFO | train_inner | {"epoch": 19, "update": 18.378, "s2c_loss": "0.21", "loss": "0.14575", "s2c_nll_loss": "0.21", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "39720", "lr": "0.000135203", "gnorm": "4.428", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "9970"} 2023-01-29 18:57:55 | INFO | train_inner | {"epoch": 19, "update": 18.383, "s2c_loss": "0.182", "loss": "0.12645", "s2c_nll_loss": "0.182", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "39730", "lr": "0.000135137", "gnorm": "4.49", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "9973"} 2023-01-29 18:57:57 | INFO | train_inner | {"epoch": 19, "update": 18.387, "s2c_loss": "0.2", "loss": "0.13887", "s2c_nll_loss": "0.2", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "39740", "lr": "0.00013507", "gnorm": "3.655", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "9975"} 2023-01-29 18:58:00 | INFO | train_inner | {"epoch": 19, "update": 18.392, "s2c_loss": "0.136", "loss": "0.09443", "s2c_nll_loss": "0.136", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "39750", "lr": "0.000135003", "gnorm": "3.481", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "9978"} 2023-01-29 18:58:02 | INFO | train_inner | {"epoch": 19, "update": 18.396, "s2c_loss": "0.163", "loss": "0.1129", "s2c_nll_loss": "0.163", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "39760", "lr": "0.000134937", "gnorm": "3.433", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "9980"} 2023-01-29 18:58:05 | INFO | train_inner | {"epoch": 19, "update": 18.401, "s2c_loss": "0.249", "loss": "0.17263", "s2c_nll_loss": "0.249", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "39770", "lr": "0.00013487", "gnorm": "5.109", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "9983"} 2023-01-29 18:58:07 | INFO | train_inner | {"epoch": 19, "update": 18.406, "s2c_loss": "0.182", "loss": "0.12581", "s2c_nll_loss": "0.182", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "39780", "lr": "0.000134803", "gnorm": "3.403", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "9985"} 2023-01-29 18:58:10 | INFO | train_inner | {"epoch": 19, "update": 18.41, "s2c_loss": "0.238", "loss": "0.16482", "s2c_nll_loss": "0.238", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "39790", "lr": "0.000134737", "gnorm": "4.171", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "9988"} 2023-01-29 18:58:13 | INFO | train_inner | {"epoch": 19, "update": 18.415, "s2c_loss": "0.199", "loss": "0.13778", "s2c_nll_loss": "0.199", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "39800", "lr": "0.00013467", "gnorm": "3.705", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "9990"} 2023-01-29 18:58:15 | INFO | train_inner | {"epoch": 19, "update": 18.42, "s2c_loss": "0.133", "loss": "0.09228", "s2c_nll_loss": "0.133", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "39810", "lr": "0.000134603", "gnorm": "3.661", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "9993"} 2023-01-29 18:58:18 | INFO | train_inner | {"epoch": 19, "update": 18.424, "s2c_loss": "0.151", "loss": "0.10445", "s2c_nll_loss": "0.151", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "39820", "lr": "0.000134537", "gnorm": "3.697", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "9996"} 2023-01-29 18:58:20 | INFO | train_inner | {"epoch": 19, "update": 18.429, "s2c_loss": "0.188", "loss": "0.13001", "s2c_nll_loss": "0.188", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "39830", "lr": "0.00013447", "gnorm": "4.669", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "9998"} 2023-01-29 18:58:23 | INFO | train_inner | {"epoch": 19, "update": 18.433, "s2c_loss": "0.218", "loss": "0.15114", "s2c_nll_loss": "0.218", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "39840", "lr": "0.000134403", "gnorm": "5.618", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10001"} 2023-01-29 18:58:25 | INFO | train_inner | {"epoch": 19, "update": 18.438, "s2c_loss": "0.126", "loss": "0.08742", "s2c_nll_loss": "0.126", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "39850", "lr": "0.000134337", "gnorm": "4.3", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10003"} 2023-01-29 18:58:28 | INFO | train_inner | {"epoch": 19, "update": 18.443, "s2c_loss": "0.143", "loss": "0.0994", "s2c_nll_loss": "0.143", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "39860", "lr": "0.00013427", "gnorm": "3.426", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10006"} 2023-01-29 18:58:30 | INFO | train_inner | {"epoch": 19, "update": 18.447, "s2c_loss": "0.232", "loss": "0.16088", "s2c_nll_loss": "0.232", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "39870", "lr": "0.000134203", "gnorm": "4.571", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "10008"} 2023-01-29 18:58:33 | INFO | train_inner | {"epoch": 19, "update": 18.452, "s2c_loss": "0.266", "loss": "0.18472", "s2c_nll_loss": "0.266", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "39880", "lr": "0.000134137", "gnorm": "4.409", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10011"} 2023-01-29 18:58:35 | INFO | train_inner | {"epoch": 19, "update": 18.457, "s2c_loss": "0.173", "loss": "0.11961", "s2c_nll_loss": "0.173", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "39890", "lr": "0.00013407", "gnorm": "4.165", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10013"} 2023-01-29 18:58:38 | INFO | train_inner | {"epoch": 19, "update": 18.461, "s2c_loss": "0.257", "loss": "0.17843", "s2c_nll_loss": "0.257", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "39900", "lr": "0.000134003", "gnorm": "4.584", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "10016"} 2023-01-29 18:58:41 | INFO | train_inner | {"epoch": 19, "update": 18.466, "s2c_loss": "0.185", "loss": "0.12815", "s2c_nll_loss": "0.185", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "39910", "lr": "0.000133937", "gnorm": "4.09", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10018"} 2023-01-29 18:58:43 | INFO | train_inner | {"epoch": 19, "update": 18.47, "s2c_loss": "0.169", "loss": "0.11722", "s2c_nll_loss": "0.169", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "247.1", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "39920", "lr": "0.00013387", "gnorm": "4.11", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10021"} 2023-01-29 18:58:46 | INFO | train_inner | {"epoch": 19, "update": 18.475, "s2c_loss": "0.203", "loss": "0.14085", "s2c_nll_loss": "0.203", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "39930", "lr": "0.000133803", "gnorm": "4.218", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10024"} 2023-01-29 18:58:48 | INFO | train_inner | {"epoch": 19, "update": 18.48, "s2c_loss": "0.256", "loss": "0.17741", "s2c_nll_loss": "0.256", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "39940", "lr": "0.000133737", "gnorm": "4.452", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10026"} 2023-01-29 18:58:51 | INFO | train_inner | {"epoch": 19, "update": 18.484, "s2c_loss": "0.25", "loss": "0.17314", "s2c_nll_loss": "0.25", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "39950", "lr": "0.00013367", "gnorm": "4.523", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10029"} 2023-01-29 18:58:53 | INFO | train_inner | {"epoch": 19, "update": 18.489, "s2c_loss": "0.217", "loss": "0.15066", "s2c_nll_loss": "0.217", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "39960", "lr": "0.000133603", "gnorm": "4.339", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10031"} 2023-01-29 18:58:56 | INFO | train_inner | {"epoch": 19, "update": 18.494, "s2c_loss": "0.259", "loss": "0.17928", "s2c_nll_loss": "0.259", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "39970", "lr": "0.000133537", "gnorm": "4.88", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10034"} 2023-01-29 18:58:58 | INFO | train_inner | {"epoch": 19, "update": 18.498, "s2c_loss": "0.16", "loss": "0.11095", "s2c_nll_loss": "0.16", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "39980", "lr": "0.00013347", "gnorm": "3.608", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10036"} 2023-01-29 18:59:01 | INFO | train_inner | {"epoch": 19, "update": 18.503, "s2c_loss": "0.211", "loss": "0.14639", "s2c_nll_loss": "0.211", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "39990", "lr": "0.000133403", "gnorm": "4.167", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10039"} 2023-01-29 18:59:03 | INFO | train_inner | {"epoch": 19, "update": 18.507, "s2c_loss": "0.229", "loss": "0.15851", "s2c_nll_loss": "0.229", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "40000", "lr": "0.000133337", "gnorm": "5.266", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10041"} 2023-01-29 18:59:03 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 18:59:18 | INFO | valid | {"epoch": 19, "valid_s2c_loss": "0.682", "valid_loss": "0.47289", "valid_s2c_nll_loss": "0.682", "valid_s2c_accuracy": "87.962", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "28.1111", "valid_num_updates": "40000", "valid_best_s2c_accuracy": "87.962"} 2023-01-29 18:59:18 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 19 @ 40000 updates 2023-01-29 18:59:18 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_19_40000.pt 2023-01-29 18:59:21 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_19_40000.pt 2023-01-29 18:59:31 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_19_40000.pt (epoch 19 @ 40000 updates, score 87.962) (writing took 12.844550338108093 seconds) 2023-01-29 18:59:33 | INFO | train_inner | {"epoch": 19, "update": 18.512, "s2c_loss": "0.169", "loss": "0.11693", "s2c_nll_loss": "0.169", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "21.3", "ups": "0.33", "wpb": "64", "bsz": "64", "num_updates": "40010", "lr": "0.00013327", "gnorm": "3.888", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10071"} 2023-01-29 18:59:36 | INFO | train_inner | {"epoch": 19, "update": 18.517, "s2c_loss": "0.208", "loss": "0.14424", "s2c_nll_loss": "0.208", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "40020", "lr": "0.000133203", "gnorm": "4.244", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10074"} 2023-01-29 18:59:38 | INFO | train_inner | {"epoch": 19, "update": 18.521, "s2c_loss": "0.227", "loss": "0.15725", "s2c_nll_loss": "0.227", "s2c_accuracy": "95.469", "s2c_total": "64", "s2c_n_correct": "61.1", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "40030", "lr": "0.000133137", "gnorm": "4.24", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10076"} 2023-01-29 18:59:41 | INFO | train_inner | {"epoch": 19, "update": 18.526, "s2c_loss": "0.211", "loss": "0.14626", "s2c_nll_loss": "0.211", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "40040", "lr": "0.00013307", "gnorm": "4.268", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10079"} 2023-01-29 18:59:43 | INFO | train_inner | {"epoch": 19, "update": 18.531, "s2c_loss": "0.233", "loss": "0.16178", "s2c_nll_loss": "0.233", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "40050", "lr": "0.000133003", "gnorm": "4.796", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10081"} 2023-01-29 18:59:46 | INFO | train_inner | {"epoch": 19, "update": 18.535, "s2c_loss": "0.207", "loss": "0.14333", "s2c_nll_loss": "0.207", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "40060", "lr": "0.000132937", "gnorm": "5.748", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10084"} 2023-01-29 18:59:49 | INFO | train_inner | {"epoch": 19, "update": 18.54, "s2c_loss": "0.159", "loss": "0.10987", "s2c_nll_loss": "0.159", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "40070", "lr": "0.00013287", "gnorm": "3.806", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10087"} 2023-01-29 18:59:51 | INFO | train_inner | {"epoch": 19, "update": 18.544, "s2c_loss": "0.226", "loss": "0.15679", "s2c_nll_loss": "0.226", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "40080", "lr": "0.000132803", "gnorm": "4.655", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10089"} 2023-01-29 18:59:54 | INFO | train_inner | {"epoch": 19, "update": 18.549, "s2c_loss": "0.213", "loss": "0.1479", "s2c_nll_loss": "0.213", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "40090", "lr": "0.000132737", "gnorm": "4.195", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10092"} 2023-01-29 18:59:56 | INFO | train_inner | {"epoch": 19, "update": 18.554, "s2c_loss": "0.168", "loss": "0.11617", "s2c_nll_loss": "0.168", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "246.2", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "40100", "lr": "0.00013267", "gnorm": "3.952", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10094"} 2023-01-29 18:59:59 | INFO | train_inner | {"epoch": 19, "update": 18.558, "s2c_loss": "0.255", "loss": "0.17676", "s2c_nll_loss": "0.255", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "257.4", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "40110", "lr": "0.000132603", "gnorm": "4.017", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10097"} 2023-01-29 19:00:01 | INFO | train_inner | {"epoch": 19, "update": 18.563, "s2c_loss": "0.191", "loss": "0.13238", "s2c_nll_loss": "0.191", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "40120", "lr": "0.000132537", "gnorm": "4.329", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10099"} 2023-01-29 19:00:04 | INFO | train_inner | {"epoch": 19, "update": 18.568, "s2c_loss": "0.303", "loss": "0.20982", "s2c_nll_loss": "0.303", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "258.4", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "40130", "lr": "0.00013247", "gnorm": "4.092", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10102"} 2023-01-29 19:00:06 | INFO | train_inner | {"epoch": 19, "update": 18.572, "s2c_loss": "0.17", "loss": "0.11787", "s2c_nll_loss": "0.17", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "40140", "lr": "0.000132403", "gnorm": "3.797", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10104"} 2023-01-29 19:00:09 | INFO | train_inner | {"epoch": 19, "update": 18.577, "s2c_loss": "0.137", "loss": "0.09526", "s2c_nll_loss": "0.137", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "40150", "lr": "0.000132337", "gnorm": "3.658", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10107"} 2023-01-29 19:00:11 | INFO | train_inner | {"epoch": 19, "update": 18.581, "s2c_loss": "0.16", "loss": "0.11098", "s2c_nll_loss": "0.16", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "40160", "lr": "0.00013227", "gnorm": "3.447", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10109"} 2023-01-29 19:00:14 | INFO | train_inner | {"epoch": 19, "update": 18.586, "s2c_loss": "0.452", "loss": "0.31333", "s2c_nll_loss": "0.452", "s2c_accuracy": "93.594", "s2c_total": "64", "s2c_n_correct": "59.9", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "40170", "lr": "0.000132203", "gnorm": "4.265", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10112"} 2023-01-29 19:00:16 | INFO | train_inner | {"epoch": 19, "update": 18.591, "s2c_loss": "0.18", "loss": "0.12492", "s2c_nll_loss": "0.18", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "40180", "lr": "0.000132137", "gnorm": "3.492", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10114"} 2023-01-29 19:00:19 | INFO | train_inner | {"epoch": 19, "update": 18.595, "s2c_loss": "0.247", "loss": "0.17155", "s2c_nll_loss": "0.247", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "40190", "lr": "0.00013207", "gnorm": "4.508", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10117"} 2023-01-29 19:00:22 | INFO | train_inner | {"epoch": 19, "update": 18.6, "s2c_loss": "0.194", "loss": "0.13452", "s2c_nll_loss": "0.194", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "40200", "lr": "0.000132003", "gnorm": "4.372", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10119"} 2023-01-29 19:00:24 | INFO | train_inner | {"epoch": 19, "update": 18.605, "s2c_loss": "0.186", "loss": "0.12901", "s2c_nll_loss": "0.186", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "40210", "lr": "0.000131937", "gnorm": "5.087", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10122"} 2023-01-29 19:00:27 | INFO | train_inner | {"epoch": 19, "update": 18.609, "s2c_loss": "0.16", "loss": "0.11108", "s2c_nll_loss": "0.16", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "40220", "lr": "0.00013187", "gnorm": "3.363", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10124"} 2023-01-29 19:00:29 | INFO | train_inner | {"epoch": 19, "update": 18.614, "s2c_loss": "0.167", "loss": "0.11546", "s2c_nll_loss": "0.167", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "40230", "lr": "0.000131803", "gnorm": "3.669", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10127"} 2023-01-29 19:00:32 | INFO | train_inner | {"epoch": 19, "update": 18.618, "s2c_loss": "0.284", "loss": "0.19696", "s2c_nll_loss": "0.284", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "40240", "lr": "0.000131737", "gnorm": "4.371", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10130"} 2023-01-29 19:00:34 | INFO | train_inner | {"epoch": 19, "update": 18.623, "s2c_loss": "0.17", "loss": "0.11779", "s2c_nll_loss": "0.17", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "40250", "lr": "0.00013167", "gnorm": "4.003", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10132"} 2023-01-29 19:00:37 | INFO | train_inner | {"epoch": 19, "update": 18.628, "s2c_loss": "0.201", "loss": "0.1391", "s2c_nll_loss": "0.201", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "259.6", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "40260", "lr": "0.000131603", "gnorm": "4.45", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10134"} 2023-01-29 19:00:39 | INFO | train_inner | {"epoch": 19, "update": 18.632, "s2c_loss": "0.15", "loss": "0.10398", "s2c_nll_loss": "0.15", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "40270", "lr": "0.000131537", "gnorm": "3.375", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "10137"} 2023-01-29 19:00:42 | INFO | train_inner | {"epoch": 19, "update": 18.637, "s2c_loss": "0.166", "loss": "0.11494", "s2c_nll_loss": "0.166", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "40280", "lr": "0.00013147", "gnorm": "4.133", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10140"} 2023-01-29 19:00:44 | INFO | train_inner | {"epoch": 19, "update": 18.642, "s2c_loss": "0.257", "loss": "0.17824", "s2c_nll_loss": "0.257", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "247.1", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "40290", "lr": "0.000131403", "gnorm": "3.922", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10142"} 2023-01-29 19:00:47 | INFO | train_inner | {"epoch": 19, "update": 18.646, "s2c_loss": "0.214", "loss": "0.14809", "s2c_nll_loss": "0.214", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "40300", "lr": "0.000131337", "gnorm": "3.822", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10145"} 2023-01-29 19:00:49 | INFO | train_inner | {"epoch": 19, "update": 18.651, "s2c_loss": "0.234", "loss": "0.16213", "s2c_nll_loss": "0.234", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "40310", "lr": "0.00013127", "gnorm": "4.6", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10147"} 2023-01-29 19:00:52 | INFO | train_inner | {"epoch": 19, "update": 18.655, "s2c_loss": "0.155", "loss": "0.1073", "s2c_nll_loss": "0.155", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "40320", "lr": "0.000131203", "gnorm": "3.718", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10150"} 2023-01-29 19:00:54 | INFO | train_inner | {"epoch": 19, "update": 18.66, "s2c_loss": "0.169", "loss": "0.11709", "s2c_nll_loss": "0.169", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "40330", "lr": "0.000131137", "gnorm": "3.668", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10152"} 2023-01-29 19:00:57 | INFO | train_inner | {"epoch": 19, "update": 18.665, "s2c_loss": "0.209", "loss": "0.14497", "s2c_nll_loss": "0.209", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "257.4", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "40340", "lr": "0.00013107", "gnorm": "4.01", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10155"} 2023-01-29 19:00:59 | INFO | train_inner | {"epoch": 19, "update": 18.669, "s2c_loss": "0.183", "loss": "0.12707", "s2c_nll_loss": "0.183", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "40350", "lr": "0.000131003", "gnorm": "4.443", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10157"} 2023-01-29 19:01:02 | INFO | train_inner | {"epoch": 19, "update": 18.674, "s2c_loss": "0.143", "loss": "0.0988", "s2c_nll_loss": "0.143", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "40360", "lr": "0.000130937", "gnorm": "3.83", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10160"} 2023-01-29 19:01:04 | INFO | train_inner | {"epoch": 19, "update": 18.679, "s2c_loss": "0.278", "loss": "0.19255", "s2c_nll_loss": "0.278", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "40370", "lr": "0.00013087", "gnorm": "4.226", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10162"} 2023-01-29 19:01:07 | INFO | train_inner | {"epoch": 19, "update": 18.683, "s2c_loss": "0.196", "loss": "0.13565", "s2c_nll_loss": "0.196", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "40380", "lr": "0.000130803", "gnorm": "3.931", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10165"} 2023-01-29 19:01:10 | INFO | train_inner | {"epoch": 19, "update": 18.688, "s2c_loss": "0.189", "loss": "0.13076", "s2c_nll_loss": "0.189", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "40390", "lr": "0.000130737", "gnorm": "4.972", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10167"} 2023-01-29 19:01:12 | INFO | train_inner | {"epoch": 19, "update": 18.692, "s2c_loss": "0.198", "loss": "0.13697", "s2c_nll_loss": "0.198", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "40400", "lr": "0.00013067", "gnorm": "3.813", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10170"} 2023-01-29 19:01:15 | INFO | train_inner | {"epoch": 19, "update": 18.697, "s2c_loss": "0.15", "loss": "0.10372", "s2c_nll_loss": "0.15", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "250.6", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "40410", "lr": "0.000130603", "gnorm": "3.723", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10173"} 2023-01-29 19:01:17 | INFO | train_inner | {"epoch": 19, "update": 18.702, "s2c_loss": "0.152", "loss": "0.10526", "s2c_nll_loss": "0.152", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "40420", "lr": "0.000130537", "gnorm": "3.582", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10175"} 2023-01-29 19:01:20 | INFO | train_inner | {"epoch": 19, "update": 18.706, "s2c_loss": "0.204", "loss": "0.14112", "s2c_nll_loss": "0.204", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "40430", "lr": "0.00013047", "gnorm": "4.306", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10178"} 2023-01-29 19:01:22 | INFO | train_inner | {"epoch": 19, "update": 18.711, "s2c_loss": "0.176", "loss": "0.12232", "s2c_nll_loss": "0.176", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "40440", "lr": "0.000130403", "gnorm": "3.596", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10180"} 2023-01-29 19:01:25 | INFO | train_inner | {"epoch": 19, "update": 18.716, "s2c_loss": "0.142", "loss": "0.09809", "s2c_nll_loss": "0.142", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "40450", "lr": "0.000130337", "gnorm": "3.614", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10183"} 2023-01-29 19:01:27 | INFO | train_inner | {"epoch": 19, "update": 18.72, "s2c_loss": "0.208", "loss": "0.14402", "s2c_nll_loss": "0.208", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "40460", "lr": "0.00013027", "gnorm": "3.682", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "10185"} 2023-01-29 19:01:30 | INFO | train_inner | {"epoch": 19, "update": 18.725, "s2c_loss": "0.167", "loss": "0.11555", "s2c_nll_loss": "0.167", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "40470", "lr": "0.000130203", "gnorm": "3.45", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10188"} 2023-01-29 19:01:33 | INFO | train_inner | {"epoch": 19, "update": 18.729, "s2c_loss": "0.137", "loss": "0.09464", "s2c_nll_loss": "0.137", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "40480", "lr": "0.000130137", "gnorm": "3.628", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10190"} 2023-01-29 19:01:35 | INFO | train_inner | {"epoch": 19, "update": 18.734, "s2c_loss": "0.172", "loss": "0.11948", "s2c_nll_loss": "0.172", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "40490", "lr": "0.00013007", "gnorm": "3.458", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10193"} 2023-01-29 19:01:38 | INFO | train_inner | {"epoch": 19, "update": 18.739, "s2c_loss": "0.171", "loss": "0.11838", "s2c_nll_loss": "0.171", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "40500", "lr": "0.000130003", "gnorm": "3.133", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10196"} 2023-01-29 19:01:40 | INFO | train_inner | {"epoch": 19, "update": 18.743, "s2c_loss": "0.114", "loss": "0.07874", "s2c_nll_loss": "0.114", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "246.5", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "40510", "lr": "0.000129937", "gnorm": "3.102", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10198"} 2023-01-29 19:01:43 | INFO | train_inner | {"epoch": 19, "update": 18.748, "s2c_loss": "0.188", "loss": "0.1306", "s2c_nll_loss": "0.188", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "40520", "lr": "0.00012987", "gnorm": "3.794", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10201"} 2023-01-29 19:01:45 | INFO | train_inner | {"epoch": 19, "update": 18.753, "s2c_loss": "0.103", "loss": "0.07152", "s2c_nll_loss": "0.103", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "40530", "lr": "0.000129804", "gnorm": "3.09", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10203"} 2023-01-29 19:01:48 | INFO | train_inner | {"epoch": 19, "update": 18.757, "s2c_loss": "0.23", "loss": "0.15976", "s2c_nll_loss": "0.23", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "40540", "lr": "0.000129737", "gnorm": "4.386", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10206"} 2023-01-29 19:01:50 | INFO | train_inner | {"epoch": 19, "update": 18.762, "s2c_loss": "0.198", "loss": "0.13715", "s2c_nll_loss": "0.198", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "40550", "lr": "0.00012967", "gnorm": "5.679", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10208"} 2023-01-29 19:01:53 | INFO | train_inner | {"epoch": 19, "update": 18.766, "s2c_loss": "0.211", "loss": "0.14647", "s2c_nll_loss": "0.211", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "40560", "lr": "0.000129604", "gnorm": "5.137", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10211"} 2023-01-29 19:01:55 | INFO | train_inner | {"epoch": 19, "update": 18.771, "s2c_loss": "0.169", "loss": "0.11703", "s2c_nll_loss": "0.169", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "40570", "lr": "0.000129537", "gnorm": "4.42", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10213"} 2023-01-29 19:01:58 | INFO | train_inner | {"epoch": 19, "update": 18.776, "s2c_loss": "0.168", "loss": "0.11611", "s2c_nll_loss": "0.168", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "40580", "lr": "0.00012947", "gnorm": "3.788", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10216"} 2023-01-29 19:02:00 | INFO | train_inner | {"epoch": 19, "update": 18.78, "s2c_loss": "0.177", "loss": "0.12236", "s2c_nll_loss": "0.177", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "40590", "lr": "0.000129404", "gnorm": "3.941", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10218"} 2023-01-29 19:02:03 | INFO | train_inner | {"epoch": 19, "update": 18.785, "s2c_loss": "0.172", "loss": "0.11935", "s2c_nll_loss": "0.172", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "261.4", "ups": "4.08", "wpb": "64", "bsz": "64", "num_updates": "40600", "lr": "0.000129337", "gnorm": "3.829", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10221"} 2023-01-29 19:02:05 | INFO | train_inner | {"epoch": 19, "update": 18.79, "s2c_loss": "0.25", "loss": "0.17347", "s2c_nll_loss": "0.25", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "247.1", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "40610", "lr": "0.00012927", "gnorm": "4.119", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10223"} 2023-01-29 19:02:08 | INFO | train_inner | {"epoch": 19, "update": 18.794, "s2c_loss": "0.463", "loss": "0.32102", "s2c_nll_loss": "0.463", "s2c_accuracy": "93.281", "s2c_total": "64", "s2c_n_correct": "59.7", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "40620", "lr": "0.000129204", "gnorm": "5.354", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10226"} 2023-01-29 19:02:10 | INFO | train_inner | {"epoch": 19, "update": 18.799, "s2c_loss": "0.225", "loss": "0.15607", "s2c_nll_loss": "0.225", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "40630", "lr": "0.000129137", "gnorm": "4.812", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10228"} 2023-01-29 19:02:13 | INFO | train_inner | {"epoch": 19, "update": 18.803, "s2c_loss": "0.156", "loss": "0.10791", "s2c_nll_loss": "0.156", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "40640", "lr": "0.00012907", "gnorm": "3.115", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10231"} 2023-01-29 19:02:15 | INFO | train_inner | {"epoch": 19, "update": 18.808, "s2c_loss": "0.17", "loss": "0.11818", "s2c_nll_loss": "0.17", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "40650", "lr": "0.000129004", "gnorm": "3.924", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10233"} 2023-01-29 19:02:18 | INFO | train_inner | {"epoch": 19, "update": 18.813, "s2c_loss": "0.596", "loss": "0.41299", "s2c_nll_loss": "0.596", "s2c_accuracy": "91.406", "s2c_total": "64", "s2c_n_correct": "58.5", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "40660", "lr": "0.000128937", "gnorm": "5.991", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10236"} 2023-01-29 19:02:21 | INFO | train_inner | {"epoch": 19, "update": 18.817, "s2c_loss": "0.262", "loss": "0.18143", "s2c_nll_loss": "0.262", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "40670", "lr": "0.00012887", "gnorm": "4.907", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "10238"} 2023-01-29 19:02:23 | INFO | train_inner | {"epoch": 19, "update": 18.822, "s2c_loss": "0.248", "loss": "0.17179", "s2c_nll_loss": "0.248", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "258.6", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "40680", "lr": "0.000128804", "gnorm": "4.703", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10241"} 2023-01-29 19:02:26 | INFO | train_inner | {"epoch": 19, "update": 18.827, "s2c_loss": "0.188", "loss": "0.13027", "s2c_nll_loss": "0.188", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "246.7", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "40690", "lr": "0.000128737", "gnorm": "3.806", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10244"} 2023-01-29 19:02:28 | INFO | train_inner | {"epoch": 19, "update": 18.831, "s2c_loss": "0.245", "loss": "0.17011", "s2c_nll_loss": "0.245", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "40700", "lr": "0.00012867", "gnorm": "3.815", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10246"} 2023-01-29 19:02:31 | INFO | train_inner | {"epoch": 19, "update": 18.836, "s2c_loss": "0.202", "loss": "0.14035", "s2c_nll_loss": "0.202", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "40710", "lr": "0.000128604", "gnorm": "4.17", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10249"} 2023-01-29 19:02:33 | INFO | train_inner | {"epoch": 19, "update": 18.84, "s2c_loss": "0.26", "loss": "0.1802", "s2c_nll_loss": "0.26", "s2c_accuracy": "95.312", "s2c_total": "64", "s2c_n_correct": "61", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "40720", "lr": "0.000128537", "gnorm": "4.614", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10251"} 2023-01-29 19:02:36 | INFO | train_inner | {"epoch": 19, "update": 18.845, "s2c_loss": "0.333", "loss": "0.23085", "s2c_nll_loss": "0.333", "s2c_accuracy": "93.125", "s2c_total": "64", "s2c_n_correct": "59.6", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "40730", "lr": "0.00012847", "gnorm": "5.571", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10254"} 2023-01-29 19:02:38 | INFO | train_inner | {"epoch": 19, "update": 18.85, "s2c_loss": "0.282", "loss": "0.19554", "s2c_nll_loss": "0.282", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "40740", "lr": "0.000128404", "gnorm": "5.055", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10256"} 2023-01-29 19:02:41 | INFO | train_inner | {"epoch": 19, "update": 18.854, "s2c_loss": "0.258", "loss": "0.17899", "s2c_nll_loss": "0.258", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "40750", "lr": "0.000128337", "gnorm": "5.179", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10259"} 2023-01-29 19:02:43 | INFO | train_inner | {"epoch": 19, "update": 18.859, "s2c_loss": "0.238", "loss": "0.16528", "s2c_nll_loss": "0.238", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "40760", "lr": "0.00012827", "gnorm": "5.191", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10261"} 2023-01-29 19:02:46 | INFO | train_inner | {"epoch": 19, "update": 18.864, "s2c_loss": "0.199", "loss": "0.13823", "s2c_nll_loss": "0.199", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "40770", "lr": "0.000128204", "gnorm": "4.381", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10264"} 2023-01-29 19:02:48 | INFO | train_inner | {"epoch": 19, "update": 18.868, "s2c_loss": "0.186", "loss": "0.12905", "s2c_nll_loss": "0.186", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "40780", "lr": "0.000128137", "gnorm": "3.811", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10266"} 2023-01-29 19:02:51 | INFO | train_inner | {"epoch": 19, "update": 18.873, "s2c_loss": "0.238", "loss": "0.16487", "s2c_nll_loss": "0.238", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "246.1", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "40790", "lr": "0.00012807", "gnorm": "4.11", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10269"} 2023-01-29 19:02:53 | INFO | train_inner | {"epoch": 19, "update": 18.877, "s2c_loss": "0.399", "loss": "0.27658", "s2c_nll_loss": "0.399", "s2c_accuracy": "93.906", "s2c_total": "64", "s2c_n_correct": "60.1", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "40800", "lr": "0.000128004", "gnorm": "3.846", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10271"} 2023-01-29 19:02:56 | INFO | train_inner | {"epoch": 19, "update": 18.882, "s2c_loss": "0.266", "loss": "0.18457", "s2c_nll_loss": "0.266", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "40810", "lr": "0.000127937", "gnorm": "4.112", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10274"} 2023-01-29 19:02:58 | INFO | train_inner | {"epoch": 19, "update": 18.887, "s2c_loss": "0.162", "loss": "0.11212", "s2c_nll_loss": "0.162", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "40820", "lr": "0.00012787", "gnorm": "3.675", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10276"} 2023-01-29 19:03:01 | INFO | train_inner | {"epoch": 19, "update": 18.891, "s2c_loss": "0.16", "loss": "0.11122", "s2c_nll_loss": "0.16", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "40830", "lr": "0.000127804", "gnorm": "3.438", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10279"} 2023-01-29 19:03:04 | INFO | train_inner | {"epoch": 19, "update": 18.896, "s2c_loss": "0.204", "loss": "0.14154", "s2c_nll_loss": "0.204", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "40840", "lr": "0.000127737", "gnorm": "3.804", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10281"} 2023-01-29 19:03:06 | INFO | train_inner | {"epoch": 19, "update": 18.901, "s2c_loss": "0.218", "loss": "0.15145", "s2c_nll_loss": "0.218", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "40850", "lr": "0.00012767", "gnorm": "4.106", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10284"} 2023-01-29 19:03:09 | INFO | train_inner | {"epoch": 19, "update": 18.905, "s2c_loss": "0.159", "loss": "0.11023", "s2c_nll_loss": "0.159", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "40860", "lr": "0.000127604", "gnorm": "3.721", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10287"} 2023-01-29 19:03:11 | INFO | train_inner | {"epoch": 19, "update": 18.91, "s2c_loss": "0.206", "loss": "0.14279", "s2c_nll_loss": "0.206", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "40870", "lr": "0.000127537", "gnorm": "3.997", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "10289"} 2023-01-29 19:03:14 | INFO | train_inner | {"epoch": 19, "update": 18.914, "s2c_loss": "0.332", "loss": "0.2302", "s2c_nll_loss": "0.332", "s2c_accuracy": "94.844", "s2c_total": "64", "s2c_n_correct": "60.7", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "40880", "lr": "0.00012747", "gnorm": "4.904", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10292"} 2023-01-29 19:03:16 | INFO | train_inner | {"epoch": 19, "update": 18.919, "s2c_loss": "0.142", "loss": "0.09814", "s2c_nll_loss": "0.142", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "40890", "lr": "0.000127404", "gnorm": "3.575", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10294"} 2023-01-29 19:03:19 | INFO | train_inner | {"epoch": 19, "update": 18.924, "s2c_loss": "0.199", "loss": "0.13777", "s2c_nll_loss": "0.199", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "40900", "lr": "0.000127337", "gnorm": "4.332", "loss_scale": "512", "train_wall": "3", "gb_free": "7.4", "wall": "10297"} 2023-01-29 19:03:21 | INFO | train_inner | {"epoch": 19, "update": 18.928, "s2c_loss": "0.192", "loss": "0.13326", "s2c_nll_loss": "0.192", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "40910", "lr": "0.00012727", "gnorm": "3.581", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10299"} 2023-01-29 19:03:24 | INFO | train_inner | {"epoch": 19, "update": 18.933, "s2c_loss": "0.151", "loss": "0.1049", "s2c_nll_loss": "0.151", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "40920", "lr": "0.000127204", "gnorm": "3.808", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10302"} 2023-01-29 19:03:26 | INFO | train_inner | {"epoch": 19, "update": 18.938, "s2c_loss": "0.204", "loss": "0.14151", "s2c_nll_loss": "0.204", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "40930", "lr": "0.000127137", "gnorm": "4.138", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10304"} 2023-01-29 19:03:29 | INFO | train_inner | {"epoch": 19, "update": 18.942, "s2c_loss": "0.233", "loss": "0.16147", "s2c_nll_loss": "0.233", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "260.6", "ups": "4.07", "wpb": "64", "bsz": "64", "num_updates": "40940", "lr": "0.00012707", "gnorm": "3.883", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10307"} 2023-01-29 19:03:32 | INFO | train_inner | {"epoch": 19, "update": 18.947, "s2c_loss": "0.17", "loss": "0.11754", "s2c_nll_loss": "0.17", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "40950", "lr": "0.000127004", "gnorm": "4.298", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10309"} 2023-01-29 19:03:34 | INFO | train_inner | {"epoch": 19, "update": 18.951, "s2c_loss": "0.198", "loss": "0.13733", "s2c_nll_loss": "0.198", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "40960", "lr": "0.000126937", "gnorm": "5.05", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10312"} 2023-01-29 19:03:37 | INFO | train_inner | {"epoch": 19, "update": 18.956, "s2c_loss": "0.166", "loss": "0.11532", "s2c_nll_loss": "0.166", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "40970", "lr": "0.00012687", "gnorm": "3.788", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10315"} 2023-01-29 19:03:39 | INFO | train_inner | {"epoch": 19, "update": 18.961, "s2c_loss": "0.164", "loss": "0.11386", "s2c_nll_loss": "0.164", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "40980", "lr": "0.000126804", "gnorm": "3.53", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10317"} 2023-01-29 19:03:42 | INFO | train_inner | {"epoch": 19, "update": 18.965, "s2c_loss": "0.119", "loss": "0.08282", "s2c_nll_loss": "0.119", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "40990", "lr": "0.000126737", "gnorm": "3.17", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10320"} 2023-01-29 19:03:44 | INFO | train_inner | {"epoch": 19, "update": 18.97, "s2c_loss": "0.146", "loss": "0.10127", "s2c_nll_loss": "0.146", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "41000", "lr": "0.00012667", "gnorm": "3.834", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10322"} 2023-01-29 19:03:47 | INFO | train_inner | {"epoch": 19, "update": 18.975, "s2c_loss": "0.146", "loss": "0.10096", "s2c_nll_loss": "0.146", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "41010", "lr": "0.000126604", "gnorm": "3.795", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10325"} 2023-01-29 19:03:49 | INFO | train_inner | {"epoch": 19, "update": 18.979, "s2c_loss": "0.192", "loss": "0.13283", "s2c_nll_loss": "0.192", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "258.7", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "41020", "lr": "0.000126537", "gnorm": "4.954", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10327"} 2023-01-29 19:03:52 | INFO | train_inner | {"epoch": 19, "update": 18.984, "s2c_loss": "0.196", "loss": "0.13564", "s2c_nll_loss": "0.196", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "41030", "lr": "0.00012647", "gnorm": "3.709", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10330"} 2023-01-29 19:03:54 | INFO | train_inner | {"epoch": 19, "update": 18.988, "s2c_loss": "0.198", "loss": "0.13737", "s2c_nll_loss": "0.198", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "41040", "lr": "0.000126404", "gnorm": "4.224", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "10332"} 2023-01-29 19:03:57 | INFO | train_inner | {"epoch": 19, "update": 18.993, "s2c_loss": "0.176", "loss": "0.12232", "s2c_nll_loss": "0.176", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "41050", "lr": "0.000126337", "gnorm": "3.742", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10335"} 2023-01-29 19:03:59 | INFO | train_inner | {"epoch": 19, "update": 18.998, "s2c_loss": "0.298", "loss": "0.20635", "s2c_nll_loss": "0.298", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "258.9", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "41060", "lr": "0.00012627", "gnorm": "4.176", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10337"} 2023-01-29 19:04:00 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 19:04:15 | INFO | valid | {"epoch": 19, "valid_s2c_loss": "0.831", "valid_loss": "0.57618", "valid_s2c_nll_loss": "0.831", "valid_s2c_accuracy": "85.905", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "27.4537", "valid_num_updates": "41065", "valid_best_s2c_accuracy": "87.962"} 2023-01-29 19:04:15 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 19 @ 41065 updates 2023-01-29 19:04:15 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 19:04:22 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 19:04:22 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt (epoch 19 @ 41065 updates, score 85.905) (writing took 7.004900731146336 seconds) 2023-01-29 19:04:22 | INFO | fairseq_cli.train | end of epoch 19 (average epoch stats below) 2023-01-29 19:04:22 | INFO | train | {"epoch": 19, "train_s2c_loss": "0.2", "train_loss": "0.13881", "train_s2c_nll_loss": "0.2", "train_s2c_accuracy": "96.45", "train_s2c_total": "63.9838", "train_s2c_n_correct": "61.7122", "train_wps": "229.3", "train_ups": "3.58", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "41065", "train_lr": "0.000126237", "train_gnorm": "4.088", "train_loss_scale": "512", "train_train_wall": "540", "train_gb_free": "7.5", "train_wall": "10360"} 2023-01-29 19:04:28 | INFO | fairseq.trainer | begin training epoch 20 2023-01-29 19:04:28 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 19:04:30 | INFO | train_inner | {"epoch": 20, "update": 19.002, "s2c_loss": "0.277", "loss": "0.19231", "s2c_nll_loss": "0.277", "s2c_accuracy": "94.901", "s2c_total": "60.8", "s2c_n_correct": "57.7", "wps": "19.9", "ups": "0.33", "wpb": "60.8", "bsz": "60.8", "num_updates": "41070", "lr": "0.000126204", "gnorm": "5.402", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10368"} 2023-01-29 19:04:32 | INFO | train_inner | {"epoch": 20, "update": 19.007, "s2c_loss": "0.204", "loss": "0.14171", "s2c_nll_loss": "0.204", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "41080", "lr": "0.000126137", "gnorm": "3.945", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10370"} 2023-01-29 19:04:35 | INFO | train_inner | {"epoch": 20, "update": 19.012, "s2c_loss": "0.222", "loss": "0.15367", "s2c_nll_loss": "0.222", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "41090", "lr": "0.00012607", "gnorm": "5.081", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10373"} 2023-01-29 19:04:37 | INFO | train_inner | {"epoch": 20, "update": 19.016, "s2c_loss": "0.142", "loss": "0.09857", "s2c_nll_loss": "0.142", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "41100", "lr": "0.000126004", "gnorm": "3.5", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10375"} 2023-01-29 19:04:40 | INFO | train_inner | {"epoch": 20, "update": 19.021, "s2c_loss": "0.184", "loss": "0.12756", "s2c_nll_loss": "0.184", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "41110", "lr": "0.000125937", "gnorm": "4.384", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10378"} 2023-01-29 19:04:42 | INFO | train_inner | {"epoch": 20, "update": 19.025, "s2c_loss": "0.201", "loss": "0.13948", "s2c_nll_loss": "0.201", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "41120", "lr": "0.00012587", "gnorm": "4.419", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10380"} 2023-01-29 19:04:45 | INFO | train_inner | {"epoch": 20, "update": 19.03, "s2c_loss": "0.261", "loss": "0.18084", "s2c_nll_loss": "0.261", "s2c_accuracy": "94.688", "s2c_total": "64", "s2c_n_correct": "60.6", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "41130", "lr": "0.000125804", "gnorm": "5.231", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10383"} 2023-01-29 19:04:47 | INFO | train_inner | {"epoch": 20, "update": 19.035, "s2c_loss": "0.152", "loss": "0.10551", "s2c_nll_loss": "0.152", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "41140", "lr": "0.000125737", "gnorm": "3.274", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10385"} 2023-01-29 19:04:50 | INFO | train_inner | {"epoch": 20, "update": 19.039, "s2c_loss": "0.401", "loss": "0.27802", "s2c_nll_loss": "0.401", "s2c_accuracy": "94.062", "s2c_total": "64", "s2c_n_correct": "60.2", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "41150", "lr": "0.00012567", "gnorm": "4.87", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10388"} 2023-01-29 19:04:53 | INFO | train_inner | {"epoch": 20, "update": 19.044, "s2c_loss": "0.175", "loss": "0.1214", "s2c_nll_loss": "0.175", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "41160", "lr": "0.000125604", "gnorm": "4.052", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10390"} 2023-01-29 19:04:55 | INFO | train_inner | {"epoch": 20, "update": 19.049, "s2c_loss": "0.182", "loss": "0.12614", "s2c_nll_loss": "0.182", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "41170", "lr": "0.000125537", "gnorm": "3.726", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10393"} 2023-01-29 19:04:58 | INFO | train_inner | {"epoch": 20, "update": 19.053, "s2c_loss": "0.176", "loss": "0.12211", "s2c_nll_loss": "0.176", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "41180", "lr": "0.00012547", "gnorm": "3.254", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10396"} 2023-01-29 19:05:00 | INFO | train_inner | {"epoch": 20, "update": 19.058, "s2c_loss": "0.267", "loss": "0.18525", "s2c_nll_loss": "0.267", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "258.1", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "41190", "lr": "0.000125404", "gnorm": "4.103", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "10398"} 2023-01-29 19:05:03 | INFO | train_inner | {"epoch": 20, "update": 19.062, "s2c_loss": "0.125", "loss": "0.08667", "s2c_nll_loss": "0.125", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "41200", "lr": "0.000125337", "gnorm": "2.962", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10401"} 2023-01-29 19:05:05 | INFO | train_inner | {"epoch": 20, "update": 19.067, "s2c_loss": "0.176", "loss": "0.1218", "s2c_nll_loss": "0.176", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "41210", "lr": "0.00012527", "gnorm": "4.396", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10403"} 2023-01-29 19:05:08 | INFO | train_inner | {"epoch": 20, "update": 19.072, "s2c_loss": "0.17", "loss": "0.11785", "s2c_nll_loss": "0.17", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "41220", "lr": "0.000125204", "gnorm": "3.493", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10406"} 2023-01-29 19:05:10 | INFO | train_inner | {"epoch": 20, "update": 19.076, "s2c_loss": "0.211", "loss": "0.14598", "s2c_nll_loss": "0.211", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "41230", "lr": "0.000125137", "gnorm": "4.291", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10408"} 2023-01-29 19:05:13 | INFO | train_inner | {"epoch": 20, "update": 19.081, "s2c_loss": "0.132", "loss": "0.09163", "s2c_nll_loss": "0.132", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "41240", "lr": "0.00012507", "gnorm": "3.245", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10411"} 2023-01-29 19:05:15 | INFO | train_inner | {"epoch": 20, "update": 19.086, "s2c_loss": "0.142", "loss": "0.09847", "s2c_nll_loss": "0.142", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "41250", "lr": "0.000125004", "gnorm": "3.383", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10413"} 2023-01-29 19:05:18 | INFO | train_inner | {"epoch": 20, "update": 19.09, "s2c_loss": "0.131", "loss": "0.09046", "s2c_nll_loss": "0.131", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "259.9", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "41260", "lr": "0.000124937", "gnorm": "3.304", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "10416"} 2023-01-29 19:05:20 | INFO | train_inner | {"epoch": 20, "update": 19.095, "s2c_loss": "0.206", "loss": "0.14261", "s2c_nll_loss": "0.206", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "262.5", "ups": "4.1", "wpb": "64", "bsz": "64", "num_updates": "41270", "lr": "0.00012487", "gnorm": "3.596", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10418"} 2023-01-29 19:05:23 | INFO | train_inner | {"epoch": 20, "update": 19.099, "s2c_loss": "0.231", "loss": "0.15995", "s2c_nll_loss": "0.231", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "261.8", "ups": "4.09", "wpb": "64", "bsz": "64", "num_updates": "41280", "lr": "0.000124804", "gnorm": "3.859", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10421"} 2023-01-29 19:05:25 | INFO | train_inner | {"epoch": 20, "update": 19.104, "s2c_loss": "0.229", "loss": "0.15881", "s2c_nll_loss": "0.229", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "41290", "lr": "0.000124737", "gnorm": "3.625", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10423"} 2023-01-29 19:05:28 | INFO | train_inner | {"epoch": 20, "update": 19.109, "s2c_loss": "0.12", "loss": "0.08316", "s2c_nll_loss": "0.12", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "41300", "lr": "0.00012467", "gnorm": "3.854", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10426"} 2023-01-29 19:05:30 | INFO | train_inner | {"epoch": 20, "update": 19.113, "s2c_loss": "0.141", "loss": "0.09799", "s2c_nll_loss": "0.141", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "41310", "lr": "0.000124604", "gnorm": "3.69", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10428"} 2023-01-29 19:05:33 | INFO | train_inner | {"epoch": 20, "update": 19.118, "s2c_loss": "0.12", "loss": "0.0835", "s2c_nll_loss": "0.12", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "41320", "lr": "0.000124537", "gnorm": "3.36", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10431"} 2023-01-29 19:05:35 | INFO | train_inner | {"epoch": 20, "update": 19.123, "s2c_loss": "0.214", "loss": "0.14841", "s2c_nll_loss": "0.214", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "41330", "lr": "0.00012447", "gnorm": "3.415", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10433"} 2023-01-29 19:05:38 | INFO | train_inner | {"epoch": 20, "update": 19.127, "s2c_loss": "0.17", "loss": "0.11759", "s2c_nll_loss": "0.17", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "41340", "lr": "0.000124404", "gnorm": "3.728", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10436"} 2023-01-29 19:05:40 | INFO | train_inner | {"epoch": 20, "update": 19.132, "s2c_loss": "0.17", "loss": "0.11787", "s2c_nll_loss": "0.17", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "41350", "lr": "0.000124337", "gnorm": "3.731", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10438"} 2023-01-29 19:05:43 | INFO | train_inner | {"epoch": 20, "update": 19.136, "s2c_loss": "0.182", "loss": "0.1264", "s2c_nll_loss": "0.182", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "41360", "lr": "0.00012427", "gnorm": "4.058", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10441"} 2023-01-29 19:05:45 | INFO | train_inner | {"epoch": 20, "update": 19.141, "s2c_loss": "0.18", "loss": "0.12463", "s2c_nll_loss": "0.18", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "41370", "lr": "0.000124204", "gnorm": "3.693", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10443"} 2023-01-29 19:05:48 | INFO | train_inner | {"epoch": 20, "update": 19.146, "s2c_loss": "0.158", "loss": "0.10936", "s2c_nll_loss": "0.158", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "41380", "lr": "0.000124137", "gnorm": "3.223", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10446"} 2023-01-29 19:05:50 | INFO | train_inner | {"epoch": 20, "update": 19.15, "s2c_loss": "0.134", "loss": "0.09259", "s2c_nll_loss": "0.134", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "41390", "lr": "0.00012407", "gnorm": "2.861", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10448"} 2023-01-29 19:05:53 | INFO | train_inner | {"epoch": 20, "update": 19.155, "s2c_loss": "0.105", "loss": "0.07288", "s2c_nll_loss": "0.105", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "41400", "lr": "0.000124004", "gnorm": "3.039", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10451"} 2023-01-29 19:05:55 | INFO | train_inner | {"epoch": 20, "update": 19.16, "s2c_loss": "0.175", "loss": "0.12154", "s2c_nll_loss": "0.175", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "41410", "lr": "0.000123937", "gnorm": "3.426", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "10453"} 2023-01-29 19:05:58 | INFO | train_inner | {"epoch": 20, "update": 19.164, "s2c_loss": "0.22", "loss": "0.15275", "s2c_nll_loss": "0.22", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "41420", "lr": "0.00012387", "gnorm": "3.536", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10456"} 2023-01-29 19:06:01 | INFO | train_inner | {"epoch": 20, "update": 19.169, "s2c_loss": "0.12", "loss": "0.08297", "s2c_nll_loss": "0.12", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "41430", "lr": "0.000123804", "gnorm": "3.059", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10458"} 2023-01-29 19:06:03 | INFO | train_inner | {"epoch": 20, "update": 19.173, "s2c_loss": "0.166", "loss": "0.11481", "s2c_nll_loss": "0.166", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "41440", "lr": "0.000123737", "gnorm": "3.681", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10461"} 2023-01-29 19:06:06 | INFO | train_inner | {"epoch": 20, "update": 19.178, "s2c_loss": "0.12", "loss": "0.08311", "s2c_nll_loss": "0.12", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "41450", "lr": "0.00012367", "gnorm": "3.991", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10463"} 2023-01-29 19:06:08 | INFO | train_inner | {"epoch": 20, "update": 19.183, "s2c_loss": "0.145", "loss": "0.1004", "s2c_nll_loss": "0.145", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "41460", "lr": "0.000123604", "gnorm": "2.844", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10466"} 2023-01-29 19:06:11 | INFO | train_inner | {"epoch": 20, "update": 19.187, "s2c_loss": "0.125", "loss": "0.08698", "s2c_nll_loss": "0.125", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "41470", "lr": "0.000123537", "gnorm": "2.948", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10469"} 2023-01-29 19:06:13 | INFO | train_inner | {"epoch": 20, "update": 19.192, "s2c_loss": "0.248", "loss": "0.17168", "s2c_nll_loss": "0.248", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "41480", "lr": "0.00012347", "gnorm": "4.289", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10471"} 2023-01-29 19:06:16 | INFO | train_inner | {"epoch": 20, "update": 19.197, "s2c_loss": "0.148", "loss": "0.10257", "s2c_nll_loss": "0.148", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "41490", "lr": "0.000123404", "gnorm": "3.339", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10474"} 2023-01-29 19:06:18 | INFO | train_inner | {"epoch": 20, "update": 19.201, "s2c_loss": "0.145", "loss": "0.10038", "s2c_nll_loss": "0.145", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "41500", "lr": "0.000123337", "gnorm": "3.437", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10476"} 2023-01-29 19:06:21 | INFO | train_inner | {"epoch": 20, "update": 19.206, "s2c_loss": "0.321", "loss": "0.22243", "s2c_nll_loss": "0.321", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "41510", "lr": "0.000123271", "gnorm": "3.028", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10479"} 2023-01-29 19:06:23 | INFO | train_inner | {"epoch": 20, "update": 19.21, "s2c_loss": "0.122", "loss": "0.08424", "s2c_nll_loss": "0.122", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "41520", "lr": "0.000123204", "gnorm": "3.029", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10481"} 2023-01-29 19:06:26 | INFO | train_inner | {"epoch": 20, "update": 19.215, "s2c_loss": "0.115", "loss": "0.07944", "s2c_nll_loss": "0.115", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "41530", "lr": "0.000123137", "gnorm": "2.731", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10484"} 2023-01-29 19:06:28 | INFO | train_inner | {"epoch": 20, "update": 19.22, "s2c_loss": "0.166", "loss": "0.11511", "s2c_nll_loss": "0.166", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "41540", "lr": "0.000123071", "gnorm": "3.428", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10486"} 2023-01-29 19:06:31 | INFO | train_inner | {"epoch": 20, "update": 19.224, "s2c_loss": "0.085", "loss": "0.0588", "s2c_nll_loss": "0.085", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "259.8", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "41550", "lr": "0.000123004", "gnorm": "2.889", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10489"} 2023-01-29 19:06:33 | INFO | train_inner | {"epoch": 20, "update": 19.229, "s2c_loss": "0.166", "loss": "0.11505", "s2c_nll_loss": "0.166", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "248", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "41560", "lr": "0.000122937", "gnorm": "3.696", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10491"} 2023-01-29 19:06:36 | INFO | train_inner | {"epoch": 20, "update": 19.234, "s2c_loss": "0.192", "loss": "0.13335", "s2c_nll_loss": "0.192", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "41570", "lr": "0.000122871", "gnorm": "3.789", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10494"} 2023-01-29 19:06:38 | INFO | train_inner | {"epoch": 20, "update": 19.238, "s2c_loss": "0.159", "loss": "0.11038", "s2c_nll_loss": "0.159", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "41580", "lr": "0.000122804", "gnorm": "3.572", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10496"} 2023-01-29 19:06:41 | INFO | train_inner | {"epoch": 20, "update": 19.243, "s2c_loss": "0.145", "loss": "0.10038", "s2c_nll_loss": "0.145", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "41590", "lr": "0.000122737", "gnorm": "3.369", "loss_scale": "512", "train_wall": "2", "gb_free": "7.4", "wall": "10499"} 2023-01-29 19:06:43 | INFO | train_inner | {"epoch": 20, "update": 19.247, "s2c_loss": "0.276", "loss": "0.19101", "s2c_nll_loss": "0.276", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "41600", "lr": "0.000122671", "gnorm": "4.516", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10501"} 2023-01-29 19:06:46 | INFO | train_inner | {"epoch": 20, "update": 19.252, "s2c_loss": "0.155", "loss": "0.1075", "s2c_nll_loss": "0.155", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "41610", "lr": "0.000122604", "gnorm": "3.682", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10504"} 2023-01-29 19:06:48 | INFO | train_inner | {"epoch": 20, "update": 19.257, "s2c_loss": "0.211", "loss": "0.14592", "s2c_nll_loss": "0.211", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "41620", "lr": "0.000122537", "gnorm": "4.69", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10506"} 2023-01-29 19:06:51 | INFO | train_inner | {"epoch": 20, "update": 19.261, "s2c_loss": "0.125", "loss": "0.08662", "s2c_nll_loss": "0.125", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "41630", "lr": "0.000122471", "gnorm": "3.502", "loss_scale": "512", "train_wall": "2", "gb_free": "7.2", "wall": "10509"} 2023-01-29 19:06:54 | INFO | train_inner | {"epoch": 20, "update": 19.266, "s2c_loss": "0.249", "loss": "0.17271", "s2c_nll_loss": "0.249", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "41640", "lr": "0.000122404", "gnorm": "5.109", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10511"} 2023-01-29 19:06:56 | INFO | train_inner | {"epoch": 20, "update": 19.271, "s2c_loss": "0.162", "loss": "0.11195", "s2c_nll_loss": "0.162", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "41650", "lr": "0.000122337", "gnorm": "3.65", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10514"} 2023-01-29 19:06:59 | INFO | train_inner | {"epoch": 20, "update": 19.275, "s2c_loss": "0.135", "loss": "0.09371", "s2c_nll_loss": "0.135", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "41660", "lr": "0.000122271", "gnorm": "2.924", "loss_scale": "512", "train_wall": "3", "gb_free": "7.3", "wall": "10517"} 2023-01-29 19:07:01 | INFO | train_inner | {"epoch": 20, "update": 19.28, "s2c_loss": "0.178", "loss": "0.12359", "s2c_nll_loss": "0.178", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "41670", "lr": "0.000122204", "gnorm": "3.083", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10519"} 2023-01-29 19:07:04 | INFO | train_inner | {"epoch": 20, "update": 19.284, "s2c_loss": "0.134", "loss": "0.0932", "s2c_nll_loss": "0.134", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "41680", "lr": "0.000122137", "gnorm": "3.594", "loss_scale": "512", "train_wall": "2", "gb_free": "7.3", "wall": "10522"} 2023-01-29 19:07:06 | INFO | train_inner | {"epoch": 20, "update": 19.289, "s2c_loss": "0.18", "loss": "0.12474", "s2c_nll_loss": "0.18", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "41690", "lr": "0.000122071", "gnorm": "3.711", "loss_scale": "512", "train_wall": "3", "gb_free": "7.2", "wall": "10524"} 2023-01-29 19:07:09 | INFO | train_inner | {"epoch": 20, "update": 19.294, "s2c_loss": "0.166", "loss": "0.11534", "s2c_nll_loss": "0.166", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "247.5", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "41700", "lr": "0.000122004", "gnorm": "3.68", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10527"} 2023-01-29 19:07:11 | INFO | train_inner | {"epoch": 20, "update": 19.298, "s2c_loss": "0.116", "loss": "0.08009", "s2c_nll_loss": "0.116", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "41710", "lr": "0.000121937", "gnorm": "3.297", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10529"} 2023-01-29 19:07:14 | INFO | train_inner | {"epoch": 20, "update": 19.303, "s2c_loss": "0.138", "loss": "0.09558", "s2c_nll_loss": "0.138", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "41720", "lr": "0.000121871", "gnorm": "3.194", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10532"} 2023-01-29 19:07:16 | INFO | train_inner | {"epoch": 20, "update": 19.308, "s2c_loss": "0.138", "loss": "0.09597", "s2c_nll_loss": "0.138", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "41730", "lr": "0.000121804", "gnorm": "3.585", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10534"} 2023-01-29 19:07:19 | INFO | train_inner | {"epoch": 20, "update": 19.312, "s2c_loss": "0.144", "loss": "0.09999", "s2c_nll_loss": "0.144", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "41740", "lr": "0.000121737", "gnorm": "3.217", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10537"} 2023-01-29 19:07:22 | INFO | train_inner | {"epoch": 20, "update": 19.317, "s2c_loss": "0.124", "loss": "0.08579", "s2c_nll_loss": "0.124", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "41750", "lr": "0.000121671", "gnorm": "3.256", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10539"} 2023-01-29 19:07:24 | INFO | train_inner | {"epoch": 20, "update": 19.321, "s2c_loss": "0.158", "loss": "0.10917", "s2c_nll_loss": "0.158", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "41760", "lr": "0.000121604", "gnorm": "4.244", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10542"} 2023-01-29 19:07:27 | INFO | train_inner | {"epoch": 20, "update": 19.326, "s2c_loss": "0.146", "loss": "0.10126", "s2c_nll_loss": "0.146", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "41770", "lr": "0.000121537", "gnorm": "3.883", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10544"} 2023-01-29 19:07:29 | INFO | train_inner | {"epoch": 20, "update": 19.331, "s2c_loss": "0.144", "loss": "0.09959", "s2c_nll_loss": "0.144", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "245.3", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "41780", "lr": "0.000121471", "gnorm": "3.525", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "10547"} 2023-01-29 19:07:32 | INFO | train_inner | {"epoch": 20, "update": 19.335, "s2c_loss": "0.136", "loss": "0.09403", "s2c_nll_loss": "0.136", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "41790", "lr": "0.000121404", "gnorm": "3.745", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10550"} 2023-01-29 19:07:34 | INFO | train_inner | {"epoch": 20, "update": 19.34, "s2c_loss": "0.168", "loss": "0.11625", "s2c_nll_loss": "0.168", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "41800", "lr": "0.000121337", "gnorm": "4.017", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10552"} 2023-01-29 19:07:37 | INFO | train_inner | {"epoch": 20, "update": 19.345, "s2c_loss": "0.148", "loss": "0.10239", "s2c_nll_loss": "0.148", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "41810", "lr": "0.000121271", "gnorm": "3.392", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10555"} 2023-01-29 19:07:39 | INFO | train_inner | {"epoch": 20, "update": 19.349, "s2c_loss": "0.143", "loss": "0.09914", "s2c_nll_loss": "0.143", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "41820", "lr": "0.000121204", "gnorm": "3.425", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10557"} 2023-01-29 19:07:42 | INFO | train_inner | {"epoch": 20, "update": 19.354, "s2c_loss": "0.213", "loss": "0.1474", "s2c_nll_loss": "0.213", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "41830", "lr": "0.000121137", "gnorm": "4.176", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10560"} 2023-01-29 19:07:44 | INFO | train_inner | {"epoch": 20, "update": 19.358, "s2c_loss": "0.149", "loss": "0.10339", "s2c_nll_loss": "0.149", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "41840", "lr": "0.000121071", "gnorm": "3.057", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10562"} 2023-01-29 19:07:47 | INFO | train_inner | {"epoch": 20, "update": 19.363, "s2c_loss": "0.158", "loss": "0.10959", "s2c_nll_loss": "0.158", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "41850", "lr": "0.000121004", "gnorm": "3.49", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10565"} 2023-01-29 19:07:49 | INFO | train_inner | {"epoch": 20, "update": 19.368, "s2c_loss": "0.176", "loss": "0.12219", "s2c_nll_loss": "0.176", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "41860", "lr": "0.000120937", "gnorm": "3.722", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10567"} 2023-01-29 19:07:52 | INFO | train_inner | {"epoch": 20, "update": 19.372, "s2c_loss": "0.106", "loss": "0.07325", "s2c_nll_loss": "0.106", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "41870", "lr": "0.000120871", "gnorm": "2.706", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10570"} 2023-01-29 19:07:54 | INFO | train_inner | {"epoch": 20, "update": 19.377, "s2c_loss": "0.198", "loss": "0.13717", "s2c_nll_loss": "0.198", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "41880", "lr": "0.000120804", "gnorm": "3.986", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10572"} 2023-01-29 19:07:57 | INFO | train_inner | {"epoch": 20, "update": 19.382, "s2c_loss": "0.132", "loss": "0.09178", "s2c_nll_loss": "0.132", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "41890", "lr": "0.000120737", "gnorm": "3.656", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10575"} 2023-01-29 19:08:00 | INFO | train_inner | {"epoch": 20, "update": 19.386, "s2c_loss": "0.137", "loss": "0.09488", "s2c_nll_loss": "0.137", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "41900", "lr": "0.000120671", "gnorm": "3.272", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10577"} 2023-01-29 19:08:02 | INFO | train_inner | {"epoch": 20, "update": 19.391, "s2c_loss": "0.187", "loss": "0.12992", "s2c_nll_loss": "0.187", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "41910", "lr": "0.000120604", "gnorm": "4.897", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10580"} 2023-01-29 19:08:05 | INFO | train_inner | {"epoch": 20, "update": 19.395, "s2c_loss": "0.148", "loss": "0.10291", "s2c_nll_loss": "0.148", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "41920", "lr": "0.000120537", "gnorm": "3.352", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10583"} 2023-01-29 19:08:07 | INFO | train_inner | {"epoch": 20, "update": 19.4, "s2c_loss": "0.213", "loss": "0.14795", "s2c_nll_loss": "0.213", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "41930", "lr": "0.000120471", "gnorm": "3.703", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "10585"} 2023-01-29 19:08:10 | INFO | train_inner | {"epoch": 20, "update": 19.405, "s2c_loss": "0.154", "loss": "0.10665", "s2c_nll_loss": "0.154", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "41940", "lr": "0.000120404", "gnorm": "3.255", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10588"} 2023-01-29 19:08:12 | INFO | train_inner | {"epoch": 20, "update": 19.409, "s2c_loss": "0.199", "loss": "0.13775", "s2c_nll_loss": "0.199", "s2c_accuracy": "95.781", "s2c_total": "64", "s2c_n_correct": "61.3", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "41950", "lr": "0.000120337", "gnorm": "3.905", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10590"} 2023-01-29 19:08:15 | INFO | train_inner | {"epoch": 20, "update": 19.414, "s2c_loss": "0.134", "loss": "0.09266", "s2c_nll_loss": "0.134", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "41960", "lr": "0.000120271", "gnorm": "3.763", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10593"} 2023-01-29 19:08:17 | INFO | train_inner | {"epoch": 20, "update": 19.419, "s2c_loss": "0.164", "loss": "0.1134", "s2c_nll_loss": "0.164", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "258.9", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "41970", "lr": "0.000120204", "gnorm": "3.153", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10595"} 2023-01-29 19:08:20 | INFO | train_inner | {"epoch": 20, "update": 19.423, "s2c_loss": "0.142", "loss": "0.09835", "s2c_nll_loss": "0.142", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "41980", "lr": "0.000120137", "gnorm": "3.4", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10598"} 2023-01-29 19:08:22 | INFO | train_inner | {"epoch": 20, "update": 19.428, "s2c_loss": "0.129", "loss": "0.0896", "s2c_nll_loss": "0.129", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "41990", "lr": "0.000120071", "gnorm": "3.178", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10600"} 2023-01-29 19:08:25 | INFO | train_inner | {"epoch": 20, "update": 19.432, "s2c_loss": "0.126", "loss": "0.08719", "s2c_nll_loss": "0.126", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "42000", "lr": "0.000120004", "gnorm": "3.587", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10603"} 2023-01-29 19:08:27 | INFO | train_inner | {"epoch": 20, "update": 19.437, "s2c_loss": "0.164", "loss": "0.11353", "s2c_nll_loss": "0.164", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "42010", "lr": "0.000119937", "gnorm": "3.823", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10605"} 2023-01-29 19:08:30 | INFO | train_inner | {"epoch": 20, "update": 19.442, "s2c_loss": "0.115", "loss": "0.08001", "s2c_nll_loss": "0.115", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "42020", "lr": "0.000119871", "gnorm": "3.456", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "10608"} 2023-01-29 19:08:32 | INFO | train_inner | {"epoch": 20, "update": 19.446, "s2c_loss": "0.141", "loss": "0.0974", "s2c_nll_loss": "0.141", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "42030", "lr": "0.000119804", "gnorm": "3.266", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10610"} 2023-01-29 19:08:35 | INFO | train_inner | {"epoch": 20, "update": 19.451, "s2c_loss": "0.159", "loss": "0.11024", "s2c_nll_loss": "0.159", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "42040", "lr": "0.000119737", "gnorm": "4.086", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "10613"} 2023-01-29 19:08:37 | INFO | train_inner | {"epoch": 20, "update": 19.456, "s2c_loss": "0.159", "loss": "0.11051", "s2c_nll_loss": "0.159", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "42050", "lr": "0.000119671", "gnorm": "3.542", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10615"} 2023-01-29 19:08:40 | INFO | train_inner | {"epoch": 20, "update": 19.46, "s2c_loss": "0.153", "loss": "0.10615", "s2c_nll_loss": "0.153", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "42060", "lr": "0.000119604", "gnorm": "3.292", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10618"} 2023-01-29 19:08:43 | INFO | train_inner | {"epoch": 20, "update": 19.465, "s2c_loss": "0.138", "loss": "0.09597", "s2c_nll_loss": "0.138", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "42070", "lr": "0.000119537", "gnorm": "3.452", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10620"} 2023-01-29 19:08:45 | INFO | train_inner | {"epoch": 20, "update": 19.469, "s2c_loss": "0.184", "loss": "0.12757", "s2c_nll_loss": "0.184", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "247.3", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "42080", "lr": "0.000119471", "gnorm": "3.983", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10623"} 2023-01-29 19:08:48 | INFO | train_inner | {"epoch": 20, "update": 19.474, "s2c_loss": "0.2", "loss": "0.13828", "s2c_nll_loss": "0.2", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "42090", "lr": "0.000119404", "gnorm": "4.058", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10626"} 2023-01-29 19:08:50 | INFO | train_inner | {"epoch": 20, "update": 19.479, "s2c_loss": "0.165", "loss": "0.11419", "s2c_nll_loss": "0.165", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "42100", "lr": "0.000119337", "gnorm": "3.785", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10628"} 2023-01-29 19:08:53 | INFO | train_inner | {"epoch": 20, "update": 19.483, "s2c_loss": "0.166", "loss": "0.11488", "s2c_nll_loss": "0.166", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "42110", "lr": "0.000119271", "gnorm": "3.776", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10631"} 2023-01-29 19:08:55 | INFO | train_inner | {"epoch": 20, "update": 19.488, "s2c_loss": "0.218", "loss": "0.15098", "s2c_nll_loss": "0.218", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "42120", "lr": "0.000119204", "gnorm": "4.161", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10633"} 2023-01-29 19:08:58 | INFO | train_inner | {"epoch": 20, "update": 19.493, "s2c_loss": "0.166", "loss": "0.11526", "s2c_nll_loss": "0.166", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "259.7", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "42130", "lr": "0.000119137", "gnorm": "3.107", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10636"} 2023-01-29 19:09:00 | INFO | train_inner | {"epoch": 20, "update": 19.497, "s2c_loss": "0.134", "loss": "0.09267", "s2c_nll_loss": "0.134", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "42140", "lr": "0.000119071", "gnorm": "3.317", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10638"} 2023-01-29 19:09:03 | INFO | train_inner | {"epoch": 20, "update": 19.502, "s2c_loss": "0.191", "loss": "0.13221", "s2c_nll_loss": "0.191", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "42150", "lr": "0.000119004", "gnorm": "3.961", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "10641"} 2023-01-29 19:09:05 | INFO | train_inner | {"epoch": 20, "update": 19.506, "s2c_loss": "0.152", "loss": "0.10529", "s2c_nll_loss": "0.152", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "42160", "lr": "0.000118937", "gnorm": "3.329", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "10643"} 2023-01-29 19:09:08 | INFO | train_inner | {"epoch": 20, "update": 19.511, "s2c_loss": "0.191", "loss": "0.13213", "s2c_nll_loss": "0.191", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "42170", "lr": "0.000118871", "gnorm": "3.15", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10646"} 2023-01-29 19:09:10 | INFO | train_inner | {"epoch": 20, "update": 19.516, "s2c_loss": "0.095", "loss": "0.0661", "s2c_nll_loss": "0.095", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "42180", "lr": "0.000118804", "gnorm": "3", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10648"} 2023-01-29 19:09:13 | INFO | train_inner | {"epoch": 20, "update": 19.52, "s2c_loss": "0.181", "loss": "0.12531", "s2c_nll_loss": "0.181", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "42190", "lr": "0.000118737", "gnorm": "3.767", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10651"} 2023-01-29 19:09:15 | INFO | train_inner | {"epoch": 20, "update": 19.525, "s2c_loss": "0.106", "loss": "0.07356", "s2c_nll_loss": "0.106", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "42200", "lr": "0.000118671", "gnorm": "2.716", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10653"} 2023-01-29 19:09:18 | INFO | train_inner | {"epoch": 20, "update": 19.53, "s2c_loss": "0.145", "loss": "0.10016", "s2c_nll_loss": "0.145", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "42210", "lr": "0.000118604", "gnorm": "3.163", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10656"} 2023-01-29 19:09:20 | INFO | train_inner | {"epoch": 20, "update": 19.534, "s2c_loss": "0.102", "loss": "0.07063", "s2c_nll_loss": "0.102", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "258.3", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "42220", "lr": "0.000118537", "gnorm": "2.586", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10658"} 2023-01-29 19:09:23 | INFO | train_inner | {"epoch": 20, "update": 19.539, "s2c_loss": "0.101", "loss": "0.07033", "s2c_nll_loss": "0.101", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "260.4", "ups": "4.07", "wpb": "64", "bsz": "64", "num_updates": "42230", "lr": "0.000118471", "gnorm": "2.987", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10661"} 2023-01-29 19:09:25 | INFO | train_inner | {"epoch": 20, "update": 19.543, "s2c_loss": "0.154", "loss": "0.10665", "s2c_nll_loss": "0.154", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "42240", "lr": "0.000118404", "gnorm": "3.265", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "10663"} 2023-01-29 19:09:28 | INFO | train_inner | {"epoch": 20, "update": 19.548, "s2c_loss": "0.13", "loss": "0.08988", "s2c_nll_loss": "0.13", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "42250", "lr": "0.000118337", "gnorm": "2.766", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10666"} 2023-01-29 19:09:30 | INFO | train_inner | {"epoch": 20, "update": 19.553, "s2c_loss": "0.102", "loss": "0.07053", "s2c_nll_loss": "0.102", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "42260", "lr": "0.000118271", "gnorm": "2.674", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10668"} 2023-01-29 19:09:33 | INFO | train_inner | {"epoch": 20, "update": 19.557, "s2c_loss": "0.151", "loss": "0.10487", "s2c_nll_loss": "0.151", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "42270", "lr": "0.000118204", "gnorm": "3.66", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "10671"} 2023-01-29 19:09:35 | INFO | train_inner | {"epoch": 20, "update": 19.562, "s2c_loss": "0.152", "loss": "0.10506", "s2c_nll_loss": "0.152", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "253.8", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "42280", "lr": "0.000118137", "gnorm": "3.271", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10673"} 2023-01-29 19:09:38 | INFO | train_inner | {"epoch": 20, "update": 19.567, "s2c_loss": "0.102", "loss": "0.07072", "s2c_nll_loss": "0.102", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "42290", "lr": "0.000118071", "gnorm": "3.133", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10676"} 2023-01-29 19:09:40 | INFO | train_inner | {"epoch": 20, "update": 19.571, "s2c_loss": "0.173", "loss": "0.12021", "s2c_nll_loss": "0.173", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "42300", "lr": "0.000118004", "gnorm": "3.354", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10678"} 2023-01-29 19:09:43 | INFO | train_inner | {"epoch": 20, "update": 19.576, "s2c_loss": "0.105", "loss": "0.07294", "s2c_nll_loss": "0.105", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "42310", "lr": "0.000117937", "gnorm": "3.033", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10681"} 2023-01-29 19:09:46 | INFO | train_inner | {"epoch": 20, "update": 19.58, "s2c_loss": "0.114", "loss": "0.07892", "s2c_nll_loss": "0.114", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "42320", "lr": "0.000117871", "gnorm": "3.122", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10684"} 2023-01-29 19:09:48 | INFO | train_inner | {"epoch": 20, "update": 19.585, "s2c_loss": "0.154", "loss": "0.10664", "s2c_nll_loss": "0.154", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "42330", "lr": "0.000117804", "gnorm": "3.18", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10686"} 2023-01-29 19:09:51 | INFO | train_inner | {"epoch": 20, "update": 19.59, "s2c_loss": "0.159", "loss": "0.11037", "s2c_nll_loss": "0.159", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "42340", "lr": "0.000117737", "gnorm": "3.586", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "10689"} 2023-01-29 19:09:53 | INFO | train_inner | {"epoch": 20, "update": 19.594, "s2c_loss": "0.182", "loss": "0.12599", "s2c_nll_loss": "0.182", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "42350", "lr": "0.000117671", "gnorm": "3.793", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10691"} 2023-01-29 19:09:56 | INFO | train_inner | {"epoch": 20, "update": 19.599, "s2c_loss": "0.162", "loss": "0.11252", "s2c_nll_loss": "0.162", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "42360", "lr": "0.000117604", "gnorm": "3.981", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10694"} 2023-01-29 19:09:58 | INFO | train_inner | {"epoch": 20, "update": 19.604, "s2c_loss": "0.195", "loss": "0.13539", "s2c_nll_loss": "0.195", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "259.6", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "42370", "lr": "0.000117537", "gnorm": "4.063", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "10696"} 2023-01-29 19:10:01 | INFO | train_inner | {"epoch": 20, "update": 19.608, "s2c_loss": "0.176", "loss": "0.12227", "s2c_nll_loss": "0.176", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "42380", "lr": "0.000117471", "gnorm": "4.006", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10699"} 2023-01-29 19:10:03 | INFO | train_inner | {"epoch": 20, "update": 19.613, "s2c_loss": "0.224", "loss": "0.15539", "s2c_nll_loss": "0.224", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "42390", "lr": "0.000117404", "gnorm": "3.793", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "10701"} 2023-01-29 19:10:06 | INFO | train_inner | {"epoch": 20, "update": 19.617, "s2c_loss": "0.221", "loss": "0.15311", "s2c_nll_loss": "0.221", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "42400", "lr": "0.000117337", "gnorm": "3.933", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10704"} 2023-01-29 19:10:08 | INFO | train_inner | {"epoch": 20, "update": 19.622, "s2c_loss": "0.156", "loss": "0.10846", "s2c_nll_loss": "0.156", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "42410", "lr": "0.000117271", "gnorm": "3.933", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10706"} 2023-01-29 19:10:11 | INFO | train_inner | {"epoch": 20, "update": 19.627, "s2c_loss": "0.176", "loss": "0.12185", "s2c_nll_loss": "0.176", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "42420", "lr": "0.000117204", "gnorm": "3.63", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10709"} 2023-01-29 19:10:13 | INFO | train_inner | {"epoch": 20, "update": 19.631, "s2c_loss": "0.141", "loss": "0.09773", "s2c_nll_loss": "0.141", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "42430", "lr": "0.000117137", "gnorm": "4.194", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10711"} 2023-01-29 19:10:16 | INFO | train_inner | {"epoch": 20, "update": 19.636, "s2c_loss": "0.204", "loss": "0.14118", "s2c_nll_loss": "0.204", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "42440", "lr": "0.000117071", "gnorm": "3.654", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10714"} 2023-01-29 19:10:18 | INFO | train_inner | {"epoch": 20, "update": 19.641, "s2c_loss": "0.247", "loss": "0.1715", "s2c_nll_loss": "0.247", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "42450", "lr": "0.000117004", "gnorm": "4.845", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10716"} 2023-01-29 19:10:21 | INFO | train_inner | {"epoch": 20, "update": 19.645, "s2c_loss": "0.143", "loss": "0.09925", "s2c_nll_loss": "0.143", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "42460", "lr": "0.000116937", "gnorm": "4.23", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10719"} 2023-01-29 19:10:23 | INFO | train_inner | {"epoch": 20, "update": 19.65, "s2c_loss": "0.183", "loss": "0.12685", "s2c_nll_loss": "0.183", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "42470", "lr": "0.000116871", "gnorm": "3.873", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10721"} 2023-01-29 19:10:26 | INFO | train_inner | {"epoch": 20, "update": 19.654, "s2c_loss": "0.178", "loss": "0.12365", "s2c_nll_loss": "0.178", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "258.2", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "42480", "lr": "0.000116804", "gnorm": "3.446", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10724"} 2023-01-29 19:10:28 | INFO | train_inner | {"epoch": 20, "update": 19.659, "s2c_loss": "0.135", "loss": "0.09373", "s2c_nll_loss": "0.135", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "42490", "lr": "0.000116737", "gnorm": "3.809", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10726"} 2023-01-29 19:10:31 | INFO | train_inner | {"epoch": 20, "update": 19.664, "s2c_loss": "0.151", "loss": "0.10461", "s2c_nll_loss": "0.151", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "42500", "lr": "0.000116671", "gnorm": "3.832", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10729"} 2023-01-29 19:10:34 | INFO | train_inner | {"epoch": 20, "update": 19.668, "s2c_loss": "0.106", "loss": "0.07375", "s2c_nll_loss": "0.106", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "42510", "lr": "0.000116604", "gnorm": "3.19", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10731"} 2023-01-29 19:10:36 | INFO | train_inner | {"epoch": 20, "update": 19.673, "s2c_loss": "0.138", "loss": "0.09572", "s2c_nll_loss": "0.138", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "42520", "lr": "0.000116538", "gnorm": "3.836", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "10734"} 2023-01-29 19:10:39 | INFO | train_inner | {"epoch": 20, "update": 19.678, "s2c_loss": "0.142", "loss": "0.09836", "s2c_nll_loss": "0.142", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "42530", "lr": "0.000116471", "gnorm": "3.328", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10736"} 2023-01-29 19:10:41 | INFO | train_inner | {"epoch": 20, "update": 19.682, "s2c_loss": "0.107", "loss": "0.07444", "s2c_nll_loss": "0.107", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "42540", "lr": "0.000116404", "gnorm": "3.096", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10739"} 2023-01-29 19:10:44 | INFO | train_inner | {"epoch": 20, "update": 19.687, "s2c_loss": "0.167", "loss": "0.11565", "s2c_nll_loss": "0.167", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "255.7", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "42550", "lr": "0.000116338", "gnorm": "3.534", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10742"} 2023-01-29 19:10:46 | INFO | train_inner | {"epoch": 20, "update": 19.691, "s2c_loss": "0.14", "loss": "0.09721", "s2c_nll_loss": "0.14", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "42560", "lr": "0.000116271", "gnorm": "3.119", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10744"} 2023-01-29 19:10:49 | INFO | train_inner | {"epoch": 20, "update": 19.696, "s2c_loss": "0.178", "loss": "0.1232", "s2c_nll_loss": "0.178", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "42570", "lr": "0.000116204", "gnorm": "4.324", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10747"} 2023-01-29 19:10:51 | INFO | train_inner | {"epoch": 20, "update": 19.701, "s2c_loss": "0.145", "loss": "0.10023", "s2c_nll_loss": "0.145", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "42580", "lr": "0.000116138", "gnorm": "4.025", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10749"} 2023-01-29 19:10:54 | INFO | train_inner | {"epoch": 20, "update": 19.705, "s2c_loss": "0.184", "loss": "0.12766", "s2c_nll_loss": "0.184", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "244.6", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "42590", "lr": "0.000116071", "gnorm": "4.841", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10752"} 2023-01-29 19:10:56 | INFO | train_inner | {"epoch": 20, "update": 19.71, "s2c_loss": "0.228", "loss": "0.15823", "s2c_nll_loss": "0.228", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "247.9", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "42600", "lr": "0.000116004", "gnorm": "5.384", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10754"} 2023-01-29 19:10:59 | INFO | train_inner | {"epoch": 20, "update": 19.715, "s2c_loss": "0.132", "loss": "0.09149", "s2c_nll_loss": "0.132", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "247.4", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "42610", "lr": "0.000115938", "gnorm": "4.172", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10757"} 2023-01-29 19:11:02 | INFO | train_inner | {"epoch": 20, "update": 19.719, "s2c_loss": "0.256", "loss": "0.17734", "s2c_nll_loss": "0.256", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "42620", "lr": "0.000115871", "gnorm": "6.339", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10759"} 2023-01-29 19:11:04 | INFO | train_inner | {"epoch": 20, "update": 19.724, "s2c_loss": "0.189", "loss": "0.13101", "s2c_nll_loss": "0.189", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "42630", "lr": "0.000115804", "gnorm": "5.177", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10762"} 2023-01-29 19:11:07 | INFO | train_inner | {"epoch": 20, "update": 19.728, "s2c_loss": "0.228", "loss": "0.15716", "s2c_nll_loss": "0.228", "s2c_accuracy": "96.075", "s2c_total": "63.7", "s2c_n_correct": "61.2", "wps": "254", "ups": "3.99", "wpb": "63.7", "bsz": "63.7", "num_updates": "42640", "lr": "0.000115738", "gnorm": "5.044", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10765"} 2023-01-29 19:11:09 | INFO | train_inner | {"epoch": 20, "update": 19.733, "s2c_loss": "0.205", "loss": "0.14219", "s2c_nll_loss": "0.205", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "42650", "lr": "0.000115671", "gnorm": "4.091", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10767"} 2023-01-29 19:11:12 | INFO | train_inner | {"epoch": 20, "update": 19.738, "s2c_loss": "0.188", "loss": "0.1306", "s2c_nll_loss": "0.188", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "42660", "lr": "0.000115604", "gnorm": "4.43", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10770"} 2023-01-29 19:11:14 | INFO | train_inner | {"epoch": 20, "update": 19.742, "s2c_loss": "0.148", "loss": "0.1025", "s2c_nll_loss": "0.148", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "42670", "lr": "0.000115538", "gnorm": "3.597", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10772"} 2023-01-29 19:11:17 | INFO | train_inner | {"epoch": 20, "update": 19.747, "s2c_loss": "0.162", "loss": "0.11238", "s2c_nll_loss": "0.162", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "258.2", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "42680", "lr": "0.000115471", "gnorm": "4.45", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10775"} 2023-01-29 19:11:19 | INFO | train_inner | {"epoch": 20, "update": 19.752, "s2c_loss": "0.176", "loss": "0.12189", "s2c_nll_loss": "0.176", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "42690", "lr": "0.000115404", "gnorm": "3.76", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10777"} 2023-01-29 19:11:22 | INFO | train_inner | {"epoch": 20, "update": 19.756, "s2c_loss": "0.148", "loss": "0.10256", "s2c_nll_loss": "0.148", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "42700", "lr": "0.000115338", "gnorm": "3.605", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10780"} 2023-01-29 19:11:24 | INFO | train_inner | {"epoch": 20, "update": 19.761, "s2c_loss": "0.185", "loss": "0.12806", "s2c_nll_loss": "0.185", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "42710", "lr": "0.000115271", "gnorm": "3.855", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10782"} 2023-01-29 19:11:27 | INFO | train_inner | {"epoch": 20, "update": 19.765, "s2c_loss": "0.191", "loss": "0.13272", "s2c_nll_loss": "0.191", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "42720", "lr": "0.000115204", "gnorm": "4.459", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "10785"} 2023-01-29 19:11:29 | INFO | train_inner | {"epoch": 20, "update": 19.77, "s2c_loss": "0.171", "loss": "0.11885", "s2c_nll_loss": "0.171", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "42730", "lr": "0.000115138", "gnorm": "3.039", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10787"} 2023-01-29 19:11:32 | INFO | train_inner | {"epoch": 20, "update": 19.775, "s2c_loss": "0.126", "loss": "0.08733", "s2c_nll_loss": "0.126", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "42740", "lr": "0.000115071", "gnorm": "3.15", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10790"} 2023-01-29 19:11:34 | INFO | train_inner | {"epoch": 20, "update": 19.779, "s2c_loss": "0.096", "loss": "0.06647", "s2c_nll_loss": "0.096", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "259.6", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "42750", "lr": "0.000115004", "gnorm": "2.537", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10792"} 2023-01-29 19:11:37 | INFO | train_inner | {"epoch": 20, "update": 19.784, "s2c_loss": "0.155", "loss": "0.10773", "s2c_nll_loss": "0.155", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "42760", "lr": "0.000114938", "gnorm": "3.127", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10795"} 2023-01-29 19:11:39 | INFO | train_inner | {"epoch": 20, "update": 19.789, "s2c_loss": "0.174", "loss": "0.12058", "s2c_nll_loss": "0.174", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "42770", "lr": "0.000114871", "gnorm": "3.377", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10797"} 2023-01-29 19:11:42 | INFO | train_inner | {"epoch": 20, "update": 19.793, "s2c_loss": "0.151", "loss": "0.10437", "s2c_nll_loss": "0.151", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "42780", "lr": "0.000114804", "gnorm": "3.334", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10800"} 2023-01-29 19:11:44 | INFO | train_inner | {"epoch": 20, "update": 19.798, "s2c_loss": "0.126", "loss": "0.08761", "s2c_nll_loss": "0.126", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "42790", "lr": "0.000114738", "gnorm": "3.09", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10802"} 2023-01-29 19:11:47 | INFO | train_inner | {"epoch": 20, "update": 19.802, "s2c_loss": "0.109", "loss": "0.07578", "s2c_nll_loss": "0.109", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "42800", "lr": "0.000114671", "gnorm": "3.102", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10805"} 2023-01-29 19:11:49 | INFO | train_inner | {"epoch": 20, "update": 19.807, "s2c_loss": "0.187", "loss": "0.12964", "s2c_nll_loss": "0.187", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "42810", "lr": "0.000114604", "gnorm": "3.521", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10807"} 2023-01-29 19:11:52 | INFO | train_inner | {"epoch": 20, "update": 19.812, "s2c_loss": "0.16", "loss": "0.11069", "s2c_nll_loss": "0.16", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "42820", "lr": "0.000114538", "gnorm": "3.045", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10810"} 2023-01-29 19:11:54 | INFO | train_inner | {"epoch": 20, "update": 19.816, "s2c_loss": "0.201", "loss": "0.13964", "s2c_nll_loss": "0.201", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "42830", "lr": "0.000114471", "gnorm": "4.02", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10812"} 2023-01-29 19:11:57 | INFO | train_inner | {"epoch": 20, "update": 19.821, "s2c_loss": "0.206", "loss": "0.14263", "s2c_nll_loss": "0.206", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "42840", "lr": "0.000114404", "gnorm": "4.547", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10815"} 2023-01-29 19:12:00 | INFO | train_inner | {"epoch": 20, "update": 19.826, "s2c_loss": "0.202", "loss": "0.13995", "s2c_nll_loss": "0.202", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "255.7", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "42850", "lr": "0.000114338", "gnorm": "3.981", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10817"} 2023-01-29 19:12:02 | INFO | train_inner | {"epoch": 20, "update": 19.83, "s2c_loss": "0.183", "loss": "0.12697", "s2c_nll_loss": "0.183", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "42860", "lr": "0.000114271", "gnorm": "3.529", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "10820"} 2023-01-29 19:12:05 | INFO | train_inner | {"epoch": 20, "update": 19.835, "s2c_loss": "0.159", "loss": "0.1103", "s2c_nll_loss": "0.159", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "42870", "lr": "0.000114204", "gnorm": "3.719", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10823"} 2023-01-29 19:12:07 | INFO | train_inner | {"epoch": 20, "update": 19.84, "s2c_loss": "0.16", "loss": "0.11065", "s2c_nll_loss": "0.16", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "248", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "42880", "lr": "0.000114138", "gnorm": "3.736", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10825"} 2023-01-29 19:12:10 | INFO | train_inner | {"epoch": 20, "update": 19.844, "s2c_loss": "0.083", "loss": "0.05761", "s2c_nll_loss": "0.083", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "42890", "lr": "0.000114071", "gnorm": "2.779", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10828"} 2023-01-29 19:12:12 | INFO | train_inner | {"epoch": 20, "update": 19.849, "s2c_loss": "0.15", "loss": "0.10366", "s2c_nll_loss": "0.15", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "42900", "lr": "0.000114004", "gnorm": "3.984", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10830"} 2023-01-29 19:12:15 | INFO | train_inner | {"epoch": 20, "update": 19.853, "s2c_loss": "0.132", "loss": "0.09135", "s2c_nll_loss": "0.132", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "42910", "lr": "0.000113938", "gnorm": "3.067", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10833"} 2023-01-29 19:12:17 | INFO | train_inner | {"epoch": 20, "update": 19.858, "s2c_loss": "0.123", "loss": "0.08536", "s2c_nll_loss": "0.123", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "42920", "lr": "0.000113871", "gnorm": "3.963", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10835"} 2023-01-29 19:12:20 | INFO | train_inner | {"epoch": 20, "update": 19.863, "s2c_loss": "0.111", "loss": "0.07693", "s2c_nll_loss": "0.111", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "42930", "lr": "0.000113804", "gnorm": "2.813", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10838"} 2023-01-29 19:12:22 | INFO | train_inner | {"epoch": 20, "update": 19.867, "s2c_loss": "0.179", "loss": "0.12441", "s2c_nll_loss": "0.179", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "42940", "lr": "0.000113738", "gnorm": "3.996", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10840"} 2023-01-29 19:12:25 | INFO | train_inner | {"epoch": 20, "update": 19.872, "s2c_loss": "0.172", "loss": "0.11894", "s2c_nll_loss": "0.172", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "42950", "lr": "0.000113671", "gnorm": "3.827", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10843"} 2023-01-29 19:12:27 | INFO | train_inner | {"epoch": 20, "update": 19.877, "s2c_loss": "0.118", "loss": "0.08173", "s2c_nll_loss": "0.118", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "42960", "lr": "0.000113604", "gnorm": "3.45", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10845"} 2023-01-29 19:12:30 | INFO | train_inner | {"epoch": 20, "update": 19.881, "s2c_loss": "0.167", "loss": "0.11591", "s2c_nll_loss": "0.167", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "42970", "lr": "0.000113538", "gnorm": "3.494", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "10848"} 2023-01-29 19:12:32 | INFO | train_inner | {"epoch": 20, "update": 19.886, "s2c_loss": "0.164", "loss": "0.11393", "s2c_nll_loss": "0.164", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "42980", "lr": "0.000113471", "gnorm": "3.5", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10850"} 2023-01-29 19:12:35 | INFO | train_inner | {"epoch": 20, "update": 19.89, "s2c_loss": "0.14", "loss": "0.09707", "s2c_nll_loss": "0.14", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "42990", "lr": "0.000113404", "gnorm": "3.698", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10853"} 2023-01-29 19:12:37 | INFO | train_inner | {"epoch": 20, "update": 19.895, "s2c_loss": "0.227", "loss": "0.15703", "s2c_nll_loss": "0.227", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "43000", "lr": "0.000113338", "gnorm": "4.223", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "10855"} 2023-01-29 19:12:40 | INFO | train_inner | {"epoch": 20, "update": 19.9, "s2c_loss": "0.198", "loss": "0.13704", "s2c_nll_loss": "0.198", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "43010", "lr": "0.000113271", "gnorm": "3.53", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10858"} 2023-01-29 19:12:42 | INFO | train_inner | {"epoch": 20, "update": 19.904, "s2c_loss": "0.142", "loss": "0.09874", "s2c_nll_loss": "0.142", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "43020", "lr": "0.000113204", "gnorm": "3.182", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10860"} 2023-01-29 19:12:45 | INFO | train_inner | {"epoch": 20, "update": 19.909, "s2c_loss": "0.169", "loss": "0.11745", "s2c_nll_loss": "0.169", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "43030", "lr": "0.000113138", "gnorm": "2.618", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10863"} 2023-01-29 19:12:47 | INFO | train_inner | {"epoch": 20, "update": 19.914, "s2c_loss": "0.104", "loss": "0.07197", "s2c_nll_loss": "0.104", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "43040", "lr": "0.000113071", "gnorm": "2.87", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10865"} 2023-01-29 19:12:50 | INFO | train_inner | {"epoch": 20, "update": 19.918, "s2c_loss": "0.134", "loss": "0.09262", "s2c_nll_loss": "0.134", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "43050", "lr": "0.000113004", "gnorm": "2.933", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10868"} 2023-01-29 19:12:53 | INFO | train_inner | {"epoch": 20, "update": 19.923, "s2c_loss": "0.16", "loss": "0.11061", "s2c_nll_loss": "0.16", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "43060", "lr": "0.000112938", "gnorm": "3.2", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.4", "wall": "10871"} 2023-01-29 19:12:55 | INFO | train_inner | {"epoch": 20, "update": 19.927, "s2c_loss": "0.179", "loss": "0.12398", "s2c_nll_loss": "0.179", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "43070", "lr": "0.000112871", "gnorm": "3.036", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10873"} 2023-01-29 19:12:58 | INFO | train_inner | {"epoch": 20, "update": 19.932, "s2c_loss": "0.119", "loss": "0.08235", "s2c_nll_loss": "0.119", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "43080", "lr": "0.000112804", "gnorm": "3.283", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10876"} 2023-01-29 19:13:00 | INFO | train_inner | {"epoch": 20, "update": 19.937, "s2c_loss": "0.379", "loss": "0.26305", "s2c_nll_loss": "0.379", "s2c_accuracy": "95", "s2c_total": "64", "s2c_n_correct": "60.8", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "43090", "lr": "0.000112738", "gnorm": "3.852", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10878"} 2023-01-29 19:13:03 | INFO | train_inner | {"epoch": 20, "update": 19.941, "s2c_loss": "0.105", "loss": "0.07291", "s2c_nll_loss": "0.105", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "43100", "lr": "0.000112671", "gnorm": "3.093", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10881"} 2023-01-29 19:13:05 | INFO | train_inner | {"epoch": 20, "update": 19.946, "s2c_loss": "0.117", "loss": "0.08112", "s2c_nll_loss": "0.117", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "43110", "lr": "0.000112604", "gnorm": "3.067", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10883"} 2023-01-29 19:13:08 | INFO | train_inner | {"epoch": 20, "update": 19.951, "s2c_loss": "0.197", "loss": "0.13652", "s2c_nll_loss": "0.197", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "43120", "lr": "0.000112538", "gnorm": "3.571", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "10886"} 2023-01-29 19:13:10 | INFO | train_inner | {"epoch": 20, "update": 19.955, "s2c_loss": "0.153", "loss": "0.10618", "s2c_nll_loss": "0.153", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "43130", "lr": "0.000112471", "gnorm": "3.931", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "10888"} 2023-01-29 19:13:13 | INFO | train_inner | {"epoch": 20, "update": 19.96, "s2c_loss": "0.224", "loss": "0.15514", "s2c_nll_loss": "0.224", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "43140", "lr": "0.000112404", "gnorm": "4.662", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10891"} 2023-01-29 19:13:15 | INFO | train_inner | {"epoch": 20, "update": 19.964, "s2c_loss": "0.196", "loss": "0.13557", "s2c_nll_loss": "0.196", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "43150", "lr": "0.000112338", "gnorm": "3.942", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10893"} 2023-01-29 19:13:18 | INFO | train_inner | {"epoch": 20, "update": 19.969, "s2c_loss": "0.177", "loss": "0.12273", "s2c_nll_loss": "0.177", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "43160", "lr": "0.000112271", "gnorm": "3.931", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10896"} 2023-01-29 19:13:20 | INFO | train_inner | {"epoch": 20, "update": 19.974, "s2c_loss": "0.118", "loss": "0.08159", "s2c_nll_loss": "0.118", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "43170", "lr": "0.000112204", "gnorm": "3.291", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10898"} 2023-01-29 19:13:23 | INFO | train_inner | {"epoch": 20, "update": 19.978, "s2c_loss": "0.106", "loss": "0.07371", "s2c_nll_loss": "0.106", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "246.7", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "43180", "lr": "0.000112138", "gnorm": "2.701", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10901"} 2023-01-29 19:13:25 | INFO | train_inner | {"epoch": 20, "update": 19.983, "s2c_loss": "0.199", "loss": "0.13803", "s2c_nll_loss": "0.199", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "43190", "lr": "0.000112071", "gnorm": "3.841", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10903"} 2023-01-29 19:13:28 | INFO | train_inner | {"epoch": 20, "update": 19.988, "s2c_loss": "0.238", "loss": "0.1652", "s2c_nll_loss": "0.238", "s2c_accuracy": "95.156", "s2c_total": "64", "s2c_n_correct": "60.9", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "43200", "lr": "0.000112004", "gnorm": "4.032", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10906"} 2023-01-29 19:13:31 | INFO | train_inner | {"epoch": 20, "update": 19.992, "s2c_loss": "0.155", "loss": "0.1071", "s2c_nll_loss": "0.155", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "255", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "43210", "lr": "0.000111938", "gnorm": "3.332", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "10908"} 2023-01-29 19:13:33 | INFO | train_inner | {"epoch": 20, "update": 19.997, "s2c_loss": "0.175", "loss": "0.12163", "s2c_nll_loss": "0.175", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "43220", "lr": "0.000111871", "gnorm": "3.563", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "10911"} 2023-01-29 19:13:35 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 19:13:49 | INFO | valid | {"epoch": 20, "valid_s2c_loss": "0.669", "valid_loss": "0.46378", "valid_s2c_nll_loss": "0.669", "valid_s2c_accuracy": "88.614", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "28.3194", "valid_num_updates": "43227", "valid_best_s2c_accuracy": "88.614"} 2023-01-29 19:13:49 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 20 @ 43227 updates 2023-01-29 19:13:49 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 19:13:56 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 19:14:01 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt (epoch 20 @ 43227 updates, score 88.614) (writing took 11.73061581607908 seconds) 2023-01-29 19:14:01 | INFO | fairseq_cli.train | end of epoch 20 (average epoch stats below) 2023-01-29 19:14:01 | INFO | train | {"epoch": 20, "train_s2c_loss": "0.165", "train_loss": "0.11433", "train_s2c_nll_loss": "0.165", "train_s2c_accuracy": "97.143", "train_s2c_total": "63.9838", "train_s2c_n_correct": "62.1559", "train_wps": "239", "train_ups": "3.73", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "43227", "train_lr": "0.000111824", "train_gnorm": "3.638", "train_loss_scale": "1024", "train_train_wall": "539", "train_gb_free": "7.5", "train_wall": "10939"} 2023-01-29 19:14:07 | INFO | fairseq.trainer | begin training epoch 21 2023-01-29 19:14:07 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 19:14:08 | INFO | train_inner | {"epoch": 21, "update": 20.001, "s2c_loss": "0.19", "loss": "0.13167", "s2c_nll_loss": "0.19", "s2c_accuracy": "97.368", "s2c_total": "60.8", "s2c_n_correct": "59.2", "wps": "17.3", "ups": "0.28", "wpb": "60.8", "bsz": "60.8", "num_updates": "43230", "lr": "0.000111804", "gnorm": "4.083", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10946"} 2023-01-29 19:14:11 | INFO | train_inner | {"epoch": 21, "update": 20.006, "s2c_loss": "0.121", "loss": "0.08413", "s2c_nll_loss": "0.121", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "43240", "lr": "0.000111738", "gnorm": "2.813", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10949"} 2023-01-29 19:14:13 | INFO | train_inner | {"epoch": 21, "update": 20.011, "s2c_loss": "0.133", "loss": "0.09209", "s2c_nll_loss": "0.133", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "43250", "lr": "0.000111671", "gnorm": "2.919", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10951"} 2023-01-29 19:14:16 | INFO | train_inner | {"epoch": 21, "update": 20.015, "s2c_loss": "0.069", "loss": "0.04787", "s2c_nll_loss": "0.069", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "43260", "lr": "0.000111604", "gnorm": "2.215", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10954"} 2023-01-29 19:14:18 | INFO | train_inner | {"epoch": 21, "update": 20.02, "s2c_loss": "0.08", "loss": "0.0556", "s2c_nll_loss": "0.08", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "43270", "lr": "0.000111538", "gnorm": "2.557", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10956"} 2023-01-29 19:14:21 | INFO | train_inner | {"epoch": 21, "update": 20.025, "s2c_loss": "0.11", "loss": "0.07655", "s2c_nll_loss": "0.11", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "43280", "lr": "0.000111471", "gnorm": "3.098", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10959"} 2023-01-29 19:14:24 | INFO | train_inner | {"epoch": 21, "update": 20.029, "s2c_loss": "0.143", "loss": "0.09894", "s2c_nll_loss": "0.143", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "244.6", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "43290", "lr": "0.000111404", "gnorm": "3.012", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10961"} 2023-01-29 19:14:26 | INFO | train_inner | {"epoch": 21, "update": 20.034, "s2c_loss": "0.154", "loss": "0.10653", "s2c_nll_loss": "0.154", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "43300", "lr": "0.000111338", "gnorm": "3.611", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10964"} 2023-01-29 19:14:29 | INFO | train_inner | {"epoch": 21, "update": 20.038, "s2c_loss": "0.117", "loss": "0.08118", "s2c_nll_loss": "0.117", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "43310", "lr": "0.000111271", "gnorm": "3.896", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10967"} 2023-01-29 19:14:31 | INFO | train_inner | {"epoch": 21, "update": 20.043, "s2c_loss": "0.194", "loss": "0.1344", "s2c_nll_loss": "0.194", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "43320", "lr": "0.000111204", "gnorm": "4.194", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10969"} 2023-01-29 19:14:34 | INFO | train_inner | {"epoch": 21, "update": 20.048, "s2c_loss": "0.273", "loss": "0.18952", "s2c_nll_loss": "0.273", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "43330", "lr": "0.000111138", "gnorm": "4.078", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10972"} 2023-01-29 19:14:36 | INFO | train_inner | {"epoch": 21, "update": 20.052, "s2c_loss": "0.188", "loss": "0.13047", "s2c_nll_loss": "0.188", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "43340", "lr": "0.000111071", "gnorm": "3.8", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10974"} 2023-01-29 19:14:39 | INFO | train_inner | {"epoch": 21, "update": 20.057, "s2c_loss": "0.157", "loss": "0.10897", "s2c_nll_loss": "0.157", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "246.9", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "43350", "lr": "0.000111004", "gnorm": "3.774", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10977"} 2023-01-29 19:14:41 | INFO | train_inner | {"epoch": 21, "update": 20.062, "s2c_loss": "0.116", "loss": "0.08043", "s2c_nll_loss": "0.116", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "43360", "lr": "0.000110938", "gnorm": "3.166", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10979"} 2023-01-29 19:14:44 | INFO | train_inner | {"epoch": 21, "update": 20.066, "s2c_loss": "0.185", "loss": "0.12807", "s2c_nll_loss": "0.185", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "43370", "lr": "0.000110871", "gnorm": "3.172", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10982"} 2023-01-29 19:14:46 | INFO | train_inner | {"epoch": 21, "update": 20.071, "s2c_loss": "0.131", "loss": "0.09109", "s2c_nll_loss": "0.131", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "43380", "lr": "0.000110804", "gnorm": "3.775", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "10984"} 2023-01-29 19:14:49 | INFO | train_inner | {"epoch": 21, "update": 20.075, "s2c_loss": "0.099", "loss": "0.06882", "s2c_nll_loss": "0.099", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "241.9", "ups": "3.78", "wpb": "64", "bsz": "64", "num_updates": "43390", "lr": "0.000110738", "gnorm": "2.658", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10987"} 2023-01-29 19:14:52 | INFO | train_inner | {"epoch": 21, "update": 20.08, "s2c_loss": "0.142", "loss": "0.0983", "s2c_nll_loss": "0.142", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "43400", "lr": "0.000110671", "gnorm": "3.513", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10990"} 2023-01-29 19:14:54 | INFO | train_inner | {"epoch": 21, "update": 20.085, "s2c_loss": "0.109", "loss": "0.07554", "s2c_nll_loss": "0.109", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "43410", "lr": "0.000110604", "gnorm": "3.613", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10992"} 2023-01-29 19:14:57 | INFO | train_inner | {"epoch": 21, "update": 20.089, "s2c_loss": "0.146", "loss": "0.10149", "s2c_nll_loss": "0.146", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "43420", "lr": "0.000110538", "gnorm": "3.854", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "10995"} 2023-01-29 19:14:59 | INFO | train_inner | {"epoch": 21, "update": 20.094, "s2c_loss": "0.087", "loss": "0.06024", "s2c_nll_loss": "0.087", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "43430", "lr": "0.000110471", "gnorm": "2.47", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "10997"} 2023-01-29 19:15:02 | INFO | train_inner | {"epoch": 21, "update": 20.099, "s2c_loss": "0.243", "loss": "0.16837", "s2c_nll_loss": "0.243", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "261.9", "ups": "4.09", "wpb": "64", "bsz": "64", "num_updates": "43440", "lr": "0.000110404", "gnorm": "3.82", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "11000"} 2023-01-29 19:15:04 | INFO | train_inner | {"epoch": 21, "update": 20.103, "s2c_loss": "0.084", "loss": "0.05818", "s2c_nll_loss": "0.084", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "43450", "lr": "0.000110338", "gnorm": "2.535", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "11002"} 2023-01-29 19:15:07 | INFO | train_inner | {"epoch": 21, "update": 20.108, "s2c_loss": "0.121", "loss": "0.08363", "s2c_nll_loss": "0.121", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "43460", "lr": "0.000110271", "gnorm": "3.29", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "11005"} 2023-01-29 19:15:09 | INFO | train_inner | {"epoch": 21, "update": 20.112, "s2c_loss": "0.091", "loss": "0.06316", "s2c_nll_loss": "0.091", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "43470", "lr": "0.000110204", "gnorm": "2.619", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "11007"} 2023-01-29 19:15:12 | INFO | train_inner | {"epoch": 21, "update": 20.117, "s2c_loss": "0.096", "loss": "0.06683", "s2c_nll_loss": "0.096", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "43480", "lr": "0.000110138", "gnorm": "3.033", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "11010"} 2023-01-29 19:15:14 | INFO | train_inner | {"epoch": 21, "update": 20.122, "s2c_loss": "0.074", "loss": "0.05155", "s2c_nll_loss": "0.074", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "43490", "lr": "0.000110071", "gnorm": "2.182", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "11012"} 2023-01-29 19:15:17 | INFO | train_inner | {"epoch": 21, "update": 20.126, "s2c_loss": "0.087", "loss": "0.0605", "s2c_nll_loss": "0.087", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "247.6", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "43500", "lr": "0.000110005", "gnorm": "2.663", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "11015"} 2023-01-29 19:15:20 | INFO | train_inner | {"epoch": 21, "update": 20.131, "s2c_loss": "0.101", "loss": "0.06998", "s2c_nll_loss": "0.101", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "43510", "lr": "0.000109938", "gnorm": "2.796", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "11018"} 2023-01-29 19:15:22 | INFO | train_inner | {"epoch": 21, "update": 20.136, "s2c_loss": "0.149", "loss": "0.10336", "s2c_nll_loss": "0.149", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "43520", "lr": "0.000109871", "gnorm": "3.091", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "11020"} 2023-01-29 19:15:25 | INFO | train_inner | {"epoch": 21, "update": 20.14, "s2c_loss": "0.101", "loss": "0.07034", "s2c_nll_loss": "0.101", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "43530", "lr": "0.000109805", "gnorm": "2.962", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "11023"} 2023-01-29 19:15:27 | INFO | train_inner | {"epoch": 21, "update": 20.145, "s2c_loss": "0.11", "loss": "0.076", "s2c_nll_loss": "0.11", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "43540", "lr": "0.000109738", "gnorm": "3.869", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "11025"} 2023-01-29 19:15:30 | INFO | train_inner | {"epoch": 21, "update": 20.149, "s2c_loss": "0.166", "loss": "0.11516", "s2c_nll_loss": "0.166", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "43550", "lr": "0.000109671", "gnorm": "4.018", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "11028"} 2023-01-29 19:15:32 | INFO | train_inner | {"epoch": 21, "update": 20.154, "s2c_loss": "0.108", "loss": "0.07516", "s2c_nll_loss": "0.108", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "43560", "lr": "0.000109605", "gnorm": "3.928", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "11030"} 2023-01-29 19:15:35 | INFO | train_inner | {"epoch": 21, "update": 20.159, "s2c_loss": "0.101", "loss": "0.07027", "s2c_nll_loss": "0.101", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "43570", "lr": "0.000109538", "gnorm": "3.116", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "11033"} 2023-01-29 19:15:37 | INFO | train_inner | {"epoch": 21, "update": 20.163, "s2c_loss": "0.1", "loss": "0.06964", "s2c_nll_loss": "0.1", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "43580", "lr": "0.000109471", "gnorm": "2.636", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.4", "wall": "11035"} 2023-01-29 19:15:40 | INFO | train_inner | {"epoch": 21, "update": 20.168, "s2c_loss": "0.324", "loss": "0.22427", "s2c_nll_loss": "0.324", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "43590", "lr": "0.000109405", "gnorm": "2.817", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "11038"} 2023-01-29 19:15:42 | INFO | train_inner | {"epoch": 21, "update": 20.173, "s2c_loss": "0.094", "loss": "0.06486", "s2c_nll_loss": "0.094", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "43600", "lr": "0.000109338", "gnorm": "2.472", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "11040"} 2023-01-29 19:15:45 | INFO | train_inner | {"epoch": 21, "update": 20.177, "s2c_loss": "0.076", "loss": "0.05279", "s2c_nll_loss": "0.076", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "43610", "lr": "0.000109271", "gnorm": "2.523", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "11043"} 2023-01-29 19:15:47 | INFO | train_inner | {"epoch": 21, "update": 20.182, "s2c_loss": "0.105", "loss": "0.07275", "s2c_nll_loss": "0.105", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "43620", "lr": "0.000109205", "gnorm": "2.914", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "11045"} 2023-01-29 19:15:50 | INFO | train_inner | {"epoch": 21, "update": 20.186, "s2c_loss": "0.112", "loss": "0.07775", "s2c_nll_loss": "0.112", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "43630", "lr": "0.000109138", "gnorm": "3.086", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "11048"} 2023-01-29 19:15:52 | INFO | train_inner | {"epoch": 21, "update": 20.191, "s2c_loss": "0.097", "loss": "0.0669", "s2c_nll_loss": "0.097", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "43640", "lr": "0.000109071", "gnorm": "2.382", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "11050"} 2023-01-29 19:15:55 | INFO | train_inner | {"epoch": 21, "update": 20.196, "s2c_loss": "0.151", "loss": "0.10462", "s2c_nll_loss": "0.151", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "43650", "lr": "0.000109005", "gnorm": "3.02", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "11053"} 2023-01-29 19:15:58 | INFO | train_inner | {"epoch": 21, "update": 20.2, "s2c_loss": "0.138", "loss": "0.09572", "s2c_nll_loss": "0.138", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "43660", "lr": "0.000108938", "gnorm": "3.444", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "11055"} 2023-01-29 19:16:00 | INFO | train_inner | {"epoch": 21, "update": 20.205, "s2c_loss": "0.069", "loss": "0.04764", "s2c_nll_loss": "0.069", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "43670", "lr": "0.000108871", "gnorm": "2.015", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "11058"} 2023-01-29 19:16:03 | INFO | train_inner | {"epoch": 21, "update": 20.21, "s2c_loss": "0.094", "loss": "0.06538", "s2c_nll_loss": "0.094", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "43680", "lr": "0.000108805", "gnorm": "2.459", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.3", "wall": "11060"} 2023-01-29 19:16:05 | INFO | train_inner | {"epoch": 21, "update": 20.214, "s2c_loss": "0.121", "loss": "0.08401", "s2c_nll_loss": "0.121", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "43690", "lr": "0.000108738", "gnorm": "2.75", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "11063"} 2023-01-29 19:16:08 | INFO | train_inner | {"epoch": 21, "update": 20.219, "s2c_loss": "0.143", "loss": "0.09935", "s2c_nll_loss": "0.143", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "43700", "lr": "0.000108671", "gnorm": "3.301", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "11066"} 2023-01-29 19:16:10 | INFO | train_inner | {"epoch": 21, "update": 20.223, "s2c_loss": "0.189", "loss": "0.13106", "s2c_nll_loss": "0.189", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "43710", "lr": "0.000108605", "gnorm": "3.858", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.3", "wall": "11068"} 2023-01-29 19:16:13 | INFO | train_inner | {"epoch": 21, "update": 20.228, "s2c_loss": "0.161", "loss": "0.11141", "s2c_nll_loss": "0.161", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "43720", "lr": "0.000108538", "gnorm": "3.97", "loss_scale": "1024", "train_wall": "3", "gb_free": "7.2", "wall": "11071"} 2023-01-29 19:16:15 | INFO | train_inner | {"epoch": 21, "update": 20.233, "s2c_loss": "0.191", "loss": "0.13208", "s2c_nll_loss": "0.191", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "43730", "lr": "0.000108471", "gnorm": "3.83", "loss_scale": "1024", "train_wall": "2", "gb_free": "7.2", "wall": "11073"} 2023-01-29 19:16:18 | INFO | train_inner | {"epoch": 21, "update": 20.237, "s2c_loss": "0.191", "loss": "0.13224", "s2c_nll_loss": "0.191", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "43740", "lr": "0.000108405", "gnorm": "3.805", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11076"} 2023-01-29 19:16:20 | INFO | train_inner | {"epoch": 21, "update": 20.242, "s2c_loss": "0.13", "loss": "0.09", "s2c_nll_loss": "0.13", "s2c_accuracy": "98.273", "s2c_total": "63.7", "s2c_n_correct": "62.6", "wps": "247.2", "ups": "3.88", "wpb": "63.7", "bsz": "63.7", "num_updates": "43750", "lr": "0.000108338", "gnorm": "2.91", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11078"} 2023-01-29 19:16:23 | INFO | train_inner | {"epoch": 21, "update": 20.247, "s2c_loss": "0.141", "loss": "0.09795", "s2c_nll_loss": "0.141", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "43760", "lr": "0.000108271", "gnorm": "2.899", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11081"} 2023-01-29 19:16:25 | INFO | train_inner | {"epoch": 21, "update": 20.251, "s2c_loss": "0.105", "loss": "0.07296", "s2c_nll_loss": "0.105", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "43770", "lr": "0.000108205", "gnorm": "3.114", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11083"} 2023-01-29 19:16:28 | INFO | train_inner | {"epoch": 21, "update": 20.256, "s2c_loss": "0.11", "loss": "0.07629", "s2c_nll_loss": "0.11", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "258.7", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "43780", "lr": "0.000108138", "gnorm": "3.213", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "11086"} 2023-01-29 19:16:30 | INFO | train_inner | {"epoch": 21, "update": 20.26, "s2c_loss": "0.133", "loss": "0.09239", "s2c_nll_loss": "0.133", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "43790", "lr": "0.000108071", "gnorm": "2.958", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11088"} 2023-01-29 19:16:33 | INFO | train_inner | {"epoch": 21, "update": 20.265, "s2c_loss": "0.193", "loss": "0.13404", "s2c_nll_loss": "0.193", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "43800", "lr": "0.000108005", "gnorm": "3.778", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11091"} 2023-01-29 19:16:36 | INFO | train_inner | {"epoch": 21, "update": 20.27, "s2c_loss": "0.134", "loss": "0.09314", "s2c_nll_loss": "0.134", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "43810", "lr": "0.000107938", "gnorm": "3.547", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11093"} 2023-01-29 19:16:38 | INFO | train_inner | {"epoch": 21, "update": 20.274, "s2c_loss": "0.137", "loss": "0.09498", "s2c_nll_loss": "0.137", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "43820", "lr": "0.000107871", "gnorm": "3.915", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11096"} 2023-01-29 19:16:41 | INFO | train_inner | {"epoch": 21, "update": 20.279, "s2c_loss": "0.138", "loss": "0.09542", "s2c_nll_loss": "0.138", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "43830", "lr": "0.000107805", "gnorm": "2.858", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11098"} 2023-01-29 19:16:43 | INFO | train_inner | {"epoch": 21, "update": 20.284, "s2c_loss": "0.126", "loss": "0.087", "s2c_nll_loss": "0.126", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "43840", "lr": "0.000107738", "gnorm": "2.957", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11101"} 2023-01-29 19:16:46 | INFO | train_inner | {"epoch": 21, "update": 20.288, "s2c_loss": "0.083", "loss": "0.05758", "s2c_nll_loss": "0.083", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "43850", "lr": "0.000107671", "gnorm": "2.033", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "11104"} 2023-01-29 19:16:48 | INFO | train_inner | {"epoch": 21, "update": 20.293, "s2c_loss": "0.159", "loss": "0.10987", "s2c_nll_loss": "0.159", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "259", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "43860", "lr": "0.000107605", "gnorm": "3.216", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11106"} 2023-01-29 19:16:51 | INFO | train_inner | {"epoch": 21, "update": 20.297, "s2c_loss": "0.08", "loss": "0.0556", "s2c_nll_loss": "0.08", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "43870", "lr": "0.000107538", "gnorm": "2.52", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11109"} 2023-01-29 19:16:53 | INFO | train_inner | {"epoch": 21, "update": 20.302, "s2c_loss": "0.08", "loss": "0.05576", "s2c_nll_loss": "0.08", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "43880", "lr": "0.000107471", "gnorm": "2.625", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11111"} 2023-01-29 19:16:56 | INFO | train_inner | {"epoch": 21, "update": 20.307, "s2c_loss": "0.158", "loss": "0.10925", "s2c_nll_loss": "0.158", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "43890", "lr": "0.000107405", "gnorm": "4.031", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "11114"} 2023-01-29 19:16:58 | INFO | train_inner | {"epoch": 21, "update": 20.311, "s2c_loss": "0.091", "loss": "0.06339", "s2c_nll_loss": "0.091", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "43900", "lr": "0.000107338", "gnorm": "2.724", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11116"} 2023-01-29 19:17:01 | INFO | train_inner | {"epoch": 21, "update": 20.316, "s2c_loss": "0.106", "loss": "0.07364", "s2c_nll_loss": "0.106", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "43910", "lr": "0.000107271", "gnorm": "2.523", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11119"} 2023-01-29 19:17:03 | INFO | train_inner | {"epoch": 21, "update": 20.321, "s2c_loss": "0.099", "loss": "0.06891", "s2c_nll_loss": "0.099", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "43920", "lr": "0.000107205", "gnorm": "3.523", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11121"} 2023-01-29 19:17:06 | INFO | train_inner | {"epoch": 21, "update": 20.325, "s2c_loss": "0.098", "loss": "0.06789", "s2c_nll_loss": "0.098", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "43930", "lr": "0.000107138", "gnorm": "2.417", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11124"} 2023-01-29 19:17:08 | INFO | train_inner | {"epoch": 21, "update": 20.33, "s2c_loss": "0.09", "loss": "0.06205", "s2c_nll_loss": "0.09", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "43940", "lr": "0.000107071", "gnorm": "2.514", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11126"} 2023-01-29 19:17:11 | INFO | train_inner | {"epoch": 21, "update": 20.334, "s2c_loss": "0.141", "loss": "0.09743", "s2c_nll_loss": "0.141", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "43950", "lr": "0.000107005", "gnorm": "3.906", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11129"} 2023-01-29 19:17:13 | INFO | train_inner | {"epoch": 21, "update": 20.339, "s2c_loss": "0.151", "loss": "0.10481", "s2c_nll_loss": "0.151", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "43960", "lr": "0.000106938", "gnorm": "3.384", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11131"} 2023-01-29 19:17:16 | INFO | train_inner | {"epoch": 21, "update": 20.344, "s2c_loss": "0.192", "loss": "0.13285", "s2c_nll_loss": "0.192", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "43970", "lr": "0.000106871", "gnorm": "3.024", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11134"} 2023-01-29 19:17:19 | INFO | train_inner | {"epoch": 21, "update": 20.348, "s2c_loss": "0.118", "loss": "0.08212", "s2c_nll_loss": "0.118", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "43980", "lr": "0.000106805", "gnorm": "3.118", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11136"} 2023-01-29 19:17:21 | INFO | train_inner | {"epoch": 21, "update": 20.353, "s2c_loss": "0.107", "loss": "0.07408", "s2c_nll_loss": "0.107", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "43990", "lr": "0.000106738", "gnorm": "2.882", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11139"} 2023-01-29 19:17:24 | INFO | train_inner | {"epoch": 21, "update": 20.358, "s2c_loss": "0.125", "loss": "0.08677", "s2c_nll_loss": "0.125", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "44000", "lr": "0.000106671", "gnorm": "3.388", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "11141"} 2023-01-29 19:17:26 | INFO | train_inner | {"epoch": 21, "update": 20.362, "s2c_loss": "0.195", "loss": "0.13501", "s2c_nll_loss": "0.195", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "44010", "lr": "0.000106605", "gnorm": "2.909", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11144"} 2023-01-29 19:17:29 | INFO | train_inner | {"epoch": 21, "update": 20.367, "s2c_loss": "0.134", "loss": "0.09305", "s2c_nll_loss": "0.134", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "244.1", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "44020", "lr": "0.000106538", "gnorm": "3.015", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11147"} 2023-01-29 19:17:31 | INFO | train_inner | {"epoch": 21, "update": 20.371, "s2c_loss": "0.111", "loss": "0.07677", "s2c_nll_loss": "0.111", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "44030", "lr": "0.000106471", "gnorm": "3.355", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11149"} 2023-01-29 19:17:34 | INFO | train_inner | {"epoch": 21, "update": 20.376, "s2c_loss": "0.125", "loss": "0.08652", "s2c_nll_loss": "0.125", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "44040", "lr": "0.000106405", "gnorm": "3.234", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11152"} 2023-01-29 19:17:36 | INFO | train_inner | {"epoch": 21, "update": 20.381, "s2c_loss": "0.138", "loss": "0.09592", "s2c_nll_loss": "0.138", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "44050", "lr": "0.000106338", "gnorm": "2.943", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11154"} 2023-01-29 19:17:39 | INFO | train_inner | {"epoch": 21, "update": 20.385, "s2c_loss": "0.111", "loss": "0.07707", "s2c_nll_loss": "0.111", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "44060", "lr": "0.000106271", "gnorm": "3.004", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11157"} 2023-01-29 19:17:41 | INFO | train_inner | {"epoch": 21, "update": 20.39, "s2c_loss": "0.215", "loss": "0.1491", "s2c_nll_loss": "0.215", "s2c_accuracy": "95.938", "s2c_total": "64", "s2c_n_correct": "61.4", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "44070", "lr": "0.000106205", "gnorm": "4.304", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11159"} 2023-01-29 19:17:44 | INFO | train_inner | {"epoch": 21, "update": 20.395, "s2c_loss": "0.154", "loss": "0.10662", "s2c_nll_loss": "0.154", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "44080", "lr": "0.000106138", "gnorm": "3.121", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11162"} 2023-01-29 19:17:46 | INFO | train_inner | {"epoch": 21, "update": 20.399, "s2c_loss": "0.126", "loss": "0.08736", "s2c_nll_loss": "0.126", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "44090", "lr": "0.000106071", "gnorm": "3.104", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.5", "wall": "11164"} 2023-01-29 19:17:49 | INFO | train_inner | {"epoch": 21, "update": 20.404, "s2c_loss": "0.129", "loss": "0.08962", "s2c_nll_loss": "0.129", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "44100", "lr": "0.000106005", "gnorm": "3.825", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11167"} 2023-01-29 19:17:51 | INFO | train_inner | {"epoch": 21, "update": 20.408, "s2c_loss": "0.129", "loss": "0.08933", "s2c_nll_loss": "0.129", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "44110", "lr": "0.000105938", "gnorm": "3.248", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11169"} 2023-01-29 19:17:54 | INFO | train_inner | {"epoch": 21, "update": 20.413, "s2c_loss": "0.149", "loss": "0.10301", "s2c_nll_loss": "0.149", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "44120", "lr": "0.000105871", "gnorm": "3.38", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11172"} 2023-01-29 19:17:57 | INFO | train_inner | {"epoch": 21, "update": 20.418, "s2c_loss": "0.084", "loss": "0.05789", "s2c_nll_loss": "0.084", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "44130", "lr": "0.000105805", "gnorm": "2.56", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11174"} 2023-01-29 19:17:59 | INFO | train_inner | {"epoch": 21, "update": 20.422, "s2c_loss": "0.101", "loss": "0.07015", "s2c_nll_loss": "0.101", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "44140", "lr": "0.000105738", "gnorm": "2.939", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "11177"} 2023-01-29 19:18:02 | INFO | train_inner | {"epoch": 21, "update": 20.427, "s2c_loss": "0.136", "loss": "0.09439", "s2c_nll_loss": "0.136", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "44150", "lr": "0.000105671", "gnorm": "3.282", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11179"} 2023-01-29 19:18:04 | INFO | train_inner | {"epoch": 21, "update": 20.432, "s2c_loss": "0.094", "loss": "0.06537", "s2c_nll_loss": "0.094", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "44160", "lr": "0.000105605", "gnorm": "2.563", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11182"} 2023-01-29 19:18:07 | INFO | train_inner | {"epoch": 21, "update": 20.436, "s2c_loss": "0.136", "loss": "0.09407", "s2c_nll_loss": "0.136", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "44170", "lr": "0.000105538", "gnorm": "3.606", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "11184"} 2023-01-29 19:18:09 | INFO | train_inner | {"epoch": 21, "update": 20.441, "s2c_loss": "0.214", "loss": "0.14848", "s2c_nll_loss": "0.214", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "44180", "lr": "0.000105471", "gnorm": "4.065", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.5", "wall": "11187"} 2023-01-29 19:18:12 | INFO | train_inner | {"epoch": 21, "update": 20.445, "s2c_loss": "0.122", "loss": "0.08479", "s2c_nll_loss": "0.122", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "44190", "lr": "0.000105405", "gnorm": "3.052", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11190"} 2023-01-29 19:18:14 | INFO | train_inner | {"epoch": 21, "update": 20.45, "s2c_loss": "0.122", "loss": "0.08481", "s2c_nll_loss": "0.122", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "44200", "lr": "0.000105338", "gnorm": "2.955", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11192"} 2023-01-29 19:18:17 | INFO | train_inner | {"epoch": 21, "update": 20.455, "s2c_loss": "0.164", "loss": "0.1139", "s2c_nll_loss": "0.164", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "44210", "lr": "0.000105271", "gnorm": "4.186", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11195"} 2023-01-29 19:18:19 | INFO | train_inner | {"epoch": 21, "update": 20.459, "s2c_loss": "0.142", "loss": "0.09833", "s2c_nll_loss": "0.142", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "44220", "lr": "0.000105205", "gnorm": "3.502", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11197"} 2023-01-29 19:18:22 | INFO | train_inner | {"epoch": 21, "update": 20.464, "s2c_loss": "0.163", "loss": "0.11305", "s2c_nll_loss": "0.163", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "255.7", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "44230", "lr": "0.000105138", "gnorm": "2.85", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11200"} 2023-01-29 19:18:24 | INFO | train_inner | {"epoch": 21, "update": 20.469, "s2c_loss": "0.154", "loss": "0.10664", "s2c_nll_loss": "0.154", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "44240", "lr": "0.000105071", "gnorm": "3.347", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11202"} 2023-01-29 19:18:27 | INFO | train_inner | {"epoch": 21, "update": 20.473, "s2c_loss": "0.159", "loss": "0.1105", "s2c_nll_loss": "0.159", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "44250", "lr": "0.000105005", "gnorm": "2.899", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11205"} 2023-01-29 19:18:29 | INFO | train_inner | {"epoch": 21, "update": 20.478, "s2c_loss": "0.089", "loss": "0.06168", "s2c_nll_loss": "0.089", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "44260", "lr": "0.000104938", "gnorm": "3.285", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11207"} 2023-01-29 19:18:32 | INFO | train_inner | {"epoch": 21, "update": 20.482, "s2c_loss": "0.133", "loss": "0.09216", "s2c_nll_loss": "0.133", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "44270", "lr": "0.000104871", "gnorm": "3.207", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11210"} 2023-01-29 19:18:34 | INFO | train_inner | {"epoch": 21, "update": 20.487, "s2c_loss": "0.13", "loss": "0.09", "s2c_nll_loss": "0.13", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "44280", "lr": "0.000104805", "gnorm": "3.277", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11212"} 2023-01-29 19:18:37 | INFO | train_inner | {"epoch": 21, "update": 20.492, "s2c_loss": "0.124", "loss": "0.08601", "s2c_nll_loss": "0.124", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "44290", "lr": "0.000104738", "gnorm": "2.302", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11215"} 2023-01-29 19:18:39 | INFO | train_inner | {"epoch": 21, "update": 20.496, "s2c_loss": "0.109", "loss": "0.07554", "s2c_nll_loss": "0.109", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "246.9", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "44300", "lr": "0.000104671", "gnorm": "2.791", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11217"} 2023-01-29 19:18:42 | INFO | train_inner | {"epoch": 21, "update": 20.501, "s2c_loss": "0.108", "loss": "0.07507", "s2c_nll_loss": "0.108", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "44310", "lr": "0.000104605", "gnorm": "3.179", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11220"} 2023-01-29 19:18:44 | INFO | train_inner | {"epoch": 21, "update": 20.506, "s2c_loss": "0.148", "loss": "0.10278", "s2c_nll_loss": "0.148", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "44320", "lr": "0.000104538", "gnorm": "3.679", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11222"} 2023-01-29 19:18:47 | INFO | train_inner | {"epoch": 21, "update": 20.51, "s2c_loss": "0.138", "loss": "0.09575", "s2c_nll_loss": "0.138", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "44330", "lr": "0.000104471", "gnorm": "3.61", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11225"} 2023-01-29 19:18:50 | INFO | train_inner | {"epoch": 21, "update": 20.515, "s2c_loss": "0.14", "loss": "0.09718", "s2c_nll_loss": "0.14", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "44340", "lr": "0.000104405", "gnorm": "3.37", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11227"} 2023-01-29 19:18:52 | INFO | train_inner | {"epoch": 21, "update": 20.519, "s2c_loss": "0.133", "loss": "0.09215", "s2c_nll_loss": "0.133", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "44350", "lr": "0.000104338", "gnorm": "3.116", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11230"} 2023-01-29 19:18:55 | INFO | train_inner | {"epoch": 21, "update": 20.524, "s2c_loss": "0.132", "loss": "0.09177", "s2c_nll_loss": "0.132", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "44360", "lr": "0.000104271", "gnorm": "3.503", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11233"} 2023-01-29 19:18:57 | INFO | train_inner | {"epoch": 21, "update": 20.529, "s2c_loss": "0.117", "loss": "0.0809", "s2c_nll_loss": "0.117", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "44370", "lr": "0.000104205", "gnorm": "2.87", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11235"} 2023-01-29 19:19:00 | INFO | train_inner | {"epoch": 21, "update": 20.533, "s2c_loss": "0.2", "loss": "0.13865", "s2c_nll_loss": "0.2", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "44380", "lr": "0.000104138", "gnorm": "3.343", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11238"} 2023-01-29 19:19:02 | INFO | train_inner | {"epoch": 21, "update": 20.538, "s2c_loss": "0.154", "loss": "0.10685", "s2c_nll_loss": "0.154", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "44390", "lr": "0.000104071", "gnorm": "4.076", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11240"} 2023-01-29 19:19:05 | INFO | train_inner | {"epoch": 21, "update": 20.543, "s2c_loss": "0.177", "loss": "0.12247", "s2c_nll_loss": "0.177", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "44400", "lr": "0.000104005", "gnorm": "4.262", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11243"} 2023-01-29 19:19:07 | INFO | train_inner | {"epoch": 21, "update": 20.547, "s2c_loss": "0.184", "loss": "0.1274", "s2c_nll_loss": "0.184", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "44410", "lr": "0.000103938", "gnorm": "4.558", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11245"} 2023-01-29 19:19:10 | INFO | train_inner | {"epoch": 21, "update": 20.552, "s2c_loss": "0.142", "loss": "0.09874", "s2c_nll_loss": "0.142", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "44420", "lr": "0.000103871", "gnorm": "3.328", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "11248"} 2023-01-29 19:19:12 | INFO | train_inner | {"epoch": 21, "update": 20.556, "s2c_loss": "0.106", "loss": "0.07371", "s2c_nll_loss": "0.106", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "44430", "lr": "0.000103805", "gnorm": "3.421", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11250"} 2023-01-29 19:19:15 | INFO | train_inner | {"epoch": 21, "update": 20.561, "s2c_loss": "0.15", "loss": "0.10411", "s2c_nll_loss": "0.15", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "44440", "lr": "0.000103738", "gnorm": "3.682", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11253"} 2023-01-29 19:19:17 | INFO | train_inner | {"epoch": 21, "update": 20.566, "s2c_loss": "0.118", "loss": "0.08174", "s2c_nll_loss": "0.118", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "44450", "lr": "0.000103671", "gnorm": "3.475", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11255"} 2023-01-29 19:19:20 | INFO | train_inner | {"epoch": 21, "update": 20.57, "s2c_loss": "0.15", "loss": "0.10408", "s2c_nll_loss": "0.15", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "44460", "lr": "0.000103605", "gnorm": "3.923", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11258"} 2023-01-29 19:19:22 | INFO | train_inner | {"epoch": 21, "update": 20.575, "s2c_loss": "0.121", "loss": "0.08365", "s2c_nll_loss": "0.121", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "44470", "lr": "0.000103538", "gnorm": "3.222", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11260"} 2023-01-29 19:19:25 | INFO | train_inner | {"epoch": 21, "update": 20.58, "s2c_loss": "0.14", "loss": "0.09713", "s2c_nll_loss": "0.14", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "44480", "lr": "0.000103471", "gnorm": "4.077", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11263"} 2023-01-29 19:19:28 | INFO | train_inner | {"epoch": 21, "update": 20.584, "s2c_loss": "0.139", "loss": "0.09615", "s2c_nll_loss": "0.139", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "258.8", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "44490", "lr": "0.000103405", "gnorm": "3.093", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11265"} 2023-01-29 19:19:30 | INFO | train_inner | {"epoch": 21, "update": 20.589, "s2c_loss": "0.107", "loss": "0.07407", "s2c_nll_loss": "0.107", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "44500", "lr": "0.000103338", "gnorm": "2.946", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11268"} 2023-01-29 19:19:33 | INFO | train_inner | {"epoch": 21, "update": 20.593, "s2c_loss": "0.124", "loss": "0.0862", "s2c_nll_loss": "0.124", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "44510", "lr": "0.000103272", "gnorm": "3.202", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11271"} 2023-01-29 19:19:35 | INFO | train_inner | {"epoch": 21, "update": 20.598, "s2c_loss": "0.161", "loss": "0.11147", "s2c_nll_loss": "0.161", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "44520", "lr": "0.000103205", "gnorm": "3.632", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11273"} 2023-01-29 19:19:38 | INFO | train_inner | {"epoch": 21, "update": 20.603, "s2c_loss": "0.139", "loss": "0.0963", "s2c_nll_loss": "0.139", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "44530", "lr": "0.000103138", "gnorm": "4.093", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11276"} 2023-01-29 19:19:40 | INFO | train_inner | {"epoch": 21, "update": 20.607, "s2c_loss": "0.119", "loss": "0.08274", "s2c_nll_loss": "0.119", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "44540", "lr": "0.000103072", "gnorm": "3.805", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11278"} 2023-01-29 19:19:43 | INFO | train_inner | {"epoch": 21, "update": 20.612, "s2c_loss": "0.085", "loss": "0.05861", "s2c_nll_loss": "0.085", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "44550", "lr": "0.000103005", "gnorm": "2.61", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11281"} 2023-01-29 19:19:45 | INFO | train_inner | {"epoch": 21, "update": 20.617, "s2c_loss": "0.185", "loss": "0.12802", "s2c_nll_loss": "0.185", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "44560", "lr": "0.000102938", "gnorm": "4.073", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11283"} 2023-01-29 19:19:48 | INFO | train_inner | {"epoch": 21, "update": 20.621, "s2c_loss": "0.141", "loss": "0.09769", "s2c_nll_loss": "0.141", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "44570", "lr": "0.000102872", "gnorm": "3.031", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11286"} 2023-01-29 19:19:50 | INFO | train_inner | {"epoch": 21, "update": 20.626, "s2c_loss": "0.13", "loss": "0.09028", "s2c_nll_loss": "0.13", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "44580", "lr": "0.000102805", "gnorm": "3.18", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "11288"} 2023-01-29 19:19:53 | INFO | train_inner | {"epoch": 21, "update": 20.63, "s2c_loss": "0.344", "loss": "0.2387", "s2c_nll_loss": "0.344", "s2c_accuracy": "95.625", "s2c_total": "64", "s2c_n_correct": "61.2", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "44590", "lr": "0.000102738", "gnorm": "3.604", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11291"} 2023-01-29 19:19:55 | INFO | train_inner | {"epoch": 21, "update": 20.635, "s2c_loss": "0.123", "loss": "0.08512", "s2c_nll_loss": "0.123", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "44600", "lr": "0.000102672", "gnorm": "3.213", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11293"} 2023-01-29 19:19:58 | INFO | train_inner | {"epoch": 21, "update": 20.64, "s2c_loss": "0.1", "loss": "0.06924", "s2c_nll_loss": "0.1", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "44610", "lr": "0.000102605", "gnorm": "2.991", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11296"} 2023-01-29 19:20:00 | INFO | train_inner | {"epoch": 21, "update": 20.644, "s2c_loss": "0.124", "loss": "0.08578", "s2c_nll_loss": "0.124", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "44620", "lr": "0.000102538", "gnorm": "3.01", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11298"} 2023-01-29 19:20:03 | INFO | train_inner | {"epoch": 21, "update": 20.649, "s2c_loss": "0.101", "loss": "0.07002", "s2c_nll_loss": "0.101", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "44630", "lr": "0.000102472", "gnorm": "2.984", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11301"} 2023-01-29 19:20:06 | INFO | train_inner | {"epoch": 21, "update": 20.654, "s2c_loss": "0.085", "loss": "0.05875", "s2c_nll_loss": "0.085", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "44640", "lr": "0.000102405", "gnorm": "2.369", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11303"} 2023-01-29 19:20:08 | INFO | train_inner | {"epoch": 21, "update": 20.658, "s2c_loss": "0.097", "loss": "0.06693", "s2c_nll_loss": "0.097", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "44650", "lr": "0.000102338", "gnorm": "2.806", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11306"} 2023-01-29 19:20:11 | INFO | train_inner | {"epoch": 21, "update": 20.663, "s2c_loss": "0.108", "loss": "0.07454", "s2c_nll_loss": "0.108", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "44660", "lr": "0.000102272", "gnorm": "3.354", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11308"} 2023-01-29 19:20:13 | INFO | train_inner | {"epoch": 21, "update": 20.667, "s2c_loss": "0.103", "loss": "0.07116", "s2c_nll_loss": "0.103", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "44670", "lr": "0.000102205", "gnorm": "3.331", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11311"} 2023-01-29 19:20:16 | INFO | train_inner | {"epoch": 21, "update": 20.672, "s2c_loss": "0.085", "loss": "0.05914", "s2c_nll_loss": "0.085", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "44680", "lr": "0.000102138", "gnorm": "2.366", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11314"} 2023-01-29 19:20:18 | INFO | train_inner | {"epoch": 21, "update": 20.677, "s2c_loss": "0.107", "loss": "0.07411", "s2c_nll_loss": "0.107", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "44690", "lr": "0.000102072", "gnorm": "2.574", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11316"} 2023-01-29 19:20:21 | INFO | train_inner | {"epoch": 21, "update": 20.681, "s2c_loss": "0.143", "loss": "0.09896", "s2c_nll_loss": "0.143", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "257.6", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "44700", "lr": "0.000102005", "gnorm": "2.625", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11319"} 2023-01-29 19:20:23 | INFO | train_inner | {"epoch": 21, "update": 20.686, "s2c_loss": "0.113", "loss": "0.07815", "s2c_nll_loss": "0.113", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "44710", "lr": "0.000101938", "gnorm": "3.643", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11321"} 2023-01-29 19:20:26 | INFO | train_inner | {"epoch": 21, "update": 20.691, "s2c_loss": "0.127", "loss": "0.08808", "s2c_nll_loss": "0.127", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "44720", "lr": "0.000101872", "gnorm": "3.413", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11324"} 2023-01-29 19:20:28 | INFO | train_inner | {"epoch": 21, "update": 20.695, "s2c_loss": "0.172", "loss": "0.11917", "s2c_nll_loss": "0.172", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "44730", "lr": "0.000101805", "gnorm": "2.909", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11326"} 2023-01-29 19:20:31 | INFO | train_inner | {"epoch": 21, "update": 20.7, "s2c_loss": "0.133", "loss": "0.09243", "s2c_nll_loss": "0.133", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "44740", "lr": "0.000101738", "gnorm": "3.13", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11329"} 2023-01-29 19:20:33 | INFO | train_inner | {"epoch": 21, "update": 20.704, "s2c_loss": "0.132", "loss": "0.09116", "s2c_nll_loss": "0.132", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "44750", "lr": "0.000101672", "gnorm": "2.925", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11331"} 2023-01-29 19:20:36 | INFO | train_inner | {"epoch": 21, "update": 20.709, "s2c_loss": "0.132", "loss": "0.09129", "s2c_nll_loss": "0.132", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "44760", "lr": "0.000101605", "gnorm": "3.223", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11334"} 2023-01-29 19:20:38 | INFO | train_inner | {"epoch": 21, "update": 20.714, "s2c_loss": "0.102", "loss": "0.07049", "s2c_nll_loss": "0.102", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "44770", "lr": "0.000101538", "gnorm": "2.393", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11336"} 2023-01-29 19:20:41 | INFO | train_inner | {"epoch": 21, "update": 20.718, "s2c_loss": "0.115", "loss": "0.08004", "s2c_nll_loss": "0.115", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "44780", "lr": "0.000101472", "gnorm": "2.575", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11339"} 2023-01-29 19:20:43 | INFO | train_inner | {"epoch": 21, "update": 20.723, "s2c_loss": "0.083", "loss": "0.05753", "s2c_nll_loss": "0.083", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "244.9", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "44790", "lr": "0.000101405", "gnorm": "2.13", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11341"} 2023-01-29 19:20:46 | INFO | train_inner | {"epoch": 21, "update": 20.728, "s2c_loss": "0.095", "loss": "0.066", "s2c_nll_loss": "0.095", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "44800", "lr": "0.000101338", "gnorm": "2.508", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11344"} 2023-01-29 19:20:49 | INFO | train_inner | {"epoch": 21, "update": 20.732, "s2c_loss": "0.118", "loss": "0.08205", "s2c_nll_loss": "0.118", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "44810", "lr": "0.000101272", "gnorm": "2.572", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11346"} 2023-01-29 19:20:51 | INFO | train_inner | {"epoch": 21, "update": 20.737, "s2c_loss": "0.108", "loss": "0.0752", "s2c_nll_loss": "0.108", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "44820", "lr": "0.000101205", "gnorm": "2.887", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11349"} 2023-01-29 19:20:54 | INFO | train_inner | {"epoch": 21, "update": 20.741, "s2c_loss": "0.059", "loss": "0.04117", "s2c_nll_loss": "0.059", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "44830", "lr": "0.000101138", "gnorm": "1.998", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11352"} 2023-01-29 19:20:56 | INFO | train_inner | {"epoch": 21, "update": 20.746, "s2c_loss": "0.087", "loss": "0.06062", "s2c_nll_loss": "0.087", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "44840", "lr": "0.000101072", "gnorm": "2.393", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11354"} 2023-01-29 19:20:59 | INFO | train_inner | {"epoch": 21, "update": 20.751, "s2c_loss": "0.104", "loss": "0.07217", "s2c_nll_loss": "0.104", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "44850", "lr": "0.000101005", "gnorm": "2.975", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11357"} 2023-01-29 19:21:01 | INFO | train_inner | {"epoch": 21, "update": 20.755, "s2c_loss": "0.106", "loss": "0.07342", "s2c_nll_loss": "0.106", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "44860", "lr": "0.000100938", "gnorm": "3.097", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11359"} 2023-01-29 19:21:04 | INFO | train_inner | {"epoch": 21, "update": 20.76, "s2c_loss": "0.123", "loss": "0.08502", "s2c_nll_loss": "0.123", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "44870", "lr": "0.000100872", "gnorm": "3.114", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11362"} 2023-01-29 19:21:06 | INFO | train_inner | {"epoch": 21, "update": 20.765, "s2c_loss": "0.137", "loss": "0.09529", "s2c_nll_loss": "0.137", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "44880", "lr": "0.000100805", "gnorm": "3.332", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11364"} 2023-01-29 19:21:09 | INFO | train_inner | {"epoch": 21, "update": 20.769, "s2c_loss": "0.123", "loss": "0.08523", "s2c_nll_loss": "0.123", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "44890", "lr": "0.000100738", "gnorm": "3.599", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11367"} 2023-01-29 19:21:11 | INFO | train_inner | {"epoch": 21, "update": 20.774, "s2c_loss": "0.108", "loss": "0.07457", "s2c_nll_loss": "0.108", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "44900", "lr": "0.000100672", "gnorm": "2.722", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11369"} 2023-01-29 19:21:14 | INFO | train_inner | {"epoch": 21, "update": 20.778, "s2c_loss": "0.148", "loss": "0.1027", "s2c_nll_loss": "0.148", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "44910", "lr": "0.000100605", "gnorm": "4.007", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "11372"} 2023-01-29 19:21:16 | INFO | train_inner | {"epoch": 21, "update": 20.783, "s2c_loss": "0.076", "loss": "0.05274", "s2c_nll_loss": "0.076", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "44920", "lr": "0.000100538", "gnorm": "2.038", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11374"} 2023-01-29 19:21:19 | INFO | train_inner | {"epoch": 21, "update": 20.788, "s2c_loss": "0.159", "loss": "0.11054", "s2c_nll_loss": "0.159", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "44930", "lr": "0.000100472", "gnorm": "3.593", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11377"} 2023-01-29 19:21:21 | INFO | train_inner | {"epoch": 21, "update": 20.792, "s2c_loss": "0.122", "loss": "0.08437", "s2c_nll_loss": "0.122", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "44940", "lr": "0.000100405", "gnorm": "3.659", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11379"} 2023-01-29 19:21:24 | INFO | train_inner | {"epoch": 21, "update": 20.797, "s2c_loss": "0.168", "loss": "0.11613", "s2c_nll_loss": "0.168", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "44950", "lr": "0.000100338", "gnorm": "4.593", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11382"} 2023-01-29 19:21:26 | INFO | train_inner | {"epoch": 21, "update": 20.802, "s2c_loss": "0.149", "loss": "0.10326", "s2c_nll_loss": "0.149", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "258.4", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "44960", "lr": "0.000100272", "gnorm": "3.733", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11384"} 2023-01-29 19:21:29 | INFO | train_inner | {"epoch": 21, "update": 20.806, "s2c_loss": "0.166", "loss": "0.11529", "s2c_nll_loss": "0.166", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "44970", "lr": "0.000100205", "gnorm": "3.712", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11387"} 2023-01-29 19:21:31 | INFO | train_inner | {"epoch": 21, "update": 20.811, "s2c_loss": "0.097", "loss": "0.06702", "s2c_nll_loss": "0.097", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "44980", "lr": "0.000100138", "gnorm": "2.748", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11389"} 2023-01-29 19:21:34 | INFO | train_inner | {"epoch": 21, "update": 20.815, "s2c_loss": "0.125", "loss": "0.08677", "s2c_nll_loss": "0.125", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "44990", "lr": "0.000100072", "gnorm": "2.849", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11392"} 2023-01-29 19:21:36 | INFO | train_inner | {"epoch": 21, "update": 20.82, "s2c_loss": "0.115", "loss": "0.07999", "s2c_nll_loss": "0.115", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "45000", "lr": "0.000100005", "gnorm": "3.16", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11394"} 2023-01-29 19:21:39 | INFO | train_inner | {"epoch": 21, "update": 20.825, "s2c_loss": "0.193", "loss": "0.13397", "s2c_nll_loss": "0.193", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "45010", "lr": "9.99383e-05", "gnorm": "3.541", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11397"} 2023-01-29 19:21:41 | INFO | train_inner | {"epoch": 21, "update": 20.829, "s2c_loss": "0.199", "loss": "0.13783", "s2c_nll_loss": "0.199", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "45020", "lr": "9.98717e-05", "gnorm": "4.219", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11399"} 2023-01-29 19:21:44 | INFO | train_inner | {"epoch": 21, "update": 20.834, "s2c_loss": "0.24", "loss": "0.16649", "s2c_nll_loss": "0.24", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "45030", "lr": "9.9805e-05", "gnorm": "4.399", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11402"} 2023-01-29 19:21:46 | INFO | train_inner | {"epoch": 21, "update": 20.839, "s2c_loss": "0.109", "loss": "0.07567", "s2c_nll_loss": "0.109", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "45040", "lr": "9.97383e-05", "gnorm": "2.776", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11404"} 2023-01-29 19:21:49 | INFO | train_inner | {"epoch": 21, "update": 20.843, "s2c_loss": "0.13", "loss": "0.08997", "s2c_nll_loss": "0.13", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "45050", "lr": "9.96717e-05", "gnorm": "4.051", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11407"} 2023-01-29 19:21:51 | INFO | train_inner | {"epoch": 21, "update": 20.848, "s2c_loss": "0.116", "loss": "0.08061", "s2c_nll_loss": "0.116", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "45060", "lr": "9.9605e-05", "gnorm": "3.665", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11409"} 2023-01-29 19:21:54 | INFO | train_inner | {"epoch": 21, "update": 20.852, "s2c_loss": "0.151", "loss": "0.10496", "s2c_nll_loss": "0.151", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "45070", "lr": "9.95384e-05", "gnorm": "3.109", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11412"} 2023-01-29 19:21:56 | INFO | train_inner | {"epoch": 21, "update": 20.857, "s2c_loss": "0.229", "loss": "0.15872", "s2c_nll_loss": "0.229", "s2c_accuracy": "96.094", "s2c_total": "64", "s2c_n_correct": "61.5", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "45080", "lr": "9.94717e-05", "gnorm": "3.616", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11414"} 2023-01-29 19:21:59 | INFO | train_inner | {"epoch": 21, "update": 20.862, "s2c_loss": "0.122", "loss": "0.08456", "s2c_nll_loss": "0.122", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "45090", "lr": "9.9405e-05", "gnorm": "3.08", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11417"} 2023-01-29 19:22:01 | INFO | train_inner | {"epoch": 21, "update": 20.866, "s2c_loss": "0.125", "loss": "0.08671", "s2c_nll_loss": "0.125", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "45100", "lr": "9.93384e-05", "gnorm": "3.236", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11419"} 2023-01-29 19:22:04 | INFO | train_inner | {"epoch": 21, "update": 20.871, "s2c_loss": "0.152", "loss": "0.10517", "s2c_nll_loss": "0.152", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "247.1", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "45110", "lr": "9.92717e-05", "gnorm": "3.954", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11422"} 2023-01-29 19:22:07 | INFO | train_inner | {"epoch": 21, "update": 20.876, "s2c_loss": "0.163", "loss": "0.11305", "s2c_nll_loss": "0.163", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "45120", "lr": "9.9205e-05", "gnorm": "3.01", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11424"} 2023-01-29 19:22:09 | INFO | train_inner | {"epoch": 21, "update": 20.88, "s2c_loss": "0.162", "loss": "0.11263", "s2c_nll_loss": "0.162", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "45130", "lr": "9.91384e-05", "gnorm": "4.317", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11427"} 2023-01-29 19:22:12 | INFO | train_inner | {"epoch": 21, "update": 20.885, "s2c_loss": "0.182", "loss": "0.12624", "s2c_nll_loss": "0.182", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "45140", "lr": "9.90717e-05", "gnorm": "3.566", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11430"} 2023-01-29 19:22:14 | INFO | train_inner | {"epoch": 21, "update": 20.889, "s2c_loss": "0.14", "loss": "0.09692", "s2c_nll_loss": "0.14", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "247.3", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "45150", "lr": "9.90051e-05", "gnorm": "3.614", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11432"} 2023-01-29 19:22:17 | INFO | train_inner | {"epoch": 21, "update": 20.894, "s2c_loss": "0.18", "loss": "0.12503", "s2c_nll_loss": "0.18", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "260.5", "ups": "4.07", "wpb": "64", "bsz": "64", "num_updates": "45160", "lr": "9.89384e-05", "gnorm": "3.233", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11435"} 2023-01-29 19:22:19 | INFO | train_inner | {"epoch": 21, "update": 20.899, "s2c_loss": "0.122", "loss": "0.08432", "s2c_nll_loss": "0.122", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "45170", "lr": "9.88717e-05", "gnorm": "3.474", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11437"} 2023-01-29 19:22:22 | INFO | train_inner | {"epoch": 21, "update": 20.903, "s2c_loss": "0.11", "loss": "0.07597", "s2c_nll_loss": "0.11", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "45180", "lr": "9.88051e-05", "gnorm": "3.012", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11440"} 2023-01-29 19:22:24 | INFO | train_inner | {"epoch": 21, "update": 20.908, "s2c_loss": "0.096", "loss": "0.06635", "s2c_nll_loss": "0.096", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "45190", "lr": "9.87384e-05", "gnorm": "2.631", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11442"} 2023-01-29 19:22:27 | INFO | train_inner | {"epoch": 21, "update": 20.913, "s2c_loss": "0.07", "loss": "0.04868", "s2c_nll_loss": "0.07", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "45200", "lr": "9.86717e-05", "gnorm": "2.212", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11445"} 2023-01-29 19:22:29 | INFO | train_inner | {"epoch": 21, "update": 20.917, "s2c_loss": "0.107", "loss": "0.0743", "s2c_nll_loss": "0.107", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "259.3", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "45210", "lr": "9.86051e-05", "gnorm": "2.973", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11447"} 2023-01-29 19:22:32 | INFO | train_inner | {"epoch": 21, "update": 20.922, "s2c_loss": "0.106", "loss": "0.07341", "s2c_nll_loss": "0.106", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "45220", "lr": "9.85384e-05", "gnorm": "2.9", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11450"} 2023-01-29 19:22:34 | INFO | train_inner | {"epoch": 21, "update": 20.926, "s2c_loss": "0.13", "loss": "0.09043", "s2c_nll_loss": "0.13", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "45230", "lr": "9.84717e-05", "gnorm": "3.164", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11452"} 2023-01-29 19:22:37 | INFO | train_inner | {"epoch": 21, "update": 20.931, "s2c_loss": "0.11", "loss": "0.07626", "s2c_nll_loss": "0.11", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "45240", "lr": "9.84051e-05", "gnorm": "3.171", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11455"} 2023-01-29 19:22:39 | INFO | train_inner | {"epoch": 21, "update": 20.936, "s2c_loss": "0.163", "loss": "0.11307", "s2c_nll_loss": "0.163", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "45250", "lr": "9.83384e-05", "gnorm": "4.318", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11457"} 2023-01-29 19:22:42 | INFO | train_inner | {"epoch": 21, "update": 20.94, "s2c_loss": "0.128", "loss": "0.089", "s2c_nll_loss": "0.128", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "45260", "lr": "9.82718e-05", "gnorm": "3.196", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11460"} 2023-01-29 19:22:44 | INFO | train_inner | {"epoch": 21, "update": 20.945, "s2c_loss": "0.106", "loss": "0.07348", "s2c_nll_loss": "0.106", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "45270", "lr": "9.82051e-05", "gnorm": "2.715", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11462"} 2023-01-29 19:22:47 | INFO | train_inner | {"epoch": 21, "update": 20.95, "s2c_loss": "0.185", "loss": "0.12804", "s2c_nll_loss": "0.185", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "45280", "lr": "9.81384e-05", "gnorm": "4.047", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11465"} 2023-01-29 19:22:49 | INFO | train_inner | {"epoch": 21, "update": 20.954, "s2c_loss": "0.202", "loss": "0.14008", "s2c_nll_loss": "0.202", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "259.2", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "45290", "lr": "9.80718e-05", "gnorm": "4.125", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11467"} 2023-01-29 19:22:52 | INFO | train_inner | {"epoch": 21, "update": 20.959, "s2c_loss": "0.114", "loss": "0.07896", "s2c_nll_loss": "0.114", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "45300", "lr": "9.80051e-05", "gnorm": "2.75", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11470"} 2023-01-29 19:22:55 | INFO | train_inner | {"epoch": 21, "update": 20.963, "s2c_loss": "0.082", "loss": "0.05687", "s2c_nll_loss": "0.082", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "45310", "lr": "9.79384e-05", "gnorm": "2.481", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11472"} 2023-01-29 19:22:57 | INFO | train_inner | {"epoch": 21, "update": 20.968, "s2c_loss": "0.102", "loss": "0.07036", "s2c_nll_loss": "0.102", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "45320", "lr": "9.78718e-05", "gnorm": "3.774", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11475"} 2023-01-29 19:23:00 | INFO | train_inner | {"epoch": 21, "update": 20.973, "s2c_loss": "0.17", "loss": "0.1176", "s2c_nll_loss": "0.17", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "45330", "lr": "9.78051e-05", "gnorm": "3.829", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11478"} 2023-01-29 19:23:02 | INFO | train_inner | {"epoch": 21, "update": 20.977, "s2c_loss": "0.124", "loss": "0.08569", "s2c_nll_loss": "0.124", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "45340", "lr": "9.77384e-05", "gnorm": "3.203", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "11480"} 2023-01-29 19:23:05 | INFO | train_inner | {"epoch": 21, "update": 20.982, "s2c_loss": "0.133", "loss": "0.09239", "s2c_nll_loss": "0.133", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "45350", "lr": "9.76718e-05", "gnorm": "2.892", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11483"} 2023-01-29 19:23:07 | INFO | train_inner | {"epoch": 21, "update": 20.987, "s2c_loss": "0.149", "loss": "0.10325", "s2c_nll_loss": "0.149", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "45360", "lr": "9.76051e-05", "gnorm": "3.603", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11485"} 2023-01-29 19:23:10 | INFO | train_inner | {"epoch": 21, "update": 20.991, "s2c_loss": "0.111", "loss": "0.07662", "s2c_nll_loss": "0.111", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "45370", "lr": "9.75385e-05", "gnorm": "3.126", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11488"} 2023-01-29 19:23:12 | INFO | train_inner | {"epoch": 21, "update": 20.996, "s2c_loss": "0.152", "loss": "0.10546", "s2c_nll_loss": "0.152", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "45380", "lr": "9.74718e-05", "gnorm": "3.466", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11490"} 2023-01-29 19:23:15 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 19:23:29 | INFO | valid | {"epoch": 21, "valid_s2c_loss": "0.551", "valid_loss": "0.38198", "valid_s2c_nll_loss": "0.551", "valid_s2c_accuracy": "90.019", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "28.7685", "valid_num_updates": "45389", "valid_best_s2c_accuracy": "90.019"} 2023-01-29 19:23:29 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 21 @ 45389 updates 2023-01-29 19:23:29 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 19:23:36 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 19:23:41 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt (epoch 21 @ 45389 updates, score 90.019) (writing took 11.87856262922287 seconds) 2023-01-29 19:23:41 | INFO | fairseq_cli.train | end of epoch 21 (average epoch stats below) 2023-01-29 19:23:41 | INFO | train | {"epoch": 21, "train_s2c_loss": "0.132", "train_loss": "0.09172", "train_s2c_nll_loss": "0.132", "train_s2c_accuracy": "97.74", "train_s2c_total": "63.9838", "train_s2c_n_correct": "62.5375", "train_wps": "238.5", "train_ups": "3.73", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "45389", "train_lr": "9.74118e-05", "train_gnorm": "3.213", "train_loss_scale": "2048", "train_train_wall": "540", "train_gb_free": "7.5", "train_wall": "11519"} 2023-01-29 19:23:48 | INFO | fairseq.trainer | begin training epoch 22 2023-01-29 19:23:48 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 19:23:48 | INFO | train_inner | {"epoch": 22, "update": 21.0, "s2c_loss": "0.133", "loss": "0.09206", "s2c_nll_loss": "0.133", "s2c_accuracy": "98.026", "s2c_total": "60.8", "s2c_n_correct": "59.6", "wps": "17.1", "ups": "0.28", "wpb": "60.8", "bsz": "60.8", "num_updates": "45390", "lr": "9.74051e-05", "gnorm": "3.058", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11526"} 2023-01-29 19:23:50 | INFO | train_inner | {"epoch": 22, "update": 21.005, "s2c_loss": "0.073", "loss": "0.05083", "s2c_nll_loss": "0.073", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "45400", "lr": "9.73385e-05", "gnorm": "1.996", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11528"} 2023-01-29 19:23:53 | INFO | train_inner | {"epoch": 22, "update": 21.01, "s2c_loss": "0.116", "loss": "0.08069", "s2c_nll_loss": "0.116", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "45410", "lr": "9.72718e-05", "gnorm": "2.469", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11531"} 2023-01-29 19:23:55 | INFO | train_inner | {"epoch": 22, "update": 21.014, "s2c_loss": "0.077", "loss": "0.05324", "s2c_nll_loss": "0.077", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "45420", "lr": "9.72051e-05", "gnorm": "1.923", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11533"} 2023-01-29 19:23:58 | INFO | train_inner | {"epoch": 22, "update": 21.019, "s2c_loss": "0.065", "loss": "0.0448", "s2c_nll_loss": "0.065", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "45430", "lr": "9.71385e-05", "gnorm": "1.956", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11536"} 2023-01-29 19:24:00 | INFO | train_inner | {"epoch": 22, "update": 21.024, "s2c_loss": "0.1", "loss": "0.06959", "s2c_nll_loss": "0.1", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "45440", "lr": "9.70718e-05", "gnorm": "2.695", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11538"} 2023-01-29 19:24:03 | INFO | train_inner | {"epoch": 22, "update": 21.028, "s2c_loss": "0.138", "loss": "0.09536", "s2c_nll_loss": "0.138", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "245.6", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "45450", "lr": "9.70052e-05", "gnorm": "2.661", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11541"} 2023-01-29 19:24:06 | INFO | train_inner | {"epoch": 22, "update": 21.033, "s2c_loss": "0.082", "loss": "0.05653", "s2c_nll_loss": "0.082", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "45460", "lr": "9.69385e-05", "gnorm": "2.389", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11544"} 2023-01-29 19:24:08 | INFO | train_inner | {"epoch": 22, "update": 21.037, "s2c_loss": "0.08", "loss": "0.05545", "s2c_nll_loss": "0.08", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "45470", "lr": "9.68718e-05", "gnorm": "1.857", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11546"} 2023-01-29 19:24:11 | INFO | train_inner | {"epoch": 22, "update": 21.042, "s2c_loss": "0.114", "loss": "0.07902", "s2c_nll_loss": "0.114", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "246.3", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "45480", "lr": "9.68052e-05", "gnorm": "2.271", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11549"} 2023-01-29 19:24:13 | INFO | train_inner | {"epoch": 22, "update": 21.047, "s2c_loss": "0.073", "loss": "0.05053", "s2c_nll_loss": "0.073", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "45490", "lr": "9.67385e-05", "gnorm": "2.481", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11551"} 2023-01-29 19:24:16 | INFO | train_inner | {"epoch": 22, "update": 21.051, "s2c_loss": "0.056", "loss": "0.03887", "s2c_nll_loss": "0.056", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "45500", "lr": "9.66718e-05", "gnorm": "2.171", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11554"} 2023-01-29 19:24:18 | INFO | train_inner | {"epoch": 22, "update": 21.056, "s2c_loss": "0.067", "loss": "0.04639", "s2c_nll_loss": "0.067", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "45510", "lr": "9.66052e-05", "gnorm": "2.223", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11556"} 2023-01-29 19:24:21 | INFO | train_inner | {"epoch": 22, "update": 21.061, "s2c_loss": "0.06", "loss": "0.04143", "s2c_nll_loss": "0.06", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "45520", "lr": "9.65385e-05", "gnorm": "1.691", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "11559"} 2023-01-29 19:24:23 | INFO | train_inner | {"epoch": 22, "update": 21.065, "s2c_loss": "0.071", "loss": "0.04938", "s2c_nll_loss": "0.071", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "45530", "lr": "9.64718e-05", "gnorm": "2.349", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11561"} 2023-01-29 19:24:26 | INFO | train_inner | {"epoch": 22, "update": 21.07, "s2c_loss": "0.116", "loss": "0.08027", "s2c_nll_loss": "0.116", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "45540", "lr": "9.64052e-05", "gnorm": "2.808", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11564"} 2023-01-29 19:24:28 | INFO | train_inner | {"epoch": 22, "update": 21.074, "s2c_loss": "0.085", "loss": "0.0589", "s2c_nll_loss": "0.085", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "45550", "lr": "9.63385e-05", "gnorm": "2.173", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11566"} 2023-01-29 19:24:31 | INFO | train_inner | {"epoch": 22, "update": 21.079, "s2c_loss": "0.096", "loss": "0.06671", "s2c_nll_loss": "0.096", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "45560", "lr": "9.62719e-05", "gnorm": "2.821", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11569"} 2023-01-29 19:24:33 | INFO | train_inner | {"epoch": 22, "update": 21.084, "s2c_loss": "0.101", "loss": "0.07026", "s2c_nll_loss": "0.101", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "257.4", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "45570", "lr": "9.62052e-05", "gnorm": "3.252", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11571"} 2023-01-29 19:24:36 | INFO | train_inner | {"epoch": 22, "update": 21.088, "s2c_loss": "0.067", "loss": "0.04671", "s2c_nll_loss": "0.067", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "45580", "lr": "9.61385e-05", "gnorm": "2.378", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11574"} 2023-01-29 19:24:39 | INFO | train_inner | {"epoch": 22, "update": 21.093, "s2c_loss": "0.073", "loss": "0.05076", "s2c_nll_loss": "0.073", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "45590", "lr": "9.60719e-05", "gnorm": "2.6", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11576"} 2023-01-29 19:24:41 | INFO | train_inner | {"epoch": 22, "update": 21.098, "s2c_loss": "0.095", "loss": "0.06601", "s2c_nll_loss": "0.095", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "259.6", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "45600", "lr": "9.60052e-05", "gnorm": "2.947", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11579"} 2023-01-29 19:24:43 | INFO | train_inner | {"epoch": 22, "update": 21.102, "s2c_loss": "0.094", "loss": "0.06518", "s2c_nll_loss": "0.094", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "258.1", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "45610", "lr": "9.59385e-05", "gnorm": "2.98", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11581"} 2023-01-29 19:24:46 | INFO | train_inner | {"epoch": 22, "update": 21.107, "s2c_loss": "0.089", "loss": "0.06141", "s2c_nll_loss": "0.089", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "45620", "lr": "9.58719e-05", "gnorm": "3.744", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11584"} 2023-01-29 19:24:49 | INFO | train_inner | {"epoch": 22, "update": 21.111, "s2c_loss": "0.109", "loss": "0.0755", "s2c_nll_loss": "0.109", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "45630", "lr": "9.58052e-05", "gnorm": "2.681", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11586"} 2023-01-29 19:24:51 | INFO | train_inner | {"epoch": 22, "update": 21.116, "s2c_loss": "0.141", "loss": "0.09796", "s2c_nll_loss": "0.141", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "45640", "lr": "9.57385e-05", "gnorm": "3.207", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11589"} 2023-01-29 19:24:54 | INFO | train_inner | {"epoch": 22, "update": 21.121, "s2c_loss": "0.082", "loss": "0.0569", "s2c_nll_loss": "0.082", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "45650", "lr": "9.56719e-05", "gnorm": "2.399", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11591"} 2023-01-29 19:24:56 | INFO | train_inner | {"epoch": 22, "update": 21.125, "s2c_loss": "0.109", "loss": "0.0756", "s2c_nll_loss": "0.109", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "45660", "lr": "9.56052e-05", "gnorm": "2.44", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11594"} 2023-01-29 19:24:59 | INFO | train_inner | {"epoch": 22, "update": 21.13, "s2c_loss": "0.061", "loss": "0.04249", "s2c_nll_loss": "0.061", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "258.4", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "45670", "lr": "9.55386e-05", "gnorm": "1.698", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11596"} 2023-01-29 19:25:01 | INFO | train_inner | {"epoch": 22, "update": 21.135, "s2c_loss": "0.112", "loss": "0.07737", "s2c_nll_loss": "0.112", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "45680", "lr": "9.54719e-05", "gnorm": "2.489", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11599"} 2023-01-29 19:25:04 | INFO | train_inner | {"epoch": 22, "update": 21.139, "s2c_loss": "0.047", "loss": "0.0324", "s2c_nll_loss": "0.047", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "45690", "lr": "9.54052e-05", "gnorm": "1.807", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11602"} 2023-01-29 19:25:06 | INFO | train_inner | {"epoch": 22, "update": 21.144, "s2c_loss": "0.09", "loss": "0.06189", "s2c_nll_loss": "0.09", "s2c_accuracy": "98.116", "s2c_total": "63.7", "s2c_n_correct": "62.5", "wps": "252.9", "ups": "3.97", "wpb": "63.7", "bsz": "63.7", "num_updates": "45700", "lr": "9.53386e-05", "gnorm": "2.57", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11604"} 2023-01-29 19:25:09 | INFO | train_inner | {"epoch": 22, "update": 21.148, "s2c_loss": "0.112", "loss": "0.0774", "s2c_nll_loss": "0.112", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "45710", "lr": "9.52719e-05", "gnorm": "2.257", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11607"} 2023-01-29 19:25:11 | INFO | train_inner | {"epoch": 22, "update": 21.153, "s2c_loss": "0.066", "loss": "0.04599", "s2c_nll_loss": "0.066", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "45720", "lr": "9.52052e-05", "gnorm": "2.508", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11609"} 2023-01-29 19:25:14 | INFO | train_inner | {"epoch": 22, "update": 21.158, "s2c_loss": "0.066", "loss": "0.04568", "s2c_nll_loss": "0.066", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "45730", "lr": "9.51386e-05", "gnorm": "2.549", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11612"} 2023-01-29 19:25:16 | INFO | train_inner | {"epoch": 22, "update": 21.162, "s2c_loss": "0.088", "loss": "0.0609", "s2c_nll_loss": "0.088", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "45740", "lr": "9.50719e-05", "gnorm": "2.44", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11614"} 2023-01-29 19:25:19 | INFO | train_inner | {"epoch": 22, "update": 21.167, "s2c_loss": "0.095", "loss": "0.06592", "s2c_nll_loss": "0.095", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "45750", "lr": "9.50053e-05", "gnorm": "2.605", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11617"} 2023-01-29 19:25:21 | INFO | train_inner | {"epoch": 22, "update": 21.172, "s2c_loss": "0.064", "loss": "0.0447", "s2c_nll_loss": "0.064", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "45760", "lr": "9.49386e-05", "gnorm": "2.536", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11619"} 2023-01-29 19:25:24 | INFO | train_inner | {"epoch": 22, "update": 21.176, "s2c_loss": "0.128", "loss": "0.08907", "s2c_nll_loss": "0.128", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "45770", "lr": "9.48719e-05", "gnorm": "3.962", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11622"} 2023-01-29 19:25:26 | INFO | train_inner | {"epoch": 22, "update": 21.181, "s2c_loss": "0.118", "loss": "0.08176", "s2c_nll_loss": "0.118", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "45780", "lr": "9.48053e-05", "gnorm": "4.342", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11624"} 2023-01-29 19:25:29 | INFO | train_inner | {"epoch": 22, "update": 21.185, "s2c_loss": "0.089", "loss": "0.06184", "s2c_nll_loss": "0.089", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "45790", "lr": "9.47386e-05", "gnorm": "2.57", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "11627"} 2023-01-29 19:25:30 | INFO | fairseq.trainer | NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 2048.0 2023-01-29 19:25:32 | INFO | train_inner | {"epoch": 22, "update": 21.191, "s2c_loss": "0.084", "loss": "0.0581", "s2c_nll_loss": "0.084", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "234.8", "ups": "3.67", "wpb": "64", "bsz": "64", "num_updates": "45800", "lr": "9.46719e-05", "gnorm": "2.628", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11630"} 2023-01-29 19:25:34 | INFO | train_inner | {"epoch": 22, "update": 21.195, "s2c_loss": "0.067", "loss": "0.04659", "s2c_nll_loss": "0.067", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "45810", "lr": "9.46053e-05", "gnorm": "1.913", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11632"} 2023-01-29 19:25:37 | INFO | train_inner | {"epoch": 22, "update": 21.2, "s2c_loss": "0.119", "loss": "0.08278", "s2c_nll_loss": "0.119", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "45820", "lr": "9.45386e-05", "gnorm": "2.757", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11635"} 2023-01-29 19:25:39 | INFO | train_inner | {"epoch": 22, "update": 21.204, "s2c_loss": "0.132", "loss": "0.09171", "s2c_nll_loss": "0.132", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "45830", "lr": "9.44719e-05", "gnorm": "2.725", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11637"} 2023-01-29 19:25:42 | INFO | train_inner | {"epoch": 22, "update": 21.209, "s2c_loss": "0.105", "loss": "0.07309", "s2c_nll_loss": "0.105", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "258.2", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "45840", "lr": "9.44053e-05", "gnorm": "2.641", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11640"} 2023-01-29 19:25:44 | INFO | train_inner | {"epoch": 22, "update": 21.214, "s2c_loss": "0.156", "loss": "0.10822", "s2c_nll_loss": "0.156", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "45850", "lr": "9.43386e-05", "gnorm": "3.722", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11642"} 2023-01-29 19:25:47 | INFO | train_inner | {"epoch": 22, "update": 21.218, "s2c_loss": "0.085", "loss": "0.05872", "s2c_nll_loss": "0.085", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "45860", "lr": "9.4272e-05", "gnorm": "2.803", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11645"} 2023-01-29 19:25:49 | INFO | train_inner | {"epoch": 22, "update": 21.223, "s2c_loss": "0.132", "loss": "0.09169", "s2c_nll_loss": "0.132", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "45870", "lr": "9.42053e-05", "gnorm": "4.883", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "11647"} 2023-01-29 19:25:52 | INFO | train_inner | {"epoch": 22, "update": 21.228, "s2c_loss": "0.129", "loss": "0.0896", "s2c_nll_loss": "0.129", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "45880", "lr": "9.41386e-05", "gnorm": "3.417", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "11650"} 2023-01-29 19:25:54 | INFO | train_inner | {"epoch": 22, "update": 21.232, "s2c_loss": "0.104", "loss": "0.07175", "s2c_nll_loss": "0.104", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "45890", "lr": "9.4072e-05", "gnorm": "3.442", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11652"} 2023-01-29 19:25:57 | INFO | train_inner | {"epoch": 22, "update": 21.237, "s2c_loss": "0.14", "loss": "0.09693", "s2c_nll_loss": "0.14", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "45900", "lr": "9.40053e-05", "gnorm": "2.864", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11655"} 2023-01-29 19:25:59 | INFO | train_inner | {"epoch": 22, "update": 21.241, "s2c_loss": "0.14", "loss": "0.09705", "s2c_nll_loss": "0.14", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "45910", "lr": "9.39386e-05", "gnorm": "3.167", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11657"} 2023-01-29 19:26:02 | INFO | train_inner | {"epoch": 22, "update": 21.246, "s2c_loss": "0.139", "loss": "0.09624", "s2c_nll_loss": "0.139", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "258.1", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "45920", "lr": "9.3872e-05", "gnorm": "3.584", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11660"} 2023-01-29 19:26:04 | INFO | train_inner | {"epoch": 22, "update": 21.251, "s2c_loss": "0.088", "loss": "0.06085", "s2c_nll_loss": "0.088", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "45930", "lr": "9.38053e-05", "gnorm": "2.45", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11662"} 2023-01-29 19:26:07 | INFO | train_inner | {"epoch": 22, "update": 21.255, "s2c_loss": "0.121", "loss": "0.08371", "s2c_nll_loss": "0.121", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "45940", "lr": "9.37386e-05", "gnorm": "2.808", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11665"} 2023-01-29 19:26:09 | INFO | train_inner | {"epoch": 22, "update": 21.26, "s2c_loss": "0.07", "loss": "0.0487", "s2c_nll_loss": "0.07", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "45950", "lr": "9.3672e-05", "gnorm": "1.976", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11667"} 2023-01-29 19:26:12 | INFO | train_inner | {"epoch": 22, "update": 21.265, "s2c_loss": "0.064", "loss": "0.04437", "s2c_nll_loss": "0.064", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "45960", "lr": "9.36053e-05", "gnorm": "2.021", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11670"} 2023-01-29 19:26:14 | INFO | train_inner | {"epoch": 22, "update": 21.269, "s2c_loss": "0.088", "loss": "0.0608", "s2c_nll_loss": "0.088", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "45970", "lr": "9.35387e-05", "gnorm": "2.385", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11672"} 2023-01-29 19:26:17 | INFO | train_inner | {"epoch": 22, "update": 21.274, "s2c_loss": "0.131", "loss": "0.09056", "s2c_nll_loss": "0.131", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "45980", "lr": "9.3472e-05", "gnorm": "3.092", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11675"} 2023-01-29 19:26:19 | INFO | train_inner | {"epoch": 22, "update": 21.278, "s2c_loss": "0.11", "loss": "0.07651", "s2c_nll_loss": "0.11", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "45990", "lr": "9.34053e-05", "gnorm": "3.006", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11677"} 2023-01-29 19:26:22 | INFO | train_inner | {"epoch": 22, "update": 21.283, "s2c_loss": "0.083", "loss": "0.05773", "s2c_nll_loss": "0.083", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "46000", "lr": "9.33387e-05", "gnorm": "2.808", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11680"} 2023-01-29 19:26:24 | INFO | train_inner | {"epoch": 22, "update": 21.288, "s2c_loss": "0.102", "loss": "0.07084", "s2c_nll_loss": "0.102", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "46010", "lr": "9.3272e-05", "gnorm": "2.172", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11682"} 2023-01-29 19:26:27 | INFO | train_inner | {"epoch": 22, "update": 21.292, "s2c_loss": "0.087", "loss": "0.06053", "s2c_nll_loss": "0.087", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "46020", "lr": "9.32053e-05", "gnorm": "3.598", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11685"} 2023-01-29 19:26:30 | INFO | train_inner | {"epoch": 22, "update": 21.297, "s2c_loss": "0.083", "loss": "0.05757", "s2c_nll_loss": "0.083", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "46030", "lr": "9.31387e-05", "gnorm": "2.589", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11688"} 2023-01-29 19:26:32 | INFO | train_inner | {"epoch": 22, "update": 21.302, "s2c_loss": "0.114", "loss": "0.07905", "s2c_nll_loss": "0.114", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "46040", "lr": "9.3072e-05", "gnorm": "3.027", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11690"} 2023-01-29 19:26:35 | INFO | train_inner | {"epoch": 22, "update": 21.306, "s2c_loss": "0.185", "loss": "0.12853", "s2c_nll_loss": "0.185", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "261.1", "ups": "4.08", "wpb": "64", "bsz": "64", "num_updates": "46050", "lr": "9.30054e-05", "gnorm": "3.34", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11692"} 2023-01-29 19:26:37 | INFO | train_inner | {"epoch": 22, "update": 21.311, "s2c_loss": "0.11", "loss": "0.07623", "s2c_nll_loss": "0.11", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "46060", "lr": "9.29387e-05", "gnorm": "2.64", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11695"} 2023-01-29 19:26:40 | INFO | train_inner | {"epoch": 22, "update": 21.315, "s2c_loss": "0.101", "loss": "0.07032", "s2c_nll_loss": "0.101", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "46070", "lr": "9.2872e-05", "gnorm": "2.79", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11698"} 2023-01-29 19:26:42 | INFO | train_inner | {"epoch": 22, "update": 21.32, "s2c_loss": "0.304", "loss": "0.21046", "s2c_nll_loss": "0.304", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "46080", "lr": "9.28054e-05", "gnorm": "2.983", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11700"} 2023-01-29 19:26:45 | INFO | train_inner | {"epoch": 22, "update": 21.325, "s2c_loss": "0.109", "loss": "0.07554", "s2c_nll_loss": "0.109", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "246.8", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "46090", "lr": "9.27387e-05", "gnorm": "2.76", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11703"} 2023-01-29 19:26:47 | INFO | train_inner | {"epoch": 22, "update": 21.329, "s2c_loss": "0.091", "loss": "0.0633", "s2c_nll_loss": "0.091", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "46100", "lr": "9.2672e-05", "gnorm": "2.482", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11705"} 2023-01-29 19:26:50 | INFO | train_inner | {"epoch": 22, "update": 21.334, "s2c_loss": "0.1", "loss": "0.06945", "s2c_nll_loss": "0.1", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "46110", "lr": "9.26054e-05", "gnorm": "2.697", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11708"} 2023-01-29 19:26:52 | INFO | train_inner | {"epoch": 22, "update": 21.339, "s2c_loss": "0.074", "loss": "0.05134", "s2c_nll_loss": "0.074", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "46120", "lr": "9.25387e-05", "gnorm": "2.078", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11710"} 2023-01-29 19:26:55 | INFO | train_inner | {"epoch": 22, "update": 21.343, "s2c_loss": "0.113", "loss": "0.0786", "s2c_nll_loss": "0.113", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "46130", "lr": "9.2472e-05", "gnorm": "2.598", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11713"} 2023-01-29 19:26:57 | INFO | train_inner | {"epoch": 22, "update": 21.348, "s2c_loss": "0.075", "loss": "0.05221", "s2c_nll_loss": "0.075", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "46140", "lr": "9.24054e-05", "gnorm": "2.214", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11715"} 2023-01-29 19:27:00 | INFO | train_inner | {"epoch": 22, "update": 21.352, "s2c_loss": "0.168", "loss": "0.11617", "s2c_nll_loss": "0.168", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "46150", "lr": "9.23387e-05", "gnorm": "3.015", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11718"} 2023-01-29 19:27:02 | INFO | train_inner | {"epoch": 22, "update": 21.357, "s2c_loss": "0.089", "loss": "0.06185", "s2c_nll_loss": "0.089", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "46160", "lr": "9.22721e-05", "gnorm": "2.514", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11720"} 2023-01-29 19:27:05 | INFO | train_inner | {"epoch": 22, "update": 21.362, "s2c_loss": "0.307", "loss": "0.21309", "s2c_nll_loss": "0.307", "s2c_accuracy": "96.562", "s2c_total": "64", "s2c_n_correct": "61.8", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "46170", "lr": "9.22054e-05", "gnorm": "2.901", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11723"} 2023-01-29 19:27:07 | INFO | train_inner | {"epoch": 22, "update": 21.366, "s2c_loss": "0.081", "loss": "0.05636", "s2c_nll_loss": "0.081", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "259", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "46180", "lr": "9.21387e-05", "gnorm": "2.336", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11725"} 2023-01-29 19:27:10 | INFO | train_inner | {"epoch": 22, "update": 21.371, "s2c_loss": "0.077", "loss": "0.05368", "s2c_nll_loss": "0.077", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "46190", "lr": "9.20721e-05", "gnorm": "2.677", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11728"} 2023-01-29 19:27:12 | INFO | train_inner | {"epoch": 22, "update": 21.376, "s2c_loss": "0.117", "loss": "0.08114", "s2c_nll_loss": "0.117", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "46200", "lr": "9.20054e-05", "gnorm": "3.143", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11730"} 2023-01-29 19:27:15 | INFO | train_inner | {"epoch": 22, "update": 21.38, "s2c_loss": "0.086", "loss": "0.05979", "s2c_nll_loss": "0.086", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "46210", "lr": "9.19387e-05", "gnorm": "2.631", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11733"} 2023-01-29 19:27:17 | INFO | train_inner | {"epoch": 22, "update": 21.385, "s2c_loss": "0.314", "loss": "0.21787", "s2c_nll_loss": "0.314", "s2c_accuracy": "96.406", "s2c_total": "64", "s2c_n_correct": "61.7", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "46220", "lr": "9.18721e-05", "gnorm": "2.962", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11735"} 2023-01-29 19:27:20 | INFO | train_inner | {"epoch": 22, "update": 21.389, "s2c_loss": "0.113", "loss": "0.07856", "s2c_nll_loss": "0.113", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "46230", "lr": "9.18054e-05", "gnorm": "3.592", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11738"} 2023-01-29 19:27:22 | INFO | train_inner | {"epoch": 22, "update": 21.394, "s2c_loss": "0.076", "loss": "0.05245", "s2c_nll_loss": "0.076", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "46240", "lr": "9.17387e-05", "gnorm": "2.599", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11740"} 2023-01-29 19:27:25 | INFO | train_inner | {"epoch": 22, "update": 21.399, "s2c_loss": "0.115", "loss": "0.08003", "s2c_nll_loss": "0.115", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "46250", "lr": "9.16721e-05", "gnorm": "3.42", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11743"} 2023-01-29 19:27:28 | INFO | train_inner | {"epoch": 22, "update": 21.403, "s2c_loss": "0.051", "loss": "0.03569", "s2c_nll_loss": "0.051", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "46260", "lr": "9.16054e-05", "gnorm": "1.762", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11745"} 2023-01-29 19:27:30 | INFO | train_inner | {"epoch": 22, "update": 21.408, "s2c_loss": "0.056", "loss": "0.03866", "s2c_nll_loss": "0.056", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "46270", "lr": "9.15388e-05", "gnorm": "1.603", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11748"} 2023-01-29 19:27:33 | INFO | train_inner | {"epoch": 22, "update": 21.413, "s2c_loss": "0.083", "loss": "0.05771", "s2c_nll_loss": "0.083", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "247.3", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "46280", "lr": "9.14721e-05", "gnorm": "2.69", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11751"} 2023-01-29 19:27:35 | INFO | train_inner | {"epoch": 22, "update": 21.417, "s2c_loss": "0.098", "loss": "0.06797", "s2c_nll_loss": "0.098", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "46290", "lr": "9.14054e-05", "gnorm": "2.233", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11753"} 2023-01-29 19:27:38 | INFO | train_inner | {"epoch": 22, "update": 21.422, "s2c_loss": "0.046", "loss": "0.03198", "s2c_nll_loss": "0.046", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "46300", "lr": "9.13388e-05", "gnorm": "1.786", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11756"} 2023-01-29 19:27:40 | INFO | train_inner | {"epoch": 22, "update": 21.426, "s2c_loss": "0.093", "loss": "0.06466", "s2c_nll_loss": "0.093", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "46310", "lr": "9.12721e-05", "gnorm": "2.646", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11758"} 2023-01-29 19:27:43 | INFO | train_inner | {"epoch": 22, "update": 21.431, "s2c_loss": "0.126", "loss": "0.08701", "s2c_nll_loss": "0.126", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "46320", "lr": "9.12054e-05", "gnorm": "2.983", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11761"} 2023-01-29 19:27:45 | INFO | train_inner | {"epoch": 22, "update": 21.436, "s2c_loss": "0.104", "loss": "0.0719", "s2c_nll_loss": "0.104", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "46330", "lr": "9.11388e-05", "gnorm": "2.714", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11763"} 2023-01-29 19:27:48 | INFO | train_inner | {"epoch": 22, "update": 21.44, "s2c_loss": "0.079", "loss": "0.05462", "s2c_nll_loss": "0.079", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "46340", "lr": "9.10721e-05", "gnorm": "2.499", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11766"} 2023-01-29 19:27:50 | INFO | train_inner | {"epoch": 22, "update": 21.445, "s2c_loss": "0.086", "loss": "0.05994", "s2c_nll_loss": "0.086", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "46350", "lr": "9.10055e-05", "gnorm": "2.377", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11768"} 2023-01-29 19:27:53 | INFO | train_inner | {"epoch": 22, "update": 21.45, "s2c_loss": "0.074", "loss": "0.05114", "s2c_nll_loss": "0.074", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "46360", "lr": "9.09388e-05", "gnorm": "1.911", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11771"} 2023-01-29 19:27:55 | INFO | train_inner | {"epoch": 22, "update": 21.454, "s2c_loss": "0.072", "loss": "0.04996", "s2c_nll_loss": "0.072", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "46370", "lr": "9.08721e-05", "gnorm": "2.251", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11773"} 2023-01-29 19:27:58 | INFO | train_inner | {"epoch": 22, "update": 21.459, "s2c_loss": "0.115", "loss": "0.0797", "s2c_nll_loss": "0.115", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "46380", "lr": "9.08055e-05", "gnorm": "2.944", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11776"} 2023-01-29 19:28:00 | INFO | train_inner | {"epoch": 22, "update": 21.463, "s2c_loss": "0.109", "loss": "0.0756", "s2c_nll_loss": "0.109", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "46390", "lr": "9.07388e-05", "gnorm": "3.232", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "11778"} 2023-01-29 19:28:03 | INFO | train_inner | {"epoch": 22, "update": 21.468, "s2c_loss": "0.123", "loss": "0.08499", "s2c_nll_loss": "0.123", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "46400", "lr": "9.06721e-05", "gnorm": "2.291", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11781"} 2023-01-29 19:28:06 | INFO | train_inner | {"epoch": 22, "update": 21.473, "s2c_loss": "0.085", "loss": "0.05905", "s2c_nll_loss": "0.085", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "46410", "lr": "9.06055e-05", "gnorm": "2.995", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11783"} 2023-01-29 19:28:08 | INFO | train_inner | {"epoch": 22, "update": 21.477, "s2c_loss": "0.116", "loss": "0.08041", "s2c_nll_loss": "0.116", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "46420", "lr": "9.05388e-05", "gnorm": "3.252", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11786"} 2023-01-29 19:28:11 | INFO | train_inner | {"epoch": 22, "update": 21.482, "s2c_loss": "0.107", "loss": "0.07396", "s2c_nll_loss": "0.107", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "258.1", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "46430", "lr": "9.04721e-05", "gnorm": "3.278", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11788"} 2023-01-29 19:28:13 | INFO | train_inner | {"epoch": 22, "update": 21.487, "s2c_loss": "0.079", "loss": "0.05456", "s2c_nll_loss": "0.079", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "46440", "lr": "9.04055e-05", "gnorm": "2.392", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11791"} 2023-01-29 19:28:16 | INFO | train_inner | {"epoch": 22, "update": 21.491, "s2c_loss": "0.094", "loss": "0.06549", "s2c_nll_loss": "0.094", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "46450", "lr": "9.03388e-05", "gnorm": "2.739", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11794"} 2023-01-29 19:28:18 | INFO | train_inner | {"epoch": 22, "update": 21.496, "s2c_loss": "0.116", "loss": "0.08073", "s2c_nll_loss": "0.116", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "46460", "lr": "9.02722e-05", "gnorm": "2.819", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11796"} 2023-01-29 19:28:21 | INFO | train_inner | {"epoch": 22, "update": 21.5, "s2c_loss": "0.097", "loss": "0.06728", "s2c_nll_loss": "0.097", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "46470", "lr": "9.02055e-05", "gnorm": "2.928", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11799"} 2023-01-29 19:28:23 | INFO | train_inner | {"epoch": 22, "update": 21.505, "s2c_loss": "0.107", "loss": "0.07444", "s2c_nll_loss": "0.107", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "46480", "lr": "9.01388e-05", "gnorm": "2.58", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11801"} 2023-01-29 19:28:26 | INFO | train_inner | {"epoch": 22, "update": 21.51, "s2c_loss": "0.08", "loss": "0.05511", "s2c_nll_loss": "0.08", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "46490", "lr": "9.00722e-05", "gnorm": "2.266", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11804"} 2023-01-29 19:28:28 | INFO | train_inner | {"epoch": 22, "update": 21.514, "s2c_loss": "0.097", "loss": "0.06717", "s2c_nll_loss": "0.097", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "46500", "lr": "9.00055e-05", "gnorm": "2.873", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11806"} 2023-01-29 19:28:31 | INFO | train_inner | {"epoch": 22, "update": 21.519, "s2c_loss": "0.093", "loss": "0.06433", "s2c_nll_loss": "0.093", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "46510", "lr": "8.99388e-05", "gnorm": "2.488", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11809"} 2023-01-29 19:28:33 | INFO | train_inner | {"epoch": 22, "update": 21.524, "s2c_loss": "0.099", "loss": "0.06883", "s2c_nll_loss": "0.099", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "46520", "lr": "8.98722e-05", "gnorm": "2.721", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11811"} 2023-01-29 19:28:36 | INFO | train_inner | {"epoch": 22, "update": 21.528, "s2c_loss": "0.125", "loss": "0.08631", "s2c_nll_loss": "0.125", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "46530", "lr": "8.98055e-05", "gnorm": "2.462", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11814"} 2023-01-29 19:28:38 | INFO | train_inner | {"epoch": 22, "update": 21.533, "s2c_loss": "0.088", "loss": "0.06067", "s2c_nll_loss": "0.088", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "46540", "lr": "8.97388e-05", "gnorm": "2.323", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "11816"} 2023-01-29 19:28:41 | INFO | train_inner | {"epoch": 22, "update": 21.537, "s2c_loss": "0.207", "loss": "0.14365", "s2c_nll_loss": "0.207", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "260.9", "ups": "4.08", "wpb": "64", "bsz": "64", "num_updates": "46550", "lr": "8.96722e-05", "gnorm": "3.164", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11819"} 2023-01-29 19:28:43 | INFO | train_inner | {"epoch": 22, "update": 21.542, "s2c_loss": "0.075", "loss": "0.05221", "s2c_nll_loss": "0.075", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "46560", "lr": "8.96055e-05", "gnorm": "2.714", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11821"} 2023-01-29 19:28:46 | INFO | train_inner | {"epoch": 22, "update": 21.547, "s2c_loss": "0.064", "loss": "0.04438", "s2c_nll_loss": "0.064", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "46570", "lr": "8.95389e-05", "gnorm": "1.898", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11824"} 2023-01-29 19:28:48 | INFO | train_inner | {"epoch": 22, "update": 21.551, "s2c_loss": "0.095", "loss": "0.06593", "s2c_nll_loss": "0.095", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "46580", "lr": "8.94722e-05", "gnorm": "2.88", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11826"} 2023-01-29 19:28:51 | INFO | train_inner | {"epoch": 22, "update": 21.556, "s2c_loss": "0.045", "loss": "0.03144", "s2c_nll_loss": "0.045", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "46590", "lr": "8.94055e-05", "gnorm": "1.99", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11829"} 2023-01-29 19:28:53 | INFO | train_inner | {"epoch": 22, "update": 21.561, "s2c_loss": "0.104", "loss": "0.0718", "s2c_nll_loss": "0.104", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "46600", "lr": "8.93389e-05", "gnorm": "2.33", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11831"} 2023-01-29 19:28:56 | INFO | train_inner | {"epoch": 22, "update": 21.565, "s2c_loss": "0.07", "loss": "0.04852", "s2c_nll_loss": "0.07", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "46610", "lr": "8.92722e-05", "gnorm": "2.253", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11834"} 2023-01-29 19:28:58 | INFO | train_inner | {"epoch": 22, "update": 21.57, "s2c_loss": "0.079", "loss": "0.05493", "s2c_nll_loss": "0.079", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "46620", "lr": "8.92055e-05", "gnorm": "1.994", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11836"} 2023-01-29 19:29:01 | INFO | train_inner | {"epoch": 22, "update": 21.574, "s2c_loss": "0.053", "loss": "0.03675", "s2c_nll_loss": "0.053", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "46630", "lr": "8.91389e-05", "gnorm": "2.137", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11839"} 2023-01-29 19:29:04 | INFO | train_inner | {"epoch": 22, "update": 21.579, "s2c_loss": "0.076", "loss": "0.05264", "s2c_nll_loss": "0.076", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "46640", "lr": "8.90722e-05", "gnorm": "2.181", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11841"} 2023-01-29 19:29:06 | INFO | train_inner | {"epoch": 22, "update": 21.584, "s2c_loss": "0.08", "loss": "0.05578", "s2c_nll_loss": "0.08", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "46650", "lr": "8.90056e-05", "gnorm": "2.11", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11844"} 2023-01-29 19:29:09 | INFO | train_inner | {"epoch": 22, "update": 21.588, "s2c_loss": "0.081", "loss": "0.05602", "s2c_nll_loss": "0.081", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "46660", "lr": "8.89389e-05", "gnorm": "2.695", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11846"} 2023-01-29 19:29:11 | INFO | train_inner | {"epoch": 22, "update": 21.593, "s2c_loss": "0.11", "loss": "0.07612", "s2c_nll_loss": "0.11", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "46670", "lr": "8.88722e-05", "gnorm": "2.891", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11849"} 2023-01-29 19:29:14 | INFO | train_inner | {"epoch": 22, "update": 21.598, "s2c_loss": "0.062", "loss": "0.04315", "s2c_nll_loss": "0.062", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "46680", "lr": "8.88056e-05", "gnorm": "2.299", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11852"} 2023-01-29 19:29:16 | INFO | train_inner | {"epoch": 22, "update": 21.602, "s2c_loss": "0.115", "loss": "0.07989", "s2c_nll_loss": "0.115", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "46690", "lr": "8.87389e-05", "gnorm": "2.674", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11854"} 2023-01-29 19:29:19 | INFO | train_inner | {"epoch": 22, "update": 21.607, "s2c_loss": "0.114", "loss": "0.07916", "s2c_nll_loss": "0.114", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "244.3", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "46700", "lr": "8.86722e-05", "gnorm": "2.45", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11857"} 2023-01-29 19:29:21 | INFO | train_inner | {"epoch": 22, "update": 21.611, "s2c_loss": "0.084", "loss": "0.05815", "s2c_nll_loss": "0.084", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "46710", "lr": "8.86056e-05", "gnorm": "2.623", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11859"} 2023-01-29 19:29:24 | INFO | train_inner | {"epoch": 22, "update": 21.616, "s2c_loss": "0.104", "loss": "0.07215", "s2c_nll_loss": "0.104", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "46720", "lr": "8.85389e-05", "gnorm": "2.904", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11862"} 2023-01-29 19:29:26 | INFO | train_inner | {"epoch": 22, "update": 21.621, "s2c_loss": "0.096", "loss": "0.06679", "s2c_nll_loss": "0.096", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "46730", "lr": "8.84722e-05", "gnorm": "2.728", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11864"} 2023-01-29 19:29:29 | INFO | train_inner | {"epoch": 22, "update": 21.625, "s2c_loss": "0.097", "loss": "0.06743", "s2c_nll_loss": "0.097", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "46740", "lr": "8.84056e-05", "gnorm": "2.291", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11867"} 2023-01-29 19:29:32 | INFO | train_inner | {"epoch": 22, "update": 21.63, "s2c_loss": "0.1", "loss": "0.06908", "s2c_nll_loss": "0.1", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "46750", "lr": "8.83389e-05", "gnorm": "3.148", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11869"} 2023-01-29 19:29:34 | INFO | train_inner | {"epoch": 22, "update": 21.635, "s2c_loss": "0.124", "loss": "0.08607", "s2c_nll_loss": "0.124", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "46760", "lr": "8.82723e-05", "gnorm": "3.753", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "11872"} 2023-01-29 19:29:37 | INFO | train_inner | {"epoch": 22, "update": 21.639, "s2c_loss": "0.11", "loss": "0.07607", "s2c_nll_loss": "0.11", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "46770", "lr": "8.82056e-05", "gnorm": "3.032", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11875"} 2023-01-29 19:29:39 | INFO | train_inner | {"epoch": 22, "update": 21.644, "s2c_loss": "0.129", "loss": "0.08919", "s2c_nll_loss": "0.129", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "258.8", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "46780", "lr": "8.81389e-05", "gnorm": "3.249", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11877"} 2023-01-29 19:29:42 | INFO | train_inner | {"epoch": 22, "update": 21.648, "s2c_loss": "0.124", "loss": "0.08574", "s2c_nll_loss": "0.124", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "46790", "lr": "8.80723e-05", "gnorm": "2.555", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11880"} 2023-01-29 19:29:44 | INFO | train_inner | {"epoch": 22, "update": 21.653, "s2c_loss": "0.088", "loss": "0.06114", "s2c_nll_loss": "0.088", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "46800", "lr": "8.80056e-05", "gnorm": "2.73", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11882"} 2023-01-29 19:29:47 | INFO | train_inner | {"epoch": 22, "update": 21.658, "s2c_loss": "0.112", "loss": "0.07732", "s2c_nll_loss": "0.112", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "46810", "lr": "8.79389e-05", "gnorm": "3.236", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11885"} 2023-01-29 19:29:49 | INFO | train_inner | {"epoch": 22, "update": 21.662, "s2c_loss": "0.089", "loss": "0.06164", "s2c_nll_loss": "0.089", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "46820", "lr": "8.78723e-05", "gnorm": "2.462", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11887"} 2023-01-29 19:29:52 | INFO | train_inner | {"epoch": 22, "update": 21.667, "s2c_loss": "0.124", "loss": "0.08593", "s2c_nll_loss": "0.124", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "46830", "lr": "8.78056e-05", "gnorm": "3.58", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11890"} 2023-01-29 19:29:54 | INFO | train_inner | {"epoch": 22, "update": 21.672, "s2c_loss": "0.087", "loss": "0.0601", "s2c_nll_loss": "0.087", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "46840", "lr": "8.77389e-05", "gnorm": "2.885", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11892"} 2023-01-29 19:29:57 | INFO | train_inner | {"epoch": 22, "update": 21.676, "s2c_loss": "0.087", "loss": "0.06014", "s2c_nll_loss": "0.087", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "46850", "lr": "8.76723e-05", "gnorm": "3.802", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11895"} 2023-01-29 19:29:59 | INFO | train_inner | {"epoch": 22, "update": 21.681, "s2c_loss": "0.089", "loss": "0.06182", "s2c_nll_loss": "0.089", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "46860", "lr": "8.76056e-05", "gnorm": "2.745", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11897"} 2023-01-29 19:30:02 | INFO | train_inner | {"epoch": 22, "update": 21.685, "s2c_loss": "0.166", "loss": "0.11495", "s2c_nll_loss": "0.166", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "46870", "lr": "8.7539e-05", "gnorm": "4.248", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11900"} 2023-01-29 19:30:04 | INFO | train_inner | {"epoch": 22, "update": 21.69, "s2c_loss": "0.115", "loss": "0.07961", "s2c_nll_loss": "0.115", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "247.6", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "46880", "lr": "8.74723e-05", "gnorm": "2.744", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11902"} 2023-01-29 19:30:07 | INFO | train_inner | {"epoch": 22, "update": 21.695, "s2c_loss": "0.131", "loss": "0.09074", "s2c_nll_loss": "0.131", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "46890", "lr": "8.74056e-05", "gnorm": "3.327", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11905"} 2023-01-29 19:30:10 | INFO | train_inner | {"epoch": 22, "update": 21.699, "s2c_loss": "0.128", "loss": "0.08881", "s2c_nll_loss": "0.128", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "46900", "lr": "8.7339e-05", "gnorm": "3.014", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.5", "wall": "11907"} 2023-01-29 19:30:12 | INFO | train_inner | {"epoch": 22, "update": 21.704, "s2c_loss": "0.137", "loss": "0.09484", "s2c_nll_loss": "0.137", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "46910", "lr": "8.72723e-05", "gnorm": "2.673", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11910"} 2023-01-29 19:30:15 | INFO | train_inner | {"epoch": 22, "update": 21.709, "s2c_loss": "0.085", "loss": "0.05925", "s2c_nll_loss": "0.085", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "46920", "lr": "8.72056e-05", "gnorm": "2.299", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11913"} 2023-01-29 19:30:17 | INFO | train_inner | {"epoch": 22, "update": 21.713, "s2c_loss": "0.092", "loss": "0.06366", "s2c_nll_loss": "0.092", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "46930", "lr": "8.7139e-05", "gnorm": "3.619", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11915"} 2023-01-29 19:30:20 | INFO | train_inner | {"epoch": 22, "update": 21.718, "s2c_loss": "0.106", "loss": "0.07336", "s2c_nll_loss": "0.106", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "46940", "lr": "8.70723e-05", "gnorm": "2.694", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11918"} 2023-01-29 19:30:22 | INFO | train_inner | {"epoch": 22, "update": 21.722, "s2c_loss": "0.102", "loss": "0.07103", "s2c_nll_loss": "0.102", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "46950", "lr": "8.70057e-05", "gnorm": "3.153", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11920"} 2023-01-29 19:30:25 | INFO | train_inner | {"epoch": 22, "update": 21.727, "s2c_loss": "0.132", "loss": "0.09143", "s2c_nll_loss": "0.132", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "46960", "lr": "8.6939e-05", "gnorm": "3.181", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11923"} 2023-01-29 19:30:27 | INFO | train_inner | {"epoch": 22, "update": 21.732, "s2c_loss": "0.111", "loss": "0.07721", "s2c_nll_loss": "0.111", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "46970", "lr": "8.68723e-05", "gnorm": "3.162", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11925"} 2023-01-29 19:30:30 | INFO | train_inner | {"epoch": 22, "update": 21.736, "s2c_loss": "0.073", "loss": "0.05047", "s2c_nll_loss": "0.073", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "46980", "lr": "8.68057e-05", "gnorm": "2.078", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11928"} 2023-01-29 19:30:32 | INFO | train_inner | {"epoch": 22, "update": 21.741, "s2c_loss": "0.075", "loss": "0.05233", "s2c_nll_loss": "0.075", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "46990", "lr": "8.6739e-05", "gnorm": "2.566", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11930"} 2023-01-29 19:30:35 | INFO | train_inner | {"epoch": 22, "update": 21.746, "s2c_loss": "0.066", "loss": "0.04597", "s2c_nll_loss": "0.066", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "47000", "lr": "8.66723e-05", "gnorm": "2.072", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11933"} 2023-01-29 19:30:37 | INFO | train_inner | {"epoch": 22, "update": 21.75, "s2c_loss": "0.131", "loss": "0.09061", "s2c_nll_loss": "0.131", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "47010", "lr": "8.66057e-05", "gnorm": "2.596", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11935"} 2023-01-29 19:30:40 | INFO | train_inner | {"epoch": 22, "update": 21.755, "s2c_loss": "0.103", "loss": "0.07106", "s2c_nll_loss": "0.103", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "47020", "lr": "8.6539e-05", "gnorm": "2.924", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11938"} 2023-01-29 19:30:42 | INFO | train_inner | {"epoch": 22, "update": 21.759, "s2c_loss": "0.184", "loss": "0.12749", "s2c_nll_loss": "0.184", "s2c_accuracy": "96.875", "s2c_total": "64", "s2c_n_correct": "62", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "47030", "lr": "8.64723e-05", "gnorm": "3.411", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11940"} 2023-01-29 19:30:45 | INFO | train_inner | {"epoch": 22, "update": 21.764, "s2c_loss": "0.078", "loss": "0.05425", "s2c_nll_loss": "0.078", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "47040", "lr": "8.64057e-05", "gnorm": "2.305", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11943"} 2023-01-29 19:30:47 | INFO | train_inner | {"epoch": 22, "update": 21.769, "s2c_loss": "0.085", "loss": "0.05892", "s2c_nll_loss": "0.085", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "47050", "lr": "8.6339e-05", "gnorm": "2.51", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11945"} 2023-01-29 19:30:50 | INFO | train_inner | {"epoch": 22, "update": 21.773, "s2c_loss": "0.082", "loss": "0.05706", "s2c_nll_loss": "0.082", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "47060", "lr": "8.62724e-05", "gnorm": "2.532", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11948"} 2023-01-29 19:30:52 | INFO | train_inner | {"epoch": 22, "update": 21.778, "s2c_loss": "0.055", "loss": "0.0384", "s2c_nll_loss": "0.055", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "47070", "lr": "8.62057e-05", "gnorm": "1.865", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11950"} 2023-01-29 19:30:55 | INFO | train_inner | {"epoch": 22, "update": 21.783, "s2c_loss": "0.067", "loss": "0.04654", "s2c_nll_loss": "0.067", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "47080", "lr": "8.6139e-05", "gnorm": "2.515", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11953"} 2023-01-29 19:30:57 | INFO | train_inner | {"epoch": 22, "update": 21.787, "s2c_loss": "0.26", "loss": "0.18039", "s2c_nll_loss": "0.26", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "47090", "lr": "8.60724e-05", "gnorm": "2.775", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11955"} 2023-01-29 19:31:00 | INFO | train_inner | {"epoch": 22, "update": 21.792, "s2c_loss": "0.087", "loss": "0.06015", "s2c_nll_loss": "0.087", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "47100", "lr": "8.60057e-05", "gnorm": "2.646", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11958"} 2023-01-29 19:31:02 | INFO | train_inner | {"epoch": 22, "update": 21.796, "s2c_loss": "0.095", "loss": "0.066", "s2c_nll_loss": "0.095", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "47110", "lr": "8.5939e-05", "gnorm": "2.511", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11960"} 2023-01-29 19:31:05 | INFO | train_inner | {"epoch": 22, "update": 21.801, "s2c_loss": "0.111", "loss": "0.07672", "s2c_nll_loss": "0.111", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "47120", "lr": "8.58724e-05", "gnorm": "2.947", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "11963"} 2023-01-29 19:31:08 | INFO | train_inner | {"epoch": 22, "update": 21.806, "s2c_loss": "0.069", "loss": "0.04798", "s2c_nll_loss": "0.069", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "47130", "lr": "8.58057e-05", "gnorm": "2.52", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11965"} 2023-01-29 19:31:10 | INFO | train_inner | {"epoch": 22, "update": 21.81, "s2c_loss": "0.074", "loss": "0.05099", "s2c_nll_loss": "0.074", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "47140", "lr": "8.5739e-05", "gnorm": "2.69", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11968"} 2023-01-29 19:31:13 | INFO | train_inner | {"epoch": 22, "update": 21.815, "s2c_loss": "0.101", "loss": "0.07002", "s2c_nll_loss": "0.101", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "47150", "lr": "8.56724e-05", "gnorm": "2.603", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11970"} 2023-01-29 19:31:15 | INFO | train_inner | {"epoch": 22, "update": 21.82, "s2c_loss": "0.131", "loss": "0.09107", "s2c_nll_loss": "0.131", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "47160", "lr": "8.56057e-05", "gnorm": "3.067", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11973"} 2023-01-29 19:31:18 | INFO | train_inner | {"epoch": 22, "update": 21.824, "s2c_loss": "0.13", "loss": "0.09045", "s2c_nll_loss": "0.13", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "246.3", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "47170", "lr": "8.55391e-05", "gnorm": "3.397", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11976"} 2023-01-29 19:31:20 | INFO | train_inner | {"epoch": 22, "update": 21.829, "s2c_loss": "0.084", "loss": "0.05812", "s2c_nll_loss": "0.084", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "245.7", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "47180", "lr": "8.54724e-05", "gnorm": "2.563", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "11978"} 2023-01-29 19:31:23 | INFO | train_inner | {"epoch": 22, "update": 21.833, "s2c_loss": "0.076", "loss": "0.053", "s2c_nll_loss": "0.076", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "247.6", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "47190", "lr": "8.54057e-05", "gnorm": "2.372", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11981"} 2023-01-29 19:31:26 | INFO | train_inner | {"epoch": 22, "update": 21.838, "s2c_loss": "0.102", "loss": "0.07053", "s2c_nll_loss": "0.102", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "47200", "lr": "8.53391e-05", "gnorm": "2.463", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "11983"} 2023-01-29 19:31:28 | INFO | train_inner | {"epoch": 22, "update": 21.843, "s2c_loss": "0.133", "loss": "0.0924", "s2c_nll_loss": "0.133", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "47210", "lr": "8.52724e-05", "gnorm": "3.112", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11986"} 2023-01-29 19:31:31 | INFO | train_inner | {"epoch": 22, "update": 21.847, "s2c_loss": "0.08", "loss": "0.05554", "s2c_nll_loss": "0.08", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "47220", "lr": "8.52057e-05", "gnorm": "2.516", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "11988"} 2023-01-29 19:31:33 | INFO | train_inner | {"epoch": 22, "update": 21.852, "s2c_loss": "0.084", "loss": "0.05843", "s2c_nll_loss": "0.084", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "258.8", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "47230", "lr": "8.51391e-05", "gnorm": "2.746", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11991"} 2023-01-29 19:31:36 | INFO | train_inner | {"epoch": 22, "update": 21.857, "s2c_loss": "0.139", "loss": "0.09648", "s2c_nll_loss": "0.139", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "47240", "lr": "8.50724e-05", "gnorm": "2.792", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "11994"} 2023-01-29 19:31:38 | INFO | train_inner | {"epoch": 22, "update": 21.861, "s2c_loss": "0.094", "loss": "0.06492", "s2c_nll_loss": "0.094", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "47250", "lr": "8.50058e-05", "gnorm": "3.152", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "11996"} 2023-01-29 19:31:41 | INFO | train_inner | {"epoch": 22, "update": 21.866, "s2c_loss": "0.137", "loss": "0.09512", "s2c_nll_loss": "0.137", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "249.9", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "47260", "lr": "8.49391e-05", "gnorm": "3.915", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "11999"} 2023-01-29 19:31:43 | INFO | train_inner | {"epoch": 22, "update": 21.87, "s2c_loss": "0.097", "loss": "0.06722", "s2c_nll_loss": "0.097", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "47270", "lr": "8.48724e-05", "gnorm": "3.233", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12001"} 2023-01-29 19:31:46 | INFO | train_inner | {"epoch": 22, "update": 21.875, "s2c_loss": "0.121", "loss": "0.08397", "s2c_nll_loss": "0.121", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "47280", "lr": "8.48058e-05", "gnorm": "5.029", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12004"} 2023-01-29 19:31:48 | INFO | train_inner | {"epoch": 22, "update": 21.88, "s2c_loss": "0.109", "loss": "0.07584", "s2c_nll_loss": "0.109", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "47290", "lr": "8.47391e-05", "gnorm": "3.184", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12006"} 2023-01-29 19:31:51 | INFO | train_inner | {"epoch": 22, "update": 21.884, "s2c_loss": "0.125", "loss": "0.08648", "s2c_nll_loss": "0.125", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "47300", "lr": "8.46724e-05", "gnorm": "4.08", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12009"} 2023-01-29 19:31:53 | INFO | train_inner | {"epoch": 22, "update": 21.889, "s2c_loss": "0.128", "loss": "0.08873", "s2c_nll_loss": "0.128", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "47310", "lr": "8.46058e-05", "gnorm": "3.988", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12011"} 2023-01-29 19:31:56 | INFO | train_inner | {"epoch": 22, "update": 21.894, "s2c_loss": "0.077", "loss": "0.05347", "s2c_nll_loss": "0.077", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "47320", "lr": "8.45391e-05", "gnorm": "2.573", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12014"} 2023-01-29 19:31:58 | INFO | train_inner | {"epoch": 22, "update": 21.898, "s2c_loss": "0.097", "loss": "0.06721", "s2c_nll_loss": "0.097", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "47330", "lr": "8.44724e-05", "gnorm": "2.882", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12016"} 2023-01-29 19:32:01 | INFO | train_inner | {"epoch": 22, "update": 21.903, "s2c_loss": "0.122", "loss": "0.08442", "s2c_nll_loss": "0.122", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "245.7", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "47340", "lr": "8.44058e-05", "gnorm": "2.685", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12019"} 2023-01-29 19:32:04 | INFO | train_inner | {"epoch": 22, "update": 21.907, "s2c_loss": "0.113", "loss": "0.07858", "s2c_nll_loss": "0.113", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "47350", "lr": "8.43391e-05", "gnorm": "3.083", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12021"} 2023-01-29 19:32:06 | INFO | train_inner | {"epoch": 22, "update": 21.912, "s2c_loss": "0.079", "loss": "0.0549", "s2c_nll_loss": "0.079", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "47360", "lr": "8.42725e-05", "gnorm": "2.653", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12024"} 2023-01-29 19:32:09 | INFO | train_inner | {"epoch": 22, "update": 21.917, "s2c_loss": "0.101", "loss": "0.07022", "s2c_nll_loss": "0.101", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "47370", "lr": "8.42058e-05", "gnorm": "2.531", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12027"} 2023-01-29 19:32:11 | INFO | train_inner | {"epoch": 22, "update": 21.921, "s2c_loss": "0.118", "loss": "0.08167", "s2c_nll_loss": "0.118", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "47380", "lr": "8.41391e-05", "gnorm": "3.055", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12029"} 2023-01-29 19:32:14 | INFO | train_inner | {"epoch": 22, "update": 21.926, "s2c_loss": "0.094", "loss": "0.06516", "s2c_nll_loss": "0.094", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "47390", "lr": "8.40725e-05", "gnorm": "2.343", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12032"} 2023-01-29 19:32:16 | INFO | train_inner | {"epoch": 22, "update": 21.931, "s2c_loss": "0.11", "loss": "0.07636", "s2c_nll_loss": "0.11", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "47400", "lr": "8.40058e-05", "gnorm": "3.072", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12034"} 2023-01-29 19:32:19 | INFO | train_inner | {"epoch": 22, "update": 21.935, "s2c_loss": "0.075", "loss": "0.05195", "s2c_nll_loss": "0.075", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "47410", "lr": "8.39391e-05", "gnorm": "2.673", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12037"} 2023-01-29 19:32:21 | INFO | train_inner | {"epoch": 22, "update": 21.94, "s2c_loss": "0.077", "loss": "0.0537", "s2c_nll_loss": "0.077", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "47420", "lr": "8.38725e-05", "gnorm": "2.671", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12039"} 2023-01-29 19:32:24 | INFO | train_inner | {"epoch": 22, "update": 21.944, "s2c_loss": "0.091", "loss": "0.06294", "s2c_nll_loss": "0.091", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "47430", "lr": "8.38058e-05", "gnorm": "2.994", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12042"} 2023-01-29 19:32:26 | INFO | train_inner | {"epoch": 22, "update": 21.949, "s2c_loss": "0.099", "loss": "0.0683", "s2c_nll_loss": "0.099", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "47440", "lr": "8.37391e-05", "gnorm": "2.685", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12044"} 2023-01-29 19:32:29 | INFO | train_inner | {"epoch": 22, "update": 21.954, "s2c_loss": "0.101", "loss": "0.06984", "s2c_nll_loss": "0.101", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "47450", "lr": "8.36725e-05", "gnorm": "2.771", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12047"} 2023-01-29 19:32:31 | INFO | train_inner | {"epoch": 22, "update": 21.958, "s2c_loss": "0.139", "loss": "0.09666", "s2c_nll_loss": "0.139", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "47460", "lr": "8.36058e-05", "gnorm": "2.989", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "12049"} 2023-01-29 19:32:34 | INFO | train_inner | {"epoch": 22, "update": 21.963, "s2c_loss": "0.074", "loss": "0.05133", "s2c_nll_loss": "0.074", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "47470", "lr": "8.35392e-05", "gnorm": "2.351", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12052"} 2023-01-29 19:32:36 | INFO | train_inner | {"epoch": 22, "update": 21.968, "s2c_loss": "0.117", "loss": "0.08093", "s2c_nll_loss": "0.117", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "47480", "lr": "8.34725e-05", "gnorm": "3.03", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12054"} 2023-01-29 19:32:39 | INFO | train_inner | {"epoch": 22, "update": 21.972, "s2c_loss": "0.12", "loss": "0.08307", "s2c_nll_loss": "0.12", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "47490", "lr": "8.34058e-05", "gnorm": "3.175", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12057"} 2023-01-29 19:32:42 | INFO | train_inner | {"epoch": 22, "update": 21.977, "s2c_loss": "0.172", "loss": "0.11936", "s2c_nll_loss": "0.172", "s2c_accuracy": "97.188", "s2c_total": "64", "s2c_n_correct": "62.2", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "47500", "lr": "8.33392e-05", "gnorm": "3.912", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "12059"} 2023-01-29 19:32:44 | INFO | train_inner | {"epoch": 22, "update": 21.981, "s2c_loss": "0.104", "loss": "0.07239", "s2c_nll_loss": "0.104", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "47510", "lr": "8.32725e-05", "gnorm": "3.56", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12062"} 2023-01-29 19:32:47 | INFO | train_inner | {"epoch": 22, "update": 21.986, "s2c_loss": "0.159", "loss": "0.11021", "s2c_nll_loss": "0.159", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "47520", "lr": "8.32058e-05", "gnorm": "2.78", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12065"} 2023-01-29 19:32:49 | INFO | train_inner | {"epoch": 22, "update": 21.991, "s2c_loss": "0.151", "loss": "0.10487", "s2c_nll_loss": "0.151", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "47530", "lr": "8.31392e-05", "gnorm": "3.109", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12067"} 2023-01-29 19:32:52 | INFO | train_inner | {"epoch": 22, "update": 21.995, "s2c_loss": "0.127", "loss": "0.08781", "s2c_nll_loss": "0.127", "s2c_accuracy": "97.656", "s2c_total": "64", "s2c_n_correct": "62.5", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "47540", "lr": "8.30725e-05", "gnorm": "3.469", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12070"} 2023-01-29 19:32:54 | INFO | train_inner | {"epoch": 22, "update": 22.0, "s2c_loss": "0.072", "loss": "0.05014", "s2c_nll_loss": "0.072", "s2c_accuracy": "99.178", "s2c_total": "60.8", "s2c_n_correct": "60.3", "wps": "249.5", "ups": "4.1", "wpb": "60.8", "bsz": "60.8", "num_updates": "47550", "lr": "8.30059e-05", "gnorm": "2.277", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12072"} 2023-01-29 19:32:54 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 19:33:09 | INFO | valid | {"epoch": 22, "valid_s2c_loss": "0.593", "valid_loss": "0.41123", "valid_s2c_nll_loss": "0.593", "valid_s2c_accuracy": "89.845", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "28.713", "valid_num_updates": "47550", "valid_best_s2c_accuracy": "90.019"} 2023-01-29 19:33:09 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 22 @ 47550 updates 2023-01-29 19:33:09 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 19:33:16 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt 2023-01-29 19:33:16 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_last.pt (epoch 22 @ 47550 updates, score 89.845) (writing took 7.029357954859734 seconds) 2023-01-29 19:33:16 | INFO | fairseq_cli.train | end of epoch 22 (average epoch stats below) 2023-01-29 19:33:16 | INFO | train | {"epoch": 22, "train_s2c_loss": "0.103", "train_loss": "0.07153", "train_s2c_nll_loss": "0.103", "train_s2c_accuracy": "98.365", "train_s2c_total": "63.9838", "train_s2c_n_correct": "62.9375", "train_wps": "240.6", "train_ups": "3.76", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "47550", "train_lr": "8.30059e-05", "train_gnorm": "2.752", "train_loss_scale": "2048", "train_train_wall": "539", "train_gb_free": "7.4", "train_wall": "12094"} 2023-01-29 19:33:22 | INFO | fairseq.trainer | begin training epoch 23 2023-01-29 19:33:22 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 19:33:25 | INFO | train_inner | {"epoch": 23, "update": 22.005, "s2c_loss": "0.108", "loss": "0.07477", "s2c_nll_loss": "0.108", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "20.9", "ups": "0.33", "wpb": "64", "bsz": "64", "num_updates": "47560", "lr": "8.29392e-05", "gnorm": "3.297", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12103"} 2023-01-29 19:33:27 | INFO | train_inner | {"epoch": 23, "update": 22.009, "s2c_loss": "0.099", "loss": "0.06876", "s2c_nll_loss": "0.099", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "246", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "47570", "lr": "8.28725e-05", "gnorm": "2.672", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12105"} 2023-01-29 19:33:30 | INFO | train_inner | {"epoch": 23, "update": 22.014, "s2c_loss": "0.061", "loss": "0.04207", "s2c_nll_loss": "0.061", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "47580", "lr": "8.28059e-05", "gnorm": "2.024", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12108"} 2023-01-29 19:33:32 | INFO | train_inner | {"epoch": 23, "update": 22.019, "s2c_loss": "0.047", "loss": "0.03283", "s2c_nll_loss": "0.047", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "47590", "lr": "8.27392e-05", "gnorm": "1.795", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12110"} 2023-01-29 19:33:35 | INFO | train_inner | {"epoch": 23, "update": 22.023, "s2c_loss": "0.073", "loss": "0.05077", "s2c_nll_loss": "0.073", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "47600", "lr": "8.26725e-05", "gnorm": "2.766", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12113"} 2023-01-29 19:33:37 | INFO | train_inner | {"epoch": 23, "update": 22.028, "s2c_loss": "0.064", "loss": "0.04433", "s2c_nll_loss": "0.064", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "47610", "lr": "8.26059e-05", "gnorm": "2.046", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12115"} 2023-01-29 19:33:40 | INFO | train_inner | {"epoch": 23, "update": 22.032, "s2c_loss": "0.088", "loss": "0.06113", "s2c_nll_loss": "0.088", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "47620", "lr": "8.25392e-05", "gnorm": "2.216", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12118"} 2023-01-29 19:33:42 | INFO | train_inner | {"epoch": 23, "update": 22.037, "s2c_loss": "0.087", "loss": "0.06056", "s2c_nll_loss": "0.087", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "261.2", "ups": "4.08", "wpb": "64", "bsz": "64", "num_updates": "47630", "lr": "8.24725e-05", "gnorm": "2.422", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12120"} 2023-01-29 19:33:45 | INFO | train_inner | {"epoch": 23, "update": 22.042, "s2c_loss": "0.069", "loss": "0.04772", "s2c_nll_loss": "0.069", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "47640", "lr": "8.24059e-05", "gnorm": "2.083", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12123"} 2023-01-29 19:33:47 | INFO | train_inner | {"epoch": 23, "update": 22.046, "s2c_loss": "0.076", "loss": "0.05235", "s2c_nll_loss": "0.076", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "47650", "lr": "8.23392e-05", "gnorm": "1.853", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12125"} 2023-01-29 19:33:50 | INFO | train_inner | {"epoch": 23, "update": 22.051, "s2c_loss": "0.2", "loss": "0.13833", "s2c_nll_loss": "0.2", "s2c_accuracy": "96.719", "s2c_total": "64", "s2c_n_correct": "61.9", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "47660", "lr": "8.22726e-05", "gnorm": "3.935", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12128"} 2023-01-29 19:33:52 | INFO | train_inner | {"epoch": 23, "update": 22.056, "s2c_loss": "0.11", "loss": "0.07601", "s2c_nll_loss": "0.11", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "47670", "lr": "8.22059e-05", "gnorm": "2.965", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12130"} 2023-01-29 19:33:55 | INFO | train_inner | {"epoch": 23, "update": 22.06, "s2c_loss": "0.057", "loss": "0.03932", "s2c_nll_loss": "0.057", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "47680", "lr": "8.21392e-05", "gnorm": "1.933", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12133"} 2023-01-29 19:33:57 | INFO | train_inner | {"epoch": 23, "update": 22.065, "s2c_loss": "0.056", "loss": "0.03887", "s2c_nll_loss": "0.056", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "47690", "lr": "8.20726e-05", "gnorm": "2.144", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12135"} 2023-01-29 19:34:00 | INFO | train_inner | {"epoch": 23, "update": 22.069, "s2c_loss": "0.052", "loss": "0.03631", "s2c_nll_loss": "0.052", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "47700", "lr": "8.20059e-05", "gnorm": "2.07", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12138"} 2023-01-29 19:34:02 | INFO | train_inner | {"epoch": 23, "update": 22.074, "s2c_loss": "0.047", "loss": "0.03254", "s2c_nll_loss": "0.047", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "260.7", "ups": "4.07", "wpb": "64", "bsz": "64", "num_updates": "47710", "lr": "8.19392e-05", "gnorm": "2.609", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12140"} 2023-01-29 19:34:05 | INFO | train_inner | {"epoch": 23, "update": 22.079, "s2c_loss": "0.074", "loss": "0.05102", "s2c_nll_loss": "0.074", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "258.4", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "47720", "lr": "8.18726e-05", "gnorm": "2.215", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12143"} 2023-01-29 19:34:07 | INFO | train_inner | {"epoch": 23, "update": 22.083, "s2c_loss": "0.053", "loss": "0.03642", "s2c_nll_loss": "0.053", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "47730", "lr": "8.18059e-05", "gnorm": "2.193", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12145"} 2023-01-29 19:34:10 | INFO | train_inner | {"epoch": 23, "update": 22.088, "s2c_loss": "0.052", "loss": "0.03595", "s2c_nll_loss": "0.052", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "47740", "lr": "8.17392e-05", "gnorm": "1.969", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12148"} 2023-01-29 19:34:12 | INFO | train_inner | {"epoch": 23, "update": 22.093, "s2c_loss": "0.029", "loss": "0.02029", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "47750", "lr": "8.16726e-05", "gnorm": "1.168", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12150"} 2023-01-29 19:34:15 | INFO | train_inner | {"epoch": 23, "update": 22.097, "s2c_loss": "0.065", "loss": "0.04482", "s2c_nll_loss": "0.065", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "47760", "lr": "8.16059e-05", "gnorm": "2.14", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12153"} 2023-01-29 19:34:17 | INFO | train_inner | {"epoch": 23, "update": 22.102, "s2c_loss": "0.076", "loss": "0.0529", "s2c_nll_loss": "0.076", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "47770", "lr": "8.15393e-05", "gnorm": "2.569", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12155"} 2023-01-29 19:34:20 | INFO | train_inner | {"epoch": 23, "update": 22.106, "s2c_loss": "0.058", "loss": "0.03994", "s2c_nll_loss": "0.058", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "47780", "lr": "8.14726e-05", "gnorm": "2.182", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12158"} 2023-01-29 19:34:23 | INFO | train_inner | {"epoch": 23, "update": 22.111, "s2c_loss": "0.063", "loss": "0.04385", "s2c_nll_loss": "0.063", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "47790", "lr": "8.14059e-05", "gnorm": "2.38", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12160"} 2023-01-29 19:34:25 | INFO | train_inner | {"epoch": 23, "update": 22.116, "s2c_loss": "0.09", "loss": "0.06267", "s2c_nll_loss": "0.09", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "47800", "lr": "8.13393e-05", "gnorm": "2.261", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12163"} 2023-01-29 19:34:28 | INFO | train_inner | {"epoch": 23, "update": 22.12, "s2c_loss": "0.067", "loss": "0.04665", "s2c_nll_loss": "0.067", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "47810", "lr": "8.12726e-05", "gnorm": "2.488", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12166"} 2023-01-29 19:34:30 | INFO | train_inner | {"epoch": 23, "update": 22.125, "s2c_loss": "0.109", "loss": "0.07547", "s2c_nll_loss": "0.109", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "47820", "lr": "8.12059e-05", "gnorm": "2.775", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12168"} 2023-01-29 19:34:33 | INFO | train_inner | {"epoch": 23, "update": 22.13, "s2c_loss": "0.064", "loss": "0.04441", "s2c_nll_loss": "0.064", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "47830", "lr": "8.11393e-05", "gnorm": "2.356", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12171"} 2023-01-29 19:34:35 | INFO | train_inner | {"epoch": 23, "update": 22.134, "s2c_loss": "0.057", "loss": "0.03961", "s2c_nll_loss": "0.057", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "47840", "lr": "8.10726e-05", "gnorm": "2.108", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12173"} 2023-01-29 19:34:38 | INFO | train_inner | {"epoch": 23, "update": 22.139, "s2c_loss": "0.059", "loss": "0.04063", "s2c_nll_loss": "0.059", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "47850", "lr": "8.10059e-05", "gnorm": "1.793", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "12176"} 2023-01-29 19:34:40 | INFO | train_inner | {"epoch": 23, "update": 22.143, "s2c_loss": "0.106", "loss": "0.07365", "s2c_nll_loss": "0.106", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "47860", "lr": "8.09393e-05", "gnorm": "2.758", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "12178"} 2023-01-29 19:34:43 | INFO | train_inner | {"epoch": 23, "update": 22.148, "s2c_loss": "0.104", "loss": "0.07203", "s2c_nll_loss": "0.104", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "47870", "lr": "8.08726e-05", "gnorm": "2.444", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "12181"} 2023-01-29 19:34:45 | INFO | train_inner | {"epoch": 23, "update": 22.153, "s2c_loss": "0.06", "loss": "0.04169", "s2c_nll_loss": "0.06", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "47880", "lr": "8.0806e-05", "gnorm": "2.313", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "12183"} 2023-01-29 19:34:48 | INFO | train_inner | {"epoch": 23, "update": 22.157, "s2c_loss": "0.058", "loss": "0.04029", "s2c_nll_loss": "0.058", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "47890", "lr": "8.07393e-05", "gnorm": "2.065", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "12186"} 2023-01-29 19:34:50 | INFO | train_inner | {"epoch": 23, "update": 22.162, "s2c_loss": "0.093", "loss": "0.06433", "s2c_nll_loss": "0.093", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "47900", "lr": "8.06726e-05", "gnorm": "2.823", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "12188"} 2023-01-29 19:34:53 | INFO | train_inner | {"epoch": 23, "update": 22.167, "s2c_loss": "0.076", "loss": "0.05286", "s2c_nll_loss": "0.076", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "47910", "lr": "8.0606e-05", "gnorm": "2.341", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "12191"} 2023-01-29 19:34:55 | INFO | train_inner | {"epoch": 23, "update": 22.171, "s2c_loss": "0.07", "loss": "0.04874", "s2c_nll_loss": "0.07", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "47920", "lr": "8.05393e-05", "gnorm": "2.757", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "12193"} 2023-01-29 19:34:58 | INFO | train_inner | {"epoch": 23, "update": 22.176, "s2c_loss": "0.086", "loss": "0.05964", "s2c_nll_loss": "0.086", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "47930", "lr": "8.04726e-05", "gnorm": "2.827", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "12196"} 2023-01-29 19:35:00 | INFO | train_inner | {"epoch": 23, "update": 22.18, "s2c_loss": "0.08", "loss": "0.05563", "s2c_nll_loss": "0.08", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "47940", "lr": "8.0406e-05", "gnorm": "2.472", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "12198"} 2023-01-29 19:35:03 | INFO | train_inner | {"epoch": 23, "update": 22.185, "s2c_loss": "0.073", "loss": "0.05048", "s2c_nll_loss": "0.073", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "47950", "lr": "8.03393e-05", "gnorm": "2.61", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "12201"} 2023-01-29 19:35:05 | INFO | train_inner | {"epoch": 23, "update": 22.19, "s2c_loss": "0.076", "loss": "0.05244", "s2c_nll_loss": "0.076", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "47960", "lr": "8.02727e-05", "gnorm": "2.289", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "12203"} 2023-01-29 19:35:08 | INFO | train_inner | {"epoch": 23, "update": 22.194, "s2c_loss": "0.041", "loss": "0.02826", "s2c_nll_loss": "0.041", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "47970", "lr": "8.0206e-05", "gnorm": "1.336", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "12206"} 2023-01-29 19:35:10 | INFO | train_inner | {"epoch": 23, "update": 22.199, "s2c_loss": "0.042", "loss": "0.02929", "s2c_nll_loss": "0.042", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "47980", "lr": "8.01393e-05", "gnorm": "1.786", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "12208"} 2023-01-29 19:35:13 | INFO | train_inner | {"epoch": 23, "update": 22.204, "s2c_loss": "0.077", "loss": "0.05341", "s2c_nll_loss": "0.077", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "47990", "lr": "8.00727e-05", "gnorm": "2.515", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "12211"} 2023-01-29 19:35:15 | INFO | train_inner | {"epoch": 23, "update": 22.208, "s2c_loss": "0.123", "loss": "0.08539", "s2c_nll_loss": "0.123", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "48000", "lr": "8.0006e-05", "gnorm": "2.8", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "12213"} 2023-01-29 19:35:18 | INFO | train_inner | {"epoch": 23, "update": 22.213, "s2c_loss": "0.068", "loss": "0.04729", "s2c_nll_loss": "0.068", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "48010", "lr": "7.99393e-05", "gnorm": "2.305", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "12216"} 2023-01-29 19:35:18 | INFO | fairseq.trainer | NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 2048.0 2023-01-29 19:35:21 | INFO | train_inner | {"epoch": 23, "update": 22.218, "s2c_loss": "0.065", "loss": "0.04475", "s2c_nll_loss": "0.065", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "231.5", "ups": "3.62", "wpb": "64", "bsz": "64", "num_updates": "48020", "lr": "7.98727e-05", "gnorm": "2.398", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12219"} 2023-01-29 19:35:23 | INFO | train_inner | {"epoch": 23, "update": 22.222, "s2c_loss": "0.066", "loss": "0.04572", "s2c_nll_loss": "0.066", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "48030", "lr": "7.9806e-05", "gnorm": "2.29", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12221"} 2023-01-29 19:35:26 | INFO | train_inner | {"epoch": 23, "update": 22.227, "s2c_loss": "0.063", "loss": "0.04398", "s2c_nll_loss": "0.063", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "48040", "lr": "7.97393e-05", "gnorm": "2.297", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12224"} 2023-01-29 19:35:28 | INFO | train_inner | {"epoch": 23, "update": 22.232, "s2c_loss": "0.104", "loss": "0.07218", "s2c_nll_loss": "0.104", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "48050", "lr": "7.96727e-05", "gnorm": "2.397", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12226"} 2023-01-29 19:35:31 | INFO | train_inner | {"epoch": 23, "update": 22.236, "s2c_loss": "0.058", "loss": "0.04032", "s2c_nll_loss": "0.058", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "48060", "lr": "7.9606e-05", "gnorm": "2.46", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12229"} 2023-01-29 19:35:33 | INFO | train_inner | {"epoch": 23, "update": 22.241, "s2c_loss": "0.042", "loss": "0.02892", "s2c_nll_loss": "0.042", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "48070", "lr": "7.95394e-05", "gnorm": "1.802", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12231"} 2023-01-29 19:35:36 | INFO | train_inner | {"epoch": 23, "update": 22.246, "s2c_loss": "0.037", "loss": "0.0256", "s2c_nll_loss": "0.037", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "48080", "lr": "7.94727e-05", "gnorm": "1.519", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12234"} 2023-01-29 19:35:38 | INFO | train_inner | {"epoch": 23, "update": 22.25, "s2c_loss": "0.055", "loss": "0.03844", "s2c_nll_loss": "0.055", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "48090", "lr": "7.9406e-05", "gnorm": "2.077", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12236"} 2023-01-29 19:35:41 | INFO | train_inner | {"epoch": 23, "update": 22.255, "s2c_loss": "0.081", "loss": "0.05588", "s2c_nll_loss": "0.081", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "48100", "lr": "7.93394e-05", "gnorm": "2.442", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12239"} 2023-01-29 19:35:44 | INFO | train_inner | {"epoch": 23, "update": 22.259, "s2c_loss": "0.06", "loss": "0.04146", "s2c_nll_loss": "0.06", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "48110", "lr": "7.92727e-05", "gnorm": "2.207", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12241"} 2023-01-29 19:35:46 | INFO | train_inner | {"epoch": 23, "update": 22.264, "s2c_loss": "0.123", "loss": "0.08511", "s2c_nll_loss": "0.123", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "48120", "lr": "7.9206e-05", "gnorm": "2.732", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12244"} 2023-01-29 19:35:49 | INFO | train_inner | {"epoch": 23, "update": 22.269, "s2c_loss": "0.08", "loss": "0.05525", "s2c_nll_loss": "0.08", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "48130", "lr": "7.91394e-05", "gnorm": "2.521", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12247"} 2023-01-29 19:35:51 | INFO | train_inner | {"epoch": 23, "update": 22.273, "s2c_loss": "0.079", "loss": "0.05469", "s2c_nll_loss": "0.079", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "258.7", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "48140", "lr": "7.90727e-05", "gnorm": "2.324", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12249"} 2023-01-29 19:35:54 | INFO | train_inner | {"epoch": 23, "update": 22.278, "s2c_loss": "0.051", "loss": "0.03524", "s2c_nll_loss": "0.051", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "48150", "lr": "7.9006e-05", "gnorm": "1.897", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12252"} 2023-01-29 19:35:56 | INFO | train_inner | {"epoch": 23, "update": 22.283, "s2c_loss": "0.062", "loss": "0.04298", "s2c_nll_loss": "0.062", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "48160", "lr": "7.89394e-05", "gnorm": "1.895", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12254"} 2023-01-29 19:35:59 | INFO | train_inner | {"epoch": 23, "update": 22.287, "s2c_loss": "0.074", "loss": "0.05159", "s2c_nll_loss": "0.074", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "48170", "lr": "7.88727e-05", "gnorm": "2.435", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12257"} 2023-01-29 19:36:01 | INFO | train_inner | {"epoch": 23, "update": 22.292, "s2c_loss": "0.073", "loss": "0.0504", "s2c_nll_loss": "0.073", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "259.5", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "48180", "lr": "7.88061e-05", "gnorm": "2.69", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12259"} 2023-01-29 19:36:04 | INFO | train_inner | {"epoch": 23, "update": 22.296, "s2c_loss": "0.046", "loss": "0.03199", "s2c_nll_loss": "0.046", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "48190", "lr": "7.87394e-05", "gnorm": "1.954", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12261"} 2023-01-29 19:36:06 | INFO | train_inner | {"epoch": 23, "update": 22.301, "s2c_loss": "0.052", "loss": "0.03626", "s2c_nll_loss": "0.052", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "48200", "lr": "7.86727e-05", "gnorm": "1.716", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12264"} 2023-01-29 19:36:09 | INFO | train_inner | {"epoch": 23, "update": 22.306, "s2c_loss": "0.089", "loss": "0.06182", "s2c_nll_loss": "0.089", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "258.8", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "48210", "lr": "7.86061e-05", "gnorm": "2.511", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12266"} 2023-01-29 19:36:11 | INFO | train_inner | {"epoch": 23, "update": 22.31, "s2c_loss": "0.06", "loss": "0.04161", "s2c_nll_loss": "0.06", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "259.4", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "48220", "lr": "7.85394e-05", "gnorm": "2.502", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12269"} 2023-01-29 19:36:13 | INFO | train_inner | {"epoch": 23, "update": 22.315, "s2c_loss": "0.093", "loss": "0.06429", "s2c_nll_loss": "0.093", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "48230", "lr": "7.84727e-05", "gnorm": "2.499", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12271"} 2023-01-29 19:36:16 | INFO | train_inner | {"epoch": 23, "update": 22.32, "s2c_loss": "0.074", "loss": "0.05125", "s2c_nll_loss": "0.074", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "48240", "lr": "7.84061e-05", "gnorm": "2.432", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12274"} 2023-01-29 19:36:18 | INFO | train_inner | {"epoch": 23, "update": 22.324, "s2c_loss": "0.065", "loss": "0.0454", "s2c_nll_loss": "0.065", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "48250", "lr": "7.83394e-05", "gnorm": "2.178", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12276"} 2023-01-29 19:36:21 | INFO | train_inner | {"epoch": 23, "update": 22.329, "s2c_loss": "0.05", "loss": "0.03436", "s2c_nll_loss": "0.05", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "257.8", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "48260", "lr": "7.82728e-05", "gnorm": "2.123", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12279"} 2023-01-29 19:36:23 | INFO | train_inner | {"epoch": 23, "update": 22.333, "s2c_loss": "0.067", "loss": "0.04621", "s2c_nll_loss": "0.067", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "48270", "lr": "7.82061e-05", "gnorm": "2.162", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12281"} 2023-01-29 19:36:26 | INFO | train_inner | {"epoch": 23, "update": 22.338, "s2c_loss": "0.065", "loss": "0.04497", "s2c_nll_loss": "0.065", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "48280", "lr": "7.81394e-05", "gnorm": "2.128", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12284"} 2023-01-29 19:36:29 | INFO | train_inner | {"epoch": 23, "update": 22.343, "s2c_loss": "0.052", "loss": "0.03614", "s2c_nll_loss": "0.052", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "48290", "lr": "7.80728e-05", "gnorm": "2.038", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12286"} 2023-01-29 19:36:31 | INFO | train_inner | {"epoch": 23, "update": 22.347, "s2c_loss": "0.065", "loss": "0.04518", "s2c_nll_loss": "0.065", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "48300", "lr": "7.80061e-05", "gnorm": "2.102", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12289"} 2023-01-29 19:36:34 | INFO | train_inner | {"epoch": 23, "update": 22.352, "s2c_loss": "0.061", "loss": "0.04204", "s2c_nll_loss": "0.061", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "48310", "lr": "7.79394e-05", "gnorm": "2.362", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12292"} 2023-01-29 19:36:36 | INFO | train_inner | {"epoch": 23, "update": 22.357, "s2c_loss": "0.105", "loss": "0.07287", "s2c_nll_loss": "0.105", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "257.4", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "48320", "lr": "7.78728e-05", "gnorm": "2.328", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12294"} 2023-01-29 19:36:39 | INFO | train_inner | {"epoch": 23, "update": 22.361, "s2c_loss": "0.044", "loss": "0.03052", "s2c_nll_loss": "0.044", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "48330", "lr": "7.78061e-05", "gnorm": "1.604", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12297"} 2023-01-29 19:36:41 | INFO | train_inner | {"epoch": 23, "update": 22.366, "s2c_loss": "0.07", "loss": "0.04841", "s2c_nll_loss": "0.07", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "259.7", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "48340", "lr": "7.77394e-05", "gnorm": "2.231", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12299"} 2023-01-29 19:36:44 | INFO | train_inner | {"epoch": 23, "update": 22.37, "s2c_loss": "0.07", "loss": "0.04871", "s2c_nll_loss": "0.07", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "48350", "lr": "7.76728e-05", "gnorm": "2.168", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12302"} 2023-01-29 19:36:46 | INFO | train_inner | {"epoch": 23, "update": 22.375, "s2c_loss": "0.055", "loss": "0.03842", "s2c_nll_loss": "0.055", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "48360", "lr": "7.76061e-05", "gnorm": "1.934", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12304"} 2023-01-29 19:36:49 | INFO | train_inner | {"epoch": 23, "update": 22.38, "s2c_loss": "0.081", "loss": "0.05633", "s2c_nll_loss": "0.081", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "48370", "lr": "7.75395e-05", "gnorm": "2.402", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12307"} 2023-01-29 19:36:51 | INFO | train_inner | {"epoch": 23, "update": 22.384, "s2c_loss": "0.089", "loss": "0.06141", "s2c_nll_loss": "0.089", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "48380", "lr": "7.74728e-05", "gnorm": "2.582", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12309"} 2023-01-29 19:36:54 | INFO | train_inner | {"epoch": 23, "update": 22.389, "s2c_loss": "0.093", "loss": "0.06436", "s2c_nll_loss": "0.093", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "48390", "lr": "7.74061e-05", "gnorm": "2.75", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12312"} 2023-01-29 19:36:56 | INFO | train_inner | {"epoch": 23, "update": 22.394, "s2c_loss": "0.058", "loss": "0.04047", "s2c_nll_loss": "0.058", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "48400", "lr": "7.73395e-05", "gnorm": "2.051", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12314"} 2023-01-29 19:36:59 | INFO | train_inner | {"epoch": 23, "update": 22.398, "s2c_loss": "0.068", "loss": "0.04739", "s2c_nll_loss": "0.068", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "247", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "48410", "lr": "7.72728e-05", "gnorm": "2.104", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "12317"} 2023-01-29 19:37:01 | INFO | train_inner | {"epoch": 23, "update": 22.403, "s2c_loss": "0.056", "loss": "0.03911", "s2c_nll_loss": "0.056", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "48420", "lr": "7.72061e-05", "gnorm": "1.499", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12319"} 2023-01-29 19:37:04 | INFO | train_inner | {"epoch": 23, "update": 22.407, "s2c_loss": "0.111", "loss": "0.07708", "s2c_nll_loss": "0.111", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "48430", "lr": "7.71395e-05", "gnorm": "2.509", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12322"} 2023-01-29 19:37:06 | INFO | train_inner | {"epoch": 23, "update": 22.412, "s2c_loss": "0.084", "loss": "0.05848", "s2c_nll_loss": "0.084", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "48440", "lr": "7.70728e-05", "gnorm": "2.819", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12324"} 2023-01-29 19:37:09 | INFO | train_inner | {"epoch": 23, "update": 22.417, "s2c_loss": "0.064", "loss": "0.0446", "s2c_nll_loss": "0.064", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "48450", "lr": "7.70061e-05", "gnorm": "1.966", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "12327"} 2023-01-29 19:37:11 | INFO | train_inner | {"epoch": 23, "update": 22.421, "s2c_loss": "0.082", "loss": "0.05681", "s2c_nll_loss": "0.082", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "48460", "lr": "7.69395e-05", "gnorm": "2.546", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12329"} 2023-01-29 19:37:14 | INFO | train_inner | {"epoch": 23, "update": 22.426, "s2c_loss": "0.089", "loss": "0.06174", "s2c_nll_loss": "0.089", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "48470", "lr": "7.68728e-05", "gnorm": "2.657", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12332"} 2023-01-29 19:37:17 | INFO | train_inner | {"epoch": 23, "update": 22.431, "s2c_loss": "0.106", "loss": "0.07344", "s2c_nll_loss": "0.106", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "48480", "lr": "7.68062e-05", "gnorm": "3.169", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12334"} 2023-01-29 19:37:19 | INFO | train_inner | {"epoch": 23, "update": 22.435, "s2c_loss": "0.075", "loss": "0.05225", "s2c_nll_loss": "0.075", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "48490", "lr": "7.67395e-05", "gnorm": "2.119", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12337"} 2023-01-29 19:37:22 | INFO | train_inner | {"epoch": 23, "update": 22.44, "s2c_loss": "0.086", "loss": "0.05956", "s2c_nll_loss": "0.086", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "48500", "lr": "7.66728e-05", "gnorm": "2.558", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12340"} 2023-01-29 19:37:24 | INFO | train_inner | {"epoch": 23, "update": 22.444, "s2c_loss": "0.085", "loss": "0.0592", "s2c_nll_loss": "0.085", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "48510", "lr": "7.66062e-05", "gnorm": "2.421", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12342"} 2023-01-29 19:37:27 | INFO | train_inner | {"epoch": 23, "update": 22.449, "s2c_loss": "0.098", "loss": "0.06812", "s2c_nll_loss": "0.098", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "48520", "lr": "7.65395e-05", "gnorm": "2.268", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12345"} 2023-01-29 19:37:29 | INFO | train_inner | {"epoch": 23, "update": 22.454, "s2c_loss": "0.072", "loss": "0.04965", "s2c_nll_loss": "0.072", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "48530", "lr": "7.64728e-05", "gnorm": "2.28", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12347"} 2023-01-29 19:37:32 | INFO | train_inner | {"epoch": 23, "update": 22.458, "s2c_loss": "0.07", "loss": "0.04872", "s2c_nll_loss": "0.07", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "48540", "lr": "7.64062e-05", "gnorm": "2.149", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12350"} 2023-01-29 19:37:34 | INFO | train_inner | {"epoch": 23, "update": 22.463, "s2c_loss": "0.052", "loss": "0.0359", "s2c_nll_loss": "0.052", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "48550", "lr": "7.63395e-05", "gnorm": "1.849", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12352"} 2023-01-29 19:37:37 | INFO | train_inner | {"epoch": 23, "update": 22.468, "s2c_loss": "0.069", "loss": "0.04765", "s2c_nll_loss": "0.069", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "259.8", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "48560", "lr": "7.62729e-05", "gnorm": "2.004", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12355"} 2023-01-29 19:37:39 | INFO | train_inner | {"epoch": 23, "update": 22.472, "s2c_loss": "0.056", "loss": "0.03888", "s2c_nll_loss": "0.056", "s2c_accuracy": "99.372", "s2c_total": "63.7", "s2c_n_correct": "63.3", "wps": "251.8", "ups": "3.95", "wpb": "63.7", "bsz": "63.7", "num_updates": "48570", "lr": "7.62062e-05", "gnorm": "1.967", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12357"} 2023-01-29 19:37:42 | INFO | train_inner | {"epoch": 23, "update": 22.477, "s2c_loss": "0.067", "loss": "0.04663", "s2c_nll_loss": "0.067", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "48580", "lr": "7.61395e-05", "gnorm": "2.287", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12360"} 2023-01-29 19:37:44 | INFO | train_inner | {"epoch": 23, "update": 22.481, "s2c_loss": "0.084", "loss": "0.05831", "s2c_nll_loss": "0.084", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "48590", "lr": "7.60729e-05", "gnorm": "2.696", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12362"} 2023-01-29 19:37:47 | INFO | train_inner | {"epoch": 23, "update": 22.486, "s2c_loss": "0.057", "loss": "0.03962", "s2c_nll_loss": "0.057", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "261.1", "ups": "4.08", "wpb": "64", "bsz": "64", "num_updates": "48600", "lr": "7.60062e-05", "gnorm": "2.058", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12365"} 2023-01-29 19:37:49 | INFO | train_inner | {"epoch": 23, "update": 22.491, "s2c_loss": "0.077", "loss": "0.05317", "s2c_nll_loss": "0.077", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "48610", "lr": "7.59395e-05", "gnorm": "2.814", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12367"} 2023-01-29 19:37:52 | INFO | train_inner | {"epoch": 23, "update": 22.495, "s2c_loss": "0.07", "loss": "0.04821", "s2c_nll_loss": "0.07", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "48620", "lr": "7.58729e-05", "gnorm": "2.096", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12370"} 2023-01-29 19:37:54 | INFO | train_inner | {"epoch": 23, "update": 22.5, "s2c_loss": "0.05", "loss": "0.03495", "s2c_nll_loss": "0.05", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "48630", "lr": "7.58062e-05", "gnorm": "1.806", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12372"} 2023-01-29 19:37:57 | INFO | train_inner | {"epoch": 23, "update": 22.505, "s2c_loss": "0.058", "loss": "0.04012", "s2c_nll_loss": "0.058", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "48640", "lr": "7.57395e-05", "gnorm": "1.733", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12375"} 2023-01-29 19:37:59 | INFO | train_inner | {"epoch": 23, "update": 22.509, "s2c_loss": "0.067", "loss": "0.04652", "s2c_nll_loss": "0.067", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "48650", "lr": "7.56729e-05", "gnorm": "1.74", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12377"} 2023-01-29 19:38:02 | INFO | train_inner | {"epoch": 23, "update": 22.514, "s2c_loss": "0.046", "loss": "0.03186", "s2c_nll_loss": "0.046", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "48660", "lr": "7.56062e-05", "gnorm": "2.251", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12380"} 2023-01-29 19:38:04 | INFO | train_inner | {"epoch": 23, "update": 22.519, "s2c_loss": "0.072", "loss": "0.04999", "s2c_nll_loss": "0.072", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "48670", "lr": "7.55396e-05", "gnorm": "1.604", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12382"} 2023-01-29 19:38:07 | INFO | train_inner | {"epoch": 23, "update": 22.523, "s2c_loss": "0.091", "loss": "0.06322", "s2c_nll_loss": "0.091", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "48680", "lr": "7.54729e-05", "gnorm": "2.333", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12385"} 2023-01-29 19:38:09 | INFO | train_inner | {"epoch": 23, "update": 22.528, "s2c_loss": "0.047", "loss": "0.03234", "s2c_nll_loss": "0.047", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "48690", "lr": "7.54062e-05", "gnorm": "1.445", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12387"} 2023-01-29 19:38:12 | INFO | train_inner | {"epoch": 23, "update": 22.532, "s2c_loss": "0.304", "loss": "0.21046", "s2c_nll_loss": "0.304", "s2c_accuracy": "96.25", "s2c_total": "64", "s2c_n_correct": "61.6", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "48700", "lr": "7.53396e-05", "gnorm": "2.767", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12390"} 2023-01-29 19:38:14 | INFO | train_inner | {"epoch": 23, "update": 22.537, "s2c_loss": "0.075", "loss": "0.05214", "s2c_nll_loss": "0.075", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "48710", "lr": "7.52729e-05", "gnorm": "2.215", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12392"} 2023-01-29 19:38:17 | INFO | train_inner | {"epoch": 23, "update": 22.542, "s2c_loss": "0.075", "loss": "0.05198", "s2c_nll_loss": "0.075", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "48720", "lr": "7.52062e-05", "gnorm": "2.064", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12395"} 2023-01-29 19:38:19 | INFO | train_inner | {"epoch": 23, "update": 22.546, "s2c_loss": "0.055", "loss": "0.03813", "s2c_nll_loss": "0.055", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "48730", "lr": "7.51396e-05", "gnorm": "1.911", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12397"} 2023-01-29 19:38:22 | INFO | train_inner | {"epoch": 23, "update": 22.551, "s2c_loss": "0.074", "loss": "0.05114", "s2c_nll_loss": "0.074", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "48740", "lr": "7.50729e-05", "gnorm": "2.277", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12400"} 2023-01-29 19:38:24 | INFO | train_inner | {"epoch": 23, "update": 22.556, "s2c_loss": "0.096", "loss": "0.06656", "s2c_nll_loss": "0.096", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "48750", "lr": "7.50062e-05", "gnorm": "2.659", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12402"} 2023-01-29 19:38:27 | INFO | train_inner | {"epoch": 23, "update": 22.56, "s2c_loss": "0.077", "loss": "0.05353", "s2c_nll_loss": "0.077", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "48760", "lr": "7.49396e-05", "gnorm": "2.37", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12405"} 2023-01-29 19:38:29 | INFO | train_inner | {"epoch": 23, "update": 22.565, "s2c_loss": "0.073", "loss": "0.05046", "s2c_nll_loss": "0.073", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "257.6", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "48770", "lr": "7.48729e-05", "gnorm": "2.27", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12407"} 2023-01-29 19:38:32 | INFO | train_inner | {"epoch": 23, "update": 22.569, "s2c_loss": "0.072", "loss": "0.05022", "s2c_nll_loss": "0.072", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "48780", "lr": "7.48063e-05", "gnorm": "2.755", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12410"} 2023-01-29 19:38:34 | INFO | train_inner | {"epoch": 23, "update": 22.574, "s2c_loss": "0.056", "loss": "0.039", "s2c_nll_loss": "0.056", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "48790", "lr": "7.47396e-05", "gnorm": "1.594", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12412"} 2023-01-29 19:38:37 | INFO | train_inner | {"epoch": 23, "update": 22.579, "s2c_loss": "0.103", "loss": "0.07152", "s2c_nll_loss": "0.103", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "48800", "lr": "7.46729e-05", "gnorm": "2.864", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12415"} 2023-01-29 19:38:39 | INFO | train_inner | {"epoch": 23, "update": 22.583, "s2c_loss": "0.06", "loss": "0.04126", "s2c_nll_loss": "0.06", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "48810", "lr": "7.46063e-05", "gnorm": "2.146", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12417"} 2023-01-29 19:38:42 | INFO | train_inner | {"epoch": 23, "update": 22.588, "s2c_loss": "0.073", "loss": "0.05065", "s2c_nll_loss": "0.073", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "259.3", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "48820", "lr": "7.45396e-05", "gnorm": "3.112", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12420"} 2023-01-29 19:38:44 | INFO | train_inner | {"epoch": 23, "update": 22.593, "s2c_loss": "0.086", "loss": "0.05981", "s2c_nll_loss": "0.086", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "48830", "lr": "7.44729e-05", "gnorm": "2.43", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12422"} 2023-01-29 19:38:47 | INFO | train_inner | {"epoch": 23, "update": 22.597, "s2c_loss": "0.084", "loss": "0.05853", "s2c_nll_loss": "0.084", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "48840", "lr": "7.44063e-05", "gnorm": "2.204", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12425"} 2023-01-29 19:38:49 | INFO | train_inner | {"epoch": 23, "update": 22.602, "s2c_loss": "0.051", "loss": "0.03551", "s2c_nll_loss": "0.051", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "260.1", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "48850", "lr": "7.43396e-05", "gnorm": "2.22", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12427"} 2023-01-29 19:38:52 | INFO | train_inner | {"epoch": 23, "update": 22.606, "s2c_loss": "0.085", "loss": "0.05916", "s2c_nll_loss": "0.085", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "258.6", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "48860", "lr": "7.4273e-05", "gnorm": "2.835", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12430"} 2023-01-29 19:38:54 | INFO | train_inner | {"epoch": 23, "update": 22.611, "s2c_loss": "0.07", "loss": "0.04871", "s2c_nll_loss": "0.07", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "48870", "lr": "7.42063e-05", "gnorm": "2.001", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12432"} 2023-01-29 19:38:57 | INFO | train_inner | {"epoch": 23, "update": 22.616, "s2c_loss": "0.05", "loss": "0.03491", "s2c_nll_loss": "0.05", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "48880", "lr": "7.41396e-05", "gnorm": "1.939", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12435"} 2023-01-29 19:38:59 | INFO | train_inner | {"epoch": 23, "update": 22.62, "s2c_loss": "0.083", "loss": "0.05772", "s2c_nll_loss": "0.083", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "48890", "lr": "7.4073e-05", "gnorm": "1.694", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12437"} 2023-01-29 19:39:02 | INFO | train_inner | {"epoch": 23, "update": 22.625, "s2c_loss": "0.086", "loss": "0.05943", "s2c_nll_loss": "0.086", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "48900", "lr": "7.40063e-05", "gnorm": "2.007", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12440"} 2023-01-29 19:39:04 | INFO | train_inner | {"epoch": 23, "update": 22.63, "s2c_loss": "0.054", "loss": "0.03717", "s2c_nll_loss": "0.054", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "48910", "lr": "7.39396e-05", "gnorm": "2.067", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "12442"} 2023-01-29 19:39:07 | INFO | train_inner | {"epoch": 23, "update": 22.634, "s2c_loss": "0.098", "loss": "0.0682", "s2c_nll_loss": "0.098", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "48920", "lr": "7.3873e-05", "gnorm": "2.083", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12445"} 2023-01-29 19:39:10 | INFO | train_inner | {"epoch": 23, "update": 22.639, "s2c_loss": "0.06", "loss": "0.04149", "s2c_nll_loss": "0.06", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "48930", "lr": "7.38063e-05", "gnorm": "1.9", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "12448"} 2023-01-29 19:39:12 | INFO | train_inner | {"epoch": 23, "update": 22.643, "s2c_loss": "0.104", "loss": "0.0723", "s2c_nll_loss": "0.104", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "48940", "lr": "7.37396e-05", "gnorm": "2.073", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12450"} 2023-01-29 19:39:15 | INFO | train_inner | {"epoch": 23, "update": 22.648, "s2c_loss": "0.036", "loss": "0.02476", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "48950", "lr": "7.3673e-05", "gnorm": "1.762", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12453"} 2023-01-29 19:39:17 | INFO | train_inner | {"epoch": 23, "update": 22.653, "s2c_loss": "0.056", "loss": "0.03852", "s2c_nll_loss": "0.056", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "48960", "lr": "7.36063e-05", "gnorm": "2.373", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12455"} 2023-01-29 19:39:20 | INFO | train_inner | {"epoch": 23, "update": 22.657, "s2c_loss": "0.065", "loss": "0.0452", "s2c_nll_loss": "0.065", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "257.4", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "48970", "lr": "7.35397e-05", "gnorm": "2.161", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12458"} 2023-01-29 19:39:22 | INFO | train_inner | {"epoch": 23, "update": 22.662, "s2c_loss": "0.052", "loss": "0.03614", "s2c_nll_loss": "0.052", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "48980", "lr": "7.3473e-05", "gnorm": "1.822", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12460"} 2023-01-29 19:39:25 | INFO | train_inner | {"epoch": 23, "update": 22.667, "s2c_loss": "0.063", "loss": "0.04363", "s2c_nll_loss": "0.063", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "255.7", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "48990", "lr": "7.34063e-05", "gnorm": "2.035", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12463"} 2023-01-29 19:39:27 | INFO | train_inner | {"epoch": 23, "update": 22.671, "s2c_loss": "0.055", "loss": "0.03798", "s2c_nll_loss": "0.055", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "49000", "lr": "7.33397e-05", "gnorm": "1.808", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12465"} 2023-01-29 19:39:30 | INFO | train_inner | {"epoch": 23, "update": 22.676, "s2c_loss": "0.037", "loss": "0.02534", "s2c_nll_loss": "0.037", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "49010", "lr": "7.3273e-05", "gnorm": "1.3", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "12468"} 2023-01-29 19:39:32 | INFO | train_inner | {"epoch": 23, "update": 22.68, "s2c_loss": "0.053", "loss": "0.03687", "s2c_nll_loss": "0.053", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "49020", "lr": "7.32063e-05", "gnorm": "1.686", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12470"} 2023-01-29 19:39:35 | INFO | train_inner | {"epoch": 23, "update": 22.685, "s2c_loss": "0.099", "loss": "0.06865", "s2c_nll_loss": "0.099", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "49030", "lr": "7.31397e-05", "gnorm": "2.611", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12473"} 2023-01-29 19:39:37 | INFO | train_inner | {"epoch": 23, "update": 22.69, "s2c_loss": "0.095", "loss": "0.06558", "s2c_nll_loss": "0.095", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "49040", "lr": "7.3073e-05", "gnorm": "2.526", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12475"} 2023-01-29 19:39:40 | INFO | train_inner | {"epoch": 23, "update": 22.694, "s2c_loss": "0.07", "loss": "0.04834", "s2c_nll_loss": "0.07", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "49050", "lr": "7.30063e-05", "gnorm": "2.492", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12478"} 2023-01-29 19:39:42 | INFO | train_inner | {"epoch": 23, "update": 22.699, "s2c_loss": "0.069", "loss": "0.0479", "s2c_nll_loss": "0.069", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "49060", "lr": "7.29397e-05", "gnorm": "2.441", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12480"} 2023-01-29 19:39:45 | INFO | train_inner | {"epoch": 23, "update": 22.704, "s2c_loss": "0.073", "loss": "0.05034", "s2c_nll_loss": "0.073", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "49070", "lr": "7.2873e-05", "gnorm": "2.114", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12483"} 2023-01-29 19:39:48 | INFO | train_inner | {"epoch": 23, "update": 22.708, "s2c_loss": "0.098", "loss": "0.06786", "s2c_nll_loss": "0.098", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "49080", "lr": "7.28064e-05", "gnorm": "2.962", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12485"} 2023-01-29 19:39:50 | INFO | train_inner | {"epoch": 23, "update": 22.713, "s2c_loss": "0.053", "loss": "0.03698", "s2c_nll_loss": "0.053", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "49090", "lr": "7.27397e-05", "gnorm": "2.618", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12488"} 2023-01-29 19:39:53 | INFO | train_inner | {"epoch": 23, "update": 22.717, "s2c_loss": "0.054", "loss": "0.03713", "s2c_nll_loss": "0.054", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "49100", "lr": "7.2673e-05", "gnorm": "2.265", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12491"} 2023-01-29 19:39:55 | INFO | train_inner | {"epoch": 23, "update": 22.722, "s2c_loss": "0.082", "loss": "0.05695", "s2c_nll_loss": "0.082", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "49110", "lr": "7.26064e-05", "gnorm": "3.024", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12493"} 2023-01-29 19:39:58 | INFO | train_inner | {"epoch": 23, "update": 22.727, "s2c_loss": "0.104", "loss": "0.07177", "s2c_nll_loss": "0.104", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "49120", "lr": "7.25397e-05", "gnorm": "3.29", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12496"} 2023-01-29 19:40:00 | INFO | train_inner | {"epoch": 23, "update": 22.731, "s2c_loss": "0.095", "loss": "0.06597", "s2c_nll_loss": "0.095", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "243.8", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "49130", "lr": "7.2473e-05", "gnorm": "2.873", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12498"} 2023-01-29 19:40:03 | INFO | train_inner | {"epoch": 23, "update": 22.736, "s2c_loss": "0.068", "loss": "0.04684", "s2c_nll_loss": "0.068", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "49140", "lr": "7.24064e-05", "gnorm": "2.088", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12501"} 2023-01-29 19:40:05 | INFO | train_inner | {"epoch": 23, "update": 22.741, "s2c_loss": "0.073", "loss": "0.05055", "s2c_nll_loss": "0.073", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "49150", "lr": "7.23397e-05", "gnorm": "2.33", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12503"} 2023-01-29 19:40:08 | INFO | train_inner | {"epoch": 23, "update": 22.745, "s2c_loss": "0.072", "loss": "0.04991", "s2c_nll_loss": "0.072", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "256.3", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "49160", "lr": "7.22731e-05", "gnorm": "2.289", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12506"} 2023-01-29 19:40:10 | INFO | train_inner | {"epoch": 23, "update": 22.75, "s2c_loss": "0.099", "loss": "0.06844", "s2c_nll_loss": "0.099", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "49170", "lr": "7.22064e-05", "gnorm": "2.263", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12508"} 2023-01-29 19:40:13 | INFO | train_inner | {"epoch": 23, "update": 22.754, "s2c_loss": "0.044", "loss": "0.03044", "s2c_nll_loss": "0.044", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "257.8", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "49180", "lr": "7.21397e-05", "gnorm": "1.467", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12511"} 2023-01-29 19:40:15 | INFO | train_inner | {"epoch": 23, "update": 22.759, "s2c_loss": "0.119", "loss": "0.0822", "s2c_nll_loss": "0.119", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "49190", "lr": "7.20731e-05", "gnorm": "2.988", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12513"} 2023-01-29 19:40:18 | INFO | train_inner | {"epoch": 23, "update": 22.764, "s2c_loss": "0.065", "loss": "0.04484", "s2c_nll_loss": "0.065", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "49200", "lr": "7.20064e-05", "gnorm": "1.995", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12516"} 2023-01-29 19:40:20 | INFO | train_inner | {"epoch": 23, "update": 22.768, "s2c_loss": "0.063", "loss": "0.04401", "s2c_nll_loss": "0.063", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "49210", "lr": "7.19397e-05", "gnorm": "1.787", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12518"} 2023-01-29 19:40:23 | INFO | train_inner | {"epoch": 23, "update": 22.773, "s2c_loss": "0.04", "loss": "0.02785", "s2c_nll_loss": "0.04", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "49220", "lr": "7.18731e-05", "gnorm": "1.642", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12521"} 2023-01-29 19:40:25 | INFO | train_inner | {"epoch": 23, "update": 22.778, "s2c_loss": "0.099", "loss": "0.06896", "s2c_nll_loss": "0.099", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "49230", "lr": "7.18064e-05", "gnorm": "1.883", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12523"} 2023-01-29 19:40:28 | INFO | train_inner | {"epoch": 23, "update": 22.782, "s2c_loss": "0.079", "loss": "0.05485", "s2c_nll_loss": "0.079", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "49240", "lr": "7.17397e-05", "gnorm": "1.847", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12526"} 2023-01-29 19:40:31 | INFO | train_inner | {"epoch": 23, "update": 22.787, "s2c_loss": "0.026", "loss": "0.01782", "s2c_nll_loss": "0.026", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "49250", "lr": "7.16731e-05", "gnorm": "1.382", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12528"} 2023-01-29 19:40:33 | INFO | train_inner | {"epoch": 23, "update": 22.791, "s2c_loss": "0.04", "loss": "0.0276", "s2c_nll_loss": "0.04", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "49260", "lr": "7.16064e-05", "gnorm": "1.715", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12531"} 2023-01-29 19:40:35 | INFO | train_inner | {"epoch": 23, "update": 22.796, "s2c_loss": "0.091", "loss": "0.06338", "s2c_nll_loss": "0.091", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "49270", "lr": "7.15398e-05", "gnorm": "2.204", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12533"} 2023-01-29 19:40:38 | INFO | train_inner | {"epoch": 23, "update": 22.801, "s2c_loss": "0.08", "loss": "0.05528", "s2c_nll_loss": "0.08", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "49280", "lr": "7.14731e-05", "gnorm": "2.273", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12536"} 2023-01-29 19:40:40 | INFO | train_inner | {"epoch": 23, "update": 22.805, "s2c_loss": "0.076", "loss": "0.05256", "s2c_nll_loss": "0.076", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "49290", "lr": "7.14064e-05", "gnorm": "2.18", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12538"} 2023-01-29 19:40:43 | INFO | train_inner | {"epoch": 23, "update": 22.81, "s2c_loss": "0.058", "loss": "0.04035", "s2c_nll_loss": "0.058", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "49300", "lr": "7.13398e-05", "gnorm": "2.11", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12541"} 2023-01-29 19:40:46 | INFO | train_inner | {"epoch": 23, "update": 22.815, "s2c_loss": "0.065", "loss": "0.0448", "s2c_nll_loss": "0.065", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "49310", "lr": "7.12731e-05", "gnorm": "2.461", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12543"} 2023-01-29 19:40:48 | INFO | train_inner | {"epoch": 23, "update": 22.819, "s2c_loss": "0.052", "loss": "0.0358", "s2c_nll_loss": "0.052", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "49320", "lr": "7.12064e-05", "gnorm": "2.566", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12546"} 2023-01-29 19:40:51 | INFO | train_inner | {"epoch": 23, "update": 22.824, "s2c_loss": "0.065", "loss": "0.04532", "s2c_nll_loss": "0.065", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "49330", "lr": "7.11398e-05", "gnorm": "2.128", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12549"} 2023-01-29 19:40:53 | INFO | train_inner | {"epoch": 23, "update": 22.828, "s2c_loss": "0.055", "loss": "0.03824", "s2c_nll_loss": "0.055", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "49340", "lr": "7.10731e-05", "gnorm": "2.06", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12551"} 2023-01-29 19:40:56 | INFO | train_inner | {"epoch": 23, "update": 22.833, "s2c_loss": "0.046", "loss": "0.03174", "s2c_nll_loss": "0.046", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "49350", "lr": "7.10064e-05", "gnorm": "1.907", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12554"} 2023-01-29 19:40:58 | INFO | train_inner | {"epoch": 23, "update": 22.838, "s2c_loss": "0.054", "loss": "0.03718", "s2c_nll_loss": "0.054", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "49360", "lr": "7.09398e-05", "gnorm": "2.905", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12556"} 2023-01-29 19:41:01 | INFO | train_inner | {"epoch": 23, "update": 22.842, "s2c_loss": "0.072", "loss": "0.05005", "s2c_nll_loss": "0.072", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "49370", "lr": "7.08731e-05", "gnorm": "2.242", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12559"} 2023-01-29 19:41:03 | INFO | train_inner | {"epoch": 23, "update": 22.847, "s2c_loss": "0.075", "loss": "0.05205", "s2c_nll_loss": "0.075", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "49380", "lr": "7.08065e-05", "gnorm": "2.446", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12561"} 2023-01-29 19:41:06 | INFO | train_inner | {"epoch": 23, "update": 22.852, "s2c_loss": "0.054", "loss": "0.03754", "s2c_nll_loss": "0.054", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "49390", "lr": "7.07398e-05", "gnorm": "1.924", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12564"} 2023-01-29 19:41:08 | INFO | train_inner | {"epoch": 23, "update": 22.856, "s2c_loss": "0.063", "loss": "0.04367", "s2c_nll_loss": "0.063", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "49400", "lr": "7.06731e-05", "gnorm": "2.67", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12566"} 2023-01-29 19:41:11 | INFO | train_inner | {"epoch": 23, "update": 22.861, "s2c_loss": "0.072", "loss": "0.04992", "s2c_nll_loss": "0.072", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "258.7", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "49410", "lr": "7.06065e-05", "gnorm": "2.326", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12569"} 2023-01-29 19:41:13 | INFO | train_inner | {"epoch": 23, "update": 22.865, "s2c_loss": "0.062", "loss": "0.04305", "s2c_nll_loss": "0.062", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "251.8", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "49420", "lr": "7.05398e-05", "gnorm": "2.076", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12571"} 2023-01-29 19:41:16 | INFO | train_inner | {"epoch": 23, "update": 22.87, "s2c_loss": "0.071", "loss": "0.049", "s2c_nll_loss": "0.071", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "49430", "lr": "7.04731e-05", "gnorm": "2.16", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12574"} 2023-01-29 19:41:18 | INFO | train_inner | {"epoch": 23, "update": 22.875, "s2c_loss": "0.052", "loss": "0.03572", "s2c_nll_loss": "0.052", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "49440", "lr": "7.04065e-05", "gnorm": "1.983", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12576"} 2023-01-29 19:41:21 | INFO | train_inner | {"epoch": 23, "update": 22.879, "s2c_loss": "0.105", "loss": "0.07261", "s2c_nll_loss": "0.105", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "49450", "lr": "7.03398e-05", "gnorm": "2.669", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12579"} 2023-01-29 19:41:23 | INFO | train_inner | {"epoch": 23, "update": 22.884, "s2c_loss": "0.059", "loss": "0.04067", "s2c_nll_loss": "0.059", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "49460", "lr": "7.02732e-05", "gnorm": "2.122", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12581"} 2023-01-29 19:41:26 | INFO | train_inner | {"epoch": 23, "update": 22.889, "s2c_loss": "0.108", "loss": "0.07493", "s2c_nll_loss": "0.108", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "49470", "lr": "7.02065e-05", "gnorm": "2.856", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12584"} 2023-01-29 19:41:28 | INFO | train_inner | {"epoch": 23, "update": 22.893, "s2c_loss": "0.052", "loss": "0.03627", "s2c_nll_loss": "0.052", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "49480", "lr": "7.01398e-05", "gnorm": "1.964", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12586"} 2023-01-29 19:41:31 | INFO | train_inner | {"epoch": 23, "update": 22.898, "s2c_loss": "0.063", "loss": "0.04341", "s2c_nll_loss": "0.063", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "49490", "lr": "7.00732e-05", "gnorm": "2.112", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12589"} 2023-01-29 19:41:34 | INFO | train_inner | {"epoch": 23, "update": 22.902, "s2c_loss": "0.064", "loss": "0.04438", "s2c_nll_loss": "0.064", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "49500", "lr": "7.00065e-05", "gnorm": "2.053", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12591"} 2023-01-29 19:41:36 | INFO | train_inner | {"epoch": 23, "update": 22.907, "s2c_loss": "0.083", "loss": "0.05779", "s2c_nll_loss": "0.083", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "245.7", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "49510", "lr": "6.99398e-05", "gnorm": "2.722", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12594"} 2023-01-29 19:41:39 | INFO | train_inner | {"epoch": 23, "update": 22.912, "s2c_loss": "0.101", "loss": "0.06969", "s2c_nll_loss": "0.101", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "257.8", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "49520", "lr": "6.98732e-05", "gnorm": "2.209", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12597"} 2023-01-29 19:41:41 | INFO | train_inner | {"epoch": 23, "update": 22.916, "s2c_loss": "0.06", "loss": "0.04135", "s2c_nll_loss": "0.06", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "260.9", "ups": "4.08", "wpb": "64", "bsz": "64", "num_updates": "49530", "lr": "6.98065e-05", "gnorm": "2.509", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12599"} 2023-01-29 19:41:44 | INFO | train_inner | {"epoch": 23, "update": 22.921, "s2c_loss": "0.07", "loss": "0.04827", "s2c_nll_loss": "0.07", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "49540", "lr": "6.97398e-05", "gnorm": "2.49", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12602"} 2023-01-29 19:41:46 | INFO | train_inner | {"epoch": 23, "update": 22.926, "s2c_loss": "0.058", "loss": "0.0399", "s2c_nll_loss": "0.058", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "49550", "lr": "6.96732e-05", "gnorm": "2.147", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12604"} 2023-01-29 19:41:49 | INFO | train_inner | {"epoch": 23, "update": 22.93, "s2c_loss": "0.167", "loss": "0.11565", "s2c_nll_loss": "0.167", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "258.9", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "49560", "lr": "6.96065e-05", "gnorm": "3.306", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12607"} 2023-01-29 19:41:51 | INFO | train_inner | {"epoch": 23, "update": 22.935, "s2c_loss": "0.059", "loss": "0.04056", "s2c_nll_loss": "0.059", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "261.2", "ups": "4.08", "wpb": "64", "bsz": "64", "num_updates": "49570", "lr": "6.95399e-05", "gnorm": "1.835", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12609"} 2023-01-29 19:41:54 | INFO | train_inner | {"epoch": 23, "update": 22.939, "s2c_loss": "0.087", "loss": "0.06007", "s2c_nll_loss": "0.087", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "49580", "lr": "6.94732e-05", "gnorm": "2.58", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12612"} 2023-01-29 19:41:56 | INFO | train_inner | {"epoch": 23, "update": 22.944, "s2c_loss": "0.097", "loss": "0.06689", "s2c_nll_loss": "0.097", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "49590", "lr": "6.94065e-05", "gnorm": "2.753", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12614"} 2023-01-29 19:41:59 | INFO | train_inner | {"epoch": 23, "update": 22.949, "s2c_loss": "0.081", "loss": "0.05591", "s2c_nll_loss": "0.081", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "49600", "lr": "6.93399e-05", "gnorm": "2.611", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12617"} 2023-01-29 19:42:01 | INFO | train_inner | {"epoch": 23, "update": 22.953, "s2c_loss": "0.048", "loss": "0.03361", "s2c_nll_loss": "0.048", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "260.2", "ups": "4.07", "wpb": "64", "bsz": "64", "num_updates": "49610", "lr": "6.92732e-05", "gnorm": "1.692", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12619"} 2023-01-29 19:42:04 | INFO | train_inner | {"epoch": 23, "update": 22.958, "s2c_loss": "0.076", "loss": "0.0526", "s2c_nll_loss": "0.076", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "49620", "lr": "6.92065e-05", "gnorm": "2.256", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12622"} 2023-01-29 19:42:06 | INFO | train_inner | {"epoch": 23, "update": 22.963, "s2c_loss": "0.078", "loss": "0.05382", "s2c_nll_loss": "0.078", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "49630", "lr": "6.91399e-05", "gnorm": "2.005", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12624"} 2023-01-29 19:42:09 | INFO | train_inner | {"epoch": 23, "update": 22.967, "s2c_loss": "0.08", "loss": "0.0557", "s2c_nll_loss": "0.08", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "49640", "lr": "6.90732e-05", "gnorm": "2.733", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12627"} 2023-01-29 19:42:11 | INFO | train_inner | {"epoch": 23, "update": 22.972, "s2c_loss": "0.097", "loss": "0.06748", "s2c_nll_loss": "0.097", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "49650", "lr": "6.90065e-05", "gnorm": "2.423", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12629"} 2023-01-29 19:42:14 | INFO | train_inner | {"epoch": 23, "update": 22.976, "s2c_loss": "0.053", "loss": "0.03698", "s2c_nll_loss": "0.053", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "49660", "lr": "6.89399e-05", "gnorm": "1.631", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12632"} 2023-01-29 19:42:16 | INFO | train_inner | {"epoch": 23, "update": 22.981, "s2c_loss": "0.068", "loss": "0.04719", "s2c_nll_loss": "0.068", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "49670", "lr": "6.88732e-05", "gnorm": "2.044", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12634"} 2023-01-29 19:42:19 | INFO | train_inner | {"epoch": 23, "update": 22.986, "s2c_loss": "0.075", "loss": "0.0519", "s2c_nll_loss": "0.075", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "258.5", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "49680", "lr": "6.88066e-05", "gnorm": "2", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12637"} 2023-01-29 19:42:21 | INFO | train_inner | {"epoch": 23, "update": 22.99, "s2c_loss": "0.331", "loss": "0.22943", "s2c_nll_loss": "0.331", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "259.3", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "49690", "lr": "6.87399e-05", "gnorm": "2.214", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12639"} 2023-01-29 19:42:24 | INFO | train_inner | {"epoch": 23, "update": 22.995, "s2c_loss": "0.112", "loss": "0.07771", "s2c_nll_loss": "0.112", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "49700", "lr": "6.86732e-05", "gnorm": "2.75", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12642"} 2023-01-29 19:42:26 | INFO | train_inner | {"epoch": 23, "update": 23.0, "s2c_loss": "0.149", "loss": "0.10358", "s2c_nll_loss": "0.149", "s2c_accuracy": "97.031", "s2c_total": "64", "s2c_n_correct": "62.1", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "49710", "lr": "6.86066e-05", "gnorm": "4.028", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12644"} 2023-01-29 19:42:26 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 19:42:41 | INFO | valid | {"epoch": 23, "valid_s2c_loss": "0.474", "valid_loss": "0.32819", "valid_s2c_nll_loss": "0.474", "valid_s2c_accuracy": "91.525", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "29.25", "valid_num_updates": "49711", "valid_best_s2c_accuracy": "91.525"} 2023-01-29 19:42:41 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 23 @ 49711 updates 2023-01-29 19:42:41 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 19:42:48 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 19:42:53 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt (epoch 23 @ 49711 updates, score 91.525) (writing took 11.531933432910591 seconds) 2023-01-29 19:42:53 | INFO | fairseq_cli.train | end of epoch 23 (average epoch stats below) 2023-01-29 19:42:53 | INFO | train | {"epoch": 23, "train_s2c_loss": "0.075", "train_loss": "0.05189", "train_s2c_nll_loss": "0.075", "train_s2c_accuracy": "98.826", "train_s2c_total": "63.9838", "train_s2c_n_correct": "63.2328", "train_wps": "239.7", "train_ups": "3.75", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "49711", "train_lr": "6.85999e-05", "train_gnorm": "2.27", "train_loss_scale": "2048", "train_train_wall": "537", "train_gb_free": "7.4", "train_wall": "12670"} 2023-01-29 19:42:59 | INFO | fairseq.trainer | begin training epoch 24 2023-01-29 19:42:59 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 19:43:01 | INFO | train_inner | {"epoch": 24, "update": 23.004, "s2c_loss": "0.073", "loss": "0.05056", "s2c_nll_loss": "0.073", "s2c_accuracy": "99.178", "s2c_total": "60.8", "s2c_n_correct": "60.3", "wps": "17.3", "ups": "0.29", "wpb": "60.8", "bsz": "60.8", "num_updates": "49720", "lr": "6.85399e-05", "gnorm": "2.275", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12679"} 2023-01-29 19:43:04 | INFO | train_inner | {"epoch": 24, "update": 23.009, "s2c_loss": "0.1", "loss": "0.06949", "s2c_nll_loss": "0.1", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "49730", "lr": "6.84732e-05", "gnorm": "2.091", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12682"} 2023-01-29 19:43:06 | INFO | train_inner | {"epoch": 24, "update": 23.013, "s2c_loss": "0.049", "loss": "0.03394", "s2c_nll_loss": "0.049", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "49740", "lr": "6.84066e-05", "gnorm": "2.062", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12684"} 2023-01-29 19:43:09 | INFO | train_inner | {"epoch": 24, "update": 23.018, "s2c_loss": "0.039", "loss": "0.0272", "s2c_nll_loss": "0.039", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "49750", "lr": "6.83399e-05", "gnorm": "1.443", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12687"} 2023-01-29 19:43:11 | INFO | train_inner | {"epoch": 24, "update": 23.023, "s2c_loss": "0.069", "loss": "0.04756", "s2c_nll_loss": "0.069", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "49760", "lr": "6.82733e-05", "gnorm": "2.104", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12689"} 2023-01-29 19:43:14 | INFO | train_inner | {"epoch": 24, "update": 23.027, "s2c_loss": "0.05", "loss": "0.03436", "s2c_nll_loss": "0.05", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "49770", "lr": "6.82066e-05", "gnorm": "1.369", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12692"} 2023-01-29 19:43:16 | INFO | train_inner | {"epoch": 24, "update": 23.032, "s2c_loss": "0.064", "loss": "0.04439", "s2c_nll_loss": "0.064", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "49780", "lr": "6.81399e-05", "gnorm": "1.35", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12694"} 2023-01-29 19:43:19 | INFO | train_inner | {"epoch": 24, "update": 23.037, "s2c_loss": "0.042", "loss": "0.02941", "s2c_nll_loss": "0.042", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "49790", "lr": "6.80733e-05", "gnorm": "1.749", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12697"} 2023-01-29 19:43:22 | INFO | train_inner | {"epoch": 24, "update": 23.041, "s2c_loss": "0.075", "loss": "0.05185", "s2c_nll_loss": "0.075", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "244", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "49800", "lr": "6.80066e-05", "gnorm": "2.349", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12700"} 2023-01-29 19:43:24 | INFO | train_inner | {"epoch": 24, "update": 23.046, "s2c_loss": "0.048", "loss": "0.03313", "s2c_nll_loss": "0.048", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "49810", "lr": "6.79399e-05", "gnorm": "1.793", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "12702"} 2023-01-29 19:43:27 | INFO | train_inner | {"epoch": 24, "update": 23.05, "s2c_loss": "0.053", "loss": "0.03674", "s2c_nll_loss": "0.053", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "49820", "lr": "6.78733e-05", "gnorm": "2.144", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12705"} 2023-01-29 19:43:29 | INFO | train_inner | {"epoch": 24, "update": 23.055, "s2c_loss": "0.07", "loss": "0.04819", "s2c_nll_loss": "0.07", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "49830", "lr": "6.78066e-05", "gnorm": "2.626", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12707"} 2023-01-29 19:43:32 | INFO | train_inner | {"epoch": 24, "update": 23.06, "s2c_loss": "0.053", "loss": "0.03663", "s2c_nll_loss": "0.053", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "49840", "lr": "6.77399e-05", "gnorm": "1.95", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12710"} 2023-01-29 19:43:34 | INFO | train_inner | {"epoch": 24, "update": 23.064, "s2c_loss": "0.055", "loss": "0.03822", "s2c_nll_loss": "0.055", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "49850", "lr": "6.76733e-05", "gnorm": "2.127", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12712"} 2023-01-29 19:43:37 | INFO | train_inner | {"epoch": 24, "update": 23.069, "s2c_loss": "0.055", "loss": "0.03787", "s2c_nll_loss": "0.055", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "49860", "lr": "6.76066e-05", "gnorm": "1.998", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12715"} 2023-01-29 19:43:39 | INFO | train_inner | {"epoch": 24, "update": 23.074, "s2c_loss": "0.03", "loss": "0.02047", "s2c_nll_loss": "0.03", "s2c_accuracy": "99.843", "s2c_total": "63.7", "s2c_n_correct": "63.6", "wps": "251.6", "ups": "3.95", "wpb": "63.7", "bsz": "63.7", "num_updates": "49870", "lr": "6.754e-05", "gnorm": "1.167", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12717"} 2023-01-29 19:43:42 | INFO | train_inner | {"epoch": 24, "update": 23.078, "s2c_loss": "0.028", "loss": "0.0192", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "49880", "lr": "6.74733e-05", "gnorm": "1.217", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12720"} 2023-01-29 19:43:45 | INFO | train_inner | {"epoch": 24, "update": 23.083, "s2c_loss": "0.039", "loss": "0.02701", "s2c_nll_loss": "0.039", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "243.3", "ups": "3.8", "wpb": "64", "bsz": "64", "num_updates": "49890", "lr": "6.74066e-05", "gnorm": "1.254", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12722"} 2023-01-29 19:43:47 | INFO | train_inner | {"epoch": 24, "update": 23.087, "s2c_loss": "0.077", "loss": "0.05363", "s2c_nll_loss": "0.077", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "49900", "lr": "6.734e-05", "gnorm": "1.94", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12725"} 2023-01-29 19:43:50 | INFO | train_inner | {"epoch": 24, "update": 23.092, "s2c_loss": "0.029", "loss": "0.02042", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "49910", "lr": "6.72733e-05", "gnorm": "1.299", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12727"} 2023-01-29 19:43:52 | INFO | train_inner | {"epoch": 24, "update": 23.097, "s2c_loss": "0.03", "loss": "0.02109", "s2c_nll_loss": "0.03", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "49920", "lr": "6.72066e-05", "gnorm": "1.333", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12730"} 2023-01-29 19:43:55 | INFO | train_inner | {"epoch": 24, "update": 23.101, "s2c_loss": "0.053", "loss": "0.03697", "s2c_nll_loss": "0.053", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "49930", "lr": "6.714e-05", "gnorm": "1.478", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12733"} 2023-01-29 19:43:57 | INFO | train_inner | {"epoch": 24, "update": 23.106, "s2c_loss": "0.07", "loss": "0.04859", "s2c_nll_loss": "0.07", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "49940", "lr": "6.70733e-05", "gnorm": "1.632", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12735"} 2023-01-29 19:44:00 | INFO | train_inner | {"epoch": 24, "update": 23.111, "s2c_loss": "0.018", "loss": "0.01273", "s2c_nll_loss": "0.018", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "243.4", "ups": "3.8", "wpb": "64", "bsz": "64", "num_updates": "49950", "lr": "6.70066e-05", "gnorm": "0.852", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12738"} 2023-01-29 19:44:02 | INFO | train_inner | {"epoch": 24, "update": 23.115, "s2c_loss": "0.045", "loss": "0.03094", "s2c_nll_loss": "0.045", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "49960", "lr": "6.694e-05", "gnorm": "1.294", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12740"} 2023-01-29 19:44:05 | INFO | train_inner | {"epoch": 24, "update": 23.12, "s2c_loss": "0.032", "loss": "0.02225", "s2c_nll_loss": "0.032", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "49970", "lr": "6.68733e-05", "gnorm": "1.51", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12743"} 2023-01-29 19:44:07 | INFO | train_inner | {"epoch": 24, "update": 23.124, "s2c_loss": "0.031", "loss": "0.02141", "s2c_nll_loss": "0.031", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "244", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "49980", "lr": "6.68067e-05", "gnorm": "1.333", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12745"} 2023-01-29 19:44:10 | INFO | train_inner | {"epoch": 24, "update": 23.129, "s2c_loss": "0.067", "loss": "0.04669", "s2c_nll_loss": "0.067", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "250.6", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "49990", "lr": "6.674e-05", "gnorm": "2.413", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12748"} 2023-01-29 19:44:13 | INFO | train_inner | {"epoch": 24, "update": 23.134, "s2c_loss": "0.067", "loss": "0.04637", "s2c_nll_loss": "0.067", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "50000", "lr": "6.66733e-05", "gnorm": "2.528", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12750"} 2023-01-29 19:44:13 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 19:44:27 | INFO | valid | {"epoch": 24, "valid_s2c_loss": "0.488", "valid_loss": "0.33796", "valid_s2c_nll_loss": "0.488", "valid_s2c_accuracy": "91.439", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "29.2222", "valid_num_updates": "50000", "valid_best_s2c_accuracy": "91.525"} 2023-01-29 19:44:27 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 24 @ 50000 updates 2023-01-29 19:44:27 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_24_50000.pt 2023-01-29 19:44:31 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_24_50000.pt 2023-01-29 19:44:35 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_24_50000.pt (epoch 24 @ 50000 updates, score 91.439) (writing took 8.275092105846852 seconds) 2023-01-29 19:44:38 | INFO | train_inner | {"epoch": 24, "update": 23.138, "s2c_loss": "0.061", "loss": "0.0423", "s2c_nll_loss": "0.061", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "25.1", "ups": "0.39", "wpb": "64", "bsz": "64", "num_updates": "50010", "lr": "6.66067e-05", "gnorm": "2.5", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12776"} 2023-01-29 19:44:41 | INFO | train_inner | {"epoch": 24, "update": 23.143, "s2c_loss": "0.04", "loss": "0.02805", "s2c_nll_loss": "0.04", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "50020", "lr": "6.654e-05", "gnorm": "1.526", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12778"} 2023-01-29 19:44:43 | INFO | train_inner | {"epoch": 24, "update": 23.148, "s2c_loss": "0.064", "loss": "0.04422", "s2c_nll_loss": "0.064", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "50030", "lr": "6.64733e-05", "gnorm": "1.972", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12781"} 2023-01-29 19:44:46 | INFO | train_inner | {"epoch": 24, "update": 23.152, "s2c_loss": "0.071", "loss": "0.04923", "s2c_nll_loss": "0.071", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "50040", "lr": "6.64067e-05", "gnorm": "2.021", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "12784"} 2023-01-29 19:44:48 | INFO | train_inner | {"epoch": 24, "update": 23.157, "s2c_loss": "0.067", "loss": "0.04643", "s2c_nll_loss": "0.067", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "242.9", "ups": "3.8", "wpb": "64", "bsz": "64", "num_updates": "50050", "lr": "6.634e-05", "gnorm": "1.588", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "12786"} 2023-01-29 19:44:51 | INFO | train_inner | {"epoch": 24, "update": 23.161, "s2c_loss": "0.057", "loss": "0.03972", "s2c_nll_loss": "0.057", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "50060", "lr": "6.62734e-05", "gnorm": "1.881", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "12789"} 2023-01-29 19:44:53 | INFO | train_inner | {"epoch": 24, "update": 23.166, "s2c_loss": "0.066", "loss": "0.04595", "s2c_nll_loss": "0.066", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "50070", "lr": "6.62067e-05", "gnorm": "1.996", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "12791"} 2023-01-29 19:44:56 | INFO | train_inner | {"epoch": 24, "update": 23.171, "s2c_loss": "0.034", "loss": "0.02352", "s2c_nll_loss": "0.034", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "50080", "lr": "6.614e-05", "gnorm": "1.386", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "12794"} 2023-01-29 19:44:59 | INFO | train_inner | {"epoch": 24, "update": 23.175, "s2c_loss": "0.099", "loss": "0.06833", "s2c_nll_loss": "0.099", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "238.6", "ups": "3.73", "wpb": "64", "bsz": "64", "num_updates": "50090", "lr": "6.60734e-05", "gnorm": "2.388", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "12796"} 2023-01-29 19:45:01 | INFO | train_inner | {"epoch": 24, "update": 23.18, "s2c_loss": "0.066", "loss": "0.04541", "s2c_nll_loss": "0.066", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "240.2", "ups": "3.75", "wpb": "64", "bsz": "64", "num_updates": "50100", "lr": "6.60067e-05", "gnorm": "2.036", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "12799"} 2023-01-29 19:45:04 | INFO | train_inner | {"epoch": 24, "update": 23.185, "s2c_loss": "0.064", "loss": "0.04419", "s2c_nll_loss": "0.064", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "50110", "lr": "6.594e-05", "gnorm": "2.356", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "12802"} 2023-01-29 19:45:06 | INFO | train_inner | {"epoch": 24, "update": 23.189, "s2c_loss": "0.062", "loss": "0.04265", "s2c_nll_loss": "0.062", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "50120", "lr": "6.58734e-05", "gnorm": "2.298", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "12804"} 2023-01-29 19:45:09 | INFO | train_inner | {"epoch": 24, "update": 23.194, "s2c_loss": "0.051", "loss": "0.03533", "s2c_nll_loss": "0.051", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "245.2", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "50130", "lr": "6.58067e-05", "gnorm": "2.038", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "12807"} 2023-01-29 19:45:12 | INFO | train_inner | {"epoch": 24, "update": 23.198, "s2c_loss": "0.068", "loss": "0.04691", "s2c_nll_loss": "0.068", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "50140", "lr": "6.574e-05", "gnorm": "2.43", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "12809"} 2023-01-29 19:45:14 | INFO | train_inner | {"epoch": 24, "update": 23.203, "s2c_loss": "0.045", "loss": "0.03101", "s2c_nll_loss": "0.045", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "50150", "lr": "6.56734e-05", "gnorm": "1.256", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "12812"} 2023-01-29 19:45:17 | INFO | train_inner | {"epoch": 24, "update": 23.208, "s2c_loss": "0.081", "loss": "0.05613", "s2c_nll_loss": "0.081", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "50160", "lr": "6.56067e-05", "gnorm": "2.67", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "12815"} 2023-01-29 19:45:18 | INFO | fairseq.trainer | NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 2048.0 2023-01-29 19:45:19 | INFO | train_inner | {"epoch": 24, "update": 23.213, "s2c_loss": "0.051", "loss": "0.03503", "s2c_nll_loss": "0.051", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "226.4", "ups": "3.54", "wpb": "64", "bsz": "64", "num_updates": "50170", "lr": "6.55401e-05", "gnorm": "2.039", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12817"} 2023-01-29 19:45:22 | INFO | train_inner | {"epoch": 24, "update": 23.217, "s2c_loss": "0.071", "loss": "0.04913", "s2c_nll_loss": "0.071", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "50180", "lr": "6.54734e-05", "gnorm": "2.266", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12820"} 2023-01-29 19:45:25 | INFO | train_inner | {"epoch": 24, "update": 23.222, "s2c_loss": "0.159", "loss": "0.11039", "s2c_nll_loss": "0.159", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "50190", "lr": "6.54067e-05", "gnorm": "3.153", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12822"} 2023-01-29 19:45:27 | INFO | train_inner | {"epoch": 24, "update": 23.227, "s2c_loss": "0.032", "loss": "0.02228", "s2c_nll_loss": "0.032", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "50200", "lr": "6.53401e-05", "gnorm": "1.448", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12825"} 2023-01-29 19:45:30 | INFO | train_inner | {"epoch": 24, "update": 23.231, "s2c_loss": "0.039", "loss": "0.02726", "s2c_nll_loss": "0.039", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "50210", "lr": "6.52734e-05", "gnorm": "1.563", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12828"} 2023-01-29 19:45:32 | INFO | train_inner | {"epoch": 24, "update": 23.236, "s2c_loss": "0.037", "loss": "0.02563", "s2c_nll_loss": "0.037", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "50220", "lr": "6.52067e-05", "gnorm": "1.301", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12830"} 2023-01-29 19:45:35 | INFO | train_inner | {"epoch": 24, "update": 23.241, "s2c_loss": "0.036", "loss": "0.02498", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "50230", "lr": "6.51401e-05", "gnorm": "1.803", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12833"} 2023-01-29 19:45:37 | INFO | train_inner | {"epoch": 24, "update": 23.245, "s2c_loss": "0.041", "loss": "0.02865", "s2c_nll_loss": "0.041", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "247.6", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "50240", "lr": "6.50734e-05", "gnorm": "1.903", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12835"} 2023-01-29 19:45:40 | INFO | train_inner | {"epoch": 24, "update": 23.25, "s2c_loss": "0.04", "loss": "0.02764", "s2c_nll_loss": "0.04", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "50250", "lr": "6.50067e-05", "gnorm": "1.651", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12838"} 2023-01-29 19:45:42 | INFO | train_inner | {"epoch": 24, "update": 23.254, "s2c_loss": "0.076", "loss": "0.05295", "s2c_nll_loss": "0.076", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "50260", "lr": "6.49401e-05", "gnorm": "3.257", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12840"} 2023-01-29 19:45:45 | INFO | train_inner | {"epoch": 24, "update": 23.259, "s2c_loss": "0.045", "loss": "0.03117", "s2c_nll_loss": "0.045", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "50270", "lr": "6.48734e-05", "gnorm": "2.03", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12843"} 2023-01-29 19:45:48 | INFO | train_inner | {"epoch": 24, "update": 23.264, "s2c_loss": "0.09", "loss": "0.06263", "s2c_nll_loss": "0.09", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "50280", "lr": "6.48068e-05", "gnorm": "2.58", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12845"} 2023-01-29 19:45:50 | INFO | train_inner | {"epoch": 24, "update": 23.268, "s2c_loss": "0.047", "loss": "0.03286", "s2c_nll_loss": "0.047", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "50290", "lr": "6.47401e-05", "gnorm": "1.786", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12848"} 2023-01-29 19:45:53 | INFO | train_inner | {"epoch": 24, "update": 23.273, "s2c_loss": "0.07", "loss": "0.04883", "s2c_nll_loss": "0.07", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "50300", "lr": "6.46734e-05", "gnorm": "1.999", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12851"} 2023-01-29 19:45:55 | INFO | train_inner | {"epoch": 24, "update": 23.278, "s2c_loss": "0.038", "loss": "0.02648", "s2c_nll_loss": "0.038", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "50310", "lr": "6.46068e-05", "gnorm": "1.582", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12853"} 2023-01-29 19:45:58 | INFO | train_inner | {"epoch": 24, "update": 23.282, "s2c_loss": "0.087", "loss": "0.06015", "s2c_nll_loss": "0.087", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "259.3", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "50320", "lr": "6.45401e-05", "gnorm": "2.42", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12856"} 2023-01-29 19:46:00 | INFO | train_inner | {"epoch": 24, "update": 23.287, "s2c_loss": "0.039", "loss": "0.02721", "s2c_nll_loss": "0.039", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "50330", "lr": "6.44734e-05", "gnorm": "1.75", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "12858"} 2023-01-29 19:46:03 | INFO | train_inner | {"epoch": 24, "update": 23.291, "s2c_loss": "0.061", "loss": "0.04222", "s2c_nll_loss": "0.061", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "50340", "lr": "6.44068e-05", "gnorm": "1.583", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12861"} 2023-01-29 19:46:05 | INFO | train_inner | {"epoch": 24, "update": 23.296, "s2c_loss": "0.058", "loss": "0.03988", "s2c_nll_loss": "0.058", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "50350", "lr": "6.43401e-05", "gnorm": "1.66", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12863"} 2023-01-29 19:46:08 | INFO | train_inner | {"epoch": 24, "update": 23.301, "s2c_loss": "0.024", "loss": "0.0164", "s2c_nll_loss": "0.024", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "50360", "lr": "6.42735e-05", "gnorm": "0.868", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12866"} 2023-01-29 19:46:10 | INFO | train_inner | {"epoch": 24, "update": 23.305, "s2c_loss": "0.051", "loss": "0.03513", "s2c_nll_loss": "0.051", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "50370", "lr": "6.42068e-05", "gnorm": "2.03", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12868"} 2023-01-29 19:46:13 | INFO | train_inner | {"epoch": 24, "update": 23.31, "s2c_loss": "0.027", "loss": "0.01867", "s2c_nll_loss": "0.027", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "50380", "lr": "6.41401e-05", "gnorm": "0.996", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12871"} 2023-01-29 19:46:15 | INFO | train_inner | {"epoch": 24, "update": 23.315, "s2c_loss": "0.051", "loss": "0.03529", "s2c_nll_loss": "0.051", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "50390", "lr": "6.40735e-05", "gnorm": "1.866", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12873"} 2023-01-29 19:46:18 | INFO | train_inner | {"epoch": 24, "update": 23.319, "s2c_loss": "0.078", "loss": "0.05423", "s2c_nll_loss": "0.078", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "247.3", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "50400", "lr": "6.40068e-05", "gnorm": "2.359", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "12876"} 2023-01-29 19:46:21 | INFO | train_inner | {"epoch": 24, "update": 23.324, "s2c_loss": "0.081", "loss": "0.0562", "s2c_nll_loss": "0.081", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "246.5", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "50410", "lr": "6.39401e-05", "gnorm": "2.459", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12879"} 2023-01-29 19:46:23 | INFO | train_inner | {"epoch": 24, "update": 23.328, "s2c_loss": "0.053", "loss": "0.03666", "s2c_nll_loss": "0.053", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "50420", "lr": "6.38735e-05", "gnorm": "1.902", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12881"} 2023-01-29 19:46:26 | INFO | train_inner | {"epoch": 24, "update": 23.333, "s2c_loss": "0.034", "loss": "0.02378", "s2c_nll_loss": "0.034", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "50430", "lr": "6.38068e-05", "gnorm": "1.31", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12884"} 2023-01-29 19:46:28 | INFO | train_inner | {"epoch": 24, "update": 23.338, "s2c_loss": "0.083", "loss": "0.05742", "s2c_nll_loss": "0.083", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "50440", "lr": "6.37401e-05", "gnorm": "1.996", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12886"} 2023-01-29 19:46:31 | INFO | train_inner | {"epoch": 24, "update": 23.342, "s2c_loss": "0.073", "loss": "0.05054", "s2c_nll_loss": "0.073", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "50450", "lr": "6.36735e-05", "gnorm": "1.705", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12889"} 2023-01-29 19:46:33 | INFO | train_inner | {"epoch": 24, "update": 23.347, "s2c_loss": "0.085", "loss": "0.05903", "s2c_nll_loss": "0.085", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "50460", "lr": "6.36068e-05", "gnorm": "2.213", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "12891"} 2023-01-29 19:46:36 | INFO | train_inner | {"epoch": 24, "update": 23.352, "s2c_loss": "0.074", "loss": "0.05111", "s2c_nll_loss": "0.074", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "50470", "lr": "6.35402e-05", "gnorm": "2.619", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12894"} 2023-01-29 19:46:38 | INFO | train_inner | {"epoch": 24, "update": 23.356, "s2c_loss": "0.065", "loss": "0.04514", "s2c_nll_loss": "0.065", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "50480", "lr": "6.34735e-05", "gnorm": "2.443", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12896"} 2023-01-29 19:46:41 | INFO | train_inner | {"epoch": 24, "update": 23.361, "s2c_loss": "0.062", "loss": "0.04317", "s2c_nll_loss": "0.062", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "50490", "lr": "6.34068e-05", "gnorm": "2.294", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12899"} 2023-01-29 19:46:43 | INFO | train_inner | {"epoch": 24, "update": 23.365, "s2c_loss": "0.091", "loss": "0.06281", "s2c_nll_loss": "0.091", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "246.3", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "50500", "lr": "6.33402e-05", "gnorm": "2.838", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "12901"} 2023-01-29 19:46:46 | INFO | train_inner | {"epoch": 24, "update": 23.37, "s2c_loss": "0.075", "loss": "0.0519", "s2c_nll_loss": "0.075", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "50510", "lr": "6.32735e-05", "gnorm": "2.581", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12904"} 2023-01-29 19:46:49 | INFO | train_inner | {"epoch": 24, "update": 23.375, "s2c_loss": "0.07", "loss": "0.04821", "s2c_nll_loss": "0.07", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "50520", "lr": "6.32068e-05", "gnorm": "1.799", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12907"} 2023-01-29 19:46:51 | INFO | train_inner | {"epoch": 24, "update": 23.379, "s2c_loss": "0.065", "loss": "0.04495", "s2c_nll_loss": "0.065", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "50530", "lr": "6.31402e-05", "gnorm": "2.414", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12909"} 2023-01-29 19:46:54 | INFO | train_inner | {"epoch": 24, "update": 23.384, "s2c_loss": "0.068", "loss": "0.04747", "s2c_nll_loss": "0.068", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "50540", "lr": "6.30735e-05", "gnorm": "2.087", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12912"} 2023-01-29 19:46:56 | INFO | train_inner | {"epoch": 24, "update": 23.389, "s2c_loss": "0.047", "loss": "0.03276", "s2c_nll_loss": "0.047", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "50550", "lr": "6.30068e-05", "gnorm": "1.678", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12914"} 2023-01-29 19:46:59 | INFO | train_inner | {"epoch": 24, "update": 23.393, "s2c_loss": "0.064", "loss": "0.04449", "s2c_nll_loss": "0.064", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "50560", "lr": "6.29402e-05", "gnorm": "2.289", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12917"} 2023-01-29 19:47:01 | INFO | train_inner | {"epoch": 24, "update": 23.398, "s2c_loss": "0.048", "loss": "0.03339", "s2c_nll_loss": "0.048", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "50570", "lr": "6.28735e-05", "gnorm": "1.753", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12919"} 2023-01-29 19:47:04 | INFO | train_inner | {"epoch": 24, "update": 23.402, "s2c_loss": "0.094", "loss": "0.06494", "s2c_nll_loss": "0.094", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "50580", "lr": "6.28069e-05", "gnorm": "2.199", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12922"} 2023-01-29 19:47:06 | INFO | train_inner | {"epoch": 24, "update": 23.407, "s2c_loss": "0.027", "loss": "0.01883", "s2c_nll_loss": "0.027", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "50590", "lr": "6.27402e-05", "gnorm": "1.016", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12924"} 2023-01-29 19:47:09 | INFO | train_inner | {"epoch": 24, "update": 23.412, "s2c_loss": "0.075", "loss": "0.05205", "s2c_nll_loss": "0.075", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "50600", "lr": "6.26735e-05", "gnorm": "1.753", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12927"} 2023-01-29 19:47:11 | INFO | train_inner | {"epoch": 24, "update": 23.416, "s2c_loss": "0.039", "loss": "0.02676", "s2c_nll_loss": "0.039", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "50610", "lr": "6.26069e-05", "gnorm": "1.943", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12929"} 2023-01-29 19:47:14 | INFO | train_inner | {"epoch": 24, "update": 23.421, "s2c_loss": "0.042", "loss": "0.02902", "s2c_nll_loss": "0.042", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "258.7", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "50620", "lr": "6.25402e-05", "gnorm": "1.585", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12932"} 2023-01-29 19:47:17 | INFO | train_inner | {"epoch": 24, "update": 23.426, "s2c_loss": "0.039", "loss": "0.0268", "s2c_nll_loss": "0.039", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "50630", "lr": "6.24735e-05", "gnorm": "1.315", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12934"} 2023-01-29 19:47:19 | INFO | train_inner | {"epoch": 24, "update": 23.43, "s2c_loss": "0.058", "loss": "0.04022", "s2c_nll_loss": "0.058", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "50640", "lr": "6.24069e-05", "gnorm": "1.989", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12937"} 2023-01-29 19:47:21 | INFO | train_inner | {"epoch": 24, "update": 23.435, "s2c_loss": "0.033", "loss": "0.02282", "s2c_nll_loss": "0.033", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "257.6", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "50650", "lr": "6.23402e-05", "gnorm": "1.314", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12939"} 2023-01-29 19:47:24 | INFO | train_inner | {"epoch": 24, "update": 23.439, "s2c_loss": "0.032", "loss": "0.02233", "s2c_nll_loss": "0.032", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "50660", "lr": "6.22736e-05", "gnorm": "1.183", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12942"} 2023-01-29 19:47:27 | INFO | train_inner | {"epoch": 24, "update": 23.444, "s2c_loss": "0.036", "loss": "0.02504", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "50670", "lr": "6.22069e-05", "gnorm": "1.345", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12944"} 2023-01-29 19:47:29 | INFO | train_inner | {"epoch": 24, "update": 23.449, "s2c_loss": "0.028", "loss": "0.01922", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "50680", "lr": "6.21402e-05", "gnorm": "1.318", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12947"} 2023-01-29 19:47:32 | INFO | train_inner | {"epoch": 24, "update": 23.453, "s2c_loss": "0.051", "loss": "0.03541", "s2c_nll_loss": "0.051", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "50690", "lr": "6.20736e-05", "gnorm": "1.927", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "12950"} 2023-01-29 19:47:34 | INFO | train_inner | {"epoch": 24, "update": 23.458, "s2c_loss": "0.055", "loss": "0.03832", "s2c_nll_loss": "0.055", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "50700", "lr": "6.20069e-05", "gnorm": "1.787", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12952"} 2023-01-29 19:47:37 | INFO | train_inner | {"epoch": 24, "update": 23.463, "s2c_loss": "0.06", "loss": "0.04127", "s2c_nll_loss": "0.06", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "258.7", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "50710", "lr": "6.19402e-05", "gnorm": "1.941", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12955"} 2023-01-29 19:47:39 | INFO | train_inner | {"epoch": 24, "update": 23.467, "s2c_loss": "0.051", "loss": "0.03503", "s2c_nll_loss": "0.051", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "50720", "lr": "6.18736e-05", "gnorm": "1.946", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12957"} 2023-01-29 19:47:42 | INFO | train_inner | {"epoch": 24, "update": 23.472, "s2c_loss": "0.042", "loss": "0.02943", "s2c_nll_loss": "0.042", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "50730", "lr": "6.18069e-05", "gnorm": "1.593", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12960"} 2023-01-29 19:47:44 | INFO | train_inner | {"epoch": 24, "update": 23.476, "s2c_loss": "0.234", "loss": "0.16233", "s2c_nll_loss": "0.234", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "50740", "lr": "6.17402e-05", "gnorm": "1.64", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12962"} 2023-01-29 19:47:47 | INFO | train_inner | {"epoch": 24, "update": 23.481, "s2c_loss": "0.273", "loss": "0.18919", "s2c_nll_loss": "0.273", "s2c_accuracy": "97.812", "s2c_total": "64", "s2c_n_correct": "62.6", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "50750", "lr": "6.16736e-05", "gnorm": "2.279", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "12965"} 2023-01-29 19:47:49 | INFO | train_inner | {"epoch": 24, "update": 23.486, "s2c_loss": "0.056", "loss": "0.03893", "s2c_nll_loss": "0.056", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "50760", "lr": "6.16069e-05", "gnorm": "1.425", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12967"} 2023-01-29 19:47:52 | INFO | train_inner | {"epoch": 24, "update": 23.49, "s2c_loss": "0.079", "loss": "0.05472", "s2c_nll_loss": "0.079", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "50770", "lr": "6.15403e-05", "gnorm": "2.181", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12970"} 2023-01-29 19:47:54 | INFO | train_inner | {"epoch": 24, "update": 23.495, "s2c_loss": "0.066", "loss": "0.0455", "s2c_nll_loss": "0.066", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "50780", "lr": "6.14736e-05", "gnorm": "1.741", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12972"} 2023-01-29 19:47:57 | INFO | train_inner | {"epoch": 24, "update": 23.5, "s2c_loss": "0.036", "loss": "0.02488", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "50790", "lr": "6.14069e-05", "gnorm": "1.348", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12975"} 2023-01-29 19:47:59 | INFO | train_inner | {"epoch": 24, "update": 23.504, "s2c_loss": "0.063", "loss": "0.04391", "s2c_nll_loss": "0.063", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "50800", "lr": "6.13403e-05", "gnorm": "2.191", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.5", "wall": "12977"} 2023-01-29 19:48:02 | INFO | train_inner | {"epoch": 24, "update": 23.509, "s2c_loss": "0.059", "loss": "0.04063", "s2c_nll_loss": "0.059", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "50810", "lr": "6.12736e-05", "gnorm": "1.858", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12980"} 2023-01-29 19:48:04 | INFO | train_inner | {"epoch": 24, "update": 23.513, "s2c_loss": "0.019", "loss": "0.01333", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "50820", "lr": "6.12069e-05", "gnorm": "1.04", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12982"} 2023-01-29 19:48:07 | INFO | train_inner | {"epoch": 24, "update": 23.518, "s2c_loss": "0.028", "loss": "0.01916", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "50830", "lr": "6.11403e-05", "gnorm": "1.123", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12985"} 2023-01-29 19:48:10 | INFO | train_inner | {"epoch": 24, "update": 23.523, "s2c_loss": "0.054", "loss": "0.03755", "s2c_nll_loss": "0.054", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "248", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "50840", "lr": "6.10736e-05", "gnorm": "1.587", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12987"} 2023-01-29 19:48:12 | INFO | train_inner | {"epoch": 24, "update": 23.527, "s2c_loss": "0.071", "loss": "0.04933", "s2c_nll_loss": "0.071", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "50850", "lr": "6.10069e-05", "gnorm": "1.929", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "12990"} 2023-01-29 19:48:15 | INFO | train_inner | {"epoch": 24, "update": 23.532, "s2c_loss": "0.043", "loss": "0.03004", "s2c_nll_loss": "0.043", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "50860", "lr": "6.09403e-05", "gnorm": "1.671", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "12993"} 2023-01-29 19:48:17 | INFO | train_inner | {"epoch": 24, "update": 23.537, "s2c_loss": "0.076", "loss": "0.05267", "s2c_nll_loss": "0.076", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "50870", "lr": "6.08736e-05", "gnorm": "2.211", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "12995"} 2023-01-29 19:48:20 | INFO | train_inner | {"epoch": 24, "update": 23.541, "s2c_loss": "0.069", "loss": "0.04805", "s2c_nll_loss": "0.069", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "50880", "lr": "6.0807e-05", "gnorm": "2.263", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "12998"} 2023-01-29 19:48:22 | INFO | train_inner | {"epoch": 24, "update": 23.546, "s2c_loss": "0.104", "loss": "0.07196", "s2c_nll_loss": "0.104", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "50890", "lr": "6.07403e-05", "gnorm": "2.552", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13000"} 2023-01-29 19:48:25 | INFO | train_inner | {"epoch": 24, "update": 23.55, "s2c_loss": "0.065", "loss": "0.04505", "s2c_nll_loss": "0.065", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "50900", "lr": "6.06736e-05", "gnorm": "2.417", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13003"} 2023-01-29 19:48:27 | INFO | train_inner | {"epoch": 24, "update": 23.555, "s2c_loss": "0.049", "loss": "0.03399", "s2c_nll_loss": "0.049", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "50910", "lr": "6.0607e-05", "gnorm": "1.699", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13005"} 2023-01-29 19:48:30 | INFO | train_inner | {"epoch": 24, "update": 23.56, "s2c_loss": "0.091", "loss": "0.06335", "s2c_nll_loss": "0.091", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "257.6", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "50920", "lr": "6.05403e-05", "gnorm": "2.835", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13008"} 2023-01-29 19:48:32 | INFO | train_inner | {"epoch": 24, "update": 23.564, "s2c_loss": "0.053", "loss": "0.03659", "s2c_nll_loss": "0.053", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "248", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "50930", "lr": "6.04736e-05", "gnorm": "1.991", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13010"} 2023-01-29 19:48:35 | INFO | train_inner | {"epoch": 24, "update": 23.569, "s2c_loss": "0.076", "loss": "0.05256", "s2c_nll_loss": "0.076", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "50940", "lr": "6.0407e-05", "gnorm": "2.717", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13013"} 2023-01-29 19:48:37 | INFO | train_inner | {"epoch": 24, "update": 23.574, "s2c_loss": "0.044", "loss": "0.03027", "s2c_nll_loss": "0.044", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "50950", "lr": "6.03403e-05", "gnorm": "1.862", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13015"} 2023-01-29 19:48:40 | INFO | train_inner | {"epoch": 24, "update": 23.578, "s2c_loss": "0.064", "loss": "0.0445", "s2c_nll_loss": "0.064", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "50960", "lr": "6.02737e-05", "gnorm": "1.921", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13018"} 2023-01-29 19:48:42 | INFO | train_inner | {"epoch": 24, "update": 23.583, "s2c_loss": "0.089", "loss": "0.06158", "s2c_nll_loss": "0.089", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "50970", "lr": "6.0207e-05", "gnorm": "2.294", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13020"} 2023-01-29 19:48:45 | INFO | train_inner | {"epoch": 24, "update": 23.587, "s2c_loss": "0.058", "loss": "0.03988", "s2c_nll_loss": "0.058", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "50980", "lr": "6.01403e-05", "gnorm": "1.561", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13023"} 2023-01-29 19:48:48 | INFO | train_inner | {"epoch": 24, "update": 23.592, "s2c_loss": "0.034", "loss": "0.02381", "s2c_nll_loss": "0.034", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "50990", "lr": "6.00737e-05", "gnorm": "1.895", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "13025"} 2023-01-29 19:48:50 | INFO | train_inner | {"epoch": 24, "update": 23.597, "s2c_loss": "0.041", "loss": "0.02829", "s2c_nll_loss": "0.041", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "51000", "lr": "6.0007e-05", "gnorm": "1.322", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13028"} 2023-01-29 19:48:53 | INFO | train_inner | {"epoch": 24, "update": 23.601, "s2c_loss": "0.058", "loss": "0.03988", "s2c_nll_loss": "0.058", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "51010", "lr": "5.99403e-05", "gnorm": "2.012", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13030"} 2023-01-29 19:48:55 | INFO | train_inner | {"epoch": 24, "update": 23.606, "s2c_loss": "0.066", "loss": "0.04594", "s2c_nll_loss": "0.066", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "51020", "lr": "5.98737e-05", "gnorm": "1.671", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13033"} 2023-01-29 19:48:58 | INFO | train_inner | {"epoch": 24, "update": 23.611, "s2c_loss": "0.032", "loss": "0.02233", "s2c_nll_loss": "0.032", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "51030", "lr": "5.9807e-05", "gnorm": "1.597", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13036"} 2023-01-29 19:49:00 | INFO | train_inner | {"epoch": 24, "update": 23.615, "s2c_loss": "0.036", "loss": "0.025", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "51040", "lr": "5.97403e-05", "gnorm": "1.203", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13038"} 2023-01-29 19:49:03 | INFO | train_inner | {"epoch": 24, "update": 23.62, "s2c_loss": "0.044", "loss": "0.03046", "s2c_nll_loss": "0.044", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "51050", "lr": "5.96737e-05", "gnorm": "1.856", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13041"} 2023-01-29 19:49:05 | INFO | train_inner | {"epoch": 24, "update": 23.624, "s2c_loss": "0.042", "loss": "0.02895", "s2c_nll_loss": "0.042", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "51060", "lr": "5.9607e-05", "gnorm": "1.681", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13043"} 2023-01-29 19:49:08 | INFO | train_inner | {"epoch": 24, "update": 23.629, "s2c_loss": "0.03", "loss": "0.02064", "s2c_nll_loss": "0.03", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "51070", "lr": "5.95404e-05", "gnorm": "1.265", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13046"} 2023-01-29 19:49:10 | INFO | train_inner | {"epoch": 24, "update": 23.634, "s2c_loss": "0.055", "loss": "0.03841", "s2c_nll_loss": "0.055", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "258.5", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "51080", "lr": "5.94737e-05", "gnorm": "1.605", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13048"} 2023-01-29 19:49:13 | INFO | train_inner | {"epoch": 24, "update": 23.638, "s2c_loss": "0.052", "loss": "0.03607", "s2c_nll_loss": "0.052", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "51090", "lr": "5.9407e-05", "gnorm": "1.873", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13051"} 2023-01-29 19:49:15 | INFO | train_inner | {"epoch": 24, "update": 23.643, "s2c_loss": "0.034", "loss": "0.02379", "s2c_nll_loss": "0.034", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "51100", "lr": "5.93404e-05", "gnorm": "1.88", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13053"} 2023-01-29 19:49:18 | INFO | train_inner | {"epoch": 24, "update": 23.648, "s2c_loss": "0.047", "loss": "0.03239", "s2c_nll_loss": "0.047", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "51110", "lr": "5.92737e-05", "gnorm": "1.57", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13056"} 2023-01-29 19:49:20 | INFO | train_inner | {"epoch": 24, "update": 23.652, "s2c_loss": "0.027", "loss": "0.01883", "s2c_nll_loss": "0.027", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "51120", "lr": "5.9207e-05", "gnorm": "1.255", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13058"} 2023-01-29 19:49:23 | INFO | train_inner | {"epoch": 24, "update": 23.657, "s2c_loss": "0.055", "loss": "0.03808", "s2c_nll_loss": "0.055", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "51130", "lr": "5.91404e-05", "gnorm": "1.982", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13061"} 2023-01-29 19:49:25 | INFO | train_inner | {"epoch": 24, "update": 23.661, "s2c_loss": "0.024", "loss": "0.01686", "s2c_nll_loss": "0.024", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "51140", "lr": "5.90737e-05", "gnorm": "1.163", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13063"} 2023-01-29 19:49:28 | INFO | train_inner | {"epoch": 24, "update": 23.666, "s2c_loss": "0.049", "loss": "0.03374", "s2c_nll_loss": "0.049", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "51150", "lr": "5.9007e-05", "gnorm": "1.872", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13066"} 2023-01-29 19:49:30 | INFO | train_inner | {"epoch": 24, "update": 23.671, "s2c_loss": "0.035", "loss": "0.02453", "s2c_nll_loss": "0.035", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "51160", "lr": "5.89404e-05", "gnorm": "1.551", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13068"} 2023-01-29 19:49:33 | INFO | train_inner | {"epoch": 24, "update": 23.675, "s2c_loss": "0.034", "loss": "0.02345", "s2c_nll_loss": "0.034", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "51170", "lr": "5.88737e-05", "gnorm": "1.197", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13071"} 2023-01-29 19:49:36 | INFO | train_inner | {"epoch": 24, "update": 23.68, "s2c_loss": "0.029", "loss": "0.01997", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "51180", "lr": "5.88071e-05", "gnorm": "1.653", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13073"} 2023-01-29 19:49:38 | INFO | train_inner | {"epoch": 24, "update": 23.685, "s2c_loss": "0.1", "loss": "0.06916", "s2c_nll_loss": "0.1", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "51190", "lr": "5.87404e-05", "gnorm": "2.858", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "13076"} 2023-01-29 19:49:41 | INFO | train_inner | {"epoch": 24, "update": 23.689, "s2c_loss": "0.092", "loss": "0.06347", "s2c_nll_loss": "0.092", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "51200", "lr": "5.86737e-05", "gnorm": "3.162", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13079"} 2023-01-29 19:49:43 | INFO | train_inner | {"epoch": 24, "update": 23.694, "s2c_loss": "0.05", "loss": "0.03478", "s2c_nll_loss": "0.05", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "51210", "lr": "5.86071e-05", "gnorm": "2.545", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13081"} 2023-01-29 19:49:46 | INFO | train_inner | {"epoch": 24, "update": 23.698, "s2c_loss": "0.125", "loss": "0.0865", "s2c_nll_loss": "0.125", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "51220", "lr": "5.85404e-05", "gnorm": "2.389", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13084"} 2023-01-29 19:49:48 | INFO | train_inner | {"epoch": 24, "update": 23.703, "s2c_loss": "0.052", "loss": "0.036", "s2c_nll_loss": "0.052", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "51230", "lr": "5.84737e-05", "gnorm": "1.89", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13086"} 2023-01-29 19:49:51 | INFO | train_inner | {"epoch": 24, "update": 23.708, "s2c_loss": "0.071", "loss": "0.04935", "s2c_nll_loss": "0.071", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "51240", "lr": "5.84071e-05", "gnorm": "2.168", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13089"} 2023-01-29 19:49:53 | INFO | train_inner | {"epoch": 24, "update": 23.712, "s2c_loss": "0.188", "loss": "0.13043", "s2c_nll_loss": "0.188", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "249.6", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "51250", "lr": "5.83404e-05", "gnorm": "1.328", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13091"} 2023-01-29 19:49:56 | INFO | train_inner | {"epoch": 24, "update": 23.717, "s2c_loss": "0.049", "loss": "0.03423", "s2c_nll_loss": "0.049", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "51260", "lr": "5.82738e-05", "gnorm": "1.623", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13094"} 2023-01-29 19:49:58 | INFO | train_inner | {"epoch": 24, "update": 23.722, "s2c_loss": "0.043", "loss": "0.02946", "s2c_nll_loss": "0.043", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "51270", "lr": "5.82071e-05", "gnorm": "1.92", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13096"} 2023-01-29 19:50:01 | INFO | train_inner | {"epoch": 24, "update": 23.726, "s2c_loss": "0.063", "loss": "0.04342", "s2c_nll_loss": "0.063", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "245.2", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "51280", "lr": "5.81404e-05", "gnorm": "2.343", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13099"} 2023-01-29 19:50:04 | INFO | train_inner | {"epoch": 24, "update": 23.731, "s2c_loss": "0.055", "loss": "0.0384", "s2c_nll_loss": "0.055", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "51290", "lr": "5.80738e-05", "gnorm": "1.877", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13102"} 2023-01-29 19:50:06 | INFO | train_inner | {"epoch": 24, "update": 23.735, "s2c_loss": "0.055", "loss": "0.0382", "s2c_nll_loss": "0.055", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "51300", "lr": "5.80071e-05", "gnorm": "1.858", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13104"} 2023-01-29 19:50:09 | INFO | train_inner | {"epoch": 24, "update": 23.74, "s2c_loss": "0.05", "loss": "0.03493", "s2c_nll_loss": "0.05", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "51310", "lr": "5.79404e-05", "gnorm": "1.805", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13107"} 2023-01-29 19:50:11 | INFO | train_inner | {"epoch": 24, "update": 23.745, "s2c_loss": "0.058", "loss": "0.04009", "s2c_nll_loss": "0.058", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "51320", "lr": "5.78738e-05", "gnorm": "1.561", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13109"} 2023-01-29 19:50:14 | INFO | train_inner | {"epoch": 24, "update": 23.749, "s2c_loss": "0.028", "loss": "0.01922", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "51330", "lr": "5.78071e-05", "gnorm": "1.154", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13112"} 2023-01-29 19:50:16 | INFO | train_inner | {"epoch": 24, "update": 23.754, "s2c_loss": "0.015", "loss": "0.01026", "s2c_nll_loss": "0.015", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "51340", "lr": "5.77404e-05", "gnorm": "0.705", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13114"} 2023-01-29 19:50:19 | INFO | train_inner | {"epoch": 24, "update": 23.759, "s2c_loss": "0.036", "loss": "0.02462", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "51350", "lr": "5.76738e-05", "gnorm": "1.329", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13117"} 2023-01-29 19:50:21 | INFO | train_inner | {"epoch": 24, "update": 23.763, "s2c_loss": "0.025", "loss": "0.01751", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "51360", "lr": "5.76071e-05", "gnorm": "1.106", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13119"} 2023-01-29 19:50:24 | INFO | train_inner | {"epoch": 24, "update": 23.768, "s2c_loss": "0.046", "loss": "0.03182", "s2c_nll_loss": "0.046", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "51370", "lr": "5.75405e-05", "gnorm": "1.919", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13122"} 2023-01-29 19:50:26 | INFO | train_inner | {"epoch": 24, "update": 23.772, "s2c_loss": "0.047", "loss": "0.03266", "s2c_nll_loss": "0.047", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "51380", "lr": "5.74738e-05", "gnorm": "1.552", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13124"} 2023-01-29 19:50:29 | INFO | train_inner | {"epoch": 24, "update": 23.777, "s2c_loss": "0.071", "loss": "0.0494", "s2c_nll_loss": "0.071", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "51390", "lr": "5.74071e-05", "gnorm": "2.423", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13127"} 2023-01-29 19:50:31 | INFO | train_inner | {"epoch": 24, "update": 23.782, "s2c_loss": "0.084", "loss": "0.05817", "s2c_nll_loss": "0.084", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "51400", "lr": "5.73405e-05", "gnorm": "2.135", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13129"} 2023-01-29 19:50:34 | INFO | train_inner | {"epoch": 24, "update": 23.786, "s2c_loss": "0.039", "loss": "0.02676", "s2c_nll_loss": "0.039", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "51410", "lr": "5.72738e-05", "gnorm": "1.493", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13132"} 2023-01-29 19:50:37 | INFO | train_inner | {"epoch": 24, "update": 23.791, "s2c_loss": "0.033", "loss": "0.0226", "s2c_nll_loss": "0.033", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "51420", "lr": "5.72071e-05", "gnorm": "1.316", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13134"} 2023-01-29 19:50:39 | INFO | train_inner | {"epoch": 24, "update": 23.796, "s2c_loss": "0.079", "loss": "0.05464", "s2c_nll_loss": "0.079", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "51430", "lr": "5.71405e-05", "gnorm": "1.801", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13137"} 2023-01-29 19:50:42 | INFO | train_inner | {"epoch": 24, "update": 23.8, "s2c_loss": "0.043", "loss": "0.02997", "s2c_nll_loss": "0.043", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "51440", "lr": "5.70738e-05", "gnorm": "1.894", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13140"} 2023-01-29 19:50:44 | INFO | train_inner | {"epoch": 24, "update": 23.805, "s2c_loss": "0.075", "loss": "0.0518", "s2c_nll_loss": "0.075", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "51450", "lr": "5.70071e-05", "gnorm": "2.357", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13142"} 2023-01-29 19:50:47 | INFO | train_inner | {"epoch": 24, "update": 23.809, "s2c_loss": "0.092", "loss": "0.06378", "s2c_nll_loss": "0.092", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "51460", "lr": "5.69405e-05", "gnorm": "3.319", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13145"} 2023-01-29 19:50:49 | INFO | train_inner | {"epoch": 24, "update": 23.814, "s2c_loss": "0.069", "loss": "0.04778", "s2c_nll_loss": "0.069", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "51470", "lr": "5.68738e-05", "gnorm": "1.907", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "13147"} 2023-01-29 19:50:52 | INFO | train_inner | {"epoch": 24, "update": 23.819, "s2c_loss": "0.041", "loss": "0.02815", "s2c_nll_loss": "0.041", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "51480", "lr": "5.68072e-05", "gnorm": "1.761", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13150"} 2023-01-29 19:50:54 | INFO | train_inner | {"epoch": 24, "update": 23.823, "s2c_loss": "0.061", "loss": "0.04259", "s2c_nll_loss": "0.061", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "51490", "lr": "5.67405e-05", "gnorm": "2.052", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13152"} 2023-01-29 19:50:57 | INFO | train_inner | {"epoch": 24, "update": 23.828, "s2c_loss": "0.048", "loss": "0.03329", "s2c_nll_loss": "0.048", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "51500", "lr": "5.66738e-05", "gnorm": "1.563", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13155"} 2023-01-29 19:50:59 | INFO | train_inner | {"epoch": 24, "update": 23.833, "s2c_loss": "0.045", "loss": "0.03135", "s2c_nll_loss": "0.045", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "51510", "lr": "5.66072e-05", "gnorm": "2.085", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13157"} 2023-01-29 19:51:02 | INFO | train_inner | {"epoch": 24, "update": 23.837, "s2c_loss": "0.064", "loss": "0.04455", "s2c_nll_loss": "0.064", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "246.6", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "51520", "lr": "5.65405e-05", "gnorm": "2.169", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13160"} 2023-01-29 19:51:04 | INFO | train_inner | {"epoch": 24, "update": 23.842, "s2c_loss": "0.057", "loss": "0.03955", "s2c_nll_loss": "0.057", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "51530", "lr": "5.64738e-05", "gnorm": "2.327", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "13162"} 2023-01-29 19:51:07 | INFO | train_inner | {"epoch": 24, "update": 23.846, "s2c_loss": "0.043", "loss": "0.03014", "s2c_nll_loss": "0.043", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "51540", "lr": "5.64072e-05", "gnorm": "1.483", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13165"} 2023-01-29 19:51:10 | INFO | train_inner | {"epoch": 24, "update": 23.851, "s2c_loss": "0.077", "loss": "0.05312", "s2c_nll_loss": "0.077", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "51550", "lr": "5.63405e-05", "gnorm": "2.63", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13167"} 2023-01-29 19:51:12 | INFO | train_inner | {"epoch": 24, "update": 23.856, "s2c_loss": "0.039", "loss": "0.02717", "s2c_nll_loss": "0.039", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "51560", "lr": "5.62739e-05", "gnorm": "1.676", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13170"} 2023-01-29 19:51:15 | INFO | train_inner | {"epoch": 24, "update": 23.86, "s2c_loss": "0.046", "loss": "0.03158", "s2c_nll_loss": "0.046", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "51570", "lr": "5.62072e-05", "gnorm": "1.702", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "13173"} 2023-01-29 19:51:17 | INFO | train_inner | {"epoch": 24, "update": 23.865, "s2c_loss": "0.052", "loss": "0.03599", "s2c_nll_loss": "0.052", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "51580", "lr": "5.61405e-05", "gnorm": "1.722", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13175"} 2023-01-29 19:51:20 | INFO | train_inner | {"epoch": 24, "update": 23.87, "s2c_loss": "0.028", "loss": "0.0197", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "51590", "lr": "5.60739e-05", "gnorm": "1.211", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13178"} 2023-01-29 19:51:22 | INFO | train_inner | {"epoch": 24, "update": 23.874, "s2c_loss": "0.046", "loss": "0.03176", "s2c_nll_loss": "0.046", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "51600", "lr": "5.60072e-05", "gnorm": "1.936", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13180"} 2023-01-29 19:51:25 | INFO | train_inner | {"epoch": 24, "update": 23.879, "s2c_loss": "0.026", "loss": "0.01816", "s2c_nll_loss": "0.026", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "246.1", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "51610", "lr": "5.59405e-05", "gnorm": "1.301", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13183"} 2023-01-29 19:51:27 | INFO | train_inner | {"epoch": 24, "update": 23.883, "s2c_loss": "0.037", "loss": "0.0258", "s2c_nll_loss": "0.037", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "51620", "lr": "5.58739e-05", "gnorm": "1.416", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13185"} 2023-01-29 19:51:30 | INFO | train_inner | {"epoch": 24, "update": 23.888, "s2c_loss": "0.065", "loss": "0.04487", "s2c_nll_loss": "0.065", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "247.6", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "51630", "lr": "5.58072e-05", "gnorm": "2.435", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13188"} 2023-01-29 19:51:33 | INFO | train_inner | {"epoch": 24, "update": 23.893, "s2c_loss": "0.041", "loss": "0.02839", "s2c_nll_loss": "0.041", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "246.1", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "51640", "lr": "5.57405e-05", "gnorm": "1.397", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13190"} 2023-01-29 19:51:35 | INFO | train_inner | {"epoch": 24, "update": 23.897, "s2c_loss": "0.041", "loss": "0.02851", "s2c_nll_loss": "0.041", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "51650", "lr": "5.56739e-05", "gnorm": "1.522", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13193"} 2023-01-29 19:51:38 | INFO | train_inner | {"epoch": 24, "update": 23.902, "s2c_loss": "0.046", "loss": "0.03185", "s2c_nll_loss": "0.046", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "51660", "lr": "5.56072e-05", "gnorm": "1.898", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13196"} 2023-01-29 19:51:40 | INFO | train_inner | {"epoch": 24, "update": 23.907, "s2c_loss": "0.033", "loss": "0.02253", "s2c_nll_loss": "0.033", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "51670", "lr": "5.55406e-05", "gnorm": "1.506", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13198"} 2023-01-29 19:51:43 | INFO | train_inner | {"epoch": 24, "update": 23.911, "s2c_loss": "0.055", "loss": "0.03845", "s2c_nll_loss": "0.055", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "51680", "lr": "5.54739e-05", "gnorm": "2.056", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13201"} 2023-01-29 19:51:45 | INFO | train_inner | {"epoch": 24, "update": 23.916, "s2c_loss": "0.046", "loss": "0.03203", "s2c_nll_loss": "0.046", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "51690", "lr": "5.54072e-05", "gnorm": "1.573", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13203"} 2023-01-29 19:51:48 | INFO | train_inner | {"epoch": 24, "update": 23.92, "s2c_loss": "0.045", "loss": "0.03135", "s2c_nll_loss": "0.045", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "51700", "lr": "5.53406e-05", "gnorm": "1.67", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13206"} 2023-01-29 19:51:50 | INFO | train_inner | {"epoch": 24, "update": 23.925, "s2c_loss": "0.047", "loss": "0.03269", "s2c_nll_loss": "0.047", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "51710", "lr": "5.52739e-05", "gnorm": "2.277", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13208"} 2023-01-29 19:51:53 | INFO | train_inner | {"epoch": 24, "update": 23.93, "s2c_loss": "0.037", "loss": "0.02558", "s2c_nll_loss": "0.037", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "51720", "lr": "5.52072e-05", "gnorm": "1.274", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13211"} 2023-01-29 19:51:55 | INFO | train_inner | {"epoch": 24, "update": 23.934, "s2c_loss": "0.037", "loss": "0.0256", "s2c_nll_loss": "0.037", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "51730", "lr": "5.51406e-05", "gnorm": "1.25", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13213"} 2023-01-29 19:51:58 | INFO | train_inner | {"epoch": 24, "update": 23.939, "s2c_loss": "0.038", "loss": "0.02608", "s2c_nll_loss": "0.038", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "51740", "lr": "5.50739e-05", "gnorm": "2.242", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13216"} 2023-01-29 19:52:00 | INFO | train_inner | {"epoch": 24, "update": 23.944, "s2c_loss": "0.038", "loss": "0.02647", "s2c_nll_loss": "0.038", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "51750", "lr": "5.50072e-05", "gnorm": "1.569", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13218"} 2023-01-29 19:52:03 | INFO | train_inner | {"epoch": 24, "update": 23.948, "s2c_loss": "0.034", "loss": "0.02388", "s2c_nll_loss": "0.034", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "51760", "lr": "5.49406e-05", "gnorm": "1.317", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13221"} 2023-01-29 19:52:06 | INFO | train_inner | {"epoch": 24, "update": 23.953, "s2c_loss": "0.028", "loss": "0.01964", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "51770", "lr": "5.48739e-05", "gnorm": "1.006", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13223"} 2023-01-29 19:52:08 | INFO | train_inner | {"epoch": 24, "update": 23.957, "s2c_loss": "0.041", "loss": "0.02866", "s2c_nll_loss": "0.041", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "51780", "lr": "5.48073e-05", "gnorm": "1.554", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13226"} 2023-01-29 19:52:11 | INFO | train_inner | {"epoch": 24, "update": 23.962, "s2c_loss": "0.045", "loss": "0.03105", "s2c_nll_loss": "0.045", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "51790", "lr": "5.47406e-05", "gnorm": "1.184", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13229"} 2023-01-29 19:52:13 | INFO | train_inner | {"epoch": 24, "update": 23.967, "s2c_loss": "0.054", "loss": "0.03731", "s2c_nll_loss": "0.054", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "51800", "lr": "5.46739e-05", "gnorm": "1.676", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13231"} 2023-01-29 19:52:16 | INFO | train_inner | {"epoch": 24, "update": 23.971, "s2c_loss": "0.06", "loss": "0.04132", "s2c_nll_loss": "0.06", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "51810", "lr": "5.46073e-05", "gnorm": "1.848", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13234"} 2023-01-29 19:52:18 | INFO | train_inner | {"epoch": 24, "update": 23.976, "s2c_loss": "0.043", "loss": "0.02946", "s2c_nll_loss": "0.043", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "51820", "lr": "5.45406e-05", "gnorm": "1.602", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13236"} 2023-01-29 19:52:21 | INFO | train_inner | {"epoch": 24, "update": 23.981, "s2c_loss": "0.034", "loss": "0.02386", "s2c_nll_loss": "0.034", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "51830", "lr": "5.44739e-05", "gnorm": "1.52", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13239"} 2023-01-29 19:52:23 | INFO | train_inner | {"epoch": 24, "update": 23.985, "s2c_loss": "0.198", "loss": "0.13758", "s2c_nll_loss": "0.198", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "51840", "lr": "5.44073e-05", "gnorm": "1.673", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13241"} 2023-01-29 19:52:26 | INFO | train_inner | {"epoch": 24, "update": 23.99, "s2c_loss": "0.032", "loss": "0.02241", "s2c_nll_loss": "0.032", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "51850", "lr": "5.43406e-05", "gnorm": "1.651", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13244"} 2023-01-29 19:52:28 | INFO | train_inner | {"epoch": 24, "update": 23.994, "s2c_loss": "0.036", "loss": "0.02462", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "51860", "lr": "5.4274e-05", "gnorm": "1.529", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13246"} 2023-01-29 19:52:31 | INFO | train_inner | {"epoch": 24, "update": 23.999, "s2c_loss": "0.064", "loss": "0.04456", "s2c_nll_loss": "0.064", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "51870", "lr": "5.42073e-05", "gnorm": "2.225", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13249"} 2023-01-29 19:52:31 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 19:52:46 | INFO | valid | {"epoch": 24, "valid_s2c_loss": "0.443", "valid_loss": "0.30684", "valid_s2c_nll_loss": "0.443", "valid_s2c_accuracy": "92.264", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "29.4861", "valid_num_updates": "51872", "valid_best_s2c_accuracy": "92.264"} 2023-01-29 19:52:46 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 24 @ 51872 updates 2023-01-29 19:52:46 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 19:52:53 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 19:52:57 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt (epoch 24 @ 51872 updates, score 92.264) (writing took 11.512843424919993 seconds) 2023-01-29 19:52:57 | INFO | fairseq_cli.train | end of epoch 24 (average epoch stats below) 2023-01-29 19:52:57 | INFO | train | {"epoch": 24, "train_s2c_loss": "0.057", "train_loss": "0.03942", "train_s2c_nll_loss": "0.057", "train_s2c_accuracy": "99.192", "train_s2c_total": "63.9838", "train_s2c_n_correct": "63.4669", "train_wps": "228.6", "train_ups": "3.57", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "51872", "train_lr": "5.4194e-05", "train_gnorm": "1.835", "train_loss_scale": "2048", "train_train_wall": "542", "train_gb_free": "7.5", "train_wall": "13275"} 2023-01-29 19:53:04 | INFO | fairseq.trainer | begin training epoch 25 2023-01-29 19:53:04 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 19:53:06 | INFO | train_inner | {"epoch": 25, "update": 24.004, "s2c_loss": "0.049", "loss": "0.03404", "s2c_nll_loss": "0.049", "s2c_accuracy": "99.013", "s2c_total": "60.8", "s2c_n_correct": "60.2", "wps": "17.3", "ups": "0.29", "wpb": "60.8", "bsz": "60.8", "num_updates": "51880", "lr": "5.41406e-05", "gnorm": "1.781", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13284"} 2023-01-29 19:53:08 | INFO | train_inner | {"epoch": 25, "update": 24.008, "s2c_loss": "0.055", "loss": "0.03825", "s2c_nll_loss": "0.055", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "247.9", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "51890", "lr": "5.4074e-05", "gnorm": "2.281", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13286"} 2023-01-29 19:53:11 | INFO | train_inner | {"epoch": 25, "update": 24.013, "s2c_loss": "0.029", "loss": "0.01978", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "51900", "lr": "5.40073e-05", "gnorm": "1.124", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "13289"} 2023-01-29 19:53:14 | INFO | train_inner | {"epoch": 25, "update": 24.018, "s2c_loss": "0.08", "loss": "0.05515", "s2c_nll_loss": "0.08", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "51910", "lr": "5.39406e-05", "gnorm": "2.672", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13291"} 2023-01-29 19:53:16 | INFO | train_inner | {"epoch": 25, "update": 24.022, "s2c_loss": "0.047", "loss": "0.03285", "s2c_nll_loss": "0.047", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "51920", "lr": "5.3874e-05", "gnorm": "1.863", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13294"} 2023-01-29 19:53:19 | INFO | train_inner | {"epoch": 25, "update": 24.027, "s2c_loss": "0.033", "loss": "0.02262", "s2c_nll_loss": "0.033", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "51930", "lr": "5.38073e-05", "gnorm": "1.579", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13296"} 2023-01-29 19:53:21 | INFO | train_inner | {"epoch": 25, "update": 24.031, "s2c_loss": "0.018", "loss": "0.01215", "s2c_nll_loss": "0.018", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "51940", "lr": "5.37406e-05", "gnorm": "1.111", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "13299"} 2023-01-29 19:53:24 | INFO | train_inner | {"epoch": 25, "update": 24.036, "s2c_loss": "0.041", "loss": "0.02828", "s2c_nll_loss": "0.041", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "51950", "lr": "5.3674e-05", "gnorm": "1.474", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13302"} 2023-01-29 19:53:26 | INFO | train_inner | {"epoch": 25, "update": 24.041, "s2c_loss": "0.029", "loss": "0.01994", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "51960", "lr": "5.36073e-05", "gnorm": "1.375", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13304"} 2023-01-29 19:53:29 | INFO | train_inner | {"epoch": 25, "update": 24.045, "s2c_loss": "0.024", "loss": "0.01662", "s2c_nll_loss": "0.024", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "51970", "lr": "5.35407e-05", "gnorm": "0.954", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13307"} 2023-01-29 19:53:31 | INFO | train_inner | {"epoch": 25, "update": 24.05, "s2c_loss": "0.019", "loss": "0.01306", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "51980", "lr": "5.3474e-05", "gnorm": "0.77", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13309"} 2023-01-29 19:53:34 | INFO | train_inner | {"epoch": 25, "update": 24.055, "s2c_loss": "0.019", "loss": "0.01344", "s2c_nll_loss": "0.019", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "51990", "lr": "5.34073e-05", "gnorm": "0.811", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13312"} 2023-01-29 19:53:36 | INFO | train_inner | {"epoch": 25, "update": 24.059, "s2c_loss": "0.201", "loss": "0.13938", "s2c_nll_loss": "0.201", "s2c_accuracy": "98.281", "s2c_total": "64", "s2c_n_correct": "62.9", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "52000", "lr": "5.33407e-05", "gnorm": "1.735", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13314"} 2023-01-29 19:53:39 | INFO | train_inner | {"epoch": 25, "update": 24.064, "s2c_loss": "0.204", "loss": "0.14114", "s2c_nll_loss": "0.204", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "52010", "lr": "5.3274e-05", "gnorm": "1.512", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13317"} 2023-01-29 19:53:42 | INFO | train_inner | {"epoch": 25, "update": 24.068, "s2c_loss": "0.02", "loss": "0.01381", "s2c_nll_loss": "0.02", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "52020", "lr": "5.32073e-05", "gnorm": "1.237", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13319"} 2023-01-29 19:53:44 | INFO | train_inner | {"epoch": 25, "update": 24.073, "s2c_loss": "0.032", "loss": "0.02226", "s2c_nll_loss": "0.032", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "52030", "lr": "5.31407e-05", "gnorm": "1.326", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "13322"} 2023-01-29 19:53:47 | INFO | train_inner | {"epoch": 25, "update": 24.078, "s2c_loss": "0.035", "loss": "0.02454", "s2c_nll_loss": "0.035", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "52040", "lr": "5.3074e-05", "gnorm": "1.381", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13324"} 2023-01-29 19:53:49 | INFO | train_inner | {"epoch": 25, "update": 24.082, "s2c_loss": "0.038", "loss": "0.02652", "s2c_nll_loss": "0.038", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "52050", "lr": "5.30073e-05", "gnorm": "1.8", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "13327"} 2023-01-29 19:53:52 | INFO | train_inner | {"epoch": 25, "update": 24.087, "s2c_loss": "0.035", "loss": "0.0242", "s2c_nll_loss": "0.035", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "52060", "lr": "5.29407e-05", "gnorm": "1.461", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13330"} 2023-01-29 19:53:54 | INFO | train_inner | {"epoch": 25, "update": 24.092, "s2c_loss": "0.085", "loss": "0.05865", "s2c_nll_loss": "0.085", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "52070", "lr": "5.2874e-05", "gnorm": "1.837", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13332"} 2023-01-29 19:53:57 | INFO | train_inner | {"epoch": 25, "update": 24.096, "s2c_loss": "0.044", "loss": "0.03026", "s2c_nll_loss": "0.044", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "52080", "lr": "5.28074e-05", "gnorm": "1.76", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13335"} 2023-01-29 19:53:59 | INFO | train_inner | {"epoch": 25, "update": 24.101, "s2c_loss": "0.031", "loss": "0.0212", "s2c_nll_loss": "0.031", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "52090", "lr": "5.27407e-05", "gnorm": "1.435", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13337"} 2023-01-29 19:54:02 | INFO | train_inner | {"epoch": 25, "update": 24.105, "s2c_loss": "0.019", "loss": "0.01291", "s2c_nll_loss": "0.019", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "52100", "lr": "5.2674e-05", "gnorm": "1.286", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13340"} 2023-01-29 19:54:04 | INFO | train_inner | {"epoch": 25, "update": 24.11, "s2c_loss": "0.021", "loss": "0.01455", "s2c_nll_loss": "0.021", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "52110", "lr": "5.26074e-05", "gnorm": "1.072", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13342"} 2023-01-29 19:54:07 | INFO | train_inner | {"epoch": 25, "update": 24.115, "s2c_loss": "0.016", "loss": "0.01134", "s2c_nll_loss": "0.016", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "52120", "lr": "5.25407e-05", "gnorm": "0.805", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13345"} 2023-01-29 19:54:09 | INFO | train_inner | {"epoch": 25, "update": 24.119, "s2c_loss": "0.022", "loss": "0.01522", "s2c_nll_loss": "0.022", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "52130", "lr": "5.2474e-05", "gnorm": "1.06", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13347"} 2023-01-29 19:54:12 | INFO | train_inner | {"epoch": 25, "update": 24.124, "s2c_loss": "0.026", "loss": "0.01809", "s2c_nll_loss": "0.026", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "52140", "lr": "5.24074e-05", "gnorm": "1.064", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "13350"} 2023-01-29 19:54:15 | INFO | train_inner | {"epoch": 25, "update": 24.129, "s2c_loss": "0.033", "loss": "0.02274", "s2c_nll_loss": "0.033", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "52150", "lr": "5.23407e-05", "gnorm": "1.459", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13352"} 2023-01-29 19:54:17 | INFO | train_inner | {"epoch": 25, "update": 24.133, "s2c_loss": "0.038", "loss": "0.02602", "s2c_nll_loss": "0.038", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "52160", "lr": "5.22741e-05", "gnorm": "1.25", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13355"} 2023-01-29 19:54:20 | INFO | train_inner | {"epoch": 25, "update": 24.138, "s2c_loss": "0.051", "loss": "0.03568", "s2c_nll_loss": "0.051", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "52170", "lr": "5.22074e-05", "gnorm": "1.652", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13357"} 2023-01-29 19:54:22 | INFO | train_inner | {"epoch": 25, "update": 24.142, "s2c_loss": "0.019", "loss": "0.01323", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "52180", "lr": "5.21407e-05", "gnorm": "0.938", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "13360"} 2023-01-29 19:54:25 | INFO | train_inner | {"epoch": 25, "update": 24.147, "s2c_loss": "0.012", "loss": "0.00846", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "52190", "lr": "5.20741e-05", "gnorm": "0.621", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13363"} 2023-01-29 19:54:27 | INFO | train_inner | {"epoch": 25, "update": 24.152, "s2c_loss": "0.016", "loss": "0.01129", "s2c_nll_loss": "0.016", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "52200", "lr": "5.20074e-05", "gnorm": "0.718", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13365"} 2023-01-29 19:54:30 | INFO | train_inner | {"epoch": 25, "update": 24.156, "s2c_loss": "0.067", "loss": "0.04632", "s2c_nll_loss": "0.067", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "52210", "lr": "5.19407e-05", "gnorm": "1.441", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.5", "wall": "13368"} 2023-01-29 19:54:32 | INFO | train_inner | {"epoch": 25, "update": 24.161, "s2c_loss": "0.029", "loss": "0.02044", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "52220", "lr": "5.18741e-05", "gnorm": "1.346", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "13370"} 2023-01-29 19:54:35 | INFO | train_inner | {"epoch": 25, "update": 24.166, "s2c_loss": "0.051", "loss": "0.03543", "s2c_nll_loss": "0.051", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "52230", "lr": "5.18074e-05", "gnorm": "1.272", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "13373"} 2023-01-29 19:54:37 | INFO | train_inner | {"epoch": 25, "update": 24.17, "s2c_loss": "0.021", "loss": "0.01463", "s2c_nll_loss": "0.021", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "52240", "lr": "5.17407e-05", "gnorm": "0.871", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.4", "wall": "13375"} 2023-01-29 19:54:40 | INFO | train_inner | {"epoch": 25, "update": 24.175, "s2c_loss": "0.04", "loss": "0.02761", "s2c_nll_loss": "0.04", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "52250", "lr": "5.16741e-05", "gnorm": "1.576", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "13378"} 2023-01-29 19:54:42 | INFO | train_inner | {"epoch": 25, "update": 24.179, "s2c_loss": "0.018", "loss": "0.01221", "s2c_nll_loss": "0.018", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "52260", "lr": "5.16074e-05", "gnorm": "1.041", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "13380"} 2023-01-29 19:54:45 | INFO | train_inner | {"epoch": 25, "update": 24.184, "s2c_loss": "0.541", "loss": "0.37516", "s2c_nll_loss": "0.541", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "52270", "lr": "5.15408e-05", "gnorm": "1.55", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "13383"} 2023-01-29 19:54:47 | INFO | train_inner | {"epoch": 25, "update": 24.189, "s2c_loss": "0.019", "loss": "0.01283", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "52280", "lr": "5.14741e-05", "gnorm": "0.854", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "13385"} 2023-01-29 19:54:50 | INFO | train_inner | {"epoch": 25, "update": 24.193, "s2c_loss": "0.029", "loss": "0.01976", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "52290", "lr": "5.14074e-05", "gnorm": "1.254", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "13388"} 2023-01-29 19:54:52 | INFO | train_inner | {"epoch": 25, "update": 24.198, "s2c_loss": "0.037", "loss": "0.02534", "s2c_nll_loss": "0.037", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "52300", "lr": "5.13408e-05", "gnorm": "2.14", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "13390"} 2023-01-29 19:54:55 | INFO | train_inner | {"epoch": 25, "update": 24.203, "s2c_loss": "0.033", "loss": "0.02266", "s2c_nll_loss": "0.033", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "52310", "lr": "5.12741e-05", "gnorm": "1.377", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "13393"} 2023-01-29 19:54:57 | INFO | train_inner | {"epoch": 25, "update": 24.207, "s2c_loss": "0.037", "loss": "0.02531", "s2c_nll_loss": "0.037", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "52320", "lr": "5.12074e-05", "gnorm": "1.07", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "13395"} 2023-01-29 19:55:00 | INFO | train_inner | {"epoch": 25, "update": 24.212, "s2c_loss": "0.036", "loss": "0.02471", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "52330", "lr": "5.11408e-05", "gnorm": "1.475", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "13398"} 2023-01-29 19:55:03 | INFO | train_inner | {"epoch": 25, "update": 24.216, "s2c_loss": "0.041", "loss": "0.02839", "s2c_nll_loss": "0.041", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "52340", "lr": "5.10741e-05", "gnorm": "1.459", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "13400"} 2023-01-29 19:55:05 | INFO | train_inner | {"epoch": 25, "update": 24.221, "s2c_loss": "0.036", "loss": "0.02505", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "52350", "lr": "5.10074e-05", "gnorm": "1.459", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "13403"} 2023-01-29 19:55:08 | INFO | train_inner | {"epoch": 25, "update": 24.226, "s2c_loss": "0.036", "loss": "0.02476", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "52360", "lr": "5.09408e-05", "gnorm": "1.624", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "13406"} 2023-01-29 19:55:10 | INFO | train_inner | {"epoch": 25, "update": 24.23, "s2c_loss": "0.033", "loss": "0.02299", "s2c_nll_loss": "0.033", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "52370", "lr": "5.08741e-05", "gnorm": "1.535", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "13408"} 2023-01-29 19:55:13 | INFO | train_inner | {"epoch": 25, "update": 24.235, "s2c_loss": "0.034", "loss": "0.02354", "s2c_nll_loss": "0.034", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "52380", "lr": "5.08075e-05", "gnorm": "1.54", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "13411"} 2023-01-29 19:55:15 | INFO | train_inner | {"epoch": 25, "update": 24.24, "s2c_loss": "0.026", "loss": "0.01833", "s2c_nll_loss": "0.026", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "52390", "lr": "5.07408e-05", "gnorm": "0.925", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "13413"} 2023-01-29 19:55:18 | INFO | train_inner | {"epoch": 25, "update": 24.244, "s2c_loss": "0.042", "loss": "0.02926", "s2c_nll_loss": "0.042", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "52400", "lr": "5.06741e-05", "gnorm": "1.548", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "13416"} 2023-01-29 19:55:20 | INFO | train_inner | {"epoch": 25, "update": 24.249, "s2c_loss": "0.045", "loss": "0.03108", "s2c_nll_loss": "0.045", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "52410", "lr": "5.06075e-05", "gnorm": "1.091", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "13418"} 2023-01-29 19:55:23 | INFO | train_inner | {"epoch": 25, "update": 24.253, "s2c_loss": "0.024", "loss": "0.01632", "s2c_nll_loss": "0.024", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "52420", "lr": "5.05408e-05", "gnorm": "1.049", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "13421"} 2023-01-29 19:55:25 | INFO | train_inner | {"epoch": 25, "update": 24.258, "s2c_loss": "0.023", "loss": "0.01572", "s2c_nll_loss": "0.023", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "52430", "lr": "5.04741e-05", "gnorm": "1.267", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "13423"} 2023-01-29 19:55:28 | INFO | train_inner | {"epoch": 25, "update": 24.263, "s2c_loss": "0.016", "loss": "0.01132", "s2c_nll_loss": "0.016", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "246.8", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "52440", "lr": "5.04075e-05", "gnorm": "0.944", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "13426"} 2023-01-29 19:55:31 | INFO | train_inner | {"epoch": 25, "update": 24.267, "s2c_loss": "0.015", "loss": "0.01049", "s2c_nll_loss": "0.015", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "247.6", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "52450", "lr": "5.03408e-05", "gnorm": "0.829", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "13428"} 2023-01-29 19:55:33 | INFO | train_inner | {"epoch": 25, "update": 24.272, "s2c_loss": "0.033", "loss": "0.0227", "s2c_nll_loss": "0.033", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "52460", "lr": "5.02742e-05", "gnorm": "1.322", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "13431"} 2023-01-29 19:55:36 | INFO | train_inner | {"epoch": 25, "update": 24.277, "s2c_loss": "0.215", "loss": "0.14899", "s2c_nll_loss": "0.215", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "52470", "lr": "5.02075e-05", "gnorm": "1.742", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "13434"} 2023-01-29 19:55:38 | INFO | train_inner | {"epoch": 25, "update": 24.281, "s2c_loss": "0.038", "loss": "0.0264", "s2c_nll_loss": "0.038", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "52480", "lr": "5.01408e-05", "gnorm": "1.386", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "13436"} 2023-01-29 19:55:41 | INFO | train_inner | {"epoch": 25, "update": 24.286, "s2c_loss": "0.019", "loss": "0.01337", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "52490", "lr": "5.00742e-05", "gnorm": "1.003", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "13439"} 2023-01-29 19:55:43 | INFO | train_inner | {"epoch": 25, "update": 24.29, "s2c_loss": "0.023", "loss": "0.01582", "s2c_nll_loss": "0.023", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "52500", "lr": "5.00075e-05", "gnorm": "0.847", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "13441"} 2023-01-29 19:55:46 | INFO | train_inner | {"epoch": 25, "update": 24.295, "s2c_loss": "0.041", "loss": "0.02865", "s2c_nll_loss": "0.041", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "52510", "lr": "4.99408e-05", "gnorm": "1.697", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "13444"} 2023-01-29 19:55:48 | INFO | train_inner | {"epoch": 25, "update": 24.3, "s2c_loss": "0.075", "loss": "0.05199", "s2c_nll_loss": "0.075", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "256.3", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "52520", "lr": "4.98742e-05", "gnorm": "1.864", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "13446"} 2023-01-29 19:55:51 | INFO | train_inner | {"epoch": 25, "update": 24.304, "s2c_loss": "0.014", "loss": "0.00967", "s2c_nll_loss": "0.014", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "52530", "lr": "4.98075e-05", "gnorm": "0.563", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "13449"} 2023-01-29 19:55:53 | INFO | train_inner | {"epoch": 25, "update": 24.309, "s2c_loss": "0.024", "loss": "0.01662", "s2c_nll_loss": "0.024", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "52540", "lr": "4.97408e-05", "gnorm": "0.953", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "13451"} 2023-01-29 19:55:56 | INFO | train_inner | {"epoch": 25, "update": 24.314, "s2c_loss": "0.022", "loss": "0.01509", "s2c_nll_loss": "0.022", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "52550", "lr": "4.96742e-05", "gnorm": "0.894", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "13454"} 2023-01-29 19:55:58 | INFO | train_inner | {"epoch": 25, "update": 24.318, "s2c_loss": "0.044", "loss": "0.03055", "s2c_nll_loss": "0.044", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "52560", "lr": "4.96075e-05", "gnorm": "1.393", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "13456"} 2023-01-29 19:56:01 | INFO | train_inner | {"epoch": 25, "update": 24.323, "s2c_loss": "0.035", "loss": "0.02428", "s2c_nll_loss": "0.035", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "52570", "lr": "4.95409e-05", "gnorm": "1.606", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "13459"} 2023-01-29 19:56:03 | INFO | train_inner | {"epoch": 25, "update": 24.327, "s2c_loss": "0.069", "loss": "0.04802", "s2c_nll_loss": "0.069", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "52580", "lr": "4.94742e-05", "gnorm": "2.575", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "13461"} 2023-01-29 19:56:06 | INFO | train_inner | {"epoch": 25, "update": 24.332, "s2c_loss": "0.032", "loss": "0.02216", "s2c_nll_loss": "0.032", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "52590", "lr": "4.94075e-05", "gnorm": "1.198", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "13464"} 2023-01-29 19:56:08 | INFO | train_inner | {"epoch": 25, "update": 24.337, "s2c_loss": "0.04", "loss": "0.02783", "s2c_nll_loss": "0.04", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "52600", "lr": "4.93409e-05", "gnorm": "1.747", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.4", "wall": "13466"} 2023-01-29 19:56:11 | INFO | train_inner | {"epoch": 25, "update": 24.341, "s2c_loss": "0.037", "loss": "0.02576", "s2c_nll_loss": "0.037", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "52610", "lr": "4.92742e-05", "gnorm": "1.721", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "13469"} 2023-01-29 19:56:14 | INFO | train_inner | {"epoch": 25, "update": 24.346, "s2c_loss": "0.04", "loss": "0.02797", "s2c_nll_loss": "0.04", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "52620", "lr": "4.92075e-05", "gnorm": "1.451", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "13471"} 2023-01-29 19:56:14 | INFO | fairseq.trainer | NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 2048.0 2023-01-29 19:56:16 | INFO | train_inner | {"epoch": 25, "update": 24.351, "s2c_loss": "0.05", "loss": "0.03498", "s2c_nll_loss": "0.05", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "230.2", "ups": "3.6", "wpb": "64", "bsz": "64", "num_updates": "52630", "lr": "4.91409e-05", "gnorm": "1.49", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13474"} 2023-01-29 19:56:19 | INFO | train_inner | {"epoch": 25, "update": 24.356, "s2c_loss": "0.044", "loss": "0.03048", "s2c_nll_loss": "0.044", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "52640", "lr": "4.90742e-05", "gnorm": "1.687", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13477"} 2023-01-29 19:56:21 | INFO | train_inner | {"epoch": 25, "update": 24.36, "s2c_loss": "0.041", "loss": "0.02824", "s2c_nll_loss": "0.041", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "52650", "lr": "4.90076e-05", "gnorm": "1.427", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "13479"} 2023-01-29 19:56:24 | INFO | train_inner | {"epoch": 25, "update": 24.365, "s2c_loss": "0.028", "loss": "0.01918", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "52660", "lr": "4.89409e-05", "gnorm": "1.205", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13482"} 2023-01-29 19:56:27 | INFO | train_inner | {"epoch": 25, "update": 24.37, "s2c_loss": "0.049", "loss": "0.03363", "s2c_nll_loss": "0.049", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "52670", "lr": "4.88742e-05", "gnorm": "1.596", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13484"} 2023-01-29 19:56:29 | INFO | train_inner | {"epoch": 25, "update": 24.374, "s2c_loss": "0.038", "loss": "0.02625", "s2c_nll_loss": "0.038", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "52680", "lr": "4.88076e-05", "gnorm": "1.417", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13487"} 2023-01-29 19:56:32 | INFO | train_inner | {"epoch": 25, "update": 24.379, "s2c_loss": "0.034", "loss": "0.02351", "s2c_nll_loss": "0.034", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "52690", "lr": "4.87409e-05", "gnorm": "1.203", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13489"} 2023-01-29 19:56:34 | INFO | train_inner | {"epoch": 25, "update": 24.383, "s2c_loss": "0.052", "loss": "0.03593", "s2c_nll_loss": "0.052", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "52700", "lr": "4.86742e-05", "gnorm": "1.386", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13492"} 2023-01-29 19:56:37 | INFO | train_inner | {"epoch": 25, "update": 24.388, "s2c_loss": "0.079", "loss": "0.05508", "s2c_nll_loss": "0.079", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "52710", "lr": "4.86076e-05", "gnorm": "2.568", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13494"} 2023-01-29 19:56:39 | INFO | train_inner | {"epoch": 25, "update": 24.393, "s2c_loss": "0.056", "loss": "0.03852", "s2c_nll_loss": "0.056", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "52720", "lr": "4.85409e-05", "gnorm": "1.891", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13497"} 2023-01-29 19:56:42 | INFO | train_inner | {"epoch": 25, "update": 24.397, "s2c_loss": "0.055", "loss": "0.03846", "s2c_nll_loss": "0.055", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "52730", "lr": "4.84742e-05", "gnorm": "1.984", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13500"} 2023-01-29 19:56:44 | INFO | train_inner | {"epoch": 25, "update": 24.402, "s2c_loss": "0.177", "loss": "0.12275", "s2c_nll_loss": "0.177", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "52740", "lr": "4.84076e-05", "gnorm": "1.359", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13502"} 2023-01-29 19:56:47 | INFO | train_inner | {"epoch": 25, "update": 24.407, "s2c_loss": "0.032", "loss": "0.02251", "s2c_nll_loss": "0.032", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "52750", "lr": "4.83409e-05", "gnorm": "1.216", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13505"} 2023-01-29 19:56:49 | INFO | train_inner | {"epoch": 25, "update": 24.411, "s2c_loss": "0.028", "loss": "0.01942", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "52760", "lr": "4.82743e-05", "gnorm": "1.309", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13507"} 2023-01-29 19:56:52 | INFO | train_inner | {"epoch": 25, "update": 24.416, "s2c_loss": "0.031", "loss": "0.02175", "s2c_nll_loss": "0.031", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "52770", "lr": "4.82076e-05", "gnorm": "1.405", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13510"} 2023-01-29 19:56:54 | INFO | train_inner | {"epoch": 25, "update": 24.42, "s2c_loss": "0.041", "loss": "0.0282", "s2c_nll_loss": "0.041", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "52780", "lr": "4.81409e-05", "gnorm": "1.586", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13512"} 2023-01-29 19:56:57 | INFO | train_inner | {"epoch": 25, "update": 24.425, "s2c_loss": "0.033", "loss": "0.02322", "s2c_nll_loss": "0.033", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "52790", "lr": "4.80743e-05", "gnorm": "1.343", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13515"} 2023-01-29 19:56:59 | INFO | train_inner | {"epoch": 25, "update": 24.43, "s2c_loss": "0.018", "loss": "0.01271", "s2c_nll_loss": "0.018", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "52800", "lr": "4.80076e-05", "gnorm": "0.839", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13517"} 2023-01-29 19:57:02 | INFO | train_inner | {"epoch": 25, "update": 24.434, "s2c_loss": "0.025", "loss": "0.01752", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "52810", "lr": "4.79409e-05", "gnorm": "1.223", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13520"} 2023-01-29 19:57:04 | INFO | train_inner | {"epoch": 25, "update": 24.439, "s2c_loss": "0.059", "loss": "0.04068", "s2c_nll_loss": "0.059", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "52820", "lr": "4.78743e-05", "gnorm": "2.171", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13522"} 2023-01-29 19:57:07 | INFO | train_inner | {"epoch": 25, "update": 24.444, "s2c_loss": "0.036", "loss": "0.02523", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "52830", "lr": "4.78076e-05", "gnorm": "1.593", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13525"} 2023-01-29 19:57:09 | INFO | train_inner | {"epoch": 25, "update": 24.448, "s2c_loss": "0.197", "loss": "0.13667", "s2c_nll_loss": "0.197", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "258.3", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "52840", "lr": "4.77409e-05", "gnorm": "1.447", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13527"} 2023-01-29 19:57:12 | INFO | train_inner | {"epoch": 25, "update": 24.453, "s2c_loss": "0.048", "loss": "0.03352", "s2c_nll_loss": "0.048", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "52850", "lr": "4.76743e-05", "gnorm": "1.806", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13530"} 2023-01-29 19:57:15 | INFO | train_inner | {"epoch": 25, "update": 24.457, "s2c_loss": "0.062", "loss": "0.04274", "s2c_nll_loss": "0.062", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "52860", "lr": "4.76076e-05", "gnorm": "1.68", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13532"} 2023-01-29 19:57:17 | INFO | train_inner | {"epoch": 25, "update": 24.462, "s2c_loss": "0.042", "loss": "0.02883", "s2c_nll_loss": "0.042", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "52870", "lr": "4.7541e-05", "gnorm": "1.611", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13535"} 2023-01-29 19:57:20 | INFO | train_inner | {"epoch": 25, "update": 24.467, "s2c_loss": "0.014", "loss": "0.00954", "s2c_nll_loss": "0.014", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "52880", "lr": "4.74743e-05", "gnorm": "0.667", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13538"} 2023-01-29 19:57:22 | INFO | train_inner | {"epoch": 25, "update": 24.471, "s2c_loss": "0.083", "loss": "0.05741", "s2c_nll_loss": "0.083", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "52890", "lr": "4.74076e-05", "gnorm": "2.061", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13540"} 2023-01-29 19:57:25 | INFO | train_inner | {"epoch": 25, "update": 24.476, "s2c_loss": "0.028", "loss": "0.0192", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "242.9", "ups": "3.79", "wpb": "64", "bsz": "64", "num_updates": "52900", "lr": "4.7341e-05", "gnorm": "1.302", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13543"} 2023-01-29 19:57:27 | INFO | train_inner | {"epoch": 25, "update": 24.481, "s2c_loss": "0.054", "loss": "0.03738", "s2c_nll_loss": "0.054", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "52910", "lr": "4.72743e-05", "gnorm": "1.939", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13545"} 2023-01-29 19:57:30 | INFO | train_inner | {"epoch": 25, "update": 24.485, "s2c_loss": "0.048", "loss": "0.03324", "s2c_nll_loss": "0.048", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "52920", "lr": "4.72076e-05", "gnorm": "1.489", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13548"} 2023-01-29 19:57:32 | INFO | train_inner | {"epoch": 25, "update": 24.49, "s2c_loss": "0.049", "loss": "0.03385", "s2c_nll_loss": "0.049", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "52930", "lr": "4.7141e-05", "gnorm": "1.627", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13550"} 2023-01-29 19:57:35 | INFO | train_inner | {"epoch": 25, "update": 24.494, "s2c_loss": "0.035", "loss": "0.02415", "s2c_nll_loss": "0.035", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "52940", "lr": "4.70743e-05", "gnorm": "1.325", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13553"} 2023-01-29 19:57:37 | INFO | train_inner | {"epoch": 25, "update": 24.499, "s2c_loss": "0.043", "loss": "0.02971", "s2c_nll_loss": "0.043", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "52950", "lr": "4.70077e-05", "gnorm": "1.976", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13555"} 2023-01-29 19:57:40 | INFO | train_inner | {"epoch": 25, "update": 24.504, "s2c_loss": "0.047", "loss": "0.03235", "s2c_nll_loss": "0.047", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "52960", "lr": "4.6941e-05", "gnorm": "1.591", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13558"} 2023-01-29 19:57:42 | INFO | train_inner | {"epoch": 25, "update": 24.508, "s2c_loss": "0.024", "loss": "0.01685", "s2c_nll_loss": "0.024", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "52970", "lr": "4.68743e-05", "gnorm": "1.181", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13560"} 2023-01-29 19:57:45 | INFO | train_inner | {"epoch": 25, "update": 24.513, "s2c_loss": "0.047", "loss": "0.03265", "s2c_nll_loss": "0.047", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "52980", "lr": "4.68077e-05", "gnorm": "1.585", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13563"} 2023-01-29 19:57:47 | INFO | train_inner | {"epoch": 25, "update": 24.518, "s2c_loss": "0.036", "loss": "0.02467", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "52990", "lr": "4.6741e-05", "gnorm": "1.126", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13565"} 2023-01-29 19:57:50 | INFO | train_inner | {"epoch": 25, "update": 24.522, "s2c_loss": "0.023", "loss": "0.01616", "s2c_nll_loss": "0.023", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "53000", "lr": "4.66743e-05", "gnorm": "1.43", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13568"} 2023-01-29 19:57:52 | INFO | train_inner | {"epoch": 25, "update": 24.527, "s2c_loss": "0.038", "loss": "0.02632", "s2c_nll_loss": "0.038", "s2c_accuracy": "99.372", "s2c_total": "63.7", "s2c_n_correct": "63.3", "wps": "254.2", "ups": "3.99", "wpb": "63.7", "bsz": "63.7", "num_updates": "53010", "lr": "4.66077e-05", "gnorm": "1.639", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13570"} 2023-01-29 19:57:55 | INFO | train_inner | {"epoch": 25, "update": 24.531, "s2c_loss": "0.061", "loss": "0.04198", "s2c_nll_loss": "0.061", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "53020", "lr": "4.6541e-05", "gnorm": "2.092", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13573"} 2023-01-29 19:57:58 | INFO | train_inner | {"epoch": 25, "update": 24.536, "s2c_loss": "0.019", "loss": "0.01288", "s2c_nll_loss": "0.019", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "246.9", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "53030", "lr": "4.64743e-05", "gnorm": "0.893", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13575"} 2023-01-29 19:58:00 | INFO | train_inner | {"epoch": 25, "update": 24.541, "s2c_loss": "0.019", "loss": "0.01295", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "53040", "lr": "4.64077e-05", "gnorm": "0.991", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13578"} 2023-01-29 19:58:03 | INFO | train_inner | {"epoch": 25, "update": 24.545, "s2c_loss": "0.047", "loss": "0.03259", "s2c_nll_loss": "0.047", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "53050", "lr": "4.6341e-05", "gnorm": "1.373", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13581"} 2023-01-29 19:58:05 | INFO | train_inner | {"epoch": 25, "update": 24.55, "s2c_loss": "0.035", "loss": "0.02443", "s2c_nll_loss": "0.035", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "53060", "lr": "4.62744e-05", "gnorm": "1.727", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13583"} 2023-01-29 19:58:08 | INFO | train_inner | {"epoch": 25, "update": 24.555, "s2c_loss": "0.036", "loss": "0.02506", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "53070", "lr": "4.62077e-05", "gnorm": "1.479", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13586"} 2023-01-29 19:58:10 | INFO | train_inner | {"epoch": 25, "update": 24.559, "s2c_loss": "0.018", "loss": "0.01268", "s2c_nll_loss": "0.018", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "53080", "lr": "4.6141e-05", "gnorm": "0.79", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13588"} 2023-01-29 19:58:13 | INFO | train_inner | {"epoch": 25, "update": 24.564, "s2c_loss": "0.024", "loss": "0.01691", "s2c_nll_loss": "0.024", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "53090", "lr": "4.60744e-05", "gnorm": "1.029", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13591"} 2023-01-29 19:58:15 | INFO | train_inner | {"epoch": 25, "update": 24.568, "s2c_loss": "0.017", "loss": "0.01177", "s2c_nll_loss": "0.017", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "53100", "lr": "4.60077e-05", "gnorm": "0.71", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13593"} 2023-01-29 19:58:18 | INFO | train_inner | {"epoch": 25, "update": 24.573, "s2c_loss": "0.061", "loss": "0.04194", "s2c_nll_loss": "0.061", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "53110", "lr": "4.5941e-05", "gnorm": "1.969", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13596"} 2023-01-29 19:58:20 | INFO | train_inner | {"epoch": 25, "update": 24.578, "s2c_loss": "0.021", "loss": "0.01487", "s2c_nll_loss": "0.021", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "53120", "lr": "4.58744e-05", "gnorm": "1.036", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13598"} 2023-01-29 19:58:23 | INFO | train_inner | {"epoch": 25, "update": 24.582, "s2c_loss": "0.025", "loss": "0.01701", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "53130", "lr": "4.58077e-05", "gnorm": "1.097", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13601"} 2023-01-29 19:58:25 | INFO | train_inner | {"epoch": 25, "update": 24.587, "s2c_loss": "0.031", "loss": "0.0215", "s2c_nll_loss": "0.031", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "53140", "lr": "4.5741e-05", "gnorm": "1.431", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "13603"} 2023-01-29 19:58:28 | INFO | train_inner | {"epoch": 25, "update": 24.592, "s2c_loss": "0.046", "loss": "0.03176", "s2c_nll_loss": "0.046", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "53150", "lr": "4.56744e-05", "gnorm": "2.057", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13606"} 2023-01-29 19:58:31 | INFO | train_inner | {"epoch": 25, "update": 24.596, "s2c_loss": "0.032", "loss": "0.022", "s2c_nll_loss": "0.032", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "53160", "lr": "4.56077e-05", "gnorm": "0.879", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13608"} 2023-01-29 19:58:33 | INFO | train_inner | {"epoch": 25, "update": 24.601, "s2c_loss": "0.017", "loss": "0.01176", "s2c_nll_loss": "0.017", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "53170", "lr": "4.55411e-05", "gnorm": "0.813", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13611"} 2023-01-29 19:58:36 | INFO | train_inner | {"epoch": 25, "update": 24.605, "s2c_loss": "0.027", "loss": "0.01902", "s2c_nll_loss": "0.027", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "53180", "lr": "4.54744e-05", "gnorm": "1.158", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13614"} 2023-01-29 19:58:38 | INFO | train_inner | {"epoch": 25, "update": 24.61, "s2c_loss": "0.041", "loss": "0.02853", "s2c_nll_loss": "0.041", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "53190", "lr": "4.54077e-05", "gnorm": "1.726", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "13616"} 2023-01-29 19:58:41 | INFO | train_inner | {"epoch": 25, "update": 24.615, "s2c_loss": "0.04", "loss": "0.02747", "s2c_nll_loss": "0.04", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "53200", "lr": "4.53411e-05", "gnorm": "1.959", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "13619"} 2023-01-29 19:58:43 | INFO | train_inner | {"epoch": 25, "update": 24.619, "s2c_loss": "0.025", "loss": "0.01751", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "53210", "lr": "4.52744e-05", "gnorm": "1.255", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13621"} 2023-01-29 19:58:46 | INFO | train_inner | {"epoch": 25, "update": 24.624, "s2c_loss": "0.035", "loss": "0.02419", "s2c_nll_loss": "0.035", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "53220", "lr": "4.52077e-05", "gnorm": "1.524", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13624"} 2023-01-29 19:58:48 | INFO | train_inner | {"epoch": 25, "update": 24.629, "s2c_loss": "0.029", "loss": "0.01988", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "53230", "lr": "4.51411e-05", "gnorm": "1.236", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13626"} 2023-01-29 19:58:51 | INFO | train_inner | {"epoch": 25, "update": 24.633, "s2c_loss": "0.028", "loss": "0.01922", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "53240", "lr": "4.50744e-05", "gnorm": "1.225", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13629"} 2023-01-29 19:58:53 | INFO | train_inner | {"epoch": 25, "update": 24.638, "s2c_loss": "0.034", "loss": "0.02355", "s2c_nll_loss": "0.034", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "53250", "lr": "4.50078e-05", "gnorm": "2.098", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13631"} 2023-01-29 19:58:56 | INFO | train_inner | {"epoch": 25, "update": 24.642, "s2c_loss": "0.023", "loss": "0.01576", "s2c_nll_loss": "0.023", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "53260", "lr": "4.49411e-05", "gnorm": "1.239", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13634"} 2023-01-29 19:58:58 | INFO | train_inner | {"epoch": 25, "update": 24.647, "s2c_loss": "0.047", "loss": "0.03276", "s2c_nll_loss": "0.047", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "259.3", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "53270", "lr": "4.48744e-05", "gnorm": "1.406", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13636"} 2023-01-29 19:59:01 | INFO | train_inner | {"epoch": 25, "update": 24.652, "s2c_loss": "0.05", "loss": "0.03431", "s2c_nll_loss": "0.05", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "53280", "lr": "4.48078e-05", "gnorm": "1.259", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13639"} 2023-01-29 19:59:03 | INFO | train_inner | {"epoch": 25, "update": 24.656, "s2c_loss": "0.04", "loss": "0.0277", "s2c_nll_loss": "0.04", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "53290", "lr": "4.47411e-05", "gnorm": "1.706", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13641"} 2023-01-29 19:59:06 | INFO | train_inner | {"epoch": 25, "update": 24.661, "s2c_loss": "0.028", "loss": "0.01921", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "53300", "lr": "4.46744e-05", "gnorm": "1.374", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13644"} 2023-01-29 19:59:08 | INFO | train_inner | {"epoch": 25, "update": 24.666, "s2c_loss": "0.031", "loss": "0.02146", "s2c_nll_loss": "0.031", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "53310", "lr": "4.46078e-05", "gnorm": "1.344", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13646"} 2023-01-29 19:59:11 | INFO | train_inner | {"epoch": 25, "update": 24.67, "s2c_loss": "0.188", "loss": "0.12997", "s2c_nll_loss": "0.188", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "53320", "lr": "4.45411e-05", "gnorm": "1.462", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13649"} 2023-01-29 19:59:14 | INFO | train_inner | {"epoch": 25, "update": 24.675, "s2c_loss": "0.025", "loss": "0.01727", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "53330", "lr": "4.44744e-05", "gnorm": "1.093", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13651"} 2023-01-29 19:59:16 | INFO | train_inner | {"epoch": 25, "update": 24.679, "s2c_loss": "0.026", "loss": "0.01825", "s2c_nll_loss": "0.026", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "53340", "lr": "4.44078e-05", "gnorm": "1.084", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13654"} 2023-01-29 19:59:19 | INFO | train_inner | {"epoch": 25, "update": 24.684, "s2c_loss": "0.043", "loss": "0.03001", "s2c_nll_loss": "0.043", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "53350", "lr": "4.43411e-05", "gnorm": "1.116", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13656"} 2023-01-29 19:59:21 | INFO | train_inner | {"epoch": 25, "update": 24.689, "s2c_loss": "0.035", "loss": "0.02457", "s2c_nll_loss": "0.035", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "53360", "lr": "4.42745e-05", "gnorm": "1.683", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13659"} 2023-01-29 19:59:24 | INFO | train_inner | {"epoch": 25, "update": 24.693, "s2c_loss": "0.029", "loss": "0.02013", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "53370", "lr": "4.42078e-05", "gnorm": "1.36", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13662"} 2023-01-29 19:59:26 | INFO | train_inner | {"epoch": 25, "update": 24.698, "s2c_loss": "0.132", "loss": "0.09115", "s2c_nll_loss": "0.132", "s2c_accuracy": "98.125", "s2c_total": "64", "s2c_n_correct": "62.8", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "53380", "lr": "4.41411e-05", "gnorm": "2.601", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13664"} 2023-01-29 19:59:29 | INFO | train_inner | {"epoch": 25, "update": 24.703, "s2c_loss": "0.067", "loss": "0.04622", "s2c_nll_loss": "0.067", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "53390", "lr": "4.40745e-05", "gnorm": "2.287", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13667"} 2023-01-29 19:59:31 | INFO | train_inner | {"epoch": 25, "update": 24.707, "s2c_loss": "0.052", "loss": "0.03594", "s2c_nll_loss": "0.052", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "258.5", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "53400", "lr": "4.40078e-05", "gnorm": "1.816", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13669"} 2023-01-29 19:59:34 | INFO | train_inner | {"epoch": 25, "update": 24.712, "s2c_loss": "0.057", "loss": "0.03981", "s2c_nll_loss": "0.057", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "53410", "lr": "4.39411e-05", "gnorm": "1.979", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13672"} 2023-01-29 19:59:36 | INFO | train_inner | {"epoch": 25, "update": 24.716, "s2c_loss": "0.061", "loss": "0.04246", "s2c_nll_loss": "0.061", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "53420", "lr": "4.38745e-05", "gnorm": "1.459", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13674"} 2023-01-29 19:59:39 | INFO | train_inner | {"epoch": 25, "update": 24.721, "s2c_loss": "0.047", "loss": "0.03274", "s2c_nll_loss": "0.047", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "259.9", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "53430", "lr": "4.38078e-05", "gnorm": "1.439", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13677"} 2023-01-29 19:59:41 | INFO | train_inner | {"epoch": 25, "update": 24.726, "s2c_loss": "0.078", "loss": "0.05393", "s2c_nll_loss": "0.078", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "53440", "lr": "4.37411e-05", "gnorm": "1.898", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13679"} 2023-01-29 19:59:44 | INFO | train_inner | {"epoch": 25, "update": 24.73, "s2c_loss": "0.102", "loss": "0.07061", "s2c_nll_loss": "0.102", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "53450", "lr": "4.36745e-05", "gnorm": "2.369", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13682"} 2023-01-29 19:59:46 | INFO | train_inner | {"epoch": 25, "update": 24.735, "s2c_loss": "0.033", "loss": "0.02297", "s2c_nll_loss": "0.033", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "53460", "lr": "4.36078e-05", "gnorm": "1.592", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13684"} 2023-01-29 19:59:49 | INFO | train_inner | {"epoch": 25, "update": 24.74, "s2c_loss": "0.025", "loss": "0.01754", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "53470", "lr": "4.35412e-05", "gnorm": "0.982", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13687"} 2023-01-29 19:59:51 | INFO | train_inner | {"epoch": 25, "update": 24.744, "s2c_loss": "0.046", "loss": "0.03168", "s2c_nll_loss": "0.046", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "259.1", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "53480", "lr": "4.34745e-05", "gnorm": "1.776", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13689"} 2023-01-29 19:59:54 | INFO | train_inner | {"epoch": 25, "update": 24.749, "s2c_loss": "0.046", "loss": "0.03185", "s2c_nll_loss": "0.046", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "53490", "lr": "4.34078e-05", "gnorm": "1.696", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13692"} 2023-01-29 19:59:56 | INFO | train_inner | {"epoch": 25, "update": 24.753, "s2c_loss": "0.049", "loss": "0.03366", "s2c_nll_loss": "0.049", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "53500", "lr": "4.33412e-05", "gnorm": "1.492", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13694"} 2023-01-29 19:59:59 | INFO | train_inner | {"epoch": 25, "update": 24.758, "s2c_loss": "0.051", "loss": "0.03521", "s2c_nll_loss": "0.051", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "53510", "lr": "4.32745e-05", "gnorm": "1.975", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13697"} 2023-01-29 20:00:01 | INFO | train_inner | {"epoch": 25, "update": 24.763, "s2c_loss": "0.049", "loss": "0.03403", "s2c_nll_loss": "0.049", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "53520", "lr": "4.32078e-05", "gnorm": "1.847", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13699"} 2023-01-29 20:00:04 | INFO | train_inner | {"epoch": 25, "update": 24.767, "s2c_loss": "0.034", "loss": "0.02361", "s2c_nll_loss": "0.034", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "53530", "lr": "4.31412e-05", "gnorm": "1.141", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13702"} 2023-01-29 20:00:07 | INFO | train_inner | {"epoch": 25, "update": 24.772, "s2c_loss": "0.055", "loss": "0.03824", "s2c_nll_loss": "0.055", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "53540", "lr": "4.30745e-05", "gnorm": "1.649", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "13704"} 2023-01-29 20:00:09 | INFO | train_inner | {"epoch": 25, "update": 24.777, "s2c_loss": "0.073", "loss": "0.05055", "s2c_nll_loss": "0.073", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "262.2", "ups": "4.1", "wpb": "64", "bsz": "64", "num_updates": "53550", "lr": "4.30079e-05", "gnorm": "1.944", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13707"} 2023-01-29 20:00:11 | INFO | train_inner | {"epoch": 25, "update": 24.781, "s2c_loss": "0.047", "loss": "0.03273", "s2c_nll_loss": "0.047", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "53560", "lr": "4.29412e-05", "gnorm": "1.972", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13709"} 2023-01-29 20:00:14 | INFO | train_inner | {"epoch": 25, "update": 24.786, "s2c_loss": "0.061", "loss": "0.04196", "s2c_nll_loss": "0.061", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "53570", "lr": "4.28745e-05", "gnorm": "2.121", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13712"} 2023-01-29 20:00:16 | INFO | train_inner | {"epoch": 25, "update": 24.79, "s2c_loss": "0.043", "loss": "0.02971", "s2c_nll_loss": "0.043", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "53580", "lr": "4.28079e-05", "gnorm": "1.949", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13714"} 2023-01-29 20:00:19 | INFO | train_inner | {"epoch": 25, "update": 24.795, "s2c_loss": "0.037", "loss": "0.02553", "s2c_nll_loss": "0.037", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "255.7", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "53590", "lr": "4.27412e-05", "gnorm": "1.89", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13717"} 2023-01-29 20:00:21 | INFO | train_inner | {"epoch": 25, "update": 24.8, "s2c_loss": "0.08", "loss": "0.05532", "s2c_nll_loss": "0.08", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "53600", "lr": "4.26745e-05", "gnorm": "1.795", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13719"} 2023-01-29 20:00:24 | INFO | train_inner | {"epoch": 25, "update": 24.804, "s2c_loss": "0.029", "loss": "0.01986", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "246.5", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "53610", "lr": "4.26079e-05", "gnorm": "1.592", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13722"} 2023-01-29 20:00:27 | INFO | train_inner | {"epoch": 25, "update": 24.809, "s2c_loss": "0.03", "loss": "0.02068", "s2c_nll_loss": "0.03", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "53620", "lr": "4.25412e-05", "gnorm": "1.841", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13725"} 2023-01-29 20:00:29 | INFO | train_inner | {"epoch": 25, "update": 24.814, "s2c_loss": "0.023", "loss": "0.01608", "s2c_nll_loss": "0.023", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "53630", "lr": "4.24745e-05", "gnorm": "0.691", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13727"} 2023-01-29 20:00:32 | INFO | train_inner | {"epoch": 25, "update": 24.818, "s2c_loss": "0.04", "loss": "0.02768", "s2c_nll_loss": "0.04", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.5", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "53640", "lr": "4.24079e-05", "gnorm": "1.256", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "13730"} 2023-01-29 20:00:34 | INFO | train_inner | {"epoch": 25, "update": 24.823, "s2c_loss": "0.067", "loss": "0.04656", "s2c_nll_loss": "0.067", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "258.5", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "53650", "lr": "4.23412e-05", "gnorm": "1.248", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "13732"} 2023-01-29 20:00:37 | INFO | train_inner | {"epoch": 25, "update": 24.827, "s2c_loss": "0.033", "loss": "0.02282", "s2c_nll_loss": "0.033", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "53660", "lr": "4.22746e-05", "gnorm": "1.19", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13735"} 2023-01-29 20:00:39 | INFO | train_inner | {"epoch": 25, "update": 24.832, "s2c_loss": "0.033", "loss": "0.02268", "s2c_nll_loss": "0.033", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "53670", "lr": "4.22079e-05", "gnorm": "1.225", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13737"} 2023-01-29 20:00:42 | INFO | train_inner | {"epoch": 25, "update": 24.837, "s2c_loss": "0.028", "loss": "0.01925", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "53680", "lr": "4.21412e-05", "gnorm": "1.189", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "13740"} 2023-01-29 20:00:44 | INFO | train_inner | {"epoch": 25, "update": 24.841, "s2c_loss": "0.139", "loss": "0.09658", "s2c_nll_loss": "0.139", "s2c_accuracy": "97.969", "s2c_total": "64", "s2c_n_correct": "62.7", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "53690", "lr": "4.20746e-05", "gnorm": "2.107", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13742"} 2023-01-29 20:00:47 | INFO | train_inner | {"epoch": 25, "update": 24.846, "s2c_loss": "0.067", "loss": "0.04643", "s2c_nll_loss": "0.067", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "53700", "lr": "4.20079e-05", "gnorm": "1.71", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13745"} 2023-01-29 20:00:49 | INFO | train_inner | {"epoch": 25, "update": 24.851, "s2c_loss": "0.038", "loss": "0.02629", "s2c_nll_loss": "0.038", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "53710", "lr": "4.19412e-05", "gnorm": "1.081", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13747"} 2023-01-29 20:00:52 | INFO | train_inner | {"epoch": 25, "update": 24.855, "s2c_loss": "0.035", "loss": "0.02425", "s2c_nll_loss": "0.035", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "53720", "lr": "4.18746e-05", "gnorm": "1.21", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13750"} 2023-01-29 20:00:54 | INFO | train_inner | {"epoch": 25, "update": 24.86, "s2c_loss": "0.045", "loss": "0.03094", "s2c_nll_loss": "0.045", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "53730", "lr": "4.18079e-05", "gnorm": "1.835", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13752"} 2023-01-29 20:00:57 | INFO | train_inner | {"epoch": 25, "update": 24.864, "s2c_loss": "0.043", "loss": "0.02976", "s2c_nll_loss": "0.043", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "53740", "lr": "4.17412e-05", "gnorm": "1.987", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13755"} 2023-01-29 20:00:59 | INFO | train_inner | {"epoch": 25, "update": 24.869, "s2c_loss": "0.027", "loss": "0.01837", "s2c_nll_loss": "0.027", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "53750", "lr": "4.16746e-05", "gnorm": "1.223", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "13757"} 2023-01-29 20:01:02 | INFO | train_inner | {"epoch": 25, "update": 24.874, "s2c_loss": "0.057", "loss": "0.03974", "s2c_nll_loss": "0.057", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "247", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "53760", "lr": "4.16079e-05", "gnorm": "2.335", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13760"} 2023-01-29 20:01:05 | INFO | train_inner | {"epoch": 25, "update": 24.878, "s2c_loss": "0.032", "loss": "0.02239", "s2c_nll_loss": "0.032", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "53770", "lr": "4.15413e-05", "gnorm": "1.081", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13762"} 2023-01-29 20:01:07 | INFO | train_inner | {"epoch": 25, "update": 24.883, "s2c_loss": "0.053", "loss": "0.03644", "s2c_nll_loss": "0.053", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "53780", "lr": "4.14746e-05", "gnorm": "2.079", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13765"} 2023-01-29 20:01:10 | INFO | train_inner | {"epoch": 25, "update": 24.888, "s2c_loss": "0.046", "loss": "0.03223", "s2c_nll_loss": "0.046", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "53790", "lr": "4.14079e-05", "gnorm": "2.062", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13767"} 2023-01-29 20:01:12 | INFO | train_inner | {"epoch": 25, "update": 24.892, "s2c_loss": "0.023", "loss": "0.01593", "s2c_nll_loss": "0.023", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "53800", "lr": "4.13413e-05", "gnorm": "1.183", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13770"} 2023-01-29 20:01:15 | INFO | train_inner | {"epoch": 25, "update": 24.897, "s2c_loss": "0.069", "loss": "0.04751", "s2c_nll_loss": "0.069", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "53810", "lr": "4.12746e-05", "gnorm": "1.761", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13773"} 2023-01-29 20:01:17 | INFO | train_inner | {"epoch": 25, "update": 24.901, "s2c_loss": "0.034", "loss": "0.02335", "s2c_nll_loss": "0.034", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "53820", "lr": "4.12079e-05", "gnorm": "0.811", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13775"} 2023-01-29 20:01:20 | INFO | train_inner | {"epoch": 25, "update": 24.906, "s2c_loss": "0.023", "loss": "0.01606", "s2c_nll_loss": "0.023", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "53830", "lr": "4.11413e-05", "gnorm": "1.108", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.5", "wall": "13778"} 2023-01-29 20:01:22 | INFO | train_inner | {"epoch": 25, "update": 24.911, "s2c_loss": "0.054", "loss": "0.03714", "s2c_nll_loss": "0.054", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "53840", "lr": "4.10746e-05", "gnorm": "1.409", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13780"} 2023-01-29 20:01:25 | INFO | train_inner | {"epoch": 25, "update": 24.915, "s2c_loss": "0.038", "loss": "0.02633", "s2c_nll_loss": "0.038", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "53850", "lr": "4.1008e-05", "gnorm": "1.544", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13783"} 2023-01-29 20:01:27 | INFO | train_inner | {"epoch": 25, "update": 24.92, "s2c_loss": "0.02", "loss": "0.01373", "s2c_nll_loss": "0.02", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "53860", "lr": "4.09413e-05", "gnorm": "1.082", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13785"} 2023-01-29 20:01:30 | INFO | train_inner | {"epoch": 25, "update": 24.925, "s2c_loss": "0.187", "loss": "0.12957", "s2c_nll_loss": "0.187", "s2c_accuracy": "97.5", "s2c_total": "64", "s2c_n_correct": "62.4", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "53870", "lr": "4.08746e-05", "gnorm": "3.161", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13788"} 2023-01-29 20:01:32 | INFO | train_inner | {"epoch": 25, "update": 24.929, "s2c_loss": "0.054", "loss": "0.03754", "s2c_nll_loss": "0.054", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "53880", "lr": "4.0808e-05", "gnorm": "1.808", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13790"} 2023-01-29 20:01:35 | INFO | train_inner | {"epoch": 25, "update": 24.934, "s2c_loss": "0.031", "loss": "0.02177", "s2c_nll_loss": "0.031", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "53890", "lr": "4.07413e-05", "gnorm": "1.2", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13793"} 2023-01-29 20:01:37 | INFO | train_inner | {"epoch": 25, "update": 24.938, "s2c_loss": "0.054", "loss": "0.03769", "s2c_nll_loss": "0.054", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "53900", "lr": "4.06746e-05", "gnorm": "2.08", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "13795"} 2023-01-29 20:01:40 | INFO | train_inner | {"epoch": 25, "update": 24.943, "s2c_loss": "0.038", "loss": "0.02621", "s2c_nll_loss": "0.038", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "257.6", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "53910", "lr": "4.0608e-05", "gnorm": "1.313", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13798"} 2023-01-29 20:01:42 | INFO | train_inner | {"epoch": 25, "update": 24.948, "s2c_loss": "0.036", "loss": "0.02502", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "53920", "lr": "4.05413e-05", "gnorm": "1.673", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13800"} 2023-01-29 20:01:45 | INFO | train_inner | {"epoch": 25, "update": 24.952, "s2c_loss": "0.025", "loss": "0.01735", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "247.5", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "53930", "lr": "4.04746e-05", "gnorm": "1.038", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13803"} 2023-01-29 20:01:47 | INFO | train_inner | {"epoch": 25, "update": 24.957, "s2c_loss": "0.021", "loss": "0.01432", "s2c_nll_loss": "0.021", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "53940", "lr": "4.0408e-05", "gnorm": "1.085", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13805"} 2023-01-29 20:01:50 | INFO | train_inner | {"epoch": 25, "update": 24.962, "s2c_loss": "0.029", "loss": "0.01978", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "53950", "lr": "4.03413e-05", "gnorm": "1.241", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13808"} 2023-01-29 20:01:52 | INFO | train_inner | {"epoch": 25, "update": 24.966, "s2c_loss": "0.025", "loss": "0.01746", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "53960", "lr": "4.02747e-05", "gnorm": "1.196", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13810"} 2023-01-29 20:01:55 | INFO | train_inner | {"epoch": 25, "update": 24.971, "s2c_loss": "0.023", "loss": "0.01575", "s2c_nll_loss": "0.023", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "53970", "lr": "4.0208e-05", "gnorm": "1.098", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13813"} 2023-01-29 20:01:58 | INFO | train_inner | {"epoch": 25, "update": 24.975, "s2c_loss": "0.528", "loss": "0.36578", "s2c_nll_loss": "0.528", "s2c_accuracy": "94.531", "s2c_total": "64", "s2c_n_correct": "60.5", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "53980", "lr": "4.01413e-05", "gnorm": "1.641", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13816"} 2023-01-29 20:02:00 | INFO | train_inner | {"epoch": 25, "update": 24.98, "s2c_loss": "0.039", "loss": "0.02731", "s2c_nll_loss": "0.039", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "53990", "lr": "4.00747e-05", "gnorm": "1.376", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13818"} 2023-01-29 20:02:03 | INFO | train_inner | {"epoch": 25, "update": 24.985, "s2c_loss": "0.029", "loss": "0.0202", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "54000", "lr": "4.0008e-05", "gnorm": "1.147", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13821"} 2023-01-29 20:02:05 | INFO | train_inner | {"epoch": 25, "update": 24.989, "s2c_loss": "0.063", "loss": "0.04358", "s2c_nll_loss": "0.063", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "54010", "lr": "3.99413e-05", "gnorm": "1.428", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13823"} 2023-01-29 20:02:08 | INFO | train_inner | {"epoch": 25, "update": 24.994, "s2c_loss": "0.021", "loss": "0.01421", "s2c_nll_loss": "0.021", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "260.1", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "54020", "lr": "3.98747e-05", "gnorm": "1.085", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "13826"} 2023-01-29 20:02:10 | INFO | train_inner | {"epoch": 25, "update": 24.999, "s2c_loss": "0.02", "loss": "0.01369", "s2c_nll_loss": "0.02", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "54030", "lr": "3.9808e-05", "gnorm": "0.9", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13828"} 2023-01-29 20:02:11 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 20:02:26 | INFO | valid | {"epoch": 25, "valid_s2c_loss": "0.377", "valid_loss": "0.26168", "valid_s2c_nll_loss": "0.377", "valid_s2c_accuracy": "93.047", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "29.7361", "valid_num_updates": "54033", "valid_best_s2c_accuracy": "93.047"} 2023-01-29 20:02:26 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 25 @ 54033 updates 2023-01-29 20:02:26 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 20:02:33 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 20:02:38 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt (epoch 25 @ 54033 updates, score 93.047) (writing took 12.001213863957673 seconds) 2023-01-29 20:02:38 | INFO | fairseq_cli.train | end of epoch 25 (average epoch stats below) 2023-01-29 20:02:38 | INFO | train | {"epoch": 25, "train_s2c_loss": "0.049", "train_loss": "0.03366", "train_s2c_nll_loss": "0.049", "train_s2c_accuracy": "99.362", "train_s2c_total": "63.9838", "train_s2c_n_correct": "63.5757", "train_wps": "238.2", "train_ups": "3.72", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "54033", "train_lr": "3.9788e-05", "train_gnorm": "1.452", "train_loss_scale": "2048", "train_train_wall": "539", "train_gb_free": "7.4", "train_wall": "13856"} 2023-01-29 20:02:44 | INFO | fairseq.trainer | begin training epoch 26 2023-01-29 20:02:44 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 20:02:46 | INFO | train_inner | {"epoch": 26, "update": 25.003, "s2c_loss": "0.031", "loss": "0.02121", "s2c_nll_loss": "0.031", "s2c_accuracy": "99.671", "s2c_total": "60.8", "s2c_n_correct": "60.6", "wps": "16.9", "ups": "0.28", "wpb": "60.8", "bsz": "60.8", "num_updates": "54040", "lr": "3.97413e-05", "gnorm": "1.176", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13864"} 2023-01-29 20:02:49 | INFO | train_inner | {"epoch": 26, "update": 25.008, "s2c_loss": "0.03", "loss": "0.02092", "s2c_nll_loss": "0.03", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "54050", "lr": "3.96747e-05", "gnorm": "0.827", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13867"} 2023-01-29 20:02:51 | INFO | train_inner | {"epoch": 26, "update": 25.012, "s2c_loss": "0.016", "loss": "0.01126", "s2c_nll_loss": "0.016", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "54060", "lr": "3.9608e-05", "gnorm": "0.925", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13869"} 2023-01-29 20:02:54 | INFO | train_inner | {"epoch": 26, "update": 25.017, "s2c_loss": "0.026", "loss": "0.01777", "s2c_nll_loss": "0.026", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "54070", "lr": "3.95414e-05", "gnorm": "1.074", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13872"} 2023-01-29 20:02:56 | INFO | train_inner | {"epoch": 26, "update": 25.022, "s2c_loss": "0.011", "loss": "0.00769", "s2c_nll_loss": "0.011", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "243.3", "ups": "3.8", "wpb": "64", "bsz": "64", "num_updates": "54080", "lr": "3.94747e-05", "gnorm": "0.656", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13874"} 2023-01-29 20:02:59 | INFO | train_inner | {"epoch": 26, "update": 25.026, "s2c_loss": "0.026", "loss": "0.01809", "s2c_nll_loss": "0.026", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "54090", "lr": "3.9408e-05", "gnorm": "1.094", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13877"} 2023-01-29 20:03:02 | INFO | train_inner | {"epoch": 26, "update": 25.031, "s2c_loss": "0.021", "loss": "0.0147", "s2c_nll_loss": "0.021", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "246.7", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "54100", "lr": "3.93414e-05", "gnorm": "0.925", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13879"} 2023-01-29 20:03:04 | INFO | train_inner | {"epoch": 26, "update": 25.036, "s2c_loss": "0.024", "loss": "0.01675", "s2c_nll_loss": "0.024", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "54110", "lr": "3.92747e-05", "gnorm": "1.207", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13882"} 2023-01-29 20:03:07 | INFO | train_inner | {"epoch": 26, "update": 25.04, "s2c_loss": "0.013", "loss": "0.00895", "s2c_nll_loss": "0.013", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "54120", "lr": "3.9208e-05", "gnorm": "0.548", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13885"} 2023-01-29 20:03:09 | INFO | train_inner | {"epoch": 26, "update": 25.045, "s2c_loss": "0.031", "loss": "0.02118", "s2c_nll_loss": "0.031", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "245.5", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "54130", "lr": "3.91414e-05", "gnorm": "1.634", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13887"} 2023-01-29 20:03:12 | INFO | train_inner | {"epoch": 26, "update": 25.049, "s2c_loss": "0.041", "loss": "0.02822", "s2c_nll_loss": "0.041", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "54140", "lr": "3.90747e-05", "gnorm": "1.343", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13890"} 2023-01-29 20:03:14 | INFO | train_inner | {"epoch": 26, "update": 25.054, "s2c_loss": "0.016", "loss": "0.01124", "s2c_nll_loss": "0.016", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "54150", "lr": "3.90081e-05", "gnorm": "1.001", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13892"} 2023-01-29 20:03:17 | INFO | train_inner | {"epoch": 26, "update": 25.059, "s2c_loss": "0.016", "loss": "0.01087", "s2c_nll_loss": "0.016", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "54160", "lr": "3.89414e-05", "gnorm": "0.678", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13895"} 2023-01-29 20:03:20 | INFO | train_inner | {"epoch": 26, "update": 25.063, "s2c_loss": "0.041", "loss": "0.02863", "s2c_nll_loss": "0.041", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "247.4", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "54170", "lr": "3.88747e-05", "gnorm": "1.354", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13897"} 2023-01-29 20:03:22 | INFO | train_inner | {"epoch": 26, "update": 25.068, "s2c_loss": "0.046", "loss": "0.03163", "s2c_nll_loss": "0.046", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "54180", "lr": "3.88081e-05", "gnorm": "1.116", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13900"} 2023-01-29 20:03:25 | INFO | train_inner | {"epoch": 26, "update": 25.073, "s2c_loss": "0.025", "loss": "0.017", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "54190", "lr": "3.87414e-05", "gnorm": "1.109", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13903"} 2023-01-29 20:03:27 | INFO | train_inner | {"epoch": 26, "update": 25.077, "s2c_loss": "0.021", "loss": "0.01464", "s2c_nll_loss": "0.021", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "54200", "lr": "3.86747e-05", "gnorm": "0.949", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13905"} 2023-01-29 20:03:30 | INFO | train_inner | {"epoch": 26, "update": 25.082, "s2c_loss": "0.034", "loss": "0.0233", "s2c_nll_loss": "0.034", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "54210", "lr": "3.86081e-05", "gnorm": "1.133", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13908"} 2023-01-29 20:03:32 | INFO | train_inner | {"epoch": 26, "update": 25.086, "s2c_loss": "0.024", "loss": "0.01672", "s2c_nll_loss": "0.024", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "54220", "lr": "3.85414e-05", "gnorm": "1.284", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13910"} 2023-01-29 20:03:35 | INFO | train_inner | {"epoch": 26, "update": 25.091, "s2c_loss": "0.191", "loss": "0.13236", "s2c_nll_loss": "0.191", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "54230", "lr": "3.84747e-05", "gnorm": "1.729", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13913"} 2023-01-29 20:03:37 | INFO | train_inner | {"epoch": 26, "update": 25.096, "s2c_loss": "0.039", "loss": "0.02694", "s2c_nll_loss": "0.039", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "54240", "lr": "3.84081e-05", "gnorm": "1.64", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "13915"} 2023-01-29 20:03:40 | INFO | train_inner | {"epoch": 26, "update": 25.1, "s2c_loss": "0.043", "loss": "0.02993", "s2c_nll_loss": "0.043", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "54250", "lr": "3.83414e-05", "gnorm": "1.885", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "13918"} 2023-01-29 20:03:42 | INFO | train_inner | {"epoch": 26, "update": 25.105, "s2c_loss": "0.014", "loss": "0.00943", "s2c_nll_loss": "0.014", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "54260", "lr": "3.82748e-05", "gnorm": "0.648", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13920"} 2023-01-29 20:03:45 | INFO | train_inner | {"epoch": 26, "update": 25.11, "s2c_loss": "0.025", "loss": "0.017", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "54270", "lr": "3.82081e-05", "gnorm": "1.64", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13923"} 2023-01-29 20:03:47 | INFO | train_inner | {"epoch": 26, "update": 25.114, "s2c_loss": "0.032", "loss": "0.02246", "s2c_nll_loss": "0.032", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "54280", "lr": "3.81414e-05", "gnorm": "1.795", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13925"} 2023-01-29 20:03:50 | INFO | train_inner | {"epoch": 26, "update": 25.119, "s2c_loss": "0.026", "loss": "0.01773", "s2c_nll_loss": "0.026", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "54290", "lr": "3.80748e-05", "gnorm": "0.977", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13928"} 2023-01-29 20:03:52 | INFO | train_inner | {"epoch": 26, "update": 25.123, "s2c_loss": "0.041", "loss": "0.02813", "s2c_nll_loss": "0.041", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "54300", "lr": "3.80081e-05", "gnorm": "1.35", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13930"} 2023-01-29 20:03:55 | INFO | train_inner | {"epoch": 26, "update": 25.128, "s2c_loss": "0.023", "loss": "0.01587", "s2c_nll_loss": "0.023", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "54310", "lr": "3.79414e-05", "gnorm": "0.793", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13933"} 2023-01-29 20:03:57 | INFO | train_inner | {"epoch": 26, "update": 25.133, "s2c_loss": "0.054", "loss": "0.03762", "s2c_nll_loss": "0.054", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "257.2", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "54320", "lr": "3.78748e-05", "gnorm": "1.694", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13935"} 2023-01-29 20:04:00 | INFO | train_inner | {"epoch": 26, "update": 25.137, "s2c_loss": "0.029", "loss": "0.02028", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "54330", "lr": "3.78081e-05", "gnorm": "1.59", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13938"} 2023-01-29 20:04:03 | INFO | train_inner | {"epoch": 26, "update": 25.142, "s2c_loss": "0.015", "loss": "0.01072", "s2c_nll_loss": "0.015", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "54340", "lr": "3.77414e-05", "gnorm": "0.779", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13940"} 2023-01-29 20:04:05 | INFO | train_inner | {"epoch": 26, "update": 25.147, "s2c_loss": "0.037", "loss": "0.0254", "s2c_nll_loss": "0.037", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "54350", "lr": "3.76748e-05", "gnorm": "1.25", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13943"} 2023-01-29 20:04:08 | INFO | train_inner | {"epoch": 26, "update": 25.151, "s2c_loss": "0.037", "loss": "0.02591", "s2c_nll_loss": "0.037", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "54360", "lr": "3.76081e-05", "gnorm": "1.12", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13946"} 2023-01-29 20:04:10 | INFO | train_inner | {"epoch": 26, "update": 25.156, "s2c_loss": "0.02", "loss": "0.01355", "s2c_nll_loss": "0.02", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "54370", "lr": "3.75415e-05", "gnorm": "0.818", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13948"} 2023-01-29 20:04:13 | INFO | train_inner | {"epoch": 26, "update": 25.16, "s2c_loss": "0.027", "loss": "0.01858", "s2c_nll_loss": "0.027", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "54380", "lr": "3.74748e-05", "gnorm": "1.235", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13951"} 2023-01-29 20:04:15 | INFO | train_inner | {"epoch": 26, "update": 25.165, "s2c_loss": "0.045", "loss": "0.03101", "s2c_nll_loss": "0.045", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "54390", "lr": "3.74081e-05", "gnorm": "1.134", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13953"} 2023-01-29 20:04:18 | INFO | train_inner | {"epoch": 26, "update": 25.17, "s2c_loss": "0.036", "loss": "0.02501", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "54400", "lr": "3.73415e-05", "gnorm": "1.526", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13956"} 2023-01-29 20:04:20 | INFO | train_inner | {"epoch": 26, "update": 25.174, "s2c_loss": "0.031", "loss": "0.02117", "s2c_nll_loss": "0.031", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "54410", "lr": "3.72748e-05", "gnorm": "1.486", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13958"} 2023-01-29 20:04:23 | INFO | train_inner | {"epoch": 26, "update": 25.179, "s2c_loss": "0.061", "loss": "0.04202", "s2c_nll_loss": "0.061", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "54420", "lr": "3.72081e-05", "gnorm": "1.094", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13961"} 2023-01-29 20:04:25 | INFO | train_inner | {"epoch": 26, "update": 25.184, "s2c_loss": "0.025", "loss": "0.01708", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "54430", "lr": "3.71415e-05", "gnorm": "1.31", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13963"} 2023-01-29 20:04:28 | INFO | train_inner | {"epoch": 26, "update": 25.188, "s2c_loss": "0.021", "loss": "0.01426", "s2c_nll_loss": "0.021", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "54440", "lr": "3.70748e-05", "gnorm": "0.97", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13966"} 2023-01-29 20:04:30 | INFO | train_inner | {"epoch": 26, "update": 25.193, "s2c_loss": "0.036", "loss": "0.02515", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "54450", "lr": "3.70082e-05", "gnorm": "1.149", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13968"} 2023-01-29 20:04:33 | INFO | train_inner | {"epoch": 26, "update": 25.198, "s2c_loss": "0.012", "loss": "0.00844", "s2c_nll_loss": "0.012", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "54460", "lr": "3.69415e-05", "gnorm": "0.65", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13971"} 2023-01-29 20:04:35 | INFO | train_inner | {"epoch": 26, "update": 25.202, "s2c_loss": "0.014", "loss": "0.00949", "s2c_nll_loss": "0.014", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "54470", "lr": "3.68748e-05", "gnorm": "0.723", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13973"} 2023-01-29 20:04:38 | INFO | train_inner | {"epoch": 26, "update": 25.207, "s2c_loss": "0.025", "loss": "0.01762", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "54480", "lr": "3.68082e-05", "gnorm": "1.023", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13976"} 2023-01-29 20:04:40 | INFO | train_inner | {"epoch": 26, "update": 25.211, "s2c_loss": "0.044", "loss": "0.03054", "s2c_nll_loss": "0.044", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "54490", "lr": "3.67415e-05", "gnorm": "1.109", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13978"} 2023-01-29 20:04:43 | INFO | train_inner | {"epoch": 26, "update": 25.216, "s2c_loss": "0.019", "loss": "0.01297", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "246.6", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "54500", "lr": "3.66748e-05", "gnorm": "1.09", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "13981"} 2023-01-29 20:04:46 | INFO | train_inner | {"epoch": 26, "update": 25.221, "s2c_loss": "0.071", "loss": "0.04907", "s2c_nll_loss": "0.071", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "54510", "lr": "3.66082e-05", "gnorm": "1.565", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "13983"} 2023-01-29 20:04:48 | INFO | train_inner | {"epoch": 26, "update": 25.225, "s2c_loss": "0.029", "loss": "0.01985", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "54520", "lr": "3.65415e-05", "gnorm": "1.289", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13986"} 2023-01-29 20:04:51 | INFO | train_inner | {"epoch": 26, "update": 25.23, "s2c_loss": "0.04", "loss": "0.02772", "s2c_nll_loss": "0.04", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "54530", "lr": "3.64748e-05", "gnorm": "1.334", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "13988"} 2023-01-29 20:04:53 | INFO | train_inner | {"epoch": 26, "update": 25.235, "s2c_loss": "0.028", "loss": "0.01945", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "54540", "lr": "3.64082e-05", "gnorm": "0.833", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13991"} 2023-01-29 20:04:56 | INFO | train_inner | {"epoch": 26, "update": 25.239, "s2c_loss": "0.02", "loss": "0.01366", "s2c_nll_loss": "0.02", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "54550", "lr": "3.63415e-05", "gnorm": "1.056", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13994"} 2023-01-29 20:04:58 | INFO | train_inner | {"epoch": 26, "update": 25.244, "s2c_loss": "0.027", "loss": "0.01863", "s2c_nll_loss": "0.027", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "54560", "lr": "3.62749e-05", "gnorm": "1.127", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "13996"} 2023-01-29 20:05:01 | INFO | train_inner | {"epoch": 26, "update": 25.248, "s2c_loss": "0.031", "loss": "0.02126", "s2c_nll_loss": "0.031", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "54570", "lr": "3.62082e-05", "gnorm": "0.671", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "13999"} 2023-01-29 20:05:03 | INFO | train_inner | {"epoch": 26, "update": 25.253, "s2c_loss": "0.015", "loss": "0.01047", "s2c_nll_loss": "0.015", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "54580", "lr": "3.61415e-05", "gnorm": "0.757", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "14001"} 2023-01-29 20:05:06 | INFO | train_inner | {"epoch": 26, "update": 25.258, "s2c_loss": "0.015", "loss": "0.01012", "s2c_nll_loss": "0.015", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "54590", "lr": "3.60749e-05", "gnorm": "0.592", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "14004"} 2023-01-29 20:05:08 | INFO | train_inner | {"epoch": 26, "update": 25.262, "s2c_loss": "0.024", "loss": "0.01695", "s2c_nll_loss": "0.024", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "54600", "lr": "3.60082e-05", "gnorm": "1.062", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.3", "wall": "14006"} 2023-01-29 20:05:11 | INFO | train_inner | {"epoch": 26, "update": 25.267, "s2c_loss": "0.033", "loss": "0.02287", "s2c_nll_loss": "0.033", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "54610", "lr": "3.59415e-05", "gnorm": "1.137", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.3", "wall": "14009"} 2023-01-29 20:05:13 | INFO | train_inner | {"epoch": 26, "update": 25.272, "s2c_loss": "0.017", "loss": "0.01175", "s2c_nll_loss": "0.017", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "54620", "lr": "3.58749e-05", "gnorm": "0.799", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "14011"} 2023-01-29 20:05:16 | INFO | train_inner | {"epoch": 26, "update": 25.276, "s2c_loss": "0.033", "loss": "0.02262", "s2c_nll_loss": "0.033", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "54630", "lr": "3.58082e-05", "gnorm": "1.296", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.2", "wall": "14014"} 2023-01-29 20:05:18 | INFO | train_inner | {"epoch": 26, "update": 25.281, "s2c_loss": "0.036", "loss": "0.02489", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "54640", "lr": "3.57415e-05", "gnorm": "2.071", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.2", "wall": "14016"} 2023-01-29 20:05:21 | INFO | train_inner | {"epoch": 26, "update": 25.285, "s2c_loss": "0.038", "loss": "0.0261", "s2c_nll_loss": "0.038", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "54650", "lr": "3.56749e-05", "gnorm": "1.805", "loss_scale": "2048", "train_wall": "3", "gb_free": "7.4", "wall": "14019"} 2023-01-29 20:05:23 | INFO | train_inner | {"epoch": 26, "update": 25.29, "s2c_loss": "0.162", "loss": "0.112", "s2c_nll_loss": "0.162", "s2c_accuracy": "97.344", "s2c_total": "64", "s2c_n_correct": "62.3", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "54660", "lr": "3.56082e-05", "gnorm": "2.293", "loss_scale": "2048", "train_wall": "2", "gb_free": "7.4", "wall": "14021"} 2023-01-29 20:05:26 | INFO | train_inner | {"epoch": 26, "update": 25.295, "s2c_loss": "0.04", "loss": "0.02752", "s2c_nll_loss": "0.04", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "54670", "lr": "3.55416e-05", "gnorm": "1.026", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14024"} 2023-01-29 20:05:29 | INFO | train_inner | {"epoch": 26, "update": 25.299, "s2c_loss": "0.027", "loss": "0.01877", "s2c_nll_loss": "0.027", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "54680", "lr": "3.54749e-05", "gnorm": "1.131", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14026"} 2023-01-29 20:05:31 | INFO | train_inner | {"epoch": 26, "update": 25.304, "s2c_loss": "0.023", "loss": "0.0162", "s2c_nll_loss": "0.023", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "54690", "lr": "3.54082e-05", "gnorm": "1.076", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14029"} 2023-01-29 20:05:34 | INFO | train_inner | {"epoch": 26, "update": 25.309, "s2c_loss": "0.026", "loss": "0.01806", "s2c_nll_loss": "0.026", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "54700", "lr": "3.53416e-05", "gnorm": "1.165", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14032"} 2023-01-29 20:05:36 | INFO | train_inner | {"epoch": 26, "update": 25.313, "s2c_loss": "0.03", "loss": "0.02097", "s2c_nll_loss": "0.03", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "54710", "lr": "3.52749e-05", "gnorm": "1.124", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14034"} 2023-01-29 20:05:39 | INFO | train_inner | {"epoch": 26, "update": 25.318, "s2c_loss": "0.029", "loss": "0.02037", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "257.4", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "54720", "lr": "3.52082e-05", "gnorm": "0.947", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14037"} 2023-01-29 20:05:41 | INFO | train_inner | {"epoch": 26, "update": 25.322, "s2c_loss": "0.015", "loss": "0.01046", "s2c_nll_loss": "0.015", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "54730", "lr": "3.51416e-05", "gnorm": "0.764", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14039"} 2023-01-29 20:05:44 | INFO | train_inner | {"epoch": 26, "update": 25.327, "s2c_loss": "0.02", "loss": "0.01373", "s2c_nll_loss": "0.02", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "54740", "lr": "3.50749e-05", "gnorm": "0.897", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14042"} 2023-01-29 20:05:46 | INFO | train_inner | {"epoch": 26, "update": 25.332, "s2c_loss": "0.01", "loss": "0.0066", "s2c_nll_loss": "0.01", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "54750", "lr": "3.50083e-05", "gnorm": "0.57", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14044"} 2023-01-29 20:05:49 | INFO | train_inner | {"epoch": 26, "update": 25.336, "s2c_loss": "0.015", "loss": "0.0102", "s2c_nll_loss": "0.015", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "54760", "lr": "3.49416e-05", "gnorm": "0.52", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14047"} 2023-01-29 20:05:51 | INFO | train_inner | {"epoch": 26, "update": 25.341, "s2c_loss": "0.026", "loss": "0.01796", "s2c_nll_loss": "0.026", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "258", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "54770", "lr": "3.48749e-05", "gnorm": "1.405", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14049"} 2023-01-29 20:05:54 | INFO | train_inner | {"epoch": 26, "update": 25.346, "s2c_loss": "0.017", "loss": "0.01178", "s2c_nll_loss": "0.017", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "54780", "lr": "3.48083e-05", "gnorm": "0.884", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14052"} 2023-01-29 20:05:56 | INFO | train_inner | {"epoch": 26, "update": 25.35, "s2c_loss": "0.02", "loss": "0.01401", "s2c_nll_loss": "0.02", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "54790", "lr": "3.47416e-05", "gnorm": "0.923", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14054"} 2023-01-29 20:05:59 | INFO | train_inner | {"epoch": 26, "update": 25.355, "s2c_loss": "0.026", "loss": "0.01768", "s2c_nll_loss": "0.026", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "54800", "lr": "3.46749e-05", "gnorm": "1.036", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14057"} 2023-01-29 20:06:01 | INFO | train_inner | {"epoch": 26, "update": 25.359, "s2c_loss": "0.024", "loss": "0.0165", "s2c_nll_loss": "0.024", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "54810", "lr": "3.46083e-05", "gnorm": "0.881", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14059"} 2023-01-29 20:06:04 | INFO | train_inner | {"epoch": 26, "update": 25.364, "s2c_loss": "0.034", "loss": "0.02361", "s2c_nll_loss": "0.034", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "54820", "lr": "3.45416e-05", "gnorm": "1.102", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14062"} 2023-01-29 20:06:06 | INFO | train_inner | {"epoch": 26, "update": 25.369, "s2c_loss": "0.013", "loss": "0.00892", "s2c_nll_loss": "0.013", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "246.6", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "54830", "lr": "3.44749e-05", "gnorm": "0.702", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14064"} 2023-01-29 20:06:09 | INFO | train_inner | {"epoch": 26, "update": 25.373, "s2c_loss": "0.013", "loss": "0.00877", "s2c_nll_loss": "0.013", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "54840", "lr": "3.44083e-05", "gnorm": "0.926", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14067"} 2023-01-29 20:06:11 | INFO | train_inner | {"epoch": 26, "update": 25.378, "s2c_loss": "0.027", "loss": "0.01886", "s2c_nll_loss": "0.027", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "54850", "lr": "3.43416e-05", "gnorm": "0.59", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14069"} 2023-01-29 20:06:14 | INFO | train_inner | {"epoch": 26, "update": 25.383, "s2c_loss": "0.031", "loss": "0.02181", "s2c_nll_loss": "0.031", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "257", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "54860", "lr": "3.4275e-05", "gnorm": "1.343", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14072"} 2023-01-29 20:06:16 | INFO | train_inner | {"epoch": 26, "update": 25.387, "s2c_loss": "0.035", "loss": "0.0246", "s2c_nll_loss": "0.035", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "258.6", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "54870", "lr": "3.42083e-05", "gnorm": "1.257", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14074"} 2023-01-29 20:06:19 | INFO | train_inner | {"epoch": 26, "update": 25.392, "s2c_loss": "0.027", "loss": "0.01878", "s2c_nll_loss": "0.027", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "54880", "lr": "3.41416e-05", "gnorm": "1.321", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14077"} 2023-01-29 20:06:22 | INFO | train_inner | {"epoch": 26, "update": 25.396, "s2c_loss": "0.021", "loss": "0.01449", "s2c_nll_loss": "0.021", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "54890", "lr": "3.4075e-05", "gnorm": "0.891", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14079"} 2023-01-29 20:06:24 | INFO | train_inner | {"epoch": 26, "update": 25.401, "s2c_loss": "0.019", "loss": "0.01329", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "54900", "lr": "3.40083e-05", "gnorm": "0.89", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14082"} 2023-01-29 20:06:27 | INFO | train_inner | {"epoch": 26, "update": 25.406, "s2c_loss": "0.016", "loss": "0.01122", "s2c_nll_loss": "0.016", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "246.9", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "54910", "lr": "3.39416e-05", "gnorm": "0.731", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14085"} 2023-01-29 20:06:29 | INFO | train_inner | {"epoch": 26, "update": 25.41, "s2c_loss": "0.061", "loss": "0.04195", "s2c_nll_loss": "0.061", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "54920", "lr": "3.3875e-05", "gnorm": "1.174", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14087"} 2023-01-29 20:06:32 | INFO | train_inner | {"epoch": 26, "update": 25.415, "s2c_loss": "0.032", "loss": "0.02216", "s2c_nll_loss": "0.032", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "54930", "lr": "3.38083e-05", "gnorm": "0.848", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14090"} 2023-01-29 20:06:34 | INFO | train_inner | {"epoch": 26, "update": 25.42, "s2c_loss": "0.009", "loss": "0.00613", "s2c_nll_loss": "0.009", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "54940", "lr": "3.37416e-05", "gnorm": "0.446", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14092"} 2023-01-29 20:06:37 | INFO | train_inner | {"epoch": 26, "update": 25.424, "s2c_loss": "0.034", "loss": "0.02377", "s2c_nll_loss": "0.034", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "54950", "lr": "3.3675e-05", "gnorm": "1.494", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14095"} 2023-01-29 20:06:39 | INFO | train_inner | {"epoch": 26, "update": 25.429, "s2c_loss": "0.02", "loss": "0.01378", "s2c_nll_loss": "0.02", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "54960", "lr": "3.36083e-05", "gnorm": "1.163", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14097"} 2023-01-29 20:06:42 | INFO | train_inner | {"epoch": 26, "update": 25.433, "s2c_loss": "0.014", "loss": "0.00997", "s2c_nll_loss": "0.014", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "54970", "lr": "3.35417e-05", "gnorm": "0.828", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14100"} 2023-01-29 20:06:44 | INFO | train_inner | {"epoch": 26, "update": 25.438, "s2c_loss": "0.012", "loss": "0.00828", "s2c_nll_loss": "0.012", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "257.9", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "54980", "lr": "3.3475e-05", "gnorm": "0.578", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14102"} 2023-01-29 20:06:47 | INFO | train_inner | {"epoch": 26, "update": 25.443, "s2c_loss": "0.043", "loss": "0.02997", "s2c_nll_loss": "0.043", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "54990", "lr": "3.34083e-05", "gnorm": "0.847", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14105"} 2023-01-29 20:06:49 | INFO | train_inner | {"epoch": 26, "update": 25.447, "s2c_loss": "0.021", "loss": "0.0145", "s2c_nll_loss": "0.021", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "55000", "lr": "3.33417e-05", "gnorm": "0.926", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14107"} 2023-01-29 20:06:52 | INFO | train_inner | {"epoch": 26, "update": 25.452, "s2c_loss": "0.024", "loss": "0.01632", "s2c_nll_loss": "0.024", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "258.1", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "55010", "lr": "3.3275e-05", "gnorm": "1.066", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14110"} 2023-01-29 20:06:54 | INFO | train_inner | {"epoch": 26, "update": 25.457, "s2c_loss": "0.02", "loss": "0.01366", "s2c_nll_loss": "0.02", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "55020", "lr": "3.32083e-05", "gnorm": "0.845", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14112"} 2023-01-29 20:06:57 | INFO | train_inner | {"epoch": 26, "update": 25.461, "s2c_loss": "0.036", "loss": "0.02507", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "55030", "lr": "3.31417e-05", "gnorm": "1.064", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14115"} 2023-01-29 20:06:59 | INFO | train_inner | {"epoch": 26, "update": 25.466, "s2c_loss": "0.154", "loss": "0.10698", "s2c_nll_loss": "0.154", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "55040", "lr": "3.3075e-05", "gnorm": "1.164", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14117"} 2023-01-29 20:07:02 | INFO | train_inner | {"epoch": 26, "update": 25.47, "s2c_loss": "0.019", "loss": "0.01313", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "55050", "lr": "3.30084e-05", "gnorm": "0.973", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14120"} 2023-01-29 20:07:04 | INFO | train_inner | {"epoch": 26, "update": 25.475, "s2c_loss": "0.031", "loss": "0.02115", "s2c_nll_loss": "0.031", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "55060", "lr": "3.29417e-05", "gnorm": "1.38", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14122"} 2023-01-29 20:07:07 | INFO | train_inner | {"epoch": 26, "update": 25.48, "s2c_loss": "0.036", "loss": "0.0252", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "55070", "lr": "3.2875e-05", "gnorm": "1.234", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14125"} 2023-01-29 20:07:09 | INFO | train_inner | {"epoch": 26, "update": 25.484, "s2c_loss": "0.034", "loss": "0.02356", "s2c_nll_loss": "0.034", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "55080", "lr": "3.28084e-05", "gnorm": "1.266", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14127"} 2023-01-29 20:07:12 | INFO | train_inner | {"epoch": 26, "update": 25.489, "s2c_loss": "0.024", "loss": "0.0169", "s2c_nll_loss": "0.024", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "55090", "lr": "3.27417e-05", "gnorm": "1.319", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14130"} 2023-01-29 20:07:14 | INFO | train_inner | {"epoch": 26, "update": 25.494, "s2c_loss": "0.019", "loss": "0.013", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "55100", "lr": "3.2675e-05", "gnorm": "0.68", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14132"} 2023-01-29 20:07:17 | INFO | train_inner | {"epoch": 26, "update": 25.498, "s2c_loss": "0.009", "loss": "0.00623", "s2c_nll_loss": "0.009", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "254.5", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "55110", "lr": "3.26084e-05", "gnorm": "0.456", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14135"} 2023-01-29 20:07:20 | INFO | train_inner | {"epoch": 26, "update": 25.503, "s2c_loss": "0.029", "loss": "0.01989", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "55120", "lr": "3.25417e-05", "gnorm": "0.874", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14137"} 2023-01-29 20:07:22 | INFO | train_inner | {"epoch": 26, "update": 25.507, "s2c_loss": "0.013", "loss": "0.00911", "s2c_nll_loss": "0.013", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "55130", "lr": "3.2475e-05", "gnorm": "0.524", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14140"} 2023-01-29 20:07:25 | INFO | train_inner | {"epoch": 26, "update": 25.512, "s2c_loss": "0.021", "loss": "0.01436", "s2c_nll_loss": "0.021", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "55140", "lr": "3.24084e-05", "gnorm": "0.817", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14143"} 2023-01-29 20:07:27 | INFO | train_inner | {"epoch": 26, "update": 25.517, "s2c_loss": "0.01", "loss": "0.00667", "s2c_nll_loss": "0.01", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "55150", "lr": "3.23417e-05", "gnorm": "0.489", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14145"} 2023-01-29 20:07:30 | INFO | train_inner | {"epoch": 26, "update": 25.521, "s2c_loss": "0.004", "loss": "0.00259", "s2c_nll_loss": "0.004", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "247.6", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "55160", "lr": "3.22751e-05", "gnorm": "0.156", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14148"} 2023-01-29 20:07:32 | INFO | train_inner | {"epoch": 26, "update": 25.526, "s2c_loss": "0.029", "loss": "0.01985", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "55170", "lr": "3.22084e-05", "gnorm": "1.2", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14150"} 2023-01-29 20:07:35 | INFO | train_inner | {"epoch": 26, "update": 25.531, "s2c_loss": "0.013", "loss": "0.00908", "s2c_nll_loss": "0.013", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "55180", "lr": "3.21417e-05", "gnorm": "0.579", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14153"} 2023-01-29 20:07:37 | INFO | train_inner | {"epoch": 26, "update": 25.535, "s2c_loss": "0.033", "loss": "0.0227", "s2c_nll_loss": "0.033", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "55190", "lr": "3.20751e-05", "gnorm": "1.666", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14155"} 2023-01-29 20:07:40 | INFO | train_inner | {"epoch": 26, "update": 25.54, "s2c_loss": "0.041", "loss": "0.02825", "s2c_nll_loss": "0.041", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "55200", "lr": "3.20084e-05", "gnorm": "1.155", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14158"} 2023-01-29 20:07:42 | INFO | train_inner | {"epoch": 26, "update": 25.544, "s2c_loss": "0.029", "loss": "0.02001", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "55210", "lr": "3.19417e-05", "gnorm": "1.28", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14160"} 2023-01-29 20:07:45 | INFO | train_inner | {"epoch": 26, "update": 25.549, "s2c_loss": "0.022", "loss": "0.01521", "s2c_nll_loss": "0.022", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "55220", "lr": "3.18751e-05", "gnorm": "1.037", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14163"} 2023-01-29 20:07:47 | INFO | train_inner | {"epoch": 26, "update": 25.554, "s2c_loss": "0.038", "loss": "0.02645", "s2c_nll_loss": "0.038", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "259.3", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "55230", "lr": "3.18084e-05", "gnorm": "1.441", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14165"} 2023-01-29 20:07:50 | INFO | train_inner | {"epoch": 26, "update": 25.558, "s2c_loss": "0.016", "loss": "0.01104", "s2c_nll_loss": "0.016", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "55240", "lr": "3.17417e-05", "gnorm": "0.943", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14168"} 2023-01-29 20:07:52 | INFO | train_inner | {"epoch": 26, "update": 25.563, "s2c_loss": "0.028", "loss": "0.01963", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "55250", "lr": "3.16751e-05", "gnorm": "1.323", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14170"} 2023-01-29 20:07:55 | INFO | train_inner | {"epoch": 26, "update": 25.568, "s2c_loss": "0.034", "loss": "0.02351", "s2c_nll_loss": "0.034", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "55260", "lr": "3.16084e-05", "gnorm": "1.109", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14173"} 2023-01-29 20:07:57 | INFO | train_inner | {"epoch": 26, "update": 25.572, "s2c_loss": "0.043", "loss": "0.02999", "s2c_nll_loss": "0.043", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "55270", "lr": "3.15418e-05", "gnorm": "1.184", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14175"} 2023-01-29 20:08:00 | INFO | train_inner | {"epoch": 26, "update": 25.577, "s2c_loss": "0.026", "loss": "0.01784", "s2c_nll_loss": "0.026", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "55280", "lr": "3.14751e-05", "gnorm": "1.113", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14178"} 2023-01-29 20:08:03 | INFO | train_inner | {"epoch": 26, "update": 25.581, "s2c_loss": "0.042", "loss": "0.02943", "s2c_nll_loss": "0.042", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "55290", "lr": "3.14084e-05", "gnorm": "1.332", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14180"} 2023-01-29 20:08:05 | INFO | train_inner | {"epoch": 26, "update": 25.586, "s2c_loss": "0.029", "loss": "0.01977", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254.4", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "55300", "lr": "3.13418e-05", "gnorm": "1.311", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14183"} 2023-01-29 20:08:08 | INFO | train_inner | {"epoch": 26, "update": 25.591, "s2c_loss": "0.035", "loss": "0.02429", "s2c_nll_loss": "0.035", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "55310", "lr": "3.12751e-05", "gnorm": "1.505", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14185"} 2023-01-29 20:08:10 | INFO | train_inner | {"epoch": 26, "update": 25.595, "s2c_loss": "0.022", "loss": "0.01532", "s2c_nll_loss": "0.022", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "55320", "lr": "3.12084e-05", "gnorm": "0.841", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14188"} 2023-01-29 20:08:13 | INFO | train_inner | {"epoch": 26, "update": 25.6, "s2c_loss": "0.046", "loss": "0.03164", "s2c_nll_loss": "0.046", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "55330", "lr": "3.11418e-05", "gnorm": "0.992", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14190"} 2023-01-29 20:08:15 | INFO | train_inner | {"epoch": 26, "update": 25.605, "s2c_loss": "0.018", "loss": "0.01223", "s2c_nll_loss": "0.018", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "55340", "lr": "3.10751e-05", "gnorm": "0.901", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14193"} 2023-01-29 20:08:18 | INFO | train_inner | {"epoch": 26, "update": 25.609, "s2c_loss": "0.013", "loss": "0.00887", "s2c_nll_loss": "0.013", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "258.1", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "55350", "lr": "3.10085e-05", "gnorm": "0.69", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14195"} 2023-01-29 20:08:20 | INFO | train_inner | {"epoch": 26, "update": 25.614, "s2c_loss": "0.015", "loss": "0.01063", "s2c_nll_loss": "0.015", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "55360", "lr": "3.09418e-05", "gnorm": "0.616", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14198"} 2023-01-29 20:08:23 | INFO | train_inner | {"epoch": 26, "update": 25.618, "s2c_loss": "0.014", "loss": "0.01001", "s2c_nll_loss": "0.014", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "55370", "lr": "3.08751e-05", "gnorm": "0.695", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14201"} 2023-01-29 20:08:25 | INFO | train_inner | {"epoch": 26, "update": 25.623, "s2c_loss": "0.011", "loss": "0.00777", "s2c_nll_loss": "0.011", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "245.8", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "55380", "lr": "3.08085e-05", "gnorm": "0.597", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14203"} 2023-01-29 20:08:28 | INFO | train_inner | {"epoch": 26, "update": 25.628, "s2c_loss": "0.013", "loss": "0.00875", "s2c_nll_loss": "0.013", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "55390", "lr": "3.07418e-05", "gnorm": "0.684", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14206"} 2023-01-29 20:08:30 | INFO | train_inner | {"epoch": 26, "update": 25.632, "s2c_loss": "0.014", "loss": "0.00952", "s2c_nll_loss": "0.014", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "55400", "lr": "3.06751e-05", "gnorm": "0.899", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14208"} 2023-01-29 20:08:33 | INFO | train_inner | {"epoch": 26, "update": 25.637, "s2c_loss": "0.017", "loss": "0.01168", "s2c_nll_loss": "0.017", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "245.6", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "55410", "lr": "3.06085e-05", "gnorm": "1.033", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14211"} 2023-01-29 20:08:35 | INFO | train_inner | {"epoch": 26, "update": 25.642, "s2c_loss": "0.036", "loss": "0.02518", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "55420", "lr": "3.05418e-05", "gnorm": "0.925", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.4", "wall": "14213"} 2023-01-29 20:08:38 | INFO | train_inner | {"epoch": 26, "update": 25.646, "s2c_loss": "0.019", "loss": "0.01347", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "248.1", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "55430", "lr": "3.04751e-05", "gnorm": "1.072", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14216"} 2023-01-29 20:08:41 | INFO | train_inner | {"epoch": 26, "update": 25.651, "s2c_loss": "0.022", "loss": "0.01539", "s2c_nll_loss": "0.022", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "55440", "lr": "3.04085e-05", "gnorm": "0.785", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.4", "wall": "14219"} 2023-01-29 20:08:43 | INFO | train_inner | {"epoch": 26, "update": 25.655, "s2c_loss": "0.197", "loss": "0.13652", "s2c_nll_loss": "0.197", "s2c_accuracy": "98.438", "s2c_total": "64", "s2c_n_correct": "63", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "55450", "lr": "3.03418e-05", "gnorm": "1.131", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14221"} 2023-01-29 20:08:46 | INFO | train_inner | {"epoch": 26, "update": 25.66, "s2c_loss": "0.172", "loss": "0.11896", "s2c_nll_loss": "0.172", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "245.8", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "55460", "lr": "3.02752e-05", "gnorm": "0.873", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14224"} 2023-01-29 20:08:48 | INFO | train_inner | {"epoch": 26, "update": 25.665, "s2c_loss": "0.038", "loss": "0.02649", "s2c_nll_loss": "0.038", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "55470", "lr": "3.02085e-05", "gnorm": "0.985", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14226"} 2023-01-29 20:08:51 | INFO | train_inner | {"epoch": 26, "update": 25.669, "s2c_loss": "0.062", "loss": "0.04293", "s2c_nll_loss": "0.062", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "55480", "lr": "3.01418e-05", "gnorm": "1.418", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14229"} 2023-01-29 20:08:53 | INFO | train_inner | {"epoch": 26, "update": 25.674, "s2c_loss": "0.028", "loss": "0.01922", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "55490", "lr": "3.00752e-05", "gnorm": "1.432", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14231"} 2023-01-29 20:08:56 | INFO | train_inner | {"epoch": 26, "update": 25.679, "s2c_loss": "0.022", "loss": "0.01522", "s2c_nll_loss": "0.022", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "55500", "lr": "3.00085e-05", "gnorm": "0.929", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14234"} 2023-01-29 20:08:58 | INFO | train_inner | {"epoch": 26, "update": 25.683, "s2c_loss": "0.011", "loss": "0.00777", "s2c_nll_loss": "0.011", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "55510", "lr": "2.99418e-05", "gnorm": "0.703", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14236"} 2023-01-29 20:09:01 | INFO | train_inner | {"epoch": 26, "update": 25.688, "s2c_loss": "0.013", "loss": "0.00935", "s2c_nll_loss": "0.013", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "55520", "lr": "2.98752e-05", "gnorm": "0.796", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14239"} 2023-01-29 20:09:04 | INFO | train_inner | {"epoch": 26, "update": 25.692, "s2c_loss": "0.014", "loss": "0.00991", "s2c_nll_loss": "0.014", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "55530", "lr": "2.98085e-05", "gnorm": "0.993", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14241"} 2023-01-29 20:09:06 | INFO | train_inner | {"epoch": 26, "update": 25.697, "s2c_loss": "0.021", "loss": "0.01489", "s2c_nll_loss": "0.021", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "55540", "lr": "2.97418e-05", "gnorm": "0.908", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14244"} 2023-01-29 20:09:09 | INFO | train_inner | {"epoch": 26, "update": 25.702, "s2c_loss": "0.018", "loss": "0.01234", "s2c_nll_loss": "0.018", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "55550", "lr": "2.96752e-05", "gnorm": "0.874", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14246"} 2023-01-29 20:09:11 | INFO | train_inner | {"epoch": 26, "update": 25.706, "s2c_loss": "0.019", "loss": "0.01328", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "257.1", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "55560", "lr": "2.96085e-05", "gnorm": "1.189", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14249"} 2023-01-29 20:09:14 | INFO | train_inner | {"epoch": 26, "update": 25.711, "s2c_loss": "0.025", "loss": "0.017", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "55570", "lr": "2.95419e-05", "gnorm": "1.296", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14252"} 2023-01-29 20:09:16 | INFO | train_inner | {"epoch": 26, "update": 25.716, "s2c_loss": "0.014", "loss": "0.00943", "s2c_nll_loss": "0.014", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "55580", "lr": "2.94752e-05", "gnorm": "0.699", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14254"} 2023-01-29 20:09:19 | INFO | train_inner | {"epoch": 26, "update": 25.72, "s2c_loss": "0.013", "loss": "0.00912", "s2c_nll_loss": "0.013", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "55590", "lr": "2.94085e-05", "gnorm": "0.862", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14257"} 2023-01-29 20:09:21 | INFO | train_inner | {"epoch": 26, "update": 25.725, "s2c_loss": "0.033", "loss": "0.02301", "s2c_nll_loss": "0.033", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "258.9", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "55600", "lr": "2.93419e-05", "gnorm": "1.381", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14259"} 2023-01-29 20:09:24 | INFO | train_inner | {"epoch": 26, "update": 25.729, "s2c_loss": "0.02", "loss": "0.01365", "s2c_nll_loss": "0.02", "s2c_accuracy": "99.529", "s2c_total": "63.7", "s2c_n_correct": "63.4", "wps": "255.2", "ups": "4.01", "wpb": "63.7", "bsz": "63.7", "num_updates": "55610", "lr": "2.92752e-05", "gnorm": "1.172", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14262"} 2023-01-29 20:09:26 | INFO | train_inner | {"epoch": 26, "update": 25.734, "s2c_loss": "0.027", "loss": "0.01838", "s2c_nll_loss": "0.027", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "257.6", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "55620", "lr": "2.92085e-05", "gnorm": "1.399", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14264"} 2023-01-29 20:09:29 | INFO | train_inner | {"epoch": 26, "update": 25.739, "s2c_loss": "0.025", "loss": "0.01713", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "55630", "lr": "2.91419e-05", "gnorm": "0.73", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14267"} 2023-01-29 20:09:31 | INFO | train_inner | {"epoch": 26, "update": 25.743, "s2c_loss": "0.041", "loss": "0.02845", "s2c_nll_loss": "0.041", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "55640", "lr": "2.90752e-05", "gnorm": "1.964", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14269"} 2023-01-29 20:09:34 | INFO | train_inner | {"epoch": 26, "update": 25.748, "s2c_loss": "0.025", "loss": "0.01748", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "55650", "lr": "2.90086e-05", "gnorm": "1.116", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14272"} 2023-01-29 20:09:36 | INFO | train_inner | {"epoch": 26, "update": 25.753, "s2c_loss": "0.029", "loss": "0.02003", "s2c_nll_loss": "0.029", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "55660", "lr": "2.89419e-05", "gnorm": "1.236", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14274"} 2023-01-29 20:09:39 | INFO | train_inner | {"epoch": 26, "update": 25.757, "s2c_loss": "0.026", "loss": "0.01833", "s2c_nll_loss": "0.026", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "55670", "lr": "2.88752e-05", "gnorm": "1.072", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14277"} 2023-01-29 20:09:41 | INFO | train_inner | {"epoch": 26, "update": 25.762, "s2c_loss": "0.031", "loss": "0.02167", "s2c_nll_loss": "0.031", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "55680", "lr": "2.88086e-05", "gnorm": "1.462", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14279"} 2023-01-29 20:09:44 | INFO | train_inner | {"epoch": 26, "update": 25.766, "s2c_loss": "0.016", "loss": "0.01121", "s2c_nll_loss": "0.016", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "55690", "lr": "2.87419e-05", "gnorm": "0.662", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14282"} 2023-01-29 20:09:46 | INFO | train_inner | {"epoch": 26, "update": 25.771, "s2c_loss": "0.039", "loss": "0.02722", "s2c_nll_loss": "0.039", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "55700", "lr": "2.86752e-05", "gnorm": "1.389", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14284"} 2023-01-29 20:09:49 | INFO | train_inner | {"epoch": 26, "update": 25.776, "s2c_loss": "0.028", "loss": "0.01958", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "55710", "lr": "2.86086e-05", "gnorm": "1.065", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14287"} 2023-01-29 20:09:52 | INFO | train_inner | {"epoch": 26, "update": 25.78, "s2c_loss": "0.023", "loss": "0.01605", "s2c_nll_loss": "0.023", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "55720", "lr": "2.85419e-05", "gnorm": "1.207", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14289"} 2023-01-29 20:09:54 | INFO | train_inner | {"epoch": 26, "update": 25.785, "s2c_loss": "0.025", "loss": "0.01766", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "55730", "lr": "2.84752e-05", "gnorm": "1.211", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14292"} 2023-01-29 20:09:57 | INFO | train_inner | {"epoch": 26, "update": 25.79, "s2c_loss": "0.011", "loss": "0.00776", "s2c_nll_loss": "0.011", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "55740", "lr": "2.84086e-05", "gnorm": "0.66", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14295"} 2023-01-29 20:09:59 | INFO | train_inner | {"epoch": 26, "update": 25.794, "s2c_loss": "0.009", "loss": "0.00611", "s2c_nll_loss": "0.009", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "55750", "lr": "2.83419e-05", "gnorm": "0.532", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14297"} 2023-01-29 20:10:02 | INFO | train_inner | {"epoch": 26, "update": 25.799, "s2c_loss": "0.006", "loss": "0.00442", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "55760", "lr": "2.82753e-05", "gnorm": "0.416", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14300"} 2023-01-29 20:10:04 | INFO | train_inner | {"epoch": 26, "update": 25.803, "s2c_loss": "0.03", "loss": "0.02056", "s2c_nll_loss": "0.03", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "55770", "lr": "2.82086e-05", "gnorm": "0.979", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14302"} 2023-01-29 20:10:07 | INFO | train_inner | {"epoch": 26, "update": 25.808, "s2c_loss": "0.02", "loss": "0.01413", "s2c_nll_loss": "0.02", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "55780", "lr": "2.81419e-05", "gnorm": "1.266", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14305"} 2023-01-29 20:10:09 | INFO | train_inner | {"epoch": 26, "update": 25.813, "s2c_loss": "0.019", "loss": "0.01348", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "55790", "lr": "2.80753e-05", "gnorm": "0.791", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14307"} 2023-01-29 20:10:12 | INFO | train_inner | {"epoch": 26, "update": 25.817, "s2c_loss": "0.024", "loss": "0.01672", "s2c_nll_loss": "0.024", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "243.6", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "55800", "lr": "2.80086e-05", "gnorm": "1.544", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14310"} 2023-01-29 20:10:15 | INFO | train_inner | {"epoch": 26, "update": 25.822, "s2c_loss": "0.053", "loss": "0.03662", "s2c_nll_loss": "0.053", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "55810", "lr": "2.79419e-05", "gnorm": "2.048", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14312"} 2023-01-29 20:10:17 | INFO | train_inner | {"epoch": 26, "update": 25.827, "s2c_loss": "0.017", "loss": "0.01159", "s2c_nll_loss": "0.017", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "55820", "lr": "2.78753e-05", "gnorm": "1.444", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14315"} 2023-01-29 20:10:20 | INFO | train_inner | {"epoch": 26, "update": 25.831, "s2c_loss": "0.031", "loss": "0.02141", "s2c_nll_loss": "0.031", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "255.7", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "55830", "lr": "2.78086e-05", "gnorm": "1.024", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14318"} 2023-01-29 20:10:22 | INFO | train_inner | {"epoch": 26, "update": 25.836, "s2c_loss": "0.037", "loss": "0.02558", "s2c_nll_loss": "0.037", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "55840", "lr": "2.77419e-05", "gnorm": "1.023", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14320"} 2023-01-29 20:10:25 | INFO | train_inner | {"epoch": 26, "update": 25.84, "s2c_loss": "0.012", "loss": "0.00848", "s2c_nll_loss": "0.012", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "55850", "lr": "2.76753e-05", "gnorm": "0.566", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14323"} 2023-01-29 20:10:27 | INFO | train_inner | {"epoch": 26, "update": 25.845, "s2c_loss": "0.049", "loss": "0.03393", "s2c_nll_loss": "0.049", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "55860", "lr": "2.76086e-05", "gnorm": "1.241", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14325"} 2023-01-29 20:10:30 | INFO | train_inner | {"epoch": 26, "update": 25.85, "s2c_loss": "0.017", "loss": "0.01162", "s2c_nll_loss": "0.017", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "55870", "lr": "2.7542e-05", "gnorm": "0.955", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14328"} 2023-01-29 20:10:32 | INFO | train_inner | {"epoch": 26, "update": 25.854, "s2c_loss": "0.008", "loss": "0.00555", "s2c_nll_loss": "0.008", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "55880", "lr": "2.74753e-05", "gnorm": "0.455", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14330"} 2023-01-29 20:10:35 | INFO | train_inner | {"epoch": 26, "update": 25.859, "s2c_loss": "0.024", "loss": "0.01664", "s2c_nll_loss": "0.024", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "55890", "lr": "2.74086e-05", "gnorm": "0.847", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14333"} 2023-01-29 20:10:37 | INFO | train_inner | {"epoch": 26, "update": 25.864, "s2c_loss": "0.008", "loss": "0.00546", "s2c_nll_loss": "0.008", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "55900", "lr": "2.7342e-05", "gnorm": "0.481", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14335"} 2023-01-29 20:10:40 | INFO | train_inner | {"epoch": 26, "update": 25.868, "s2c_loss": "0.036", "loss": "0.02516", "s2c_nll_loss": "0.036", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "55910", "lr": "2.72753e-05", "gnorm": "1.296", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14338"} 2023-01-29 20:10:42 | INFO | train_inner | {"epoch": 26, "update": 25.873, "s2c_loss": "0.009", "loss": "0.00649", "s2c_nll_loss": "0.009", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "55920", "lr": "2.72086e-05", "gnorm": "0.497", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14340"} 2023-01-29 20:10:45 | INFO | train_inner | {"epoch": 26, "update": 25.877, "s2c_loss": "0.023", "loss": "0.01594", "s2c_nll_loss": "0.023", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "55930", "lr": "2.7142e-05", "gnorm": "0.802", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14343"} 2023-01-29 20:10:47 | INFO | train_inner | {"epoch": 26, "update": 25.882, "s2c_loss": "0.019", "loss": "0.01329", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "55940", "lr": "2.70753e-05", "gnorm": "1.122", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14345"} 2023-01-29 20:10:50 | INFO | train_inner | {"epoch": 26, "update": 25.887, "s2c_loss": "0.014", "loss": "0.00991", "s2c_nll_loss": "0.014", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "55950", "lr": "2.70087e-05", "gnorm": "0.92", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14348"} 2023-01-29 20:10:53 | INFO | train_inner | {"epoch": 26, "update": 25.891, "s2c_loss": "0.01", "loss": "0.00689", "s2c_nll_loss": "0.01", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "55960", "lr": "2.6942e-05", "gnorm": "0.547", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14350"} 2023-01-29 20:10:55 | INFO | train_inner | {"epoch": 26, "update": 25.896, "s2c_loss": "0.017", "loss": "0.0117", "s2c_nll_loss": "0.017", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "55970", "lr": "2.68753e-05", "gnorm": "1.004", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14353"} 2023-01-29 20:10:58 | INFO | train_inner | {"epoch": 26, "update": 25.901, "s2c_loss": "0.018", "loss": "0.01276", "s2c_nll_loss": "0.018", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "55980", "lr": "2.68087e-05", "gnorm": "0.649", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14355"} 2023-01-29 20:11:00 | INFO | train_inner | {"epoch": 26, "update": 25.905, "s2c_loss": "0.031", "loss": "0.02121", "s2c_nll_loss": "0.031", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "55990", "lr": "2.6742e-05", "gnorm": "1.28", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14358"} 2023-01-29 20:11:03 | INFO | train_inner | {"epoch": 26, "update": 25.91, "s2c_loss": "0.008", "loss": "0.00576", "s2c_nll_loss": "0.008", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "56000", "lr": "2.66753e-05", "gnorm": "0.472", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14361"} 2023-01-29 20:11:05 | INFO | train_inner | {"epoch": 26, "update": 25.914, "s2c_loss": "0.016", "loss": "0.01129", "s2c_nll_loss": "0.016", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "56010", "lr": "2.66087e-05", "gnorm": "0.757", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14363"} 2023-01-29 20:11:08 | INFO | train_inner | {"epoch": 26, "update": 25.919, "s2c_loss": "0.016", "loss": "0.01128", "s2c_nll_loss": "0.016", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "56020", "lr": "2.6542e-05", "gnorm": "0.821", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14366"} 2023-01-29 20:11:10 | INFO | train_inner | {"epoch": 26, "update": 25.924, "s2c_loss": "0.02", "loss": "0.01372", "s2c_nll_loss": "0.02", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "56030", "lr": "2.64753e-05", "gnorm": "0.815", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14368"} 2023-01-29 20:11:13 | INFO | train_inner | {"epoch": 26, "update": 25.928, "s2c_loss": "0.006", "loss": "0.00431", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "56040", "lr": "2.64087e-05", "gnorm": "0.409", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14371"} 2023-01-29 20:11:15 | INFO | train_inner | {"epoch": 26, "update": 25.933, "s2c_loss": "0.016", "loss": "0.0108", "s2c_nll_loss": "0.016", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "56050", "lr": "2.6342e-05", "gnorm": "0.629", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14373"} 2023-01-29 20:11:18 | INFO | train_inner | {"epoch": 26, "update": 25.938, "s2c_loss": "0.018", "loss": "0.01279", "s2c_nll_loss": "0.018", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "260.5", "ups": "4.07", "wpb": "64", "bsz": "64", "num_updates": "56060", "lr": "2.62754e-05", "gnorm": "0.983", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14376"} 2023-01-29 20:11:20 | INFO | train_inner | {"epoch": 26, "update": 25.942, "s2c_loss": "0.035", "loss": "0.02392", "s2c_nll_loss": "0.035", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "56070", "lr": "2.62087e-05", "gnorm": "1.265", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14378"} 2023-01-29 20:11:23 | INFO | train_inner | {"epoch": 26, "update": 25.947, "s2c_loss": "0.015", "loss": "0.01036", "s2c_nll_loss": "0.015", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "260", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "56080", "lr": "2.6142e-05", "gnorm": "0.749", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14381"} 2023-01-29 20:11:25 | INFO | train_inner | {"epoch": 26, "update": 25.951, "s2c_loss": "0.011", "loss": "0.00765", "s2c_nll_loss": "0.011", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "56090", "lr": "2.60754e-05", "gnorm": "0.82", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14383"} 2023-01-29 20:11:28 | INFO | train_inner | {"epoch": 26, "update": 25.956, "s2c_loss": "0.012", "loss": "0.00843", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "56100", "lr": "2.60087e-05", "gnorm": "1.041", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14386"} 2023-01-29 20:11:30 | INFO | train_inner | {"epoch": 26, "update": 25.961, "s2c_loss": "0.012", "loss": "0.00865", "s2c_nll_loss": "0.012", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "56110", "lr": "2.5942e-05", "gnorm": "0.723", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14388"} 2023-01-29 20:11:33 | INFO | train_inner | {"epoch": 26, "update": 25.965, "s2c_loss": "0.031", "loss": "0.02117", "s2c_nll_loss": "0.031", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "56120", "lr": "2.58754e-05", "gnorm": "1.655", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14391"} 2023-01-29 20:11:35 | INFO | train_inner | {"epoch": 26, "update": 25.97, "s2c_loss": "0.012", "loss": "0.00857", "s2c_nll_loss": "0.012", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "56130", "lr": "2.58087e-05", "gnorm": "0.622", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14393"} 2023-01-29 20:11:38 | INFO | train_inner | {"epoch": 26, "update": 25.975, "s2c_loss": "0.023", "loss": "0.01572", "s2c_nll_loss": "0.023", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "56140", "lr": "2.5742e-05", "gnorm": "1.237", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14396"} 2023-01-29 20:11:41 | INFO | train_inner | {"epoch": 26, "update": 25.979, "s2c_loss": "0.035", "loss": "0.024", "s2c_nll_loss": "0.035", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "247.8", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "56150", "lr": "2.56754e-05", "gnorm": "1.061", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14398"} 2023-01-29 20:11:43 | INFO | train_inner | {"epoch": 26, "update": 25.984, "s2c_loss": "0.034", "loss": "0.02391", "s2c_nll_loss": "0.034", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "256.3", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "56160", "lr": "2.56087e-05", "gnorm": "1.331", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14401"} 2023-01-29 20:11:46 | INFO | train_inner | {"epoch": 26, "update": 25.988, "s2c_loss": "0.03", "loss": "0.02049", "s2c_nll_loss": "0.03", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "56170", "lr": "2.55421e-05", "gnorm": "0.643", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14404"} 2023-01-29 20:11:48 | INFO | train_inner | {"epoch": 26, "update": 25.993, "s2c_loss": "0.013", "loss": "0.00871", "s2c_nll_loss": "0.013", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "56180", "lr": "2.54754e-05", "gnorm": "0.713", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14406"} 2023-01-29 20:11:51 | INFO | train_inner | {"epoch": 26, "update": 25.998, "s2c_loss": "0.023", "loss": "0.01602", "s2c_nll_loss": "0.023", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "56190", "lr": "2.54087e-05", "gnorm": "1.005", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14409"} 2023-01-29 20:11:52 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 20:12:06 | INFO | valid | {"epoch": 26, "valid_s2c_loss": "0.34", "valid_loss": "0.23561", "valid_s2c_nll_loss": "0.34", "valid_s2c_accuracy": "94.147", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "30.088", "valid_num_updates": "56195", "valid_best_s2c_accuracy": "94.147"} 2023-01-29 20:12:06 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 26 @ 56195 updates 2023-01-29 20:12:06 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 20:12:13 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 20:12:18 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt (epoch 26 @ 56195 updates, score 94.147) (writing took 11.605315372813493 seconds) 2023-01-29 20:12:18 | INFO | fairseq_cli.train | end of epoch 26 (average epoch stats below) 2023-01-29 20:12:18 | INFO | train | {"epoch": 26, "train_s2c_loss": "0.029", "train_loss": "0.01977", "train_s2c_nll_loss": "0.029", "train_s2c_accuracy": "99.665", "train_s2c_total": "63.9838", "train_s2c_n_correct": "63.7697", "train_wps": "238.4", "train_ups": "3.73", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "56195", "train_lr": "2.53754e-05", "train_gnorm": "1.036", "train_loss_scale": "4096", "train_train_wall": "540", "train_gb_free": "7.6", "train_wall": "14436"} 2023-01-29 20:12:24 | INFO | fairseq.trainer | begin training epoch 27 2023-01-29 20:12:24 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 20:12:26 | INFO | train_inner | {"epoch": 27, "update": 26.002, "s2c_loss": "0.033", "loss": "0.02257", "s2c_nll_loss": "0.033", "s2c_accuracy": "99.836", "s2c_total": "60.8", "s2c_n_correct": "60.7", "wps": "17.3", "ups": "0.29", "wpb": "60.8", "bsz": "60.8", "num_updates": "56200", "lr": "2.53421e-05", "gnorm": "0.84", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14444"} 2023-01-29 20:12:28 | INFO | train_inner | {"epoch": 27, "update": 26.007, "s2c_loss": "0.02", "loss": "0.0137", "s2c_nll_loss": "0.02", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "241.5", "ups": "3.77", "wpb": "64", "bsz": "64", "num_updates": "56210", "lr": "2.52754e-05", "gnorm": "0.691", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14446"} 2023-01-29 20:12:31 | INFO | train_inner | {"epoch": 27, "update": 26.012, "s2c_loss": "0.017", "loss": "0.01167", "s2c_nll_loss": "0.017", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "56220", "lr": "2.52087e-05", "gnorm": "0.88", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14449"} 2023-01-29 20:12:33 | INFO | train_inner | {"epoch": 27, "update": 26.016, "s2c_loss": "0.013", "loss": "0.00872", "s2c_nll_loss": "0.013", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "56230", "lr": "2.51421e-05", "gnorm": "0.581", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14451"} 2023-01-29 20:12:36 | INFO | train_inner | {"epoch": 27, "update": 26.021, "s2c_loss": "0.018", "loss": "0.01227", "s2c_nll_loss": "0.018", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "56240", "lr": "2.50754e-05", "gnorm": "0.883", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14454"} 2023-01-29 20:12:39 | INFO | train_inner | {"epoch": 27, "update": 26.025, "s2c_loss": "0.009", "loss": "0.00635", "s2c_nll_loss": "0.009", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "56250", "lr": "2.50088e-05", "gnorm": "0.508", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14456"} 2023-01-29 20:12:41 | INFO | train_inner | {"epoch": 27, "update": 26.03, "s2c_loss": "0.028", "loss": "0.01953", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "56260", "lr": "2.49421e-05", "gnorm": "0.675", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14459"} 2023-01-29 20:12:44 | INFO | train_inner | {"epoch": 27, "update": 26.035, "s2c_loss": "0.016", "loss": "0.01081", "s2c_nll_loss": "0.016", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "56270", "lr": "2.48754e-05", "gnorm": "0.689", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14462"} 2023-01-29 20:12:46 | INFO | train_inner | {"epoch": 27, "update": 26.039, "s2c_loss": "0.011", "loss": "0.00794", "s2c_nll_loss": "0.011", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "56280", "lr": "2.48088e-05", "gnorm": "0.626", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14464"} 2023-01-29 20:12:49 | INFO | train_inner | {"epoch": 27, "update": 26.044, "s2c_loss": "0.026", "loss": "0.01769", "s2c_nll_loss": "0.026", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "56290", "lr": "2.47421e-05", "gnorm": "0.935", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14467"} 2023-01-29 20:12:51 | INFO | train_inner | {"epoch": 27, "update": 26.049, "s2c_loss": "0.008", "loss": "0.00547", "s2c_nll_loss": "0.008", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "56300", "lr": "2.46754e-05", "gnorm": "0.391", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14469"} 2023-01-29 20:12:54 | INFO | train_inner | {"epoch": 27, "update": 26.053, "s2c_loss": "0.018", "loss": "0.01252", "s2c_nll_loss": "0.018", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "56310", "lr": "2.46088e-05", "gnorm": "0.754", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14472"} 2023-01-29 20:12:56 | INFO | train_inner | {"epoch": 27, "update": 26.058, "s2c_loss": "0.009", "loss": "0.00648", "s2c_nll_loss": "0.009", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "56320", "lr": "2.45421e-05", "gnorm": "0.601", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14474"} 2023-01-29 20:12:59 | INFO | train_inner | {"epoch": 27, "update": 26.062, "s2c_loss": "0.017", "loss": "0.01168", "s2c_nll_loss": "0.017", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "56330", "lr": "2.44754e-05", "gnorm": "0.883", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14477"} 2023-01-29 20:13:01 | INFO | train_inner | {"epoch": 27, "update": 26.067, "s2c_loss": "0.01", "loss": "0.0071", "s2c_nll_loss": "0.01", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "56340", "lr": "2.44088e-05", "gnorm": "0.616", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14479"} 2023-01-29 20:13:04 | INFO | train_inner | {"epoch": 27, "update": 26.072, "s2c_loss": "0.045", "loss": "0.03101", "s2c_nll_loss": "0.045", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "56350", "lr": "2.43421e-05", "gnorm": "1.177", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14482"} 2023-01-29 20:13:06 | INFO | train_inner | {"epoch": 27, "update": 26.076, "s2c_loss": "0.019", "loss": "0.01295", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "56360", "lr": "2.42755e-05", "gnorm": "0.79", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14484"} 2023-01-29 20:13:09 | INFO | train_inner | {"epoch": 27, "update": 26.081, "s2c_loss": "0.011", "loss": "0.00741", "s2c_nll_loss": "0.011", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "56370", "lr": "2.42088e-05", "gnorm": "0.468", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14487"} 2023-01-29 20:13:11 | INFO | train_inner | {"epoch": 27, "update": 26.086, "s2c_loss": "0.014", "loss": "0.00945", "s2c_nll_loss": "0.014", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "56380", "lr": "2.41421e-05", "gnorm": "0.544", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14489"} 2023-01-29 20:13:14 | INFO | train_inner | {"epoch": 27, "update": 26.09, "s2c_loss": "0.158", "loss": "0.10947", "s2c_nll_loss": "0.158", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "56390", "lr": "2.40755e-05", "gnorm": "0.996", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14492"} 2023-01-29 20:13:16 | INFO | train_inner | {"epoch": 27, "update": 26.095, "s2c_loss": "0.161", "loss": "0.11144", "s2c_nll_loss": "0.161", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "56400", "lr": "2.40088e-05", "gnorm": "0.98", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14494"} 2023-01-29 20:13:19 | INFO | train_inner | {"epoch": 27, "update": 26.099, "s2c_loss": "0.025", "loss": "0.01754", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "56410", "lr": "2.39421e-05", "gnorm": "0.948", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14497"} 2023-01-29 20:13:22 | INFO | train_inner | {"epoch": 27, "update": 26.104, "s2c_loss": "0.01", "loss": "0.00695", "s2c_nll_loss": "0.01", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "56420", "lr": "2.38755e-05", "gnorm": "0.658", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14499"} 2023-01-29 20:13:24 | INFO | train_inner | {"epoch": 27, "update": 26.109, "s2c_loss": "0.013", "loss": "0.00922", "s2c_nll_loss": "0.013", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "56430", "lr": "2.38088e-05", "gnorm": "0.787", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14502"} 2023-01-29 20:13:27 | INFO | train_inner | {"epoch": 27, "update": 26.113, "s2c_loss": "0.018", "loss": "0.01235", "s2c_nll_loss": "0.018", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "56440", "lr": "2.37421e-05", "gnorm": "0.925", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14505"} 2023-01-29 20:13:29 | INFO | train_inner | {"epoch": 27, "update": 26.118, "s2c_loss": "0.025", "loss": "0.01727", "s2c_nll_loss": "0.025", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "248", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "56450", "lr": "2.36755e-05", "gnorm": "0.587", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14507"} 2023-01-29 20:13:32 | INFO | train_inner | {"epoch": 27, "update": 26.123, "s2c_loss": "0.031", "loss": "0.0216", "s2c_nll_loss": "0.031", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "56460", "lr": "2.36088e-05", "gnorm": "0.907", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14510"} 2023-01-29 20:13:34 | INFO | train_inner | {"epoch": 27, "update": 26.127, "s2c_loss": "0.009", "loss": "0.00618", "s2c_nll_loss": "0.009", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "56470", "lr": "2.35422e-05", "gnorm": "0.543", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14512"} 2023-01-29 20:13:37 | INFO | train_inner | {"epoch": 27, "update": 26.132, "s2c_loss": "0.03", "loss": "0.02088", "s2c_nll_loss": "0.03", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "56480", "lr": "2.34755e-05", "gnorm": "0.812", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.4", "wall": "14515"} 2023-01-29 20:13:39 | INFO | train_inner | {"epoch": 27, "update": 26.136, "s2c_loss": "0.02", "loss": "0.01401", "s2c_nll_loss": "0.02", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "245.8", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "56490", "lr": "2.34088e-05", "gnorm": "0.777", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14517"} 2023-01-29 20:13:42 | INFO | train_inner | {"epoch": 27, "update": 26.141, "s2c_loss": "0.01", "loss": "0.00692", "s2c_nll_loss": "0.01", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "56500", "lr": "2.33422e-05", "gnorm": "0.525", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14520"} 2023-01-29 20:13:44 | INFO | train_inner | {"epoch": 27, "update": 26.146, "s2c_loss": "0.007", "loss": "0.00464", "s2c_nll_loss": "0.007", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "56510", "lr": "2.32755e-05", "gnorm": "0.448", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14522"} 2023-01-29 20:13:47 | INFO | train_inner | {"epoch": 27, "update": 26.15, "s2c_loss": "0.021", "loss": "0.01485", "s2c_nll_loss": "0.021", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "56520", "lr": "2.32088e-05", "gnorm": "0.708", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14525"} 2023-01-29 20:13:50 | INFO | train_inner | {"epoch": 27, "update": 26.155, "s2c_loss": "0.03", "loss": "0.0207", "s2c_nll_loss": "0.03", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "56530", "lr": "2.31422e-05", "gnorm": "0.861", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14527"} 2023-01-29 20:13:52 | INFO | train_inner | {"epoch": 27, "update": 26.16, "s2c_loss": "0.027", "loss": "0.01845", "s2c_nll_loss": "0.027", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "56540", "lr": "2.30755e-05", "gnorm": "0.837", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14530"} 2023-01-29 20:13:55 | INFO | train_inner | {"epoch": 27, "update": 26.164, "s2c_loss": "0.028", "loss": "0.01971", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "56550", "lr": "2.30089e-05", "gnorm": "0.974", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14533"} 2023-01-29 20:13:57 | INFO | train_inner | {"epoch": 27, "update": 26.169, "s2c_loss": "0.012", "loss": "0.00815", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "56560", "lr": "2.29422e-05", "gnorm": "0.459", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14535"} 2023-01-29 20:14:00 | INFO | train_inner | {"epoch": 27, "update": 26.173, "s2c_loss": "0.011", "loss": "0.00752", "s2c_nll_loss": "0.011", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "56570", "lr": "2.28755e-05", "gnorm": "0.561", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14538"} 2023-01-29 20:14:02 | INFO | train_inner | {"epoch": 27, "update": 26.178, "s2c_loss": "0.171", "loss": "0.11837", "s2c_nll_loss": "0.171", "s2c_accuracy": "98.906", "s2c_total": "64", "s2c_n_correct": "63.3", "wps": "259.9", "ups": "4.06", "wpb": "64", "bsz": "64", "num_updates": "56580", "lr": "2.28089e-05", "gnorm": "1.017", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14540"} 2023-01-29 20:14:05 | INFO | train_inner | {"epoch": 27, "update": 26.183, "s2c_loss": "0.007", "loss": "0.00476", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "56590", "lr": "2.27422e-05", "gnorm": "0.354", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14543"} 2023-01-29 20:14:07 | INFO | train_inner | {"epoch": 27, "update": 26.187, "s2c_loss": "0.023", "loss": "0.01594", "s2c_nll_loss": "0.023", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "56600", "lr": "2.26755e-05", "gnorm": "0.835", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14545"} 2023-01-29 20:14:10 | INFO | train_inner | {"epoch": 27, "update": 26.192, "s2c_loss": "0.027", "loss": "0.01889", "s2c_nll_loss": "0.027", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "56610", "lr": "2.26089e-05", "gnorm": "1.359", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14548"} 2023-01-29 20:14:12 | INFO | train_inner | {"epoch": 27, "update": 26.197, "s2c_loss": "0.007", "loss": "0.00476", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "56620", "lr": "2.25422e-05", "gnorm": "0.381", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14550"} 2023-01-29 20:14:15 | INFO | train_inner | {"epoch": 27, "update": 26.201, "s2c_loss": "0.01", "loss": "0.00717", "s2c_nll_loss": "0.01", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "56630", "lr": "2.24755e-05", "gnorm": "0.678", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14553"} 2023-01-29 20:14:17 | INFO | train_inner | {"epoch": 27, "update": 26.206, "s2c_loss": "0.008", "loss": "0.00539", "s2c_nll_loss": "0.008", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "56640", "lr": "2.24089e-05", "gnorm": "0.313", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14555"} 2023-01-29 20:14:20 | INFO | train_inner | {"epoch": 27, "update": 26.21, "s2c_loss": "0.03", "loss": "0.02101", "s2c_nll_loss": "0.03", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "56650", "lr": "2.23422e-05", "gnorm": "0.9", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.4", "wall": "14558"} 2023-01-29 20:14:22 | INFO | train_inner | {"epoch": 27, "update": 26.215, "s2c_loss": "0.026", "loss": "0.01786", "s2c_nll_loss": "0.026", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "56660", "lr": "2.22756e-05", "gnorm": "0.913", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14560"} 2023-01-29 20:14:25 | INFO | train_inner | {"epoch": 27, "update": 26.22, "s2c_loss": "0.013", "loss": "0.00917", "s2c_nll_loss": "0.013", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "56670", "lr": "2.22089e-05", "gnorm": "0.729", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14563"} 2023-01-29 20:14:28 | INFO | train_inner | {"epoch": 27, "update": 26.224, "s2c_loss": "0.031", "loss": "0.02166", "s2c_nll_loss": "0.031", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "247.3", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "56680", "lr": "2.21422e-05", "gnorm": "0.686", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14566"} 2023-01-29 20:14:30 | INFO | train_inner | {"epoch": 27, "update": 26.229, "s2c_loss": "0.02", "loss": "0.01399", "s2c_nll_loss": "0.02", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "56690", "lr": "2.20756e-05", "gnorm": "0.496", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14568"} 2023-01-29 20:14:33 | INFO | train_inner | {"epoch": 27, "update": 26.234, "s2c_loss": "0.009", "loss": "0.00655", "s2c_nll_loss": "0.009", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "56700", "lr": "2.20089e-05", "gnorm": "0.426", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14571"} 2023-01-29 20:14:35 | INFO | train_inner | {"epoch": 27, "update": 26.238, "s2c_loss": "0.009", "loss": "0.00626", "s2c_nll_loss": "0.009", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "56710", "lr": "2.19422e-05", "gnorm": "0.654", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14573"} 2023-01-29 20:14:38 | INFO | train_inner | {"epoch": 27, "update": 26.243, "s2c_loss": "0.009", "loss": "0.00603", "s2c_nll_loss": "0.009", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "248.9", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "56720", "lr": "2.18756e-05", "gnorm": "0.596", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.3", "wall": "14576"} 2023-01-29 20:14:40 | INFO | train_inner | {"epoch": 27, "update": 26.247, "s2c_loss": "0.027", "loss": "0.01845", "s2c_nll_loss": "0.027", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "56730", "lr": "2.18089e-05", "gnorm": "0.659", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.2", "wall": "14578"} 2023-01-29 20:14:43 | INFO | train_inner | {"epoch": 27, "update": 26.252, "s2c_loss": "0.021", "loss": "0.01465", "s2c_nll_loss": "0.021", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "56740", "lr": "2.17422e-05", "gnorm": "0.611", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.3", "wall": "14581"} 2023-01-29 20:14:45 | INFO | train_inner | {"epoch": 27, "update": 26.257, "s2c_loss": "0.01", "loss": "0.0069", "s2c_nll_loss": "0.01", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "56750", "lr": "2.16756e-05", "gnorm": "0.539", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "14583"} 2023-01-29 20:14:48 | INFO | train_inner | {"epoch": 27, "update": 26.261, "s2c_loss": "0.012", "loss": "0.00805", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "56760", "lr": "2.16089e-05", "gnorm": "0.596", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "14586"} 2023-01-29 20:14:50 | INFO | train_inner | {"epoch": 27, "update": 26.266, "s2c_loss": "0.019", "loss": "0.01302", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "56770", "lr": "2.15423e-05", "gnorm": "0.484", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.2", "wall": "14588"} 2023-01-29 20:14:53 | INFO | train_inner | {"epoch": 27, "update": 26.271, "s2c_loss": "0.006", "loss": "0.00446", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "56780", "lr": "2.14756e-05", "gnorm": "0.251", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "14591"} 2023-01-29 20:14:55 | INFO | train_inner | {"epoch": 27, "update": 26.275, "s2c_loss": "0.01", "loss": "0.00698", "s2c_nll_loss": "0.01", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "56790", "lr": "2.14089e-05", "gnorm": "0.551", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.2", "wall": "14593"} 2023-01-29 20:14:58 | INFO | train_inner | {"epoch": 27, "update": 26.28, "s2c_loss": "0.012", "loss": "0.00812", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "241.7", "ups": "3.78", "wpb": "64", "bsz": "64", "num_updates": "56800", "lr": "2.13423e-05", "gnorm": "0.642", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.3", "wall": "14596"} 2023-01-29 20:15:01 | INFO | train_inner | {"epoch": 27, "update": 26.284, "s2c_loss": "0.023", "loss": "0.01565", "s2c_nll_loss": "0.023", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "247.1", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "56810", "lr": "2.12756e-05", "gnorm": "0.753", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.3", "wall": "14599"} 2023-01-29 20:15:03 | INFO | train_inner | {"epoch": 27, "update": 26.289, "s2c_loss": "0.01", "loss": "0.00684", "s2c_nll_loss": "0.01", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "245.2", "ups": "3.83", "wpb": "64", "bsz": "64", "num_updates": "56820", "lr": "2.12089e-05", "gnorm": "0.485", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.4", "wall": "14601"} 2023-01-29 20:15:06 | INFO | train_inner | {"epoch": 27, "update": 26.294, "s2c_loss": "0.008", "loss": "0.00535", "s2c_nll_loss": "0.008", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "56830", "lr": "2.11423e-05", "gnorm": "0.354", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.4", "wall": "14604"} 2023-01-29 20:15:08 | INFO | train_inner | {"epoch": 27, "update": 26.298, "s2c_loss": "0.015", "loss": "0.01033", "s2c_nll_loss": "0.015", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "56840", "lr": "2.10756e-05", "gnorm": "0.685", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "14606"} 2023-01-29 20:15:11 | INFO | train_inner | {"epoch": 27, "update": 26.303, "s2c_loss": "0.012", "loss": "0.00829", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "250.2", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "56850", "lr": "2.10089e-05", "gnorm": "0.74", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.4", "wall": "14609"} 2023-01-29 20:15:14 | INFO | train_inner | {"epoch": 27, "update": 26.308, "s2c_loss": "0.013", "loss": "0.00918", "s2c_nll_loss": "0.013", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "56860", "lr": "2.09423e-05", "gnorm": "0.631", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.3", "wall": "14611"} 2023-01-29 20:15:14 | INFO | fairseq.trainer | NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 4096.0 2023-01-29 20:15:16 | INFO | train_inner | {"epoch": 27, "update": 26.313, "s2c_loss": "0.019", "loss": "0.01289", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "233.4", "ups": "3.65", "wpb": "64", "bsz": "64", "num_updates": "56870", "lr": "2.08756e-05", "gnorm": "0.723", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14614"} 2023-01-29 20:15:19 | INFO | train_inner | {"epoch": 27, "update": 26.317, "s2c_loss": "0.02", "loss": "0.01389", "s2c_nll_loss": "0.02", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "56880", "lr": "2.0809e-05", "gnorm": "1.124", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14617"} 2023-01-29 20:15:21 | INFO | train_inner | {"epoch": 27, "update": 26.322, "s2c_loss": "0.006", "loss": "0.00425", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "56890", "lr": "2.07423e-05", "gnorm": "0.403", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14619"} 2023-01-29 20:15:24 | INFO | train_inner | {"epoch": 27, "update": 26.327, "s2c_loss": "0.025", "loss": "0.01712", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "56900", "lr": "2.06756e-05", "gnorm": "0.913", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14622"} 2023-01-29 20:15:26 | INFO | train_inner | {"epoch": 27, "update": 26.331, "s2c_loss": "0.007", "loss": "0.00519", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "56910", "lr": "2.0609e-05", "gnorm": "0.458", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14624"} 2023-01-29 20:15:29 | INFO | train_inner | {"epoch": 27, "update": 26.336, "s2c_loss": "0.006", "loss": "0.00385", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "56920", "lr": "2.05423e-05", "gnorm": "0.274", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14627"} 2023-01-29 20:15:31 | INFO | train_inner | {"epoch": 27, "update": 26.34, "s2c_loss": "0.015", "loss": "0.01073", "s2c_nll_loss": "0.015", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "56930", "lr": "2.04756e-05", "gnorm": "0.787", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14629"} 2023-01-29 20:15:34 | INFO | train_inner | {"epoch": 27, "update": 26.345, "s2c_loss": "0.019", "loss": "0.01318", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "247", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "56940", "lr": "2.0409e-05", "gnorm": "0.85", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14632"} 2023-01-29 20:15:37 | INFO | train_inner | {"epoch": 27, "update": 26.35, "s2c_loss": "0.025", "loss": "0.01751", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "56950", "lr": "2.03423e-05", "gnorm": "1.099", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14634"} 2023-01-29 20:15:39 | INFO | train_inner | {"epoch": 27, "update": 26.354, "s2c_loss": "0.027", "loss": "0.01905", "s2c_nll_loss": "0.027", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "56960", "lr": "2.02757e-05", "gnorm": "0.863", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14637"} 2023-01-29 20:15:41 | INFO | train_inner | {"epoch": 27, "update": 26.359, "s2c_loss": "0.027", "loss": "0.01899", "s2c_nll_loss": "0.027", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "56970", "lr": "2.0209e-05", "gnorm": "0.734", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14639"} 2023-01-29 20:15:44 | INFO | train_inner | {"epoch": 27, "update": 26.364, "s2c_loss": "0.009", "loss": "0.00641", "s2c_nll_loss": "0.009", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "56980", "lr": "2.01423e-05", "gnorm": "0.647", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14642"} 2023-01-29 20:15:47 | INFO | train_inner | {"epoch": 27, "update": 26.368, "s2c_loss": "0.008", "loss": "0.00537", "s2c_nll_loss": "0.008", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "56990", "lr": "2.00757e-05", "gnorm": "0.392", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14644"} 2023-01-29 20:15:49 | INFO | train_inner | {"epoch": 27, "update": 26.373, "s2c_loss": "0.029", "loss": "0.0202", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "257.5", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "57000", "lr": "2.0009e-05", "gnorm": "0.825", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14647"} 2023-01-29 20:15:52 | INFO | train_inner | {"epoch": 27, "update": 26.377, "s2c_loss": "0.026", "loss": "0.01776", "s2c_nll_loss": "0.026", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "57010", "lr": "1.99423e-05", "gnorm": "1.129", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14649"} 2023-01-29 20:15:54 | INFO | train_inner | {"epoch": 27, "update": 26.382, "s2c_loss": "0.012", "loss": "0.00847", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "57020", "lr": "1.98757e-05", "gnorm": "0.661", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14652"} 2023-01-29 20:15:57 | INFO | train_inner | {"epoch": 27, "update": 26.387, "s2c_loss": "0.014", "loss": "0.00944", "s2c_nll_loss": "0.014", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "57030", "lr": "1.9809e-05", "gnorm": "0.787", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14655"} 2023-01-29 20:15:59 | INFO | train_inner | {"epoch": 27, "update": 26.391, "s2c_loss": "0.023", "loss": "0.01624", "s2c_nll_loss": "0.023", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "57040", "lr": "1.97423e-05", "gnorm": "1.071", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14657"} 2023-01-29 20:16:02 | INFO | train_inner | {"epoch": 27, "update": 26.396, "s2c_loss": "0.023", "loss": "0.01586", "s2c_nll_loss": "0.023", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "57050", "lr": "1.96757e-05", "gnorm": "0.591", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.4", "wall": "14660"} 2023-01-29 20:16:04 | INFO | train_inner | {"epoch": 27, "update": 26.401, "s2c_loss": "0.016", "loss": "0.01121", "s2c_nll_loss": "0.016", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "57060", "lr": "1.9609e-05", "gnorm": "0.613", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14662"} 2023-01-29 20:16:07 | INFO | train_inner | {"epoch": 27, "update": 26.405, "s2c_loss": "0.003", "loss": "0.00234", "s2c_nll_loss": "0.003", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "57070", "lr": "1.95424e-05", "gnorm": "0.185", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14665"} 2023-01-29 20:16:09 | INFO | train_inner | {"epoch": 27, "update": 26.41, "s2c_loss": "0.025", "loss": "0.01701", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "57080", "lr": "1.94757e-05", "gnorm": "1.081", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14667"} 2023-01-29 20:16:12 | INFO | train_inner | {"epoch": 27, "update": 26.414, "s2c_loss": "0.013", "loss": "0.00877", "s2c_nll_loss": "0.013", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "57090", "lr": "1.9409e-05", "gnorm": "0.638", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14670"} 2023-01-29 20:16:14 | INFO | train_inner | {"epoch": 27, "update": 26.419, "s2c_loss": "0.021", "loss": "0.01479", "s2c_nll_loss": "0.021", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "57100", "lr": "1.93424e-05", "gnorm": "0.684", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14672"} 2023-01-29 20:16:17 | INFO | train_inner | {"epoch": 27, "update": 26.424, "s2c_loss": "0.025", "loss": "0.01728", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "57110", "lr": "1.92757e-05", "gnorm": "0.903", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14675"} 2023-01-29 20:16:19 | INFO | train_inner | {"epoch": 27, "update": 26.428, "s2c_loss": "0.012", "loss": "0.00828", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "57120", "lr": "1.9209e-05", "gnorm": "0.626", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14677"} 2023-01-29 20:16:22 | INFO | train_inner | {"epoch": 27, "update": 26.433, "s2c_loss": "0.015", "loss": "0.01072", "s2c_nll_loss": "0.015", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "57130", "lr": "1.91424e-05", "gnorm": "0.647", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14680"} 2023-01-29 20:16:24 | INFO | train_inner | {"epoch": 27, "update": 26.438, "s2c_loss": "0.026", "loss": "0.01768", "s2c_nll_loss": "0.026", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "248.2", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "57140", "lr": "1.90757e-05", "gnorm": "0.567", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14682"} 2023-01-29 20:16:27 | INFO | train_inner | {"epoch": 27, "update": 26.442, "s2c_loss": "0.007", "loss": "0.00513", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "246.6", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "57150", "lr": "1.9009e-05", "gnorm": "0.508", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14685"} 2023-01-29 20:16:30 | INFO | train_inner | {"epoch": 27, "update": 26.447, "s2c_loss": "0.008", "loss": "0.00544", "s2c_nll_loss": "0.008", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "57160", "lr": "1.89424e-05", "gnorm": "0.559", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14688"} 2023-01-29 20:16:32 | INFO | train_inner | {"epoch": 27, "update": 26.451, "s2c_loss": "0.029", "loss": "0.02006", "s2c_nll_loss": "0.029", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "57170", "lr": "1.88757e-05", "gnorm": "0.705", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14690"} 2023-01-29 20:16:35 | INFO | train_inner | {"epoch": 27, "update": 26.456, "s2c_loss": "0.012", "loss": "0.00847", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "57180", "lr": "1.88091e-05", "gnorm": "0.702", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14693"} 2023-01-29 20:16:37 | INFO | train_inner | {"epoch": 27, "update": 26.461, "s2c_loss": "0.013", "loss": "0.00885", "s2c_nll_loss": "0.013", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "57190", "lr": "1.87424e-05", "gnorm": "0.621", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14695"} 2023-01-29 20:16:40 | INFO | train_inner | {"epoch": 27, "update": 26.465, "s2c_loss": "0.016", "loss": "0.01117", "s2c_nll_loss": "0.016", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "57200", "lr": "1.86757e-05", "gnorm": "0.675", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14698"} 2023-01-29 20:16:42 | INFO | train_inner | {"epoch": 27, "update": 26.47, "s2c_loss": "0.025", "loss": "0.01764", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "57210", "lr": "1.86091e-05", "gnorm": "1.178", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14700"} 2023-01-29 20:16:45 | INFO | train_inner | {"epoch": 27, "update": 26.475, "s2c_loss": "0.007", "loss": "0.00487", "s2c_nll_loss": "0.007", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "57220", "lr": "1.85424e-05", "gnorm": "0.368", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14703"} 2023-01-29 20:16:47 | INFO | train_inner | {"epoch": 27, "update": 26.479, "s2c_loss": "0.035", "loss": "0.0242", "s2c_nll_loss": "0.035", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "57230", "lr": "1.84757e-05", "gnorm": "0.867", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14705"} 2023-01-29 20:16:50 | INFO | train_inner | {"epoch": 27, "update": 26.484, "s2c_loss": "0.009", "loss": "0.00656", "s2c_nll_loss": "0.009", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "249.1", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "57240", "lr": "1.84091e-05", "gnorm": "0.72", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14708"} 2023-01-29 20:16:52 | INFO | train_inner | {"epoch": 27, "update": 26.488, "s2c_loss": "0.009", "loss": "0.00606", "s2c_nll_loss": "0.009", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "258.4", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "57250", "lr": "1.83424e-05", "gnorm": "0.49", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14710"} 2023-01-29 20:16:55 | INFO | train_inner | {"epoch": 27, "update": 26.493, "s2c_loss": "0.017", "loss": "0.01206", "s2c_nll_loss": "0.017", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254.4", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "57260", "lr": "1.82758e-05", "gnorm": "0.567", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14713"} 2023-01-29 20:16:57 | INFO | train_inner | {"epoch": 27, "update": 26.498, "s2c_loss": "0.019", "loss": "0.01295", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "57270", "lr": "1.82091e-05", "gnorm": "0.891", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14715"} 2023-01-29 20:17:00 | INFO | train_inner | {"epoch": 27, "update": 26.502, "s2c_loss": "0.015", "loss": "0.01063", "s2c_nll_loss": "0.015", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "57280", "lr": "1.81424e-05", "gnorm": "0.66", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14718"} 2023-01-29 20:17:03 | INFO | train_inner | {"epoch": 27, "update": 26.507, "s2c_loss": "0.01", "loss": "0.00722", "s2c_nll_loss": "0.01", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "57290", "lr": "1.80758e-05", "gnorm": "0.343", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14720"} 2023-01-29 20:17:05 | INFO | train_inner | {"epoch": 27, "update": 26.512, "s2c_loss": "0.016", "loss": "0.01085", "s2c_nll_loss": "0.016", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "57300", "lr": "1.80091e-05", "gnorm": "0.679", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14723"} 2023-01-29 20:17:08 | INFO | train_inner | {"epoch": 27, "update": 26.516, "s2c_loss": "0.011", "loss": "0.00765", "s2c_nll_loss": "0.011", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "57310", "lr": "1.79424e-05", "gnorm": "0.638", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14726"} 2023-01-29 20:17:10 | INFO | train_inner | {"epoch": 27, "update": 26.521, "s2c_loss": "0.009", "loss": "0.00656", "s2c_nll_loss": "0.009", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "57320", "lr": "1.78758e-05", "gnorm": "0.442", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14728"} 2023-01-29 20:17:13 | INFO | train_inner | {"epoch": 27, "update": 26.525, "s2c_loss": "0.006", "loss": "0.00383", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "57330", "lr": "1.78091e-05", "gnorm": "0.281", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14731"} 2023-01-29 20:17:15 | INFO | train_inner | {"epoch": 27, "update": 26.53, "s2c_loss": "0.018", "loss": "0.01214", "s2c_nll_loss": "0.018", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "57340", "lr": "1.77424e-05", "gnorm": "1.045", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14733"} 2023-01-29 20:17:18 | INFO | train_inner | {"epoch": 27, "update": 26.535, "s2c_loss": "0.01", "loss": "0.00677", "s2c_nll_loss": "0.01", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "246.1", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "57350", "lr": "1.76758e-05", "gnorm": "0.509", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14736"} 2023-01-29 20:17:20 | INFO | train_inner | {"epoch": 27, "update": 26.539, "s2c_loss": "0.022", "loss": "0.01555", "s2c_nll_loss": "0.022", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "57360", "lr": "1.76091e-05", "gnorm": "0.765", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14738"} 2023-01-29 20:17:23 | INFO | train_inner | {"epoch": 27, "update": 26.544, "s2c_loss": "0.017", "loss": "0.01177", "s2c_nll_loss": "0.017", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "57370", "lr": "1.75425e-05", "gnorm": "0.945", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14741"} 2023-01-29 20:17:25 | INFO | train_inner | {"epoch": 27, "update": 26.549, "s2c_loss": "0.008", "loss": "0.00522", "s2c_nll_loss": "0.008", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "57380", "lr": "1.74758e-05", "gnorm": "0.47", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14743"} 2023-01-29 20:17:28 | INFO | train_inner | {"epoch": 27, "update": 26.553, "s2c_loss": "0.008", "loss": "0.00578", "s2c_nll_loss": "0.008", "s2c_accuracy": "100", "s2c_total": "63.7", "s2c_n_correct": "63.7", "wps": "247.3", "ups": "3.88", "wpb": "63.7", "bsz": "63.7", "num_updates": "57390", "lr": "1.74091e-05", "gnorm": "0.542", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14746"} 2023-01-29 20:17:31 | INFO | train_inner | {"epoch": 27, "update": 26.558, "s2c_loss": "0.011", "loss": "0.00751", "s2c_nll_loss": "0.011", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "57400", "lr": "1.73425e-05", "gnorm": "0.441", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14749"} 2023-01-29 20:17:33 | INFO | train_inner | {"epoch": 27, "update": 26.562, "s2c_loss": "0.019", "loss": "0.01302", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "57410", "lr": "1.72758e-05", "gnorm": "0.65", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14751"} 2023-01-29 20:17:36 | INFO | train_inner | {"epoch": 27, "update": 26.567, "s2c_loss": "0.022", "loss": "0.01495", "s2c_nll_loss": "0.022", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "246.3", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "57420", "lr": "1.72091e-05", "gnorm": "0.584", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14754"} 2023-01-29 20:17:38 | INFO | train_inner | {"epoch": 27, "update": 26.572, "s2c_loss": "0.006", "loss": "0.00391", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "247.4", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "57430", "lr": "1.71425e-05", "gnorm": "0.295", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14756"} 2023-01-29 20:17:41 | INFO | train_inner | {"epoch": 27, "update": 26.576, "s2c_loss": "0.006", "loss": "0.00444", "s2c_nll_loss": "0.006", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "57440", "lr": "1.70758e-05", "gnorm": "0.318", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14759"} 2023-01-29 20:17:43 | INFO | train_inner | {"epoch": 27, "update": 26.581, "s2c_loss": "0.015", "loss": "0.01016", "s2c_nll_loss": "0.015", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "57450", "lr": "1.70091e-05", "gnorm": "0.699", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14761"} 2023-01-29 20:17:46 | INFO | train_inner | {"epoch": 27, "update": 26.586, "s2c_loss": "0.005", "loss": "0.0038", "s2c_nll_loss": "0.005", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "57460", "lr": "1.69425e-05", "gnorm": "0.283", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14764"} 2023-01-29 20:17:49 | INFO | train_inner | {"epoch": 27, "update": 26.59, "s2c_loss": "0.011", "loss": "0.00772", "s2c_nll_loss": "0.011", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "57470", "lr": "1.68758e-05", "gnorm": "0.951", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14766"} 2023-01-29 20:17:51 | INFO | train_inner | {"epoch": 27, "update": 26.595, "s2c_loss": "0.008", "loss": "0.00535", "s2c_nll_loss": "0.008", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "249.4", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "57480", "lr": "1.68092e-05", "gnorm": "0.428", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14769"} 2023-01-29 20:17:54 | INFO | train_inner | {"epoch": 27, "update": 26.599, "s2c_loss": "0.012", "loss": "0.00846", "s2c_nll_loss": "0.012", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "57490", "lr": "1.67425e-05", "gnorm": "0.719", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14772"} 2023-01-29 20:17:56 | INFO | train_inner | {"epoch": 27, "update": 26.604, "s2c_loss": "0.009", "loss": "0.00637", "s2c_nll_loss": "0.009", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "57500", "lr": "1.66758e-05", "gnorm": "0.602", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14774"} 2023-01-29 20:17:59 | INFO | train_inner | {"epoch": 27, "update": 26.609, "s2c_loss": "0.012", "loss": "0.00842", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "57510", "lr": "1.66092e-05", "gnorm": "0.723", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14777"} 2023-01-29 20:18:01 | INFO | train_inner | {"epoch": 27, "update": 26.613, "s2c_loss": "0.005", "loss": "0.00322", "s2c_nll_loss": "0.005", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "57520", "lr": "1.65425e-05", "gnorm": "0.222", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14779"} 2023-01-29 20:18:04 | INFO | train_inner | {"epoch": 27, "update": 26.618, "s2c_loss": "0.008", "loss": "0.00585", "s2c_nll_loss": "0.008", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "57530", "lr": "1.64758e-05", "gnorm": "0.538", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14782"} 2023-01-29 20:18:06 | INFO | train_inner | {"epoch": 27, "update": 26.623, "s2c_loss": "0.007", "loss": "0.00504", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "57540", "lr": "1.64092e-05", "gnorm": "0.491", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14784"} 2023-01-29 20:18:09 | INFO | train_inner | {"epoch": 27, "update": 26.627, "s2c_loss": "0.01", "loss": "0.00691", "s2c_nll_loss": "0.01", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "247.7", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "57550", "lr": "1.63425e-05", "gnorm": "0.579", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14787"} 2023-01-29 20:18:12 | INFO | train_inner | {"epoch": 27, "update": 26.632, "s2c_loss": "0.014", "loss": "0.00969", "s2c_nll_loss": "0.014", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "57560", "lr": "1.62759e-05", "gnorm": "0.811", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14789"} 2023-01-29 20:18:14 | INFO | train_inner | {"epoch": 27, "update": 26.636, "s2c_loss": "0.016", "loss": "0.01131", "s2c_nll_loss": "0.016", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "57570", "lr": "1.62092e-05", "gnorm": "0.679", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14792"} 2023-01-29 20:18:17 | INFO | train_inner | {"epoch": 27, "update": 26.641, "s2c_loss": "0.014", "loss": "0.00982", "s2c_nll_loss": "0.014", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "57580", "lr": "1.61425e-05", "gnorm": "0.396", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14794"} 2023-01-29 20:18:19 | INFO | train_inner | {"epoch": 27, "update": 26.646, "s2c_loss": "0.012", "loss": "0.00825", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "57590", "lr": "1.60759e-05", "gnorm": "0.632", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14797"} 2023-01-29 20:18:22 | INFO | train_inner | {"epoch": 27, "update": 26.65, "s2c_loss": "0.009", "loss": "0.00602", "s2c_nll_loss": "0.009", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "57600", "lr": "1.60092e-05", "gnorm": "0.512", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14800"} 2023-01-29 20:18:24 | INFO | train_inner | {"epoch": 27, "update": 26.655, "s2c_loss": "0.008", "loss": "0.00585", "s2c_nll_loss": "0.008", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252.5", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "57610", "lr": "1.59425e-05", "gnorm": "0.522", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14802"} 2023-01-29 20:18:27 | INFO | train_inner | {"epoch": 27, "update": 26.66, "s2c_loss": "0.007", "loss": "0.00464", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "246.6", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "57620", "lr": "1.58759e-05", "gnorm": "0.398", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14805"} 2023-01-29 20:18:29 | INFO | train_inner | {"epoch": 27, "update": 26.664, "s2c_loss": "0.004", "loss": "0.00243", "s2c_nll_loss": "0.004", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "57630", "lr": "1.58092e-05", "gnorm": "0.226", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14807"} 2023-01-29 20:18:32 | INFO | train_inner | {"epoch": 27, "update": 26.669, "s2c_loss": "0.012", "loss": "0.00822", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "57640", "lr": "1.57425e-05", "gnorm": "0.543", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14810"} 2023-01-29 20:18:34 | INFO | train_inner | {"epoch": 27, "update": 26.673, "s2c_loss": "0.017", "loss": "0.01157", "s2c_nll_loss": "0.017", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "246.2", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "57650", "lr": "1.56759e-05", "gnorm": "0.779", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14812"} 2023-01-29 20:18:37 | INFO | train_inner | {"epoch": 27, "update": 26.678, "s2c_loss": "0.009", "loss": "0.00658", "s2c_nll_loss": "0.009", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "57660", "lr": "1.56092e-05", "gnorm": "0.698", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14815"} 2023-01-29 20:18:39 | INFO | train_inner | {"epoch": 27, "update": 26.683, "s2c_loss": "0.01", "loss": "0.00665", "s2c_nll_loss": "0.01", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "57670", "lr": "1.55426e-05", "gnorm": "0.498", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14817"} 2023-01-29 20:18:42 | INFO | train_inner | {"epoch": 27, "update": 26.687, "s2c_loss": "0.006", "loss": "0.00384", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "257.8", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "57680", "lr": "1.54759e-05", "gnorm": "0.261", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14820"} 2023-01-29 20:18:45 | INFO | train_inner | {"epoch": 27, "update": 26.692, "s2c_loss": "0.154", "loss": "0.10652", "s2c_nll_loss": "0.154", "s2c_accuracy": "98.75", "s2c_total": "64", "s2c_n_correct": "63.2", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "57690", "lr": "1.54092e-05", "gnorm": "0.847", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14822"} 2023-01-29 20:18:47 | INFO | train_inner | {"epoch": 27, "update": 26.697, "s2c_loss": "0.02", "loss": "0.01405", "s2c_nll_loss": "0.02", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "57700", "lr": "1.53426e-05", "gnorm": "0.897", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14825"} 2023-01-29 20:18:50 | INFO | train_inner | {"epoch": 27, "update": 26.701, "s2c_loss": "0.008", "loss": "0.00581", "s2c_nll_loss": "0.008", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "57710", "lr": "1.52759e-05", "gnorm": "0.593", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14827"} 2023-01-29 20:18:52 | INFO | train_inner | {"epoch": 27, "update": 26.706, "s2c_loss": "0.033", "loss": "0.02317", "s2c_nll_loss": "0.033", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "57720", "lr": "1.52092e-05", "gnorm": "1.618", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.5", "wall": "14830"} 2023-01-29 20:18:55 | INFO | train_inner | {"epoch": 27, "update": 26.71, "s2c_loss": "0.032", "loss": "0.02226", "s2c_nll_loss": "0.032", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "57730", "lr": "1.51426e-05", "gnorm": "1.116", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14833"} 2023-01-29 20:18:57 | INFO | train_inner | {"epoch": 27, "update": 26.715, "s2c_loss": "0.01", "loss": "0.00706", "s2c_nll_loss": "0.01", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "246", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "57740", "lr": "1.50759e-05", "gnorm": "0.674", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14835"} 2023-01-29 20:19:00 | INFO | train_inner | {"epoch": 27, "update": 26.72, "s2c_loss": "0.011", "loss": "0.00764", "s2c_nll_loss": "0.011", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "57750", "lr": "1.50092e-05", "gnorm": "0.391", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14838"} 2023-01-29 20:19:02 | INFO | train_inner | {"epoch": 27, "update": 26.724, "s2c_loss": "0.019", "loss": "0.01285", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "57760", "lr": "1.49426e-05", "gnorm": "0.552", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14840"} 2023-01-29 20:19:05 | INFO | train_inner | {"epoch": 27, "update": 26.729, "s2c_loss": "0.14", "loss": "0.09716", "s2c_nll_loss": "0.14", "s2c_accuracy": "99.375", "s2c_total": "64", "s2c_n_correct": "63.6", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "57770", "lr": "1.48759e-05", "gnorm": "0.548", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14843"} 2023-01-29 20:19:07 | INFO | train_inner | {"epoch": 27, "update": 26.734, "s2c_loss": "0.011", "loss": "0.00743", "s2c_nll_loss": "0.011", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "57780", "lr": "1.48093e-05", "gnorm": "0.616", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14845"} 2023-01-29 20:19:10 | INFO | train_inner | {"epoch": 27, "update": 26.738, "s2c_loss": "0.011", "loss": "0.00782", "s2c_nll_loss": "0.011", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "57790", "lr": "1.47426e-05", "gnorm": "0.536", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14848"} 2023-01-29 20:19:12 | INFO | train_inner | {"epoch": 27, "update": 26.743, "s2c_loss": "0.013", "loss": "0.0088", "s2c_nll_loss": "0.013", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "57800", "lr": "1.46759e-05", "gnorm": "0.285", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14850"} 2023-01-29 20:19:15 | INFO | train_inner | {"epoch": 27, "update": 26.747, "s2c_loss": "0.012", "loss": "0.00821", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "258.5", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "57810", "lr": "1.46093e-05", "gnorm": "0.461", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14853"} 2023-01-29 20:19:17 | INFO | train_inner | {"epoch": 27, "update": 26.752, "s2c_loss": "0.012", "loss": "0.00854", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "246.1", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "57820", "lr": "1.45426e-05", "gnorm": "0.464", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14855"} 2023-01-29 20:19:20 | INFO | train_inner | {"epoch": 27, "update": 26.757, "s2c_loss": "0.005", "loss": "0.00363", "s2c_nll_loss": "0.005", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "57830", "lr": "1.44759e-05", "gnorm": "0.308", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14858"} 2023-01-29 20:19:22 | INFO | train_inner | {"epoch": 27, "update": 26.761, "s2c_loss": "0.007", "loss": "0.00484", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "57840", "lr": "1.44093e-05", "gnorm": "0.349", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14860"} 2023-01-29 20:19:25 | INFO | train_inner | {"epoch": 27, "update": 26.766, "s2c_loss": "0.01", "loss": "0.00678", "s2c_nll_loss": "0.01", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "57850", "lr": "1.43426e-05", "gnorm": "0.552", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14863"} 2023-01-29 20:19:28 | INFO | train_inner | {"epoch": 27, "update": 26.771, "s2c_loss": "0.02", "loss": "0.01363", "s2c_nll_loss": "0.02", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "247.6", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "57860", "lr": "1.4276e-05", "gnorm": "0.888", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14865"} 2023-01-29 20:19:30 | INFO | train_inner | {"epoch": 27, "update": 26.775, "s2c_loss": "0.028", "loss": "0.01934", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "57870", "lr": "1.42093e-05", "gnorm": "0.502", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14868"} 2023-01-29 20:19:33 | INFO | train_inner | {"epoch": 27, "update": 26.78, "s2c_loss": "0.014", "loss": "0.00955", "s2c_nll_loss": "0.014", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "57880", "lr": "1.41426e-05", "gnorm": "0.746", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14871"} 2023-01-29 20:19:35 | INFO | train_inner | {"epoch": 27, "update": 26.784, "s2c_loss": "0.012", "loss": "0.00826", "s2c_nll_loss": "0.012", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "57890", "lr": "1.4076e-05", "gnorm": "0.443", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14873"} 2023-01-29 20:19:38 | INFO | train_inner | {"epoch": 27, "update": 26.789, "s2c_loss": "0.008", "loss": "0.00583", "s2c_nll_loss": "0.008", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "57900", "lr": "1.40093e-05", "gnorm": "0.569", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14876"} 2023-01-29 20:19:40 | INFO | train_inner | {"epoch": 27, "update": 26.794, "s2c_loss": "0.019", "loss": "0.01284", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "57910", "lr": "1.39426e-05", "gnorm": "0.897", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14878"} 2023-01-29 20:19:43 | INFO | train_inner | {"epoch": 27, "update": 26.798, "s2c_loss": "0.011", "loss": "0.00789", "s2c_nll_loss": "0.011", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "57920", "lr": "1.3876e-05", "gnorm": "0.527", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14881"} 2023-01-29 20:19:45 | INFO | train_inner | {"epoch": 27, "update": 26.803, "s2c_loss": "0.008", "loss": "0.00571", "s2c_nll_loss": "0.008", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "57930", "lr": "1.38093e-05", "gnorm": "0.453", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14883"} 2023-01-29 20:19:48 | INFO | train_inner | {"epoch": 27, "update": 26.808, "s2c_loss": "0.007", "loss": "0.00495", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "57940", "lr": "1.37426e-05", "gnorm": "0.309", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14886"} 2023-01-29 20:19:50 | INFO | train_inner | {"epoch": 27, "update": 26.812, "s2c_loss": "0.017", "loss": "0.01193", "s2c_nll_loss": "0.017", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "248", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "57950", "lr": "1.3676e-05", "gnorm": "0.464", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14888"} 2023-01-29 20:19:53 | INFO | train_inner | {"epoch": 27, "update": 26.817, "s2c_loss": "0.006", "loss": "0.00406", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "57960", "lr": "1.36093e-05", "gnorm": "0.354", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14891"} 2023-01-29 20:19:55 | INFO | train_inner | {"epoch": 27, "update": 26.821, "s2c_loss": "0.01", "loss": "0.00686", "s2c_nll_loss": "0.01", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "57970", "lr": "1.35427e-05", "gnorm": "0.432", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14893"} 2023-01-29 20:19:58 | INFO | train_inner | {"epoch": 27, "update": 26.826, "s2c_loss": "0.016", "loss": "0.011", "s2c_nll_loss": "0.016", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "250", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "57980", "lr": "1.3476e-05", "gnorm": "0.389", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14896"} 2023-01-29 20:20:01 | INFO | train_inner | {"epoch": 27, "update": 26.831, "s2c_loss": "0.016", "loss": "0.0109", "s2c_nll_loss": "0.016", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.1", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "57990", "lr": "1.34093e-05", "gnorm": "0.804", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14899"} 2023-01-29 20:20:03 | INFO | train_inner | {"epoch": 27, "update": 26.835, "s2c_loss": "0.011", "loss": "0.00781", "s2c_nll_loss": "0.011", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "58000", "lr": "1.33427e-05", "gnorm": "0.469", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14901"} 2023-01-29 20:20:06 | INFO | train_inner | {"epoch": 27, "update": 26.84, "s2c_loss": "0.142", "loss": "0.0981", "s2c_nll_loss": "0.142", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "58010", "lr": "1.3276e-05", "gnorm": "0.538", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14904"} 2023-01-29 20:20:08 | INFO | train_inner | {"epoch": 27, "update": 26.845, "s2c_loss": "0.015", "loss": "0.01056", "s2c_nll_loss": "0.015", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "58020", "lr": "1.32093e-05", "gnorm": "0.751", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14906"} 2023-01-29 20:20:11 | INFO | train_inner | {"epoch": 27, "update": 26.849, "s2c_loss": "0.01", "loss": "0.0071", "s2c_nll_loss": "0.01", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "58030", "lr": "1.31427e-05", "gnorm": "0.624", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14909"} 2023-01-29 20:20:13 | INFO | train_inner | {"epoch": 27, "update": 26.854, "s2c_loss": "0.016", "loss": "0.01093", "s2c_nll_loss": "0.016", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "244.2", "ups": "3.82", "wpb": "64", "bsz": "64", "num_updates": "58040", "lr": "1.3076e-05", "gnorm": "0.517", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14911"} 2023-01-29 20:20:16 | INFO | train_inner | {"epoch": 27, "update": 26.858, "s2c_loss": "0.012", "loss": "0.00854", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "58050", "lr": "1.30093e-05", "gnorm": "0.616", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14914"} 2023-01-29 20:20:18 | INFO | train_inner | {"epoch": 27, "update": 26.863, "s2c_loss": "0.02", "loss": "0.01383", "s2c_nll_loss": "0.02", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "58060", "lr": "1.29427e-05", "gnorm": "0.7", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14916"} 2023-01-29 20:20:21 | INFO | train_inner | {"epoch": 27, "update": 26.868, "s2c_loss": "0.018", "loss": "0.01252", "s2c_nll_loss": "0.018", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "58070", "lr": "1.2876e-05", "gnorm": "0.792", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14919"} 2023-01-29 20:20:23 | INFO | train_inner | {"epoch": 27, "update": 26.872, "s2c_loss": "0.017", "loss": "0.01193", "s2c_nll_loss": "0.017", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "58080", "lr": "1.28094e-05", "gnorm": "0.472", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14921"} 2023-01-29 20:20:26 | INFO | train_inner | {"epoch": 27, "update": 26.877, "s2c_loss": "0.006", "loss": "0.00445", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "249", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "58090", "lr": "1.27427e-05", "gnorm": "0.383", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14924"} 2023-01-29 20:20:29 | INFO | train_inner | {"epoch": 27, "update": 26.882, "s2c_loss": "0.02", "loss": "0.01402", "s2c_nll_loss": "0.02", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "246.3", "ups": "3.85", "wpb": "64", "bsz": "64", "num_updates": "58100", "lr": "1.2676e-05", "gnorm": "1.017", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.4", "wall": "14927"} 2023-01-29 20:20:31 | INFO | train_inner | {"epoch": 27, "update": 26.886, "s2c_loss": "0.008", "loss": "0.00575", "s2c_nll_loss": "0.008", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.9", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "58110", "lr": "1.26094e-05", "gnorm": "0.403", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14929"} 2023-01-29 20:20:34 | INFO | train_inner | {"epoch": 27, "update": 26.891, "s2c_loss": "0.013", "loss": "0.00871", "s2c_nll_loss": "0.013", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "58120", "lr": "1.25427e-05", "gnorm": "0.86", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14932"} 2023-01-29 20:20:36 | INFO | train_inner | {"epoch": 27, "update": 26.895, "s2c_loss": "0.008", "loss": "0.00564", "s2c_nll_loss": "0.008", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "58130", "lr": "1.2476e-05", "gnorm": "0.389", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14934"} 2023-01-29 20:20:39 | INFO | train_inner | {"epoch": 27, "update": 26.9, "s2c_loss": "0.012", "loss": "0.00858", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "58140", "lr": "1.24094e-05", "gnorm": "0.517", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14937"} 2023-01-29 20:20:41 | INFO | train_inner | {"epoch": 27, "update": 26.905, "s2c_loss": "0.021", "loss": "0.01464", "s2c_nll_loss": "0.021", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "255.7", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "58150", "lr": "1.23427e-05", "gnorm": "0.773", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14939"} 2023-01-29 20:20:44 | INFO | train_inner | {"epoch": 27, "update": 26.909, "s2c_loss": "0.005", "loss": "0.00354", "s2c_nll_loss": "0.005", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "58160", "lr": "1.22761e-05", "gnorm": "0.204", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14942"} 2023-01-29 20:20:46 | INFO | train_inner | {"epoch": 27, "update": 26.914, "s2c_loss": "0.175", "loss": "0.12157", "s2c_nll_loss": "0.175", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "58170", "lr": "1.22094e-05", "gnorm": "0.801", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14944"} 2023-01-29 20:20:49 | INFO | train_inner | {"epoch": 27, "update": 26.919, "s2c_loss": "0.008", "loss": "0.00524", "s2c_nll_loss": "0.008", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "58180", "lr": "1.21427e-05", "gnorm": "0.463", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14947"} 2023-01-29 20:20:51 | INFO | train_inner | {"epoch": 27, "update": 26.923, "s2c_loss": "0.013", "loss": "0.0091", "s2c_nll_loss": "0.013", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "58190", "lr": "1.20761e-05", "gnorm": "0.797", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14949"} 2023-01-29 20:20:54 | INFO | train_inner | {"epoch": 27, "update": 26.928, "s2c_loss": "0.008", "loss": "0.00583", "s2c_nll_loss": "0.008", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "258.1", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "58200", "lr": "1.20094e-05", "gnorm": "0.392", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14952"} 2023-01-29 20:20:56 | INFO | train_inner | {"epoch": 27, "update": 26.932, "s2c_loss": "0.009", "loss": "0.00604", "s2c_nll_loss": "0.009", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "58210", "lr": "1.19427e-05", "gnorm": "0.494", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14954"} 2023-01-29 20:20:59 | INFO | train_inner | {"epoch": 27, "update": 26.937, "s2c_loss": "0.024", "loss": "0.01675", "s2c_nll_loss": "0.024", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "58220", "lr": "1.18761e-05", "gnorm": "1.225", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14957"} 2023-01-29 20:21:02 | INFO | train_inner | {"epoch": 27, "update": 26.942, "s2c_loss": "0.007", "loss": "0.00494", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "58230", "lr": "1.18094e-05", "gnorm": "0.442", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14959"} 2023-01-29 20:21:04 | INFO | train_inner | {"epoch": 27, "update": 26.946, "s2c_loss": "0.03", "loss": "0.02112", "s2c_nll_loss": "0.03", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.1", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "58240", "lr": "1.17427e-05", "gnorm": "0.54", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14962"} 2023-01-29 20:21:07 | INFO | train_inner | {"epoch": 27, "update": 26.951, "s2c_loss": "0.005", "loss": "0.00323", "s2c_nll_loss": "0.005", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "58250", "lr": "1.16761e-05", "gnorm": "0.256", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14965"} 2023-01-29 20:21:09 | INFO | train_inner | {"epoch": 27, "update": 26.956, "s2c_loss": "0.008", "loss": "0.00544", "s2c_nll_loss": "0.008", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "58260", "lr": "1.16094e-05", "gnorm": "0.395", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14967"} 2023-01-29 20:21:12 | INFO | train_inner | {"epoch": 27, "update": 26.96, "s2c_loss": "0.008", "loss": "0.00523", "s2c_nll_loss": "0.008", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "58270", "lr": "1.15428e-05", "gnorm": "0.422", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14970"} 2023-01-29 20:21:14 | INFO | train_inner | {"epoch": 27, "update": 26.965, "s2c_loss": "0.015", "loss": "0.01066", "s2c_nll_loss": "0.015", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "58280", "lr": "1.14761e-05", "gnorm": "0.418", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14972"} 2023-01-29 20:21:17 | INFO | train_inner | {"epoch": 27, "update": 26.969, "s2c_loss": "0.009", "loss": "0.00628", "s2c_nll_loss": "0.009", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "58290", "lr": "1.14094e-05", "gnorm": "0.631", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14975"} 2023-01-29 20:21:19 | INFO | train_inner | {"epoch": 27, "update": 26.974, "s2c_loss": "0.009", "loss": "0.00618", "s2c_nll_loss": "0.009", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "58300", "lr": "1.13428e-05", "gnorm": "0.616", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "14977"} 2023-01-29 20:21:22 | INFO | train_inner | {"epoch": 27, "update": 26.979, "s2c_loss": "0.014", "loss": "0.00948", "s2c_nll_loss": "0.014", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "58310", "lr": "1.12761e-05", "gnorm": "0.713", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14980"} 2023-01-29 20:21:24 | INFO | train_inner | {"epoch": 27, "update": 26.983, "s2c_loss": "0.009", "loss": "0.00614", "s2c_nll_loss": "0.009", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "58320", "lr": "1.12094e-05", "gnorm": "0.398", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "14982"} 2023-01-29 20:21:27 | INFO | train_inner | {"epoch": 27, "update": 26.988, "s2c_loss": "0.008", "loss": "0.00582", "s2c_nll_loss": "0.008", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "250.9", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "58330", "lr": "1.11428e-05", "gnorm": "0.617", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "14985"} 2023-01-29 20:21:29 | INFO | train_inner | {"epoch": 27, "update": 26.993, "s2c_loss": "0.011", "loss": "0.00767", "s2c_nll_loss": "0.011", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "58340", "lr": "1.10761e-05", "gnorm": "0.506", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "14987"} 2023-01-29 20:21:32 | INFO | train_inner | {"epoch": 27, "update": 26.997, "s2c_loss": "0.027", "loss": "0.01895", "s2c_nll_loss": "0.027", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.6", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "58350", "lr": "1.10094e-05", "gnorm": "0.655", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "14990"} 2023-01-29 20:21:33 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 20:21:48 | INFO | valid | {"epoch": 27, "valid_s2c_loss": "0.308", "valid_loss": "0.21339", "valid_s2c_nll_loss": "0.308", "valid_s2c_accuracy": "94.727", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "30.2731", "valid_num_updates": "58356", "valid_best_s2c_accuracy": "94.727"} 2023-01-29 20:21:48 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 27 @ 58356 updates 2023-01-29 20:21:48 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 20:21:55 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt 2023-01-29 20:22:00 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_best.pt (epoch 27 @ 58356 updates, score 94.727) (writing took 11.766202256083488 seconds) 2023-01-29 20:22:00 | INFO | fairseq_cli.train | end of epoch 27 (average epoch stats below) 2023-01-29 20:22:00 | INFO | train | {"epoch": 27, "train_s2c_loss": "0.019", "train_loss": "0.01339", "train_s2c_nll_loss": "0.019", "train_s2c_accuracy": "99.826", "train_s2c_total": "63.9838", "train_s2c_n_correct": "63.8723", "train_wps": "237.8", "train_ups": "3.72", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "58356", "train_lr": "1.09695e-05", "train_gnorm": "0.635", "train_loss_scale": "4096", "train_train_wall": "541", "train_gb_free": "7.4", "train_wall": "15018"} 2023-01-29 20:22:06 | INFO | fairseq.trainer | begin training epoch 28 2023-01-29 20:22:06 | INFO | fairseq_cli.train | Start iterating over samples 2023-01-29 20:22:07 | INFO | train_inner | {"epoch": 28, "update": 27.002, "s2c_loss": "0.01", "loss": "0.00688", "s2c_nll_loss": "0.01", "s2c_accuracy": "100", "s2c_total": "60.8", "s2c_n_correct": "60.8", "wps": "17.2", "ups": "0.28", "wpb": "60.8", "bsz": "60.8", "num_updates": "58360", "lr": "1.09428e-05", "gnorm": "0.487", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15025"} 2023-01-29 20:22:10 | INFO | train_inner | {"epoch": 28, "update": 27.006, "s2c_loss": "0.003", "loss": "0.00231", "s2c_nll_loss": "0.003", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "247.4", "ups": "3.87", "wpb": "64", "bsz": "64", "num_updates": "58370", "lr": "1.08761e-05", "gnorm": "0.226", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15028"} 2023-01-29 20:22:12 | INFO | train_inner | {"epoch": 28, "update": 27.011, "s2c_loss": "0.011", "loss": "0.00763", "s2c_nll_loss": "0.011", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "246.9", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "58380", "lr": "1.08095e-05", "gnorm": "0.83", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.4", "wall": "15030"} 2023-01-29 20:22:15 | INFO | train_inner | {"epoch": 28, "update": 27.016, "s2c_loss": "0.018", "loss": "0.01256", "s2c_nll_loss": "0.018", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "249.9", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "58390", "lr": "1.07428e-05", "gnorm": "0.352", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "15033"} 2023-01-29 20:22:18 | INFO | train_inner | {"epoch": 28, "update": 27.02, "s2c_loss": "0.127", "loss": "0.08813", "s2c_nll_loss": "0.127", "s2c_accuracy": "99.062", "s2c_total": "64", "s2c_n_correct": "63.4", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "58400", "lr": "1.06761e-05", "gnorm": "1.037", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15035"} 2023-01-29 20:22:20 | INFO | train_inner | {"epoch": 28, "update": 27.025, "s2c_loss": "0.078", "loss": "0.05387", "s2c_nll_loss": "0.078", "s2c_accuracy": "98.594", "s2c_total": "64", "s2c_n_correct": "63.1", "wps": "249.8", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "58410", "lr": "1.06095e-05", "gnorm": "1.106", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15038"} 2023-01-29 20:22:23 | INFO | train_inner | {"epoch": 28, "update": 27.03, "s2c_loss": "0.004", "loss": "0.00266", "s2c_nll_loss": "0.004", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252.2", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "58420", "lr": "1.05428e-05", "gnorm": "0.253", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15041"} 2023-01-29 20:22:25 | INFO | train_inner | {"epoch": 28, "update": 27.034, "s2c_loss": "0.013", "loss": "0.00877", "s2c_nll_loss": "0.013", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "58430", "lr": "1.04761e-05", "gnorm": "0.591", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15043"} 2023-01-29 20:22:28 | INFO | train_inner | {"epoch": 28, "update": 27.039, "s2c_loss": "0.003", "loss": "0.00237", "s2c_nll_loss": "0.003", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "248", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "58440", "lr": "1.04095e-05", "gnorm": "0.219", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "15046"} 2023-01-29 20:22:30 | INFO | train_inner | {"epoch": 28, "update": 27.043, "s2c_loss": "0.006", "loss": "0.00428", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "256.7", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "58450", "lr": "1.03428e-05", "gnorm": "0.33", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15048"} 2023-01-29 20:22:33 | INFO | train_inner | {"epoch": 28, "update": 27.048, "s2c_loss": "0.016", "loss": "0.01095", "s2c_nll_loss": "0.016", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "58460", "lr": "1.02762e-05", "gnorm": "0.787", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.4", "wall": "15051"} 2023-01-29 20:22:35 | INFO | train_inner | {"epoch": 28, "update": 27.053, "s2c_loss": "0.013", "loss": "0.00896", "s2c_nll_loss": "0.013", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "58470", "lr": "1.02095e-05", "gnorm": "0.431", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15053"} 2023-01-29 20:22:38 | INFO | train_inner | {"epoch": 28, "update": 27.057, "s2c_loss": "0.004", "loss": "0.00287", "s2c_nll_loss": "0.004", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "58480", "lr": "1.01428e-05", "gnorm": "0.237", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.4", "wall": "15056"} 2023-01-29 20:22:40 | INFO | train_inner | {"epoch": 28, "update": 27.062, "s2c_loss": "0.011", "loss": "0.00779", "s2c_nll_loss": "0.011", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "58490", "lr": "1.00762e-05", "gnorm": "0.54", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15058"} 2023-01-29 20:22:43 | INFO | train_inner | {"epoch": 28, "update": 27.067, "s2c_loss": "0.03", "loss": "0.02064", "s2c_nll_loss": "0.03", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "58500", "lr": "1.00095e-05", "gnorm": "0.68", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "15061"} 2023-01-29 20:22:45 | INFO | train_inner | {"epoch": 28, "update": 27.071, "s2c_loss": "0.028", "loss": "0.01957", "s2c_nll_loss": "0.028", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "58510", "lr": "9.94284e-06", "gnorm": "0.566", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.4", "wall": "15063"} 2023-01-29 20:22:48 | INFO | train_inner | {"epoch": 28, "update": 27.076, "s2c_loss": "0.004", "loss": "0.00249", "s2c_nll_loss": "0.004", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "58520", "lr": "9.87617e-06", "gnorm": "0.219", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15066"} 2023-01-29 20:22:51 | INFO | train_inner | {"epoch": 28, "update": 27.08, "s2c_loss": "0.003", "loss": "0.00216", "s2c_nll_loss": "0.003", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "58530", "lr": "9.80951e-06", "gnorm": "0.161", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15069"} 2023-01-29 20:22:53 | INFO | train_inner | {"epoch": 28, "update": 27.085, "s2c_loss": "0.007", "loss": "0.00483", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.8", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "58540", "lr": "9.74285e-06", "gnorm": "0.326", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15071"} 2023-01-29 20:22:56 | INFO | train_inner | {"epoch": 28, "update": 27.09, "s2c_loss": "0.008", "loss": "0.00589", "s2c_nll_loss": "0.008", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "58550", "lr": "9.67618e-06", "gnorm": "0.348", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15074"} 2023-01-29 20:22:58 | INFO | train_inner | {"epoch": 28, "update": 27.094, "s2c_loss": "0.008", "loss": "0.00572", "s2c_nll_loss": "0.008", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "246.9", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "58560", "lr": "9.60952e-06", "gnorm": "0.563", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "15076"} 2023-01-29 20:23:01 | INFO | train_inner | {"epoch": 28, "update": 27.099, "s2c_loss": "0.006", "loss": "0.00405", "s2c_nll_loss": "0.006", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "58570", "lr": "9.54286e-06", "gnorm": "0.356", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15079"} 2023-01-29 20:23:03 | INFO | train_inner | {"epoch": 28, "update": 27.104, "s2c_loss": "0.008", "loss": "0.00553", "s2c_nll_loss": "0.008", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "58580", "lr": "9.47619e-06", "gnorm": "0.402", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15081"} 2023-01-29 20:23:06 | INFO | train_inner | {"epoch": 28, "update": 27.108, "s2c_loss": "0.003", "loss": "0.00236", "s2c_nll_loss": "0.003", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.2", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "58590", "lr": "9.40953e-06", "gnorm": "0.177", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15084"} 2023-01-29 20:23:08 | INFO | train_inner | {"epoch": 28, "update": 27.113, "s2c_loss": "0.026", "loss": "0.01792", "s2c_nll_loss": "0.026", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "58600", "lr": "9.34287e-06", "gnorm": "1.099", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15086"} 2023-01-29 20:23:11 | INFO | train_inner | {"epoch": 28, "update": 27.117, "s2c_loss": "0.007", "loss": "0.00519", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "58610", "lr": "9.2762e-06", "gnorm": "0.424", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15089"} 2023-01-29 20:23:14 | INFO | train_inner | {"epoch": 28, "update": 27.122, "s2c_loss": "0.005", "loss": "0.00355", "s2c_nll_loss": "0.005", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "58620", "lr": "9.20954e-06", "gnorm": "0.273", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15091"} 2023-01-29 20:23:16 | INFO | train_inner | {"epoch": 28, "update": 27.127, "s2c_loss": "0.018", "loss": "0.0124", "s2c_nll_loss": "0.018", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "58630", "lr": "9.14288e-06", "gnorm": "0.55", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15094"} 2023-01-29 20:23:19 | INFO | train_inner | {"epoch": 28, "update": 27.131, "s2c_loss": "0.007", "loss": "0.00503", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.9", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "58640", "lr": "9.07621e-06", "gnorm": "0.261", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15096"} 2023-01-29 20:23:21 | INFO | train_inner | {"epoch": 28, "update": 27.136, "s2c_loss": "0.014", "loss": "0.01005", "s2c_nll_loss": "0.014", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.7", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "58650", "lr": "9.00955e-06", "gnorm": "0.674", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15099"} 2023-01-29 20:23:24 | INFO | train_inner | {"epoch": 28, "update": 27.141, "s2c_loss": "0.023", "loss": "0.01618", "s2c_nll_loss": "0.023", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "58660", "lr": "8.94289e-06", "gnorm": "0.57", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "15102"} 2023-01-29 20:23:26 | INFO | train_inner | {"epoch": 28, "update": 27.145, "s2c_loss": "0.023", "loss": "0.01603", "s2c_nll_loss": "0.023", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252.6", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "58670", "lr": "8.87622e-06", "gnorm": "0.531", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15104"} 2023-01-29 20:23:29 | INFO | train_inner | {"epoch": 28, "update": 27.15, "s2c_loss": "0.015", "loss": "0.01041", "s2c_nll_loss": "0.015", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "58680", "lr": "8.80956e-06", "gnorm": "0.351", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "15107"} 2023-01-29 20:23:31 | INFO | train_inner | {"epoch": 28, "update": 27.154, "s2c_loss": "0.005", "loss": "0.00324", "s2c_nll_loss": "0.005", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "247.1", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "58690", "lr": "8.7429e-06", "gnorm": "0.313", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "15109"} 2023-01-29 20:23:34 | INFO | train_inner | {"epoch": 28, "update": 27.159, "s2c_loss": "0.006", "loss": "0.00398", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "58700", "lr": "8.67623e-06", "gnorm": "0.299", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15112"} 2023-01-29 20:23:36 | INFO | train_inner | {"epoch": 28, "update": 27.164, "s2c_loss": "0.011", "loss": "0.00729", "s2c_nll_loss": "0.011", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "58710", "lr": "8.60957e-06", "gnorm": "0.552", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15114"} 2023-01-29 20:23:39 | INFO | train_inner | {"epoch": 28, "update": 27.168, "s2c_loss": "0.015", "loss": "0.01013", "s2c_nll_loss": "0.015", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "58720", "lr": "8.54291e-06", "gnorm": "0.501", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "15117"} 2023-01-29 20:23:41 | INFO | train_inner | {"epoch": 28, "update": 27.173, "s2c_loss": "0.007", "loss": "0.00504", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "58730", "lr": "8.47624e-06", "gnorm": "0.404", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15119"} 2023-01-29 20:23:44 | INFO | train_inner | {"epoch": 28, "update": 27.178, "s2c_loss": "0.016", "loss": "0.01125", "s2c_nll_loss": "0.016", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "58740", "lr": "8.40958e-06", "gnorm": "0.619", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "15122"} 2023-01-29 20:23:46 | INFO | train_inner | {"epoch": 28, "update": 27.182, "s2c_loss": "0.003", "loss": "0.00227", "s2c_nll_loss": "0.003", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "58750", "lr": "8.34292e-06", "gnorm": "0.152", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "15124"} 2023-01-29 20:23:49 | INFO | train_inner | {"epoch": 28, "update": 27.187, "s2c_loss": "0.133", "loss": "0.09187", "s2c_nll_loss": "0.133", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "58760", "lr": "8.27625e-06", "gnorm": "0.321", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15127"} 2023-01-29 20:23:52 | INFO | train_inner | {"epoch": 28, "update": 27.191, "s2c_loss": "0.009", "loss": "0.0059", "s2c_nll_loss": "0.009", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "58770", "lr": "8.20959e-06", "gnorm": "0.361", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15129"} 2023-01-29 20:23:54 | INFO | train_inner | {"epoch": 28, "update": 27.196, "s2c_loss": "0.021", "loss": "0.01433", "s2c_nll_loss": "0.021", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "58780", "lr": "8.14293e-06", "gnorm": "0.653", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15132"} 2023-01-29 20:23:57 | INFO | train_inner | {"epoch": 28, "update": 27.201, "s2c_loss": "0.004", "loss": "0.00257", "s2c_nll_loss": "0.004", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "58790", "lr": "8.07626e-06", "gnorm": "0.159", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15134"} 2023-01-29 20:23:59 | INFO | train_inner | {"epoch": 28, "update": 27.205, "s2c_loss": "0.004", "loss": "0.00261", "s2c_nll_loss": "0.004", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "58800", "lr": "8.0096e-06", "gnorm": "0.194", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15137"} 2023-01-29 20:24:02 | INFO | train_inner | {"epoch": 28, "update": 27.21, "s2c_loss": "0.005", "loss": "0.00337", "s2c_nll_loss": "0.005", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "58810", "lr": "7.94294e-06", "gnorm": "0.284", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15140"} 2023-01-29 20:24:04 | INFO | train_inner | {"epoch": 28, "update": 27.215, "s2c_loss": "0.006", "loss": "0.00396", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "58820", "lr": "7.87627e-06", "gnorm": "0.208", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15142"} 2023-01-29 20:24:07 | INFO | train_inner | {"epoch": 28, "update": 27.219, "s2c_loss": "0.021", "loss": "0.01463", "s2c_nll_loss": "0.021", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "58830", "lr": "7.80961e-06", "gnorm": "1.064", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15145"} 2023-01-29 20:24:09 | INFO | train_inner | {"epoch": 28, "update": 27.224, "s2c_loss": "0.009", "loss": "0.00653", "s2c_nll_loss": "0.009", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "258.1", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "58840", "lr": "7.74295e-06", "gnorm": "0.458", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15147"} 2023-01-29 20:24:12 | INFO | train_inner | {"epoch": 28, "update": 27.228, "s2c_loss": "0.007", "loss": "0.00472", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.8", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "58850", "lr": "7.67628e-06", "gnorm": "0.418", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "15150"} 2023-01-29 20:24:14 | INFO | train_inner | {"epoch": 28, "update": 27.233, "s2c_loss": "0.011", "loss": "0.00733", "s2c_nll_loss": "0.011", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "58860", "lr": "7.60962e-06", "gnorm": "0.675", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15152"} 2023-01-29 20:24:17 | INFO | train_inner | {"epoch": 28, "update": 27.238, "s2c_loss": "0.005", "loss": "0.00316", "s2c_nll_loss": "0.005", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "58870", "lr": "7.54296e-06", "gnorm": "0.164", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15155"} 2023-01-29 20:24:19 | INFO | train_inner | {"epoch": 28, "update": 27.242, "s2c_loss": "0.01", "loss": "0.00683", "s2c_nll_loss": "0.01", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "245.8", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "58880", "lr": "7.47629e-06", "gnorm": "0.62", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15157"} 2023-01-29 20:24:22 | INFO | train_inner | {"epoch": 28, "update": 27.247, "s2c_loss": "0.025", "loss": "0.01745", "s2c_nll_loss": "0.025", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "245.5", "ups": "3.84", "wpb": "64", "bsz": "64", "num_updates": "58890", "lr": "7.40963e-06", "gnorm": "0.711", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "15160"} 2023-01-29 20:24:25 | INFO | train_inner | {"epoch": 28, "update": 27.252, "s2c_loss": "0.005", "loss": "0.00366", "s2c_nll_loss": "0.005", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "241.2", "ups": "3.77", "wpb": "64", "bsz": "64", "num_updates": "58900", "lr": "7.34297e-06", "gnorm": "0.29", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "15163"} 2023-01-29 20:24:27 | INFO | train_inner | {"epoch": 28, "update": 27.256, "s2c_loss": "0.004", "loss": "0.00281", "s2c_nll_loss": "0.004", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "243.8", "ups": "3.81", "wpb": "64", "bsz": "64", "num_updates": "58910", "lr": "7.2763e-06", "gnorm": "0.243", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.2", "wall": "15165"} 2023-01-29 20:24:30 | INFO | train_inner | {"epoch": 28, "update": 27.261, "s2c_loss": "0.004", "loss": "0.00295", "s2c_nll_loss": "0.004", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.5", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "58920", "lr": "7.20964e-06", "gnorm": "0.305", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "15168"} 2023-01-29 20:24:32 | INFO | train_inner | {"epoch": 28, "update": 27.265, "s2c_loss": "0.007", "loss": "0.00501", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "58930", "lr": "7.14298e-06", "gnorm": "0.332", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "15170"} 2023-01-29 20:24:35 | INFO | train_inner | {"epoch": 28, "update": 27.27, "s2c_loss": "0.006", "loss": "0.00442", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.6", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "58940", "lr": "7.07631e-06", "gnorm": "0.424", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.3", "wall": "15173"} 2023-01-29 20:24:37 | INFO | train_inner | {"epoch": 28, "update": 27.275, "s2c_loss": "0.008", "loss": "0.00562", "s2c_nll_loss": "0.008", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254.2", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "58950", "lr": "7.00965e-06", "gnorm": "0.494", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "15175"} 2023-01-29 20:24:40 | INFO | train_inner | {"epoch": 28, "update": 27.279, "s2c_loss": "0.006", "loss": "0.00448", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "250.4", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "58960", "lr": "6.94299e-06", "gnorm": "0.303", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.2", "wall": "15178"} 2023-01-29 20:24:42 | INFO | train_inner | {"epoch": 28, "update": 27.284, "s2c_loss": "0.004", "loss": "0.00283", "s2c_nll_loss": "0.004", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "58970", "lr": "6.87632e-06", "gnorm": "0.211", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "15180"} 2023-01-29 20:24:45 | INFO | train_inner | {"epoch": 28, "update": 27.289, "s2c_loss": "0.003", "loss": "0.00214", "s2c_nll_loss": "0.003", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "258.1", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "58980", "lr": "6.80966e-06", "gnorm": "0.139", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.3", "wall": "15183"} 2023-01-29 20:24:47 | INFO | train_inner | {"epoch": 28, "update": 27.293, "s2c_loss": "0.008", "loss": "0.00524", "s2c_nll_loss": "0.008", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "58990", "lr": "6.743e-06", "gnorm": "0.371", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "15185"} 2023-01-29 20:24:50 | INFO | train_inner | {"epoch": 28, "update": 27.298, "s2c_loss": "0.006", "loss": "0.00412", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.3", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "59000", "lr": "6.67633e-06", "gnorm": "0.399", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "15188"} 2023-01-29 20:24:52 | INFO | train_inner | {"epoch": 28, "update": 27.302, "s2c_loss": "0.01", "loss": "0.0067", "s2c_nll_loss": "0.01", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "59010", "lr": "6.60967e-06", "gnorm": "0.627", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.2", "wall": "15190"} 2023-01-29 20:24:55 | INFO | train_inner | {"epoch": 28, "update": 27.307, "s2c_loss": "0.005", "loss": "0.00367", "s2c_nll_loss": "0.005", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "59020", "lr": "6.54301e-06", "gnorm": "0.259", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.3", "wall": "15193"} 2023-01-29 20:24:57 | INFO | train_inner | {"epoch": 28, "update": 27.312, "s2c_loss": "0.013", "loss": "0.00894", "s2c_nll_loss": "0.013", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.3", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "59030", "lr": "6.47634e-06", "gnorm": "0.413", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.4", "wall": "15195"} 2023-01-29 20:25:00 | INFO | train_inner | {"epoch": 28, "update": 27.316, "s2c_loss": "0.002", "loss": "0.00145", "s2c_nll_loss": "0.002", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "249.5", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "59040", "lr": "6.40968e-06", "gnorm": "0.125", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.3", "wall": "15198"} 2023-01-29 20:25:03 | INFO | train_inner | {"epoch": 28, "update": 27.321, "s2c_loss": "0.003", "loss": "0.00204", "s2c_nll_loss": "0.003", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "257.4", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "59050", "lr": "6.34302e-06", "gnorm": "0.122", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "15200"} 2023-01-29 20:25:05 | INFO | train_inner | {"epoch": 28, "update": 27.326, "s2c_loss": "0.01", "loss": "0.0072", "s2c_nll_loss": "0.01", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "59060", "lr": "6.27635e-06", "gnorm": "0.551", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.2", "wall": "15203"} 2023-01-29 20:25:08 | INFO | train_inner | {"epoch": 28, "update": 27.33, "s2c_loss": "0.005", "loss": "0.00337", "s2c_nll_loss": "0.005", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "59070", "lr": "6.20969e-06", "gnorm": "0.344", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.3", "wall": "15205"} 2023-01-29 20:25:10 | INFO | train_inner | {"epoch": 28, "update": 27.335, "s2c_loss": "0.005", "loss": "0.00322", "s2c_nll_loss": "0.005", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "59080", "lr": "6.14303e-06", "gnorm": "0.215", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.2", "wall": "15208"} 2023-01-29 20:25:13 | INFO | train_inner | {"epoch": 28, "update": 27.34, "s2c_loss": "0.008", "loss": "0.00571", "s2c_nll_loss": "0.008", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "59090", "lr": "6.07636e-06", "gnorm": "0.587", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "15211"} 2023-01-29 20:25:15 | INFO | train_inner | {"epoch": 28, "update": 27.344, "s2c_loss": "0.012", "loss": "0.00815", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "59100", "lr": "6.0097e-06", "gnorm": "0.444", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.3", "wall": "15213"} 2023-01-29 20:25:18 | INFO | train_inner | {"epoch": 28, "update": 27.349, "s2c_loss": "0.006", "loss": "0.00418", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "59110", "lr": "5.94304e-06", "gnorm": "0.302", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.3", "wall": "15216"} 2023-01-29 20:25:20 | INFO | train_inner | {"epoch": 28, "update": 27.353, "s2c_loss": "0.01", "loss": "0.0069", "s2c_nll_loss": "0.01", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "59120", "lr": "5.87637e-06", "gnorm": "0.55", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.3", "wall": "15218"} 2023-01-29 20:25:23 | INFO | train_inner | {"epoch": 28, "update": 27.358, "s2c_loss": "0.007", "loss": "0.00462", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "59130", "lr": "5.80971e-06", "gnorm": "0.385", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.2", "wall": "15221"} 2023-01-29 20:25:25 | INFO | train_inner | {"epoch": 28, "update": 27.363, "s2c_loss": "0.004", "loss": "0.0026", "s2c_nll_loss": "0.004", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "258.9", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "59140", "lr": "5.74305e-06", "gnorm": "0.24", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.3", "wall": "15223"} 2023-01-29 20:25:28 | INFO | train_inner | {"epoch": 28, "update": 27.367, "s2c_loss": "0.016", "loss": "0.01078", "s2c_nll_loss": "0.016", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "59150", "lr": "5.67638e-06", "gnorm": "0.813", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "15226"} 2023-01-29 20:25:30 | INFO | train_inner | {"epoch": 28, "update": 27.372, "s2c_loss": "0.002", "loss": "0.00134", "s2c_nll_loss": "0.002", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "254.8", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "59160", "lr": "5.60972e-06", "gnorm": "0.108", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.3", "wall": "15228"} 2023-01-29 20:25:33 | INFO | train_inner | {"epoch": 28, "update": 27.377, "s2c_loss": "0.01", "loss": "0.00686", "s2c_nll_loss": "0.01", "s2c_accuracy": "100", "s2c_total": "63.7", "s2c_n_correct": "63.7", "wps": "253.3", "ups": "3.98", "wpb": "63.7", "bsz": "63.7", "num_updates": "59170", "lr": "5.54306e-06", "gnorm": "0.545", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.3", "wall": "15231"} 2023-01-29 20:25:35 | INFO | train_inner | {"epoch": 28, "update": 27.381, "s2c_loss": "0.008", "loss": "0.00525", "s2c_nll_loss": "0.008", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "59180", "lr": "5.47639e-06", "gnorm": "0.258", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.2", "wall": "15233"} 2023-01-29 20:25:38 | INFO | train_inner | {"epoch": 28, "update": 27.386, "s2c_loss": "0.003", "loss": "0.00196", "s2c_nll_loss": "0.003", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "59190", "lr": "5.40973e-06", "gnorm": "0.175", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.2", "wall": "15236"} 2023-01-29 20:25:40 | INFO | train_inner | {"epoch": 28, "update": 27.39, "s2c_loss": "0.016", "loss": "0.01093", "s2c_nll_loss": "0.016", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "247.2", "ups": "3.86", "wpb": "64", "bsz": "64", "num_updates": "59200", "lr": "5.34307e-06", "gnorm": "0.651", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.3", "wall": "15238"} 2023-01-29 20:25:43 | INFO | train_inner | {"epoch": 28, "update": 27.395, "s2c_loss": "0.006", "loss": "0.00419", "s2c_nll_loss": "0.006", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "59210", "lr": "5.2764e-06", "gnorm": "0.223", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.3", "wall": "15241"} 2023-01-29 20:25:46 | INFO | train_inner | {"epoch": 28, "update": 27.4, "s2c_loss": "0.004", "loss": "0.00262", "s2c_nll_loss": "0.004", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.2", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "59220", "lr": "5.20974e-06", "gnorm": "0.199", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.2", "wall": "15243"} 2023-01-29 20:25:48 | INFO | train_inner | {"epoch": 28, "update": 27.404, "s2c_loss": "0.012", "loss": "0.00807", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "59230", "lr": "5.14308e-06", "gnorm": "0.739", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.3", "wall": "15246"} 2023-01-29 20:25:51 | INFO | train_inner | {"epoch": 28, "update": 27.409, "s2c_loss": "0.006", "loss": "0.00405", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "256.4", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "59240", "lr": "5.07641e-06", "gnorm": "0.446", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.3", "wall": "15249"} 2023-01-29 20:25:53 | INFO | train_inner | {"epoch": 28, "update": 27.414, "s2c_loss": "0.004", "loss": "0.00286", "s2c_nll_loss": "0.004", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "59250", "lr": "5.00975e-06", "gnorm": "0.299", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "15251"} 2023-01-29 20:25:56 | INFO | train_inner | {"epoch": 28, "update": 27.418, "s2c_loss": "0.005", "loss": "0.0034", "s2c_nll_loss": "0.005", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "256.5", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "59260", "lr": "4.94309e-06", "gnorm": "0.311", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "15254"} 2023-01-29 20:25:58 | INFO | train_inner | {"epoch": 28, "update": 27.423, "s2c_loss": "0.014", "loss": "0.00936", "s2c_nll_loss": "0.014", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "59270", "lr": "4.87642e-06", "gnorm": "0.555", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.3", "wall": "15256"} 2023-01-29 20:26:01 | INFO | train_inner | {"epoch": 28, "update": 27.427, "s2c_loss": "0.019", "loss": "0.01284", "s2c_nll_loss": "0.019", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "59280", "lr": "4.80976e-06", "gnorm": "0.812", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "15259"} 2023-01-29 20:26:03 | INFO | train_inner | {"epoch": 28, "update": 27.432, "s2c_loss": "0.006", "loss": "0.00421", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "256.8", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "59290", "lr": "4.7431e-06", "gnorm": "0.426", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.3", "wall": "15261"} 2023-01-29 20:26:06 | INFO | train_inner | {"epoch": 28, "update": 27.437, "s2c_loss": "0.007", "loss": "0.00507", "s2c_nll_loss": "0.007", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "59300", "lr": "4.67643e-06", "gnorm": "0.427", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.3", "wall": "15264"} 2023-01-29 20:26:08 | INFO | train_inner | {"epoch": 28, "update": 27.441, "s2c_loss": "0.006", "loss": "0.0045", "s2c_nll_loss": "0.006", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.6", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "59310", "lr": "4.60977e-06", "gnorm": "0.299", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.2", "wall": "15266"} 2023-01-29 20:26:11 | INFO | train_inner | {"epoch": 28, "update": 27.446, "s2c_loss": "0.004", "loss": "0.00306", "s2c_nll_loss": "0.004", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "59320", "lr": "4.54311e-06", "gnorm": "0.294", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "15269"} 2023-01-29 20:26:13 | INFO | train_inner | {"epoch": 28, "update": 27.451, "s2c_loss": "0.006", "loss": "0.00447", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.8", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "59330", "lr": "4.47644e-06", "gnorm": "0.352", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.3", "wall": "15271"} 2023-01-29 20:26:16 | INFO | train_inner | {"epoch": 28, "update": 27.455, "s2c_loss": "0.004", "loss": "0.00305", "s2c_nll_loss": "0.004", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.1", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "59340", "lr": "4.40978e-06", "gnorm": "0.241", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "15274"} 2023-01-29 20:26:18 | INFO | train_inner | {"epoch": 28, "update": 27.46, "s2c_loss": "0.011", "loss": "0.00783", "s2c_nll_loss": "0.011", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254.6", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "59350", "lr": "4.34312e-06", "gnorm": "0.547", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "15276"} 2023-01-29 20:26:21 | INFO | train_inner | {"epoch": 28, "update": 27.464, "s2c_loss": "0.003", "loss": "0.00213", "s2c_nll_loss": "0.003", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "59360", "lr": "4.27645e-06", "gnorm": "0.132", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.2", "wall": "15279"} 2023-01-29 20:26:23 | INFO | train_inner | {"epoch": 28, "update": 27.469, "s2c_loss": "0.005", "loss": "0.00347", "s2c_nll_loss": "0.005", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "59370", "lr": "4.20979e-06", "gnorm": "0.178", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.3", "wall": "15281"} 2023-01-29 20:26:26 | INFO | train_inner | {"epoch": 28, "update": 27.474, "s2c_loss": "0.028", "loss": "0.01917", "s2c_nll_loss": "0.028", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "59380", "lr": "4.14313e-06", "gnorm": "1.091", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.2", "wall": "15284"} 2023-01-29 20:26:28 | INFO | train_inner | {"epoch": 28, "update": 27.478, "s2c_loss": "0.005", "loss": "0.00328", "s2c_nll_loss": "0.005", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "248.8", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "59390", "lr": "4.07646e-06", "gnorm": "0.183", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.2", "wall": "15286"} 2023-01-29 20:26:31 | INFO | train_inner | {"epoch": 28, "update": 27.483, "s2c_loss": "0.012", "loss": "0.00824", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "59400", "lr": "4.0098e-06", "gnorm": "0.542", "loss_scale": "8192", "train_wall": "2", "gb_free": "7.3", "wall": "15289"} 2023-01-29 20:26:34 | INFO | train_inner | {"epoch": 28, "update": 27.488, "s2c_loss": "0.004", "loss": "0.00251", "s2c_nll_loss": "0.004", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "59410", "lr": "3.94314e-06", "gnorm": "0.251", "loss_scale": "8192", "train_wall": "3", "gb_free": "7.3", "wall": "15291"} 2023-01-29 20:26:36 | INFO | fairseq.trainer | NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 4096.0 2023-01-29 20:26:36 | INFO | train_inner | {"epoch": 28, "update": 27.493, "s2c_loss": "0.006", "loss": "0.00415", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "230.8", "ups": "3.61", "wpb": "64", "bsz": "64", "num_updates": "59420", "lr": "3.87647e-06", "gnorm": "0.426", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15294"} 2023-01-29 20:26:39 | INFO | train_inner | {"epoch": 28, "update": 27.497, "s2c_loss": "0.015", "loss": "0.01047", "s2c_nll_loss": "0.015", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.9", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "59430", "lr": "3.80981e-06", "gnorm": "0.743", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15297"} 2023-01-29 20:26:41 | INFO | train_inner | {"epoch": 28, "update": 27.502, "s2c_loss": "0.003", "loss": "0.00223", "s2c_nll_loss": "0.003", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "59440", "lr": "3.74315e-06", "gnorm": "0.132", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "15299"} 2023-01-29 20:26:44 | INFO | train_inner | {"epoch": 28, "update": 27.506, "s2c_loss": "0.005", "loss": "0.00319", "s2c_nll_loss": "0.005", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.9", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "59450", "lr": "3.67648e-06", "gnorm": "0.412", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15302"} 2023-01-29 20:26:46 | INFO | train_inner | {"epoch": 28, "update": 27.511, "s2c_loss": "0.009", "loss": "0.00644", "s2c_nll_loss": "0.009", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.3", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "59460", "lr": "3.60982e-06", "gnorm": "0.454", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15304"} 2023-01-29 20:26:49 | INFO | train_inner | {"epoch": 28, "update": 27.516, "s2c_loss": "0.016", "loss": "0.01113", "s2c_nll_loss": "0.016", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "248.5", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "59470", "lr": "3.54316e-06", "gnorm": "0.707", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15307"} 2023-01-29 20:26:52 | INFO | train_inner | {"epoch": 28, "update": 27.52, "s2c_loss": "0.02", "loss": "0.01371", "s2c_nll_loss": "0.02", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "256", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "59480", "lr": "3.47649e-06", "gnorm": "1.027", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15309"} 2023-01-29 20:26:54 | INFO | train_inner | {"epoch": 28, "update": 27.525, "s2c_loss": "0.005", "loss": "0.00334", "s2c_nll_loss": "0.005", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "59490", "lr": "3.40983e-06", "gnorm": "0.389", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15312"} 2023-01-29 20:26:57 | INFO | train_inner | {"epoch": 28, "update": 27.53, "s2c_loss": "0.013", "loss": "0.00888", "s2c_nll_loss": "0.013", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "254.9", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "59500", "lr": "3.34317e-06", "gnorm": "0.829", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15314"} 2023-01-29 20:26:59 | INFO | train_inner | {"epoch": 28, "update": 27.534, "s2c_loss": "0.007", "loss": "0.00507", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "249.2", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "59510", "lr": "3.2765e-06", "gnorm": "0.386", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "15317"} 2023-01-29 20:27:02 | INFO | train_inner | {"epoch": 28, "update": 27.539, "s2c_loss": "0.015", "loss": "0.01052", "s2c_nll_loss": "0.015", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.4", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "59520", "lr": "3.20984e-06", "gnorm": "0.472", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15320"} 2023-01-29 20:27:04 | INFO | train_inner | {"epoch": 28, "update": 27.543, "s2c_loss": "0.005", "loss": "0.00344", "s2c_nll_loss": "0.005", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.5", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "59530", "lr": "3.14318e-06", "gnorm": "0.349", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15322"} 2023-01-29 20:27:07 | INFO | train_inner | {"epoch": 28, "update": 27.548, "s2c_loss": "0.011", "loss": "0.00729", "s2c_nll_loss": "0.011", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "59540", "lr": "3.07651e-06", "gnorm": "0.629", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15325"} 2023-01-29 20:27:09 | INFO | train_inner | {"epoch": 28, "update": 27.553, "s2c_loss": "0.015", "loss": "0.01072", "s2c_nll_loss": "0.015", "s2c_accuracy": "99.531", "s2c_total": "64", "s2c_n_correct": "63.7", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "59550", "lr": "3.00985e-06", "gnorm": "0.894", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15327"} 2023-01-29 20:27:12 | INFO | train_inner | {"epoch": 28, "update": 27.557, "s2c_loss": "0.006", "loss": "0.00406", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "258.3", "ups": "4.04", "wpb": "64", "bsz": "64", "num_updates": "59560", "lr": "2.94319e-06", "gnorm": "0.376", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15330"} 2023-01-29 20:27:14 | INFO | train_inner | {"epoch": 28, "update": 27.562, "s2c_loss": "0.012", "loss": "0.00855", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "259.1", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "59570", "lr": "2.87652e-06", "gnorm": "0.701", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.4", "wall": "15332"} 2023-01-29 20:27:17 | INFO | train_inner | {"epoch": 28, "update": 27.567, "s2c_loss": "0.006", "loss": "0.00404", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.2", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "59580", "lr": "2.80986e-06", "gnorm": "0.223", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15335"} 2023-01-29 20:27:19 | INFO | train_inner | {"epoch": 28, "update": 27.571, "s2c_loss": "0.009", "loss": "0.00601", "s2c_nll_loss": "0.009", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "256.1", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "59590", "lr": "2.7432e-06", "gnorm": "0.629", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15337"} 2023-01-29 20:27:22 | INFO | train_inner | {"epoch": 28, "update": 27.576, "s2c_loss": "0.013", "loss": "0.00867", "s2c_nll_loss": "0.013", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "255.2", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "59600", "lr": "2.67653e-06", "gnorm": "0.673", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15340"} 2023-01-29 20:27:24 | INFO | train_inner | {"epoch": 28, "update": 27.58, "s2c_loss": "0.007", "loss": "0.00452", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.4", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "59610", "lr": "2.60987e-06", "gnorm": "0.322", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.4", "wall": "15342"} 2023-01-29 20:27:27 | INFO | train_inner | {"epoch": 28, "update": 27.585, "s2c_loss": "0.006", "loss": "0.00417", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "248.3", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "59620", "lr": "2.54321e-06", "gnorm": "0.283", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15345"} 2023-01-29 20:27:29 | INFO | train_inner | {"epoch": 28, "update": 27.59, "s2c_loss": "0.009", "loss": "0.00609", "s2c_nll_loss": "0.009", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "248.6", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "59630", "lr": "2.47654e-06", "gnorm": "0.603", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "15347"} 2023-01-29 20:27:32 | INFO | train_inner | {"epoch": 28, "update": 27.594, "s2c_loss": "0.01", "loss": "0.00727", "s2c_nll_loss": "0.01", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "59640", "lr": "2.40988e-06", "gnorm": "0.341", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15350"} 2023-01-29 20:27:34 | INFO | train_inner | {"epoch": 28, "update": 27.599, "s2c_loss": "0.018", "loss": "0.01218", "s2c_nll_loss": "0.018", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.2", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "59650", "lr": "2.34322e-06", "gnorm": "0.471", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15352"} 2023-01-29 20:27:37 | INFO | train_inner | {"epoch": 28, "update": 27.604, "s2c_loss": "0.011", "loss": "0.00758", "s2c_nll_loss": "0.011", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "59660", "lr": "2.27655e-06", "gnorm": "0.742", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15355"} 2023-01-29 20:27:40 | INFO | train_inner | {"epoch": 28, "update": 27.608, "s2c_loss": "0.007", "loss": "0.00461", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "249.3", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "59670", "lr": "2.20989e-06", "gnorm": "0.399", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15357"} 2023-01-29 20:27:42 | INFO | train_inner | {"epoch": 28, "update": 27.613, "s2c_loss": "0.032", "loss": "0.02195", "s2c_nll_loss": "0.032", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "59680", "lr": "2.14323e-06", "gnorm": "0.711", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15360"} 2023-01-29 20:27:45 | INFO | train_inner | {"epoch": 28, "update": 27.617, "s2c_loss": "0.005", "loss": "0.00375", "s2c_nll_loss": "0.005", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.3", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "59690", "lr": "2.07656e-06", "gnorm": "0.269", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15363"} 2023-01-29 20:27:47 | INFO | train_inner | {"epoch": 28, "update": 27.622, "s2c_loss": "0.007", "loss": "0.00507", "s2c_nll_loss": "0.007", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "59700", "lr": "2.0099e-06", "gnorm": "0.424", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15365"} 2023-01-29 20:27:50 | INFO | train_inner | {"epoch": 28, "update": 27.627, "s2c_loss": "0.018", "loss": "0.01279", "s2c_nll_loss": "0.018", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "59710", "lr": "1.94324e-06", "gnorm": "0.41", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15368"} 2023-01-29 20:27:52 | INFO | train_inner | {"epoch": 28, "update": 27.631, "s2c_loss": "0.003", "loss": "0.00212", "s2c_nll_loss": "0.003", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.4", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "59720", "lr": "1.87657e-06", "gnorm": "0.136", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15370"} 2023-01-29 20:27:55 | INFO | train_inner | {"epoch": 28, "update": 27.636, "s2c_loss": "0.014", "loss": "0.00948", "s2c_nll_loss": "0.014", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.7", "ups": "4", "wpb": "64", "bsz": "64", "num_updates": "59730", "lr": "1.80991e-06", "gnorm": "0.442", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15373"} 2023-01-29 20:27:57 | INFO | train_inner | {"epoch": 28, "update": 27.641, "s2c_loss": "0.005", "loss": "0.00337", "s2c_nll_loss": "0.005", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "59740", "lr": "1.74325e-06", "gnorm": "0.323", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15375"} 2023-01-29 20:28:00 | INFO | train_inner | {"epoch": 28, "update": 27.645, "s2c_loss": "0.002", "loss": "0.00165", "s2c_nll_loss": "0.002", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.6", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "59750", "lr": "1.67658e-06", "gnorm": "0.123", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15378"} 2023-01-29 20:28:02 | INFO | train_inner | {"epoch": 28, "update": 27.65, "s2c_loss": "0.009", "loss": "0.00646", "s2c_nll_loss": "0.009", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "256.9", "ups": "4.01", "wpb": "64", "bsz": "64", "num_updates": "59760", "lr": "1.60992e-06", "gnorm": "0.789", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15380"} 2023-01-29 20:28:05 | INFO | train_inner | {"epoch": 28, "update": 27.654, "s2c_loss": "0.004", "loss": "0.00304", "s2c_nll_loss": "0.004", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "257.7", "ups": "4.03", "wpb": "64", "bsz": "64", "num_updates": "59770", "lr": "1.54326e-06", "gnorm": "0.296", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15383"} 2023-01-29 20:28:07 | INFO | train_inner | {"epoch": 28, "update": 27.659, "s2c_loss": "0.159", "loss": "0.11047", "s2c_nll_loss": "0.159", "s2c_accuracy": "99.219", "s2c_total": "64", "s2c_n_correct": "63.5", "wps": "254.1", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "59780", "lr": "1.47659e-06", "gnorm": "0.523", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15385"} 2023-01-29 20:28:10 | INFO | train_inner | {"epoch": 28, "update": 27.664, "s2c_loss": "0.011", "loss": "0.00739", "s2c_nll_loss": "0.011", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "248.4", "ups": "3.88", "wpb": "64", "bsz": "64", "num_updates": "59790", "lr": "1.40993e-06", "gnorm": "0.446", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "15388"} 2023-01-29 20:28:12 | INFO | train_inner | {"epoch": 28, "update": 27.668, "s2c_loss": "0.009", "loss": "0.0064", "s2c_nll_loss": "0.009", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252.1", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "59800", "lr": "1.34327e-06", "gnorm": "0.517", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15390"} 2023-01-29 20:28:15 | INFO | train_inner | {"epoch": 28, "update": 27.673, "s2c_loss": "0.006", "loss": "0.00443", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "250.7", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "59810", "lr": "1.2766e-06", "gnorm": "0.193", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15393"} 2023-01-29 20:28:17 | INFO | train_inner | {"epoch": 28, "update": 27.678, "s2c_loss": "0.017", "loss": "0.01212", "s2c_nll_loss": "0.017", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "252.3", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "59820", "lr": "1.20994e-06", "gnorm": "0.45", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15395"} 2023-01-29 20:28:20 | INFO | train_inner | {"epoch": 28, "update": 27.682, "s2c_loss": "0.007", "loss": "0.00483", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "250.5", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "59830", "lr": "1.14328e-06", "gnorm": "0.437", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15398"} 2023-01-29 20:28:22 | INFO | train_inner | {"epoch": 28, "update": 27.687, "s2c_loss": "0.018", "loss": "0.01241", "s2c_nll_loss": "0.018", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "257.3", "ups": "4.02", "wpb": "64", "bsz": "64", "num_updates": "59840", "lr": "1.07661e-06", "gnorm": "0.711", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15400"} 2023-01-29 20:28:25 | INFO | train_inner | {"epoch": 28, "update": 27.691, "s2c_loss": "0.017", "loss": "0.01185", "s2c_nll_loss": "0.017", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "253.6", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "59850", "lr": "1.00995e-06", "gnorm": "0.388", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15403"} 2023-01-29 20:28:28 | INFO | train_inner | {"epoch": 28, "update": 27.696, "s2c_loss": "0.006", "loss": "0.00402", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.5", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "59860", "lr": "9.43287e-07", "gnorm": "0.379", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "15405"} 2023-01-29 20:28:30 | INFO | train_inner | {"epoch": 28, "update": 27.701, "s2c_loss": "0.021", "loss": "0.01457", "s2c_nll_loss": "0.021", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "254.7", "ups": "3.98", "wpb": "64", "bsz": "64", "num_updates": "59870", "lr": "8.76623e-07", "gnorm": "0.551", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15408"} 2023-01-29 20:28:33 | INFO | train_inner | {"epoch": 28, "update": 27.705, "s2c_loss": "0.004", "loss": "0.0026", "s2c_nll_loss": "0.004", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "252", "ups": "3.94", "wpb": "64", "bsz": "64", "num_updates": "59880", "lr": "8.0996e-07", "gnorm": "0.237", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.4", "wall": "15411"} 2023-01-29 20:28:35 | INFO | train_inner | {"epoch": 28, "update": 27.71, "s2c_loss": "0.006", "loss": "0.00446", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "250.8", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "59890", "lr": "7.43297e-07", "gnorm": "0.295", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "15413"} 2023-01-29 20:28:38 | INFO | train_inner | {"epoch": 28, "update": 27.715, "s2c_loss": "0.012", "loss": "0.00823", "s2c_nll_loss": "0.012", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "254.3", "ups": "3.97", "wpb": "64", "bsz": "64", "num_updates": "59900", "lr": "6.76633e-07", "gnorm": "0.577", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15416"} 2023-01-29 20:28:40 | INFO | train_inner | {"epoch": 28, "update": 27.719, "s2c_loss": "0.006", "loss": "0.00426", "s2c_nll_loss": "0.006", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "248.7", "ups": "3.89", "wpb": "64", "bsz": "64", "num_updates": "59910", "lr": "6.0997e-07", "gnorm": "0.46", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15418"} 2023-01-29 20:28:43 | INFO | train_inner | {"epoch": 28, "update": 27.724, "s2c_loss": "0.007", "loss": "0.00468", "s2c_nll_loss": "0.007", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "255.1", "ups": "3.99", "wpb": "64", "bsz": "64", "num_updates": "59920", "lr": "5.43307e-07", "gnorm": "0.277", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15421"} 2023-01-29 20:28:45 | INFO | train_inner | {"epoch": 28, "update": 27.728, "s2c_loss": "0.017", "loss": "0.0117", "s2c_nll_loss": "0.017", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "252.8", "ups": "3.95", "wpb": "64", "bsz": "64", "num_updates": "59930", "lr": "4.76643e-07", "gnorm": "0.369", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15423"} 2023-01-29 20:28:48 | INFO | train_inner | {"epoch": 28, "update": 27.733, "s2c_loss": "0.003", "loss": "0.00241", "s2c_nll_loss": "0.003", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.7", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "59940", "lr": "4.0998e-07", "gnorm": "0.243", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.3", "wall": "15426"} 2023-01-29 20:28:50 | INFO | train_inner | {"epoch": 28, "update": 27.738, "s2c_loss": "0.022", "loss": "0.01507", "s2c_nll_loss": "0.022", "s2c_accuracy": "99.688", "s2c_total": "64", "s2c_n_correct": "63.8", "wps": "251.7", "ups": "3.93", "wpb": "64", "bsz": "64", "num_updates": "59950", "lr": "3.43317e-07", "gnorm": "0.651", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15428"} 2023-01-29 20:28:53 | INFO | train_inner | {"epoch": 28, "update": 27.742, "s2c_loss": "0.003", "loss": "0.00179", "s2c_nll_loss": "0.003", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "251.1", "ups": "3.92", "wpb": "64", "bsz": "64", "num_updates": "59960", "lr": "2.76653e-07", "gnorm": "0.121", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.2", "wall": "15431"} 2023-01-29 20:28:55 | INFO | train_inner | {"epoch": 28, "update": 27.747, "s2c_loss": "0.004", "loss": "0.00243", "s2c_nll_loss": "0.004", "s2c_accuracy": "100", "s2c_total": "64", "s2c_n_correct": "64", "wps": "253.4", "ups": "3.96", "wpb": "64", "bsz": "64", "num_updates": "59970", "lr": "2.0999e-07", "gnorm": "0.241", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15433"} 2023-01-29 20:28:58 | INFO | train_inner | {"epoch": 28, "update": 27.752, "s2c_loss": "0.009", "loss": "0.00597", "s2c_nll_loss": "0.009", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "259", "ups": "4.05", "wpb": "64", "bsz": "64", "num_updates": "59980", "lr": "1.43327e-07", "gnorm": "0.406", "loss_scale": "4096", "train_wall": "2", "gb_free": "7.2", "wall": "15436"} 2023-01-29 20:29:00 | INFO | train_inner | {"epoch": 28, "update": 27.756, "s2c_loss": "0.008", "loss": "0.00585", "s2c_nll_loss": "0.008", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "250.3", "ups": "3.91", "wpb": "64", "bsz": "64", "num_updates": "59990", "lr": "7.66633e-08", "gnorm": "0.574", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "15438"} 2023-01-29 20:29:03 | INFO | train_inner | {"epoch": 28, "update": 27.761, "s2c_loss": "0.013", "loss": "0.00896", "s2c_nll_loss": "0.013", "s2c_accuracy": "99.844", "s2c_total": "64", "s2c_n_correct": "63.9", "wps": "249.7", "ups": "3.9", "wpb": "64", "bsz": "64", "num_updates": "60000", "lr": "1e-08", "gnorm": "0.53", "loss_scale": "4096", "train_wall": "3", "gb_free": "7.3", "wall": "15441"} 2023-01-29 20:29:03 | INFO | fairseq_cli.train | Stopping training due to num_updates: 60000 >= max_update: 60000 2023-01-29 20:29:03 | INFO | fairseq_cli.train | begin validation on "valid" subset 2023-01-29 20:29:18 | INFO | valid | {"epoch": 28, "valid_s2c_loss": "0.28", "valid_loss": "0.19381", "valid_s2c_nll_loss": "0.28", "valid_s2c_accuracy": "95.263", "valid_s2c_total": "31.9583", "valid_s2c_n_correct": "30.4444", "valid_num_updates": "60000", "valid_best_s2c_accuracy": "95.263"} 2023-01-29 20:29:18 | INFO | fairseq.checkpoint_utils | Preparing to save checkpoint for epoch 28 @ 60000 updates 2023-01-29 20:29:18 | INFO | fairseq.trainer | Saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_28_60000.pt 2023-01-29 20:29:21 | INFO | fairseq.trainer | Finished saving checkpoint to /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_28_60000.pt 2023-01-29 20:29:30 | INFO | fairseq.checkpoint_utils | Saved checkpoint /home/wangrui/projects/SpeechT5/experimental/s2c/checkpoint_28_60000.pt (epoch 28 @ 60000 updates, score 95.263) (writing took 12.561591240111738 seconds) 2023-01-29 20:29:30 | INFO | fairseq_cli.train | end of epoch 28 (average epoch stats below) 2023-01-29 20:29:30 | INFO | train | {"epoch": 28, "train_s2c_loss": "0.012", "train_loss": "0.0085", "train_s2c_nll_loss": "0.012", "train_s2c_accuracy": "99.895", "train_s2c_total": "63.9982", "train_s2c_n_correct": "63.9313", "train_wps": "233.4", "train_ups": "3.65", "train_wpb": "64", "train_bsz": "64", "train_num_updates": "60000", "train_lr": "1e-08", "train_gnorm": "0.436", "train_loss_scale": "4096", "train_train_wall": "411", "train_gb_free": "7.3", "train_wall": "15468"} 2023-01-29 20:29:30 | INFO | fairseq_cli.train | done training in 15464.5 seconds