[2024-03-08 16:27:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 232): INFO AMP_ENABLE: true AMP_OPT_LEVEL: '' AUG: AUTO_AUGMENT: rand-m9-mstd0.5-inc1 COLOR_JITTER: 0.4 CUTMIX: 1.0 CUTMIX_MINMAX: null MIXUP: 0.8 MIXUP_MODE: batch MIXUP_PROB: 1.0 MIXUP_SWITCH_PROB: 0.5 RECOUNT: 1 REMODE: pixel REPROB: 0.25 BASE: - '' DATA: BANDS: rgb BATCH_SIZE: 64 CACHE_MODE: part CHANNELS: 3 DATASET: imagenet DATA_PATH: /workspace/storage/data/hydro/images/ IMG_SIZE: 256 INTERPOLATION: bicubic MASK_PATCH_SIZE: 32 MASK_RATIO: 0.6 MEAN: - 340.76769064 - 429.9430203 - 614.21682446 - 590.23569706 - 950.68368468 - 1792.46290469 - 2075.46795189 - 2218.94553375 - 2266.46036911 - 2246.0605464 - 1594.42694882 - 1009.32729131 NUM_WORKERS: 8 PIN_MEMORY: true STD: - 554.81258967 - 572.41639287 - 582.87945694 - 675.88746967 - 729.89827633 - 1096.01480586 - 1273.45393088 - 1365.45589904 - 1356.13789355 - 1302.3292881 - 1079.19066363 - 818.86747235 ZIP_MODE: false ENABLE_AMP: true EVAL_MODE: false FUSED_LAYERNORM: false FUSED_WINDOW_PROCESS: false LOCAL_RANK: 1 MODEL: DROP_PATH_RATE: 0.1 DROP_RATE: 0.0 IN_CHANS: 3 LABEL_SMOOTHING: 0.1 NAME: hydro_rgb_simmim_pretrain NUM_CLASSES: 1000 PRETRAINED: '' RESUME: '' SIMMIM: NORM_TARGET: ENABLE: true PATCH_SIZE: 47 SWIN: APE: false DEPTHS: - 2 - 2 - 6 - 2 EMBED_DIM: 96 IN_CHANS: 3 MLP_RATIO: 4.0 NUM_HEADS: - 3 - 6 - 12 - 24 PATCH_NORM: true PATCH_SIZE: 4 QKV_BIAS: true QK_SCALE: null WINDOW_SIZE: 7 SWINV2: APE: false DEPTHS: - 2 - 2 - 18 - 2 EMBED_DIM: 128 IN_CHANS: 3 MLP_RATIO: 4.0 NUM_HEADS: - 4 - 8 - 16 - 32 PATCH_NORM: true PATCH_SIZE: 4 PRETRAINED_WINDOW_SIZES: - 0 - 0 - 0 - 0 QKV_BIAS: true WINDOW_SIZE: 16 SWIN_MLP: APE: false DEPTHS: - 2 - 2 - 6 - 2 EMBED_DIM: 96 IN_CHANS: 3 MLP_RATIO: 4.0 NUM_HEADS: - 3 - 6 - 12 - 24 PATCH_NORM: true PATCH_SIZE: 4 WINDOW_SIZE: 7 SWIN_MOE: APE: false AUX_LOSS_WEIGHT: 0.01 CAPACITY_FACTOR: 1.25 COSINE_ROUTER: false COSINE_ROUTER_DIM: 256 COSINE_ROUTER_INIT_T: 0.5 DEPTHS: - 2 - 2 - 6 - 2 EMBED_DIM: 96 GATE_NOISE: 1.0 INIT_STD: 0.02 IN_CHANS: 3 IS_GSHARD_LOSS: false MLP_FC2_BIAS: true MLP_RATIO: 4.0 MOE_BLOCKS: - - -1 - - -1 - - -1 - - -1 MOE_DROP: 0.0 NORMALIZE_GATE: false NUM_HEADS: - 3 - 6 - 12 - 24 NUM_LOCAL_EXPERTS: 1 PATCH_NORM: true PATCH_SIZE: 4 PRETRAINED_WINDOW_SIZES: - 0 - 0 - 0 - 0 QKV_BIAS: true QK_SCALE: null TOP_VALUE: 1 USE_BPR: true WINDOW_SIZE: 7 TYPE: swinv2 OUTPUT: output/hydro_rgb_simmim_pretrain/hydro_rgb_simmim_pretrain_swinv2_base_img256_window16_800ep PRINT_FREQ: 100 SAVE_FREQ: 5 SEED: 0 TAG: hydro_rgb_simmim_pretrain_swinv2_base_img256_window16_800ep TEST: CROP: true SEQUENTIAL: false SHUFFLE: false THROUGHPUT_MODE: false TRAIN: ACCUMULATION_STEPS: 1 AUTO_RESUME: true BASE_LR: 2.5e-05 CLIP_GRAD: 5.0 EPOCHS: 800 LAYER_DECAY: 1.0 LR_SCHEDULER: DECAY_EPOCHS: 30 DECAY_RATE: 0.1 GAMMA: 0.1 MULTISTEPS: - 700 NAME: multistep WARMUP_PREFIX: true MIN_LR: 1.25e-06 MOE: SAVE_MASTER: false OPTIMIZER: BETAS: - 0.9 - 0.999 EPS: 1.0e-08 MOMENTUM: 0.9 NAME: adamw START_EPOCH: 0 USE_CHECKPOINT: false WARMUP_EPOCHS: 10 WARMUP_LR: 1.25e-07 WEIGHT_DECAY: 0.05 [2024-03-08 16:27:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 73): INFO Creating model:swinv2/hydro_rgb_simmim_pretrain [2024-03-08 16:27:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 76): INFO SimMIM( (encoder): SwinTransformerV2ForSimMIM( (patch_embed): PatchEmbed( (proj): Conv2d(3, 128, kernel_size=(4, 4), stride=(4, 4)) (norm): LayerNorm((128,), eps=1e-05, elementwise_affine=True) ) (pos_drop): Dropout(p=0.0, inplace=False) (layers): ModuleList( (0): BasicLayer( dim=128, input_resolution=(64, 64), depth=2 (blocks): ModuleList( (0): SwinTransformerBlock( dim=128, input_resolution=(64, 64), num_heads=4, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((128,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=128, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=4 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=4, bias=False) ) (qkv): Linear(in_features=128, out_features=384, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=128, out_features=128, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): Identity() (norm2): LayerNorm((128,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=128, out_features=512, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=512, out_features=128, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( dim=128, input_resolution=(64, 64), num_heads=4, window_size=16, shift_size=8, mlp_ratio=4.0 (norm1): LayerNorm((128,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=128, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=4 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=4, bias=False) ) (qkv): Linear(in_features=128, out_features=384, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=128, out_features=128, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.004) (norm2): LayerNorm((128,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=128, out_features=512, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=512, out_features=128, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (downsample): PatchMerging( input_resolution=(64, 64), dim=128 (reduction): Linear(in_features=512, out_features=256, bias=False) (norm): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (1): BasicLayer( dim=256, input_resolution=(32, 32), depth=2 (blocks): ModuleList( (0): SwinTransformerBlock( dim=256, input_resolution=(32, 32), num_heads=8, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=256, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=8 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=8, bias=False) ) (qkv): Linear(in_features=256, out_features=768, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=256, out_features=256, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.009) (norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=256, out_features=1024, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=1024, out_features=256, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( dim=256, input_resolution=(32, 32), num_heads=8, window_size=16, shift_size=8, mlp_ratio=4.0 (norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=256, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=8 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=8, bias=False) ) (qkv): Linear(in_features=256, out_features=768, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=256, out_features=256, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.013) (norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=256, out_features=1024, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=1024, out_features=256, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (downsample): PatchMerging( input_resolution=(32, 32), dim=256 (reduction): Linear(in_features=1024, out_features=512, bias=False) (norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True) ) ) (2): BasicLayer( dim=512, input_resolution=(16, 16), depth=18 (blocks): ModuleList( (0): SwinTransformerBlock( dim=512, input_resolution=(16, 16), num_heads=16, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=512, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=16 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=16, bias=False) ) (qkv): Linear(in_features=512, out_features=1536, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=512, out_features=512, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.017) (norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=512, out_features=2048, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=2048, out_features=512, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( dim=512, input_resolution=(16, 16), num_heads=16, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=512, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=16 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=16, bias=False) ) (qkv): Linear(in_features=512, out_features=1536, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=512, out_features=512, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.022) (norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=512, out_features=2048, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=2048, out_features=512, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (2): SwinTransformerBlock( dim=512, input_resolution=(16, 16), num_heads=16, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=512, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=16 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=16, bias=False) ) (qkv): Linear(in_features=512, out_features=1536, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=512, out_features=512, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.026) (norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=512, out_features=2048, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=2048, out_features=512, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (3): SwinTransformerBlock( dim=512, input_resolution=(16, 16), num_heads=16, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=512, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=16 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=16, bias=False) ) (qkv): Linear(in_features=512, out_features=1536, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=512, out_features=512, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.030) (norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=512, out_features=2048, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=2048, out_features=512, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (4): SwinTransformerBlock( dim=512, input_resolution=(16, 16), num_heads=16, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=512, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=16 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=16, bias=False) ) (qkv): Linear(in_features=512, out_features=1536, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=512, out_features=512, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.035) (norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=512, out_features=2048, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=2048, out_features=512, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (5): SwinTransformerBlock( dim=512, input_resolution=(16, 16), num_heads=16, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=512, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=16 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=16, bias=False) ) (qkv): Linear(in_features=512, out_features=1536, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=512, out_features=512, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.039) (norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=512, out_features=2048, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=2048, out_features=512, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (6): SwinTransformerBlock( dim=512, input_resolution=(16, 16), num_heads=16, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=512, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=16 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=16, bias=False) ) (qkv): Linear(in_features=512, out_features=1536, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=512, out_features=512, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.043) (norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=512, out_features=2048, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=2048, out_features=512, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (7): SwinTransformerBlock( dim=512, input_resolution=(16, 16), num_heads=16, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=512, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=16 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=16, bias=False) ) (qkv): Linear(in_features=512, out_features=1536, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=512, out_features=512, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.048) (norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=512, out_features=2048, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=2048, out_features=512, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (8): SwinTransformerBlock( dim=512, input_resolution=(16, 16), num_heads=16, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=512, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=16 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=16, bias=False) ) (qkv): Linear(in_features=512, out_features=1536, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=512, out_features=512, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.052) (norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=512, out_features=2048, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=2048, out_features=512, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (9): SwinTransformerBlock( dim=512, input_resolution=(16, 16), num_heads=16, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=512, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=16 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=16, bias=False) ) (qkv): Linear(in_features=512, out_features=1536, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=512, out_features=512, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.057) (norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=512, out_features=2048, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=2048, out_features=512, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (10): SwinTransformerBlock( dim=512, input_resolution=(16, 16), num_heads=16, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=512, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=16 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=16, bias=False) ) (qkv): Linear(in_features=512, out_features=1536, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=512, out_features=512, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.061) (norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=512, out_features=2048, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=2048, out_features=512, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (11): SwinTransformerBlock( dim=512, input_resolution=(16, 16), num_heads=16, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=512, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=16 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=16, bias=False) ) (qkv): Linear(in_features=512, out_features=1536, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=512, out_features=512, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.065) (norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=512, out_features=2048, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=2048, out_features=512, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (12): SwinTransformerBlock( dim=512, input_resolution=(16, 16), num_heads=16, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=512, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=16 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=16, bias=False) ) (qkv): Linear(in_features=512, out_features=1536, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=512, out_features=512, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.070) (norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=512, out_features=2048, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=2048, out_features=512, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (13): SwinTransformerBlock( dim=512, input_resolution=(16, 16), num_heads=16, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=512, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=16 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=16, bias=False) ) (qkv): Linear(in_features=512, out_features=1536, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=512, out_features=512, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.074) (norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=512, out_features=2048, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=2048, out_features=512, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (14): SwinTransformerBlock( dim=512, input_resolution=(16, 16), num_heads=16, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=512, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=16 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=16, bias=False) ) (qkv): Linear(in_features=512, out_features=1536, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=512, out_features=512, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.078) (norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=512, out_features=2048, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=2048, out_features=512, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (15): SwinTransformerBlock( dim=512, input_resolution=(16, 16), num_heads=16, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=512, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=16 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=16, bias=False) ) (qkv): Linear(in_features=512, out_features=1536, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=512, out_features=512, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.083) (norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=512, out_features=2048, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=2048, out_features=512, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (16): SwinTransformerBlock( dim=512, input_resolution=(16, 16), num_heads=16, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=512, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=16 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=16, bias=False) ) (qkv): Linear(in_features=512, out_features=1536, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=512, out_features=512, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.087) (norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=512, out_features=2048, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=2048, out_features=512, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (17): SwinTransformerBlock( dim=512, input_resolution=(16, 16), num_heads=16, window_size=16, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=512, window_size=(16, 16), pretrained_window_size=(0, 0), num_heads=16 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=16, bias=False) ) (qkv): Linear(in_features=512, out_features=1536, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=512, out_features=512, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.091) (norm2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=512, out_features=2048, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=2048, out_features=512, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (downsample): PatchMerging( input_resolution=(16, 16), dim=512 (reduction): Linear(in_features=2048, out_features=1024, bias=False) (norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) ) (3): BasicLayer( dim=1024, input_resolution=(8, 8), depth=2 (blocks): ModuleList( (0): SwinTransformerBlock( dim=1024, input_resolution=(8, 8), num_heads=32, window_size=8, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=1024, window_size=(8, 8), pretrained_window_size=(0, 0), num_heads=32 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=32, bias=False) ) (qkv): Linear(in_features=1024, out_features=3072, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=1024, out_features=1024, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.096) (norm2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=1024, out_features=4096, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=4096, out_features=1024, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( dim=1024, input_resolution=(8, 8), num_heads=32, window_size=8, shift_size=0, mlp_ratio=4.0 (norm1): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( dim=1024, window_size=(8, 8), pretrained_window_size=(0, 0), num_heads=32 (cpb_mlp): Sequential( (0): Linear(in_features=2, out_features=512, bias=True) (1): ReLU(inplace=True) (2): Linear(in_features=512, out_features=32, bias=False) ) (qkv): Linear(in_features=1024, out_features=3072, bias=False) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=1024, out_features=1024, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath(drop_prob=0.100) (norm2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=1024, out_features=4096, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=4096, out_features=1024, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) ) ) (norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (avgpool): AdaptiveAvgPool1d(output_size=1) (head): Identity() ) (decoder): Sequential( (0): Conv2d(1024, 3072, kernel_size=(1, 1), stride=(1, 1)) (1): PixelShuffle(upscale_factor=32) ) ) [2024-03-08 16:27:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 83): INFO number of params: 90042744 [2024-03-08 16:27:05 hydro_rgb_simmim_pretrain] (utils_simmim.py 84): INFO All checkpoints founded in output/hydro_rgb_simmim_pretrain/hydro_rgb_simmim_pretrain_swinv2_base_img256_window16_800ep: [] [2024-03-08 16:27:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 101): INFO no checkpoint found in output/hydro_rgb_simmim_pretrain/hydro_rgb_simmim_pretrain_swinv2_base_img256_window16_800ep, ignoring auto resume [2024-03-08 16:27:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 106): INFO Start training [2024-03-08 16:27:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [0/800][0/402] eta 0:36:05 lr 0.000000 time 5.3871 (5.3871) loss 0.8333 (0.8333) grad_norm 0.7783 (0.7783) loss_scale 65536.0000 (65536.0000) mem 27910MB [2024-03-08 16:28:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [0/800][100/402] eta 0:03:59 lr 0.000001 time 0.7431 (0.7940) loss 0.8173 (0.8340) grad_norm 0.7252 (0.7503) loss_scale 65536.0000 (65536.0000) mem 28964MB [2024-03-08 16:29:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [0/800][200/402] eta 0:02:35 lr 0.000001 time 0.7440 (0.7690) loss 0.7549 (0.8108) grad_norm 0.4873 (0.6922) loss_scale 65536.0000 (65536.0000) mem 28964MB [2024-03-08 16:30:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [0/800][300/402] eta 0:01:17 lr 0.000002 time 0.7437 (0.7607) loss 0.7284 (0.7843) grad_norm 0.1505 (0.5584) loss_scale 65536.0000 (65536.0000) mem 28964MB [2024-03-08 16:32:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [0/800][400/402] eta 0:00:01 lr 0.000003 time 0.7421 (0.7564) loss 0.6768 (0.7642) grad_norm 0.0837 (0.4473) loss_scale 65536.0000 (65536.0000) mem 28964MB [2024-03-08 16:32:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 0 training takes 0:05:04 [2024-03-08 16:32:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [1/800][0/402] eta 0:32:49 lr 0.000003 time 4.8991 (4.8991) loss 0.6922 (0.6922) grad_norm 0.0877 (0.0877) loss_scale 65536.0000 (65536.0000) mem 28964MB [2024-03-08 16:33:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [1/800][100/402] eta 0:03:57 lr 0.000003 time 0.7439 (0.7858) loss 0.6798 (0.6887) grad_norm 0.0437 (0.0608) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 16:34:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [1/800][200/402] eta 0:02:34 lr 0.000004 time 0.7438 (0.7654) loss 0.6741 (0.6891) grad_norm 0.0330 (0.0540) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 16:35:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [1/800][300/402] eta 0:01:17 lr 0.000004 time 0.7435 (0.7583) loss 0.6798 (0.6889) grad_norm 0.0613 (0.0513) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 16:37:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [1/800][400/402] eta 0:00:01 lr 0.000005 time 0.7422 (0.7547) loss 0.6908 (0.6881) grad_norm 0.0365 (0.0495) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 16:37:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 1 training takes 0:05:03 [2024-03-08 16:37:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [2/800][0/402] eta 0:22:44 lr 0.000005 time 3.3941 (3.3941) loss 0.6709 (0.6709) grad_norm 0.0542 (0.0542) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 16:38:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [2/800][100/402] eta 0:03:52 lr 0.000006 time 0.7437 (0.7704) loss 0.6716 (0.6882) grad_norm 0.0412 (0.0465) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 16:39:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [2/800][200/402] eta 0:02:32 lr 0.000006 time 0.7442 (0.7574) loss 0.6832 (0.6890) grad_norm 0.0837 (0.0480) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 16:40:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [2/800][300/402] eta 0:01:16 lr 0.000007 time 0.7449 (0.7530) loss 0.7023 (0.6896) grad_norm 0.0461 (0.0509) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 16:42:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [2/800][400/402] eta 0:00:01 lr 0.000008 time 0.7427 (0.7508) loss 0.6833 (0.6888) grad_norm 0.1701 (0.0583) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 16:42:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 2 training takes 0:05:01 [2024-03-08 16:42:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [3/800][0/402] eta 0:22:20 lr 0.000008 time 3.3338 (3.3338) loss 0.7085 (0.7085) grad_norm 0.0918 (0.0918) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 16:43:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [3/800][100/402] eta 0:03:52 lr 0.000008 time 0.7442 (0.7699) loss 0.6867 (0.6868) grad_norm 0.2891 (0.1345) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 16:44:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [3/800][200/402] eta 0:02:32 lr 0.000009 time 0.7446 (0.7573) loss 0.6438 (0.6863) grad_norm 0.1415 (0.1891) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 16:46:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [3/800][300/402] eta 0:01:16 lr 0.000009 time 0.7448 (0.7531) loss 0.6959 (0.6849) grad_norm 0.1026 (0.2249) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 16:47:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [3/800][400/402] eta 0:00:01 lr 0.000010 time 0.7433 (0.7510) loss 0.6676 (0.6852) grad_norm 0.7555 (0.2978) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 16:47:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 3 training takes 0:05:01 [2024-03-08 16:47:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [4/800][0/402] eta 0:22:48 lr 0.000010 time 3.4032 (3.4032) loss 0.6814 (0.6814) grad_norm 0.6378 (0.6378) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 16:48:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [4/800][100/402] eta 0:03:52 lr 0.000011 time 0.7446 (0.7710) loss 0.6793 (0.6797) grad_norm 0.1429 (0.4838) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 16:49:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [4/800][200/402] eta 0:02:33 lr 0.000011 time 0.7447 (0.7580) loss 0.6766 (0.6813) grad_norm 0.1064 (0.4456) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 16:51:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [4/800][300/402] eta 0:01:16 lr 0.000012 time 0.7447 (0.7537) loss 0.7042 (0.6823) grad_norm 0.4760 (0.4714) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 16:52:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [4/800][400/402] eta 0:00:01 lr 0.000013 time 0.7432 (0.7514) loss 0.6603 (0.6821) grad_norm 0.5401 (0.5230) loss_scale 131072.0000 (67170.3142) mem 28968MB [2024-03-08 16:52:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 4 training takes 0:05:02 [2024-03-08 16:52:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [5/800][0/402] eta 0:22:28 lr 0.000013 time 3.3549 (3.3549) loss 0.6828 (0.6828) grad_norm 0.6518 (0.6518) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 16:53:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [5/800][100/402] eta 0:03:52 lr 0.000013 time 0.7446 (0.7707) loss 0.6723 (0.6807) grad_norm 2.0739 (0.4747) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 16:54:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [5/800][200/402] eta 0:02:33 lr 0.000014 time 0.7454 (0.7580) loss 0.7149 (0.6802) grad_norm 0.9014 (0.5115) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 16:56:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [5/800][300/402] eta 0:01:16 lr 0.000014 time 0.7448 (0.7537) loss 0.6883 (0.6805) grad_norm 0.3661 (0.4900) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 16:57:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [5/800][400/402] eta 0:00:01 lr 0.000015 time 0.7434 (0.7514) loss 0.6769 (0.6808) grad_norm 1.2126 (0.5006) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 16:57:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 5 training takes 0:05:02 [2024-03-08 16:57:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [6/800][0/402] eta 0:32:31 lr 0.000015 time 4.8534 (4.8534) loss 0.6713 (0.6713) grad_norm 1.3309 (1.3309) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 16:58:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [6/800][100/402] eta 0:03:57 lr 0.000016 time 0.7447 (0.7857) loss 0.6742 (0.6796) grad_norm 0.4742 (0.6286) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 16:59:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [6/800][200/402] eta 0:02:34 lr 0.000016 time 0.7451 (0.7655) loss 0.6624 (0.6796) grad_norm 0.2228 (0.6585) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:01:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [6/800][300/402] eta 0:01:17 lr 0.000017 time 0.7448 (0.7587) loss 0.7033 (0.6796) grad_norm 0.3441 (0.6078) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:02:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [6/800][400/402] eta 0:00:01 lr 0.000018 time 0.7433 (0.7552) loss 0.6940 (0.6798) grad_norm 0.9769 (0.6123) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:02:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 6 training takes 0:05:03 [2024-03-08 17:02:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [7/800][0/402] eta 0:21:52 lr 0.000018 time 3.2661 (3.2661) loss 0.6522 (0.6522) grad_norm 0.4728 (0.4728) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:03:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [7/800][100/402] eta 0:03:52 lr 0.000018 time 0.7446 (0.7698) loss 0.7010 (0.6831) grad_norm 1.0019 (0.6034) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:04:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [7/800][200/402] eta 0:02:32 lr 0.000019 time 0.7449 (0.7574) loss 0.6727 (0.6820) grad_norm 0.4195 (0.6437) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:06:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [7/800][300/402] eta 0:01:16 lr 0.000019 time 0.7448 (0.7532) loss 0.6781 (0.6808) grad_norm 0.5543 (0.6289) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:07:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [7/800][400/402] eta 0:00:01 lr 0.000020 time 0.7436 (0.7511) loss 0.6734 (0.6809) grad_norm 0.6521 (0.5986) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:07:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 7 training takes 0:05:02 [2024-03-08 17:07:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [8/800][0/402] eta 0:22:52 lr 0.000020 time 3.4142 (3.4142) loss 0.6723 (0.6723) grad_norm 0.1443 (0.1443) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:08:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [8/800][100/402] eta 0:03:52 lr 0.000021 time 0.7451 (0.7713) loss 0.7017 (0.6818) grad_norm 0.4539 (0.6224) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:09:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [8/800][200/402] eta 0:02:33 lr 0.000021 time 0.7446 (0.7581) loss 0.7128 (0.6799) grad_norm 0.6502 (0.5736) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:11:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [8/800][300/402] eta 0:01:16 lr 0.000022 time 0.7448 (0.7537) loss 0.6655 (0.6803) grad_norm 0.2703 (0.5466) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:12:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [8/800][400/402] eta 0:00:01 lr 0.000023 time 0.7430 (0.7515) loss 0.6792 (0.6797) grad_norm 0.5240 (0.5360) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:12:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 8 training takes 0:05:02 [2024-03-08 17:12:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [9/800][0/402] eta 0:22:26 lr 0.000023 time 3.3486 (3.3486) loss 0.7097 (0.7097) grad_norm 0.8847 (0.8847) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:13:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [9/800][100/402] eta 0:03:52 lr 0.000023 time 0.7447 (0.7707) loss 0.6942 (0.6816) grad_norm 0.3377 (0.5706) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:15:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [9/800][200/402] eta 0:02:33 lr 0.000024 time 0.7457 (0.7579) loss 0.6679 (0.6808) grad_norm 0.5947 (0.5459) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:16:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [9/800][300/402] eta 0:01:16 lr 0.000024 time 0.7450 (0.7536) loss 0.6944 (0.6807) grad_norm 0.3927 (0.5256) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:17:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [9/800][400/402] eta 0:00:01 lr 0.000025 time 0.7437 (0.7514) loss 0.6752 (0.6803) grad_norm 1.0855 (0.5394) loss_scale 262144.0000 (137609.2569) mem 28968MB [2024-03-08 17:17:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 9 training takes 0:05:02 [2024-03-08 17:17:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [10/800][0/402] eta 0:22:15 lr 0.000025 time 3.3225 (3.3225) loss 0.6835 (0.6835) grad_norm 0.3593 (0.3593) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 17:18:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [10/800][100/402] eta 0:03:52 lr 0.000025 time 0.7447 (0.7705) loss 0.7110 (0.6766) grad_norm 0.1855 (0.5321) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 17:20:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [10/800][200/402] eta 0:02:33 lr 0.000025 time 0.7446 (0.7578) loss 0.6952 (0.6789) grad_norm 1.0225 (0.5371) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 17:21:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [10/800][300/402] eta 0:01:16 lr 0.000025 time 0.7448 (0.7536) loss 0.6816 (0.6789) grad_norm 0.2263 (0.5668) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 17:22:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [10/800][400/402] eta 0:00:01 lr 0.000025 time 0.7437 (0.7514) loss 0.6835 (0.6795) grad_norm 0.4269 (0.5447) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 17:22:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 10 training takes 0:05:02 [2024-03-08 17:22:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [11/800][0/402] eta 0:33:39 lr 0.000025 time 5.0230 (5.0230) loss 0.6644 (0.6644) grad_norm 0.8129 (0.8129) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 17:23:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [11/800][100/402] eta 0:03:57 lr 0.000025 time 0.7449 (0.7880) loss 0.6809 (0.6811) grad_norm 0.1484 (0.4267) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 17:25:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [11/800][200/402] eta 0:02:34 lr 0.000025 time 0.7449 (0.7668) loss 0.7060 (0.6802) grad_norm 0.5288 (0.4994) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 17:26:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [11/800][300/402] eta 0:01:17 lr 0.000025 time 0.7447 (0.7596) loss 0.6866 (0.6801) grad_norm 0.3067 (0.4885) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 17:27:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [11/800][400/402] eta 0:00:01 lr 0.000025 time 0.7435 (0.7559) loss 0.6923 (0.6803) grad_norm 0.9234 (0.4857) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 17:27:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 11 training takes 0:05:03 [2024-03-08 17:27:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [12/800][0/402] eta 0:22:30 lr 0.000025 time 3.3584 (3.3584) loss 0.6752 (0.6752) grad_norm 0.4947 (0.4947) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 17:28:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [12/800][100/402] eta 0:03:52 lr 0.000025 time 0.7446 (0.7707) loss 0.6906 (0.6796) grad_norm 0.3230 (inf) loss_scale 131072.0000 (208936.5545) mem 28968MB [2024-03-08 17:30:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [12/800][200/402] eta 0:02:33 lr 0.000025 time 0.7448 (0.7579) loss 0.6829 (0.6802) grad_norm 0.6405 (inf) loss_scale 131072.0000 (170197.9701) mem 28968MB [2024-03-08 17:31:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [12/800][300/402] eta 0:01:16 lr 0.000025 time 0.7445 (0.7535) loss 0.6860 (0.6807) grad_norm 0.2510 (inf) loss_scale 131072.0000 (157199.3090) mem 28968MB [2024-03-08 17:32:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [12/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7513) loss 0.6944 (0.6794) grad_norm 0.3737 (inf) loss_scale 131072.0000 (150683.7706) mem 28968MB [2024-03-08 17:32:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 12 training takes 0:05:02 [2024-03-08 17:32:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [13/800][0/402] eta 0:22:11 lr 0.000025 time 3.3134 (3.3134) loss 0.6843 (0.6843) grad_norm 0.1101 (0.1101) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:33:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [13/800][100/402] eta 0:03:52 lr 0.000025 time 0.7448 (0.7702) loss 0.7124 (0.6781) grad_norm 0.1503 (0.3648) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:35:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [13/800][200/402] eta 0:02:33 lr 0.000025 time 0.7444 (0.7577) loss 0.6875 (0.6786) grad_norm 0.1861 (0.4047) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:36:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [13/800][300/402] eta 0:01:16 lr 0.000025 time 0.7447 (0.7534) loss 0.7001 (0.6781) grad_norm 0.3951 (0.4528) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:37:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [13/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7513) loss 0.6950 (0.6782) grad_norm 0.6865 (0.4340) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:37:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 13 training takes 0:05:02 [2024-03-08 17:37:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [14/800][0/402] eta 0:22:16 lr 0.000025 time 3.3256 (3.3256) loss 0.7058 (0.7058) grad_norm 0.4730 (0.4730) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:38:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [14/800][100/402] eta 0:03:52 lr 0.000025 time 0.7449 (0.7704) loss 0.6685 (0.6783) grad_norm 0.3991 (0.4549) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:40:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [14/800][200/402] eta 0:02:33 lr 0.000025 time 0.7452 (0.7577) loss 0.6749 (0.6794) grad_norm 0.2241 (0.4466) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:41:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [14/800][300/402] eta 0:01:16 lr 0.000025 time 0.7453 (0.7534) loss 0.6739 (0.6788) grad_norm 0.2981 (0.4491) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:42:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [14/800][400/402] eta 0:00:01 lr 0.000025 time 0.7432 (0.7512) loss 0.7014 (0.6782) grad_norm 0.5961 (0.4456) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:42:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 14 training takes 0:05:02 [2024-03-08 17:42:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [15/800][0/402] eta 0:22:17 lr 0.000025 time 3.3268 (3.3268) loss 0.6823 (0.6823) grad_norm 0.3987 (0.3987) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:44:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [15/800][100/402] eta 0:03:52 lr 0.000025 time 0.7449 (0.7704) loss 0.6806 (0.6810) grad_norm 0.3307 (0.4604) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:45:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [15/800][200/402] eta 0:02:33 lr 0.000025 time 0.7450 (0.7577) loss 0.6770 (0.6816) grad_norm 1.2173 (0.5004) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:46:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [15/800][300/402] eta 0:01:16 lr 0.000025 time 0.7446 (0.7535) loss 0.6707 (0.6799) grad_norm 0.5285 (0.4721) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:47:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [15/800][400/402] eta 0:00:01 lr 0.000025 time 0.7436 (0.7513) loss 0.6854 (0.6794) grad_norm 0.5432 (0.4575) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:47:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 15 training takes 0:05:02 [2024-03-08 17:47:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [16/800][0/402] eta 0:32:11 lr 0.000025 time 4.8058 (4.8058) loss 0.6915 (0.6915) grad_norm 0.5217 (0.5217) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:49:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [16/800][100/402] eta 0:03:57 lr 0.000025 time 0.7449 (0.7860) loss 0.6539 (0.6766) grad_norm 0.1986 (0.4287) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:50:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [16/800][200/402] eta 0:02:34 lr 0.000025 time 0.7464 (0.7658) loss 0.6619 (0.6774) grad_norm 1.6105 (0.4056) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:51:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [16/800][300/402] eta 0:01:17 lr 0.000025 time 0.7450 (0.7591) loss 0.6727 (0.6774) grad_norm 0.5931 (0.4061) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:52:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [16/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7557) loss 0.6864 (0.6777) grad_norm 0.1847 (0.4027) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:52:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 16 training takes 0:05:03 [2024-03-08 17:52:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [17/800][0/402] eta 0:21:44 lr 0.000025 time 3.2441 (3.2441) loss 0.6726 (0.6726) grad_norm 0.1855 (0.1855) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:54:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [17/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7697) loss 0.6823 (0.6771) grad_norm 0.4972 (0.3925) loss_scale 262144.0000 (197256.8713) mem 28968MB [2024-03-08 17:55:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [17/800][200/402] eta 0:02:32 lr 0.000025 time 0.7447 (0.7573) loss 0.6930 (0.6783) grad_norm 0.1221 (inf) loss_scale 131072.0000 (170197.9701) mem 28968MB [2024-03-08 17:56:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [17/800][300/402] eta 0:01:16 lr 0.000025 time 0.7445 (0.7532) loss 0.6554 (0.6784) grad_norm 0.6538 (inf) loss_scale 131072.0000 (157199.3090) mem 28968MB [2024-03-08 17:57:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [17/800][400/402] eta 0:00:01 lr 0.000025 time 0.7434 (0.7510) loss 0.6740 (0.6778) grad_norm 0.2976 (inf) loss_scale 131072.0000 (150683.7706) mem 28968MB [2024-03-08 17:57:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 17 training takes 0:05:01 [2024-03-08 17:57:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [18/800][0/402] eta 0:21:51 lr 0.000025 time 3.2613 (3.2613) loss 0.6754 (0.6754) grad_norm 0.2354 (0.2354) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 17:59:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [18/800][100/402] eta 0:03:52 lr 0.000025 time 0.7447 (0.7698) loss 0.7000 (0.6796) grad_norm 0.3417 (0.4037) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:00:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [18/800][200/402] eta 0:02:33 lr 0.000025 time 0.7451 (0.7575) loss 0.6834 (0.6806) grad_norm 0.7860 (0.3721) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:01:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [18/800][300/402] eta 0:01:16 lr 0.000025 time 0.7448 (0.7533) loss 0.6620 (0.6804) grad_norm 0.4909 (0.3656) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:02:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [18/800][400/402] eta 0:00:01 lr 0.000025 time 0.7431 (0.7512) loss 0.6530 (0.6794) grad_norm 0.4592 (0.3673) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:02:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 18 training takes 0:05:02 [2024-03-08 18:02:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [19/800][0/402] eta 0:21:20 lr 0.000025 time 3.1861 (3.1861) loss 0.6810 (0.6810) grad_norm 0.3108 (0.3108) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:04:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [19/800][100/402] eta 0:03:52 lr 0.000025 time 0.7448 (0.7690) loss 0.6623 (0.6776) grad_norm 0.2649 (0.3853) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:05:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [19/800][200/402] eta 0:02:32 lr 0.000025 time 0.7448 (0.7570) loss 0.6623 (0.6789) grad_norm 0.4233 (0.3753) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:06:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [19/800][300/402] eta 0:01:16 lr 0.000025 time 0.7453 (0.7530) loss 0.6911 (0.6789) grad_norm 0.2872 (0.3571) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:07:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [19/800][400/402] eta 0:00:01 lr 0.000025 time 0.7435 (0.7509) loss 0.6924 (0.6784) grad_norm 0.8216 (0.3610) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:07:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 19 training takes 0:05:01 [2024-03-08 18:07:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [20/800][0/402] eta 0:22:17 lr 0.000025 time 3.3261 (3.3261) loss 0.6595 (0.6595) grad_norm 0.4106 (0.4106) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:09:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [20/800][100/402] eta 0:03:52 lr 0.000025 time 0.7450 (0.7704) loss 0.6754 (0.6781) grad_norm 0.1930 (0.3303) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:10:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [20/800][200/402] eta 0:02:33 lr 0.000025 time 0.7448 (0.7577) loss 0.6742 (0.6782) grad_norm 0.3953 (0.3535) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:11:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [20/800][300/402] eta 0:01:16 lr 0.000025 time 0.7448 (0.7535) loss 0.6790 (0.6784) grad_norm 0.2855 (0.3455) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:12:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [20/800][400/402] eta 0:00:01 lr 0.000025 time 0.7433 (0.7513) loss 0.6871 (0.6785) grad_norm 0.1702 (0.3431) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:12:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 20 training takes 0:05:02 [2024-03-08 18:13:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [21/800][0/402] eta 0:32:19 lr 0.000025 time 4.8241 (4.8241) loss 0.6613 (0.6613) grad_norm 0.1812 (0.1812) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:14:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [21/800][100/402] eta 0:03:57 lr 0.000025 time 0.7446 (0.7852) loss 0.6771 (0.6782) grad_norm 0.1286 (0.3285) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:15:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [21/800][200/402] eta 0:02:34 lr 0.000025 time 0.7455 (0.7651) loss 0.6630 (0.6760) grad_norm 0.5463 (0.3179) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:16:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [21/800][300/402] eta 0:01:17 lr 0.000025 time 0.7448 (0.7584) loss 0.6827 (0.6775) grad_norm 0.4981 (0.3211) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:18:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [21/800][400/402] eta 0:00:01 lr 0.000025 time 0.7435 (0.7551) loss 0.6536 (0.6781) grad_norm 0.2526 (0.3261) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:18:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 21 training takes 0:05:03 [2024-03-08 18:18:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [22/800][0/402] eta 0:22:06 lr 0.000025 time 3.2992 (3.2992) loss 0.6843 (0.6843) grad_norm 0.5217 (0.5217) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:19:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [22/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7702) loss 0.6910 (0.6779) grad_norm 0.1630 (0.3574) loss_scale 262144.0000 (132369.7426) mem 28968MB [2024-03-08 18:20:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [22/800][200/402] eta 0:02:33 lr 0.000025 time 0.7450 (0.7577) loss 0.7038 (0.6774) grad_norm 0.3344 (0.3269) loss_scale 262144.0000 (196934.0498) mem 28968MB [2024-03-08 18:21:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [22/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7535) loss 0.6408 (0.6775) grad_norm 2.0848 (inf) loss_scale 131072.0000 (187245.7143) mem 28968MB [2024-03-08 18:23:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [22/800][400/402] eta 0:00:01 lr 0.000025 time 0.7435 (0.7513) loss 0.6831 (0.6783) grad_norm 0.1574 (inf) loss_scale 131072.0000 (173237.3067) mem 28968MB [2024-03-08 18:23:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 22 training takes 0:05:02 [2024-03-08 18:23:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [23/800][0/402] eta 0:21:53 lr 0.000025 time 3.2662 (3.2662) loss 0.6863 (0.6863) grad_norm 0.1414 (0.1414) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:24:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [23/800][100/402] eta 0:03:52 lr 0.000025 time 0.7470 (0.7699) loss 0.7022 (0.6788) grad_norm 0.5092 (0.3394) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:25:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [23/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7575) loss 0.6796 (0.6768) grad_norm 0.4363 (0.3301) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:26:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [23/800][300/402] eta 0:01:16 lr 0.000025 time 0.7447 (0.7534) loss 0.6925 (0.6773) grad_norm 0.4360 (0.3293) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:28:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [23/800][400/402] eta 0:00:01 lr 0.000025 time 0.7437 (0.7513) loss 0.6701 (0.6772) grad_norm 0.8104 (0.3310) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:28:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 23 training takes 0:05:02 [2024-03-08 18:28:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [24/800][0/402] eta 0:22:45 lr 0.000025 time 3.3959 (3.3959) loss 0.7000 (0.7000) grad_norm 0.3200 (0.3200) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:29:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [24/800][100/402] eta 0:03:52 lr 0.000025 time 0.7450 (0.7713) loss 0.6503 (0.6744) grad_norm 0.2538 (0.3613) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:30:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [24/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7583) loss 0.6804 (0.6750) grad_norm 0.1818 (0.3645) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:31:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [24/800][300/402] eta 0:01:16 lr 0.000025 time 0.7453 (0.7539) loss 0.6622 (0.6754) grad_norm 0.2856 (0.3708) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:33:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [24/800][400/402] eta 0:00:01 lr 0.000025 time 0.7434 (0.7517) loss 0.6861 (0.6756) grad_norm 0.6788 (0.3731) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:33:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 24 training takes 0:05:02 [2024-03-08 18:33:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [25/800][0/402] eta 0:21:59 lr 0.000025 time 3.2834 (3.2834) loss 0.6544 (0.6544) grad_norm 0.7230 (0.7230) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:34:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [25/800][100/402] eta 0:03:52 lr 0.000025 time 0.7449 (0.7703) loss 0.6705 (0.6733) grad_norm 0.7879 (0.4277) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:35:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [25/800][200/402] eta 0:02:33 lr 0.000025 time 0.7449 (0.7578) loss 0.6941 (0.6748) grad_norm 0.3423 (0.4185) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:36:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [25/800][300/402] eta 0:01:16 lr 0.000025 time 0.7465 (0.7536) loss 0.6662 (0.6743) grad_norm 0.2645 (0.4191) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:38:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [25/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7515) loss 0.7023 (0.6738) grad_norm 0.4717 (0.4172) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:38:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 25 training takes 0:05:02 [2024-03-08 18:38:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [26/800][0/402] eta 0:32:03 lr 0.000025 time 4.7854 (4.7854) loss 0.6847 (0.6847) grad_norm 0.4599 (0.4599) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 18:39:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [26/800][100/402] eta 0:03:57 lr 0.000025 time 0.7455 (0.7853) loss 0.6832 (0.6733) grad_norm 0.3262 (inf) loss_scale 65536.0000 (110956.9901) mem 28968MB [2024-03-08 18:40:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [26/800][200/402] eta 0:02:34 lr 0.000025 time 0.7460 (0.7655) loss 0.6386 (0.6736) grad_norm 0.3724 (inf) loss_scale 65536.0000 (88359.4826) mem 28968MB [2024-03-08 18:41:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [26/800][300/402] eta 0:01:17 lr 0.000025 time 0.7446 (0.7591) loss 0.6718 (0.6735) grad_norm 0.5897 (inf) loss_scale 65536.0000 (80776.9302) mem 28968MB [2024-03-08 18:43:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [26/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7556) loss 0.6582 (0.6730) grad_norm 0.5065 (inf) loss_scale 65536.0000 (76976.1995) mem 28968MB [2024-03-08 18:43:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 26 training takes 0:05:03 [2024-03-08 18:43:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [27/800][0/402] eta 0:21:53 lr 0.000025 time 3.2685 (3.2685) loss 0.6735 (0.6735) grad_norm 0.6245 (0.6245) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 18:44:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [27/800][100/402] eta 0:03:52 lr 0.000025 time 0.7465 (0.7713) loss 0.6623 (0.6700) grad_norm 0.4921 (0.4507) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 18:45:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [27/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7589) loss 0.6750 (0.6707) grad_norm 0.5612 (0.4517) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 18:47:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [27/800][300/402] eta 0:01:16 lr 0.000025 time 0.7463 (0.7548) loss 0.6813 (0.6713) grad_norm 0.3206 (0.4343) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 18:48:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [27/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7526) loss 0.6729 (0.6707) grad_norm 0.4722 (0.4291) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 18:48:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 27 training takes 0:05:02 [2024-03-08 18:48:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [28/800][0/402] eta 0:22:09 lr 0.000025 time 3.3061 (3.3061) loss 0.6486 (0.6486) grad_norm 0.3460 (0.3460) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 18:49:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [28/800][100/402] eta 0:03:52 lr 0.000025 time 0.7452 (0.7706) loss 0.6100 (0.6672) grad_norm 0.3810 (0.3775) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 18:50:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [28/800][200/402] eta 0:02:33 lr 0.000025 time 0.7535 (0.7580) loss 0.6802 (0.6686) grad_norm 0.2998 (0.3813) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 18:52:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [28/800][300/402] eta 0:01:16 lr 0.000025 time 0.7453 (0.7538) loss 0.6795 (0.6681) grad_norm 0.4202 (0.3946) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 18:53:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [28/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7516) loss 0.6470 (0.6687) grad_norm 0.4515 (0.3957) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 18:53:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 28 training takes 0:05:02 [2024-03-08 18:53:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [29/800][0/402] eta 0:22:06 lr 0.000025 time 3.2991 (3.2991) loss 0.6876 (0.6876) grad_norm 0.5239 (0.5239) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 18:54:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [29/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7706) loss 0.6575 (0.6676) grad_norm 0.3649 (0.3727) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 18:55:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [29/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7580) loss 0.6860 (0.6663) grad_norm 0.2775 (0.3704) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 18:57:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [29/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7538) loss 0.6749 (0.6671) grad_norm 0.6458 (0.3845) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 18:58:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [29/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7517) loss 0.6997 (0.6676) grad_norm 0.5457 (0.3787) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 18:58:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 29 training takes 0:05:02 [2024-03-08 18:58:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [30/800][0/402] eta 0:21:41 lr 0.000025 time 3.2363 (3.2363) loss 0.6751 (0.6751) grad_norm 0.4656 (0.4656) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 18:59:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [30/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7700) loss 0.6688 (0.6663) grad_norm 0.2889 (0.3546) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:00:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [30/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7577) loss 0.6477 (0.6666) grad_norm 0.4927 (0.3717) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:02:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [30/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7536) loss 0.6737 (0.6673) grad_norm 0.5190 (0.3800) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:03:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [30/800][400/402] eta 0:00:01 lr 0.000025 time 0.7436 (0.7515) loss 0.6595 (0.6676) grad_norm 0.2916 (0.3795) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:03:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 30 training takes 0:05:02 [2024-03-08 19:03:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [31/800][0/402] eta 0:31:31 lr 0.000025 time 4.7056 (4.7056) loss 0.6688 (0.6688) grad_norm 0.2480 (0.2480) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:04:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [31/800][100/402] eta 0:03:56 lr 0.000025 time 0.7457 (0.7846) loss 0.6882 (0.6675) grad_norm 0.2896 (0.3738) loss_scale 131072.0000 (92139.7228) mem 28968MB [2024-03-08 19:05:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [31/800][200/402] eta 0:02:34 lr 0.000025 time 0.7456 (0.7651) loss 0.6718 (0.6667) grad_norm 0.3612 (0.3702) loss_scale 131072.0000 (111509.0149) mem 28968MB [2024-03-08 19:07:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [31/800][300/402] eta 0:01:17 lr 0.000025 time 0.7454 (0.7586) loss 0.6365 (0.6662) grad_norm 0.3286 (0.3754) loss_scale 131072.0000 (118008.3455) mem 28968MB [2024-03-08 19:08:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [31/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7553) loss 0.7072 (0.6669) grad_norm 0.4298 (0.3767) loss_scale 131072.0000 (121266.1147) mem 28968MB [2024-03-08 19:08:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 31 training takes 0:05:03 [2024-03-08 19:08:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [32/800][0/402] eta 0:22:59 lr 0.000025 time 3.4322 (3.4322) loss 0.6398 (0.6398) grad_norm 0.3414 (0.3414) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:09:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [32/800][100/402] eta 0:03:53 lr 0.000025 time 0.7457 (0.7725) loss 0.6625 (0.6661) grad_norm 0.3754 (0.3931) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:10:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [32/800][200/402] eta 0:02:33 lr 0.000025 time 0.7451 (0.7590) loss 0.6897 (0.6668) grad_norm 0.3048 (0.3762) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:12:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [32/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7545) loss 0.6736 (0.6666) grad_norm 0.2822 (0.3843) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:13:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [32/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7522) loss 0.7016 (0.6672) grad_norm 0.2976 (inf) loss_scale 65536.0000 (118978.0748) mem 28968MB [2024-03-08 19:13:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 32 training takes 0:05:02 [2024-03-08 19:13:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [33/800][0/402] eta 0:22:22 lr 0.000025 time 3.3387 (3.3387) loss 0.6839 (0.6839) grad_norm 0.6103 (0.6103) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:14:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [33/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7710) loss 0.6492 (0.6648) grad_norm 0.3105 (0.3959) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:16:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [33/800][200/402] eta 0:02:33 lr 0.000025 time 0.7450 (0.7582) loss 0.6613 (0.6660) grad_norm 0.2948 (0.3997) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:17:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [33/800][300/402] eta 0:01:16 lr 0.000025 time 0.7451 (0.7539) loss 0.6609 (0.6651) grad_norm 0.3665 (0.3950) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:18:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [33/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7518) loss 0.7070 (0.6653) grad_norm 0.2446 (0.3959) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:18:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 33 training takes 0:05:02 [2024-03-08 19:18:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [34/800][0/402] eta 0:21:38 lr 0.000025 time 3.2308 (3.2308) loss 0.6216 (0.6216) grad_norm 0.3565 (0.3565) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:19:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [34/800][100/402] eta 0:03:52 lr 0.000025 time 0.7451 (0.7699) loss 0.6746 (0.6658) grad_norm 0.5027 (0.3940) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:21:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [34/800][200/402] eta 0:02:33 lr 0.000025 time 0.7453 (0.7578) loss 0.6389 (0.6646) grad_norm 0.6270 (0.3998) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:22:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [34/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7536) loss 0.6620 (0.6653) grad_norm 0.3619 (0.4027) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:23:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [34/800][400/402] eta 0:00:01 lr 0.000025 time 0.7437 (0.7515) loss 0.6741 (0.6643) grad_norm 0.4194 (0.4035) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:23:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 34 training takes 0:05:02 [2024-03-08 19:23:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [35/800][0/402] eta 0:21:59 lr 0.000025 time 3.2825 (3.2825) loss 0.6688 (0.6688) grad_norm 0.5107 (0.5107) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:24:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [35/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7706) loss 0.6975 (0.6649) grad_norm 0.3511 (0.4042) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:26:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [35/800][200/402] eta 0:02:33 lr 0.000025 time 0.7450 (0.7580) loss 0.6672 (0.6635) grad_norm 0.4240 (0.4012) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:27:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [35/800][300/402] eta 0:01:16 lr 0.000025 time 0.7464 (0.7539) loss 0.6616 (0.6630) grad_norm 0.4317 (0.4068) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:28:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [35/800][400/402] eta 0:00:01 lr 0.000025 time 0.7437 (0.7517) loss 0.6675 (0.6637) grad_norm 0.4258 (0.4167) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:28:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 35 training takes 0:05:02 [2024-03-08 19:28:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [36/800][0/402] eta 0:31:27 lr 0.000025 time 4.6951 (4.6951) loss 0.6548 (0.6548) grad_norm 0.3470 (0.3470) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:29:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [36/800][100/402] eta 0:03:56 lr 0.000025 time 0.7454 (0.7845) loss 0.6956 (0.6666) grad_norm 0.4872 (0.4111) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:31:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [36/800][200/402] eta 0:02:34 lr 0.000025 time 0.7458 (0.7650) loss 0.6372 (0.6654) grad_norm 0.6281 (0.4227) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:32:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [36/800][300/402] eta 0:01:17 lr 0.000025 time 0.7454 (0.7586) loss 0.6708 (0.6652) grad_norm 0.4452 (0.4281) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:33:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [36/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7553) loss 0.6549 (0.6650) grad_norm 0.3427 (0.4334) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:33:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 36 training takes 0:05:03 [2024-03-08 19:33:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [37/800][0/402] eta 0:22:47 lr 0.000025 time 3.4009 (3.4009) loss 0.6976 (0.6976) grad_norm 0.4960 (0.4960) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:34:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [37/800][100/402] eta 0:03:53 lr 0.000025 time 0.7450 (0.7719) loss 0.6487 (0.6604) grad_norm 0.3516 (0.4234) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:36:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [37/800][200/402] eta 0:02:33 lr 0.000025 time 0.7464 (0.7590) loss 0.6586 (0.6632) grad_norm 0.3240 (0.4245) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:37:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [37/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7547) loss 0.6343 (0.6638) grad_norm 0.4849 (0.4190) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-08 19:38:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [37/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7524) loss 0.6625 (0.6632) grad_norm 0.3913 (0.4242) loss_scale 131072.0000 (79264.2394) mem 28968MB [2024-03-08 19:38:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 37 training takes 0:05:02 [2024-03-08 19:38:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [38/800][0/402] eta 0:22:01 lr 0.000025 time 3.2873 (3.2873) loss 0.6624 (0.6624) grad_norm 0.3750 (0.3750) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:40:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [38/800][100/402] eta 0:03:52 lr 0.000025 time 0.7463 (0.7706) loss 0.6823 (0.6626) grad_norm 0.3645 (0.4253) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:41:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [38/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7581) loss 0.6330 (0.6634) grad_norm 0.4872 (0.4218) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:42:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [38/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7539) loss 0.6989 (0.6625) grad_norm 0.4140 (0.4225) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:43:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [38/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7518) loss 0.6509 (0.6626) grad_norm 0.4085 (0.4285) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:43:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 38 training takes 0:05:02 [2024-03-08 19:43:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [39/800][0/402] eta 0:22:41 lr 0.000025 time 3.3870 (3.3870) loss 0.6470 (0.6470) grad_norm 0.3247 (0.3247) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:45:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [39/800][100/402] eta 0:03:53 lr 0.000025 time 0.7457 (0.7717) loss 0.6716 (0.6627) grad_norm 0.4248 (0.4387) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:46:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [39/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7587) loss 0.6572 (0.6621) grad_norm 0.3872 (0.4278) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:47:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [39/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7543) loss 0.7099 (0.6618) grad_norm 0.3793 (0.4270) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:48:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [39/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7521) loss 0.6299 (0.6619) grad_norm 0.6606 (0.4271) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:48:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 39 training takes 0:05:02 [2024-03-08 19:48:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [40/800][0/402] eta 0:22:31 lr 0.000025 time 3.3619 (3.3619) loss 0.6616 (0.6616) grad_norm 0.3369 (0.3369) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:50:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [40/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7714) loss 0.6494 (0.6578) grad_norm 0.4069 (0.4323) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:51:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [40/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7585) loss 0.6524 (0.6604) grad_norm 0.6360 (0.4357) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:52:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [40/800][300/402] eta 0:01:16 lr 0.000025 time 0.7453 (0.7542) loss 0.6895 (0.6613) grad_norm 0.3695 (0.4406) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:53:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [40/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7520) loss 0.6956 (0.6612) grad_norm 0.4349 (0.4449) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:53:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 40 training takes 0:05:02 [2024-03-08 19:53:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [41/800][0/402] eta 0:32:49 lr 0.000025 time 4.8999 (4.8999) loss 0.6711 (0.6711) grad_norm 0.4048 (0.4048) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:55:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [41/800][100/402] eta 0:03:57 lr 0.000025 time 0.7448 (0.7867) loss 0.6953 (0.6627) grad_norm 0.3318 (0.4376) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:56:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [41/800][200/402] eta 0:02:34 lr 0.000025 time 0.7467 (0.7663) loss 0.6537 (0.6630) grad_norm 0.3164 (0.4355) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:57:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [41/800][300/402] eta 0:01:17 lr 0.000025 time 0.7453 (0.7595) loss 0.6854 (0.6630) grad_norm 0.6823 (0.4379) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:58:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [41/800][400/402] eta 0:00:01 lr 0.000025 time 0.7456 (0.7560) loss 0.6857 (0.6624) grad_norm 0.3681 (0.4363) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 19:58:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 41 training takes 0:05:04 [2024-03-08 19:58:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [42/800][0/402] eta 0:22:26 lr 0.000025 time 3.3507 (3.3507) loss 0.6306 (0.6306) grad_norm 0.3520 (0.3520) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:00:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [42/800][100/402] eta 0:03:53 lr 0.000025 time 0.7462 (0.7724) loss 0.6566 (0.6574) grad_norm 0.4394 (0.4279) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:01:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [42/800][200/402] eta 0:02:33 lr 0.000025 time 0.7464 (0.7596) loss 0.6758 (0.6589) grad_norm 0.6823 (0.4405) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:02:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [42/800][300/402] eta 0:01:17 lr 0.000025 time 0.7471 (0.7553) loss 0.6630 (0.6600) grad_norm 0.4941 (0.4421) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:03:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [42/800][400/402] eta 0:00:01 lr 0.000025 time 0.7450 (0.7531) loss 0.6656 (0.6597) grad_norm 0.3839 (inf) loss_scale 131072.0000 (159835.9302) mem 28968MB [2024-03-08 20:03:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 42 training takes 0:05:02 [2024-03-08 20:03:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [43/800][0/402] eta 0:22:06 lr 0.000025 time 3.2988 (3.2988) loss 0.6728 (0.6728) grad_norm 0.4610 (0.4610) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:05:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [43/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7708) loss 0.6627 (0.6613) grad_norm 0.5265 (0.4371) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:06:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [43/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7583) loss 0.6470 (0.6590) grad_norm 0.4289 (0.4519) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:07:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [43/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7541) loss 0.6708 (0.6595) grad_norm 0.3918 (0.4485) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:08:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [43/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7520) loss 0.6941 (0.6597) grad_norm 0.2880 (0.4418) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:08:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 43 training takes 0:05:02 [2024-03-08 20:09:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [44/800][0/402] eta 0:21:47 lr 0.000025 time 3.2521 (3.2521) loss 0.6482 (0.6482) grad_norm 0.3691 (0.3691) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:10:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [44/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7704) loss 0.6610 (0.6575) grad_norm 0.2821 (0.4514) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:11:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [44/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7581) loss 0.6375 (0.6571) grad_norm 0.4006 (0.4335) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:12:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [44/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7540) loss 0.6236 (0.6582) grad_norm 0.5688 (0.4353) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:14:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [44/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7518) loss 0.6491 (0.6584) grad_norm 0.4730 (0.4382) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:14:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 44 training takes 0:05:02 [2024-03-08 20:14:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [45/800][0/402] eta 0:22:21 lr 0.000025 time 3.3362 (3.3362) loss 0.6812 (0.6812) grad_norm 0.3276 (0.3276) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:15:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [45/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7712) loss 0.6896 (0.6604) grad_norm 0.4295 (0.4543) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:16:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [45/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7585) loss 0.6378 (0.6589) grad_norm 0.5348 (0.4435) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:17:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [45/800][300/402] eta 0:01:16 lr 0.000025 time 0.7463 (0.7542) loss 0.6238 (0.6583) grad_norm 0.5690 (0.4445) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:19:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [45/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7520) loss 0.6719 (0.6588) grad_norm 0.4038 (0.4463) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:19:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 45 training takes 0:05:02 [2024-03-08 20:19:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [46/800][0/402] eta 0:32:29 lr 0.000025 time 4.8496 (4.8496) loss 0.7006 (0.7006) grad_norm 0.3699 (0.3699) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:20:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [46/800][100/402] eta 0:03:57 lr 0.000025 time 0.7454 (0.7861) loss 0.6471 (0.6592) grad_norm 0.3572 (0.4210) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:21:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [46/800][200/402] eta 0:02:34 lr 0.000025 time 0.7458 (0.7659) loss 0.6686 (0.6588) grad_norm 0.5199 (0.4244) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:22:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [46/800][300/402] eta 0:01:17 lr 0.000025 time 0.7456 (0.7592) loss 0.6571 (0.6573) grad_norm 0.3727 (0.4402) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:24:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [46/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7557) loss 0.6560 (0.6585) grad_norm 0.4057 (0.4388) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:24:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 46 training takes 0:05:03 [2024-03-08 20:24:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [47/800][0/402] eta 0:22:13 lr 0.000025 time 3.3162 (3.3162) loss 0.6609 (0.6609) grad_norm 0.4310 (0.4310) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:25:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [47/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7709) loss 0.6232 (0.6563) grad_norm 0.3100 (0.4178) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:26:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [47/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7583) loss 0.6670 (0.6572) grad_norm 0.5349 (0.4181) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:27:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [47/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7541) loss 0.6255 (0.6574) grad_norm 0.4141 (0.4178) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:29:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [47/800][400/402] eta 0:00:01 lr 0.000025 time 0.7437 (0.7519) loss 0.6848 (0.6577) grad_norm 0.3523 (0.4145) loss_scale 262144.0000 (136301.8055) mem 28968MB [2024-03-08 20:29:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 47 training takes 0:05:02 [2024-03-08 20:29:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [48/800][0/402] eta 0:21:32 lr 0.000025 time 3.2163 (3.2163) loss 0.6611 (0.6611) grad_norm 0.3394 (0.3394) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 20:30:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [48/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7701) loss 0.7001 (0.6595) grad_norm 0.4442 (0.4222) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 20:31:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [48/800][200/402] eta 0:02:33 lr 0.000025 time 0.7453 (0.7580) loss 0.6638 (0.6585) grad_norm 0.4460 (0.4197) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 20:32:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [48/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7539) loss 0.6736 (0.6571) grad_norm 0.4686 (0.4223) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 20:34:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [48/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7518) loss 0.6754 (0.6570) grad_norm 0.4769 (0.4219) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 20:34:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 48 training takes 0:05:02 [2024-03-08 20:34:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [49/800][0/402] eta 0:22:21 lr 0.000025 time 3.3376 (3.3376) loss 0.6516 (0.6516) grad_norm 0.4082 (0.4082) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 20:35:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [49/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7714) loss 0.6670 (0.6618) grad_norm 0.6100 (0.4108) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 20:36:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [49/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7586) loss 0.6749 (0.6583) grad_norm 0.2961 (0.4250) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 20:37:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [49/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7543) loss 0.6459 (0.6574) grad_norm 0.3641 (0.4215) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 20:39:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [49/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7521) loss 0.6764 (0.6576) grad_norm 0.3905 (0.4167) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 20:39:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 49 training takes 0:05:02 [2024-03-08 20:39:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [50/800][0/402] eta 0:22:23 lr 0.000025 time 3.3414 (3.3414) loss 0.6367 (0.6367) grad_norm 0.4236 (0.4236) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 20:40:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [50/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7712) loss 0.6745 (0.6548) grad_norm 0.5238 (inf) loss_scale 131072.0000 (168706.5347) mem 28968MB [2024-03-08 20:41:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [50/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7585) loss 0.6459 (0.6564) grad_norm 0.3318 (inf) loss_scale 131072.0000 (149982.8856) mem 28968MB [2024-03-08 20:43:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [50/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7543) loss 0.6703 (0.6571) grad_norm 0.3777 (inf) loss_scale 131072.0000 (143700.1993) mem 28968MB [2024-03-08 20:44:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [50/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7522) loss 0.6325 (0.6561) grad_norm 0.3388 (inf) loss_scale 131072.0000 (140551.0224) mem 28968MB [2024-03-08 20:44:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 50 training takes 0:05:02 [2024-03-08 20:44:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [51/800][0/402] eta 0:32:28 lr 0.000025 time 4.8461 (4.8461) loss 0.6703 (0.6703) grad_norm 0.3709 (0.3709) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:45:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [51/800][100/402] eta 0:03:57 lr 0.000025 time 0.7453 (0.7863) loss 0.6428 (0.6533) grad_norm 0.5133 (0.4287) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:46:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [51/800][200/402] eta 0:02:34 lr 0.000025 time 0.7464 (0.7661) loss 0.6738 (0.6557) grad_norm 0.2742 (0.4087) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:48:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [51/800][300/402] eta 0:01:17 lr 0.000025 time 0.7454 (0.7593) loss 0.6518 (0.6560) grad_norm 0.3314 (0.4125) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:49:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [51/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7559) loss 0.6578 (0.6558) grad_norm 0.4965 (0.4157) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:49:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 51 training takes 0:05:03 [2024-03-08 20:49:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [52/800][0/402] eta 0:22:16 lr 0.000025 time 3.3259 (3.3259) loss 0.6778 (0.6778) grad_norm 0.4243 (0.4243) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:50:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [52/800][100/402] eta 0:03:53 lr 0.000025 time 0.7463 (0.7722) loss 0.6467 (0.6573) grad_norm 0.4512 (0.3974) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:51:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [52/800][200/402] eta 0:02:33 lr 0.000025 time 0.7467 (0.7595) loss 0.6644 (0.6567) grad_norm 0.3336 (0.4033) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:53:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [52/800][300/402] eta 0:01:17 lr 0.000025 time 0.7470 (0.7554) loss 0.6574 (0.6554) grad_norm 0.3852 (0.4035) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:54:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [52/800][400/402] eta 0:00:01 lr 0.000025 time 0.7453 (0.7532) loss 0.6645 (0.6550) grad_norm 0.5849 (0.4017) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:54:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 52 training takes 0:05:02 [2024-03-08 20:54:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [53/800][0/402] eta 0:21:44 lr 0.000025 time 3.2455 (3.2455) loss 0.6966 (0.6966) grad_norm 0.3255 (0.3255) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:55:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [53/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7704) loss 0.6382 (0.6565) grad_norm 0.4064 (0.3999) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:56:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [53/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7581) loss 0.6316 (0.6568) grad_norm 0.4823 (0.3951) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:58:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [53/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7540) loss 0.6686 (0.6561) grad_norm 0.3924 (0.3988) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:59:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [53/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7519) loss 0.6680 (0.6554) grad_norm 0.3774 (0.4009) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 20:59:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 53 training takes 0:05:02 [2024-03-08 20:59:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [54/800][0/402] eta 0:22:13 lr 0.000025 time 3.3184 (3.3184) loss 0.6785 (0.6785) grad_norm 0.3925 (0.3925) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 21:00:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [54/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7711) loss 0.6711 (0.6538) grad_norm 0.3178 (0.4073) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 21:01:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [54/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7584) loss 0.6661 (0.6539) grad_norm 0.5042 (0.4011) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 21:03:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [54/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7542) loss 0.6602 (0.6539) grad_norm 0.4346 (0.3947) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 21:04:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [54/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7520) loss 0.6428 (0.6538) grad_norm 0.4281 (0.3961) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 21:04:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 54 training takes 0:05:02 [2024-03-08 21:04:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [55/800][0/402] eta 0:21:40 lr 0.000025 time 3.2360 (3.2360) loss 0.6681 (0.6681) grad_norm 0.4213 (0.4213) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 21:05:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [55/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7708) loss 0.6176 (0.6560) grad_norm 0.5320 (0.3789) loss_scale 262144.0000 (237486.8911) mem 28968MB [2024-03-08 21:07:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [55/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7583) loss 0.6698 (0.6536) grad_norm 0.2514 (0.3807) loss_scale 262144.0000 (249754.1095) mem 28968MB [2024-03-08 21:08:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [55/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7541) loss 0.6654 (0.6541) grad_norm 0.4288 (0.3845) loss_scale 262144.0000 (253870.3522) mem 28968MB [2024-03-08 21:09:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [55/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7520) loss 0.6620 (0.6545) grad_norm 0.3071 (0.3826) loss_scale 262144.0000 (255933.6060) mem 28968MB [2024-03-08 21:09:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 55 training takes 0:05:02 [2024-03-08 21:09:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [56/800][0/402] eta 0:32:03 lr 0.000025 time 4.7846 (4.7846) loss 0.6418 (0.6418) grad_norm 0.3840 (0.3840) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:10:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [56/800][100/402] eta 0:03:57 lr 0.000025 time 0.7458 (0.7860) loss 0.6567 (0.6519) grad_norm 0.4400 (0.3789) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:12:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [56/800][200/402] eta 0:02:34 lr 0.000025 time 0.7456 (0.7660) loss 0.6607 (0.6542) grad_norm 0.3731 (0.3846) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:13:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [56/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7592) loss 0.6620 (0.6543) grad_norm 0.3265 (0.3780) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:14:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [56/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7558) loss 0.6513 (0.6543) grad_norm 0.3752 (0.3771) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:14:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 56 training takes 0:05:03 [2024-03-08 21:14:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [57/800][0/402] eta 0:21:46 lr 0.000025 time 3.2502 (3.2502) loss 0.6380 (0.6380) grad_norm 0.3298 (0.3298) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:15:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [57/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7706) loss 0.6641 (0.6556) grad_norm 0.6119 (0.4068) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:17:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [57/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7582) loss 0.6709 (0.6544) grad_norm 0.3320 (0.3883) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:18:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [57/800][300/402] eta 0:01:16 lr 0.000025 time 0.7453 (0.7540) loss 0.6538 (0.6550) grad_norm 0.3424 (0.3840) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:19:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [57/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7519) loss 0.6396 (0.6544) grad_norm 0.3045 (0.3846) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:19:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 57 training takes 0:05:02 [2024-03-08 21:19:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [58/800][0/402] eta 0:23:01 lr 0.000025 time 3.4357 (3.4357) loss 0.6375 (0.6375) grad_norm 0.2837 (0.2837) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:20:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [58/800][100/402] eta 0:03:53 lr 0.000025 time 0.7452 (0.7721) loss 0.6723 (0.6499) grad_norm 0.3346 (0.3580) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:22:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [58/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7590) loss 0.6444 (0.6501) grad_norm 0.2906 (0.3694) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:23:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [58/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7545) loss 0.6785 (0.6516) grad_norm 0.3724 (0.3665) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:24:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [58/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7523) loss 0.6257 (0.6521) grad_norm 0.2915 (0.3707) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:24:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 58 training takes 0:05:02 [2024-03-08 21:24:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [59/800][0/402] eta 0:22:25 lr 0.000025 time 3.3469 (3.3469) loss 0.6312 (0.6312) grad_norm 0.3153 (0.3153) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:25:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [59/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7714) loss 0.6248 (0.6519) grad_norm 0.3666 (0.3590) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:27:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [59/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7587) loss 0.6426 (0.6516) grad_norm 0.4228 (0.3648) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:28:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [59/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7544) loss 0.6819 (0.6513) grad_norm 0.3926 (0.3666) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:29:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [59/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7522) loss 0.6352 (0.6516) grad_norm 0.3577 (0.3676) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:29:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 59 training takes 0:05:02 [2024-03-08 21:29:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [60/800][0/402] eta 0:23:04 lr 0.000025 time 3.4441 (3.4441) loss 0.6511 (0.6511) grad_norm 0.3824 (0.3824) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:30:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [60/800][100/402] eta 0:03:53 lr 0.000025 time 0.7455 (0.7724) loss 0.6599 (0.6516) grad_norm 0.3140 (0.3631) loss_scale 524288.0000 (500928.6337) mem 28968MB [2024-03-08 21:32:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [60/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7591) loss 0.6259 (0.6531) grad_norm 0.3806 (inf) loss_scale 262144.0000 (469511.6418) mem 28968MB [2024-03-08 21:33:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [60/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7547) loss 0.6592 (0.6529) grad_norm 0.4220 (inf) loss_scale 262144.0000 (400618.7375) mem 28968MB [2024-03-08 21:34:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [60/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7524) loss 0.6433 (0.6527) grad_norm 0.3152 (inf) loss_scale 262144.0000 (366086.3840) mem 28968MB [2024-03-08 21:34:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 60 training takes 0:05:02 [2024-03-08 21:34:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [61/800][0/402] eta 0:31:54 lr 0.000025 time 4.7618 (4.7618) loss 0.6910 (0.6910) grad_norm 0.5302 (0.5302) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:36:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [61/800][100/402] eta 0:03:57 lr 0.000025 time 0.7454 (0.7857) loss 0.6263 (0.6512) grad_norm 0.3873 (0.3722) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:37:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [61/800][200/402] eta 0:02:34 lr 0.000025 time 0.7463 (0.7662) loss 0.6703 (0.6529) grad_norm 0.2840 (0.3634) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:38:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [61/800][300/402] eta 0:01:17 lr 0.000025 time 0.7471 (0.7596) loss 0.6439 (0.6519) grad_norm 0.4923 (0.3628) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:39:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [61/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7562) loss 0.6713 (0.6514) grad_norm 0.3171 (0.3578) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:39:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 61 training takes 0:05:04 [2024-03-08 21:39:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [62/800][0/402] eta 0:22:09 lr 0.000025 time 3.3065 (3.3065) loss 0.6439 (0.6439) grad_norm 0.3915 (0.3915) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:41:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [62/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7710) loss 0.6516 (0.6530) grad_norm 0.3762 (0.3598) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:42:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [62/800][200/402] eta 0:02:33 lr 0.000025 time 0.7453 (0.7584) loss 0.6380 (0.6526) grad_norm 0.3030 (0.3584) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:43:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [62/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7541) loss 0.6258 (0.6519) grad_norm 0.2889 (0.3583) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:44:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [62/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7520) loss 0.6437 (0.6517) grad_norm 0.3272 (0.3537) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:44:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 62 training takes 0:05:02 [2024-03-08 21:44:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [63/800][0/402] eta 0:21:43 lr 0.000025 time 3.2426 (3.2426) loss 0.6266 (0.6266) grad_norm 0.3948 (0.3948) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:46:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [63/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7703) loss 0.6524 (0.6526) grad_norm 0.3547 (0.3510) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:47:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [63/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7580) loss 0.6653 (0.6522) grad_norm 0.2866 (0.3457) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:48:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [63/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7539) loss 0.6474 (0.6513) grad_norm 0.3025 (0.3494) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:49:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [63/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7518) loss 0.6616 (0.6510) grad_norm 0.3467 (0.3474) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:49:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 63 training takes 0:05:02 [2024-03-08 21:49:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [64/800][0/402] eta 0:22:04 lr 0.000025 time 3.2957 (3.2957) loss 0.6523 (0.6523) grad_norm 0.3587 (0.3587) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:51:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [64/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7709) loss 0.6866 (0.6482) grad_norm 0.3077 (0.3403) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:52:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [64/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7584) loss 0.6316 (0.6495) grad_norm 0.4353 (0.3418) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:53:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [64/800][300/402] eta 0:01:16 lr 0.000025 time 0.7453 (0.7542) loss 0.5927 (0.6501) grad_norm 0.3382 (0.3391) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:54:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [64/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7520) loss 0.6375 (0.6503) grad_norm 0.3034 (0.3450) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:54:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 64 training takes 0:05:02 [2024-03-08 21:54:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [65/800][0/402] eta 0:22:15 lr 0.000025 time 3.3229 (3.3229) loss 0.6558 (0.6558) grad_norm 0.3486 (0.3486) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:56:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [65/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7711) loss 0.6630 (0.6505) grad_norm 0.3240 (0.3348) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 21:57:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [65/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7584) loss 0.6428 (0.6499) grad_norm 0.2780 (inf) loss_scale 262144.0000 (266056.5970) mem 28968MB [2024-03-08 21:58:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [65/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7542) loss 0.6672 (0.6501) grad_norm 0.2883 (inf) loss_scale 262144.0000 (264756.7309) mem 28968MB [2024-03-08 21:59:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [65/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7520) loss 0.6441 (0.6497) grad_norm 0.3113 (inf) loss_scale 262144.0000 (264105.1771) mem 28968MB [2024-03-08 21:59:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 65 training takes 0:05:02 [2024-03-08 22:00:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [66/800][0/402] eta 0:32:45 lr 0.000025 time 4.8886 (4.8886) loss 0.6417 (0.6417) grad_norm 0.3478 (0.3478) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 22:01:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [66/800][100/402] eta 0:03:57 lr 0.000025 time 0.7457 (0.7867) loss 0.6425 (0.6488) grad_norm 0.4447 (0.3460) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 22:02:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [66/800][200/402] eta 0:02:34 lr 0.000025 time 0.7457 (0.7664) loss 0.6736 (0.6502) grad_norm 0.3013 (0.3500) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 22:03:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [66/800][300/402] eta 0:01:17 lr 0.000025 time 0.7460 (0.7596) loss 0.6374 (0.6499) grad_norm 0.2947 (0.3427) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 22:05:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [66/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7561) loss 0.6239 (0.6503) grad_norm 0.3508 (0.3408) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 22:05:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 66 training takes 0:05:04 [2024-03-08 22:05:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [67/800][0/402] eta 0:22:50 lr 0.000025 time 3.4094 (3.4094) loss 0.6582 (0.6582) grad_norm 0.3901 (0.3901) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 22:06:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [67/800][100/402] eta 0:03:53 lr 0.000025 time 0.7454 (0.7721) loss 0.6342 (0.6495) grad_norm 0.2853 (0.3350) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 22:07:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [67/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7589) loss 0.6517 (0.6498) grad_norm 0.3848 (inf) loss_scale 131072.0000 (256275.1045) mem 28968MB [2024-03-08 22:08:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [67/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7545) loss 0.6595 (0.6496) grad_norm 0.4043 (inf) loss_scale 131072.0000 (214679.3887) mem 28968MB [2024-03-08 22:10:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [67/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7522) loss 0.6709 (0.6496) grad_norm 0.2948 (inf) loss_scale 131072.0000 (193829.6658) mem 28968MB [2024-03-08 22:10:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 67 training takes 0:05:02 [2024-03-08 22:10:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [68/800][0/402] eta 0:22:29 lr 0.000025 time 3.3575 (3.3575) loss 0.6702 (0.6702) grad_norm 0.3013 (0.3013) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:11:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [68/800][100/402] eta 0:03:52 lr 0.000025 time 0.7452 (0.7715) loss 0.6335 (0.6503) grad_norm 0.3931 (0.3360) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:12:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [68/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7587) loss 0.6610 (0.6486) grad_norm 0.3017 (0.3361) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:13:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [68/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7544) loss 0.6489 (0.6487) grad_norm 0.2835 (0.3308) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:15:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [68/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7522) loss 0.6853 (0.6488) grad_norm 0.2684 (0.3312) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:15:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 68 training takes 0:05:02 [2024-03-08 22:15:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [69/800][0/402] eta 0:22:03 lr 0.000025 time 3.2914 (3.2914) loss 0.6888 (0.6888) grad_norm 0.2972 (0.2972) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:16:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [69/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7709) loss 0.6384 (0.6480) grad_norm 0.3549 (0.3228) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:17:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [69/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7584) loss 0.6063 (0.6497) grad_norm 0.3927 (0.3251) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:18:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [69/800][300/402] eta 0:01:16 lr 0.000025 time 0.7453 (0.7541) loss 0.6372 (0.6495) grad_norm 0.3055 (0.3236) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:20:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [69/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7519) loss 0.6135 (0.6489) grad_norm 0.6230 (0.3236) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:20:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 69 training takes 0:05:02 [2024-03-08 22:20:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [70/800][0/402] eta 0:21:23 lr 0.000025 time 3.1937 (3.1937) loss 0.6737 (0.6737) grad_norm 0.2332 (0.2332) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:21:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [70/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7699) loss 0.6496 (0.6501) grad_norm 0.3076 (0.3180) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:22:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [70/800][200/402] eta 0:02:33 lr 0.000025 time 0.7452 (0.7578) loss 0.6340 (0.6488) grad_norm 0.3615 (0.3187) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:23:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [70/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7538) loss 0.6757 (0.6481) grad_norm 0.1925 (0.3199) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:25:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [70/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7518) loss 0.6259 (0.6477) grad_norm 0.3718 (0.3214) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:25:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 70 training takes 0:05:02 [2024-03-08 22:25:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [71/800][0/402] eta 0:32:08 lr 0.000025 time 4.7971 (4.7971) loss 0.6247 (0.6247) grad_norm 0.2570 (0.2570) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:26:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [71/800][100/402] eta 0:03:57 lr 0.000025 time 0.7459 (0.7860) loss 0.6587 (0.6477) grad_norm 0.3140 (0.3162) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:27:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [71/800][200/402] eta 0:02:34 lr 0.000025 time 0.7453 (0.7659) loss 0.6324 (0.6482) grad_norm 0.4110 (0.3216) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:29:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [71/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7592) loss 0.6859 (0.6477) grad_norm 0.2632 (0.3197) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:30:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [71/800][400/402] eta 0:00:01 lr 0.000025 time 0.7452 (0.7558) loss 0.6459 (0.6478) grad_norm 0.2960 (0.3230) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:30:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 71 training takes 0:05:03 [2024-03-08 22:30:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [72/800][0/402] eta 0:22:41 lr 0.000025 time 3.3866 (3.3866) loss 0.6490 (0.6490) grad_norm 0.2788 (0.2788) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:31:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [72/800][100/402] eta 0:03:53 lr 0.000025 time 0.7460 (0.7717) loss 0.6363 (0.6483) grad_norm 0.3968 (0.3157) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:32:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [72/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7587) loss 0.6298 (0.6473) grad_norm 0.2451 (0.3167) loss_scale 262144.0000 (143461.8905) mem 28968MB [2024-03-08 22:34:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [72/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7544) loss 0.6432 (0.6482) grad_norm 0.3406 (0.3129) loss_scale 262144.0000 (182891.1628) mem 28968MB [2024-03-08 22:35:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [72/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7522) loss 0.6384 (0.6476) grad_norm 0.3497 (0.3103) loss_scale 262144.0000 (202654.9626) mem 28968MB [2024-03-08 22:35:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 72 training takes 0:05:02 [2024-03-08 22:35:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [73/800][0/402] eta 0:22:18 lr 0.000025 time 3.3296 (3.3296) loss 0.6638 (0.6638) grad_norm 0.2567 (0.2567) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 22:36:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [73/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7713) loss 0.6249 (0.6471) grad_norm 0.3194 (0.3094) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 22:37:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [73/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7586) loss 0.6336 (0.6453) grad_norm 0.2981 (0.3050) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 22:39:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [73/800][300/402] eta 0:01:16 lr 0.000025 time 0.7453 (0.7543) loss 0.6477 (0.6465) grad_norm 0.2576 (0.3091) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 22:40:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [73/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7521) loss 0.6581 (0.6465) grad_norm 0.2461 (inf) loss_scale 131072.0000 (251030.6633) mem 28968MB [2024-03-08 22:40:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 73 training takes 0:05:02 [2024-03-08 22:40:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [74/800][0/402] eta 0:22:42 lr 0.000025 time 3.3902 (3.3902) loss 0.6716 (0.6716) grad_norm 0.2411 (0.2411) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:41:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [74/800][100/402] eta 0:03:53 lr 0.000025 time 0.7453 (0.7718) loss 0.6990 (0.6489) grad_norm 0.2485 (0.2968) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:42:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [74/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7588) loss 0.6310 (0.6495) grad_norm 0.3101 (0.3030) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:44:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [74/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7544) loss 0.6580 (0.6485) grad_norm 0.3992 (0.3026) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:45:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [74/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7522) loss 0.6641 (0.6482) grad_norm 0.2939 (0.3045) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:45:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 74 training takes 0:05:02 [2024-03-08 22:45:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [75/800][0/402] eta 0:22:55 lr 0.000025 time 3.4214 (3.4214) loss 0.6080 (0.6080) grad_norm 0.3005 (0.3005) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:46:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [75/800][100/402] eta 0:03:53 lr 0.000025 time 0.7454 (0.7722) loss 0.6230 (0.6486) grad_norm 0.2417 (0.2903) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:47:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [75/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7590) loss 0.6342 (0.6473) grad_norm 0.2679 (0.2961) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:49:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [75/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7546) loss 0.6492 (0.6474) grad_norm 0.3086 (0.2973) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:50:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [75/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7523) loss 0.6463 (0.6467) grad_norm 0.2515 (0.2983) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:50:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 75 training takes 0:05:02 [2024-03-08 22:50:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [76/800][0/402] eta 0:32:35 lr 0.000025 time 4.8642 (4.8642) loss 0.6039 (0.6039) grad_norm 0.2685 (0.2685) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:51:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [76/800][100/402] eta 0:03:57 lr 0.000025 time 0.7454 (0.7864) loss 0.6377 (0.6494) grad_norm 0.3104 (0.3051) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:52:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [76/800][200/402] eta 0:02:34 lr 0.000025 time 0.7457 (0.7661) loss 0.6249 (0.6483) grad_norm 0.3223 (0.2989) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:54:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [76/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7593) loss 0.6563 (0.6481) grad_norm 0.3570 (0.3037) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:55:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [76/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7559) loss 0.6537 (0.6474) grad_norm 0.3331 (0.3041) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:55:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 76 training takes 0:05:03 [2024-03-08 22:55:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [77/800][0/402] eta 0:22:16 lr 0.000025 time 3.3255 (3.3255) loss 0.6396 (0.6396) grad_norm 0.2484 (0.2484) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:56:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [77/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7712) loss 0.6510 (0.6464) grad_norm 0.2405 (0.2918) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:58:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [77/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7585) loss 0.6399 (0.6468) grad_norm 0.4133 (0.2970) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 22:59:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [77/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7542) loss 0.6485 (0.6474) grad_norm 0.3322 (0.2984) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 23:00:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [77/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7520) loss 0.6514 (0.6463) grad_norm 0.2604 (0.2973) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 23:00:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 77 training takes 0:05:02 [2024-03-08 23:00:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [78/800][0/402] eta 0:22:55 lr 0.000025 time 3.4218 (3.4218) loss 0.6337 (0.6337) grad_norm 0.3610 (0.3610) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 23:01:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [78/800][100/402] eta 0:03:53 lr 0.000025 time 0.7456 (0.7721) loss 0.6643 (0.6467) grad_norm 0.3087 (0.2960) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 23:03:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [78/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7589) loss 0.6467 (0.6455) grad_norm 0.2891 (0.2998) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 23:04:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [78/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7545) loss 0.6467 (0.6463) grad_norm 0.2463 (0.3041) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 23:05:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [78/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7522) loss 0.6538 (0.6471) grad_norm 0.2737 (0.2970) loss_scale 262144.0000 (145453.9651) mem 28968MB [2024-03-08 23:05:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 78 training takes 0:05:02 [2024-03-08 23:05:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [79/800][0/402] eta 0:22:38 lr 0.000025 time 3.3782 (3.3782) loss 0.6430 (0.6430) grad_norm 0.2186 (0.2186) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:06:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [79/800][100/402] eta 0:03:53 lr 0.000025 time 0.7452 (0.7718) loss 0.6680 (0.6453) grad_norm 0.4548 (0.2797) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:08:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [79/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7589) loss 0.6325 (0.6453) grad_norm 0.2583 (0.2747) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:09:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [79/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7545) loss 0.6778 (0.6451) grad_norm 0.2328 (0.2741) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:10:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [79/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7523) loss 0.6384 (0.6449) grad_norm 0.3294 (0.2782) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:10:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 79 training takes 0:05:02 [2024-03-08 23:10:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [80/800][0/402] eta 0:21:41 lr 0.000025 time 3.2376 (3.2376) loss 0.6242 (0.6242) grad_norm 0.2877 (0.2877) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:11:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [80/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7704) loss 0.6557 (0.6449) grad_norm 0.2197 (0.2858) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:13:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [80/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7582) loss 0.6298 (0.6470) grad_norm 0.4871 (0.2844) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:14:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [80/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7541) loss 0.6522 (0.6463) grad_norm 0.2987 (0.2869) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:15:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [80/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7520) loss 0.6346 (0.6463) grad_norm 0.2734 (0.2858) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:15:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 80 training takes 0:05:02 [2024-03-08 23:15:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [81/800][0/402] eta 0:32:05 lr 0.000025 time 4.7887 (4.7887) loss 0.6894 (0.6894) grad_norm 0.2916 (0.2916) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:16:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [81/800][100/402] eta 0:03:57 lr 0.000025 time 0.7456 (0.7857) loss 0.6599 (0.6466) grad_norm 0.2651 (0.2774) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:18:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [81/800][200/402] eta 0:02:34 lr 0.000025 time 0.7461 (0.7659) loss 0.6407 (0.6467) grad_norm 0.2597 (0.2706) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:19:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [81/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7592) loss 0.6414 (0.6464) grad_norm 0.2670 (0.2707) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:20:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [81/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7558) loss 0.6149 (0.6460) grad_norm 0.2761 (0.2708) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:20:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 81 training takes 0:05:03 [2024-03-08 23:20:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [82/800][0/402] eta 0:22:29 lr 0.000025 time 3.3571 (3.3571) loss 0.6317 (0.6317) grad_norm 0.3330 (0.3330) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:22:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [82/800][100/402] eta 0:03:53 lr 0.000025 time 0.7458 (0.7716) loss 0.6080 (0.6444) grad_norm 0.3204 (0.2840) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:23:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [82/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7587) loss 0.6607 (0.6440) grad_norm 0.3955 (0.2796) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:24:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [82/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7544) loss 0.6479 (0.6452) grad_norm 0.2296 (0.2746) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:25:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [82/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7522) loss 0.6515 (0.6458) grad_norm 0.2763 (0.2777) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:25:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 82 training takes 0:05:02 [2024-03-08 23:25:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [83/800][0/402] eta 0:22:26 lr 0.000025 time 3.3498 (3.3498) loss 0.6423 (0.6423) grad_norm 0.3051 (0.3051) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:27:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [83/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7714) loss 0.6333 (0.6453) grad_norm 0.3341 (0.2799) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:28:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [83/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7586) loss 0.6634 (0.6435) grad_norm 0.2960 (0.2767) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:29:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [83/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7543) loss 0.6371 (0.6438) grad_norm 0.2714 (0.2731) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:30:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [83/800][400/402] eta 0:00:01 lr 0.000025 time 0.7331 (0.7521) loss 0.5952 (0.6441) grad_norm inf (inf) loss_scale 262144.0000 (296791.4613) mem 28968MB [2024-03-08 23:30:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 83 training takes 0:05:02 [2024-03-08 23:30:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [84/800][0/402] eta 0:22:58 lr 0.000025 time 3.4301 (3.4301) loss 0.6458 (0.6458) grad_norm 0.2413 (0.2413) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:32:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [84/800][100/402] eta 0:03:53 lr 0.000025 time 0.7458 (0.7722) loss 0.6443 (0.6436) grad_norm 0.2284 (0.2682) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:33:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [84/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7590) loss 0.6412 (0.6446) grad_norm 0.3307 (0.2634) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:34:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [84/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7546) loss 0.6329 (0.6450) grad_norm 0.3626 (0.2713) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:35:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [84/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7523) loss 0.5984 (0.6453) grad_norm 0.3743 (0.2749) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:35:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 84 training takes 0:05:02 [2024-03-08 23:35:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [85/800][0/402] eta 0:21:59 lr 0.000025 time 3.2815 (3.2815) loss 0.6461 (0.6461) grad_norm 0.2555 (0.2555) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:37:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [85/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7707) loss 0.6281 (0.6446) grad_norm 0.2674 (0.2599) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:38:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [85/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7582) loss 0.6582 (0.6449) grad_norm 0.2012 (0.2707) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:39:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [85/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7541) loss 0.6344 (0.6449) grad_norm 0.2804 (0.2716) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:40:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [85/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7519) loss 0.6331 (0.6439) grad_norm 0.2275 (0.2686) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:40:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 85 training takes 0:05:02 [2024-03-08 23:40:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [86/800][0/402] eta 0:31:54 lr 0.000025 time 4.7635 (4.7635) loss 0.6620 (0.6620) grad_norm 0.3092 (0.3092) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:42:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [86/800][100/402] eta 0:03:57 lr 0.000025 time 0.7455 (0.7863) loss 0.6606 (0.6444) grad_norm 0.2932 (0.2734) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-08 23:43:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [86/800][200/402] eta 0:02:34 lr 0.000025 time 0.7471 (0.7665) loss 0.6336 (0.6430) grad_norm 0.2198 (inf) loss_scale 131072.0000 (245189.4129) mem 28968MB [2024-03-08 23:44:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [86/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7599) loss 0.6283 (0.6426) grad_norm 0.2561 (inf) loss_scale 131072.0000 (207276.6512) mem 28968MB [2024-03-08 23:45:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [86/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7563) loss 0.6335 (0.6429) grad_norm 0.2273 (inf) loss_scale 131072.0000 (188272.9975) mem 28968MB [2024-03-08 23:45:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 86 training takes 0:05:04 [2024-03-08 23:46:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [87/800][0/402] eta 0:22:20 lr 0.000025 time 3.3354 (3.3354) loss 0.6592 (0.6592) grad_norm 0.2001 (0.2001) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 23:47:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [87/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7713) loss 0.6312 (0.6461) grad_norm 0.2751 (0.2637) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 23:48:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [87/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7585) loss 0.6353 (0.6453) grad_norm 0.2601 (0.2701) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 23:49:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [87/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7542) loss 0.6552 (0.6444) grad_norm 0.2352 (0.2665) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 23:50:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [87/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7520) loss 0.6035 (0.6438) grad_norm 0.2633 (0.2642) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 23:50:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 87 training takes 0:05:02 [2024-03-08 23:51:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [88/800][0/402] eta 0:22:11 lr 0.000025 time 3.3125 (3.3125) loss 0.6678 (0.6678) grad_norm 0.3050 (0.3050) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 23:52:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [88/800][100/402] eta 0:03:52 lr 0.000025 time 0.7461 (0.7711) loss 0.6220 (0.6441) grad_norm 0.2268 (0.2588) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 23:53:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [88/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7584) loss 0.6255 (0.6436) grad_norm 0.2615 (0.2626) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 23:54:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [88/800][300/402] eta 0:01:16 lr 0.000025 time 0.7463 (0.7542) loss 0.6365 (0.6446) grad_norm 0.2532 (0.2630) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 23:56:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [88/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7520) loss 0.6391 (0.6442) grad_norm 0.2605 (0.2623) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 23:56:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 88 training takes 0:05:02 [2024-03-08 23:56:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [89/800][0/402] eta 0:23:02 lr 0.000025 time 3.4395 (3.4395) loss 0.6622 (0.6622) grad_norm 0.2854 (0.2854) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 23:57:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [89/800][100/402] eta 0:03:53 lr 0.000025 time 0.7457 (0.7723) loss 0.6503 (0.6498) grad_norm 0.2524 (0.2588) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 23:58:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [89/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7590) loss 0.6457 (0.6463) grad_norm 0.2615 (0.2558) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-08 23:59:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [89/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7546) loss 0.6297 (0.6446) grad_norm 0.2619 (0.2586) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:01:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [89/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7524) loss 0.6523 (0.6437) grad_norm 0.2402 (0.2586) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:01:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 89 training takes 0:05:02 [2024-03-09 00:01:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [90/800][0/402] eta 0:22:44 lr 0.000025 time 3.3937 (3.3937) loss 0.6384 (0.6384) grad_norm 0.3351 (0.3351) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:02:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [90/800][100/402] eta 0:03:53 lr 0.000025 time 0.7451 (0.7719) loss 0.6613 (0.6418) grad_norm 0.2463 (0.2641) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:03:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [90/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7589) loss 0.6768 (0.6427) grad_norm 0.2031 (0.2626) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:04:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [90/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7545) loss 0.6680 (0.6436) grad_norm 0.2570 (0.2581) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:06:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [90/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7523) loss 0.6171 (0.6432) grad_norm 0.1861 (0.2597) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:06:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 90 training takes 0:05:02 [2024-03-09 00:06:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [91/800][0/402] eta 0:31:57 lr 0.000025 time 4.7689 (4.7689) loss 0.6200 (0.6200) grad_norm 0.2305 (0.2305) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:07:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [91/800][100/402] eta 0:03:57 lr 0.000025 time 0.7456 (0.7854) loss 0.6735 (0.6444) grad_norm 0.2556 (0.2518) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:08:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [91/800][200/402] eta 0:02:34 lr 0.000025 time 0.7456 (0.7656) loss 0.6410 (0.6449) grad_norm 0.2066 (0.2493) loss_scale 262144.0000 (154547.5821) mem 28968MB [2024-03-09 00:09:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [91/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7590) loss 0.6379 (0.6439) grad_norm 0.2779 (0.2539) loss_scale 262144.0000 (190293.9003) mem 28968MB [2024-03-09 00:11:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [91/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7556) loss 0.6649 (0.6435) grad_norm 0.2511 (0.2553) loss_scale 262144.0000 (208211.6309) mem 28968MB [2024-03-09 00:11:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 91 training takes 0:05:03 [2024-03-09 00:11:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [92/800][0/402] eta 0:22:02 lr 0.000025 time 3.2896 (3.2896) loss 0.6202 (0.6202) grad_norm 0.2503 (0.2503) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 00:12:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [92/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7708) loss 0.6392 (0.6467) grad_norm 0.1526 (0.2411) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 00:13:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [92/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7583) loss 0.6408 (0.6449) grad_norm 0.2536 (0.2446) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 00:14:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [92/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7541) loss 0.6422 (0.6440) grad_norm 0.2216 (0.2480) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 00:16:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [92/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7520) loss 0.6284 (0.6442) grad_norm 0.1882 (0.2481) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 00:16:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 92 training takes 0:05:02 [2024-03-09 00:16:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [93/800][0/402] eta 0:22:12 lr 0.000025 time 3.3145 (3.3145) loss 0.6212 (0.6212) grad_norm 0.2396 (0.2396) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 00:17:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [93/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7710) loss 0.6300 (0.6431) grad_norm 0.1899 (0.2524) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 00:18:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [93/800][200/402] eta 0:02:33 lr 0.000025 time 0.7453 (0.7584) loss 0.6594 (0.6432) grad_norm 0.2577 (0.2513) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 00:19:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [93/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7542) loss 0.6005 (0.6426) grad_norm 0.2980 (0.2492) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 00:21:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [93/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7520) loss 0.6462 (0.6429) grad_norm 0.1782 (0.2512) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 00:21:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 93 training takes 0:05:02 [2024-03-09 00:21:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [94/800][0/402] eta 0:22:41 lr 0.000025 time 3.3869 (3.3869) loss 0.6419 (0.6419) grad_norm 0.2165 (0.2165) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 00:22:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [94/800][100/402] eta 0:03:53 lr 0.000025 time 0.7454 (0.7717) loss 0.6101 (0.6429) grad_norm 0.2006 (inf) loss_scale 131072.0000 (193363.6436) mem 28968MB [2024-03-09 00:23:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [94/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7587) loss 0.6447 (0.6425) grad_norm 0.2727 (inf) loss_scale 131072.0000 (162372.7761) mem 28968MB [2024-03-09 00:25:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [94/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7544) loss 0.6536 (0.6437) grad_norm 0.2457 (inf) loss_scale 131072.0000 (151973.8472) mem 28968MB [2024-03-09 00:26:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [94/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7522) loss 0.6698 (0.6437) grad_norm 0.1958 (inf) loss_scale 131072.0000 (146761.4165) mem 28968MB [2024-03-09 00:26:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 94 training takes 0:05:02 [2024-03-09 00:26:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [95/800][0/402] eta 0:21:42 lr 0.000025 time 3.2388 (3.2388) loss 0.6459 (0.6459) grad_norm 0.2254 (0.2254) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:27:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [95/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7704) loss 0.6356 (0.6430) grad_norm 0.2030 (0.2409) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:28:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [95/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7581) loss 0.6583 (0.6413) grad_norm 0.1817 (0.2376) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:30:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [95/800][300/402] eta 0:01:16 lr 0.000025 time 0.7463 (0.7540) loss 0.6669 (0.6414) grad_norm 0.2335 (0.2389) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:31:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [95/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7519) loss 0.6435 (0.6415) grad_norm 0.2201 (0.2397) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:31:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 95 training takes 0:05:02 [2024-03-09 00:31:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [96/800][0/402] eta 0:32:17 lr 0.000025 time 4.8184 (4.8184) loss 0.6780 (0.6780) grad_norm 0.2102 (0.2102) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:32:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [96/800][100/402] eta 0:03:57 lr 0.000025 time 0.7467 (0.7872) loss 0.6262 (0.6426) grad_norm 0.2140 (0.2370) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:33:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [96/800][200/402] eta 0:02:34 lr 0.000025 time 0.7467 (0.7672) loss 0.6470 (0.6428) grad_norm 0.2346 (0.2464) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:35:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [96/800][300/402] eta 0:01:17 lr 0.000025 time 0.7465 (0.7604) loss 0.6342 (0.6416) grad_norm 0.2359 (0.2441) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:36:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [96/800][400/402] eta 0:00:01 lr 0.000025 time 0.7450 (0.7570) loss 0.6469 (0.6420) grad_norm 0.4481 (0.2420) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:36:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 96 training takes 0:05:04 [2024-03-09 00:36:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [97/800][0/402] eta 0:22:20 lr 0.000025 time 3.3356 (3.3356) loss 0.6146 (0.6146) grad_norm 0.2214 (0.2214) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:37:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [97/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7713) loss 0.6154 (0.6392) grad_norm 0.2102 (0.2571) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:38:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [97/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7585) loss 0.6484 (0.6400) grad_norm 0.2159 (0.2462) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:40:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [97/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7543) loss 0.6521 (0.6407) grad_norm 0.1969 (0.2393) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:41:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [97/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7521) loss 0.6283 (0.6413) grad_norm 0.2260 (0.2365) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:41:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 97 training takes 0:05:02 [2024-03-09 00:41:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [98/800][0/402] eta 0:22:02 lr 0.000025 time 3.2893 (3.2893) loss 0.6450 (0.6450) grad_norm 0.2280 (0.2280) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:42:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [98/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7709) loss 0.6431 (0.6430) grad_norm 0.2898 (0.2367) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:43:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [98/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7584) loss 0.6592 (0.6430) grad_norm 0.2282 (0.2448) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:45:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [98/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7542) loss 0.6749 (0.6425) grad_norm 0.1700 (0.2390) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:46:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [98/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7520) loss 0.6731 (0.6422) grad_norm 0.2203 (0.2345) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:46:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 98 training takes 0:05:02 [2024-03-09 00:46:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [99/800][0/402] eta 0:22:00 lr 0.000025 time 3.2838 (3.2838) loss 0.6501 (0.6501) grad_norm 0.2022 (0.2022) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 00:47:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [99/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7709) loss 0.6263 (0.6413) grad_norm 0.1936 (0.2475) loss_scale 262144.0000 (212829.7822) mem 28968MB [2024-03-09 00:49:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [99/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7583) loss 0.6552 (0.6428) grad_norm 0.2299 (0.2419) loss_scale 262144.0000 (237364.2189) mem 28968MB [2024-03-09 00:50:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [99/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7541) loss 0.6656 (0.6425) grad_norm 0.1911 (0.2352) loss_scale 262144.0000 (245596.7043) mem 28968MB [2024-03-09 00:51:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [99/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7520) loss 0.6362 (0.6415) grad_norm 0.2313 (0.2333) loss_scale 262144.0000 (249723.2120) mem 28968MB [2024-03-09 00:51:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 99 training takes 0:05:02 [2024-03-09 00:51:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [100/800][0/402] eta 0:22:14 lr 0.000025 time 3.3195 (3.3195) loss 0.6678 (0.6678) grad_norm 0.1867 (0.1867) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 00:52:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [100/800][100/402] eta 0:03:52 lr 0.000025 time 0.7463 (0.7711) loss 0.6141 (0.6419) grad_norm 0.2424 (0.2361) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 00:54:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [100/800][200/402] eta 0:02:33 lr 0.000025 time 0.7451 (0.7585) loss 0.6410 (0.6415) grad_norm 0.2473 (0.2301) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 00:55:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [100/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7542) loss 0.6218 (0.6415) grad_norm 0.2594 (0.2299) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 00:56:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [100/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7520) loss 0.6530 (0.6415) grad_norm 0.2239 (0.2257) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 00:56:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 100 training takes 0:05:02 [2024-03-09 00:56:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [101/800][0/402] eta 0:32:08 lr 0.000025 time 4.7975 (4.7975) loss 0.6539 (0.6539) grad_norm 0.1980 (0.1980) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 00:57:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [101/800][100/402] eta 0:03:57 lr 0.000025 time 0.7464 (0.7860) loss 0.6440 (0.6434) grad_norm 0.2436 (0.2540) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 00:59:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [101/800][200/402] eta 0:02:34 lr 0.000025 time 0.7456 (0.7660) loss 0.6401 (0.6425) grad_norm 0.1651 (0.2387) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:00:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [101/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7593) loss 0.6184 (0.6420) grad_norm 0.2182 (0.2359) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:01:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [101/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7559) loss 0.6469 (0.6416) grad_norm 0.2542 (0.2330) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:01:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 101 training takes 0:05:03 [2024-03-09 01:01:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [102/800][0/402] eta 0:21:41 lr 0.000025 time 3.2386 (3.2386) loss 0.6488 (0.6488) grad_norm 0.2458 (0.2458) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:02:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [102/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7704) loss 0.6245 (0.6403) grad_norm 0.1858 (0.2236) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:04:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [102/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7581) loss 0.6641 (0.6403) grad_norm 0.2282 (0.2251) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:05:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [102/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7540) loss 0.6613 (0.6408) grad_norm 0.2204 (0.2273) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:06:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [102/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7519) loss 0.6651 (0.6415) grad_norm 0.1610 (0.2266) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:06:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 102 training takes 0:05:02 [2024-03-09 01:06:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [103/800][0/402] eta 0:21:41 lr 0.000025 time 3.2372 (3.2372) loss 0.6298 (0.6298) grad_norm 0.2140 (0.2140) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:07:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [103/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7704) loss 0.6366 (0.6401) grad_norm 0.2390 (0.2304) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:09:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [103/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7582) loss 0.6353 (0.6403) grad_norm 0.2167 (0.2267) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:10:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [103/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7540) loss 0.6151 (0.6408) grad_norm 0.2589 (0.2266) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:11:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [103/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7519) loss 0.6267 (0.6410) grad_norm 0.2425 (0.2259) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:11:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 103 training takes 0:05:02 [2024-03-09 01:11:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [104/800][0/402] eta 0:22:33 lr 0.000025 time 3.3681 (3.3681) loss 0.6576 (0.6576) grad_norm 0.2301 (0.2301) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:13:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [104/800][100/402] eta 0:03:53 lr 0.000025 time 0.7458 (0.7715) loss 0.6530 (0.6372) grad_norm 0.1557 (inf) loss_scale 262144.0000 (275121.4257) mem 28968MB [2024-03-09 01:14:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [104/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7587) loss 0.6677 (0.6394) grad_norm 0.2057 (inf) loss_scale 262144.0000 (268664.9950) mem 28968MB [2024-03-09 01:15:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [104/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7543) loss 0.6618 (0.6403) grad_norm 0.2164 (inf) loss_scale 262144.0000 (266498.5515) mem 28968MB [2024-03-09 01:16:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [104/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7521) loss 0.6357 (0.6394) grad_norm 0.2459 (inf) loss_scale 262144.0000 (265412.6284) mem 28968MB [2024-03-09 01:16:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 104 training takes 0:05:02 [2024-03-09 01:16:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [105/800][0/402] eta 0:22:07 lr 0.000025 time 3.3013 (3.3013) loss 0.6331 (0.6331) grad_norm 0.2392 (0.2392) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:18:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [105/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7709) loss 0.6519 (0.6388) grad_norm 0.2567 (0.2209) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:19:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [105/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7584) loss 0.6553 (0.6387) grad_norm 0.1910 (0.2221) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:20:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [105/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7542) loss 0.6565 (0.6403) grad_norm 0.1969 (0.2199) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:21:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [105/800][400/402] eta 0:00:01 lr 0.000025 time 0.7437 (0.7520) loss 0.6376 (0.6405) grad_norm 0.1957 (0.2191) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:21:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 105 training takes 0:05:02 [2024-03-09 01:21:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [106/800][0/402] eta 0:31:40 lr 0.000025 time 4.7279 (4.7279) loss 0.6509 (0.6509) grad_norm 0.2231 (0.2231) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:23:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [106/800][100/402] eta 0:03:57 lr 0.000025 time 0.7469 (0.7861) loss 0.6504 (0.6385) grad_norm 0.1788 (0.2180) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:24:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [106/800][200/402] eta 0:02:34 lr 0.000025 time 0.7465 (0.7665) loss 0.6467 (0.6397) grad_norm 0.2267 (0.2189) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:25:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [106/800][300/402] eta 0:01:17 lr 0.000025 time 0.7466 (0.7600) loss 0.6567 (0.6403) grad_norm 0.2117 (0.2202) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:26:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [106/800][400/402] eta 0:00:01 lr 0.000025 time 0.7466 (0.7567) loss 0.6271 (0.6411) grad_norm 0.2260 (0.2188) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:26:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 106 training takes 0:05:04 [2024-03-09 01:26:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [107/800][0/402] eta 0:21:56 lr 0.000025 time 3.2744 (3.2744) loss 0.6485 (0.6485) grad_norm 0.1864 (0.1864) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:28:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [107/800][100/402] eta 0:03:52 lr 0.000025 time 0.7462 (0.7707) loss 0.6846 (0.6390) grad_norm 0.2524 (0.2290) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 01:29:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [107/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7582) loss 0.6053 (0.6406) grad_norm 0.2042 (inf) loss_scale 131072.0000 (236060.0199) mem 28968MB [2024-03-09 01:30:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [107/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7541) loss 0.6556 (0.6416) grad_norm 0.1844 (inf) loss_scale 131072.0000 (201180.2791) mem 28968MB [2024-03-09 01:31:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [107/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7520) loss 0.6369 (0.6412) grad_norm 0.2720 (inf) loss_scale 131072.0000 (183696.9177) mem 28968MB [2024-03-09 01:31:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 107 training takes 0:05:02 [2024-03-09 01:31:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [108/800][0/402] eta 0:22:13 lr 0.000025 time 3.3178 (3.3178) loss 0.6425 (0.6425) grad_norm 0.2259 (0.2259) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:33:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [108/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7712) loss 0.6636 (0.6449) grad_norm 0.1984 (0.2186) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:34:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [108/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7585) loss 0.6441 (0.6430) grad_norm 0.1967 (0.2177) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:35:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [108/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7542) loss 0.6261 (0.6420) grad_norm 0.2207 (0.2126) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:36:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [108/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7521) loss 0.6319 (0.6423) grad_norm 0.2083 (0.2138) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:36:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 108 training takes 0:05:02 [2024-03-09 01:36:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [109/800][0/402] eta 0:21:40 lr 0.000025 time 3.2362 (3.2362) loss 0.6444 (0.6444) grad_norm 0.2107 (0.2107) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:38:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [109/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7703) loss 0.6542 (0.6396) grad_norm 0.1896 (0.2150) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:39:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [109/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7581) loss 0.6263 (0.6395) grad_norm 0.2104 (0.2147) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:40:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [109/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7540) loss 0.6203 (0.6398) grad_norm 0.2127 (0.2121) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:41:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [109/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7519) loss 0.6430 (0.6402) grad_norm 0.2084 (0.2111) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:41:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 109 training takes 0:05:02 [2024-03-09 01:42:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [110/800][0/402] eta 0:22:20 lr 0.000025 time 3.3356 (3.3356) loss 0.6616 (0.6616) grad_norm 0.1865 (0.1865) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:43:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [110/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7714) loss 0.6059 (0.6369) grad_norm 0.2064 (0.2125) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:44:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [110/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7586) loss 0.6280 (0.6387) grad_norm 0.2123 (0.2115) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:45:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [110/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7543) loss 0.6552 (0.6394) grad_norm 0.2525 (0.2102) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:47:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [110/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7521) loss 0.6212 (0.6403) grad_norm 0.1853 (0.2090) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:47:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 110 training takes 0:05:02 [2024-03-09 01:47:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [111/800][0/402] eta 0:31:41 lr 0.000025 time 4.7301 (4.7301) loss 0.6524 (0.6524) grad_norm 0.2433 (0.2433) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:48:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [111/800][100/402] eta 0:03:57 lr 0.000025 time 0.7456 (0.7850) loss 0.6457 (0.6383) grad_norm 0.1927 (0.2065) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:49:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [111/800][200/402] eta 0:02:34 lr 0.000025 time 0.7459 (0.7655) loss 0.6804 (0.6397) grad_norm 0.1530 (0.2053) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:50:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [111/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7589) loss 0.6305 (0.6398) grad_norm 0.1726 (0.2062) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:52:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [111/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7556) loss 0.6494 (0.6399) grad_norm 0.4420 (0.2080) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:52:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 111 training takes 0:05:03 [2024-03-09 01:52:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [112/800][0/402] eta 0:21:15 lr 0.000025 time 3.1731 (3.1731) loss 0.6417 (0.6417) grad_norm 0.2305 (0.2305) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:53:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [112/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7699) loss 0.6254 (0.6411) grad_norm 0.1969 (0.2038) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:54:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [112/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7578) loss 0.6427 (0.6409) grad_norm 0.2260 (inf) loss_scale 131072.0000 (144113.9900) mem 28968MB [2024-03-09 01:55:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [112/800][300/402] eta 0:01:16 lr 0.000025 time 0.7468 (0.7538) loss 0.6319 (0.6412) grad_norm 0.2372 (inf) loss_scale 131072.0000 (139781.1030) mem 28968MB [2024-03-09 01:57:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [112/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7517) loss 0.6341 (0.6410) grad_norm 0.1843 (inf) loss_scale 131072.0000 (137609.2569) mem 28968MB [2024-03-09 01:57:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 112 training takes 0:05:02 [2024-03-09 01:57:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [113/800][0/402] eta 0:22:28 lr 0.000025 time 3.3543 (3.3543) loss 0.6306 (0.6306) grad_norm 0.1919 (0.1919) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:58:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [113/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7715) loss 0.6569 (0.6404) grad_norm 0.2006 (0.2070) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 01:59:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [113/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7587) loss 0.6434 (0.6411) grad_norm 0.1676 (0.2097) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:00:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [113/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7544) loss 0.6408 (0.6417) grad_norm 0.2172 (0.2082) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:02:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [113/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7522) loss 0.6560 (0.6407) grad_norm 0.2228 (0.2108) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:02:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 113 training takes 0:05:02 [2024-03-09 02:02:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [114/800][0/402] eta 0:22:37 lr 0.000025 time 3.3773 (3.3773) loss 0.6585 (0.6585) grad_norm 0.2514 (0.2514) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:03:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [114/800][100/402] eta 0:03:53 lr 0.000025 time 0.7458 (0.7717) loss 0.6287 (0.6395) grad_norm 0.2058 (0.2062) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:04:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [114/800][200/402] eta 0:02:33 lr 0.000025 time 0.7475 (0.7588) loss 0.6192 (0.6387) grad_norm 0.2192 (0.2036) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:05:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [114/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7544) loss 0.6147 (0.6389) grad_norm 0.1961 (0.2016) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:07:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [114/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7522) loss 0.5872 (0.6395) grad_norm 0.3231 (0.2030) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:07:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 114 training takes 0:05:02 [2024-03-09 02:07:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [115/800][0/402] eta 0:22:09 lr 0.000025 time 3.3069 (3.3069) loss 0.6743 (0.6743) grad_norm 0.1973 (0.1973) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:08:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [115/800][100/402] eta 0:03:52 lr 0.000025 time 0.7476 (0.7710) loss 0.6466 (0.6402) grad_norm 0.1828 (0.2032) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:09:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [115/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7584) loss 0.6175 (0.6386) grad_norm 0.2246 (0.2044) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:10:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [115/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7542) loss 0.6374 (0.6392) grad_norm 0.2090 (0.2073) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:12:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [115/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7520) loss 0.6225 (0.6404) grad_norm 0.1828 (0.2072) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:12:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 115 training takes 0:05:02 [2024-03-09 02:12:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [116/800][0/402] eta 0:32:12 lr 0.000025 time 4.8077 (4.8077) loss 0.6325 (0.6325) grad_norm 0.2167 (0.2167) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:13:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [116/800][100/402] eta 0:03:57 lr 0.000025 time 0.7459 (0.7859) loss 0.6612 (0.6392) grad_norm 0.1862 (0.2083) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:14:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [116/800][200/402] eta 0:02:34 lr 0.000025 time 0.7455 (0.7659) loss 0.6325 (0.6399) grad_norm 0.2006 (0.2072) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:16:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [116/800][300/402] eta 0:01:17 lr 0.000025 time 0.7455 (0.7592) loss 0.6437 (0.6405) grad_norm 0.1893 (0.2032) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:17:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [116/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7558) loss 0.6587 (0.6411) grad_norm 0.1756 (0.2003) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:17:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 116 training takes 0:05:03 [2024-03-09 02:17:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [117/800][0/402] eta 0:22:58 lr 0.000025 time 3.4300 (3.4300) loss 0.6278 (0.6278) grad_norm 0.1935 (0.1935) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:18:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [117/800][100/402] eta 0:03:53 lr 0.000025 time 0.7459 (0.7723) loss 0.6683 (0.6414) grad_norm 0.1825 (0.2122) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:19:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [117/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7591) loss 0.6507 (0.6420) grad_norm 0.1801 (0.2059) loss_scale 262144.0000 (157155.9801) mem 28968MB [2024-03-09 02:21:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [117/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7546) loss 0.6485 (0.6411) grad_norm 0.2188 (0.2010) loss_scale 262144.0000 (192035.7209) mem 28968MB [2024-03-09 02:22:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [117/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7524) loss 0.6561 (0.6401) grad_norm 0.2050 (0.2017) loss_scale 262144.0000 (209519.0823) mem 28968MB [2024-03-09 02:22:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 117 training takes 0:05:02 [2024-03-09 02:22:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [118/800][0/402] eta 0:21:59 lr 0.000025 time 3.2826 (3.2826) loss 0.6151 (0.6151) grad_norm 0.1615 (0.1615) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 02:23:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [118/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7708) loss 0.6210 (0.6401) grad_norm 0.1962 (0.1986) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 02:24:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [118/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7583) loss 0.6811 (0.6394) grad_norm 0.1876 (0.2014) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 02:26:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [118/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7541) loss 0.6557 (0.6405) grad_norm 0.1972 (0.2020) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 02:27:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [118/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7520) loss 0.6093 (0.6398) grad_norm 0.2024 (0.1999) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 02:27:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 118 training takes 0:05:02 [2024-03-09 02:27:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [119/800][0/402] eta 0:21:57 lr 0.000025 time 3.2782 (3.2782) loss 0.6351 (0.6351) grad_norm 0.1736 (0.1736) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 02:28:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [119/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7708) loss 0.6241 (0.6408) grad_norm 0.2531 (0.2032) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 02:29:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [119/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7584) loss 0.6461 (0.6403) grad_norm 0.2039 (0.2012) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 02:31:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [119/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7542) loss 0.6666 (0.6399) grad_norm 0.2031 (0.1986) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 02:32:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [119/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7520) loss 0.6403 (0.6393) grad_norm 0.2277 (0.1997) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 02:32:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 119 training takes 0:05:02 [2024-03-09 02:32:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [120/800][0/402] eta 0:21:45 lr 0.000025 time 3.2465 (3.2465) loss 0.6380 (0.6380) grad_norm 0.1882 (0.1882) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 02:33:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [120/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7703) loss 0.6505 (0.6418) grad_norm 0.1784 (inf) loss_scale 131072.0000 (163515.5644) mem 28968MB [2024-03-09 02:34:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [120/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7581) loss 0.6143 (0.6395) grad_norm 0.2165 (inf) loss_scale 131072.0000 (147374.4876) mem 28968MB [2024-03-09 02:36:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [120/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7540) loss 0.6585 (0.6394) grad_norm 0.2031 (inf) loss_scale 131072.0000 (141958.3787) mem 28968MB [2024-03-09 02:37:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [120/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7519) loss 0.6253 (0.6395) grad_norm 0.1821 (inf) loss_scale 131072.0000 (139243.5711) mem 28968MB [2024-03-09 02:37:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 120 training takes 0:05:02 [2024-03-09 02:37:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [121/800][0/402] eta 0:31:18 lr 0.000025 time 4.6720 (4.6720) loss 0.6214 (0.6214) grad_norm 0.2103 (0.2103) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:38:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [121/800][100/402] eta 0:03:57 lr 0.000025 time 0.7454 (0.7848) loss 0.6374 (0.6366) grad_norm 0.1751 (0.1965) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:40:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [121/800][200/402] eta 0:02:34 lr 0.000025 time 0.7464 (0.7654) loss 0.6536 (0.6374) grad_norm 0.1539 (0.1991) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:41:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [121/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7589) loss 0.6548 (0.6376) grad_norm 0.2104 (0.1991) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:42:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [121/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7556) loss 0.6501 (0.6390) grad_norm 0.1847 (0.1983) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:42:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 121 training takes 0:05:03 [2024-03-09 02:42:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [122/800][0/402] eta 0:21:34 lr 0.000025 time 3.2208 (3.2208) loss 0.6130 (0.6130) grad_norm 0.2580 (0.2580) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:43:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [122/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7702) loss 0.6475 (0.6413) grad_norm 0.1813 (0.1992) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:45:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [122/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7580) loss 0.6230 (0.6402) grad_norm 0.2120 (0.1977) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:46:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [122/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7539) loss 0.6535 (0.6403) grad_norm 0.2035 (0.1967) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:47:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [122/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7518) loss 0.6621 (0.6398) grad_norm 0.1876 (0.1960) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:47:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 122 training takes 0:05:02 [2024-03-09 02:47:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [123/800][0/402] eta 0:22:28 lr 0.000025 time 3.3550 (3.3550) loss 0.6755 (0.6755) grad_norm 0.1873 (0.1873) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:48:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [123/800][100/402] eta 0:03:53 lr 0.000025 time 0.7455 (0.7716) loss 0.6272 (0.6389) grad_norm 0.1761 (0.2011) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:50:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [123/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7587) loss 0.6361 (0.6382) grad_norm 0.1991 (0.1988) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:51:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [123/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7544) loss 0.6256 (0.6389) grad_norm 0.2168 (0.1966) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:52:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [123/800][400/402] eta 0:00:01 lr 0.000025 time 0.7453 (0.7522) loss 0.6349 (0.6389) grad_norm 0.1892 (0.1965) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:52:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 123 training takes 0:05:02 [2024-03-09 02:52:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [124/800][0/402] eta 0:22:20 lr 0.000025 time 3.3355 (3.3355) loss 0.6257 (0.6257) grad_norm 0.1831 (0.1831) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:53:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [124/800][100/402] eta 0:03:52 lr 0.000025 time 0.7461 (0.7714) loss 0.6265 (0.6396) grad_norm 0.1926 (0.1905) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:55:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [124/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7586) loss 0.6417 (0.6385) grad_norm 0.1854 (0.1897) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:56:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [124/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7544) loss 0.5901 (0.6387) grad_norm 0.1895 (0.1907) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:57:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [124/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7522) loss 0.6480 (0.6388) grad_norm 0.2161 (0.1904) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:57:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 124 training takes 0:05:02 [2024-03-09 02:57:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [125/800][0/402] eta 0:22:31 lr 0.000025 time 3.3615 (3.3615) loss 0.6813 (0.6813) grad_norm 0.1917 (0.1917) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 02:58:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [125/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7715) loss 0.6250 (0.6386) grad_norm 0.1784 (0.1929) loss_scale 262144.0000 (242677.8614) mem 28968MB [2024-03-09 03:00:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [125/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7587) loss 0.6370 (0.6383) grad_norm 0.1704 (0.1905) loss_scale 262144.0000 (252362.5075) mem 28968MB [2024-03-09 03:01:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [125/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7544) loss 0.6335 (0.6374) grad_norm 0.1740 (0.1911) loss_scale 262144.0000 (255612.1728) mem 28968MB [2024-03-09 03:02:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [125/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7522) loss 0.6613 (0.6384) grad_norm 0.1820 (0.1909) loss_scale 262144.0000 (257241.0574) mem 28968MB [2024-03-09 03:02:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 125 training takes 0:05:02 [2024-03-09 03:02:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [126/800][0/402] eta 0:32:22 lr 0.000025 time 4.8327 (4.8327) loss 0.6436 (0.6436) grad_norm 0.1565 (0.1565) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:04:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [126/800][100/402] eta 0:03:57 lr 0.000025 time 0.7461 (0.7861) loss 0.6404 (0.6386) grad_norm 0.1798 (0.1946) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:05:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [126/800][200/402] eta 0:02:34 lr 0.000025 time 0.7455 (0.7660) loss 0.6224 (0.6378) grad_norm 0.1766 (0.1932) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:06:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [126/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7593) loss 0.6283 (0.6380) grad_norm 0.2600 (0.1910) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:07:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [126/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7559) loss 0.6204 (0.6389) grad_norm 0.1910 (0.1916) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:07:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 126 training takes 0:05:03 [2024-03-09 03:07:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [127/800][0/402] eta 0:22:21 lr 0.000025 time 3.3378 (3.3378) loss 0.6245 (0.6245) grad_norm 0.1790 (0.1790) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:09:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [127/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7714) loss 0.6653 (0.6385) grad_norm 0.1905 (0.1997) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:10:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [127/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7586) loss 0.6338 (0.6382) grad_norm 0.2459 (0.1929) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:11:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [127/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7544) loss 0.6153 (0.6385) grad_norm 0.2404 (0.1905) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:12:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [127/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7522) loss 0.6270 (0.6385) grad_norm 0.1866 (0.1900) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:12:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 127 training takes 0:05:02 [2024-03-09 03:12:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [128/800][0/402] eta 0:22:05 lr 0.000025 time 3.2969 (3.2969) loss 0.6187 (0.6187) grad_norm 0.2002 (0.2002) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:14:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [128/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7710) loss 0.6518 (0.6394) grad_norm 0.1769 (0.1901) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:15:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [128/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7585) loss 0.6192 (0.6391) grad_norm 0.1823 (0.1855) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:16:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [128/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7543) loss 0.6534 (0.6387) grad_norm 0.1558 (0.1858) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:17:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [128/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7521) loss 0.6158 (0.6387) grad_norm 0.2222 (0.1868) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:17:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 128 training takes 0:05:02 [2024-03-09 03:17:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [129/800][0/402] eta 0:21:34 lr 0.000025 time 3.2207 (3.2207) loss 0.6004 (0.6004) grad_norm 0.1902 (0.1902) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:19:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [129/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7701) loss 0.6473 (0.6358) grad_norm 0.1591 (0.1969) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:20:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [129/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7580) loss 0.6361 (0.6395) grad_norm 0.1835 (0.1904) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:21:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [129/800][300/402] eta 0:01:16 lr 0.000025 time 0.7453 (0.7539) loss 0.6415 (0.6391) grad_norm 0.2065 (0.1905) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:22:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [129/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7518) loss 0.6636 (0.6388) grad_norm 0.2316 (0.1895) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:22:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 129 training takes 0:05:02 [2024-03-09 03:22:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [130/800][0/402] eta 0:21:54 lr 0.000025 time 3.2691 (3.2691) loss 0.6525 (0.6525) grad_norm 0.1840 (0.1840) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:24:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [130/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7707) loss 0.6559 (0.6420) grad_norm 0.1475 (0.1808) loss_scale 524288.0000 (511310.5743) mem 28968MB [2024-03-09 03:25:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [130/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7582) loss 0.6409 (0.6391) grad_norm 0.1895 (nan) loss_scale 262144.0000 (449948.6567) mem 28968MB [2024-03-09 03:26:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [130/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7541) loss 0.6447 (0.6395) grad_norm 0.1894 (nan) loss_scale 262144.0000 (387555.0831) mem 28968MB [2024-03-09 03:27:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [130/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7520) loss 0.6316 (0.6389) grad_norm 0.2205 (nan) loss_scale 262144.0000 (356280.4988) mem 28968MB [2024-03-09 03:27:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 130 training takes 0:05:02 [2024-03-09 03:28:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [131/800][0/402] eta 0:32:22 lr 0.000025 time 4.8319 (4.8319) loss 0.6278 (0.6278) grad_norm 0.1863 (0.1863) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:29:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [131/800][100/402] eta 0:03:57 lr 0.000025 time 0.7452 (0.7860) loss 0.6155 (0.6384) grad_norm 0.2102 (0.1881) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:30:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [131/800][200/402] eta 0:02:34 lr 0.000025 time 0.7454 (0.7660) loss 0.5889 (0.6369) grad_norm 0.1619 (0.1895) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:31:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [131/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7592) loss 0.6083 (0.6377) grad_norm 0.2133 (0.1913) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:32:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [131/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7558) loss 0.6248 (0.6376) grad_norm 0.1862 (0.1886) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:32:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 131 training takes 0:05:03 [2024-03-09 03:33:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [132/800][0/402] eta 0:22:07 lr 0.000025 time 3.3012 (3.3012) loss 0.6374 (0.6374) grad_norm 0.1776 (0.1776) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:34:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [132/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7711) loss 0.6268 (0.6356) grad_norm 0.1831 (0.1830) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:35:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [132/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7585) loss 0.6425 (0.6362) grad_norm 0.2877 (0.1830) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:36:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [132/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7543) loss 0.6612 (0.6376) grad_norm 0.2275 (0.1844) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:38:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [132/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7521) loss 0.6241 (0.6380) grad_norm 0.2285 (0.1844) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:38:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 132 training takes 0:05:02 [2024-03-09 03:38:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [133/800][0/402] eta 0:21:37 lr 0.000025 time 3.2269 (3.2269) loss 0.6403 (0.6403) grad_norm 0.2191 (0.2191) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:39:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [133/800][100/402] eta 0:03:52 lr 0.000025 time 0.7462 (0.7702) loss 0.6511 (0.6384) grad_norm 0.1669 (0.1843) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:40:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [133/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7580) loss 0.6441 (0.6383) grad_norm 0.1802 (0.1853) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:41:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [133/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7539) loss 0.6502 (0.6381) grad_norm 0.2383 (0.1864) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:43:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [133/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7518) loss 0.6397 (0.6377) grad_norm 0.1707 (0.1852) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:43:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 133 training takes 0:05:02 [2024-03-09 03:43:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [134/800][0/402] eta 0:22:00 lr 0.000025 time 3.2860 (3.2860) loss 0.6273 (0.6273) grad_norm 0.1748 (0.1748) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:44:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [134/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7708) loss 0.6660 (0.6398) grad_norm 0.1913 (0.1774) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:45:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [134/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7584) loss 0.6681 (0.6383) grad_norm 0.1979 (0.1786) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:46:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [134/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7542) loss 0.6478 (0.6381) grad_norm 0.1710 (0.1803) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:48:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [134/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7520) loss 0.6528 (0.6380) grad_norm 0.1867 (0.1822) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:48:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 134 training takes 0:05:02 [2024-03-09 03:48:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [135/800][0/402] eta 0:21:41 lr 0.000025 time 3.2375 (3.2375) loss 0.6134 (0.6134) grad_norm 0.1618 (0.1618) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:49:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [135/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7703) loss 0.6321 (0.6424) grad_norm 0.1421 (0.1744) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:50:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [135/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7581) loss 0.6393 (0.6394) grad_norm 0.2375 (0.1785) loss_scale 524288.0000 (343004.3383) mem 28968MB [2024-03-09 03:51:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [135/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7540) loss 0.6410 (0.6375) grad_norm 0.1736 (0.1814) loss_scale 524288.0000 (403231.4684) mem 28968MB [2024-03-09 03:53:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [135/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7519) loss 0.6206 (0.6374) grad_norm 0.1695 (0.1817) loss_scale 524288.0000 (433420.1297) mem 28968MB [2024-03-09 03:53:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 135 training takes 0:05:02 [2024-03-09 03:53:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [136/800][0/402] eta 0:31:20 lr 0.000025 time 4.6780 (4.6780) loss 0.6398 (0.6398) grad_norm 0.2067 (0.2067) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 03:54:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [136/800][100/402] eta 0:03:56 lr 0.000025 time 0.7459 (0.7844) loss 0.6385 (0.6379) grad_norm 0.1478 (inf) loss_scale 262144.0000 (389322.7723) mem 28968MB [2024-03-09 03:55:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [136/800][200/402] eta 0:02:34 lr 0.000025 time 0.7458 (0.7652) loss 0.6365 (0.6380) grad_norm 0.1685 (inf) loss_scale 262144.0000 (326049.7512) mem 28968MB [2024-03-09 03:56:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [136/800][300/402] eta 0:01:17 lr 0.000025 time 0.7453 (0.7587) loss 0.6521 (0.6391) grad_norm 0.1771 (inf) loss_scale 262144.0000 (304818.6047) mem 28968MB [2024-03-09 03:58:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [136/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7554) loss 0.6338 (0.6384) grad_norm 0.1449 (inf) loss_scale 262144.0000 (294176.5586) mem 28968MB [2024-03-09 03:58:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 136 training takes 0:05:03 [2024-03-09 03:58:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [137/800][0/402] eta 0:22:23 lr 0.000025 time 3.3430 (3.3430) loss 0.6678 (0.6678) grad_norm 0.2107 (0.2107) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 03:59:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [137/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7712) loss 0.6196 (0.6368) grad_norm 0.1921 (inf) loss_scale 131072.0000 (215425.2673) mem 28968MB [2024-03-09 04:00:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [137/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7585) loss 0.6273 (0.6363) grad_norm 0.1721 (inf) loss_scale 131072.0000 (173458.4677) mem 28968MB [2024-03-09 04:01:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [137/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7542) loss 0.6661 (0.6371) grad_norm 0.1544 (inf) loss_scale 131072.0000 (159376.5847) mem 28968MB [2024-03-09 04:03:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [137/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7521) loss 0.6308 (0.6377) grad_norm 0.2321 (inf) loss_scale 131072.0000 (152318.0848) mem 28968MB [2024-03-09 04:03:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 137 training takes 0:05:02 [2024-03-09 04:03:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [138/800][0/402] eta 0:21:55 lr 0.000025 time 3.2721 (3.2721) loss 0.6095 (0.6095) grad_norm 0.2300 (0.2300) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:04:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [138/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7706) loss 0.6358 (0.6363) grad_norm 0.1766 (0.1755) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:05:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [138/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7582) loss 0.6557 (0.6371) grad_norm 0.1722 (0.1741) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:07:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [138/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7541) loss 0.6643 (0.6367) grad_norm 0.1661 (0.1766) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:08:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [138/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7519) loss 0.6236 (0.6365) grad_norm 0.2589 (0.1769) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:08:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 138 training takes 0:05:02 [2024-03-09 04:08:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [139/800][0/402] eta 0:22:06 lr 0.000025 time 3.2994 (3.2994) loss 0.6462 (0.6462) grad_norm 0.1731 (0.1731) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:09:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [139/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7709) loss 0.6588 (0.6395) grad_norm 0.1713 (0.1717) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:10:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [139/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7584) loss 0.6279 (0.6363) grad_norm 0.1459 (0.1732) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:12:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [139/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7541) loss 0.6380 (0.6374) grad_norm 0.1303 (0.1741) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:13:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [139/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7520) loss 0.6277 (0.6377) grad_norm 0.1907 (0.1726) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:13:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 139 training takes 0:05:02 [2024-03-09 04:13:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [140/800][0/402] eta 0:21:40 lr 0.000025 time 3.2355 (3.2355) loss 0.6084 (0.6084) grad_norm 0.1592 (0.1592) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:14:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [140/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7704) loss 0.6085 (0.6378) grad_norm 0.1963 (0.1787) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:15:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [140/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7581) loss 0.6308 (0.6352) grad_norm 0.3374 (0.1801) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:17:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [140/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7540) loss 0.6081 (0.6350) grad_norm 0.1850 (0.1800) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:18:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [140/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7519) loss 0.6088 (0.6356) grad_norm 0.2065 (0.1788) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:18:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 140 training takes 0:05:02 [2024-03-09 04:18:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [141/800][0/402] eta 0:32:57 lr 0.000025 time 4.9192 (4.9192) loss 0.6397 (0.6397) grad_norm 0.1743 (0.1743) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:19:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [141/800][100/402] eta 0:03:57 lr 0.000025 time 0.7456 (0.7871) loss 0.6294 (0.6396) grad_norm 0.1802 (0.1756) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:20:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [141/800][200/402] eta 0:02:34 lr 0.000025 time 0.7461 (0.7666) loss 0.6410 (0.6387) grad_norm 0.1872 (0.1732) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:22:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [141/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7596) loss 0.6255 (0.6390) grad_norm 0.1838 (0.1746) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:23:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [141/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7561) loss 0.6167 (0.6379) grad_norm 0.1526 (0.1757) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:23:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 141 training takes 0:05:04 [2024-03-09 04:23:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [142/800][0/402] eta 0:22:12 lr 0.000025 time 3.3153 (3.3153) loss 0.6481 (0.6481) grad_norm 0.1639 (0.1639) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 04:24:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [142/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7711) loss 0.6191 (0.6382) grad_norm 0.1569 (0.1716) loss_scale 262144.0000 (190768.1584) mem 28968MB [2024-03-09 04:25:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [142/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7585) loss 0.6366 (0.6374) grad_norm 0.1461 (0.1718) loss_scale 262144.0000 (226278.5274) mem 28968MB [2024-03-09 04:27:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [142/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7543) loss 0.6483 (0.6365) grad_norm 0.1479 (0.1742) loss_scale 262144.0000 (238193.9668) mem 28968MB [2024-03-09 04:28:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [142/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7521) loss 0.6257 (0.6373) grad_norm 0.1508 (0.1748) loss_scale 262144.0000 (244166.5436) mem 28968MB [2024-03-09 04:28:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 142 training takes 0:05:02 [2024-03-09 04:28:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [143/800][0/402] eta 0:22:19 lr 0.000025 time 3.3319 (3.3319) loss 0.6481 (0.6481) grad_norm 0.1882 (0.1882) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:29:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [143/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7713) loss 0.6784 (0.6386) grad_norm 0.1584 (0.1665) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:31:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [143/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7586) loss 0.6249 (0.6373) grad_norm 0.2546 (0.1727) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:32:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [143/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7543) loss 0.6314 (0.6368) grad_norm 0.1557 (0.1753) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:33:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [143/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7521) loss 0.6434 (0.6371) grad_norm 0.1466 (0.1735) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:33:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 143 training takes 0:05:02 [2024-03-09 04:33:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [144/800][0/402] eta 0:22:24 lr 0.000025 time 3.3456 (3.3456) loss 0.6869 (0.6869) grad_norm 0.1396 (0.1396) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:34:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [144/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7714) loss 0.6496 (0.6370) grad_norm 0.1642 (0.1777) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:36:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [144/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7587) loss 0.6266 (0.6374) grad_norm 0.1704 (0.1761) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:37:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [144/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7544) loss 0.6185 (0.6375) grad_norm 0.1695 (0.1734) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:38:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [144/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7522) loss 0.6610 (0.6376) grad_norm 0.2109 (0.1734) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:38:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 144 training takes 0:05:02 [2024-03-09 04:38:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [145/800][0/402] eta 0:22:11 lr 0.000025 time 3.3130 (3.3130) loss 0.6453 (0.6453) grad_norm 0.1335 (0.1335) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:39:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [145/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7711) loss 0.6493 (0.6359) grad_norm 0.1780 (0.1715) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:41:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [145/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7585) loss 0.6313 (0.6387) grad_norm 0.1748 (0.1734) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:42:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [145/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7543) loss 0.6412 (0.6380) grad_norm 0.1998 (0.1737) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:43:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [145/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7521) loss 0.6080 (0.6373) grad_norm 0.1986 (0.1728) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:43:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 145 training takes 0:05:02 [2024-03-09 04:43:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [146/800][0/402] eta 0:31:18 lr 0.000025 time 4.6731 (4.6731) loss 0.6328 (0.6328) grad_norm 0.1652 (0.1652) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:44:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [146/800][100/402] eta 0:03:56 lr 0.000025 time 0.7459 (0.7845) loss 0.6307 (0.6394) grad_norm 0.1621 (0.1667) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:46:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [146/800][200/402] eta 0:02:34 lr 0.000025 time 0.7459 (0.7652) loss 0.6394 (0.6375) grad_norm 0.1546 (0.1691) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:47:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [146/800][300/402] eta 0:01:17 lr 0.000025 time 0.7452 (0.7588) loss 0.6466 (0.6368) grad_norm 0.1566 (0.1693) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:48:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [146/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7555) loss 0.6251 (0.6367) grad_norm 0.1940 (0.1695) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:48:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 146 training takes 0:05:03 [2024-03-09 04:48:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [147/800][0/402] eta 0:22:06 lr 0.000025 time 3.3000 (3.3000) loss 0.6598 (0.6598) grad_norm 0.1505 (0.1505) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:49:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [147/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7711) loss 0.6126 (0.6361) grad_norm 0.1775 (0.1672) loss_scale 524288.0000 (407491.1683) mem 28968MB [2024-03-09 04:51:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [147/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7585) loss 0.6043 (0.6362) grad_norm 0.1585 (inf) loss_scale 262144.0000 (358654.7264) mem 28968MB [2024-03-09 04:52:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [147/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7542) loss 0.6260 (0.6356) grad_norm 0.1376 (inf) loss_scale 262144.0000 (326591.3621) mem 28968MB [2024-03-09 04:53:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [147/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7521) loss 0.6559 (0.6354) grad_norm 0.1441 (inf) loss_scale 262144.0000 (310519.7007) mem 28968MB [2024-03-09 04:53:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 147 training takes 0:05:02 [2024-03-09 04:53:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [148/800][0/402] eta 0:22:28 lr 0.000025 time 3.3553 (3.3553) loss 0.6359 (0.6359) grad_norm 0.1589 (0.1589) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:54:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [148/800][100/402] eta 0:03:53 lr 0.000025 time 0.7461 (0.7715) loss 0.6051 (0.6346) grad_norm 0.3022 (0.1718) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:56:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [148/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7587) loss 0.6267 (0.6367) grad_norm 0.2042 (0.1728) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:57:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [148/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7544) loss 0.6374 (0.6365) grad_norm 0.1644 (0.1716) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:58:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [148/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7522) loss 0.6472 (0.6362) grad_norm 0.1365 (0.1700) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 04:58:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 148 training takes 0:05:02 [2024-03-09 04:58:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [149/800][0/402] eta 0:22:14 lr 0.000025 time 3.3201 (3.3201) loss 0.6589 (0.6589) grad_norm 0.1346 (0.1346) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 05:00:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [149/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7711) loss 0.6117 (0.6374) grad_norm 0.2119 (0.1684) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 05:01:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [149/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7585) loss 0.6529 (0.6367) grad_norm 0.1853 (0.1689) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 05:02:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [149/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7543) loss 0.6461 (0.6372) grad_norm 0.1337 (0.1708) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 05:03:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [149/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7521) loss 0.6214 (0.6371) grad_norm 0.1500 (0.1687) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 05:03:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 149 training takes 0:05:02 [2024-03-09 05:03:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [150/800][0/402] eta 0:21:45 lr 0.000025 time 3.2476 (3.2476) loss 0.6118 (0.6118) grad_norm 0.1512 (0.1512) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 05:05:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [150/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7705) loss 0.5989 (0.6344) grad_norm 0.2120 (0.1702) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 05:06:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [150/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7582) loss 0.6540 (0.6357) grad_norm 0.1718 (0.1691) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 05:07:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [150/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7540) loss 0.6526 (0.6363) grad_norm 0.1337 (0.1687) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 05:08:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [150/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7519) loss 0.6411 (0.6366) grad_norm 0.1757 (0.1689) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 05:08:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 150 training takes 0:05:02 [2024-03-09 05:08:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [151/800][0/402] eta 0:32:30 lr 0.000025 time 4.8510 (4.8510) loss 0.6078 (0.6078) grad_norm 0.2319 (0.2319) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 05:10:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [151/800][100/402] eta 0:03:57 lr 0.000025 time 0.7460 (0.7864) loss 0.6217 (0.6335) grad_norm 0.1835 (0.1695) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 05:11:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [151/800][200/402] eta 0:02:34 lr 0.000025 time 0.7457 (0.7661) loss 0.6265 (0.6350) grad_norm 0.1898 (0.1664) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 05:12:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [151/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7594) loss 0.6452 (0.6355) grad_norm 0.1897 (0.1672) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 05:13:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [151/800][400/402] eta 0:00:01 lr 0.000025 time 0.7450 (0.7560) loss 0.6200 (0.6359) grad_norm 0.1508 (0.1670) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 05:13:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 151 training takes 0:05:03 [2024-03-09 05:13:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [152/800][0/402] eta 0:21:30 lr 0.000025 time 3.2094 (3.2094) loss 0.6175 (0.6175) grad_norm 0.1523 (0.1523) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 05:15:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [152/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7701) loss 0.6386 (0.6327) grad_norm 0.1518 (0.1689) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 05:16:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [152/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7580) loss 0.6231 (0.6348) grad_norm 0.2371 (0.1667) loss_scale 524288.0000 (382130.3085) mem 28968MB [2024-03-09 05:17:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [152/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7539) loss 0.6524 (0.6353) grad_norm 0.1304 (inf) loss_scale 262144.0000 (425875.1362) mem 28968MB [2024-03-09 05:18:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [152/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7518) loss 0.6321 (0.6361) grad_norm 0.1556 (nan) loss_scale 131072.0000 (374911.6808) mem 28968MB [2024-03-09 05:18:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 152 training takes 0:05:02 [2024-03-09 05:18:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [153/800][0/402] eta 0:21:54 lr 0.000025 time 3.2701 (3.2701) loss 0.6395 (0.6395) grad_norm 0.1764 (0.1764) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:20:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [153/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7706) loss 0.6339 (0.6358) grad_norm 0.1734 (0.1675) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:21:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [153/800][200/402] eta 0:02:33 lr 0.000025 time 0.7467 (0.7582) loss 0.6389 (0.6367) grad_norm 0.1387 (0.1645) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:22:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [153/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7540) loss 0.6475 (0.6375) grad_norm 0.1719 (0.1666) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:23:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [153/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7519) loss 0.6501 (0.6369) grad_norm 0.1530 (0.1694) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:23:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 153 training takes 0:05:02 [2024-03-09 05:24:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [154/800][0/402] eta 0:22:54 lr 0.000025 time 3.4181 (3.4181) loss 0.6313 (0.6313) grad_norm 0.1673 (0.1673) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:25:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [154/800][100/402] eta 0:03:53 lr 0.000025 time 0.7456 (0.7721) loss 0.6397 (0.6322) grad_norm 0.1400 (0.1633) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:26:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [154/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7590) loss 0.6649 (0.6348) grad_norm 0.1785 (0.1634) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:27:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [154/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7546) loss 0.6354 (0.6347) grad_norm 0.1552 (0.1638) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:28:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [154/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7523) loss 0.6203 (0.6345) grad_norm 0.1916 (0.1641) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:29:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 154 training takes 0:05:02 [2024-03-09 05:29:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [155/800][0/402] eta 0:22:06 lr 0.000025 time 3.3008 (3.3008) loss 0.6735 (0.6735) grad_norm 0.1816 (0.1816) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:30:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [155/800][100/402] eta 0:03:52 lr 0.000025 time 0.7462 (0.7710) loss 0.6756 (0.6407) grad_norm 0.1834 (0.1609) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:31:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [155/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7585) loss 0.6196 (0.6375) grad_norm 0.1616 (0.1622) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:32:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [155/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7543) loss 0.6659 (0.6358) grad_norm 0.1455 (0.1635) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:34:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [155/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7521) loss 0.6255 (0.6357) grad_norm 0.1748 (0.1641) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:34:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 155 training takes 0:05:02 [2024-03-09 05:34:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [156/800][0/402] eta 0:31:22 lr 0.000025 time 4.6817 (4.6817) loss 0.6622 (0.6622) grad_norm 0.1827 (0.1827) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:35:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [156/800][100/402] eta 0:03:57 lr 0.000025 time 0.7478 (0.7852) loss 0.6189 (0.6373) grad_norm 0.1942 (0.1704) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:36:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [156/800][200/402] eta 0:02:34 lr 0.000025 time 0.7462 (0.7662) loss 0.6441 (0.6375) grad_norm 0.2014 (0.1706) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:37:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [156/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7597) loss 0.6625 (0.6368) grad_norm 0.1817 (0.1684) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:39:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [156/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7564) loss 0.6691 (0.6370) grad_norm 0.1407 (0.1683) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:39:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 156 training takes 0:05:04 [2024-03-09 05:39:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [157/800][0/402] eta 0:21:47 lr 0.000025 time 3.2516 (3.2516) loss 0.6116 (0.6116) grad_norm 0.1597 (0.1597) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:40:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [157/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7705) loss 0.6358 (0.6357) grad_norm 0.1428 (0.1595) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:41:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [157/800][200/402] eta 0:02:33 lr 0.000025 time 0.7465 (0.7582) loss 0.5886 (0.6354) grad_norm 0.1723 (0.1598) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:42:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [157/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7540) loss 0.6124 (0.6354) grad_norm 0.1537 (0.1601) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:44:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [157/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7519) loss 0.6547 (0.6356) grad_norm 0.1986 (0.1616) loss_scale 262144.0000 (144473.3766) mem 28968MB [2024-03-09 05:44:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 157 training takes 0:05:02 [2024-03-09 05:44:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [158/800][0/402] eta 0:22:01 lr 0.000025 time 3.2866 (3.2866) loss 0.6611 (0.6611) grad_norm 0.1658 (0.1658) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 05:45:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [158/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7709) loss 0.6350 (0.6368) grad_norm 0.1614 (0.1642) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 05:46:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [158/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7583) loss 0.6573 (0.6351) grad_norm 0.1593 (inf) loss_scale 131072.0000 (209976.0398) mem 28968MB [2024-03-09 05:47:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [158/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7542) loss 0.6526 (0.6356) grad_norm 0.1416 (inf) loss_scale 131072.0000 (183762.0731) mem 28968MB [2024-03-09 05:49:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [158/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7520) loss 0.6581 (0.6356) grad_norm 0.1456 (inf) loss_scale 131072.0000 (170622.4040) mem 28968MB [2024-03-09 05:49:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 158 training takes 0:05:02 [2024-03-09 05:49:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [159/800][0/402] eta 0:22:31 lr 0.000025 time 3.3627 (3.3627) loss 0.6276 (0.6276) grad_norm 0.1405 (0.1405) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:50:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [159/800][100/402] eta 0:03:53 lr 0.000025 time 0.7456 (0.7717) loss 0.6024 (0.6377) grad_norm 0.1724 (0.1578) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:51:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [159/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7588) loss 0.6549 (0.6366) grad_norm 0.1726 (0.1630) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:52:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [159/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7545) loss 0.6529 (0.6366) grad_norm 0.1784 (0.1632) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:54:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [159/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7522) loss 0.6526 (0.6364) grad_norm 0.1425 (0.1637) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:54:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 159 training takes 0:05:02 [2024-03-09 05:54:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [160/800][0/402] eta 0:22:23 lr 0.000025 time 3.3431 (3.3431) loss 0.6612 (0.6612) grad_norm 0.1437 (0.1437) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:55:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [160/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7714) loss 0.6699 (0.6370) grad_norm 0.1424 (0.1680) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:56:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [160/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7587) loss 0.6546 (0.6346) grad_norm 0.1627 (0.1655) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:58:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [160/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7544) loss 0.6171 (0.6356) grad_norm 0.1488 (0.1643) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:59:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [160/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7522) loss 0.6460 (0.6359) grad_norm 0.1385 (0.1640) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 05:59:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 160 training takes 0:05:02 [2024-03-09 05:59:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [161/800][0/402] eta 0:31:17 lr 0.000025 time 4.6709 (4.6709) loss 0.6391 (0.6391) grad_norm 0.1910 (0.1910) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 06:00:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [161/800][100/402] eta 0:03:56 lr 0.000025 time 0.7450 (0.7845) loss 0.6567 (0.6336) grad_norm 0.1808 (0.1596) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 06:01:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [161/800][200/402] eta 0:02:34 lr 0.000025 time 0.7455 (0.7652) loss 0.6758 (0.6355) grad_norm 0.1473 (0.1623) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 06:03:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [161/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7588) loss 0.6277 (0.6353) grad_norm 0.1897 (0.1625) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 06:04:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [161/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7555) loss 0.6248 (0.6351) grad_norm 0.1839 (0.1619) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 06:04:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 161 training takes 0:05:03 [2024-03-09 06:04:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [162/800][0/402] eta 0:22:30 lr 0.000025 time 3.3595 (3.3595) loss 0.6303 (0.6303) grad_norm 0.1540 (0.1540) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 06:05:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [162/800][100/402] eta 0:03:53 lr 0.000025 time 0.7456 (0.7715) loss 0.6388 (0.6347) grad_norm 0.1449 (0.1649) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 06:06:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [162/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7587) loss 0.6240 (0.6358) grad_norm 0.1156 (0.1606) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 06:08:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [162/800][300/402] eta 0:01:16 lr 0.000025 time 0.7453 (0.7544) loss 0.6186 (0.6354) grad_norm 0.1636 (0.1613) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 06:09:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [162/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7522) loss 0.6403 (0.6349) grad_norm 0.1303 (0.1608) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 06:09:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 162 training takes 0:05:02 [2024-03-09 06:09:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [163/800][0/402] eta 0:22:22 lr 0.000025 time 3.3384 (3.3384) loss 0.6595 (0.6595) grad_norm 0.1524 (0.1524) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 06:10:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [163/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7713) loss 0.6435 (0.6372) grad_norm 0.1432 (0.1555) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 06:11:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [163/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7586) loss 0.6393 (0.6369) grad_norm 0.2316 (0.1562) loss_scale 262144.0000 (189760.9552) mem 28968MB [2024-03-09 06:13:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [163/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7543) loss 0.6274 (0.6369) grad_norm 0.1597 (0.1575) loss_scale 262144.0000 (213808.4784) mem 28968MB [2024-03-09 06:14:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [163/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7521) loss 0.6490 (0.6361) grad_norm 0.1519 (0.1576) loss_scale 262144.0000 (225862.2244) mem 28968MB [2024-03-09 06:14:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 163 training takes 0:05:02 [2024-03-09 06:14:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [164/800][0/402] eta 0:21:41 lr 0.000025 time 3.2379 (3.2379) loss 0.6379 (0.6379) grad_norm 0.1648 (0.1648) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:15:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [164/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7703) loss 0.5738 (0.6337) grad_norm 0.1462 (0.1574) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:16:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [164/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7580) loss 0.6077 (0.6349) grad_norm 0.1323 (0.1609) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:18:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [164/800][300/402] eta 0:01:16 lr 0.000025 time 0.7472 (0.7542) loss 0.6213 (0.6357) grad_norm 0.1460 (0.1582) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:19:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [164/800][400/402] eta 0:00:01 lr 0.000025 time 0.7457 (0.7523) loss 0.6055 (0.6347) grad_norm 0.1678 (0.1603) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:19:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 164 training takes 0:05:02 [2024-03-09 06:19:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [165/800][0/402] eta 0:22:08 lr 0.000025 time 3.3045 (3.3045) loss 0.6495 (0.6495) grad_norm 0.1614 (0.1614) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:20:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [165/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7709) loss 0.6181 (0.6328) grad_norm 0.1720 (0.1621) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:22:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [165/800][200/402] eta 0:02:33 lr 0.000025 time 0.7464 (0.7584) loss 0.6513 (0.6356) grad_norm 0.1455 (0.1574) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:23:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [165/800][300/402] eta 0:01:16 lr 0.000025 time 0.7451 (0.7542) loss 0.6366 (0.6353) grad_norm 0.1492 (0.1583) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:24:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [165/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7520) loss 0.6273 (0.6352) grad_norm 0.1540 (0.1590) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:24:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 165 training takes 0:05:02 [2024-03-09 06:24:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [166/800][0/402] eta 0:32:15 lr 0.000025 time 4.8138 (4.8138) loss 0.5822 (0.5822) grad_norm 0.1968 (0.1968) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:25:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [166/800][100/402] eta 0:03:57 lr 0.000025 time 0.7457 (0.7863) loss 0.6475 (0.6355) grad_norm 0.1421 (0.1671) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:27:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [166/800][200/402] eta 0:02:34 lr 0.000025 time 0.7459 (0.7664) loss 0.6121 (0.6353) grad_norm 0.1549 (0.1607) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:28:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [166/800][300/402] eta 0:01:17 lr 0.000025 time 0.7451 (0.7595) loss 0.6542 (0.6349) grad_norm 0.1362 (0.1609) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:29:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [166/800][400/402] eta 0:00:01 lr 0.000025 time 0.7450 (0.7562) loss 0.6376 (0.6358) grad_norm 0.1539 (0.1604) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:29:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 166 training takes 0:05:04 [2024-03-09 06:29:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [167/800][0/402] eta 0:22:51 lr 0.000025 time 3.4111 (3.4111) loss 0.6166 (0.6166) grad_norm 0.1412 (0.1412) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:30:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [167/800][100/402] eta 0:03:53 lr 0.000025 time 0.7453 (0.7721) loss 0.6346 (0.6346) grad_norm 0.1702 (0.1617) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:32:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [167/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7590) loss 0.6356 (0.6329) grad_norm 0.1574 (0.1635) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:33:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [167/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7546) loss 0.6440 (0.6343) grad_norm 0.1628 (0.1631) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:34:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [167/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7524) loss 0.6382 (0.6345) grad_norm 0.1713 (0.1600) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:34:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 167 training takes 0:05:02 [2024-03-09 06:34:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [168/800][0/402] eta 0:22:52 lr 0.000025 time 3.4140 (3.4140) loss 0.6341 (0.6341) grad_norm 0.1774 (0.1774) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:35:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [168/800][100/402] eta 0:03:53 lr 0.000025 time 0.7469 (0.7731) loss 0.6302 (0.6340) grad_norm 0.1552 (0.1588) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:37:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [168/800][200/402] eta 0:02:33 lr 0.000025 time 0.7469 (0.7600) loss 0.6592 (0.6344) grad_norm 0.1309 (0.1602) loss_scale 524288.0000 (392563.9005) mem 28968MB [2024-03-09 06:38:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [168/800][300/402] eta 0:01:17 lr 0.000025 time 0.7467 (0.7556) loss 0.6224 (0.6339) grad_norm 0.1353 (inf) loss_scale 262144.0000 (379716.8904) mem 28968MB [2024-03-09 06:39:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [168/800][400/402] eta 0:00:01 lr 0.000025 time 0.7453 (0.7534) loss 0.6379 (0.6344) grad_norm 0.1752 (inf) loss_scale 262144.0000 (350396.9676) mem 28968MB [2024-03-09 06:39:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 168 training takes 0:05:02 [2024-03-09 06:39:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [169/800][0/402] eta 0:22:25 lr 0.000025 time 3.3459 (3.3459) loss 0.6295 (0.6295) grad_norm 0.1471 (0.1471) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:40:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [169/800][100/402] eta 0:03:53 lr 0.000025 time 0.7468 (0.7725) loss 0.6375 (0.6342) grad_norm 0.1487 (0.1587) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:42:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [169/800][200/402] eta 0:02:33 lr 0.000025 time 0.7473 (0.7597) loss 0.6426 (0.6349) grad_norm 0.1097 (0.1596) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:43:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [169/800][300/402] eta 0:01:17 lr 0.000025 time 0.7469 (0.7555) loss 0.6287 (0.6350) grad_norm 0.1267 (0.1587) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:44:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [169/800][400/402] eta 0:00:01 lr 0.000025 time 0.7456 (0.7532) loss 0.6084 (0.6348) grad_norm 0.1684 (0.1591) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:44:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 169 training takes 0:05:02 [2024-03-09 06:44:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [170/800][0/402] eta 0:21:37 lr 0.000025 time 3.2266 (3.2266) loss 0.6326 (0.6326) grad_norm 0.1323 (0.1323) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:46:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [170/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7702) loss 0.6453 (0.6365) grad_norm 0.1335 (0.1561) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:47:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [170/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7580) loss 0.6400 (0.6367) grad_norm 0.1507 (0.1541) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:48:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [170/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7539) loss 0.6404 (0.6356) grad_norm 0.1581 (0.1549) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:49:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [170/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7518) loss 0.6309 (0.6355) grad_norm 0.1606 (0.1564) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:49:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 170 training takes 0:05:02 [2024-03-09 06:49:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [171/800][0/402] eta 0:32:06 lr 0.000025 time 4.7923 (4.7923) loss 0.6185 (0.6185) grad_norm 0.1534 (0.1534) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:51:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [171/800][100/402] eta 0:03:57 lr 0.000025 time 0.7459 (0.7857) loss 0.6174 (0.6324) grad_norm 0.1429 (0.1529) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:52:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [171/800][200/402] eta 0:02:34 lr 0.000025 time 0.7456 (0.7658) loss 0.6953 (0.6353) grad_norm 0.1394 (0.1551) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:53:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [171/800][300/402] eta 0:01:17 lr 0.000025 time 0.7455 (0.7592) loss 0.6266 (0.6356) grad_norm 0.1605 (0.1558) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:54:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [171/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7558) loss 0.6608 (0.6347) grad_norm 0.1383 (0.1571) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:54:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 171 training takes 0:05:03 [2024-03-09 06:54:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [172/800][0/402] eta 0:22:19 lr 0.000025 time 3.3328 (3.3328) loss 0.6513 (0.6513) grad_norm 0.1624 (0.1624) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:56:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [172/800][100/402] eta 0:03:52 lr 0.000025 time 0.7462 (0.7713) loss 0.6305 (0.6332) grad_norm 0.2125 (0.1626) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:57:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [172/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7586) loss 0.6063 (0.6345) grad_norm 0.1314 (0.1583) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:58:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [172/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7543) loss 0.6168 (0.6352) grad_norm 0.1395 (0.1577) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:59:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [172/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7521) loss 0.6477 (0.6355) grad_norm 0.1442 (0.1571) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 06:59:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 172 training takes 0:05:02 [2024-03-09 06:59:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [173/800][0/402] eta 0:21:39 lr 0.000025 time 3.2337 (3.2337) loss 0.6125 (0.6125) grad_norm 0.1359 (0.1359) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 07:01:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [173/800][100/402] eta 0:03:52 lr 0.000025 time 0.7461 (0.7702) loss 0.6361 (0.6349) grad_norm 0.1169 (0.1554) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 07:02:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [173/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7580) loss 0.5910 (0.6350) grad_norm 0.1852 (0.1560) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 07:03:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [173/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7539) loss 0.6318 (0.6349) grad_norm 0.1740 (0.1571) loss_scale 524288.0000 (327462.2724) mem 28968MB [2024-03-09 07:04:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [173/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7518) loss 0.6124 (0.6350) grad_norm 0.1707 (0.1556) loss_scale 524288.0000 (376545.9950) mem 28968MB [2024-03-09 07:04:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 173 training takes 0:05:02 [2024-03-09 07:04:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [174/800][0/402] eta 0:22:05 lr 0.000025 time 3.2978 (3.2978) loss 0.6520 (0.6520) grad_norm 0.1740 (0.1740) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 07:06:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [174/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7710) loss 0.6493 (0.6328) grad_norm 0.1784 (0.1540) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 07:07:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [174/800][200/402] eta 0:02:33 lr 0.000025 time 0.7453 (0.7584) loss 0.6179 (0.6342) grad_norm 0.1665 (0.1565) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 07:08:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [174/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7542) loss 0.6235 (0.6340) grad_norm 0.1850 (0.1565) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 07:09:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [174/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7521) loss 0.6295 (0.6338) grad_norm 0.1759 (0.1554) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 07:09:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 174 training takes 0:05:02 [2024-03-09 07:09:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [175/800][0/402] eta 0:21:56 lr 0.000025 time 3.2742 (3.2742) loss 0.6457 (0.6457) grad_norm 0.1569 (0.1569) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 07:11:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [175/800][100/402] eta 0:03:53 lr 0.000025 time 0.7468 (0.7718) loss 0.6091 (0.6340) grad_norm 0.1498 (0.1514) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 07:12:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [175/800][200/402] eta 0:02:33 lr 0.000025 time 0.7473 (0.7594) loss 0.5989 (0.6337) grad_norm 0.1807 (0.1553) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 07:13:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [175/800][300/402] eta 0:01:17 lr 0.000025 time 0.7469 (0.7552) loss 0.6539 (0.6338) grad_norm 0.1462 (0.1556) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 07:14:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [175/800][400/402] eta 0:00:01 lr 0.000025 time 0.7454 (0.7530) loss 0.6330 (0.6341) grad_norm 0.1799 (inf) loss_scale 131072.0000 (498465.8354) mem 28968MB [2024-03-09 07:14:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 175 training takes 0:05:02 [2024-03-09 07:15:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [176/800][0/402] eta 0:31:21 lr 0.000025 time 4.6814 (4.6814) loss 0.6156 (0.6156) grad_norm 0.1471 (0.1471) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:16:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [176/800][100/402] eta 0:03:57 lr 0.000025 time 0.7478 (0.7857) loss 0.5986 (0.6358) grad_norm 0.1566 (0.1508) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:17:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [176/800][200/402] eta 0:02:34 lr 0.000025 time 0.7482 (0.7663) loss 0.6156 (0.6363) grad_norm 0.1582 (0.1524) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:18:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [176/800][300/402] eta 0:01:17 lr 0.000025 time 0.7475 (0.7598) loss 0.6404 (0.6354) grad_norm 0.1571 (0.1568) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:20:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [176/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7566) loss 0.6564 (0.6351) grad_norm 0.1577 (0.1566) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:20:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 176 training takes 0:05:04 [2024-03-09 07:20:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [177/800][0/402] eta 0:21:46 lr 0.000025 time 3.2501 (3.2501) loss 0.6405 (0.6405) grad_norm 0.1578 (0.1578) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:21:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [177/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7706) loss 0.6399 (0.6345) grad_norm 0.1777 (0.1560) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:22:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [177/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7582) loss 0.6274 (0.6346) grad_norm 0.1488 (0.1561) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:23:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [177/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7540) loss 0.6469 (0.6348) grad_norm 0.1521 (0.1557) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:25:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [177/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7519) loss 0.6277 (0.6358) grad_norm 0.1505 (0.1555) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:25:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 177 training takes 0:05:02 [2024-03-09 07:25:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [178/800][0/402] eta 0:22:47 lr 0.000025 time 3.4019 (3.4019) loss 0.6017 (0.6017) grad_norm 0.1310 (0.1310) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:26:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [178/800][100/402] eta 0:03:53 lr 0.000025 time 0.7454 (0.7719) loss 0.6346 (0.6363) grad_norm 0.1421 (0.1549) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:27:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [178/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7589) loss 0.6323 (0.6371) grad_norm 0.1445 (0.1519) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:28:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [178/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7545) loss 0.6545 (0.6363) grad_norm 0.1565 (0.1528) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:30:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [178/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7523) loss 0.6150 (0.6354) grad_norm 0.1844 (0.1532) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:30:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 178 training takes 0:05:02 [2024-03-09 07:30:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [179/800][0/402] eta 0:21:44 lr 0.000025 time 3.2456 (3.2456) loss 0.6274 (0.6274) grad_norm 0.1134 (0.1134) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:31:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [179/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7704) loss 0.6363 (0.6341) grad_norm 0.1510 (0.1502) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:32:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [179/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7581) loss 0.6096 (0.6331) grad_norm 0.1339 (0.1515) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:33:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [179/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7540) loss 0.6405 (0.6339) grad_norm 0.1625 (0.1522) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:35:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [179/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7519) loss 0.6430 (0.6340) grad_norm 0.1148 (0.1530) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:35:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 179 training takes 0:05:02 [2024-03-09 07:35:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [180/800][0/402] eta 0:22:44 lr 0.000025 time 3.3950 (3.3950) loss 0.6335 (0.6335) grad_norm 0.1447 (0.1447) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:36:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [180/800][100/402] eta 0:03:53 lr 0.000025 time 0.7454 (0.7719) loss 0.6351 (0.6350) grad_norm 0.1340 (0.1510) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:37:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [180/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7589) loss 0.6332 (0.6360) grad_norm 0.1467 (0.1516) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:38:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [180/800][300/402] eta 0:01:16 lr 0.000025 time 0.7468 (0.7545) loss 0.6644 (0.6358) grad_norm 0.1359 (0.1521) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:40:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [180/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7523) loss 0.6358 (0.6352) grad_norm 0.1690 (0.1522) loss_scale 262144.0000 (142512.1995) mem 28968MB [2024-03-09 07:40:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 180 training takes 0:05:02 [2024-03-09 07:40:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [181/800][0/402] eta 0:31:48 lr 0.000025 time 4.7469 (4.7469) loss 0.6215 (0.6215) grad_norm 0.1451 (0.1451) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 07:41:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [181/800][100/402] eta 0:03:57 lr 0.000025 time 0.7456 (0.7854) loss 0.6566 (0.6348) grad_norm 0.1599 (inf) loss_scale 131072.0000 (189470.4158) mem 28968MB [2024-03-09 07:42:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [181/800][200/402] eta 0:02:34 lr 0.000025 time 0.7473 (0.7660) loss 0.6323 (0.6358) grad_norm 0.1553 (inf) loss_scale 131072.0000 (160416.4776) mem 28968MB [2024-03-09 07:44:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [181/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7595) loss 0.6147 (0.6353) grad_norm 0.1195 (inf) loss_scale 131072.0000 (150667.4817) mem 28968MB [2024-03-09 07:45:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [181/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7562) loss 0.5852 (0.6346) grad_norm 0.1782 (inf) loss_scale 131072.0000 (145780.8279) mem 28968MB [2024-03-09 07:45:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 181 training takes 0:05:04 [2024-03-09 07:45:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [182/800][0/402] eta 0:21:44 lr 0.000025 time 3.2456 (3.2456) loss 0.6048 (0.6048) grad_norm 0.1427 (0.1427) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:46:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [182/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7704) loss 0.6097 (0.6329) grad_norm 0.1504 (0.1529) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:47:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [182/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7581) loss 0.6485 (0.6327) grad_norm 0.1211 (0.1552) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:49:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [182/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7540) loss 0.6511 (0.6328) grad_norm 0.2060 (0.1551) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:50:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [182/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7519) loss 0.6042 (0.6332) grad_norm 0.1658 (0.1534) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:50:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 182 training takes 0:05:02 [2024-03-09 07:50:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [183/800][0/402] eta 0:21:59 lr 0.000025 time 3.2827 (3.2827) loss 0.6119 (0.6119) grad_norm 0.1866 (0.1866) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:51:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [183/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7707) loss 0.6577 (0.6330) grad_norm 0.1337 (0.1572) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:52:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [183/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7583) loss 0.6497 (0.6352) grad_norm 0.1416 (0.1546) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:54:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [183/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7541) loss 0.6831 (0.6345) grad_norm 0.1566 (0.1548) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:55:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [183/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7520) loss 0.6256 (0.6345) grad_norm 0.1454 (0.1553) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:55:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 183 training takes 0:05:02 [2024-03-09 07:55:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [184/800][0/402] eta 0:21:25 lr 0.000025 time 3.1966 (3.1966) loss 0.6048 (0.6048) grad_norm 0.1738 (0.1738) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:56:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [184/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7699) loss 0.6476 (0.6365) grad_norm 0.1477 (0.1532) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:57:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [184/800][200/402] eta 0:02:33 lr 0.000025 time 0.7467 (0.7582) loss 0.6249 (0.6371) grad_norm 0.1312 (0.1534) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 07:59:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [184/800][300/402] eta 0:01:16 lr 0.000025 time 0.7470 (0.7545) loss 0.6277 (0.6365) grad_norm 0.1645 (0.1536) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:00:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [184/800][400/402] eta 0:00:01 lr 0.000025 time 0.7452 (0.7525) loss 0.6161 (0.6355) grad_norm 0.1500 (0.1526) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:00:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 184 training takes 0:05:02 [2024-03-09 08:00:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [185/800][0/402] eta 0:21:51 lr 0.000025 time 3.2616 (3.2616) loss 0.6316 (0.6316) grad_norm 0.1419 (0.1419) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:01:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [185/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7706) loss 0.6450 (0.6384) grad_norm 0.1645 (0.1525) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:02:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [185/800][200/402] eta 0:02:33 lr 0.000025 time 0.7466 (0.7582) loss 0.6188 (0.6366) grad_norm 0.1298 (0.1527) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:04:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [185/800][300/402] eta 0:01:16 lr 0.000025 time 0.7452 (0.7541) loss 0.6534 (0.6349) grad_norm 0.1701 (0.1524) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:05:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [185/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7519) loss 0.5985 (0.6336) grad_norm 0.1608 (0.1530) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:05:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 185 training takes 0:05:02 [2024-03-09 08:05:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [186/800][0/402] eta 0:32:43 lr 0.000025 time 4.8842 (4.8842) loss 0.6555 (0.6555) grad_norm 0.1495 (0.1495) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:06:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [186/800][100/402] eta 0:03:57 lr 0.000025 time 0.7454 (0.7867) loss 0.6257 (0.6356) grad_norm 0.1343 (0.1501) loss_scale 262144.0000 (216723.0099) mem 28968MB [2024-03-09 08:08:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [186/800][200/402] eta 0:02:34 lr 0.000025 time 0.7456 (0.7663) loss 0.6296 (0.6358) grad_norm 0.1840 (0.1518) loss_scale 262144.0000 (239320.5174) mem 28968MB [2024-03-09 08:09:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [186/800][300/402] eta 0:01:17 lr 0.000025 time 0.7456 (0.7595) loss 0.6469 (0.6340) grad_norm 0.1619 (0.1520) loss_scale 262144.0000 (246903.0698) mem 28968MB [2024-03-09 08:10:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [186/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7560) loss 0.6639 (0.6337) grad_norm 0.1778 (0.1521) loss_scale 262144.0000 (250703.8005) mem 28968MB [2024-03-09 08:10:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 186 training takes 0:05:03 [2024-03-09 08:10:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [187/800][0/402] eta 0:22:07 lr 0.000025 time 3.3018 (3.3018) loss 0.6475 (0.6475) grad_norm 0.1283 (0.1283) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 08:11:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [187/800][100/402] eta 0:03:52 lr 0.000025 time 0.7461 (0.7712) loss 0.6417 (0.6359) grad_norm 0.1274 (0.1511) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 08:13:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [187/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7585) loss 0.6366 (0.6351) grad_norm 0.1511 (0.1527) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 08:14:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [187/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7543) loss 0.6185 (0.6356) grad_norm 0.1423 (0.1521) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 08:15:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [187/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7521) loss 0.6357 (0.6352) grad_norm 0.1567 (0.1529) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 08:15:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 187 training takes 0:05:02 [2024-03-09 08:15:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [188/800][0/402] eta 0:22:22 lr 0.000025 time 3.3396 (3.3396) loss 0.6615 (0.6615) grad_norm 0.1495 (0.1495) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 08:16:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [188/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7713) loss 0.6240 (0.6367) grad_norm 0.1614 (0.1521) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 08:18:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [188/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7586) loss 0.6391 (0.6356) grad_norm 0.1953 (0.1528) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 08:19:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [188/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7543) loss 0.6351 (0.6358) grad_norm 0.1824 (0.1535) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 08:20:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [188/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7521) loss 0.6724 (0.6348) grad_norm 0.1278 (0.1533) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 08:20:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 188 training takes 0:05:02 [2024-03-09 08:20:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [189/800][0/402] eta 0:21:47 lr 0.000025 time 3.2528 (3.2528) loss 0.6289 (0.6289) grad_norm 0.1225 (0.1225) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 08:21:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [189/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7706) loss 0.6516 (0.6332) grad_norm 0.1430 (0.1514) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 08:23:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [189/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7582) loss 0.6060 (0.6347) grad_norm 0.1428 (0.1531) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 08:24:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [189/800][300/402] eta 0:01:16 lr 0.000025 time 0.7464 (0.7541) loss 0.5998 (0.6339) grad_norm 0.1560 (0.1525) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 08:25:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [189/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7520) loss 0.6199 (0.6339) grad_norm 0.1665 (0.1520) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 08:25:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 189 training takes 0:05:02 [2024-03-09 08:25:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [190/800][0/402] eta 0:22:07 lr 0.000025 time 3.3022 (3.3022) loss 0.6287 (0.6287) grad_norm 0.2035 (0.2035) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 08:26:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [190/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7709) loss 0.6402 (0.6362) grad_norm 0.1325 (inf) loss_scale 131072.0000 (186874.9307) mem 28968MB [2024-03-09 08:28:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [190/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7584) loss 0.6438 (0.6353) grad_norm 0.1345 (inf) loss_scale 131072.0000 (159112.2786) mem 28968MB [2024-03-09 08:29:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [190/800][300/402] eta 0:01:16 lr 0.000025 time 0.7563 (0.7542) loss 0.6330 (0.6341) grad_norm 0.1343 (inf) loss_scale 131072.0000 (149796.5714) mem 28968MB [2024-03-09 08:30:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [190/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7520) loss 0.6382 (0.6339) grad_norm 0.1263 (inf) loss_scale 131072.0000 (145127.1022) mem 28968MB [2024-03-09 08:30:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 190 training takes 0:05:02 [2024-03-09 08:30:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [191/800][0/402] eta 0:32:42 lr 0.000025 time 4.8828 (4.8828) loss 0.6170 (0.6170) grad_norm 0.1660 (0.1660) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:31:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [191/800][100/402] eta 0:03:57 lr 0.000025 time 0.7456 (0.7869) loss 0.6379 (0.6329) grad_norm 0.1441 (0.1549) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:33:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [191/800][200/402] eta 0:02:34 lr 0.000025 time 0.7466 (0.7664) loss 0.6099 (0.6348) grad_norm 0.1271 (0.1516) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:34:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [191/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7597) loss 0.6574 (0.6346) grad_norm 0.1500 (0.1514) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:35:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [191/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7562) loss 0.6336 (0.6345) grad_norm 0.1545 (0.1521) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:35:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 191 training takes 0:05:04 [2024-03-09 08:35:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [192/800][0/402] eta 0:22:19 lr 0.000025 time 3.3317 (3.3317) loss 0.6504 (0.6504) grad_norm 0.1524 (0.1524) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:37:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [192/800][100/402] eta 0:03:52 lr 0.000025 time 0.7461 (0.7713) loss 0.6526 (0.6336) grad_norm 0.1373 (0.1520) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:38:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [192/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7585) loss 0.6362 (0.6351) grad_norm 0.1503 (0.1505) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:39:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [192/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7543) loss 0.6484 (0.6348) grad_norm 0.1227 (0.1498) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:40:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [192/800][400/402] eta 0:00:01 lr 0.000025 time 0.7437 (0.7521) loss 0.6417 (0.6342) grad_norm 0.1380 (0.1505) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:40:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 192 training takes 0:05:02 [2024-03-09 08:40:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [193/800][0/402] eta 0:22:37 lr 0.000025 time 3.3760 (3.3760) loss 0.6135 (0.6135) grad_norm 0.1674 (0.1674) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:42:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [193/800][100/402] eta 0:03:53 lr 0.000025 time 0.7459 (0.7718) loss 0.6387 (0.6329) grad_norm 0.1496 (0.1553) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:43:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [193/800][200/402] eta 0:02:33 lr 0.000025 time 0.7452 (0.7588) loss 0.6292 (0.6342) grad_norm 0.1678 (0.1601) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:44:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [193/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7545) loss 0.6273 (0.6340) grad_norm 0.1620 (0.1575) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:45:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [193/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7523) loss 0.6384 (0.6338) grad_norm 0.1537 (0.1553) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:45:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 193 training takes 0:05:02 [2024-03-09 08:45:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [194/800][0/402] eta 0:21:28 lr 0.000025 time 3.2048 (3.2048) loss 0.6313 (0.6313) grad_norm 0.1894 (0.1894) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:47:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [194/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7700) loss 0.6246 (0.6350) grad_norm 0.1581 (0.1512) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:48:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [194/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7579) loss 0.6280 (0.6342) grad_norm 0.1635 (0.1515) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:49:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [194/800][300/402] eta 0:01:16 lr 0.000025 time 0.7451 (0.7538) loss 0.6284 (0.6345) grad_norm 0.1275 (0.1523) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:50:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [194/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7517) loss 0.6200 (0.6340) grad_norm 0.1918 (0.1529) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:50:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 194 training takes 0:05:02 [2024-03-09 08:50:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [195/800][0/402] eta 0:21:45 lr 0.000025 time 3.2472 (3.2472) loss 0.6035 (0.6035) grad_norm 0.1386 (0.1386) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 08:52:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [195/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7705) loss 0.6563 (0.6303) grad_norm 0.1404 (0.1474) loss_scale 262144.0000 (219318.4950) mem 28968MB [2024-03-09 08:53:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [195/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7582) loss 0.6423 (0.6316) grad_norm 0.1295 (0.1501) loss_scale 262144.0000 (240624.7164) mem 28968MB [2024-03-09 08:54:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [195/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7540) loss 0.6272 (0.6330) grad_norm 0.1403 (0.1488) loss_scale 262144.0000 (247773.9801) mem 28968MB [2024-03-09 08:55:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [195/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7519) loss 0.6242 (0.6331) grad_norm 0.1451 (0.1492) loss_scale 262144.0000 (251357.5262) mem 28968MB [2024-03-09 08:55:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 195 training takes 0:05:02 [2024-03-09 08:55:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [196/800][0/402] eta 0:32:22 lr 0.000025 time 4.8315 (4.8315) loss 0.6429 (0.6429) grad_norm 0.1523 (0.1523) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 08:57:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [196/800][100/402] eta 0:03:57 lr 0.000025 time 0.7456 (0.7862) loss 0.6069 (0.6337) grad_norm 0.1647 (inf) loss_scale 131072.0000 (225807.2079) mem 28968MB [2024-03-09 08:58:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [196/800][200/402] eta 0:02:34 lr 0.000025 time 0.7457 (0.7661) loss 0.6395 (0.6343) grad_norm 0.1306 (inf) loss_scale 131072.0000 (178675.2637) mem 28968MB [2024-03-09 08:59:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [196/800][300/402] eta 0:01:17 lr 0.000025 time 0.7460 (0.7593) loss 0.6079 (0.6341) grad_norm 0.1428 (inf) loss_scale 131072.0000 (162860.2259) mem 28968MB [2024-03-09 09:00:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [196/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7559) loss 0.5939 (0.6342) grad_norm 0.1345 (inf) loss_scale 131072.0000 (154932.9875) mem 28968MB [2024-03-09 09:00:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 196 training takes 0:05:03 [2024-03-09 09:01:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [197/800][0/402] eta 0:22:14 lr 0.000025 time 3.3203 (3.3203) loss 0.6339 (0.6339) grad_norm 0.1473 (0.1473) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:02:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [197/800][100/402] eta 0:03:52 lr 0.000025 time 0.7462 (0.7711) loss 0.6363 (0.6313) grad_norm 0.1269 (0.1514) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:03:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [197/800][200/402] eta 0:02:33 lr 0.000025 time 0.7453 (0.7585) loss 0.6310 (0.6326) grad_norm 0.1831 (0.1545) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:04:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [197/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7543) loss 0.6623 (0.6338) grad_norm 0.1469 (0.1534) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:05:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [197/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7521) loss 0.6600 (0.6343) grad_norm 0.1574 (0.1526) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:05:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 197 training takes 0:05:02 [2024-03-09 09:06:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [198/800][0/402] eta 0:22:20 lr 0.000025 time 3.3356 (3.3356) loss 0.6213 (0.6213) grad_norm 0.1602 (0.1602) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:07:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [198/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7713) loss 0.6373 (0.6342) grad_norm 0.1137 (0.1440) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:08:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [198/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7586) loss 0.6533 (0.6337) grad_norm 0.1351 (0.1475) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:09:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [198/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7543) loss 0.6372 (0.6328) grad_norm 0.1377 (0.1475) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:11:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [198/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7521) loss 0.6282 (0.6330) grad_norm 0.1365 (inf) loss_scale 65536.0000 (130254.8429) mem 28968MB [2024-03-09 09:11:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 198 training takes 0:05:02 [2024-03-09 09:11:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [199/800][0/402] eta 0:21:31 lr 0.000025 time 3.2133 (3.2133) loss 0.6186 (0.6186) grad_norm 0.1646 (0.1646) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:12:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [199/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7701) loss 0.6548 (0.6314) grad_norm 0.1214 (0.1500) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:13:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [199/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7579) loss 0.6329 (0.6329) grad_norm 0.1433 (0.1499) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:14:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [199/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7538) loss 0.6333 (0.6329) grad_norm 0.1456 (0.1503) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:16:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [199/800][400/402] eta 0:00:01 lr 0.000025 time 0.7450 (0.7518) loss 0.6506 (0.6334) grad_norm 0.1265 (0.1497) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:16:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 199 training takes 0:05:02 [2024-03-09 09:16:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [200/800][0/402] eta 0:22:10 lr 0.000025 time 3.3091 (3.3091) loss 0.6193 (0.6193) grad_norm 0.1711 (0.1711) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:17:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [200/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7710) loss 0.6563 (0.6297) grad_norm 0.1357 (0.1515) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:18:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [200/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7584) loss 0.6524 (0.6307) grad_norm 0.1285 (0.1500) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:19:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [200/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7542) loss 0.6220 (0.6322) grad_norm 0.1542 (0.1483) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:21:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [200/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7520) loss 0.6544 (0.6328) grad_norm 0.1431 (0.1499) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:21:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 200 training takes 0:05:02 [2024-03-09 09:21:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [201/800][0/402] eta 0:31:20 lr 0.000025 time 4.6770 (4.6770) loss 0.6286 (0.6286) grad_norm 0.1442 (0.1442) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:22:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [201/800][100/402] eta 0:03:56 lr 0.000025 time 0.7457 (0.7845) loss 0.6465 (0.6348) grad_norm 0.1144 (0.1501) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:23:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [201/800][200/402] eta 0:02:34 lr 0.000025 time 0.7459 (0.7652) loss 0.6428 (0.6329) grad_norm 0.1660 (0.1489) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:24:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [201/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7587) loss 0.6616 (0.6324) grad_norm 0.1178 (0.1494) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:26:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [201/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7554) loss 0.6219 (0.6329) grad_norm 0.1212 (0.1484) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:26:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 201 training takes 0:05:03 [2024-03-09 09:26:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [202/800][0/402] eta 0:22:09 lr 0.000025 time 3.3070 (3.3070) loss 0.6295 (0.6295) grad_norm 0.1390 (0.1390) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:27:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [202/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7711) loss 0.6611 (0.6328) grad_norm 0.1575 (0.1509) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:28:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [202/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7584) loss 0.6469 (0.6321) grad_norm 0.1311 (0.1494) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:29:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [202/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7542) loss 0.6116 (0.6321) grad_norm 0.1288 (0.1491) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:31:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [202/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7520) loss 0.6456 (0.6326) grad_norm 0.1388 (0.1495) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:31:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 202 training takes 0:05:02 [2024-03-09 09:31:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [203/800][0/402] eta 0:22:22 lr 0.000025 time 3.3399 (3.3399) loss 0.6414 (0.6414) grad_norm 0.1403 (0.1403) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:32:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [203/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7713) loss 0.6572 (0.6328) grad_norm 0.1633 (0.1450) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:33:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [203/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7586) loss 0.6214 (0.6340) grad_norm 0.1253 (0.1456) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:35:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [203/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7543) loss 0.6491 (0.6334) grad_norm 0.1448 (0.1451) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 09:36:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [203/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7521) loss 0.6375 (0.6329) grad_norm 0.1563 (0.1472) loss_scale 131072.0000 (67987.4713) mem 28968MB [2024-03-09 09:36:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 203 training takes 0:05:02 [2024-03-09 09:36:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [204/800][0/402] eta 0:22:15 lr 0.000025 time 3.3214 (3.3214) loss 0.6391 (0.6391) grad_norm 0.1594 (0.1594) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:37:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [204/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7713) loss 0.6388 (0.6356) grad_norm 0.1839 (0.1507) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:38:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [204/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7586) loss 0.6394 (0.6351) grad_norm 0.1619 (0.1510) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:40:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [204/800][300/402] eta 0:01:16 lr 0.000025 time 0.7453 (0.7544) loss 0.6226 (0.6348) grad_norm 0.1457 (0.1512) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:41:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [204/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7522) loss 0.6184 (0.6338) grad_norm 0.1452 (0.1500) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:41:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 204 training takes 0:05:02 [2024-03-09 09:41:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [205/800][0/402] eta 0:22:25 lr 0.000025 time 3.3472 (3.3472) loss 0.6689 (0.6689) grad_norm 0.1303 (0.1303) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:42:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [205/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7714) loss 0.6309 (0.6322) grad_norm 0.1781 (0.1456) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:43:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [205/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7586) loss 0.6272 (0.6327) grad_norm 0.1322 (0.1477) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:45:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [205/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7543) loss 0.6754 (0.6321) grad_norm 0.1519 (0.1480) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:46:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [205/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7521) loss 0.6066 (0.6326) grad_norm 0.1923 (0.1483) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:46:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 205 training takes 0:05:02 [2024-03-09 09:46:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [206/800][0/402] eta 0:31:27 lr 0.000025 time 4.6960 (4.6960) loss 0.6452 (0.6452) grad_norm 0.1537 (0.1537) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:47:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [206/800][100/402] eta 0:03:57 lr 0.000025 time 0.7463 (0.7860) loss 0.6421 (0.6352) grad_norm 0.1471 (0.1514) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:48:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [206/800][200/402] eta 0:02:34 lr 0.000025 time 0.7468 (0.7665) loss 0.6368 (0.6341) grad_norm 0.1343 (0.1495) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:50:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [206/800][300/402] eta 0:01:17 lr 0.000025 time 0.7468 (0.7599) loss 0.6245 (0.6334) grad_norm 0.1794 (0.1496) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:51:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [206/800][400/402] eta 0:00:01 lr 0.000025 time 0.7452 (0.7567) loss 0.6450 (0.6340) grad_norm 0.1327 (0.1502) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:51:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 206 training takes 0:05:04 [2024-03-09 09:51:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [207/800][0/402] eta 0:21:57 lr 0.000025 time 3.2779 (3.2779) loss 0.6267 (0.6267) grad_norm 0.1298 (0.1298) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:52:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [207/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7707) loss 0.6399 (0.6343) grad_norm 0.1419 (0.1462) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:53:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [207/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7582) loss 0.6767 (0.6344) grad_norm 0.1447 (0.1484) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:55:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [207/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7540) loss 0.6442 (0.6340) grad_norm 0.1202 (0.1498) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:56:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [207/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7519) loss 0.6643 (0.6334) grad_norm 0.1423 (0.1495) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:56:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 207 training takes 0:05:02 [2024-03-09 09:56:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [208/800][0/402] eta 0:22:27 lr 0.000025 time 3.3521 (3.3521) loss 0.6590 (0.6590) grad_norm 0.1720 (0.1720) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:57:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [208/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7714) loss 0.6395 (0.6340) grad_norm 0.1559 (0.1498) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 09:58:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [208/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7586) loss 0.6177 (0.6342) grad_norm 0.1197 (0.1475) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:00:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [208/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7543) loss 0.6545 (0.6324) grad_norm 0.1237 (0.1484) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:01:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [208/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7521) loss 0.5659 (0.6319) grad_norm 0.1453 (0.1499) loss_scale 262144.0000 (139243.5711) mem 28968MB [2024-03-09 10:01:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 208 training takes 0:05:02 [2024-03-09 10:01:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [209/800][0/402] eta 0:22:13 lr 0.000025 time 3.3182 (3.3182) loss 0.6367 (0.6367) grad_norm 0.1305 (0.1305) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:02:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [209/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7713) loss 0.6693 (0.6351) grad_norm 0.1388 (0.1423) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:04:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [209/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7586) loss 0.6348 (0.6338) grad_norm 0.1657 (0.1461) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:05:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [209/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7543) loss 0.6527 (0.6337) grad_norm 0.1295 (0.1470) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:06:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [209/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7521) loss 0.6368 (0.6336) grad_norm 0.1730 (0.1479) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:06:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 209 training takes 0:05:02 [2024-03-09 10:06:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [210/800][0/402] eta 0:21:47 lr 0.000025 time 3.2526 (3.2526) loss 0.6485 (0.6485) grad_norm 0.1469 (0.1469) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:07:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [210/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7705) loss 0.6560 (0.6325) grad_norm 0.1466 (0.1484) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:09:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [210/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7582) loss 0.6272 (0.6316) grad_norm 0.1209 (0.1463) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:10:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [210/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7541) loss 0.6044 (0.6309) grad_norm 0.1867 (0.1471) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:11:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [210/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7519) loss 0.6582 (0.6324) grad_norm 0.1607 (0.1475) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:11:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 210 training takes 0:05:02 [2024-03-09 10:11:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [211/800][0/402] eta 0:31:01 lr 0.000025 time 4.6300 (4.6300) loss 0.6431 (0.6431) grad_norm 0.1447 (0.1447) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:12:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [211/800][100/402] eta 0:03:56 lr 0.000025 time 0.7464 (0.7841) loss 0.6432 (0.6354) grad_norm 0.1289 (0.1479) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:14:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [211/800][200/402] eta 0:02:34 lr 0.000025 time 0.7459 (0.7651) loss 0.6362 (0.6323) grad_norm 0.1278 (0.1467) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:15:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [211/800][300/402] eta 0:01:17 lr 0.000025 time 0.7465 (0.7586) loss 0.6346 (0.6329) grad_norm 0.1292 (0.1486) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:16:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [211/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7553) loss 0.6727 (0.6324) grad_norm 0.1470 (0.1504) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:16:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 211 training takes 0:05:03 [2024-03-09 10:16:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [212/800][0/402] eta 0:21:55 lr 0.000025 time 3.2731 (3.2731) loss 0.6607 (0.6607) grad_norm 0.1394 (0.1394) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:17:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [212/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7706) loss 0.6043 (0.6318) grad_norm 0.1456 (0.1450) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:19:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [212/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7582) loss 0.6094 (0.6320) grad_norm 0.1781 (0.1475) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:20:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [212/800][300/402] eta 0:01:16 lr 0.000025 time 0.7465 (0.7540) loss 0.6305 (0.6322) grad_norm 0.1572 (0.1527) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:21:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [212/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7519) loss 0.6297 (0.6326) grad_norm 0.1346 (inf) loss_scale 131072.0000 (246127.7207) mem 28968MB [2024-03-09 10:21:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 212 training takes 0:05:02 [2024-03-09 10:21:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [213/800][0/402] eta 0:22:49 lr 0.000025 time 3.4067 (3.4067) loss 0.6539 (0.6539) grad_norm 0.1182 (0.1182) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:22:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [213/800][100/402] eta 0:03:53 lr 0.000025 time 0.7456 (0.7720) loss 0.6328 (0.6325) grad_norm 0.1398 (0.1446) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:24:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [213/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7589) loss 0.6370 (0.6327) grad_norm 0.1678 (0.1478) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:25:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [213/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7546) loss 0.6169 (0.6325) grad_norm 0.1383 (0.1471) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:26:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [213/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7524) loss 0.6078 (0.6329) grad_norm 0.1405 (0.1463) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:26:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 213 training takes 0:05:02 [2024-03-09 10:26:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [214/800][0/402] eta 0:21:41 lr 0.000025 time 3.2380 (3.2380) loss 0.6180 (0.6180) grad_norm 0.1622 (0.1622) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:28:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [214/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7704) loss 0.6354 (0.6337) grad_norm 0.1882 (0.1511) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:29:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [214/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7581) loss 0.6381 (0.6332) grad_norm 0.1172 (0.1488) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:30:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [214/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7540) loss 0.6015 (0.6335) grad_norm 0.1985 (0.1484) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:31:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [214/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7519) loss 0.6403 (0.6331) grad_norm 0.1777 (0.1484) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:31:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 214 training takes 0:05:02 [2024-03-09 10:31:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [215/800][0/402] eta 0:21:31 lr 0.000025 time 3.2128 (3.2128) loss 0.6412 (0.6412) grad_norm 0.1850 (0.1850) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:33:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [215/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7701) loss 0.6197 (0.6334) grad_norm 0.1450 (0.1481) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:34:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [215/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7580) loss 0.6513 (0.6335) grad_norm 0.1166 (0.1467) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:35:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [215/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7539) loss 0.6235 (0.6327) grad_norm 0.1228 (0.1460) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:36:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [215/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7518) loss 0.6501 (0.6326) grad_norm 0.1700 (0.1461) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:36:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 215 training takes 0:05:02 [2024-03-09 10:36:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [216/800][0/402] eta 0:32:33 lr 0.000025 time 4.8584 (4.8584) loss 0.6442 (0.6442) grad_norm 0.1656 (0.1656) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:38:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [216/800][100/402] eta 0:03:57 lr 0.000025 time 0.7458 (0.7864) loss 0.5960 (0.6332) grad_norm 0.1468 (0.1458) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:39:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [216/800][200/402] eta 0:02:34 lr 0.000025 time 0.7456 (0.7662) loss 0.6265 (0.6310) grad_norm 0.1551 (0.1511) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:40:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [216/800][300/402] eta 0:01:17 lr 0.000025 time 0.7456 (0.7594) loss 0.6153 (0.6325) grad_norm 0.1718 (0.1500) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:41:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [216/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7560) loss 0.6116 (0.6326) grad_norm 0.1591 (0.1510) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:41:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 216 training takes 0:05:03 [2024-03-09 10:41:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [217/800][0/402] eta 0:22:38 lr 0.000025 time 3.3790 (3.3790) loss 0.6670 (0.6670) grad_norm 0.1306 (0.1306) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:43:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [217/800][100/402] eta 0:03:53 lr 0.000025 time 0.7458 (0.7719) loss 0.6301 (0.6334) grad_norm 0.1539 (0.1457) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:44:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [217/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7589) loss 0.6379 (0.6320) grad_norm 0.1217 (0.1471) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:45:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [217/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7545) loss 0.6735 (0.6323) grad_norm 0.1265 (0.1466) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:46:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [217/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7523) loss 0.6527 (0.6325) grad_norm 0.1264 (0.1461) loss_scale 262144.0000 (150356.9077) mem 28968MB [2024-03-09 10:46:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 217 training takes 0:05:02 [2024-03-09 10:46:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [218/800][0/402] eta 0:22:16 lr 0.000025 time 3.3241 (3.3241) loss 0.6371 (0.6371) grad_norm 0.1232 (0.1232) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:48:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [218/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7713) loss 0.6426 (0.6322) grad_norm 0.1685 (0.1509) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:49:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [218/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7586) loss 0.6503 (0.6317) grad_norm 0.1274 (0.1490) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:50:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [218/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7543) loss 0.6407 (0.6321) grad_norm 0.1549 (0.1474) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:51:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [218/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7521) loss 0.6359 (0.6315) grad_norm 0.1603 (0.1472) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:51:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 218 training takes 0:05:02 [2024-03-09 10:51:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [219/800][0/402] eta 0:21:34 lr 0.000025 time 3.2201 (3.2201) loss 0.6341 (0.6341) grad_norm 0.1315 (0.1315) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:53:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [219/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7702) loss 0.6699 (0.6308) grad_norm 0.1348 (0.1493) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:54:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [219/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7580) loss 0.6375 (0.6323) grad_norm 0.1375 (0.1489) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:55:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [219/800][300/402] eta 0:01:16 lr 0.000025 time 0.7470 (0.7539) loss 0.6402 (0.6322) grad_norm 0.1313 (0.1487) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 10:56:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [219/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7518) loss 0.6201 (0.6321) grad_norm 0.1434 (inf) loss_scale 131072.0000 (231745.7556) mem 28968MB [2024-03-09 10:56:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 219 training takes 0:05:02 [2024-03-09 10:57:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [220/800][0/402] eta 0:22:02 lr 0.000025 time 3.2891 (3.2891) loss 0.6104 (0.6104) grad_norm 0.1547 (0.1547) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:58:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [220/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7709) loss 0.6305 (0.6309) grad_norm 0.1629 (0.1445) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 10:59:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [220/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7584) loss 0.6554 (0.6328) grad_norm 0.1577 (0.1439) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:00:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [220/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7542) loss 0.5940 (0.6322) grad_norm 0.1686 (0.1457) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:02:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [220/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7521) loss 0.6380 (0.6311) grad_norm 0.1343 (0.1471) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:02:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 220 training takes 0:05:02 [2024-03-09 11:02:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [221/800][0/402] eta 0:32:17 lr 0.000025 time 4.8186 (4.8186) loss 0.6402 (0.6402) grad_norm 0.1369 (0.1369) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:03:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [221/800][100/402] eta 0:03:57 lr 0.000025 time 0.7486 (0.7859) loss 0.6195 (0.6338) grad_norm 0.1635 (0.1505) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:04:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [221/800][200/402] eta 0:02:34 lr 0.000025 time 0.7454 (0.7659) loss 0.6613 (0.6338) grad_norm 0.1357 (0.1497) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:05:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [221/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7593) loss 0.6224 (0.6329) grad_norm 0.1556 (0.1478) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:07:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [221/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7560) loss 0.6668 (0.6326) grad_norm 0.1343 (0.1468) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:07:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 221 training takes 0:05:03 [2024-03-09 11:07:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [222/800][0/402] eta 0:21:52 lr 0.000025 time 3.2639 (3.2639) loss 0.6557 (0.6557) grad_norm 0.1351 (0.1351) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:08:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [222/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7707) loss 0.6015 (0.6323) grad_norm 0.1671 (0.1466) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:09:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [222/800][200/402] eta 0:02:33 lr 0.000025 time 0.7550 (0.7583) loss 0.6153 (0.6316) grad_norm 0.1258 (0.1461) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:10:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [222/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7542) loss 0.6413 (0.6329) grad_norm 0.1468 (0.1459) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:12:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [222/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7520) loss 0.6284 (0.6325) grad_norm 0.1538 (0.1472) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:12:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 222 training takes 0:05:02 [2024-03-09 11:12:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [223/800][0/402] eta 0:21:51 lr 0.000025 time 3.2623 (3.2623) loss 0.6147 (0.6147) grad_norm 0.1485 (0.1485) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:13:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [223/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7707) loss 0.6171 (0.6332) grad_norm 0.1475 (0.1474) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:14:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [223/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7583) loss 0.6402 (0.6337) grad_norm 0.1297 (0.1477) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:15:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [223/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7541) loss 0.6458 (0.6326) grad_norm 0.1230 (0.1476) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:17:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [223/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7520) loss 0.6364 (0.6326) grad_norm 0.1377 (0.1470) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:17:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 223 training takes 0:05:02 [2024-03-09 11:17:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [224/800][0/402] eta 0:22:00 lr 0.000025 time 3.2857 (3.2857) loss 0.5991 (0.5991) grad_norm 0.1743 (0.1743) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:18:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [224/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7709) loss 0.6115 (0.6358) grad_norm 0.1662 (0.1451) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:19:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [224/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7584) loss 0.6428 (0.6337) grad_norm 0.2146 (0.1466) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:20:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [224/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7542) loss 0.6102 (0.6326) grad_norm 0.1286 (0.1480) loss_scale 262144.0000 (132378.3654) mem 28968MB [2024-03-09 11:22:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [224/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7521) loss 0.6424 (0.6332) grad_norm 0.1137 (0.1483) loss_scale 262144.0000 (164738.8728) mem 28968MB [2024-03-09 11:22:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 224 training takes 0:05:02 [2024-03-09 11:22:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [225/800][0/402] eta 0:21:40 lr 0.000025 time 3.2361 (3.2361) loss 0.6591 (0.6591) grad_norm 0.1825 (0.1825) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:23:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [225/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7704) loss 0.6655 (0.6322) grad_norm 0.1530 (0.1451) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:24:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [225/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7581) loss 0.6103 (0.6299) grad_norm 0.1462 (0.1457) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:25:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [225/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7540) loss 0.6047 (0.6311) grad_norm 0.1561 (0.1459) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:27:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [225/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7519) loss 0.6356 (0.6311) grad_norm 0.1382 (0.1453) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:27:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 225 training takes 0:05:02 [2024-03-09 11:27:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [226/800][0/402] eta 0:32:00 lr 0.000025 time 4.7778 (4.7778) loss 0.6113 (0.6113) grad_norm 0.1840 (0.1840) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:28:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [226/800][100/402] eta 0:03:57 lr 0.000025 time 0.7454 (0.7861) loss 0.6063 (0.6291) grad_norm 0.1527 (0.1495) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:29:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [226/800][200/402] eta 0:02:34 lr 0.000025 time 0.7453 (0.7661) loss 0.6370 (0.6313) grad_norm 0.1467 (0.1471) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:31:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [226/800][300/402] eta 0:01:17 lr 0.000025 time 0.7455 (0.7593) loss 0.6411 (0.6313) grad_norm 0.1620 (0.1484) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:32:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [226/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7559) loss 0.6341 (0.6312) grad_norm 0.1455 (0.1477) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:32:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 226 training takes 0:05:03 [2024-03-09 11:32:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [227/800][0/402] eta 0:22:15 lr 0.000025 time 3.3232 (3.3232) loss 0.6292 (0.6292) grad_norm 0.1177 (0.1177) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:33:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [227/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7713) loss 0.6530 (0.6286) grad_norm 0.1507 (0.1435) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:34:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [227/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7586) loss 0.5881 (0.6297) grad_norm 0.1383 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:36:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [227/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7544) loss 0.6347 (0.6305) grad_norm 0.1150 (0.1461) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:37:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [227/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7522) loss 0.6332 (0.6311) grad_norm 0.1217 (0.1462) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:37:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 227 training takes 0:05:02 [2024-03-09 11:37:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [228/800][0/402] eta 0:21:29 lr 0.000025 time 3.2071 (3.2071) loss 0.6422 (0.6422) grad_norm 0.1233 (0.1233) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:38:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [228/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7701) loss 0.6157 (0.6316) grad_norm 0.1671 (0.1465) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:39:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [228/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7580) loss 0.6053 (0.6324) grad_norm 0.1545 (0.1466) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:41:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [228/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7539) loss 0.6362 (0.6328) grad_norm 0.1418 (0.1447) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:42:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [228/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7519) loss 0.6163 (0.6327) grad_norm 0.1622 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:42:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 228 training takes 0:05:02 [2024-03-09 11:42:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [229/800][0/402] eta 0:22:25 lr 0.000025 time 3.3458 (3.3458) loss 0.5791 (0.5791) grad_norm 0.1490 (0.1490) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:43:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [229/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7714) loss 0.6429 (0.6331) grad_norm 0.1756 (0.1492) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 11:44:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [229/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7585) loss 0.6073 (0.6322) grad_norm 0.1659 (inf) loss_scale 131072.0000 (200846.6468) mem 28968MB [2024-03-09 11:46:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [229/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7543) loss 0.6354 (0.6317) grad_norm 0.1714 (inf) loss_scale 131072.0000 (177665.7010) mem 28968MB [2024-03-09 11:47:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [229/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7521) loss 0.6232 (0.6320) grad_norm 0.1612 (inf) loss_scale 131072.0000 (166046.3242) mem 28968MB [2024-03-09 11:47:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 229 training takes 0:05:02 [2024-03-09 11:47:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [230/800][0/402] eta 0:22:30 lr 0.000025 time 3.3582 (3.3582) loss 0.6506 (0.6506) grad_norm 0.1216 (0.1216) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:48:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [230/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7715) loss 0.6336 (0.6320) grad_norm 0.1268 (0.1415) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:49:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [230/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7587) loss 0.6648 (0.6321) grad_norm 0.1252 (0.1428) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:51:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [230/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7544) loss 0.6168 (0.6325) grad_norm 0.1371 (0.1432) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:52:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [230/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7522) loss 0.6405 (0.6323) grad_norm 0.1369 (0.1440) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:52:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 230 training takes 0:05:02 [2024-03-09 11:52:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [231/800][0/402] eta 0:31:45 lr 0.000025 time 4.7391 (4.7391) loss 0.6163 (0.6163) grad_norm 0.1574 (0.1574) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:53:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [231/800][100/402] eta 0:03:57 lr 0.000025 time 0.7452 (0.7852) loss 0.6341 (0.6318) grad_norm 0.1646 (0.1470) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:55:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [231/800][200/402] eta 0:02:34 lr 0.000025 time 0.7460 (0.7656) loss 0.6570 (0.6321) grad_norm 0.1722 (0.1474) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:56:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [231/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7590) loss 0.6146 (0.6305) grad_norm 0.1370 (0.1476) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:57:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [231/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7556) loss 0.6328 (0.6319) grad_norm 0.1415 (0.1473) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:57:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 231 training takes 0:05:03 [2024-03-09 11:57:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [232/800][0/402] eta 0:21:57 lr 0.000025 time 3.2775 (3.2775) loss 0.6061 (0.6061) grad_norm 0.1250 (0.1250) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 11:58:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [232/800][100/402] eta 0:03:52 lr 0.000025 time 0.7461 (0.7709) loss 0.6697 (0.6326) grad_norm 0.1311 (0.1439) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:00:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [232/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7584) loss 0.6619 (0.6331) grad_norm 0.1388 (0.1460) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:01:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [232/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7542) loss 0.6409 (0.6327) grad_norm 0.1510 (0.1464) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:02:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [232/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7521) loss 0.6032 (0.6320) grad_norm 0.1423 (0.1466) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:02:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 232 training takes 0:05:02 [2024-03-09 12:02:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [233/800][0/402] eta 0:22:05 lr 0.000025 time 3.2981 (3.2981) loss 0.6494 (0.6494) grad_norm 0.1512 (0.1512) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:03:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [233/800][100/402] eta 0:03:52 lr 0.000025 time 0.7466 (0.7710) loss 0.6074 (0.6317) grad_norm 0.1542 (0.1430) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:05:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [233/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7584) loss 0.6078 (0.6314) grad_norm 0.1456 (0.1427) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:06:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [233/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7542) loss 0.6575 (0.6313) grad_norm 0.1569 (0.1451) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:07:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [233/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7521) loss 0.6415 (0.6312) grad_norm 0.1523 (0.1457) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:07:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 233 training takes 0:05:02 [2024-03-09 12:07:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [234/800][0/402] eta 0:21:33 lr 0.000025 time 3.2170 (3.2170) loss 0.6266 (0.6266) grad_norm 0.1378 (0.1378) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:08:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [234/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7702) loss 0.6472 (0.6314) grad_norm 0.1388 (0.1490) loss_scale 262144.0000 (136262.9703) mem 28968MB [2024-03-09 12:10:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [234/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7581) loss 0.5975 (0.6312) grad_norm 0.1481 (0.1450) loss_scale 262144.0000 (198890.3483) mem 28968MB [2024-03-09 12:11:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [234/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7540) loss 0.6201 (0.6318) grad_norm 0.1414 (0.1438) loss_scale 262144.0000 (219904.8505) mem 28968MB [2024-03-09 12:12:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [234/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7519) loss 0.5994 (0.6320) grad_norm 0.1731 (0.1443) loss_scale 262144.0000 (230438.3042) mem 28968MB [2024-03-09 12:12:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 234 training takes 0:05:02 [2024-03-09 12:12:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [235/800][0/402] eta 0:21:51 lr 0.000025 time 3.2633 (3.2633) loss 0.6336 (0.6336) grad_norm 0.1770 (0.1770) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:13:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [235/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7706) loss 0.6363 (0.6331) grad_norm 0.1345 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:15:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [235/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7583) loss 0.6323 (0.6316) grad_norm 0.1517 (0.1445) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:16:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [235/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7541) loss 0.6678 (0.6325) grad_norm 0.1264 (0.1436) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:17:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [235/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7520) loss 0.6042 (0.6317) grad_norm 0.1515 (0.1444) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:17:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 235 training takes 0:05:02 [2024-03-09 12:17:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [236/800][0/402] eta 0:32:20 lr 0.000025 time 4.8271 (4.8271) loss 0.6423 (0.6423) grad_norm 0.1725 (0.1725) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:19:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [236/800][100/402] eta 0:03:57 lr 0.000025 time 0.7456 (0.7862) loss 0.6282 (0.6286) grad_norm 0.1510 (0.1469) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:20:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [236/800][200/402] eta 0:02:34 lr 0.000025 time 0.7458 (0.7661) loss 0.6577 (0.6314) grad_norm 0.1575 (0.1462) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:21:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [236/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7593) loss 0.6463 (0.6309) grad_norm 0.1184 (0.1460) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:22:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [236/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7559) loss 0.6305 (0.6311) grad_norm 0.1824 (0.1466) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:22:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 236 training takes 0:05:03 [2024-03-09 12:22:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [237/800][0/402] eta 0:21:42 lr 0.000025 time 3.2402 (3.2402) loss 0.6002 (0.6002) grad_norm 0.1507 (0.1507) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:24:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [237/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7703) loss 0.6523 (0.6316) grad_norm 0.1364 (0.1481) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:25:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [237/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7581) loss 0.6126 (0.6313) grad_norm 0.1457 (0.1479) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:26:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [237/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7540) loss 0.6358 (0.6309) grad_norm 0.1307 (0.1475) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:27:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [237/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7519) loss 0.6716 (0.6307) grad_norm 0.1596 (0.1466) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:27:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 237 training takes 0:05:02 [2024-03-09 12:27:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [238/800][0/402] eta 0:21:53 lr 0.000025 time 3.2669 (3.2669) loss 0.6243 (0.6243) grad_norm 0.1574 (0.1574) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:29:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [238/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7706) loss 0.6404 (0.6320) grad_norm 0.1521 (0.1483) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:30:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [238/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7582) loss 0.6433 (0.6317) grad_norm 0.1447 (0.1478) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:31:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [238/800][300/402] eta 0:01:16 lr 0.000025 time 0.7463 (0.7540) loss 0.6399 (0.6316) grad_norm 0.1740 (0.1473) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:32:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [238/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7519) loss 0.6004 (0.6311) grad_norm 0.1369 (0.1476) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:32:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 238 training takes 0:05:02 [2024-03-09 12:32:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [239/800][0/402] eta 0:22:35 lr 0.000025 time 3.3715 (3.3715) loss 0.6522 (0.6522) grad_norm 0.1313 (0.1313) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 12:34:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [239/800][100/402] eta 0:03:53 lr 0.000025 time 0.7477 (0.7727) loss 0.6219 (0.6333) grad_norm 0.1649 (0.1417) loss_scale 524288.0000 (298480.7921) mem 28968MB [2024-03-09 12:35:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [239/800][200/402] eta 0:02:33 lr 0.000025 time 0.7470 (0.7599) loss 0.6145 (0.6326) grad_norm 0.1384 (0.1451) loss_scale 524288.0000 (410822.6866) mem 28968MB [2024-03-09 12:36:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [239/800][300/402] eta 0:01:17 lr 0.000025 time 0.7468 (0.7555) loss 0.6363 (0.6321) grad_norm 0.1210 (inf) loss_scale 131072.0000 (381458.7110) mem 28968MB [2024-03-09 12:37:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [239/800][400/402] eta 0:00:01 lr 0.000025 time 0.7459 (0.7533) loss 0.6309 (0.6308) grad_norm 0.1336 (inf) loss_scale 131072.0000 (319018.1347) mem 28968MB [2024-03-09 12:37:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 239 training takes 0:05:02 [2024-03-09 12:37:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [240/800][0/402] eta 0:21:55 lr 0.000025 time 3.2719 (3.2719) loss 0.6158 (0.6158) grad_norm 0.1473 (0.1473) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:39:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [240/800][100/402] eta 0:03:53 lr 0.000025 time 0.7469 (0.7718) loss 0.6591 (0.6299) grad_norm 0.1355 (0.1467) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:40:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [240/800][200/402] eta 0:02:33 lr 0.000025 time 0.7471 (0.7594) loss 0.6379 (0.6323) grad_norm 0.1664 (0.1458) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:41:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [240/800][300/402] eta 0:01:17 lr 0.000025 time 0.7470 (0.7552) loss 0.6479 (0.6306) grad_norm 0.1665 (0.1447) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:42:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [240/800][400/402] eta 0:00:01 lr 0.000025 time 0.7453 (0.7531) loss 0.6056 (0.6304) grad_norm 0.1536 (0.1445) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:42:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 240 training takes 0:05:02 [2024-03-09 12:43:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [241/800][0/402] eta 0:31:38 lr 0.000025 time 4.7228 (4.7228) loss 0.6238 (0.6238) grad_norm 0.1438 (0.1438) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:44:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [241/800][100/402] eta 0:03:57 lr 0.000025 time 0.7455 (0.7858) loss 0.6285 (0.6279) grad_norm 0.1534 (0.1453) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:45:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [241/800][200/402] eta 0:02:34 lr 0.000025 time 0.7456 (0.7659) loss 0.6499 (0.6297) grad_norm 0.1305 (0.1447) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:46:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [241/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7592) loss 0.6385 (0.6303) grad_norm 0.1291 (0.1453) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:47:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [241/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7558) loss 0.6039 (0.6305) grad_norm 0.1432 (0.1460) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:47:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 241 training takes 0:05:03 [2024-03-09 12:48:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [242/800][0/402] eta 0:21:37 lr 0.000025 time 3.2271 (3.2271) loss 0.6041 (0.6041) grad_norm 0.1651 (0.1651) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:49:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [242/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7703) loss 0.6338 (0.6291) grad_norm 0.1401 (0.1461) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:50:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [242/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7581) loss 0.6487 (0.6306) grad_norm 0.1478 (0.1448) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:51:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [242/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7540) loss 0.6426 (0.6308) grad_norm 0.1518 (0.1452) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:53:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [242/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7519) loss 0.6261 (0.6311) grad_norm 0.1546 (0.1454) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:53:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 242 training takes 0:05:02 [2024-03-09 12:53:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [243/800][0/402] eta 0:22:04 lr 0.000025 time 3.2942 (3.2942) loss 0.6160 (0.6160) grad_norm 0.1490 (0.1490) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:54:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [243/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7709) loss 0.6247 (0.6303) grad_norm 0.1512 (0.1445) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:55:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [243/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7584) loss 0.6293 (0.6314) grad_norm 0.1521 (0.1465) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:56:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [243/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7542) loss 0.6172 (0.6305) grad_norm 0.1344 (0.1455) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:58:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [243/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7520) loss 0.5759 (0.6301) grad_norm 0.1236 (0.1462) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:58:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 243 training takes 0:05:02 [2024-03-09 12:58:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [244/800][0/402] eta 0:21:29 lr 0.000025 time 3.2066 (3.2066) loss 0.6668 (0.6668) grad_norm 0.1740 (0.1740) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 12:59:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [244/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7701) loss 0.6444 (0.6319) grad_norm 0.1436 (0.1443) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:00:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [244/800][200/402] eta 0:02:33 lr 0.000025 time 0.7466 (0.7580) loss 0.6612 (0.6306) grad_norm 0.1307 (0.1435) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:01:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [244/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7540) loss 0.6307 (0.6297) grad_norm 0.1223 (0.1447) loss_scale 262144.0000 (149361.1163) mem 28968MB [2024-03-09 13:03:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [244/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7519) loss 0.6429 (0.6292) grad_norm 0.1514 (0.1460) loss_scale 262144.0000 (177486.5237) mem 28968MB [2024-03-09 13:03:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 244 training takes 0:05:02 [2024-03-09 13:03:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [245/800][0/402] eta 0:22:28 lr 0.000025 time 3.3550 (3.3550) loss 0.6182 (0.6182) grad_norm 0.1935 (0.1935) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:04:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [245/800][100/402] eta 0:03:52 lr 0.000025 time 0.7465 (0.7715) loss 0.6490 (0.6329) grad_norm 0.1458 (0.1458) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:05:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [245/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7586) loss 0.6464 (0.6321) grad_norm 0.1163 (inf) loss_scale 131072.0000 (256275.1045) mem 28968MB [2024-03-09 13:06:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [245/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7544) loss 0.6307 (0.6318) grad_norm 0.1283 (inf) loss_scale 131072.0000 (214679.3887) mem 28968MB [2024-03-09 13:08:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [245/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7522) loss 0.6378 (0.6309) grad_norm 0.1817 (inf) loss_scale 131072.0000 (193829.6658) mem 28968MB [2024-03-09 13:08:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 245 training takes 0:05:02 [2024-03-09 13:08:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [246/800][0/402] eta 0:31:27 lr 0.000025 time 4.6963 (4.6963) loss 0.6273 (0.6273) grad_norm 0.1294 (0.1294) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:09:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [246/800][100/402] eta 0:03:57 lr 0.000025 time 0.7455 (0.7851) loss 0.6263 (0.6283) grad_norm 0.1395 (0.1436) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:10:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [246/800][200/402] eta 0:02:34 lr 0.000025 time 0.7461 (0.7656) loss 0.6261 (0.6306) grad_norm 0.1504 (0.1432) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:11:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [246/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7592) loss 0.6118 (0.6302) grad_norm 0.1291 (0.1439) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:13:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [246/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7558) loss 0.6338 (0.6306) grad_norm 0.1612 (0.1442) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:13:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 246 training takes 0:05:03 [2024-03-09 13:13:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [247/800][0/402] eta 0:22:04 lr 0.000025 time 3.2943 (3.2943) loss 0.6302 (0.6302) grad_norm 0.1324 (0.1324) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:14:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [247/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7710) loss 0.6372 (0.6300) grad_norm 0.1626 (0.1437) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:15:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [247/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7584) loss 0.6291 (0.6304) grad_norm 0.1370 (0.1445) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:17:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [247/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7542) loss 0.6000 (0.6301) grad_norm 0.1369 (0.1441) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:18:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [247/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7520) loss 0.6839 (0.6311) grad_norm 0.1235 (0.1432) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:18:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 247 training takes 0:05:02 [2024-03-09 13:18:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [248/800][0/402] eta 0:22:29 lr 0.000025 time 3.3560 (3.3560) loss 0.6199 (0.6199) grad_norm 0.1372 (0.1372) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:19:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [248/800][100/402] eta 0:03:53 lr 0.000025 time 0.7454 (0.7716) loss 0.6025 (0.6337) grad_norm 0.1336 (0.1490) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:20:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [248/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7587) loss 0.6727 (0.6325) grad_norm 0.1397 (0.1479) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:22:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [248/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7544) loss 0.6248 (0.6317) grad_norm 0.1595 (0.1475) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:23:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [248/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7522) loss 0.6159 (0.6315) grad_norm 0.1328 (0.1466) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:23:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 248 training takes 0:05:02 [2024-03-09 13:23:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [249/800][0/402] eta 0:22:31 lr 0.000025 time 3.3615 (3.3615) loss 0.6224 (0.6224) grad_norm 0.1390 (0.1390) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:24:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [249/800][100/402] eta 0:03:53 lr 0.000025 time 0.7463 (0.7726) loss 0.6337 (0.6285) grad_norm 0.1762 (0.1461) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:25:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [249/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7593) loss 0.6231 (0.6290) grad_norm 0.1873 (0.1463) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:27:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [249/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7548) loss 0.6421 (0.6299) grad_norm 0.1551 (0.1462) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:28:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [249/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7525) loss 0.6321 (0.6301) grad_norm 0.1271 (0.1459) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:28:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 249 training takes 0:05:02 [2024-03-09 13:28:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [250/800][0/402] eta 0:22:21 lr 0.000025 time 3.3380 (3.3380) loss 0.6110 (0.6110) grad_norm 0.1386 (0.1386) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:29:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [250/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7713) loss 0.6299 (0.6291) grad_norm 0.1681 (0.1456) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 13:30:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [250/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7586) loss 0.6574 (0.6299) grad_norm 0.1268 (0.1454) loss_scale 262144.0000 (143461.8905) mem 28968MB [2024-03-09 13:32:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [250/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7543) loss 0.6628 (0.6299) grad_norm 0.1265 (0.1444) loss_scale 262144.0000 (182891.1628) mem 28968MB [2024-03-09 13:33:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [250/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7521) loss 0.6587 (0.6297) grad_norm 0.1539 (0.1443) loss_scale 262144.0000 (202654.9626) mem 28968MB [2024-03-09 13:33:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 250 training takes 0:05:02 [2024-03-09 13:33:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [251/800][0/402] eta 0:32:09 lr 0.000025 time 4.8007 (4.8007) loss 0.6174 (0.6174) grad_norm 0.1479 (0.1479) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:34:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [251/800][100/402] eta 0:03:57 lr 0.000025 time 0.7471 (0.7872) loss 0.6395 (0.6316) grad_norm 0.1455 (0.1427) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:35:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [251/800][200/402] eta 0:02:34 lr 0.000025 time 0.7470 (0.7672) loss 0.5970 (0.6312) grad_norm 0.1829 (0.1437) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:37:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [251/800][300/402] eta 0:01:17 lr 0.000025 time 0.7455 (0.7601) loss 0.6280 (0.6307) grad_norm 0.1811 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:38:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [251/800][400/402] eta 0:00:01 lr 0.000025 time 0.7453 (0.7565) loss 0.6369 (0.6312) grad_norm 0.1515 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:38:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 251 training takes 0:05:04 [2024-03-09 13:38:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [252/800][0/402] eta 0:22:43 lr 0.000025 time 3.3913 (3.3913) loss 0.6030 (0.6030) grad_norm 0.1389 (0.1389) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:39:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [252/800][100/402] eta 0:03:53 lr 0.000025 time 0.7460 (0.7719) loss 0.6411 (0.6313) grad_norm 0.1448 (0.1416) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:40:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [252/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7589) loss 0.6370 (0.6323) grad_norm 0.1496 (0.1441) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:42:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [252/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7545) loss 0.6200 (0.6316) grad_norm 0.1511 (0.1475) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:43:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [252/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7523) loss 0.6466 (0.6308) grad_norm 0.1334 (0.1464) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:43:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 252 training takes 0:05:02 [2024-03-09 13:43:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [253/800][0/402] eta 0:22:00 lr 0.000025 time 3.2857 (3.2857) loss 0.6498 (0.6498) grad_norm 0.1353 (0.1353) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:44:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [253/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7709) loss 0.6201 (0.6308) grad_norm 0.1251 (0.1456) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:46:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [253/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7587) loss 0.5892 (0.6302) grad_norm 0.1736 (0.1452) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:47:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [253/800][300/402] eta 0:01:16 lr 0.000025 time 0.7463 (0.7544) loss 0.5986 (0.6301) grad_norm 0.1447 (0.1462) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:48:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [253/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7522) loss 0.6452 (0.6297) grad_norm 0.1372 (0.1459) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:48:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 253 training takes 0:05:02 [2024-03-09 13:48:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [254/800][0/402] eta 0:22:04 lr 0.000025 time 3.2940 (3.2940) loss 0.6492 (0.6492) grad_norm 0.1345 (0.1345) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:49:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [254/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7710) loss 0.6613 (0.6302) grad_norm 0.1517 (0.1548) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:51:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [254/800][200/402] eta 0:02:33 lr 0.000025 time 0.7464 (0.7584) loss 0.6521 (0.6315) grad_norm 0.1493 (0.1506) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:52:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [254/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7542) loss 0.6632 (0.6308) grad_norm 0.1210 (0.1483) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:53:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [254/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7521) loss 0.6117 (0.6300) grad_norm 0.1459 (0.1479) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:53:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 254 training takes 0:05:02 [2024-03-09 13:53:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [255/800][0/402] eta 0:22:16 lr 0.000025 time 3.3253 (3.3253) loss 0.6240 (0.6240) grad_norm 0.1592 (0.1592) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:54:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [255/800][100/402] eta 0:03:52 lr 0.000025 time 0.7464 (0.7714) loss 0.5765 (0.6314) grad_norm 0.1539 (0.1491) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:56:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [255/800][200/402] eta 0:02:33 lr 0.000025 time 0.7453 (0.7587) loss 0.6421 (0.6293) grad_norm 0.1360 (0.1460) loss_scale 524288.0000 (299965.7711) mem 28968MB [2024-03-09 13:57:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [255/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7546) loss 0.6500 (0.6299) grad_norm 0.1333 (0.1456) loss_scale 524288.0000 (374491.4286) mem 28968MB [2024-03-09 13:58:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [255/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7523) loss 0.6426 (0.6302) grad_norm 0.1353 (inf) loss_scale 262144.0000 (398772.6683) mem 28968MB [2024-03-09 13:58:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 255 training takes 0:05:02 [2024-03-09 13:58:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [256/800][0/402] eta 0:35:58 lr 0.000025 time 5.3685 (5.3685) loss 0.6156 (0.6156) grad_norm 0.1493 (0.1493) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 13:59:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [256/800][100/402] eta 0:03:59 lr 0.000025 time 0.7460 (0.7919) loss 0.6316 (0.6307) grad_norm 0.1408 (0.1412) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:01:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [256/800][200/402] eta 0:02:35 lr 0.000025 time 0.7457 (0.7691) loss 0.6226 (0.6297) grad_norm 0.1481 (0.1416) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:02:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [256/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7616) loss 0.6291 (0.6294) grad_norm 0.1373 (0.1430) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:03:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [256/800][400/402] eta 0:00:01 lr 0.000025 time 0.7455 (0.7577) loss 0.6350 (0.6294) grad_norm 0.1390 (0.1432) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:03:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 256 training takes 0:05:04 [2024-03-09 14:03:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [257/800][0/402] eta 0:25:18 lr 0.000025 time 3.7785 (3.7785) loss 0.6468 (0.6468) grad_norm 0.1459 (0.1459) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:05:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [257/800][100/402] eta 0:03:54 lr 0.000025 time 0.7470 (0.7764) loss 0.6291 (0.6287) grad_norm 0.1370 (0.1431) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:06:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [257/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7616) loss 0.6268 (0.6304) grad_norm 0.1521 (0.1441) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:07:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [257/800][300/402] eta 0:01:17 lr 0.000025 time 0.8034 (0.7567) loss 0.6204 (0.6304) grad_norm 0.1600 (0.1450) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:08:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [257/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7540) loss 0.6229 (0.6306) grad_norm 0.1560 (0.1447) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:08:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 257 training takes 0:05:03 [2024-03-09 14:08:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [258/800][0/402] eta 0:25:52 lr 0.000025 time 3.8616 (3.8616) loss 0.6155 (0.6155) grad_norm 0.1618 (0.1618) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:10:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [258/800][100/402] eta 0:03:54 lr 0.000025 time 0.7464 (0.7772) loss 0.6099 (0.6285) grad_norm 0.1434 (0.1467) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:11:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [258/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7618) loss 0.6362 (0.6296) grad_norm 0.1536 (0.1446) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:12:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [258/800][300/402] eta 0:01:17 lr 0.000025 time 0.7471 (0.7568) loss 0.6276 (0.6305) grad_norm 0.1437 (0.1456) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:13:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [258/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7542) loss 0.6321 (0.6303) grad_norm 0.1337 (0.1454) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:13:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 258 training takes 0:05:03 [2024-03-09 14:13:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [259/800][0/402] eta 0:24:37 lr 0.000025 time 3.6754 (3.6754) loss 0.6146 (0.6146) grad_norm 0.1593 (0.1593) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:15:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [259/800][100/402] eta 0:03:54 lr 0.000025 time 0.7476 (0.7757) loss 0.6583 (0.6322) grad_norm 0.1346 (0.1435) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:16:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [259/800][200/402] eta 0:02:33 lr 0.000025 time 0.7472 (0.7612) loss 0.6125 (0.6311) grad_norm 0.1767 (0.1457) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:17:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [259/800][300/402] eta 0:01:17 lr 0.000025 time 0.7473 (0.7563) loss 0.6423 (0.6300) grad_norm 0.1170 (0.1458) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:18:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [259/800][400/402] eta 0:00:01 lr 0.000025 time 0.7450 (0.7539) loss 0.6169 (0.6294) grad_norm 0.1426 (0.1454) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:18:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 259 training takes 0:05:03 [2024-03-09 14:18:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [260/800][0/402] eta 0:24:23 lr 0.000025 time 3.6403 (3.6403) loss 0.6169 (0.6169) grad_norm 0.2193 (0.2193) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:20:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [260/800][100/402] eta 0:03:54 lr 0.000025 time 0.7462 (0.7758) loss 0.6462 (0.6304) grad_norm 0.1402 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:21:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [260/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7612) loss 0.6216 (0.6304) grad_norm 0.1490 (0.1459) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:22:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [260/800][300/402] eta 0:01:17 lr 0.000025 time 0.7467 (0.7562) loss 0.6453 (0.6300) grad_norm 0.1467 (0.1459) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:23:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [260/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7537) loss 0.6559 (0.6297) grad_norm 0.1482 (0.1449) loss_scale 524288.0000 (281755.7706) mem 28968MB [2024-03-09 14:23:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 260 training takes 0:05:03 [2024-03-09 14:24:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [261/800][0/402] eta 0:36:14 lr 0.000025 time 5.4096 (5.4096) loss 0.6499 (0.6499) grad_norm 0.1507 (0.1507) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 14:25:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [261/800][100/402] eta 0:03:59 lr 0.000025 time 0.7462 (0.7926) loss 0.6284 (0.6315) grad_norm 0.1507 (0.1451) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 14:26:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [261/800][200/402] eta 0:02:35 lr 0.000025 time 0.7458 (0.7696) loss 0.6461 (0.6304) grad_norm 0.1472 (0.1454) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 14:27:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [261/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7619) loss 0.6368 (0.6296) grad_norm 0.1187 (0.1459) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 14:28:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [261/800][400/402] eta 0:00:01 lr 0.000025 time 0.7452 (0.7579) loss 0.6092 (0.6303) grad_norm 0.1310 (0.1450) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 14:28:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 261 training takes 0:05:04 [2024-03-09 14:29:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [262/800][0/402] eta 0:24:17 lr 0.000025 time 3.6246 (3.6246) loss 0.6714 (0.6714) grad_norm 0.1264 (0.1264) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 14:30:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [262/800][100/402] eta 0:03:54 lr 0.000025 time 0.7463 (0.7764) loss 0.6353 (0.6301) grad_norm 0.1396 (0.1442) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 14:31:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [262/800][200/402] eta 0:02:33 lr 0.000025 time 0.7465 (0.7615) loss 0.6188 (0.6295) grad_norm 0.1529 (0.1448) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 14:32:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [262/800][300/402] eta 0:01:17 lr 0.000025 time 0.7456 (0.7564) loss 0.6363 (0.6298) grad_norm 0.1278 (0.1447) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 14:34:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [262/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7538) loss 0.6387 (0.6293) grad_norm 0.1499 (0.1447) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 14:34:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 262 training takes 0:05:03 [2024-03-09 14:34:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [263/800][0/402] eta 0:24:30 lr 0.000025 time 3.6586 (3.6586) loss 0.6297 (0.6297) grad_norm 0.1340 (0.1340) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 14:35:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [263/800][100/402] eta 0:03:54 lr 0.000025 time 0.7460 (0.7750) loss 0.6329 (0.6276) grad_norm 0.1526 (inf) loss_scale 262144.0000 (495737.6634) mem 28968MB [2024-03-09 14:36:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [263/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7609) loss 0.6429 (0.6288) grad_norm 0.1296 (inf) loss_scale 262144.0000 (379521.9104) mem 28968MB [2024-03-09 14:37:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [263/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7561) loss 0.6233 (0.6288) grad_norm 0.1422 (inf) loss_scale 262144.0000 (340525.9269) mem 28968MB [2024-03-09 14:39:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [263/800][400/402] eta 0:00:01 lr 0.000025 time 0.7453 (0.7537) loss 0.5969 (0.6284) grad_norm 0.1539 (inf) loss_scale 262144.0000 (320979.3117) mem 28968MB [2024-03-09 14:39:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 263 training takes 0:05:03 [2024-03-09 14:39:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [264/800][0/402] eta 0:24:40 lr 0.000025 time 3.6819 (3.6819) loss 0.6173 (0.6173) grad_norm 0.1455 (0.1455) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:40:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [264/800][100/402] eta 0:03:54 lr 0.000025 time 0.7460 (0.7757) loss 0.6424 (0.6279) grad_norm 0.1411 (0.1443) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:41:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [264/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7617) loss 0.6582 (0.6286) grad_norm 0.1468 (0.1438) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:42:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [264/800][300/402] eta 0:01:17 lr 0.000025 time 0.7469 (0.7567) loss 0.6078 (0.6279) grad_norm 0.1337 (0.1445) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:44:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [264/800][400/402] eta 0:00:01 lr 0.000025 time 0.7450 (0.7541) loss 0.6120 (0.6281) grad_norm 0.1624 (0.1450) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:44:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 264 training takes 0:05:03 [2024-03-09 14:44:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [265/800][0/402] eta 0:24:29 lr 0.000025 time 3.6565 (3.6565) loss 0.6340 (0.6340) grad_norm 0.1391 (0.1391) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:45:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [265/800][100/402] eta 0:03:54 lr 0.000025 time 0.7460 (0.7754) loss 0.6193 (0.6302) grad_norm 0.1284 (0.1428) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:46:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [265/800][200/402] eta 0:02:33 lr 0.000025 time 0.7483 (0.7610) loss 0.6339 (0.6315) grad_norm 0.1441 (0.1429) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:47:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [265/800][300/402] eta 0:01:17 lr 0.000025 time 0.7470 (0.7562) loss 0.6429 (0.6297) grad_norm 0.1403 (0.1444) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:49:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [265/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7537) loss 0.6425 (0.6292) grad_norm 0.1193 (0.1439) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:49:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 265 training takes 0:05:03 [2024-03-09 14:49:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [266/800][0/402] eta 0:36:20 lr 0.000025 time 5.4247 (5.4247) loss 0.6093 (0.6093) grad_norm 0.1492 (0.1492) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:50:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [266/800][100/402] eta 0:03:59 lr 0.000025 time 0.7461 (0.7927) loss 0.6142 (0.6321) grad_norm 0.1340 (0.1448) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:51:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [266/800][200/402] eta 0:02:35 lr 0.000025 time 0.7461 (0.7695) loss 0.6262 (0.6299) grad_norm 0.1342 (0.1439) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:53:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [266/800][300/402] eta 0:01:17 lr 0.000025 time 0.7454 (0.7622) loss 0.6140 (0.6291) grad_norm 0.1287 (0.1452) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:54:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [266/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7582) loss 0.6416 (0.6292) grad_norm 0.1459 (0.1445) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:54:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 266 training takes 0:05:04 [2024-03-09 14:54:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [267/800][0/402] eta 0:25:10 lr 0.000025 time 3.7578 (3.7578) loss 0.6034 (0.6034) grad_norm 0.1500 (0.1500) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 14:55:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [267/800][100/402] eta 0:03:54 lr 0.000025 time 0.7458 (0.7765) loss 0.5995 (0.6313) grad_norm 0.1293 (inf) loss_scale 131072.0000 (184279.4455) mem 28968MB [2024-03-09 14:56:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [267/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7615) loss 0.6000 (0.6301) grad_norm 0.1792 (inf) loss_scale 131072.0000 (157808.0796) mem 28968MB [2024-03-09 14:58:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [267/800][300/402] eta 0:01:17 lr 0.000025 time 0.7464 (0.7566) loss 0.6294 (0.6297) grad_norm 0.1127 (inf) loss_scale 131072.0000 (148925.6611) mem 28968MB [2024-03-09 14:59:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [267/800][400/402] eta 0:00:01 lr 0.000025 time 0.7457 (0.7540) loss 0.6046 (0.6297) grad_norm 0.1350 (inf) loss_scale 131072.0000 (144473.3766) mem 28968MB [2024-03-09 14:59:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 267 training takes 0:05:03 [2024-03-09 14:59:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [268/800][0/402] eta 0:25:40 lr 0.000025 time 3.8315 (3.8315) loss 0.6103 (0.6103) grad_norm 0.1699 (0.1699) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:00:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [268/800][100/402] eta 0:03:54 lr 0.000025 time 0.7465 (0.7769) loss 0.6163 (0.6291) grad_norm 0.1557 (0.1457) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:01:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [268/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7618) loss 0.6294 (0.6299) grad_norm 0.1554 (0.1434) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:03:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [268/800][300/402] eta 0:01:17 lr 0.000025 time 0.7473 (0.7567) loss 0.6048 (0.6296) grad_norm 0.1348 (0.1438) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:04:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [268/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7542) loss 0.6402 (0.6296) grad_norm 0.1306 (0.1443) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:04:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 268 training takes 0:05:03 [2024-03-09 15:04:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [269/800][0/402] eta 0:26:08 lr 0.000025 time 3.9017 (3.9017) loss 0.6245 (0.6245) grad_norm 0.1687 (0.1687) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:05:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [269/800][100/402] eta 0:03:54 lr 0.000025 time 0.7465 (0.7777) loss 0.6170 (0.6283) grad_norm 0.1280 (0.1457) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:06:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [269/800][200/402] eta 0:02:33 lr 0.000025 time 0.7466 (0.7621) loss 0.6308 (0.6287) grad_norm 0.1354 (0.1436) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:08:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [269/800][300/402] eta 0:01:17 lr 0.000025 time 0.7465 (0.7569) loss 0.6328 (0.6304) grad_norm 0.1336 (0.1431) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:09:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [269/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7543) loss 0.6428 (0.6299) grad_norm 0.1239 (0.1444) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:09:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 269 training takes 0:05:03 [2024-03-09 15:09:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [270/800][0/402] eta 0:25:08 lr 0.000025 time 3.7521 (3.7521) loss 0.6307 (0.6307) grad_norm 0.1371 (0.1371) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:10:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [270/800][100/402] eta 0:03:54 lr 0.000025 time 0.7466 (0.7762) loss 0.6384 (0.6287) grad_norm 0.1476 (0.1455) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:11:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [270/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7614) loss 0.6124 (0.6293) grad_norm 0.1444 (0.1455) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:13:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [270/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7564) loss 0.6170 (0.6294) grad_norm 0.1189 (0.1443) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:14:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [270/800][400/402] eta 0:00:01 lr 0.000025 time 0.7459 (0.7539) loss 0.6460 (0.6299) grad_norm 0.1318 (0.1451) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:14:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 270 training takes 0:05:03 [2024-03-09 15:14:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [271/800][0/402] eta 0:36:54 lr 0.000025 time 5.5099 (5.5099) loss 0.6391 (0.6391) grad_norm 0.1519 (0.1519) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:15:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [271/800][100/402] eta 0:03:59 lr 0.000025 time 0.7476 (0.7936) loss 0.6260 (0.6317) grad_norm 0.1553 (0.1485) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:17:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [271/800][200/402] eta 0:02:35 lr 0.000025 time 0.7457 (0.7703) loss 0.6216 (0.6292) grad_norm 0.1339 (0.1482) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:18:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [271/800][300/402] eta 0:01:17 lr 0.000025 time 0.7489 (0.7624) loss 0.6436 (0.6295) grad_norm 0.1589 (0.1456) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:19:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [271/800][400/402] eta 0:00:01 lr 0.000025 time 0.7436 (0.7583) loss 0.6543 (0.6298) grad_norm 0.1351 (0.1460) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:19:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 271 training takes 0:05:04 [2024-03-09 15:19:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [272/800][0/402] eta 0:24:58 lr 0.000025 time 3.7276 (3.7276) loss 0.6164 (0.6164) grad_norm 0.1518 (0.1518) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:20:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [272/800][100/402] eta 0:03:54 lr 0.000025 time 0.7460 (0.7759) loss 0.6025 (0.6297) grad_norm 0.1765 (0.1453) loss_scale 262144.0000 (221913.9802) mem 28968MB [2024-03-09 15:22:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [272/800][200/402] eta 0:02:33 lr 0.000025 time 0.7471 (0.7612) loss 0.6286 (0.6305) grad_norm 0.1516 (inf) loss_scale 131072.0000 (202150.8458) mem 28968MB [2024-03-09 15:23:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [272/800][300/402] eta 0:01:17 lr 0.000025 time 0.7464 (0.7563) loss 0.6175 (0.6300) grad_norm 0.1542 (inf) loss_scale 131072.0000 (178536.6113) mem 28968MB [2024-03-09 15:24:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [272/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7537) loss 0.6208 (0.6294) grad_norm 0.1303 (inf) loss_scale 131072.0000 (166700.0499) mem 28968MB [2024-03-09 15:24:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 272 training takes 0:05:03 [2024-03-09 15:24:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [273/800][0/402] eta 0:24:41 lr 0.000025 time 3.6847 (3.6847) loss 0.6612 (0.6612) grad_norm 0.1129 (0.1129) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:25:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [273/800][100/402] eta 0:03:54 lr 0.000025 time 0.7463 (0.7754) loss 0.6357 (0.6303) grad_norm 0.1347 (0.1437) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:27:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [273/800][200/402] eta 0:02:33 lr 0.000025 time 0.7465 (0.7618) loss 0.6293 (0.6311) grad_norm 0.1225 (0.1454) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:28:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [273/800][300/402] eta 0:01:17 lr 0.000025 time 0.7479 (0.7568) loss 0.6221 (0.6311) grad_norm 0.1547 (0.1455) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:29:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [273/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7542) loss 0.5846 (0.6295) grad_norm 0.1680 (0.1489) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:29:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 273 training takes 0:05:03 [2024-03-09 15:29:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [274/800][0/402] eta 0:24:52 lr 0.000025 time 3.7125 (3.7125) loss 0.6347 (0.6347) grad_norm 0.1347 (0.1347) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:30:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [274/800][100/402] eta 0:03:54 lr 0.000025 time 0.7462 (0.7757) loss 0.6322 (0.6283) grad_norm 0.1459 (0.1459) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:32:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [274/800][200/402] eta 0:02:33 lr 0.000025 time 0.7469 (0.7612) loss 0.6448 (0.6278) grad_norm 0.1132 (0.1425) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:33:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [274/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7563) loss 0.6353 (0.6289) grad_norm 0.1409 (0.1426) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:34:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [274/800][400/402] eta 0:00:01 lr 0.000025 time 0.7471 (0.7538) loss 0.6071 (0.6291) grad_norm 0.1556 (0.1423) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:34:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 274 training takes 0:05:03 [2024-03-09 15:34:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [275/800][0/402] eta 0:26:14 lr 0.000025 time 3.9161 (3.9161) loss 0.6420 (0.6420) grad_norm 0.1606 (0.1606) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:36:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [275/800][100/402] eta 0:03:54 lr 0.000025 time 0.7462 (0.7778) loss 0.6086 (0.6278) grad_norm 0.1383 (0.1463) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:37:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [275/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7622) loss 0.5937 (0.6301) grad_norm 0.1452 (0.1450) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:38:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [275/800][300/402] eta 0:01:17 lr 0.000025 time 0.7473 (0.7573) loss 0.6429 (0.6292) grad_norm 0.1342 (0.1446) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:39:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [275/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7547) loss 0.6254 (0.6289) grad_norm 0.1425 (0.1442) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:39:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 275 training takes 0:05:03 [2024-03-09 15:39:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [276/800][0/402] eta 0:36:46 lr 0.000025 time 5.4887 (5.4887) loss 0.5989 (0.5989) grad_norm 0.1395 (0.1395) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:41:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [276/800][100/402] eta 0:03:59 lr 0.000025 time 0.7461 (0.7935) loss 0.6225 (0.6297) grad_norm 0.1504 (0.1460) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:42:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [276/800][200/402] eta 0:02:35 lr 0.000025 time 0.7461 (0.7700) loss 0.6321 (0.6286) grad_norm 0.1369 (0.1442) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:43:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [276/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7622) loss 0.6047 (0.6276) grad_norm 0.1111 (0.1445) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:44:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [276/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7582) loss 0.6689 (0.6281) grad_norm 0.1529 (0.1445) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:44:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 276 training takes 0:05:04 [2024-03-09 15:44:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [277/800][0/402] eta 0:25:16 lr 0.000025 time 3.7715 (3.7715) loss 0.6271 (0.6271) grad_norm 0.1561 (0.1561) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:46:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [277/800][100/402] eta 0:03:54 lr 0.000025 time 0.7456 (0.7765) loss 0.6353 (0.6316) grad_norm 0.1277 (0.1419) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 15:47:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [277/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7615) loss 0.6017 (0.6312) grad_norm 0.1266 (0.1410) loss_scale 262144.0000 (177371.0647) mem 28968MB [2024-03-09 15:48:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [277/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7564) loss 0.6114 (0.6296) grad_norm 0.1413 (0.1427) loss_scale 262144.0000 (205534.8306) mem 28968MB [2024-03-09 15:49:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [277/800][400/402] eta 0:00:01 lr 0.000025 time 0.7456 (0.7539) loss 0.6064 (0.6296) grad_norm 0.1351 (0.1429) loss_scale 262144.0000 (219651.8304) mem 28968MB [2024-03-09 15:49:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 277 training takes 0:05:03 [2024-03-09 15:49:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [278/800][0/402] eta 0:22:44 lr 0.000025 time 3.3933 (3.3933) loss 0.6197 (0.6197) grad_norm 0.1463 (0.1463) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 15:51:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [278/800][100/402] eta 0:03:53 lr 0.000025 time 0.7461 (0.7721) loss 0.6271 (0.6278) grad_norm 0.1415 (0.1452) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 15:52:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [278/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7590) loss 0.6593 (0.6292) grad_norm 0.1324 (0.1464) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 15:53:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [278/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7546) loss 0.6361 (0.6290) grad_norm 0.1688 (0.1451) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 15:54:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [278/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7523) loss 0.6182 (0.6297) grad_norm 0.1327 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 15:54:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 278 training takes 0:05:02 [2024-03-09 15:55:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [279/800][0/402] eta 0:22:07 lr 0.000025 time 3.3020 (3.3020) loss 0.6187 (0.6187) grad_norm 0.1455 (0.1455) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 15:56:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [279/800][100/402] eta 0:03:53 lr 0.000025 time 0.7465 (0.7721) loss 0.6503 (0.6254) grad_norm 0.1618 (0.1448) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 15:57:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [279/800][200/402] eta 0:02:33 lr 0.000025 time 0.7468 (0.7595) loss 0.6406 (0.6299) grad_norm 0.1516 (0.1435) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 15:58:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [279/800][300/402] eta 0:01:17 lr 0.000025 time 0.7475 (0.7553) loss 0.6114 (0.6297) grad_norm 0.1775 (0.1434) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:00:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [279/800][400/402] eta 0:00:01 lr 0.000025 time 0.7450 (0.7531) loss 0.6282 (0.6299) grad_norm 0.1197 (0.1438) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:00:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 279 training takes 0:05:02 [2024-03-09 16:00:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [280/800][0/402] eta 0:21:55 lr 0.000025 time 3.2716 (3.2716) loss 0.6144 (0.6144) grad_norm 0.1829 (0.1829) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:01:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [280/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7709) loss 0.6423 (0.6295) grad_norm 0.1891 (0.1438) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:02:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [280/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7584) loss 0.6369 (0.6296) grad_norm 0.1318 (0.1428) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:03:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [280/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7542) loss 0.6296 (0.6296) grad_norm 0.1347 (0.1444) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:05:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [280/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7521) loss 0.6260 (0.6290) grad_norm 0.1463 (0.1450) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:05:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 280 training takes 0:05:02 [2024-03-09 16:05:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [281/800][0/402] eta 0:32:33 lr 0.000025 time 4.8586 (4.8586) loss 0.6602 (0.6602) grad_norm 0.1241 (0.1241) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:06:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [281/800][100/402] eta 0:03:57 lr 0.000025 time 0.7455 (0.7864) loss 0.6498 (0.6295) grad_norm 0.1109 (0.1452) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:07:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [281/800][200/402] eta 0:02:34 lr 0.000025 time 0.7459 (0.7662) loss 0.6542 (0.6302) grad_norm 0.1168 (0.1481) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:08:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [281/800][300/402] eta 0:01:17 lr 0.000025 time 0.7456 (0.7594) loss 0.6674 (0.6293) grad_norm 0.1436 (0.1470) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:10:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [281/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7559) loss 0.6288 (0.6291) grad_norm 0.1475 (0.1457) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:10:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 281 training takes 0:05:03 [2024-03-09 16:10:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [282/800][0/402] eta 0:21:59 lr 0.000025 time 3.2834 (3.2834) loss 0.6248 (0.6248) grad_norm 0.3229 (0.3229) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:11:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [282/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7709) loss 0.6443 (0.6279) grad_norm 0.1412 (0.1515) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:12:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [282/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7584) loss 0.6356 (0.6284) grad_norm 0.1237 (0.1459) loss_scale 524288.0000 (367784.1194) mem 28968MB [2024-03-09 16:13:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [282/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7541) loss 0.6574 (0.6284) grad_norm 0.1328 (inf) loss_scale 262144.0000 (401489.6478) mem 28968MB [2024-03-09 16:15:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [282/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7520) loss 0.6356 (0.6288) grad_norm 0.1733 (inf) loss_scale 262144.0000 (366740.1097) mem 28968MB [2024-03-09 16:15:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 282 training takes 0:05:02 [2024-03-09 16:15:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [283/800][0/402] eta 0:22:42 lr 0.000025 time 3.3899 (3.3899) loss 0.6056 (0.6056) grad_norm 0.1245 (0.1245) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:16:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [283/800][100/402] eta 0:03:53 lr 0.000025 time 0.7459 (0.7719) loss 0.6660 (0.6284) grad_norm 0.1379 (0.1456) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:17:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [283/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7589) loss 0.6148 (0.6297) grad_norm 0.1482 (0.1456) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:18:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [283/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7545) loss 0.6398 (0.6298) grad_norm 0.1427 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:20:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [283/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7523) loss 0.6081 (0.6291) grad_norm 0.1577 (0.1450) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:20:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 283 training takes 0:05:02 [2024-03-09 16:20:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [284/800][0/402] eta 0:22:20 lr 0.000025 time 3.3338 (3.3338) loss 0.6256 (0.6256) grad_norm 0.1188 (0.1188) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:21:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [284/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7713) loss 0.6118 (0.6292) grad_norm 0.1546 (0.1450) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:22:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [284/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7585) loss 0.6308 (0.6291) grad_norm 0.1438 (inf) loss_scale 131072.0000 (208019.7413) mem 28968MB [2024-03-09 16:23:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [284/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7543) loss 0.6473 (0.6293) grad_norm 0.1926 (inf) loss_scale 131072.0000 (182455.7076) mem 28968MB [2024-03-09 16:25:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [284/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7521) loss 0.6318 (0.6291) grad_norm 0.1306 (inf) loss_scale 131072.0000 (169641.8155) mem 28968MB [2024-03-09 16:25:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 284 training takes 0:05:02 [2024-03-09 16:25:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [285/800][0/402] eta 0:22:10 lr 0.000025 time 3.3094 (3.3094) loss 0.6557 (0.6557) grad_norm 0.1527 (0.1527) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:26:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [285/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7711) loss 0.6271 (0.6293) grad_norm 0.1409 (0.1430) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:27:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [285/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7585) loss 0.6084 (0.6285) grad_norm 0.1209 (0.1454) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:29:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [285/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7542) loss 0.6509 (0.6278) grad_norm 0.1334 (0.1448) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:30:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [285/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7521) loss 0.6461 (0.6289) grad_norm 0.1528 (0.1446) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:30:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 285 training takes 0:05:02 [2024-03-09 16:30:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [286/800][0/402] eta 0:32:07 lr 0.000025 time 4.7952 (4.7952) loss 0.6160 (0.6160) grad_norm 0.1330 (0.1330) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:31:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [286/800][100/402] eta 0:03:57 lr 0.000025 time 0.7454 (0.7858) loss 0.6153 (0.6288) grad_norm 0.1167 (0.1428) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:32:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [286/800][200/402] eta 0:02:34 lr 0.000025 time 0.7455 (0.7658) loss 0.6174 (0.6300) grad_norm 0.1489 (0.1440) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:34:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [286/800][300/402] eta 0:01:17 lr 0.000025 time 0.7460 (0.7592) loss 0.6529 (0.6301) grad_norm 0.1286 (0.1445) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:35:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [286/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7558) loss 0.6406 (0.6294) grad_norm 0.1382 (0.1448) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:35:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 286 training takes 0:05:03 [2024-03-09 16:35:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [287/800][0/402] eta 0:22:01 lr 0.000025 time 3.2871 (3.2871) loss 0.6371 (0.6371) grad_norm 0.1310 (0.1310) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:36:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [287/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7708) loss 0.6233 (0.6276) grad_norm 0.1347 (0.1435) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:37:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [287/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7584) loss 0.6104 (0.6283) grad_norm 0.1407 (0.1429) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:39:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [287/800][300/402] eta 0:01:16 lr 0.000025 time 0.7453 (0.7542) loss 0.6359 (0.6286) grad_norm 0.1695 (0.1426) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:40:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [287/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7520) loss 0.6226 (0.6295) grad_norm 0.1623 (0.1430) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:40:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 287 training takes 0:05:02 [2024-03-09 16:40:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [288/800][0/402] eta 0:21:39 lr 0.000025 time 3.2317 (3.2317) loss 0.6272 (0.6272) grad_norm 0.1320 (0.1320) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:41:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [288/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7702) loss 0.6340 (0.6298) grad_norm 0.1376 (0.1501) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:42:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [288/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7580) loss 0.6334 (0.6298) grad_norm 0.1424 (0.1501) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:44:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [288/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7539) loss 0.6001 (0.6295) grad_norm 0.1477 (0.1478) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:45:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [288/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7518) loss 0.6420 (0.6301) grad_norm 0.1258 (0.1463) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:45:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 288 training takes 0:05:02 [2024-03-09 16:45:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [289/800][0/402] eta 0:22:29 lr 0.000025 time 3.3563 (3.3563) loss 0.6333 (0.6333) grad_norm 0.1314 (0.1314) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:46:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [289/800][100/402] eta 0:03:53 lr 0.000025 time 0.7453 (0.7717) loss 0.6028 (0.6275) grad_norm 0.1421 (0.1434) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 16:47:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [289/800][200/402] eta 0:02:33 lr 0.000025 time 0.7465 (0.7588) loss 0.6106 (0.6281) grad_norm 0.1750 (0.1429) loss_scale 262144.0000 (191717.2537) mem 28968MB [2024-03-09 16:49:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [289/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7545) loss 0.6555 (0.6286) grad_norm 0.1812 (0.1439) loss_scale 262144.0000 (215114.8439) mem 28968MB [2024-03-09 16:50:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [289/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7522) loss 0.6267 (0.6285) grad_norm 0.1468 (0.1445) loss_scale 262144.0000 (226842.8130) mem 28968MB [2024-03-09 16:50:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 289 training takes 0:05:02 [2024-03-09 16:50:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [290/800][0/402] eta 0:22:11 lr 0.000025 time 3.3111 (3.3111) loss 0.6291 (0.6291) grad_norm 0.1463 (0.1463) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:51:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [290/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7711) loss 0.6403 (0.6275) grad_norm 0.1195 (0.1435) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:53:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [290/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7584) loss 0.6570 (0.6298) grad_norm 0.1341 (0.1433) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:54:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [290/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7542) loss 0.6108 (0.6297) grad_norm 0.1362 (0.1437) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:55:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [290/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7521) loss 0.6468 (0.6301) grad_norm 0.1492 (0.1438) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:55:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 290 training takes 0:05:02 [2024-03-09 16:55:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [291/800][0/402] eta 0:32:07 lr 0.000025 time 4.7940 (4.7940) loss 0.6491 (0.6491) grad_norm 0.1381 (0.1381) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:56:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [291/800][100/402] eta 0:03:57 lr 0.000025 time 0.7473 (0.7863) loss 0.6085 (0.6309) grad_norm 0.1294 (0.1432) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:58:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [291/800][200/402] eta 0:02:34 lr 0.000025 time 0.7470 (0.7667) loss 0.6282 (0.6299) grad_norm 0.1581 (0.1442) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 16:59:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [291/800][300/402] eta 0:01:17 lr 0.000025 time 0.7466 (0.7601) loss 0.6364 (0.6296) grad_norm 0.1648 (0.1447) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:00:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [291/800][400/402] eta 0:00:01 lr 0.000025 time 0.7458 (0.7568) loss 0.6377 (0.6291) grad_norm 0.1432 (0.1444) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:00:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 291 training takes 0:05:04 [2024-03-09 17:00:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [292/800][0/402] eta 0:21:45 lr 0.000025 time 3.2483 (3.2483) loss 0.6332 (0.6332) grad_norm 0.1358 (0.1358) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:01:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [292/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7705) loss 0.6195 (0.6299) grad_norm 0.1673 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:03:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [292/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7582) loss 0.6168 (0.6280) grad_norm 0.1555 (0.1444) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:04:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [292/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7541) loss 0.6362 (0.6285) grad_norm 0.1559 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:05:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [292/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7520) loss 0.6239 (0.6282) grad_norm 0.1290 (0.1441) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:05:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 292 training takes 0:05:02 [2024-03-09 17:05:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [293/800][0/402] eta 0:21:44 lr 0.000025 time 3.2458 (3.2458) loss 0.6511 (0.6511) grad_norm 0.1396 (0.1396) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:06:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [293/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7705) loss 0.6413 (0.6269) grad_norm 0.1501 (0.1459) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:08:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [293/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7582) loss 0.6626 (0.6264) grad_norm 0.1430 (0.1452) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:09:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [293/800][300/402] eta 0:01:16 lr 0.000025 time 0.7465 (0.7541) loss 0.6238 (0.6272) grad_norm 0.1734 (0.1448) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:10:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [293/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7520) loss 0.6484 (0.6283) grad_norm 0.1290 (0.1445) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:10:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 293 training takes 0:05:02 [2024-03-09 17:10:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [294/800][0/402] eta 0:22:28 lr 0.000025 time 3.3540 (3.3540) loss 0.6380 (0.6380) grad_norm 0.1517 (0.1517) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:11:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [294/800][100/402] eta 0:03:52 lr 0.000025 time 0.7461 (0.7715) loss 0.6036 (0.6266) grad_norm 0.1676 (0.1418) loss_scale 524288.0000 (269930.4554) mem 28968MB [2024-03-09 17:13:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [294/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7586) loss 0.6465 (0.6279) grad_norm 0.1266 (inf) loss_scale 262144.0000 (286923.7811) mem 28968MB [2024-03-09 17:14:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [294/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7544) loss 0.6350 (0.6287) grad_norm 0.1507 (inf) loss_scale 262144.0000 (278691.2957) mem 28968MB [2024-03-09 17:15:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [294/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7522) loss 0.6215 (0.6287) grad_norm 0.1428 (inf) loss_scale 262144.0000 (274564.7880) mem 28968MB [2024-03-09 17:15:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 294 training takes 0:05:02 [2024-03-09 17:15:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [295/800][0/402] eta 0:22:11 lr 0.000025 time 3.3122 (3.3122) loss 0.6464 (0.6464) grad_norm 0.1283 (0.1283) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:17:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [295/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7711) loss 0.6566 (0.6270) grad_norm 0.1635 (0.1424) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:18:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [295/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7584) loss 0.6461 (0.6278) grad_norm 0.1186 (0.1417) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:19:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [295/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7542) loss 0.6188 (0.6283) grad_norm 0.1219 (0.1428) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:20:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [295/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7520) loss 0.6281 (0.6280) grad_norm 0.1681 (0.1440) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:20:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 295 training takes 0:05:02 [2024-03-09 17:20:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [296/800][0/402] eta 0:32:14 lr 0.000025 time 4.8123 (4.8123) loss 0.6373 (0.6373) grad_norm 0.1244 (0.1244) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:22:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [296/800][100/402] eta 0:03:57 lr 0.000025 time 0.7455 (0.7861) loss 0.6322 (0.6295) grad_norm 0.1397 (0.1458) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:23:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [296/800][200/402] eta 0:02:34 lr 0.000025 time 0.7456 (0.7662) loss 0.6360 (0.6288) grad_norm 0.1376 (0.1442) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:24:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [296/800][300/402] eta 0:01:17 lr 0.000025 time 0.7469 (0.7596) loss 0.6453 (0.6284) grad_norm 0.1283 (0.1441) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:25:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [296/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7563) loss 0.6454 (0.6284) grad_norm 0.1180 (0.1438) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:25:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 296 training takes 0:05:04 [2024-03-09 17:25:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [297/800][0/402] eta 0:22:10 lr 0.000025 time 3.3090 (3.3090) loss 0.6404 (0.6404) grad_norm 0.1223 (0.1223) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:27:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [297/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7711) loss 0.6132 (0.6278) grad_norm 0.1725 (0.1443) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:28:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [297/800][200/402] eta 0:02:33 lr 0.000025 time 0.7464 (0.7585) loss 0.6416 (0.6283) grad_norm 0.1360 (0.1454) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:29:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [297/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7543) loss 0.6403 (0.6286) grad_norm 0.1487 (0.1438) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:30:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [297/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7521) loss 0.6379 (0.6291) grad_norm 0.1349 (0.1451) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:30:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 297 training takes 0:05:02 [2024-03-09 17:30:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [298/800][0/402] eta 0:22:28 lr 0.000025 time 3.3552 (3.3552) loss 0.6434 (0.6434) grad_norm 0.1396 (0.1396) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:32:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [298/800][100/402] eta 0:03:53 lr 0.000025 time 0.7457 (0.7715) loss 0.6420 (0.6304) grad_norm 0.1484 (0.1457) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:33:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [298/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7587) loss 0.6081 (0.6287) grad_norm 0.1373 (0.1435) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:34:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [298/800][300/402] eta 0:01:16 lr 0.000025 time 0.7463 (0.7544) loss 0.6321 (0.6288) grad_norm 0.1512 (0.1435) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:35:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [298/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7522) loss 0.5942 (0.6282) grad_norm 0.1329 (0.1432) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:35:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 298 training takes 0:05:02 [2024-03-09 17:35:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [299/800][0/402] eta 0:21:46 lr 0.000025 time 3.2506 (3.2506) loss 0.6513 (0.6513) grad_norm 0.1429 (0.1429) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:37:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [299/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7705) loss 0.6347 (0.6298) grad_norm 0.1444 (0.1447) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:38:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [299/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7582) loss 0.6496 (0.6294) grad_norm 0.1566 (0.1451) loss_scale 524288.0000 (384738.7065) mem 28968MB [2024-03-09 17:39:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [299/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7541) loss 0.6408 (0.6297) grad_norm 0.1576 (0.1448) loss_scale 524288.0000 (431100.5980) mem 28968MB [2024-03-09 17:40:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [299/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7520) loss 0.6287 (0.6292) grad_norm 0.1200 (0.1445) loss_scale 524288.0000 (454339.3516) mem 28968MB [2024-03-09 17:40:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 299 training takes 0:05:02 [2024-03-09 17:40:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [300/800][0/402] eta 0:21:45 lr 0.000025 time 3.2484 (3.2484) loss 0.6378 (0.6378) grad_norm 0.1508 (0.1508) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 17:42:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [300/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7705) loss 0.6095 (0.6320) grad_norm 0.1482 (0.1450) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 17:43:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [300/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7582) loss 0.6412 (0.6301) grad_norm 0.1522 (0.1435) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 17:44:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [300/800][300/402] eta 0:01:16 lr 0.000025 time 0.7469 (0.7541) loss 0.6485 (0.6296) grad_norm 0.1192 (0.1425) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 17:45:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [300/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7519) loss 0.6246 (0.6292) grad_norm 0.1339 (0.1430) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 17:45:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 300 training takes 0:05:02 [2024-03-09 17:46:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [301/800][0/402] eta 0:31:34 lr 0.000025 time 4.7122 (4.7122) loss 0.6269 (0.6269) grad_norm 0.1545 (0.1545) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 17:47:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [301/800][100/402] eta 0:03:57 lr 0.000025 time 0.7456 (0.7849) loss 0.6304 (0.6300) grad_norm 0.1771 (inf) loss_scale 262144.0000 (316649.1881) mem 28968MB [2024-03-09 17:48:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [301/800][200/402] eta 0:02:34 lr 0.000025 time 0.7462 (0.7654) loss 0.6027 (0.6282) grad_norm 0.1539 (inf) loss_scale 262144.0000 (289532.1791) mem 28968MB [2024-03-09 17:49:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [301/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7589) loss 0.6197 (0.6286) grad_norm 0.1717 (inf) loss_scale 262144.0000 (280433.1163) mem 28968MB [2024-03-09 17:51:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [301/800][400/402] eta 0:00:01 lr 0.000025 time 0.7437 (0.7555) loss 0.6418 (0.6287) grad_norm 0.1229 (inf) loss_scale 262144.0000 (275872.2394) mem 28968MB [2024-03-09 17:51:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 301 training takes 0:05:03 [2024-03-09 17:51:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [302/800][0/402] eta 0:22:35 lr 0.000025 time 3.3718 (3.3718) loss 0.6459 (0.6459) grad_norm 0.1218 (0.1218) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:52:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [302/800][100/402] eta 0:03:53 lr 0.000025 time 0.7457 (0.7717) loss 0.6134 (0.6260) grad_norm 0.1322 (0.1459) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 17:53:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [302/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7588) loss 0.6272 (0.6280) grad_norm 0.1712 (inf) loss_scale 131072.0000 (250406.2090) mem 28968MB [2024-03-09 17:54:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [302/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7545) loss 0.6436 (0.6288) grad_norm 0.1450 (inf) loss_scale 131072.0000 (210760.2924) mem 28968MB [2024-03-09 17:56:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [302/800][400/402] eta 0:00:01 lr 0.000025 time 0.7450 (0.7522) loss 0.6205 (0.6282) grad_norm 0.1454 (inf) loss_scale 131072.0000 (190887.9002) mem 28968MB [2024-03-09 17:56:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 302 training takes 0:05:02 [2024-03-09 17:56:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [303/800][0/402] eta 0:22:32 lr 0.000025 time 3.3657 (3.3657) loss 0.6204 (0.6204) grad_norm 0.1805 (0.1805) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 17:57:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [303/800][100/402] eta 0:03:53 lr 0.000025 time 0.7457 (0.7716) loss 0.5954 (0.6303) grad_norm 0.1536 (0.1410) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 17:58:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [303/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7587) loss 0.6363 (0.6280) grad_norm 0.1409 (0.1434) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 17:59:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [303/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7544) loss 0.6507 (0.6291) grad_norm 0.1551 (0.1463) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:01:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [303/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7522) loss 0.6453 (0.6292) grad_norm 0.1532 (0.1464) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:01:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 303 training takes 0:05:02 [2024-03-09 18:01:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [304/800][0/402] eta 0:22:24 lr 0.000025 time 3.3446 (3.3446) loss 0.6416 (0.6416) grad_norm 0.1301 (0.1301) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:02:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [304/800][100/402] eta 0:03:52 lr 0.000025 time 0.7465 (0.7715) loss 0.6386 (0.6256) grad_norm 0.1417 (0.1375) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:03:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [304/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7587) loss 0.6231 (0.6272) grad_norm 0.1701 (0.1412) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:04:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [304/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7544) loss 0.6371 (0.6279) grad_norm 0.1520 (0.1407) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:06:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [304/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7522) loss 0.6366 (0.6271) grad_norm 0.1502 (inf) loss_scale 65536.0000 (124698.1746) mem 28968MB [2024-03-09 18:06:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 304 training takes 0:05:02 [2024-03-09 18:06:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [305/800][0/402] eta 0:21:42 lr 0.000025 time 3.2412 (3.2412) loss 0.6451 (0.6451) grad_norm 0.1506 (0.1506) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:07:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [305/800][100/402] eta 0:03:52 lr 0.000025 time 0.7467 (0.7704) loss 0.5937 (0.6274) grad_norm 0.1548 (0.1464) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:08:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [305/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7581) loss 0.6311 (0.6263) grad_norm 0.1645 (0.1451) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:09:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [305/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7540) loss 0.6247 (0.6274) grad_norm 0.1243 (0.1437) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:11:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [305/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7519) loss 0.6226 (0.6276) grad_norm 0.1615 (0.1432) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:11:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 305 training takes 0:05:02 [2024-03-09 18:11:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [306/800][0/402] eta 0:31:29 lr 0.000025 time 4.7006 (4.7006) loss 0.6360 (0.6360) grad_norm 0.1253 (0.1253) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:12:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [306/800][100/402] eta 0:03:57 lr 0.000025 time 0.7532 (0.7862) loss 0.6158 (0.6259) grad_norm 0.1391 (0.1437) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:13:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [306/800][200/402] eta 0:02:34 lr 0.000025 time 0.7457 (0.7662) loss 0.6289 (0.6283) grad_norm 0.1952 (0.1436) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:15:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [306/800][300/402] eta 0:01:17 lr 0.000025 time 0.7468 (0.7596) loss 0.6151 (0.6277) grad_norm 0.1203 (0.1454) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:16:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [306/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7561) loss 0.6453 (0.6275) grad_norm 0.1388 (0.1452) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:16:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 306 training takes 0:05:04 [2024-03-09 18:16:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [307/800][0/402] eta 0:21:49 lr 0.000025 time 3.2587 (3.2587) loss 0.6289 (0.6289) grad_norm 0.1243 (0.1243) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:17:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [307/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7706) loss 0.6036 (0.6284) grad_norm 0.1795 (0.1435) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:18:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [307/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7583) loss 0.6308 (0.6280) grad_norm 0.1698 (0.1449) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:20:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [307/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7541) loss 0.6002 (0.6278) grad_norm 0.1535 (0.1449) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:21:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [307/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7520) loss 0.6247 (0.6282) grad_norm 0.1643 (0.1440) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:21:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 307 training takes 0:05:02 [2024-03-09 18:21:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [308/800][0/402] eta 0:21:53 lr 0.000025 time 3.2678 (3.2678) loss 0.6141 (0.6141) grad_norm 0.1223 (0.1223) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:22:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [308/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7708) loss 0.6284 (0.6308) grad_norm 0.1166 (0.1423) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:23:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [308/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7583) loss 0.6168 (0.6296) grad_norm 0.1607 (0.1449) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:25:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [308/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7542) loss 0.6265 (0.6297) grad_norm 0.1512 (0.1440) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:26:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [308/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7521) loss 0.5994 (0.6293) grad_norm 0.1581 (0.1439) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:26:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 308 training takes 0:05:02 [2024-03-09 18:26:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [309/800][0/402] eta 0:21:53 lr 0.000025 time 3.2673 (3.2673) loss 0.6339 (0.6339) grad_norm 0.1310 (0.1310) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:27:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [309/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7707) loss 0.6099 (0.6279) grad_norm 0.1590 (0.1459) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:28:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [309/800][200/402] eta 0:02:33 lr 0.000025 time 0.7453 (0.7583) loss 0.6127 (0.6270) grad_norm 0.1380 (0.1462) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:30:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [309/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7541) loss 0.5667 (0.6272) grad_norm 0.1188 (0.1454) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-09 18:31:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [309/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7520) loss 0.6177 (0.6279) grad_norm 0.1458 (0.1468) loss_scale 131072.0000 (73544.1397) mem 28968MB [2024-03-09 18:31:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 309 training takes 0:05:02 [2024-03-09 18:31:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [310/800][0/402] eta 0:22:24 lr 0.000025 time 3.3435 (3.3435) loss 0.6302 (0.6302) grad_norm 0.1536 (0.1536) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:32:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [310/800][100/402] eta 0:03:52 lr 0.000025 time 0.7464 (0.7715) loss 0.6126 (0.6278) grad_norm 0.1402 (0.1424) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:33:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [310/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7588) loss 0.6603 (0.6277) grad_norm 0.1453 (0.1426) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:35:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [310/800][300/402] eta 0:01:16 lr 0.000025 time 0.7453 (0.7544) loss 0.6189 (0.6291) grad_norm 0.1529 (0.1429) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:36:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [310/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7523) loss 0.6274 (0.6291) grad_norm 0.1444 (0.1427) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:36:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 310 training takes 0:05:02 [2024-03-09 18:36:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [311/800][0/402] eta 0:32:23 lr 0.000025 time 4.8347 (4.8347) loss 0.6373 (0.6373) grad_norm 0.1599 (0.1599) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:37:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [311/800][100/402] eta 0:03:57 lr 0.000025 time 0.7467 (0.7865) loss 0.6159 (0.6301) grad_norm 0.2079 (0.1464) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:38:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [311/800][200/402] eta 0:02:34 lr 0.000025 time 0.7459 (0.7664) loss 0.6162 (0.6282) grad_norm 0.1393 (0.1467) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:40:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [311/800][300/402] eta 0:01:17 lr 0.000025 time 0.7468 (0.7596) loss 0.6486 (0.6281) grad_norm 0.1580 (0.1467) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:41:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [311/800][400/402] eta 0:00:01 lr 0.000025 time 0.7456 (0.7563) loss 0.6262 (0.6288) grad_norm 0.1335 (0.1468) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:41:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 311 training takes 0:05:04 [2024-03-09 18:41:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [312/800][0/402] eta 0:22:16 lr 0.000025 time 3.3237 (3.3237) loss 0.6545 (0.6545) grad_norm 0.1482 (0.1482) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:42:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [312/800][100/402] eta 0:03:53 lr 0.000025 time 0.7468 (0.7724) loss 0.6361 (0.6309) grad_norm 0.1205 (0.1406) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:44:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [312/800][200/402] eta 0:02:33 lr 0.000025 time 0.7469 (0.7597) loss 0.6209 (0.6277) grad_norm 0.1409 (0.1431) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:45:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [312/800][300/402] eta 0:01:17 lr 0.000025 time 0.7465 (0.7554) loss 0.6493 (0.6282) grad_norm 0.1270 (0.1431) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:46:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [312/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7532) loss 0.6284 (0.6285) grad_norm 0.1432 (0.1426) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:46:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 312 training takes 0:05:02 [2024-03-09 18:46:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [313/800][0/402] eta 0:21:49 lr 0.000025 time 3.2584 (3.2584) loss 0.6306 (0.6306) grad_norm 0.1673 (0.1673) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:47:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [313/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7706) loss 0.6236 (0.6258) grad_norm 0.1493 (0.1430) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:49:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [313/800][200/402] eta 0:02:33 lr 0.000025 time 0.7452 (0.7582) loss 0.6530 (0.6272) grad_norm 0.1466 (0.1454) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:50:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [313/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7540) loss 0.6330 (0.6265) grad_norm 0.1436 (0.1459) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:51:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [313/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7519) loss 0.6141 (0.6267) grad_norm 0.1641 (0.1454) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:51:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 313 training takes 0:05:02 [2024-03-09 18:51:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [314/800][0/402] eta 0:21:42 lr 0.000025 time 3.2395 (3.2395) loss 0.6220 (0.6220) grad_norm 0.1373 (0.1373) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:52:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [314/800][100/402] eta 0:03:52 lr 0.000025 time 0.7462 (0.7703) loss 0.6220 (0.6275) grad_norm 0.1685 (0.1444) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:54:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [314/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7581) loss 0.6519 (0.6270) grad_norm 0.1475 (0.1442) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:55:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [314/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7540) loss 0.6641 (0.6284) grad_norm 0.1341 (0.1441) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 18:56:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [314/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7519) loss 0.6487 (0.6282) grad_norm 0.1385 (0.1438) loss_scale 262144.0000 (150356.9077) mem 28968MB [2024-03-09 18:56:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 314 training takes 0:05:02 [2024-03-09 18:56:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [315/800][0/402] eta 0:22:01 lr 0.000025 time 3.2873 (3.2873) loss 0.5826 (0.5826) grad_norm 0.1519 (0.1519) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 18:57:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [315/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7709) loss 0.6415 (0.6306) grad_norm 0.1298 (0.1461) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 18:59:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [315/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7584) loss 0.6269 (0.6290) grad_norm 0.1788 (0.1465) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:00:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [315/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7541) loss 0.6432 (0.6290) grad_norm 0.1290 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:01:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [315/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7520) loss 0.6145 (0.6289) grad_norm 0.1228 (0.1441) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:01:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 315 training takes 0:05:02 [2024-03-09 19:01:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [316/800][0/402] eta 0:32:01 lr 0.000025 time 4.7810 (4.7810) loss 0.6297 (0.6297) grad_norm 0.1193 (0.1193) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:02:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [316/800][100/402] eta 0:03:57 lr 0.000025 time 0.7462 (0.7867) loss 0.6478 (0.6283) grad_norm 0.1270 (0.1453) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:04:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [316/800][200/402] eta 0:02:34 lr 0.000025 time 0.7455 (0.7665) loss 0.6049 (0.6275) grad_norm 0.1400 (0.1441) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:05:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [316/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7596) loss 0.5829 (0.6282) grad_norm 0.1586 (0.1419) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:06:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [316/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7561) loss 0.6322 (0.6278) grad_norm 0.1375 (0.1432) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:06:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 316 training takes 0:05:04 [2024-03-09 19:06:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [317/800][0/402] eta 0:22:40 lr 0.000025 time 3.3840 (3.3840) loss 0.6013 (0.6013) grad_norm 0.1617 (0.1617) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:08:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [317/800][100/402] eta 0:03:53 lr 0.000025 time 0.7454 (0.7718) loss 0.6518 (0.6270) grad_norm 0.1229 (0.1450) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:09:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [317/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7589) loss 0.5995 (0.6276) grad_norm 0.1451 (0.1444) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:10:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [317/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7545) loss 0.6432 (0.6277) grad_norm 0.1237 (0.1445) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:11:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [317/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7523) loss 0.6187 (0.6272) grad_norm 0.1359 (0.1444) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:11:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 317 training takes 0:05:02 [2024-03-09 19:11:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [318/800][0/402] eta 0:21:30 lr 0.000025 time 3.2112 (3.2112) loss 0.6312 (0.6312) grad_norm 0.1921 (0.1921) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:13:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [318/800][100/402] eta 0:03:52 lr 0.000025 time 0.7463 (0.7701) loss 0.6356 (0.6275) grad_norm 0.1695 (0.1451) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:14:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [318/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7579) loss 0.6124 (0.6280) grad_norm 0.1349 (0.1447) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:15:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [318/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7539) loss 0.6345 (0.6279) grad_norm 0.1349 (0.1453) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:16:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [318/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7518) loss 0.6347 (0.6282) grad_norm 0.1348 (0.1450) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:16:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 318 training takes 0:05:02 [2024-03-09 19:16:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [319/800][0/402] eta 0:22:26 lr 0.000025 time 3.3483 (3.3483) loss 0.6114 (0.6114) grad_norm 0.1740 (0.1740) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:18:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [319/800][100/402] eta 0:03:52 lr 0.000025 time 0.7461 (0.7715) loss 0.6384 (0.6260) grad_norm 0.1503 (0.1470) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:19:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [319/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7587) loss 0.6210 (0.6273) grad_norm 0.1432 (0.1433) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:20:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [319/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7544) loss 0.6041 (0.6276) grad_norm 0.1443 (0.1437) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:21:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [319/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7522) loss 0.6173 (0.6279) grad_norm 0.1384 (0.1443) loss_scale 524288.0000 (307251.0723) mem 28968MB [2024-03-09 19:21:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 319 training takes 0:05:02 [2024-03-09 19:21:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [320/800][0/402] eta 0:21:39 lr 0.000025 time 3.2329 (3.2329) loss 0.6202 (0.6202) grad_norm 0.1416 (0.1416) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 19:23:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [320/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7704) loss 0.6123 (0.6261) grad_norm 0.1532 (0.1437) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 19:24:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [320/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7580) loss 0.6443 (0.6265) grad_norm 0.1130 (inf) loss_scale 262144.0000 (439515.0647) mem 28968MB [2024-03-09 19:25:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [320/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7539) loss 0.6280 (0.6274) grad_norm 0.1409 (inf) loss_scale 262144.0000 (380587.8007) mem 28968MB [2024-03-09 19:26:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [320/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7518) loss 0.6357 (0.6281) grad_norm 0.1724 (inf) loss_scale 262144.0000 (351050.6933) mem 28968MB [2024-03-09 19:26:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 320 training takes 0:05:02 [2024-03-09 19:26:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [321/800][0/402] eta 0:32:19 lr 0.000025 time 4.8242 (4.8242) loss 0.6056 (0.6056) grad_norm 0.1517 (0.1517) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:28:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [321/800][100/402] eta 0:03:57 lr 0.000025 time 0.7467 (0.7871) loss 0.5915 (0.6259) grad_norm 0.1435 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:29:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [321/800][200/402] eta 0:02:34 lr 0.000025 time 0.7467 (0.7671) loss 0.6341 (0.6269) grad_norm 0.1487 (0.1434) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:30:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [321/800][300/402] eta 0:01:17 lr 0.000025 time 0.7467 (0.7604) loss 0.6488 (0.6271) grad_norm 0.1490 (0.1451) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:31:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [321/800][400/402] eta 0:00:01 lr 0.000025 time 0.7458 (0.7570) loss 0.6342 (0.6275) grad_norm 0.1588 (0.1448) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:31:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 321 training takes 0:05:04 [2024-03-09 19:32:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [322/800][0/402] eta 0:22:07 lr 0.000025 time 3.3015 (3.3015) loss 0.6403 (0.6403) grad_norm 0.1613 (0.1613) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:33:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [322/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7710) loss 0.6170 (0.6285) grad_norm 0.1598 (0.1389) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:34:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [322/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7584) loss 0.6580 (0.6274) grad_norm 0.1297 (0.1426) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:35:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [322/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7542) loss 0.6326 (0.6280) grad_norm 0.1351 (0.1439) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:36:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [322/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7520) loss 0.6128 (0.6281) grad_norm 0.1314 (0.1431) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:37:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 322 training takes 0:05:02 [2024-03-09 19:37:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [323/800][0/402] eta 0:21:43 lr 0.000025 time 3.2419 (3.2419) loss 0.6189 (0.6189) grad_norm 0.1315 (0.1315) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:38:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [323/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7704) loss 0.6079 (0.6262) grad_norm 0.1343 (0.1453) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:39:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [323/800][200/402] eta 0:02:33 lr 0.000025 time 0.7453 (0.7581) loss 0.6269 (0.6278) grad_norm 0.1237 (0.1448) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:40:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [323/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7540) loss 0.6246 (0.6279) grad_norm 0.1614 (0.1440) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:42:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [323/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7519) loss 0.6267 (0.6278) grad_norm 0.1490 (0.1446) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:42:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 323 training takes 0:05:02 [2024-03-09 19:42:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [324/800][0/402] eta 0:21:56 lr 0.000025 time 3.2743 (3.2743) loss 0.6189 (0.6189) grad_norm 0.1448 (0.1448) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:43:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [324/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7707) loss 0.6385 (0.6261) grad_norm 0.1355 (0.1474) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:44:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [324/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7583) loss 0.6656 (0.6265) grad_norm 0.1387 (0.1467) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:45:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [324/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7541) loss 0.6603 (0.6269) grad_norm 0.1248 (0.1468) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:47:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [324/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7519) loss 0.6236 (0.6273) grad_norm 0.1700 (0.1456) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:47:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 324 training takes 0:05:02 [2024-03-09 19:47:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [325/800][0/402] eta 0:22:48 lr 0.000025 time 3.4045 (3.4045) loss 0.6100 (0.6100) grad_norm 0.1775 (0.1775) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:48:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [325/800][100/402] eta 0:03:53 lr 0.000025 time 0.7454 (0.7720) loss 0.6244 (0.6277) grad_norm 0.1730 (0.1432) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:49:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [325/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7589) loss 0.6099 (0.6281) grad_norm 0.1413 (0.1439) loss_scale 524288.0000 (359958.9254) mem 28968MB [2024-03-09 19:50:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [325/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7545) loss 0.6474 (0.6285) grad_norm 0.1444 (0.1440) loss_scale 524288.0000 (414553.3023) mem 28968MB [2024-03-09 19:52:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [325/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7523) loss 0.6154 (0.6279) grad_norm 0.1539 (0.1441) loss_scale 524288.0000 (441918.5636) mem 28968MB [2024-03-09 19:52:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 325 training takes 0:05:02 [2024-03-09 19:52:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [326/800][0/402] eta 0:32:11 lr 0.000025 time 4.8042 (4.8042) loss 0.5882 (0.5882) grad_norm 0.1538 (0.1538) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 19:53:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [326/800][100/402] eta 0:03:57 lr 0.000025 time 0.7459 (0.7861) loss 0.6144 (0.6274) grad_norm 0.1661 (0.1466) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 19:54:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [326/800][200/402] eta 0:02:34 lr 0.000025 time 0.7461 (0.7660) loss 0.5668 (0.6291) grad_norm 0.1611 (0.1452) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 19:55:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [326/800][300/402] eta 0:01:17 lr 0.000025 time 0.7451 (0.7593) loss 0.6133 (0.6280) grad_norm 0.1716 (0.1436) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 19:57:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [326/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7558) loss 0.6534 (0.6275) grad_norm 0.1487 (inf) loss_scale 262144.0000 (495524.0698) mem 28968MB [2024-03-09 19:57:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 326 training takes 0:05:03 [2024-03-09 19:57:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [327/800][0/402] eta 0:22:08 lr 0.000025 time 3.3052 (3.3052) loss 0.6497 (0.6497) grad_norm 0.1099 (0.1099) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:58:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [327/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7710) loss 0.6058 (0.6241) grad_norm 0.1533 (0.1441) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 19:59:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [327/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7584) loss 0.6261 (0.6253) grad_norm 0.1581 (0.1469) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:00:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [327/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7542) loss 0.6537 (0.6268) grad_norm 0.1450 (0.1453) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:02:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [327/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7520) loss 0.6194 (0.6275) grad_norm 0.1378 (0.1461) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:02:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 327 training takes 0:05:02 [2024-03-09 20:02:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [328/800][0/402] eta 0:22:56 lr 0.000025 time 3.4245 (3.4245) loss 0.6354 (0.6354) grad_norm 0.1851 (0.1851) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:03:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [328/800][100/402] eta 0:03:53 lr 0.000025 time 0.7462 (0.7723) loss 0.6135 (0.6297) grad_norm 0.1500 (0.1462) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:04:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [328/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7591) loss 0.6035 (0.6270) grad_norm 0.1180 (0.1435) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:06:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [328/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7547) loss 0.6166 (0.6273) grad_norm 0.1200 (0.1432) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:07:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [328/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7524) loss 0.5944 (0.6274) grad_norm 0.1655 (0.1426) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:07:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 328 training takes 0:05:02 [2024-03-09 20:07:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [329/800][0/402] eta 0:22:18 lr 0.000025 time 3.3291 (3.3291) loss 0.6348 (0.6348) grad_norm 0.1676 (0.1676) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:08:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [329/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7713) loss 0.6476 (0.6288) grad_norm 0.1398 (0.1447) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:09:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [329/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7586) loss 0.6287 (0.6277) grad_norm 0.1295 (0.1447) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:11:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [329/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7543) loss 0.6367 (0.6279) grad_norm 0.1138 (0.1443) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:12:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [329/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7521) loss 0.6244 (0.6272) grad_norm 0.1418 (inf) loss_scale 131072.0000 (257894.7830) mem 28968MB [2024-03-09 20:12:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 329 training takes 0:05:02 [2024-03-09 20:12:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [330/800][0/402] eta 0:21:40 lr 0.000025 time 3.2343 (3.2343) loss 0.6301 (0.6301) grad_norm 0.1306 (0.1306) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:13:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [330/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7704) loss 0.6209 (0.6230) grad_norm 0.1422 (0.1428) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:14:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [330/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7581) loss 0.6532 (0.6252) grad_norm 0.1815 (0.1416) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:16:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [330/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7540) loss 0.6499 (0.6260) grad_norm 0.1553 (0.1417) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:17:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [330/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7519) loss 0.6217 (0.6271) grad_norm 0.1206 (0.1411) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:17:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 330 training takes 0:05:02 [2024-03-09 20:17:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [331/800][0/402] eta 0:32:01 lr 0.000025 time 4.7800 (4.7800) loss 0.6339 (0.6339) grad_norm 0.1283 (0.1283) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:18:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [331/800][100/402] eta 0:03:57 lr 0.000025 time 0.7464 (0.7859) loss 0.6257 (0.6297) grad_norm 0.1326 (0.1415) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:19:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [331/800][200/402] eta 0:02:34 lr 0.000025 time 0.7463 (0.7659) loss 0.6458 (0.6295) grad_norm 0.1438 (0.1429) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:21:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [331/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7592) loss 0.6550 (0.6292) grad_norm 0.1252 (0.1433) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:22:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [331/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7558) loss 0.6111 (0.6287) grad_norm 0.1390 (0.1438) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:22:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 331 training takes 0:05:03 [2024-03-09 20:22:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [332/800][0/402] eta 0:22:08 lr 0.000025 time 3.3051 (3.3051) loss 0.6382 (0.6382) grad_norm 0.1418 (0.1418) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:23:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [332/800][100/402] eta 0:03:52 lr 0.000025 time 0.7484 (0.7711) loss 0.5980 (0.6261) grad_norm 0.1263 (0.1416) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:24:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [332/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7585) loss 0.6102 (0.6265) grad_norm 0.1339 (0.1427) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:26:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [332/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7542) loss 0.6413 (0.6263) grad_norm 0.1194 (0.1431) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:27:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [332/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7521) loss 0.6041 (0.6275) grad_norm 0.1484 (0.1419) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:27:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 332 training takes 0:05:02 [2024-03-09 20:27:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [333/800][0/402] eta 0:22:08 lr 0.000025 time 3.3040 (3.3040) loss 0.6082 (0.6082) grad_norm 0.1173 (0.1173) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:28:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [333/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7710) loss 0.6374 (0.6270) grad_norm 0.1570 (0.1458) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:29:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [333/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7585) loss 0.6299 (0.6270) grad_norm 0.1689 (0.1442) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:31:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [333/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7543) loss 0.6273 (0.6279) grad_norm 0.1248 (0.1437) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:32:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [333/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7521) loss 0.6020 (0.6277) grad_norm 0.1612 (0.1436) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:32:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 333 training takes 0:05:02 [2024-03-09 20:32:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [334/800][0/402] eta 0:21:38 lr 0.000025 time 3.2298 (3.2298) loss 0.6570 (0.6570) grad_norm 0.1373 (0.1373) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:33:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [334/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7702) loss 0.6560 (0.6254) grad_norm 0.1209 (0.1417) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:35:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [334/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7580) loss 0.6162 (0.6268) grad_norm 0.1235 (0.1423) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:36:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [334/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7539) loss 0.6292 (0.6277) grad_norm 0.1606 (0.1420) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-09 20:37:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [334/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7518) loss 0.6020 (0.6277) grad_norm 0.1315 (0.1434) loss_scale 262144.0000 (138589.8454) mem 28968MB [2024-03-09 20:37:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 334 training takes 0:05:02 [2024-03-09 20:37:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [335/800][0/402] eta 0:22:31 lr 0.000025 time 3.3618 (3.3618) loss 0.6265 (0.6265) grad_norm 0.1293 (0.1293) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:38:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [335/800][100/402] eta 0:03:53 lr 0.000025 time 0.7451 (0.7718) loss 0.6333 (0.6243) grad_norm 0.1341 (0.1412) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:40:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [335/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7588) loss 0.6336 (0.6261) grad_norm 0.1337 (0.1421) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:41:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [335/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7545) loss 0.6131 (0.6270) grad_norm 0.1590 (0.1424) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:42:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [335/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7522) loss 0.6345 (0.6271) grad_norm 0.1814 (0.1440) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:42:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 335 training takes 0:05:02 [2024-03-09 20:42:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [336/800][0/402] eta 0:32:40 lr 0.000025 time 4.8770 (4.8770) loss 0.6524 (0.6524) grad_norm 0.1100 (0.1100) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:43:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [336/800][100/402] eta 0:03:57 lr 0.000025 time 0.7468 (0.7873) loss 0.6234 (0.6265) grad_norm 0.1262 (0.1455) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:45:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [336/800][200/402] eta 0:02:34 lr 0.000025 time 0.7464 (0.7668) loss 0.5997 (0.6260) grad_norm 0.1565 (0.1439) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:46:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [336/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7599) loss 0.6341 (0.6266) grad_norm 0.1394 (0.1440) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:47:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [336/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7565) loss 0.6211 (0.6268) grad_norm 0.1337 (0.1440) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:47:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 336 training takes 0:05:04 [2024-03-09 20:47:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [337/800][0/402] eta 0:21:51 lr 0.000025 time 3.2632 (3.2632) loss 0.6217 (0.6217) grad_norm 0.1429 (0.1429) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:48:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [337/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7707) loss 0.6165 (0.6254) grad_norm 0.1209 (0.1444) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:50:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [337/800][200/402] eta 0:02:33 lr 0.000025 time 0.7464 (0.7582) loss 0.6450 (0.6268) grad_norm 0.1391 (0.1435) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:51:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [337/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7541) loss 0.6113 (0.6269) grad_norm 0.1463 (0.1437) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:52:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [337/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7520) loss 0.5932 (0.6267) grad_norm 0.1402 (0.1432) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:52:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 337 training takes 0:05:02 [2024-03-09 20:52:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [338/800][0/402] eta 0:22:04 lr 0.000025 time 3.2941 (3.2941) loss 0.6544 (0.6544) grad_norm 0.1686 (0.1686) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:53:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [338/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7709) loss 0.5901 (0.6285) grad_norm 0.1689 (0.1473) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:55:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [338/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7584) loss 0.5934 (0.6285) grad_norm 0.1704 (0.1466) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:56:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [338/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7542) loss 0.6469 (0.6280) grad_norm 0.1526 (0.1468) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:57:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [338/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7520) loss 0.6451 (0.6276) grad_norm 0.1453 (0.1458) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:57:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 338 training takes 0:05:02 [2024-03-09 20:57:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [339/800][0/402] eta 0:21:45 lr 0.000025 time 3.2477 (3.2477) loss 0.6242 (0.6242) grad_norm 0.1301 (0.1301) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 20:59:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [339/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7705) loss 0.6272 (0.6280) grad_norm 0.1431 (0.1435) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:00:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [339/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7582) loss 0.6349 (0.6262) grad_norm 0.1387 (0.1431) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:01:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [339/800][300/402] eta 0:01:16 lr 0.000025 time 0.7453 (0.7541) loss 0.6223 (0.6263) grad_norm 0.1670 (0.1456) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:02:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [339/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7519) loss 0.6227 (0.6267) grad_norm 0.1307 (0.1452) loss_scale 524288.0000 (283716.9476) mem 28968MB [2024-03-09 21:02:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 339 training takes 0:05:02 [2024-03-09 21:02:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [340/800][0/402] eta 0:22:21 lr 0.000025 time 3.3382 (3.3382) loss 0.6326 (0.6326) grad_norm 0.1542 (0.1542) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:04:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [340/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7714) loss 0.6557 (0.6259) grad_norm 0.1121 (0.1413) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:05:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [340/800][200/402] eta 0:02:33 lr 0.000025 time 0.7464 (0.7586) loss 0.6301 (0.6263) grad_norm 0.1694 (0.1452) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:06:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [340/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7543) loss 0.6327 (0.6256) grad_norm 0.1546 (0.1451) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:07:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [340/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7521) loss 0.6219 (0.6259) grad_norm 0.1472 (0.1443) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:07:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 340 training takes 0:05:02 [2024-03-09 21:07:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [341/800][0/402] eta 0:31:56 lr 0.000025 time 4.7686 (4.7686) loss 0.6512 (0.6512) grad_norm 0.1417 (0.1417) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:09:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [341/800][100/402] eta 0:03:57 lr 0.000025 time 0.7454 (0.7861) loss 0.6034 (0.6267) grad_norm 0.1731 (0.1406) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:10:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [341/800][200/402] eta 0:02:34 lr 0.000025 time 0.7459 (0.7663) loss 0.6242 (0.6283) grad_norm 0.1483 (0.1434) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:11:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [341/800][300/402] eta 0:01:17 lr 0.000025 time 0.7464 (0.7597) loss 0.6240 (0.6271) grad_norm 0.1600 (0.1437) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:12:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [341/800][400/402] eta 0:00:01 lr 0.000025 time 0.7450 (0.7565) loss 0.6285 (0.6272) grad_norm 0.1309 (0.1433) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:12:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 341 training takes 0:05:04 [2024-03-09 21:12:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [342/800][0/402] eta 0:22:23 lr 0.000025 time 3.3421 (3.3421) loss 0.6205 (0.6205) grad_norm 0.1375 (0.1375) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:14:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [342/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7715) loss 0.6263 (0.6288) grad_norm 0.1196 (0.1443) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:15:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [342/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7587) loss 0.6590 (0.6278) grad_norm 0.1586 (0.1431) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:16:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [342/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7544) loss 0.6482 (0.6283) grad_norm 0.1311 (0.1435) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:17:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [342/800][400/402] eta 0:00:01 lr 0.000025 time 0.7450 (0.7522) loss 0.6175 (0.6273) grad_norm 0.1557 (inf) loss_scale 262144.0000 (477873.4763) mem 28968MB [2024-03-09 21:17:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 342 training takes 0:05:02 [2024-03-09 21:17:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [343/800][0/402] eta 0:21:50 lr 0.000025 time 3.2596 (3.2596) loss 0.6352 (0.6352) grad_norm 0.1865 (0.1865) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:19:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [343/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7705) loss 0.6585 (0.6260) grad_norm 0.1104 (0.1409) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:20:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [343/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7582) loss 0.6500 (0.6261) grad_norm 0.1252 (0.1407) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:21:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [343/800][300/402] eta 0:01:16 lr 0.000025 time 0.7463 (0.7541) loss 0.6493 (0.6264) grad_norm 0.1442 (0.1427) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:22:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [343/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7520) loss 0.6282 (0.6267) grad_norm 0.1403 (0.1427) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:22:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 343 training takes 0:05:02 [2024-03-09 21:23:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [344/800][0/402] eta 0:22:44 lr 0.000025 time 3.3952 (3.3952) loss 0.6219 (0.6219) grad_norm 0.1499 (0.1499) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:24:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [344/800][100/402] eta 0:03:53 lr 0.000025 time 0.7456 (0.7719) loss 0.6288 (0.6294) grad_norm 0.1370 (0.1406) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:25:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [344/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7589) loss 0.6112 (0.6281) grad_norm 0.1589 (0.1426) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:26:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [344/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7545) loss 0.6091 (0.6276) grad_norm 0.1543 (0.1425) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:27:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [344/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7523) loss 0.6022 (0.6272) grad_norm 0.1600 (0.1441) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:27:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 344 training takes 0:05:02 [2024-03-09 21:28:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [345/800][0/402] eta 0:22:12 lr 0.000025 time 3.3148 (3.3148) loss 0.6453 (0.6453) grad_norm 0.1264 (0.1264) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:29:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [345/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7711) loss 0.6212 (0.6282) grad_norm 0.1868 (0.1435) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:30:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [345/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7585) loss 0.5990 (0.6269) grad_norm 0.1687 (0.1455) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:31:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [345/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7542) loss 0.6330 (0.6268) grad_norm 0.1537 (0.1456) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:33:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [345/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7520) loss 0.5959 (0.6264) grad_norm 0.1813 (0.1448) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:33:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 345 training takes 0:05:02 [2024-03-09 21:33:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [346/800][0/402] eta 0:32:28 lr 0.000025 time 4.8465 (4.8465) loss 0.6542 (0.6542) grad_norm 0.1563 (0.1563) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:34:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [346/800][100/402] eta 0:03:57 lr 0.000025 time 0.7460 (0.7865) loss 0.6235 (0.6287) grad_norm 0.1871 (0.1441) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:35:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [346/800][200/402] eta 0:02:34 lr 0.000025 time 0.7461 (0.7663) loss 0.5920 (0.6276) grad_norm 0.1471 (0.1441) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:36:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [346/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7595) loss 0.6076 (0.6279) grad_norm 0.1506 (0.1442) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:38:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [346/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7560) loss 0.5832 (0.6274) grad_norm 0.1248 (0.1444) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:38:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 346 training takes 0:05:04 [2024-03-09 21:38:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [347/800][0/402] eta 0:21:40 lr 0.000025 time 3.2356 (3.2356) loss 0.6001 (0.6001) grad_norm 0.1748 (0.1748) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:39:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [347/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7703) loss 0.6362 (0.6259) grad_norm 0.1376 (0.1437) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:40:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [347/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7581) loss 0.6107 (0.6271) grad_norm 0.1373 (0.1442) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:41:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [347/800][300/402] eta 0:01:16 lr 0.000025 time 0.7463 (0.7540) loss 0.6166 (0.6272) grad_norm 0.1604 (0.1452) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 21:43:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [347/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7519) loss 0.6731 (0.6273) grad_norm 0.1384 (0.1450) loss_scale 524288.0000 (315095.7805) mem 28968MB [2024-03-09 21:43:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 347 training takes 0:05:02 [2024-03-09 21:43:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [348/800][0/402] eta 0:22:17 lr 0.000025 time 3.3279 (3.3279) loss 0.6243 (0.6243) grad_norm 0.1375 (0.1375) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:44:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [348/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7712) loss 0.6487 (0.6293) grad_norm 0.1539 (0.1415) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:45:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [348/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7586) loss 0.6338 (0.6275) grad_norm 0.1422 (0.1423) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:46:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [348/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7543) loss 0.5691 (0.6280) grad_norm 0.1625 (0.1432) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:48:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [348/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7521) loss 0.6233 (0.6274) grad_norm 0.1450 (0.1440) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:48:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 348 training takes 0:05:02 [2024-03-09 21:48:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [349/800][0/402] eta 0:21:48 lr 0.000025 time 3.2551 (3.2551) loss 0.6373 (0.6373) grad_norm 0.1152 (0.1152) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:49:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [349/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7706) loss 0.6132 (0.6306) grad_norm 0.1574 (0.1436) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:50:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [349/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7583) loss 0.6190 (0.6280) grad_norm 0.1285 (0.1427) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:51:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [349/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7542) loss 0.6192 (0.6278) grad_norm 0.1222 (0.1434) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:53:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [349/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7520) loss 0.6338 (0.6276) grad_norm 0.1341 (0.1425) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:53:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 349 training takes 0:05:02 [2024-03-09 21:53:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [350/800][0/402] eta 0:22:38 lr 0.000025 time 3.3793 (3.3793) loss 0.6544 (0.6544) grad_norm 0.1350 (0.1350) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:54:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [350/800][100/402] eta 0:03:53 lr 0.000025 time 0.7452 (0.7718) loss 0.6219 (0.6238) grad_norm 0.1268 (0.1435) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:55:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [350/800][200/402] eta 0:02:33 lr 0.000025 time 0.7467 (0.7589) loss 0.6556 (0.6255) grad_norm 0.1271 (0.1414) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:57:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [350/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7545) loss 0.6362 (0.6261) grad_norm 0.1492 (0.1407) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:58:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [350/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7523) loss 0.6643 (0.6263) grad_norm 0.1316 (0.1410) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:58:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 350 training takes 0:05:02 [2024-03-09 21:58:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [351/800][0/402] eta 0:32:15 lr 0.000025 time 4.8155 (4.8155) loss 0.6254 (0.6254) grad_norm 0.1324 (0.1324) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 21:59:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [351/800][100/402] eta 0:03:57 lr 0.000025 time 0.7454 (0.7859) loss 0.6279 (0.6262) grad_norm 0.1239 (0.1465) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 22:00:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [351/800][200/402] eta 0:02:34 lr 0.000025 time 0.7459 (0.7659) loss 0.6269 (0.6262) grad_norm 0.1645 (0.1472) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 22:02:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [351/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7592) loss 0.6469 (0.6266) grad_norm 0.1043 (0.1465) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 22:03:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [351/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7558) loss 0.6416 (0.6270) grad_norm 0.1449 (0.1451) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 22:03:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 351 training takes 0:05:03 [2024-03-09 22:03:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [352/800][0/402] eta 0:22:56 lr 0.000025 time 3.4248 (3.4248) loss 0.6622 (0.6622) grad_norm 0.1406 (0.1406) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 22:04:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [352/800][100/402] eta 0:03:53 lr 0.000025 time 0.7459 (0.7721) loss 0.6357 (0.6289) grad_norm 0.1264 (inf) loss_scale 262144.0000 (386727.2871) mem 28968MB [2024-03-09 22:05:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [352/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7589) loss 0.6146 (0.6260) grad_norm 0.1323 (inf) loss_scale 262144.0000 (324745.5522) mem 28968MB [2024-03-09 22:07:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [352/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7546) loss 0.6442 (0.6267) grad_norm 0.1552 (inf) loss_scale 262144.0000 (303947.6944) mem 28968MB [2024-03-09 22:08:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [352/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7523) loss 0.6179 (0.6269) grad_norm 0.1512 (inf) loss_scale 262144.0000 (293522.8329) mem 28968MB [2024-03-09 22:08:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 352 training takes 0:05:02 [2024-03-09 22:08:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [353/800][0/402] eta 0:21:56 lr 0.000025 time 3.2751 (3.2751) loss 0.6149 (0.6149) grad_norm 0.1449 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:09:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [353/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7707) loss 0.6365 (0.6293) grad_norm 0.1422 (0.1455) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:10:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [353/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7582) loss 0.6266 (0.6273) grad_norm 0.1249 (0.1464) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:12:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [353/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7541) loss 0.5997 (0.6271) grad_norm 0.1608 (0.1461) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:13:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [353/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7519) loss 0.6384 (0.6268) grad_norm 0.1405 (0.1450) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:13:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 353 training takes 0:05:02 [2024-03-09 22:13:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [354/800][0/402] eta 0:22:35 lr 0.000025 time 3.3721 (3.3721) loss 0.6308 (0.6308) grad_norm 0.1495 (0.1495) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:14:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [354/800][100/402] eta 0:03:53 lr 0.000025 time 0.7455 (0.7718) loss 0.6231 (0.6242) grad_norm 0.1490 (0.1462) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:15:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [354/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7588) loss 0.5951 (0.6259) grad_norm 0.1815 (0.1463) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:17:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [354/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7544) loss 0.6313 (0.6254) grad_norm 0.1381 (0.1464) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:18:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [354/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7522) loss 0.6292 (0.6258) grad_norm 0.1306 (0.1450) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:18:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 354 training takes 0:05:02 [2024-03-09 22:18:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [355/800][0/402] eta 0:21:48 lr 0.000025 time 3.2556 (3.2556) loss 0.6417 (0.6417) grad_norm 0.1182 (0.1182) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:19:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [355/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7706) loss 0.6304 (0.6265) grad_norm 0.1359 (0.1424) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:20:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [355/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7582) loss 0.6511 (0.6259) grad_norm 0.1269 (0.1444) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:22:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [355/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7540) loss 0.6074 (0.6269) grad_norm 0.1486 (0.1444) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:23:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [355/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7519) loss 0.6305 (0.6276) grad_norm 0.1189 (0.1436) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:23:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 355 training takes 0:05:02 [2024-03-09 22:23:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [356/800][0/402] eta 0:31:26 lr 0.000025 time 4.6934 (4.6934) loss 0.6243 (0.6243) grad_norm 0.1547 (0.1547) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:24:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [356/800][100/402] eta 0:03:57 lr 0.000025 time 0.7464 (0.7853) loss 0.6319 (0.6275) grad_norm 0.2102 (0.1450) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:26:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [356/800][200/402] eta 0:02:34 lr 0.000025 time 0.7457 (0.7659) loss 0.6179 (0.6270) grad_norm 0.1808 (0.1440) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:27:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [356/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7592) loss 0.6303 (0.6270) grad_norm 0.1821 (0.1464) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:28:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [356/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7560) loss 0.6464 (0.6268) grad_norm 0.1480 (0.1454) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:28:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 356 training takes 0:05:03 [2024-03-09 22:28:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [357/800][0/402] eta 0:22:32 lr 0.000025 time 3.3644 (3.3644) loss 0.6019 (0.6019) grad_norm 0.1673 (0.1673) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:29:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [357/800][100/402] eta 0:03:53 lr 0.000025 time 0.7458 (0.7718) loss 0.6074 (0.6266) grad_norm 0.1427 (0.1469) loss_scale 524288.0000 (425659.5644) mem 28968MB [2024-03-09 22:31:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [357/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7588) loss 0.6278 (0.6253) grad_norm 0.1424 (0.1440) loss_scale 524288.0000 (474728.4378) mem 28968MB [2024-03-09 22:32:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [357/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7545) loss 0.6420 (0.6259) grad_norm 0.1561 (0.1431) loss_scale 524288.0000 (491193.4086) mem 28968MB [2024-03-09 22:33:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [357/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7523) loss 0.6183 (0.6266) grad_norm 0.1227 (0.1428) loss_scale 524288.0000 (499446.4239) mem 28968MB [2024-03-09 22:33:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 357 training takes 0:05:02 [2024-03-09 22:33:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [358/800][0/402] eta 0:22:39 lr 0.000025 time 3.3824 (3.3824) loss 0.6359 (0.6359) grad_norm 0.1316 (0.1316) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 22:34:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [358/800][100/402] eta 0:03:53 lr 0.000025 time 0.7457 (0.7717) loss 0.5987 (0.6260) grad_norm 0.1620 (0.1469) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 22:36:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [358/800][200/402] eta 0:02:33 lr 0.000025 time 0.7452 (0.7588) loss 0.6282 (0.6259) grad_norm 0.1304 (0.1450) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 22:37:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [358/800][300/402] eta 0:01:16 lr 0.000025 time 0.7336 (0.7544) loss 0.6206 (0.6265) grad_norm inf (inf) loss_scale 262144.0000 (523417.0897) mem 28968MB [2024-03-09 22:38:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [358/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7522) loss 0.6103 (0.6259) grad_norm 0.1622 (inf) loss_scale 262144.0000 (458261.7057) mem 28968MB [2024-03-09 22:38:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 358 training takes 0:05:02 [2024-03-09 22:38:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [359/800][0/402] eta 0:22:39 lr 0.000025 time 3.3831 (3.3831) loss 0.6363 (0.6363) grad_norm 0.1244 (0.1244) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:39:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [359/800][100/402] eta 0:03:53 lr 0.000025 time 0.7458 (0.7718) loss 0.6333 (0.6255) grad_norm 0.1340 (0.1452) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:41:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [359/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7588) loss 0.6233 (0.6265) grad_norm 0.1429 (0.1447) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:42:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [359/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7545) loss 0.6308 (0.6256) grad_norm 0.1422 (0.1446) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:43:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [359/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7522) loss 0.6184 (0.6259) grad_norm 0.1497 (0.1445) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:43:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 359 training takes 0:05:02 [2024-03-09 22:43:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [360/800][0/402] eta 0:21:41 lr 0.000025 time 3.2380 (3.2380) loss 0.6171 (0.6171) grad_norm 0.1448 (0.1448) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:44:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [360/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7704) loss 0.6112 (0.6260) grad_norm 0.1375 (0.1419) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:46:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [360/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7581) loss 0.6306 (0.6256) grad_norm 0.1534 (0.1422) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:47:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [360/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7540) loss 0.6062 (0.6244) grad_norm 0.1608 (0.1425) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:48:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [360/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7519) loss 0.6546 (0.6253) grad_norm 0.1391 (0.1437) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:48:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 360 training takes 0:05:02 [2024-03-09 22:48:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [361/800][0/402] eta 0:31:58 lr 0.000025 time 4.7729 (4.7729) loss 0.6345 (0.6345) grad_norm 0.1140 (0.1140) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:50:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [361/800][100/402] eta 0:03:57 lr 0.000025 time 0.7460 (0.7856) loss 0.5955 (0.6263) grad_norm 0.1510 (0.1428) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:51:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [361/800][200/402] eta 0:02:34 lr 0.000025 time 0.7456 (0.7659) loss 0.5958 (0.6255) grad_norm 0.1604 (0.1426) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:52:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [361/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7592) loss 0.6321 (0.6262) grad_norm 0.1672 (0.1426) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:53:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [361/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7558) loss 0.6346 (0.6264) grad_norm 0.1225 (0.1430) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:53:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 361 training takes 0:05:03 [2024-03-09 22:53:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [362/800][0/402] eta 0:21:42 lr 0.000025 time 3.2409 (3.2409) loss 0.6194 (0.6194) grad_norm 0.1568 (0.1568) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:55:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [362/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7704) loss 0.6336 (0.6271) grad_norm 0.1555 (0.1477) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:56:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [362/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7581) loss 0.6532 (0.6274) grad_norm 0.1255 (0.1461) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:57:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [362/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7540) loss 0.6494 (0.6274) grad_norm 0.1710 (0.1461) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:58:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [362/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7519) loss 0.6273 (0.6271) grad_norm 0.1376 (0.1457) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 22:58:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 362 training takes 0:05:02 [2024-03-09 22:58:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [363/800][0/402] eta 0:21:33 lr 0.000025 time 3.2180 (3.2180) loss 0.6356 (0.6356) grad_norm 0.1664 (0.1664) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:00:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [363/800][100/402] eta 0:03:52 lr 0.000025 time 0.7461 (0.7703) loss 0.6513 (0.6268) grad_norm 0.1196 (0.1478) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:01:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [363/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7581) loss 0.6333 (0.6272) grad_norm 0.1484 (0.1457) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:02:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [363/800][300/402] eta 0:01:16 lr 0.000025 time 0.7464 (0.7540) loss 0.6516 (0.6278) grad_norm 0.1597 (0.1453) loss_scale 524288.0000 (271724.0133) mem 28968MB [2024-03-09 23:03:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [363/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7519) loss 0.5794 (0.6267) grad_norm 0.1321 (inf) loss_scale 262144.0000 (292215.3815) mem 28968MB [2024-03-09 23:03:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 363 training takes 0:05:02 [2024-03-09 23:03:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [364/800][0/402] eta 0:21:41 lr 0.000025 time 3.2373 (3.2373) loss 0.6041 (0.6041) grad_norm 0.1149 (0.1149) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:05:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [364/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7704) loss 0.6353 (0.6279) grad_norm 0.1547 (0.1431) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:06:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [364/800][200/402] eta 0:02:33 lr 0.000025 time 0.7464 (0.7582) loss 0.6263 (0.6274) grad_norm 0.1367 (0.1423) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:07:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [364/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7541) loss 0.6231 (0.6257) grad_norm 0.1795 (0.1438) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:08:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [364/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7519) loss 0.6224 (0.6266) grad_norm 0.1300 (0.1424) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:08:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 364 training takes 0:05:02 [2024-03-09 23:08:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [365/800][0/402] eta 0:22:40 lr 0.000025 time 3.3834 (3.3834) loss 0.6055 (0.6055) grad_norm 0.1515 (0.1515) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:10:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [365/800][100/402] eta 0:03:53 lr 0.000025 time 0.7454 (0.7718) loss 0.6444 (0.6279) grad_norm 0.1301 (0.1434) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:11:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [365/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7588) loss 0.5948 (0.6278) grad_norm 0.1257 (0.1411) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:12:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [365/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7545) loss 0.6101 (0.6270) grad_norm 0.1426 (0.1429) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:13:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [365/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7523) loss 0.6317 (0.6271) grad_norm 0.1342 (0.1435) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:13:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 365 training takes 0:05:02 [2024-03-09 23:14:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [366/800][0/402] eta 0:30:54 lr 0.000025 time 4.6142 (4.6142) loss 0.5944 (0.5944) grad_norm 0.1466 (0.1466) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:15:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [366/800][100/402] eta 0:03:56 lr 0.000025 time 0.7460 (0.7841) loss 0.5992 (0.6267) grad_norm 0.1370 (0.1414) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:16:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [366/800][200/402] eta 0:02:34 lr 0.000025 time 0.7459 (0.7650) loss 0.6302 (0.6266) grad_norm 0.1295 (0.1432) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:17:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [366/800][300/402] eta 0:01:17 lr 0.000025 time 0.7454 (0.7586) loss 0.6260 (0.6257) grad_norm 0.1242 (0.1431) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:18:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [366/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7553) loss 0.6621 (0.6253) grad_norm 0.1229 (0.1434) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:19:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 366 training takes 0:05:03 [2024-03-09 23:19:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [367/800][0/402] eta 0:22:37 lr 0.000025 time 3.3765 (3.3765) loss 0.6288 (0.6288) grad_norm 0.1581 (0.1581) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:20:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [367/800][100/402] eta 0:03:53 lr 0.000025 time 0.7460 (0.7717) loss 0.6596 (0.6290) grad_norm 0.1315 (0.1527) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:21:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [367/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7587) loss 0.6392 (0.6274) grad_norm 0.1205 (0.1475) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:22:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [367/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7544) loss 0.6112 (0.6259) grad_norm 0.1233 (0.1470) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:24:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [367/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7522) loss 0.6261 (0.6265) grad_norm 0.1549 (0.1461) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:24:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 367 training takes 0:05:02 [2024-03-09 23:24:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [368/800][0/402] eta 0:22:09 lr 0.000025 time 3.3069 (3.3069) loss 0.5877 (0.5877) grad_norm 0.1345 (0.1345) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:25:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [368/800][100/402] eta 0:03:52 lr 0.000025 time 0.7461 (0.7711) loss 0.6371 (0.6275) grad_norm 0.1603 (0.1446) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:26:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [368/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7585) loss 0.6183 (0.6280) grad_norm 0.1262 (0.1421) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:27:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [368/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7543) loss 0.6121 (0.6273) grad_norm 0.1596 (0.1443) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-09 23:29:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [368/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7521) loss 0.6470 (0.6270) grad_norm 0.1428 (0.1443) loss_scale 524288.0000 (311173.4264) mem 28968MB [2024-03-09 23:29:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 368 training takes 0:05:02 [2024-03-09 23:29:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [369/800][0/402] eta 0:21:37 lr 0.000025 time 3.2264 (3.2264) loss 0.6300 (0.6300) grad_norm 0.1244 (0.1244) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:30:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [369/800][100/402] eta 0:03:52 lr 0.000025 time 0.7462 (0.7703) loss 0.6431 (0.6253) grad_norm 0.1412 (0.1455) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:31:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [369/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7581) loss 0.6177 (0.6272) grad_norm 0.1586 (0.1438) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:32:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [369/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7540) loss 0.6180 (0.6278) grad_norm 0.1392 (0.1429) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:34:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [369/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7519) loss 0.6531 (0.6269) grad_norm 0.1529 (0.1432) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:34:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 369 training takes 0:05:02 [2024-03-09 23:34:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [370/800][0/402] eta 0:21:50 lr 0.000025 time 3.2603 (3.2603) loss 0.6020 (0.6020) grad_norm 0.1596 (0.1596) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:35:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [370/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7707) loss 0.6397 (0.6252) grad_norm 0.1282 (0.1447) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:36:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [370/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7583) loss 0.6021 (0.6268) grad_norm 0.1448 (0.1443) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:37:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [370/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7541) loss 0.6225 (0.6269) grad_norm 0.1339 (0.1434) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:39:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [370/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7520) loss 0.6475 (0.6269) grad_norm 0.1446 (0.1432) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:39:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 370 training takes 0:05:02 [2024-03-09 23:39:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [371/800][0/402] eta 0:31:27 lr 0.000025 time 4.6948 (4.6948) loss 0.6353 (0.6353) grad_norm 0.1265 (0.1265) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:40:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [371/800][100/402] eta 0:03:57 lr 0.000025 time 0.7459 (0.7855) loss 0.6304 (0.6282) grad_norm 0.1258 (0.1452) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:41:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [371/800][200/402] eta 0:02:34 lr 0.000025 time 0.7453 (0.7659) loss 0.6357 (0.6263) grad_norm 0.1461 (0.1428) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:42:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [371/800][300/402] eta 0:01:17 lr 0.000025 time 0.7482 (0.7594) loss 0.6289 (0.6261) grad_norm 0.1363 (0.1427) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:44:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [371/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7560) loss 0.6151 (0.6266) grad_norm 0.1507 (0.1434) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:44:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 371 training takes 0:05:03 [2024-03-09 23:44:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [372/800][0/402] eta 0:22:27 lr 0.000025 time 3.3522 (3.3522) loss 0.6123 (0.6123) grad_norm 0.1239 (0.1239) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:45:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [372/800][100/402] eta 0:03:53 lr 0.000025 time 0.7457 (0.7715) loss 0.5986 (0.6249) grad_norm 0.1621 (0.1435) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:46:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [372/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7588) loss 0.6226 (0.6258) grad_norm 0.1529 (0.1435) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:48:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [372/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7544) loss 0.6380 (0.6258) grad_norm 0.1452 (0.1437) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:49:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [372/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7522) loss 0.6272 (0.6259) grad_norm 0.1498 (0.1439) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:49:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 372 training takes 0:05:02 [2024-03-09 23:49:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [373/800][0/402] eta 0:22:30 lr 0.000025 time 3.3596 (3.3596) loss 0.6233 (0.6233) grad_norm 0.1487 (0.1487) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:50:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [373/800][100/402] eta 0:03:53 lr 0.000025 time 0.7451 (0.7716) loss 0.6010 (0.6292) grad_norm 0.1703 (0.1433) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:51:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [373/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7588) loss 0.6156 (0.6262) grad_norm 0.1738 (0.1490) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:53:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [373/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7544) loss 0.6145 (0.6260) grad_norm 0.1326 (0.1479) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:54:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [373/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7522) loss 0.6202 (0.6262) grad_norm 0.1529 (inf) loss_scale 524288.0000 (550437.0274) mem 28968MB [2024-03-09 23:54:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 373 training takes 0:05:02 [2024-03-09 23:54:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [374/800][0/402] eta 0:22:11 lr 0.000025 time 3.3129 (3.3129) loss 0.6098 (0.6098) grad_norm 0.1278 (0.1278) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:55:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [374/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7711) loss 0.6338 (0.6285) grad_norm 0.1495 (0.1427) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:56:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [374/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7585) loss 0.6393 (0.6264) grad_norm 0.1399 (0.1428) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:58:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [374/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7544) loss 0.6393 (0.6268) grad_norm 0.1486 (0.1434) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:59:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [374/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7522) loss 0.6302 (0.6266) grad_norm 0.1475 (0.1432) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-09 23:59:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 374 training takes 0:05:02 [2024-03-09 23:59:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [375/800][0/402] eta 0:22:51 lr 0.000025 time 3.4126 (3.4126) loss 0.6326 (0.6326) grad_norm 0.1231 (0.1231) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:00:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [375/800][100/402] eta 0:03:53 lr 0.000025 time 0.7464 (0.7722) loss 0.6348 (0.6254) grad_norm 0.1677 (0.1419) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:01:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [375/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7590) loss 0.6368 (0.6252) grad_norm 0.1305 (0.1426) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:03:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [375/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7546) loss 0.6218 (0.6262) grad_norm 0.1360 (0.1424) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:04:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [375/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7524) loss 0.6419 (0.6259) grad_norm 0.1402 (0.1421) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:04:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 375 training takes 0:05:02 [2024-03-10 00:04:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [376/800][0/402] eta 0:32:13 lr 0.000025 time 4.8085 (4.8085) loss 0.6146 (0.6146) grad_norm 0.1278 (0.1278) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:05:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [376/800][100/402] eta 0:03:57 lr 0.000025 time 0.7461 (0.7860) loss 0.6330 (0.6245) grad_norm 0.1368 (0.1470) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:06:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [376/800][200/402] eta 0:02:34 lr 0.000025 time 0.7463 (0.7660) loss 0.6379 (0.6258) grad_norm 0.1523 (0.1455) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:08:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [376/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7593) loss 0.5997 (0.6258) grad_norm 0.1682 (0.1454) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:09:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [376/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7559) loss 0.6301 (0.6262) grad_norm 0.1355 (0.1460) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:09:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 376 training takes 0:05:03 [2024-03-10 00:09:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [377/800][0/402] eta 0:22:14 lr 0.000025 time 3.3185 (3.3185) loss 0.6076 (0.6076) grad_norm 0.1393 (0.1393) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:10:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [377/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7714) loss 0.6098 (0.6238) grad_norm 0.1301 (0.1431) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:12:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [377/800][200/402] eta 0:02:33 lr 0.000025 time 0.7453 (0.7586) loss 0.6264 (0.6247) grad_norm 0.1340 (inf) loss_scale 262144.0000 (456469.6517) mem 28968MB [2024-03-10 00:13:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [377/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7544) loss 0.6378 (0.6243) grad_norm 0.1464 (inf) loss_scale 262144.0000 (391909.6346) mem 28968MB [2024-03-10 00:14:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [377/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7522) loss 0.6341 (0.6247) grad_norm 0.1490 (inf) loss_scale 262144.0000 (359549.1272) mem 28968MB [2024-03-10 00:14:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 377 training takes 0:05:02 [2024-03-10 00:14:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [378/800][0/402] eta 0:22:22 lr 0.000025 time 3.3407 (3.3407) loss 0.6317 (0.6317) grad_norm 0.1459 (0.1459) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:15:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [378/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7715) loss 0.6203 (0.6277) grad_norm 0.1587 (0.1476) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:17:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [378/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7587) loss 0.6269 (0.6272) grad_norm 0.1639 (0.1457) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:18:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [378/800][300/402] eta 0:01:16 lr 0.000025 time 0.7452 (0.7544) loss 0.6525 (0.6274) grad_norm 0.1373 (0.1447) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:19:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [378/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7522) loss 0.6327 (0.6267) grad_norm 0.1257 (0.1452) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:19:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 378 training takes 0:05:02 [2024-03-10 00:19:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [379/800][0/402] eta 0:22:24 lr 0.000025 time 3.3456 (3.3456) loss 0.6055 (0.6055) grad_norm 0.1334 (0.1334) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:20:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [379/800][100/402] eta 0:03:53 lr 0.000025 time 0.7453 (0.7715) loss 0.5909 (0.6250) grad_norm 0.1415 (0.1461) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:22:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [379/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7588) loss 0.6143 (0.6256) grad_norm 0.1376 (0.1448) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:23:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [379/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7544) loss 0.6266 (0.6266) grad_norm 0.1506 (0.1440) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:24:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [379/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7522) loss 0.6144 (0.6265) grad_norm 0.1534 (0.1444) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:24:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 379 training takes 0:05:02 [2024-03-10 00:24:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [380/800][0/402] eta 0:22:14 lr 0.000025 time 3.3194 (3.3194) loss 0.6142 (0.6142) grad_norm 0.1551 (0.1551) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:25:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [380/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7712) loss 0.6681 (0.6263) grad_norm 0.1448 (0.1422) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:27:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [380/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7585) loss 0.6085 (0.6249) grad_norm 0.1356 (0.1420) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:28:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [380/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7543) loss 0.6401 (0.6254) grad_norm 0.1106 (0.1433) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:29:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [380/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7521) loss 0.6347 (0.6250) grad_norm 0.1250 (0.1442) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:29:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 380 training takes 0:05:02 [2024-03-10 00:29:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [381/800][0/402] eta 0:31:13 lr 0.000025 time 4.6615 (4.6615) loss 0.6124 (0.6124) grad_norm 0.1457 (0.1457) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:30:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [381/800][100/402] eta 0:03:56 lr 0.000025 time 0.7456 (0.7846) loss 0.5865 (0.6248) grad_norm 0.1542 (0.1412) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:32:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [381/800][200/402] eta 0:02:34 lr 0.000025 time 0.7457 (0.7652) loss 0.6535 (0.6254) grad_norm 0.1485 (0.1419) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:33:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [381/800][300/402] eta 0:01:17 lr 0.000025 time 0.7455 (0.7588) loss 0.6050 (0.6254) grad_norm 0.1402 (0.1419) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:34:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [381/800][400/402] eta 0:00:01 lr 0.000025 time 0.7437 (0.7555) loss 0.6242 (0.6259) grad_norm 0.1515 (0.1420) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:34:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 381 training takes 0:05:03 [2024-03-10 00:34:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [382/800][0/402] eta 0:21:39 lr 0.000025 time 3.2338 (3.2338) loss 0.6495 (0.6495) grad_norm 0.1361 (0.1361) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:35:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [382/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7703) loss 0.6353 (0.6272) grad_norm 0.1365 (0.1469) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 00:37:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [382/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7580) loss 0.6411 (0.6261) grad_norm 0.1226 (0.1448) loss_scale 524288.0000 (343004.3383) mem 28968MB [2024-03-10 00:38:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [382/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7539) loss 0.6076 (0.6267) grad_norm 0.1488 (0.1451) loss_scale 524288.0000 (403231.4684) mem 28968MB [2024-03-10 00:39:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [382/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7518) loss 0.6057 (0.6265) grad_norm 0.1572 (0.1454) loss_scale 524288.0000 (433420.1297) mem 28968MB [2024-03-10 00:39:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 382 training takes 0:05:02 [2024-03-10 00:39:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [383/800][0/402] eta 0:21:45 lr 0.000025 time 3.2466 (3.2466) loss 0.6125 (0.6125) grad_norm 0.1335 (0.1335) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:41:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [383/800][100/402] eta 0:03:52 lr 0.000025 time 0.7452 (0.7704) loss 0.6425 (0.6270) grad_norm 0.1253 (0.1446) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:42:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [383/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7582) loss 0.6346 (0.6263) grad_norm 0.1775 (0.1436) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:43:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [383/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7540) loss 0.6413 (0.6269) grad_norm 0.1347 (0.1442) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:44:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [383/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7519) loss 0.6493 (0.6267) grad_norm 0.1483 (0.1441) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:44:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 383 training takes 0:05:02 [2024-03-10 00:44:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [384/800][0/402] eta 0:21:44 lr 0.000025 time 3.2439 (3.2439) loss 0.5980 (0.5980) grad_norm 0.1309 (0.1309) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:46:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [384/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7704) loss 0.6130 (0.6237) grad_norm 0.1437 (0.1443) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:47:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [384/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7582) loss 0.6068 (0.6243) grad_norm 0.1300 (0.1440) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:48:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [384/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7541) loss 0.6562 (0.6252) grad_norm 0.1613 (0.1435) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:49:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [384/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7519) loss 0.6412 (0.6252) grad_norm 0.1356 (0.1440) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:49:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 384 training takes 0:05:02 [2024-03-10 00:49:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [385/800][0/402] eta 0:21:54 lr 0.000025 time 3.2692 (3.2692) loss 0.6210 (0.6210) grad_norm 0.1318 (0.1318) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:51:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [385/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7706) loss 0.6080 (0.6219) grad_norm 0.1532 (0.1454) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:52:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [385/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7583) loss 0.6161 (0.6241) grad_norm 0.1333 (0.1459) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:53:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [385/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7541) loss 0.6453 (0.6248) grad_norm 0.1411 (0.1448) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:54:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [385/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7520) loss 0.6557 (0.6254) grad_norm 0.1322 (0.1456) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:54:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 385 training takes 0:05:02 [2024-03-10 00:54:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [386/800][0/402] eta 0:31:45 lr 0.000025 time 4.7409 (4.7409) loss 0.6430 (0.6430) grad_norm 0.1443 (0.1443) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:56:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [386/800][100/402] eta 0:03:57 lr 0.000025 time 0.7455 (0.7853) loss 0.6354 (0.6273) grad_norm 0.1290 (0.1400) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:57:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [386/800][200/402] eta 0:02:34 lr 0.000025 time 0.7456 (0.7656) loss 0.6104 (0.6264) grad_norm 0.1428 (0.1412) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:58:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [386/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7590) loss 0.6471 (0.6257) grad_norm 0.1502 (0.1427) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:59:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [386/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7557) loss 0.6151 (0.6262) grad_norm 0.1403 (0.1421) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 00:59:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 386 training takes 0:05:03 [2024-03-10 00:59:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [387/800][0/402] eta 0:21:57 lr 0.000025 time 3.2773 (3.2773) loss 0.6068 (0.6068) grad_norm 0.1360 (0.1360) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 01:01:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [387/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7707) loss 0.6026 (0.6275) grad_norm 0.1373 (0.1408) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 01:02:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [387/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7582) loss 0.6109 (0.6263) grad_norm 0.1335 (inf) loss_scale 524288.0000 (568630.7662) mem 28968MB [2024-03-10 01:03:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [387/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7541) loss 0.6465 (0.6267) grad_norm 0.1407 (inf) loss_scale 524288.0000 (553898.9502) mem 28968MB [2024-03-10 01:04:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [387/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7520) loss 0.6256 (0.6257) grad_norm 0.1463 (inf) loss_scale 524288.0000 (546514.6733) mem 28968MB [2024-03-10 01:04:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 387 training takes 0:05:02 [2024-03-10 01:05:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [388/800][0/402] eta 0:22:24 lr 0.000025 time 3.3453 (3.3453) loss 0.6486 (0.6486) grad_norm 0.1622 (0.1622) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 01:06:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [388/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7715) loss 0.6402 (0.6266) grad_norm 0.1212 (0.1442) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 01:07:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [388/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7587) loss 0.6453 (0.6257) grad_norm 0.1673 (0.1439) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 01:08:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [388/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7544) loss 0.6411 (0.6252) grad_norm 0.1748 (0.1441) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 01:09:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [388/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7522) loss 0.6610 (0.6250) grad_norm 0.1348 (0.1439) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 01:09:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 388 training takes 0:05:02 [2024-03-10 01:10:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [389/800][0/402] eta 0:22:28 lr 0.000025 time 3.3549 (3.3549) loss 0.6423 (0.6423) grad_norm 0.1463 (0.1463) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 01:11:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [389/800][100/402] eta 0:03:53 lr 0.000025 time 0.7459 (0.7716) loss 0.6357 (0.6244) grad_norm 0.1265 (0.1463) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 01:12:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [389/800][200/402] eta 0:02:33 lr 0.000025 time 0.7464 (0.7587) loss 0.6352 (0.6236) grad_norm 0.1140 (0.1439) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 01:13:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [389/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7544) loss 0.6416 (0.6239) grad_norm 0.1124 (0.1453) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 01:15:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [389/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7522) loss 0.6437 (0.6243) grad_norm 0.1643 (0.1454) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 01:15:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 389 training takes 0:05:02 [2024-03-10 01:15:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [390/800][0/402] eta 0:21:53 lr 0.000025 time 3.2679 (3.2679) loss 0.6012 (0.6012) grad_norm 0.1365 (0.1365) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 01:16:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [390/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7709) loss 0.6340 (0.6277) grad_norm 0.1382 (0.1468) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 01:17:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [390/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7584) loss 0.6420 (0.6272) grad_norm 0.1222 (0.1453) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 01:18:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [390/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7541) loss 0.6218 (0.6264) grad_norm 0.1465 (inf) loss_scale 262144.0000 (519062.5382) mem 28968MB [2024-03-10 01:20:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [390/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7520) loss 0.6344 (0.6258) grad_norm 0.1403 (inf) loss_scale 262144.0000 (454993.0773) mem 28968MB [2024-03-10 01:20:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 390 training takes 0:05:02 [2024-03-10 01:20:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [391/800][0/402] eta 0:31:22 lr 0.000025 time 4.6818 (4.6818) loss 0.6414 (0.6414) grad_norm 0.1172 (0.1172) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:21:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [391/800][100/402] eta 0:03:56 lr 0.000025 time 0.7460 (0.7847) loss 0.6175 (0.6242) grad_norm 0.1408 (0.1460) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:22:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [391/800][200/402] eta 0:02:34 lr 0.000025 time 0.7458 (0.7653) loss 0.6163 (0.6252) grad_norm 0.1534 (0.1469) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:23:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [391/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7588) loss 0.6178 (0.6251) grad_norm 0.1659 (0.1469) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:25:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [391/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7555) loss 0.6123 (0.6256) grad_norm 0.1616 (0.1467) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:25:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 391 training takes 0:05:03 [2024-03-10 01:25:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [392/800][0/402] eta 0:22:47 lr 0.000025 time 3.4009 (3.4009) loss 0.5970 (0.5970) grad_norm 0.1391 (0.1391) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:26:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [392/800][100/402] eta 0:03:53 lr 0.000025 time 0.7456 (0.7720) loss 0.6079 (0.6231) grad_norm 0.1500 (0.1462) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:27:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [392/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7589) loss 0.6410 (0.6264) grad_norm 0.1494 (0.1465) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:28:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [392/800][300/402] eta 0:01:16 lr 0.000025 time 0.7463 (0.7546) loss 0.6430 (0.6264) grad_norm 0.1543 (0.1465) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:30:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [392/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7523) loss 0.5892 (0.6266) grad_norm 0.1481 (0.1452) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:30:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 392 training takes 0:05:02 [2024-03-10 01:30:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [393/800][0/402] eta 0:21:47 lr 0.000025 time 3.2520 (3.2520) loss 0.6269 (0.6269) grad_norm 0.1813 (0.1813) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:31:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [393/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7709) loss 0.6167 (0.6242) grad_norm 0.1561 (0.1459) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:32:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [393/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7584) loss 0.6223 (0.6230) grad_norm 0.1631 (0.1458) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:33:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [393/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7542) loss 0.6483 (0.6250) grad_norm 0.1397 (0.1452) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:35:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [393/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7521) loss 0.6278 (0.6251) grad_norm 0.1394 (0.1455) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:35:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 393 training takes 0:05:02 [2024-03-10 01:35:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [394/800][0/402] eta 0:21:53 lr 0.000025 time 3.2667 (3.2667) loss 0.6188 (0.6188) grad_norm 0.1552 (0.1552) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:36:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [394/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7707) loss 0.6227 (0.6241) grad_norm 0.1647 (0.1465) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:37:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [394/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7583) loss 0.6496 (0.6255) grad_norm 0.1218 (0.1464) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:39:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [394/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7542) loss 0.6345 (0.6266) grad_norm 0.1406 (0.1451) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:40:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [394/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7520) loss 0.6275 (0.6262) grad_norm 0.1554 (0.1447) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:40:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 394 training takes 0:05:02 [2024-03-10 01:40:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [395/800][0/402] eta 0:22:44 lr 0.000025 time 3.3939 (3.3939) loss 0.6287 (0.6287) grad_norm 0.1126 (0.1126) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:41:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [395/800][100/402] eta 0:03:53 lr 0.000025 time 0.7475 (0.7719) loss 0.6419 (0.6280) grad_norm 0.1598 (0.1479) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:42:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [395/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7589) loss 0.6577 (0.6269) grad_norm 0.1478 (0.1469) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:44:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [395/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7546) loss 0.5974 (0.6266) grad_norm 0.1379 (0.1459) loss_scale 524288.0000 (276078.5648) mem 28968MB [2024-03-10 01:45:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [395/800][400/402] eta 0:00:01 lr 0.000025 time 0.7437 (0.7523) loss 0.6050 (0.6270) grad_norm 0.1562 (0.1469) loss_scale 524288.0000 (337976.1796) mem 28968MB [2024-03-10 01:45:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 395 training takes 0:05:02 [2024-03-10 01:45:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [396/800][0/402] eta 0:31:41 lr 0.000025 time 4.7294 (4.7294) loss 0.6328 (0.6328) grad_norm 0.1281 (0.1281) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 01:46:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [396/800][100/402] eta 0:03:57 lr 0.000025 time 0.7471 (0.7858) loss 0.6384 (0.6257) grad_norm 0.1268 (inf) loss_scale 262144.0000 (415277.6238) mem 28968MB [2024-03-10 01:47:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [396/800][200/402] eta 0:02:34 lr 0.000025 time 0.7468 (0.7664) loss 0.6346 (0.6254) grad_norm 0.1643 (inf) loss_scale 262144.0000 (339091.7413) mem 28968MB [2024-03-10 01:49:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [396/800][300/402] eta 0:01:17 lr 0.000025 time 0.7468 (0.7600) loss 0.6235 (0.6260) grad_norm 0.1158 (inf) loss_scale 262144.0000 (313527.7076) mem 28968MB [2024-03-10 01:50:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [396/800][400/402] eta 0:00:01 lr 0.000025 time 0.7452 (0.7567) loss 0.6212 (0.6254) grad_norm 0.1220 (inf) loss_scale 262144.0000 (300713.8155) mem 28968MB [2024-03-10 01:50:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 396 training takes 0:05:04 [2024-03-10 01:50:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [397/800][0/402] eta 0:22:21 lr 0.000025 time 3.3380 (3.3380) loss 0.6110 (0.6110) grad_norm 0.1465 (0.1465) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:51:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [397/800][100/402] eta 0:03:53 lr 0.000025 time 0.7470 (0.7725) loss 0.6479 (0.6272) grad_norm 0.1460 (0.1505) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:52:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [397/800][200/402] eta 0:02:33 lr 0.000025 time 0.7470 (0.7597) loss 0.6222 (0.6265) grad_norm 0.1293 (0.1481) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:54:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [397/800][300/402] eta 0:01:17 lr 0.000025 time 0.7468 (0.7555) loss 0.6266 (0.6267) grad_norm 0.1333 (0.1465) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:55:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [397/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7533) loss 0.6055 (0.6263) grad_norm 0.1514 (0.1465) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:55:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 397 training takes 0:05:02 [2024-03-10 01:55:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [398/800][0/402] eta 0:22:21 lr 0.000025 time 3.3360 (3.3360) loss 0.6478 (0.6478) grad_norm 0.1770 (0.1770) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:56:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [398/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7714) loss 0.6331 (0.6251) grad_norm 0.1407 (0.1444) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:57:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [398/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7586) loss 0.6296 (0.6261) grad_norm 0.1342 (0.1437) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 01:59:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [398/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7544) loss 0.6266 (0.6252) grad_norm 0.1272 (0.1437) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:00:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [398/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7522) loss 0.6302 (0.6252) grad_norm 0.1560 (0.1447) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:00:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 398 training takes 0:05:02 [2024-03-10 02:00:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [399/800][0/402] eta 0:22:30 lr 0.000025 time 3.3595 (3.3595) loss 0.6307 (0.6307) grad_norm 0.1380 (0.1380) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:01:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [399/800][100/402] eta 0:03:53 lr 0.000025 time 0.7456 (0.7716) loss 0.6188 (0.6246) grad_norm 0.1605 (0.1424) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:03:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [399/800][200/402] eta 0:02:33 lr 0.000025 time 0.7453 (0.7587) loss 0.6089 (0.6236) grad_norm 0.1545 (0.1461) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:04:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [399/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7544) loss 0.6454 (0.6251) grad_norm 0.1603 (0.1448) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:05:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [399/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7522) loss 0.6315 (0.6250) grad_norm 0.1493 (0.1448) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:05:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 399 training takes 0:05:02 [2024-03-10 02:05:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [400/800][0/402] eta 0:22:10 lr 0.000025 time 3.3109 (3.3109) loss 0.6443 (0.6443) grad_norm 0.1419 (0.1419) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:06:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [400/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7713) loss 0.6161 (0.6292) grad_norm 0.1316 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:08:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [400/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7586) loss 0.6212 (0.6268) grad_norm 0.1260 (0.1452) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:09:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [400/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7543) loss 0.6318 (0.6262) grad_norm 0.1510 (inf) loss_scale 131072.0000 (241242.1528) mem 28968MB [2024-03-10 02:10:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [400/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7521) loss 0.6358 (0.6249) grad_norm 0.1491 (inf) loss_scale 131072.0000 (213768.2993) mem 28968MB [2024-03-10 02:10:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 400 training takes 0:05:02 [2024-03-10 02:10:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [401/800][0/402] eta 0:31:32 lr 0.000025 time 4.7082 (4.7082) loss 0.6452 (0.6452) grad_norm 0.1428 (0.1428) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:11:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [401/800][100/402] eta 0:03:57 lr 0.000025 time 0.7462 (0.7851) loss 0.6442 (0.6240) grad_norm 0.1187 (0.1446) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:13:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [401/800][200/402] eta 0:02:34 lr 0.000025 time 0.7461 (0.7655) loss 0.6301 (0.6250) grad_norm 0.1450 (0.1431) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:14:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [401/800][300/402] eta 0:01:17 lr 0.000025 time 0.7454 (0.7589) loss 0.6266 (0.6258) grad_norm 0.1657 (0.1426) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:15:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [401/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7556) loss 0.6175 (0.6259) grad_norm 0.1486 (0.1433) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:15:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 401 training takes 0:05:03 [2024-03-10 02:15:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [402/800][0/402] eta 0:21:58 lr 0.000025 time 3.2809 (3.2809) loss 0.6257 (0.6257) grad_norm 0.1381 (0.1381) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:16:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [402/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7708) loss 0.6206 (0.6249) grad_norm 0.1455 (0.1479) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:18:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [402/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7584) loss 0.6312 (0.6253) grad_norm 0.1531 (0.1470) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:19:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [402/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7541) loss 0.6290 (0.6257) grad_norm 0.1651 (0.1489) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:20:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [402/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7520) loss 0.6393 (0.6256) grad_norm 0.1500 (0.1475) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:20:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 402 training takes 0:05:02 [2024-03-10 02:20:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [403/800][0/402] eta 0:22:06 lr 0.000025 time 3.2998 (3.2998) loss 0.6379 (0.6379) grad_norm 0.1506 (0.1506) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:21:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [403/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7709) loss 0.6360 (0.6245) grad_norm 0.1578 (0.1439) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:23:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [403/800][200/402] eta 0:02:33 lr 0.000025 time 0.7453 (0.7584) loss 0.6221 (0.6241) grad_norm 0.1431 (0.1443) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:24:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [403/800][300/402] eta 0:01:16 lr 0.000025 time 0.7451 (0.7542) loss 0.6242 (0.6248) grad_norm 0.1543 (0.1449) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:25:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [403/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7520) loss 0.6261 (0.6251) grad_norm 0.1158 (0.1441) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:25:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 403 training takes 0:05:02 [2024-03-10 02:25:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [404/800][0/402] eta 0:22:02 lr 0.000025 time 3.2892 (3.2892) loss 0.6122 (0.6122) grad_norm 0.1169 (0.1169) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:26:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [404/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7708) loss 0.6510 (0.6262) grad_norm 0.1386 (0.1442) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:28:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [404/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7583) loss 0.6107 (0.6257) grad_norm 0.1427 (0.1450) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:29:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [404/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7541) loss 0.6092 (0.6256) grad_norm 0.1462 (0.1442) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:30:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [404/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7520) loss 0.6355 (0.6253) grad_norm 0.1500 (0.1445) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:30:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 404 training takes 0:05:02 [2024-03-10 02:30:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [405/800][0/402] eta 0:22:16 lr 0.000025 time 3.3240 (3.3240) loss 0.6015 (0.6015) grad_norm 0.1361 (0.1361) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:32:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [405/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7712) loss 0.6169 (0.6231) grad_norm 0.1448 (0.1474) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:33:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [405/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7586) loss 0.6266 (0.6243) grad_norm 0.1673 (0.1481) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 02:34:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [405/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7543) loss 0.6361 (0.6245) grad_norm 0.1622 (0.1472) loss_scale 262144.0000 (156328.3987) mem 28968MB [2024-03-10 02:35:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [405/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7521) loss 0.6067 (0.6245) grad_norm 0.1178 (0.1469) loss_scale 262144.0000 (182716.3292) mem 28968MB [2024-03-10 02:35:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 405 training takes 0:05:02 [2024-03-10 02:35:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [406/800][0/402] eta 0:31:39 lr 0.000025 time 4.7252 (4.7252) loss 0.6382 (0.6382) grad_norm 0.1657 (0.1657) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:37:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [406/800][100/402] eta 0:03:57 lr 0.000025 time 0.7455 (0.7852) loss 0.6260 (0.6260) grad_norm 0.1346 (0.1484) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:38:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [406/800][200/402] eta 0:02:34 lr 0.000025 time 0.7468 (0.7658) loss 0.6423 (0.6253) grad_norm 0.1350 (0.1483) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:39:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [406/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7594) loss 0.6339 (0.6266) grad_norm 0.1394 (0.1466) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:40:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [406/800][400/402] eta 0:00:01 lr 0.000025 time 0.7452 (0.7560) loss 0.6404 (0.6264) grad_norm 0.1400 (0.1459) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:40:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 406 training takes 0:05:03 [2024-03-10 02:40:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [407/800][0/402] eta 0:21:59 lr 0.000025 time 3.2821 (3.2821) loss 0.6338 (0.6338) grad_norm 0.1360 (0.1360) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:42:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [407/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7709) loss 0.5924 (0.6234) grad_norm 0.1394 (0.1472) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:43:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [407/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7584) loss 0.6509 (0.6246) grad_norm 0.1367 (0.1463) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:44:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [407/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7542) loss 0.5987 (0.6246) grad_norm 0.1579 (0.1454) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:45:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [407/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7521) loss 0.6169 (0.6255) grad_norm 0.1544 (0.1454) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:45:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 407 training takes 0:05:02 [2024-03-10 02:45:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [408/800][0/402] eta 0:21:41 lr 0.000025 time 3.2381 (3.2381) loss 0.6206 (0.6206) grad_norm 0.1580 (0.1580) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:47:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [408/800][100/402] eta 0:03:52 lr 0.000025 time 0.7468 (0.7714) loss 0.6287 (0.6250) grad_norm 0.1898 (0.1436) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:48:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [408/800][200/402] eta 0:02:33 lr 0.000025 time 0.7465 (0.7592) loss 0.6307 (0.6256) grad_norm 0.1567 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:49:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [408/800][300/402] eta 0:01:17 lr 0.000025 time 0.7467 (0.7551) loss 0.6394 (0.6254) grad_norm 0.1407 (0.1447) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:50:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [408/800][400/402] eta 0:00:01 lr 0.000025 time 0.7450 (0.7530) loss 0.6119 (0.6258) grad_norm 0.1304 (0.1446) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:50:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 408 training takes 0:05:02 [2024-03-10 02:50:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [409/800][0/402] eta 0:21:33 lr 0.000025 time 3.2169 (3.2169) loss 0.6029 (0.6029) grad_norm 0.1376 (0.1376) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:52:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [409/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7702) loss 0.6189 (0.6253) grad_norm 0.1721 (0.1443) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:53:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [409/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7581) loss 0.6429 (0.6256) grad_norm 0.1425 (0.1453) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:54:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [409/800][300/402] eta 0:01:16 lr 0.000025 time 0.7463 (0.7540) loss 0.6434 (0.6262) grad_norm 0.1238 (0.1432) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:55:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [409/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7518) loss 0.5939 (0.6257) grad_norm 0.1416 (0.1446) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:55:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 409 training takes 0:05:02 [2024-03-10 02:56:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [410/800][0/402] eta 0:22:18 lr 0.000025 time 3.3295 (3.3295) loss 0.6069 (0.6069) grad_norm 0.1428 (0.1428) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:57:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [410/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7713) loss 0.6021 (0.6254) grad_norm 0.1434 (0.1450) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:58:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [410/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7586) loss 0.6410 (0.6261) grad_norm 0.1380 (0.1452) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 02:59:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [410/800][300/402] eta 0:01:16 lr 0.000025 time 0.7464 (0.7543) loss 0.6416 (0.6254) grad_norm 0.1257 (0.1459) loss_scale 524288.0000 (321365.9003) mem 28968MB [2024-03-10 03:00:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [410/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7521) loss 0.5812 (0.6252) grad_norm 0.1554 (0.1460) loss_scale 524288.0000 (371969.9152) mem 28968MB [2024-03-10 03:00:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 410 training takes 0:05:02 [2024-03-10 03:01:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [411/800][0/402] eta 0:31:46 lr 0.000025 time 4.7418 (4.7418) loss 0.6299 (0.6299) grad_norm 0.1429 (0.1429) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:02:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [411/800][100/402] eta 0:03:57 lr 0.000025 time 0.7458 (0.7853) loss 0.5981 (0.6225) grad_norm 0.1468 (0.1449) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:03:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [411/800][200/402] eta 0:02:34 lr 0.000025 time 0.7457 (0.7657) loss 0.6386 (0.6225) grad_norm 0.1392 (0.1453) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:04:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [411/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7591) loss 0.5950 (0.6242) grad_norm 0.1530 (0.1454) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:06:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [411/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7557) loss 0.6173 (0.6247) grad_norm 0.1750 (0.1446) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:06:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 411 training takes 0:05:03 [2024-03-10 03:06:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [412/800][0/402] eta 0:21:34 lr 0.000025 time 3.2199 (3.2199) loss 0.6360 (0.6360) grad_norm 0.1421 (0.1421) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:07:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [412/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7703) loss 0.6346 (0.6245) grad_norm 0.1228 (0.1496) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:08:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [412/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7581) loss 0.5968 (0.6251) grad_norm 0.1463 (0.1463) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:09:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [412/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7540) loss 0.6005 (0.6247) grad_norm 0.1250 (0.1459) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:11:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [412/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7519) loss 0.5971 (0.6249) grad_norm 0.1937 (0.1450) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:11:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 412 training takes 0:05:02 [2024-03-10 03:11:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [413/800][0/402] eta 0:22:04 lr 0.000025 time 3.2939 (3.2939) loss 0.6313 (0.6313) grad_norm 0.1491 (0.1491) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:12:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [413/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7709) loss 0.6392 (0.6237) grad_norm 0.1272 (0.1425) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:13:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [413/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7584) loss 0.6445 (0.6233) grad_norm 0.1308 (0.1443) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:14:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [413/800][300/402] eta 0:01:16 lr 0.000025 time 0.7453 (0.7542) loss 0.6467 (0.6245) grad_norm 0.1967 (0.1449) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:16:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [413/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7521) loss 0.6371 (0.6244) grad_norm 0.1402 (0.1446) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:16:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 413 training takes 0:05:02 [2024-03-10 03:16:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [414/800][0/402] eta 0:22:02 lr 0.000025 time 3.2907 (3.2907) loss 0.6563 (0.6563) grad_norm 0.1155 (0.1155) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:17:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [414/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7710) loss 0.6285 (0.6264) grad_norm 0.1308 (0.1448) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:18:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [414/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7584) loss 0.6352 (0.6249) grad_norm 0.1162 (inf) loss_scale 262144.0000 (500812.4179) mem 28968MB [2024-03-10 03:19:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [414/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7542) loss 0.6073 (0.6245) grad_norm 0.1510 (inf) loss_scale 262144.0000 (421520.5847) mem 28968MB [2024-03-10 03:21:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [414/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7520) loss 0.6068 (0.6252) grad_norm 0.1792 (inf) loss_scale 262144.0000 (381775.8005) mem 28968MB [2024-03-10 03:21:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 414 training takes 0:05:02 [2024-03-10 03:21:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [415/800][0/402] eta 0:22:15 lr 0.000025 time 3.3212 (3.3212) loss 0.6275 (0.6275) grad_norm 0.1715 (0.1715) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:22:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [415/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7712) loss 0.6457 (0.6229) grad_norm 0.1332 (0.1451) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:23:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [415/800][200/402] eta 0:02:33 lr 0.000025 time 0.7453 (0.7586) loss 0.6431 (0.6232) grad_norm 0.1249 (0.1480) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:24:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [415/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7543) loss 0.6180 (0.6240) grad_norm 0.1431 (0.1476) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:26:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [415/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7521) loss 0.6438 (0.6248) grad_norm 0.1516 (0.1472) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:26:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 415 training takes 0:05:02 [2024-03-10 03:26:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [416/800][0/402] eta 0:31:19 lr 0.000025 time 4.6759 (4.6759) loss 0.6307 (0.6307) grad_norm 0.1432 (0.1432) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:27:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [416/800][100/402] eta 0:03:56 lr 0.000025 time 0.7458 (0.7846) loss 0.6186 (0.6261) grad_norm 0.1664 (0.1441) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:28:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [416/800][200/402] eta 0:02:34 lr 0.000025 time 0.7454 (0.7653) loss 0.6215 (0.6241) grad_norm 0.1554 (0.1456) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:30:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [416/800][300/402] eta 0:01:17 lr 0.000025 time 0.7456 (0.7588) loss 0.6397 (0.6246) grad_norm 0.1194 (0.1461) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:31:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [416/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7555) loss 0.6074 (0.6245) grad_norm 0.1796 (0.1458) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:31:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 416 training takes 0:05:03 [2024-03-10 03:31:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [417/800][0/402] eta 0:22:00 lr 0.000025 time 3.2842 (3.2842) loss 0.5809 (0.5809) grad_norm 0.1582 (0.1582) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:32:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [417/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7708) loss 0.6279 (0.6251) grad_norm 0.1577 (0.1469) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:33:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [417/800][200/402] eta 0:02:33 lr 0.000025 time 0.7466 (0.7583) loss 0.6360 (0.6254) grad_norm 0.1534 (0.1461) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:35:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [417/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7541) loss 0.6291 (0.6265) grad_norm 0.1672 (0.1466) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:36:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [417/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7520) loss 0.5712 (0.6255) grad_norm 0.1606 (0.1462) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:36:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 417 training takes 0:05:02 [2024-03-10 03:36:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [418/800][0/402] eta 0:22:01 lr 0.000025 time 3.2874 (3.2874) loss 0.6248 (0.6248) grad_norm 0.1208 (0.1208) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:37:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [418/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7708) loss 0.6358 (0.6258) grad_norm 0.1327 (0.1419) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:38:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [418/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7584) loss 0.6167 (0.6263) grad_norm 0.1510 (0.1452) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:40:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [418/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7542) loss 0.6193 (0.6253) grad_norm 0.1600 (0.1445) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:41:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [418/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7521) loss 0.6266 (0.6253) grad_norm 0.1340 (0.1451) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:41:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 418 training takes 0:05:02 [2024-03-10 03:41:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [419/800][0/402] eta 0:21:58 lr 0.000025 time 3.2797 (3.2797) loss 0.6336 (0.6336) grad_norm 0.1243 (0.1243) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:42:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [419/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7707) loss 0.6301 (0.6226) grad_norm 0.1541 (0.1494) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:43:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [419/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7583) loss 0.6132 (0.6232) grad_norm 0.1549 (0.1473) loss_scale 524288.0000 (298661.5721) mem 28968MB [2024-03-10 03:45:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [419/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7541) loss 0.6102 (0.6235) grad_norm 0.1642 (0.1460) loss_scale 524288.0000 (373620.5183) mem 28968MB [2024-03-10 03:46:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [419/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7520) loss 0.6385 (0.6233) grad_norm 0.1370 (0.1452) loss_scale 524288.0000 (411193.4564) mem 28968MB [2024-03-10 03:46:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 419 training takes 0:05:02 [2024-03-10 03:46:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [420/800][0/402] eta 0:22:13 lr 0.000025 time 3.3181 (3.3181) loss 0.6255 (0.6255) grad_norm 0.1290 (0.1290) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:47:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [420/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7712) loss 0.6115 (0.6253) grad_norm 0.1440 (0.1469) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:48:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [420/800][200/402] eta 0:02:33 lr 0.000025 time 0.7465 (0.7586) loss 0.6321 (0.6260) grad_norm 0.1270 (0.1434) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:50:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [420/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7544) loss 0.6330 (0.6257) grad_norm 0.1267 (0.1440) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:51:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [420/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7522) loss 0.6185 (0.6252) grad_norm 0.1717 (0.1440) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:51:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 420 training takes 0:05:02 [2024-03-10 03:51:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [421/800][0/402] eta 0:33:30 lr 0.000025 time 5.0014 (5.0014) loss 0.6214 (0.6214) grad_norm 0.1627 (0.1627) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 03:52:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [421/800][100/402] eta 0:03:57 lr 0.000025 time 0.7454 (0.7880) loss 0.6471 (0.6240) grad_norm 0.1673 (inf) loss_scale 262144.0000 (415277.6238) mem 28968MB [2024-03-10 03:54:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [421/800][200/402] eta 0:02:34 lr 0.000025 time 0.7461 (0.7670) loss 0.6161 (0.6244) grad_norm 0.1388 (inf) loss_scale 262144.0000 (339091.7413) mem 28968MB [2024-03-10 03:55:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [421/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7599) loss 0.6074 (0.6245) grad_norm 0.1432 (inf) loss_scale 262144.0000 (313527.7076) mem 28968MB [2024-03-10 03:56:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [421/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7563) loss 0.6387 (0.6255) grad_norm 0.1526 (inf) loss_scale 262144.0000 (300713.8155) mem 28968MB [2024-03-10 03:56:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 421 training takes 0:05:04 [2024-03-10 03:56:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [422/800][0/402] eta 0:21:45 lr 0.000025 time 3.2463 (3.2463) loss 0.6191 (0.6191) grad_norm 0.1187 (0.1187) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:57:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [422/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7705) loss 0.5947 (0.6264) grad_norm 0.1333 (0.1407) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 03:59:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [422/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7581) loss 0.6335 (0.6257) grad_norm 0.1626 (0.1426) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:00:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [422/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7540) loss 0.6270 (0.6257) grad_norm 0.1485 (0.1434) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:01:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [422/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7519) loss 0.6338 (0.6247) grad_norm 0.1528 (0.1435) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:01:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 422 training takes 0:05:02 [2024-03-10 04:01:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [423/800][0/402] eta 0:21:44 lr 0.000025 time 3.2443 (3.2443) loss 0.6255 (0.6255) grad_norm 0.1514 (0.1514) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:02:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [423/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7704) loss 0.6180 (0.6236) grad_norm 0.1559 (0.1491) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:04:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [423/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7582) loss 0.6068 (0.6232) grad_norm 0.1414 (0.1465) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:05:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [423/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7541) loss 0.6253 (0.6236) grad_norm 0.1651 (0.1463) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:06:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [423/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7520) loss 0.6174 (0.6237) grad_norm 0.1559 (0.1457) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:06:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 423 training takes 0:05:02 [2024-03-10 04:06:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [424/800][0/402] eta 0:21:38 lr 0.000025 time 3.2301 (3.2301) loss 0.6455 (0.6455) grad_norm 0.1399 (0.1399) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:07:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [424/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7703) loss 0.6140 (0.6260) grad_norm 0.1658 (0.1435) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:09:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [424/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7580) loss 0.6414 (0.6260) grad_norm 0.1508 (0.1445) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:10:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [424/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7540) loss 0.6294 (0.6261) grad_norm 0.1624 (0.1445) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:11:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [424/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7518) loss 0.6078 (0.6251) grad_norm 0.1391 (0.1445) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:11:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 424 training takes 0:05:02 [2024-03-10 04:11:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [425/800][0/402] eta 0:22:49 lr 0.000025 time 3.4077 (3.4077) loss 0.5970 (0.5970) grad_norm 0.1469 (0.1469) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:12:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [425/800][100/402] eta 0:03:53 lr 0.000025 time 0.7458 (0.7720) loss 0.6226 (0.6238) grad_norm 0.1470 (0.1432) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:14:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [425/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7590) loss 0.6698 (0.6245) grad_norm 0.1351 (0.1438) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:15:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [425/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7546) loss 0.6236 (0.6233) grad_norm 0.1412 (0.1458) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:16:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [425/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7523) loss 0.6416 (0.6236) grad_norm 0.1220 (0.1456) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:16:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 425 training takes 0:05:02 [2024-03-10 04:16:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [426/800][0/402] eta 0:32:14 lr 0.000025 time 4.8115 (4.8115) loss 0.6315 (0.6315) grad_norm 0.1405 (0.1405) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:17:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [426/800][100/402] eta 0:03:57 lr 0.000025 time 0.7455 (0.7858) loss 0.6019 (0.6288) grad_norm 0.1533 (0.1475) loss_scale 524288.0000 (397109.2277) mem 28968MB [2024-03-10 04:19:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [426/800][200/402] eta 0:02:34 lr 0.000025 time 0.7458 (0.7659) loss 0.6324 (0.6269) grad_norm 0.1669 (0.1470) loss_scale 524288.0000 (460382.2488) mem 28968MB [2024-03-10 04:20:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [426/800][300/402] eta 0:01:17 lr 0.000025 time 0.7460 (0.7592) loss 0.6193 (0.6261) grad_norm 0.1299 (0.1463) loss_scale 524288.0000 (481613.3953) mem 28968MB [2024-03-10 04:21:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [426/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7559) loss 0.6455 (0.6263) grad_norm 0.1179 (0.1465) loss_scale 524288.0000 (492255.4414) mem 28968MB [2024-03-10 04:21:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 426 training takes 0:05:03 [2024-03-10 04:21:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [427/800][0/402] eta 0:22:03 lr 0.000025 time 3.2931 (3.2931) loss 0.6403 (0.6403) grad_norm 0.1665 (0.1665) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 04:23:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [427/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7710) loss 0.6144 (0.6283) grad_norm 0.1594 (0.1425) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 04:24:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [427/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7585) loss 0.6389 (0.6258) grad_norm 0.1342 (0.1430) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 04:25:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [427/800][300/402] eta 0:01:16 lr 0.000025 time 0.7463 (0.7543) loss 0.6154 (0.6255) grad_norm 0.1502 (0.1442) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 04:26:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [427/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7521) loss 0.6569 (0.6255) grad_norm 0.1677 (0.1452) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 04:26:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 427 training takes 0:05:02 [2024-03-10 04:26:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [428/800][0/402] eta 0:22:09 lr 0.000025 time 3.3061 (3.3061) loss 0.6306 (0.6306) grad_norm 0.1615 (0.1615) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 04:28:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [428/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7711) loss 0.6550 (0.6250) grad_norm 0.1391 (0.1460) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 04:29:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [428/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7585) loss 0.6359 (0.6249) grad_norm 0.1421 (0.1466) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 04:30:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [428/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7542) loss 0.6087 (0.6246) grad_norm 0.1521 (0.1459) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 04:31:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [428/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7521) loss 0.6326 (0.6239) grad_norm 0.1421 (0.1459) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 04:31:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 428 training takes 0:05:02 [2024-03-10 04:31:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [429/800][0/402] eta 0:21:41 lr 0.000025 time 3.2385 (3.2385) loss 0.6253 (0.6253) grad_norm 0.1146 (0.1146) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 04:33:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [429/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7704) loss 0.6144 (0.6249) grad_norm 0.1382 (0.1464) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 04:34:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [429/800][200/402] eta 0:02:33 lr 0.000025 time 0.7453 (0.7581) loss 0.6552 (0.6251) grad_norm 0.1297 (0.1465) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 04:35:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [429/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7541) loss 0.6531 (0.6251) grad_norm 0.1647 (0.1470) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 04:36:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [429/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7520) loss 0.6414 (0.6248) grad_norm 0.1130 (inf) loss_scale 262144.0000 (505983.6808) mem 28968MB [2024-03-10 04:36:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 429 training takes 0:05:02 [2024-03-10 04:36:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [430/800][0/402] eta 0:22:01 lr 0.000025 time 3.2885 (3.2885) loss 0.6328 (0.6328) grad_norm 0.1261 (0.1261) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:38:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [430/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7708) loss 0.6240 (0.6233) grad_norm 0.1651 (0.1461) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:39:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [430/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7583) loss 0.6338 (0.6226) grad_norm 0.1658 (0.1447) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:40:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [430/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7541) loss 0.6186 (0.6233) grad_norm 0.1446 (0.1447) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:41:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [430/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7520) loss 0.6605 (0.6237) grad_norm 0.1192 (0.1445) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:41:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 430 training takes 0:05:02 [2024-03-10 04:41:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [431/800][0/402] eta 0:32:01 lr 0.000025 time 4.7794 (4.7794) loss 0.5943 (0.5943) grad_norm 0.1681 (0.1681) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:43:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [431/800][100/402] eta 0:03:57 lr 0.000025 time 0.7451 (0.7856) loss 0.6326 (0.6247) grad_norm 0.1641 (0.1451) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:44:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [431/800][200/402] eta 0:02:34 lr 0.000025 time 0.7462 (0.7658) loss 0.6337 (0.6255) grad_norm 0.1460 (0.1436) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:45:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [431/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7591) loss 0.6334 (0.6247) grad_norm 0.1546 (0.1442) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:46:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [431/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7557) loss 0.6526 (0.6251) grad_norm 0.1514 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:46:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 431 training takes 0:05:03 [2024-03-10 04:47:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [432/800][0/402] eta 0:22:28 lr 0.000025 time 3.3556 (3.3556) loss 0.6276 (0.6276) grad_norm 0.1653 (0.1653) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:48:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [432/800][100/402] eta 0:03:53 lr 0.000025 time 0.7472 (0.7715) loss 0.6359 (0.6246) grad_norm 0.1625 (0.1487) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:49:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [432/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7587) loss 0.6494 (0.6239) grad_norm 0.1296 (0.1461) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:50:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [432/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7544) loss 0.6525 (0.6245) grad_norm 0.1699 (0.1450) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:51:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [432/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7522) loss 0.6220 (0.6247) grad_norm 0.1421 (0.1453) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:52:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 432 training takes 0:05:02 [2024-03-10 04:52:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [433/800][0/402] eta 0:21:48 lr 0.000025 time 3.2542 (3.2542) loss 0.6168 (0.6168) grad_norm 0.1350 (0.1350) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:53:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [433/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7705) loss 0.6572 (0.6256) grad_norm 0.1451 (0.1444) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:54:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [433/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7582) loss 0.6286 (0.6232) grad_norm 0.1641 (0.1442) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:55:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [433/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7540) loss 0.6304 (0.6228) grad_norm 0.1450 (0.1447) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:57:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [433/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7519) loss 0.6243 (0.6238) grad_norm 0.1484 (0.1440) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:57:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 433 training takes 0:05:02 [2024-03-10 04:57:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [434/800][0/402] eta 0:22:17 lr 0.000025 time 3.3283 (3.3283) loss 0.6245 (0.6245) grad_norm 0.1411 (0.1411) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:58:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [434/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7713) loss 0.6389 (0.6250) grad_norm 0.1494 (0.1467) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 04:59:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [434/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7586) loss 0.6050 (0.6267) grad_norm 0.1398 (0.1438) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:00:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [434/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7543) loss 0.6339 (0.6262) grad_norm 0.1535 (0.1452) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:02:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [434/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7521) loss 0.6373 (0.6260) grad_norm 0.1304 (0.1450) loss_scale 524288.0000 (286985.5761) mem 28968MB [2024-03-10 05:02:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 434 training takes 0:05:02 [2024-03-10 05:02:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [435/800][0/402] eta 0:22:39 lr 0.000025 time 3.3823 (3.3823) loss 0.6330 (0.6330) grad_norm 0.1241 (0.1241) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:03:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [435/800][100/402] eta 0:03:53 lr 0.000025 time 0.7458 (0.7718) loss 0.6114 (0.6279) grad_norm 0.1381 (0.1426) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:04:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [435/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7589) loss 0.6223 (0.6259) grad_norm 0.1254 (0.1438) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:05:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [435/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7545) loss 0.6030 (0.6255) grad_norm 0.1461 (0.1446) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:07:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [435/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7523) loss 0.6065 (0.6250) grad_norm 0.1481 (0.1440) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:07:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 435 training takes 0:05:02 [2024-03-10 05:07:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [436/800][0/402] eta 0:32:32 lr 0.000025 time 4.8572 (4.8572) loss 0.6168 (0.6168) grad_norm 0.1227 (0.1227) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:08:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [436/800][100/402] eta 0:03:57 lr 0.000025 time 0.7459 (0.7867) loss 0.5989 (0.6275) grad_norm 0.1568 (inf) loss_scale 262144.0000 (334817.5842) mem 28968MB [2024-03-10 05:09:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [436/800][200/402] eta 0:02:34 lr 0.000025 time 0.7463 (0.7666) loss 0.5954 (0.6251) grad_norm 0.1312 (inf) loss_scale 262144.0000 (298661.5721) mem 28968MB [2024-03-10 05:10:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [436/800][300/402] eta 0:01:17 lr 0.000025 time 0.7454 (0.7597) loss 0.5888 (0.6238) grad_norm 0.1548 (inf) loss_scale 262144.0000 (286529.4884) mem 28968MB [2024-03-10 05:12:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [436/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7562) loss 0.6216 (0.6247) grad_norm 0.1559 (inf) loss_scale 262144.0000 (280448.3192) mem 28968MB [2024-03-10 05:12:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 436 training takes 0:05:04 [2024-03-10 05:12:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [437/800][0/402] eta 0:21:46 lr 0.000025 time 3.2507 (3.2507) loss 0.6370 (0.6370) grad_norm 0.1191 (0.1191) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:13:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [437/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7705) loss 0.6142 (0.6273) grad_norm 0.1409 (0.1467) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:14:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [437/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7582) loss 0.6365 (0.6270) grad_norm 0.1428 (0.1462) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:15:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [437/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7541) loss 0.6247 (0.6258) grad_norm 0.1440 (0.1452) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:17:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [437/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7520) loss 0.6210 (0.6252) grad_norm 0.1556 (0.1460) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:17:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 437 training takes 0:05:02 [2024-03-10 05:17:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [438/800][0/402] eta 0:22:34 lr 0.000025 time 3.3705 (3.3705) loss 0.6140 (0.6140) grad_norm 0.1287 (0.1287) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:18:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [438/800][100/402] eta 0:03:53 lr 0.000025 time 0.7457 (0.7716) loss 0.6257 (0.6219) grad_norm 0.1624 (0.1484) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:19:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [438/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7588) loss 0.6112 (0.6233) grad_norm 0.1509 (0.1479) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:21:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [438/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7544) loss 0.6393 (0.6238) grad_norm 0.1286 (0.1482) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:22:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [438/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7522) loss 0.6389 (0.6243) grad_norm 0.1269 (0.1475) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:22:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 438 training takes 0:05:02 [2024-03-10 05:22:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [439/800][0/402] eta 0:22:30 lr 0.000025 time 3.3597 (3.3597) loss 0.6584 (0.6584) grad_norm 0.1478 (0.1478) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:23:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [439/800][100/402] eta 0:03:53 lr 0.000025 time 0.7464 (0.7717) loss 0.6324 (0.6260) grad_norm 0.1177 (0.1408) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:24:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [439/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7588) loss 0.6126 (0.6242) grad_norm 0.1394 (0.1429) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:26:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [439/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7544) loss 0.6402 (0.6244) grad_norm 0.1319 (0.1435) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:27:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [439/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7522) loss 0.6299 (0.6244) grad_norm 0.1475 (0.1445) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:27:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 439 training takes 0:05:02 [2024-03-10 05:27:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [440/800][0/402] eta 0:22:41 lr 0.000025 time 3.3857 (3.3857) loss 0.6435 (0.6435) grad_norm 0.1597 (0.1597) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:28:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [440/800][100/402] eta 0:03:53 lr 0.000025 time 0.7463 (0.7718) loss 0.6508 (0.6238) grad_norm 0.1090 (0.1467) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:29:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [440/800][200/402] eta 0:02:33 lr 0.000025 time 0.7466 (0.7588) loss 0.6417 (0.6236) grad_norm 0.1219 (0.1439) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:31:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [440/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7545) loss 0.6645 (0.6238) grad_norm 0.1342 (0.1441) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:32:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [440/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7523) loss 0.6362 (0.6236) grad_norm 0.1365 (0.1446) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:32:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 440 training takes 0:05:02 [2024-03-10 05:32:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [441/800][0/402] eta 0:32:33 lr 0.000025 time 4.8601 (4.8601) loss 0.6385 (0.6385) grad_norm 0.1514 (0.1514) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:33:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [441/800][100/402] eta 0:03:57 lr 0.000025 time 0.7465 (0.7871) loss 0.6234 (0.6259) grad_norm 0.1260 (0.1484) loss_scale 524288.0000 (477569.2673) mem 28968MB [2024-03-10 05:34:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [441/800][200/402] eta 0:02:34 lr 0.000025 time 0.7464 (0.7669) loss 0.6219 (0.6269) grad_norm 0.1799 (0.1466) loss_scale 524288.0000 (500812.4179) mem 28968MB [2024-03-10 05:36:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [441/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7600) loss 0.6125 (0.6254) grad_norm 0.1498 (0.1466) loss_scale 524288.0000 (508611.6146) mem 28968MB [2024-03-10 05:37:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [441/800][400/402] eta 0:00:01 lr 0.000025 time 0.7453 (0.7565) loss 0.6572 (0.6255) grad_norm 0.1229 (0.1462) loss_scale 524288.0000 (512520.9377) mem 28968MB [2024-03-10 05:37:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 441 training takes 0:05:04 [2024-03-10 05:37:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [442/800][0/402] eta 0:22:19 lr 0.000025 time 3.3318 (3.3318) loss 0.6577 (0.6577) grad_norm 0.1418 (0.1418) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:38:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [442/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7714) loss 0.5983 (0.6235) grad_norm 0.1271 (0.1468) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:39:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [442/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7586) loss 0.6315 (0.6235) grad_norm 0.1440 (0.1478) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:41:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [442/800][300/402] eta 0:01:16 lr 0.000025 time 0.7453 (0.7543) loss 0.6411 (0.6239) grad_norm 0.1447 (0.1477) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:42:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [442/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7522) loss 0.6261 (0.6244) grad_norm 0.1414 (0.1469) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:42:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 442 training takes 0:05:02 [2024-03-10 05:42:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [443/800][0/402] eta 0:22:32 lr 0.000025 time 3.3634 (3.3634) loss 0.6274 (0.6274) grad_norm 0.1440 (0.1440) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:43:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [443/800][100/402] eta 0:03:53 lr 0.000025 time 0.7455 (0.7716) loss 0.6260 (0.6240) grad_norm 0.1388 (0.1479) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:45:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [443/800][200/402] eta 0:02:33 lr 0.000025 time 0.7453 (0.7587) loss 0.6207 (0.6236) grad_norm 0.1422 (0.1458) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:46:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [443/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7544) loss 0.6393 (0.6235) grad_norm 0.1272 (0.1469) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:47:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [443/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7522) loss 0.6421 (0.6244) grad_norm 0.1226 (0.1462) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:47:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 443 training takes 0:05:02 [2024-03-10 05:47:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [444/800][0/402] eta 0:21:48 lr 0.000025 time 3.2537 (3.2537) loss 0.6226 (0.6226) grad_norm 0.1495 (0.1495) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:48:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [444/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7705) loss 0.6123 (0.6245) grad_norm 0.1597 (0.1483) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:50:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [444/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7582) loss 0.5719 (0.6241) grad_norm 0.1521 (0.1471) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:51:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [444/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7541) loss 0.6274 (0.6251) grad_norm 0.1290 (0.1463) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:52:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [444/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7519) loss 0.5975 (0.6249) grad_norm 0.1392 (0.1456) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:52:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 444 training takes 0:05:02 [2024-03-10 05:52:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [445/800][0/402] eta 0:21:50 lr 0.000025 time 3.2602 (3.2602) loss 0.6119 (0.6119) grad_norm 0.1533 (0.1533) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:53:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [445/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7707) loss 0.6266 (0.6245) grad_norm 0.1343 (0.1452) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 05:55:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [445/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7582) loss 0.6645 (0.6254) grad_norm 0.1532 (inf) loss_scale 262144.0000 (513854.4080) mem 28968MB [2024-03-10 05:56:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [445/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7540) loss 0.6141 (0.6244) grad_norm 0.1825 (inf) loss_scale 262144.0000 (430229.6877) mem 28968MB [2024-03-10 05:57:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [445/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7519) loss 0.6174 (0.6245) grad_norm 0.1344 (inf) loss_scale 262144.0000 (388313.0574) mem 28968MB [2024-03-10 05:57:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 445 training takes 0:05:02 [2024-03-10 05:57:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [446/800][0/402] eta 0:32:25 lr 0.000025 time 4.8395 (4.8395) loss 0.6150 (0.6150) grad_norm 0.1777 (0.1777) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 05:58:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [446/800][100/402] eta 0:03:57 lr 0.000025 time 0.7461 (0.7860) loss 0.6645 (0.6246) grad_norm 0.1579 (inf) loss_scale 131072.0000 (236189.1485) mem 28968MB [2024-03-10 06:00:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [446/800][200/402] eta 0:02:34 lr 0.000025 time 0.7456 (0.7659) loss 0.6278 (0.6248) grad_norm 0.1324 (inf) loss_scale 131072.0000 (183892.0597) mem 28968MB [2024-03-10 06:01:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [446/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7592) loss 0.6202 (0.6244) grad_norm 0.1448 (inf) loss_scale 131072.0000 (166343.8671) mem 28968MB [2024-03-10 06:02:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [446/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7558) loss 0.6137 (0.6251) grad_norm 0.1467 (inf) loss_scale 131072.0000 (157547.8903) mem 28968MB [2024-03-10 06:02:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 446 training takes 0:05:03 [2024-03-10 06:02:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [447/800][0/402] eta 0:21:52 lr 0.000025 time 3.2660 (3.2660) loss 0.6169 (0.6169) grad_norm 0.1426 (0.1426) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:03:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [447/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7707) loss 0.6186 (0.6237) grad_norm 0.1350 (0.1472) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:05:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [447/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7583) loss 0.6535 (0.6251) grad_norm 0.1297 (0.1461) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:06:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [447/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7542) loss 0.6055 (0.6248) grad_norm 0.1291 (0.1455) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:07:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [447/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7520) loss 0.5866 (0.6247) grad_norm 0.1707 (0.1449) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:07:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 447 training takes 0:05:02 [2024-03-10 06:07:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [448/800][0/402] eta 0:22:09 lr 0.000025 time 3.3075 (3.3075) loss 0.6615 (0.6615) grad_norm 0.1427 (0.1427) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:08:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [448/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7711) loss 0.6188 (0.6258) grad_norm 0.1495 (0.1475) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:10:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [448/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7585) loss 0.6379 (0.6248) grad_norm 0.1387 (0.1472) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:11:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [448/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7543) loss 0.6176 (0.6242) grad_norm 0.1464 (0.1508) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:12:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [448/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7521) loss 0.6203 (0.6250) grad_norm 0.1273 (0.1492) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:12:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 448 training takes 0:05:02 [2024-03-10 06:12:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [449/800][0/402] eta 0:22:00 lr 0.000025 time 3.2839 (3.2839) loss 0.6103 (0.6103) grad_norm 0.1300 (0.1300) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:14:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [449/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7708) loss 0.5971 (0.6258) grad_norm 0.1645 (0.1405) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:15:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [449/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7583) loss 0.6226 (0.6253) grad_norm 0.1288 (0.1424) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:16:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [449/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7542) loss 0.6168 (0.6251) grad_norm 0.1425 (0.1428) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:17:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [449/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7521) loss 0.6138 (0.6244) grad_norm 0.1424 (0.1432) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:17:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 449 training takes 0:05:02 [2024-03-10 06:17:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [450/800][0/402] eta 0:21:39 lr 0.000025 time 3.2315 (3.2315) loss 0.5945 (0.5945) grad_norm 0.1594 (0.1594) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:19:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [450/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7703) loss 0.6009 (0.6231) grad_norm 0.1303 (0.1447) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:20:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [450/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7580) loss 0.6089 (0.6229) grad_norm 0.1340 (0.1448) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:21:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [450/800][300/402] eta 0:01:16 lr 0.000025 time 0.7464 (0.7540) loss 0.6162 (0.6239) grad_norm 0.1510 (0.1448) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:22:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [450/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7519) loss 0.6504 (0.6248) grad_norm 0.1366 (0.1451) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:22:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 450 training takes 0:05:02 [2024-03-10 06:22:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [451/800][0/402] eta 0:32:20 lr 0.000025 time 4.8279 (4.8279) loss 0.6312 (0.6312) grad_norm 0.1653 (0.1653) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 06:24:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [451/800][100/402] eta 0:03:57 lr 0.000025 time 0.7455 (0.7861) loss 0.6447 (0.6256) grad_norm 0.1633 (0.1438) loss_scale 262144.0000 (170004.2772) mem 28968MB [2024-03-10 06:25:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [451/800][200/402] eta 0:02:34 lr 0.000025 time 0.7454 (0.7660) loss 0.6095 (0.6242) grad_norm 0.1396 (0.1439) loss_scale 262144.0000 (215844.9353) mem 28968MB [2024-03-10 06:26:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [451/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7593) loss 0.6416 (0.6241) grad_norm 0.1277 (0.1454) loss_scale 262144.0000 (231226.6844) mem 28968MB [2024-03-10 06:27:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [451/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7559) loss 0.6041 (0.6240) grad_norm 0.1820 (0.1460) loss_scale 262144.0000 (238936.7382) mem 28968MB [2024-03-10 06:27:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 451 training takes 0:05:03 [2024-03-10 06:27:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [452/800][0/402] eta 0:22:00 lr 0.000025 time 3.2859 (3.2859) loss 0.6166 (0.6166) grad_norm 0.1453 (0.1453) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:29:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [452/800][100/402] eta 0:03:53 lr 0.000025 time 0.7470 (0.7719) loss 0.5946 (0.6229) grad_norm 0.1594 (0.1405) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:30:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [452/800][200/402] eta 0:02:33 lr 0.000025 time 0.7465 (0.7594) loss 0.6296 (0.6239) grad_norm 0.1306 (0.1436) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:31:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [452/800][300/402] eta 0:01:17 lr 0.000025 time 0.7467 (0.7552) loss 0.6324 (0.6239) grad_norm 0.1414 (0.1446) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:32:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [452/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7531) loss 0.6182 (0.6240) grad_norm 0.1481 (0.1450) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:32:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 452 training takes 0:05:02 [2024-03-10 06:32:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [453/800][0/402] eta 0:22:04 lr 0.000025 time 3.2941 (3.2941) loss 0.6276 (0.6276) grad_norm 0.1445 (0.1445) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:34:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [453/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7709) loss 0.6335 (0.6230) grad_norm 0.1296 (0.1440) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:35:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [453/800][200/402] eta 0:02:33 lr 0.000025 time 0.7468 (0.7584) loss 0.6241 (0.6238) grad_norm 0.1309 (0.1463) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:36:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [453/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7542) loss 0.6519 (0.6239) grad_norm 0.1700 (0.1467) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:37:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [453/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7521) loss 0.6367 (0.6242) grad_norm 0.1367 (0.1463) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:37:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 453 training takes 0:05:02 [2024-03-10 06:38:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [454/800][0/402] eta 0:22:14 lr 0.000025 time 3.3184 (3.3184) loss 0.6328 (0.6328) grad_norm 0.1772 (0.1772) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:39:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [454/800][100/402] eta 0:03:52 lr 0.000025 time 0.7462 (0.7713) loss 0.6032 (0.6233) grad_norm 0.1584 (0.1476) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:40:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [454/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7586) loss 0.6104 (0.6251) grad_norm 0.1377 (0.1453) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:41:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [454/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7544) loss 0.6074 (0.6241) grad_norm 0.1352 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:42:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [454/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7522) loss 0.6172 (0.6242) grad_norm 0.1446 (0.1453) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:43:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 454 training takes 0:05:02 [2024-03-10 06:43:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [455/800][0/402] eta 0:22:12 lr 0.000025 time 3.3151 (3.3151) loss 0.6096 (0.6096) grad_norm 0.1647 (0.1647) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:44:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [455/800][100/402] eta 0:03:52 lr 0.000025 time 0.7462 (0.7712) loss 0.6200 (0.6248) grad_norm 0.1426 (0.1441) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:45:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [455/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7585) loss 0.6044 (0.6245) grad_norm 0.1507 (0.1455) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:46:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [455/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7543) loss 0.6355 (0.6248) grad_norm 0.1145 (0.1455) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:48:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [455/800][400/402] eta 0:00:01 lr 0.000025 time 0.7437 (0.7521) loss 0.6317 (0.6250) grad_norm 0.1585 (0.1456) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:48:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 455 training takes 0:05:02 [2024-03-10 06:48:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [456/800][0/402] eta 0:31:18 lr 0.000025 time 4.6722 (4.6722) loss 0.6296 (0.6296) grad_norm 0.1176 (0.1176) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 06:49:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [456/800][100/402] eta 0:03:56 lr 0.000025 time 0.7460 (0.7846) loss 0.6432 (0.6225) grad_norm 0.1468 (0.1474) loss_scale 524288.0000 (365963.4059) mem 28968MB [2024-03-10 06:50:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [456/800][200/402] eta 0:02:34 lr 0.000025 time 0.7462 (0.7653) loss 0.6244 (0.6234) grad_norm 0.1289 (0.1474) loss_scale 524288.0000 (444731.8607) mem 28968MB [2024-03-10 06:51:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [456/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7588) loss 0.6431 (0.6244) grad_norm 0.1295 (0.1475) loss_scale 524288.0000 (471162.4718) mem 28968MB [2024-03-10 06:53:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [456/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7555) loss 0.5996 (0.6248) grad_norm 0.1394 (0.1474) loss_scale 524288.0000 (484410.7332) mem 28968MB [2024-03-10 06:53:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 456 training takes 0:05:03 [2024-03-10 06:53:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [457/800][0/402] eta 0:21:45 lr 0.000025 time 3.2483 (3.2483) loss 0.6010 (0.6010) grad_norm 0.1334 (0.1334) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 06:54:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [457/800][100/402] eta 0:03:52 lr 0.000025 time 0.7450 (0.7704) loss 0.5990 (0.6243) grad_norm 0.1460 (0.1416) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 06:55:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [457/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7582) loss 0.6167 (0.6239) grad_norm 0.1647 (0.1458) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 06:56:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [457/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7541) loss 0.6330 (0.6246) grad_norm 0.1229 (0.1463) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 06:58:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [457/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7519) loss 0.6572 (0.6246) grad_norm 0.1581 (0.1461) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 06:58:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 457 training takes 0:05:02 [2024-03-10 06:58:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [458/800][0/402] eta 0:22:12 lr 0.000025 time 3.3146 (3.3146) loss 0.6354 (0.6354) grad_norm 0.1253 (0.1253) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 06:59:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [458/800][100/402] eta 0:03:52 lr 0.000025 time 0.7454 (0.7711) loss 0.6290 (0.6229) grad_norm 0.1555 (0.1486) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:00:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [458/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7584) loss 0.6205 (0.6241) grad_norm 0.1338 (0.1477) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:01:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [458/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7542) loss 0.6305 (0.6241) grad_norm 0.1427 (0.1465) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:03:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [458/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7521) loss 0.6257 (0.6241) grad_norm 0.1642 (0.1470) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:03:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 458 training takes 0:05:02 [2024-03-10 07:03:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [459/800][0/402] eta 0:21:42 lr 0.000025 time 3.2394 (3.2394) loss 0.6254 (0.6254) grad_norm 0.1559 (0.1559) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:04:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [459/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7704) loss 0.5847 (0.6233) grad_norm 0.1637 (0.1451) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:05:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [459/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7581) loss 0.6441 (0.6246) grad_norm 0.1410 (0.1458) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:06:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [459/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7541) loss 0.6248 (0.6237) grad_norm 0.1350 (0.1464) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:08:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [459/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7519) loss 0.6278 (0.6238) grad_norm 0.1480 (0.1454) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:08:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 459 training takes 0:05:02 [2024-03-10 07:08:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [460/800][0/402] eta 0:21:31 lr 0.000025 time 3.2136 (3.2136) loss 0.6365 (0.6365) grad_norm 0.1301 (0.1301) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:09:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [460/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7701) loss 0.6262 (0.6238) grad_norm 0.1401 (0.1448) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:10:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [460/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7580) loss 0.6260 (0.6255) grad_norm 0.1479 (0.1441) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:12:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [460/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7539) loss 0.5968 (0.6241) grad_norm 0.1517 (0.1462) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:13:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [460/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7518) loss 0.6160 (0.6235) grad_norm 0.1244 (0.1466) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:13:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 460 training takes 0:05:02 [2024-03-10 07:13:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [461/800][0/402] eta 0:31:18 lr 0.000025 time 4.6734 (4.6734) loss 0.6115 (0.6115) grad_norm 0.1403 (0.1403) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:14:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [461/800][100/402] eta 0:03:56 lr 0.000025 time 0.7454 (0.7846) loss 0.6193 (0.6251) grad_norm 0.1397 (inf) loss_scale 524288.0000 (726735.8416) mem 28968MB [2024-03-10 07:15:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [461/800][200/402] eta 0:02:34 lr 0.000025 time 0.7457 (0.7653) loss 0.6016 (0.6239) grad_norm 0.1548 (inf) loss_scale 524288.0000 (626015.5224) mem 28968MB [2024-03-10 07:17:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [461/800][300/402] eta 0:01:17 lr 0.000025 time 0.7456 (0.7588) loss 0.6475 (0.6244) grad_norm 0.1377 (inf) loss_scale 524288.0000 (592219.0033) mem 28968MB [2024-03-10 07:18:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [461/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7555) loss 0.6027 (0.6235) grad_norm 0.1474 (inf) loss_scale 524288.0000 (575278.6035) mem 28968MB [2024-03-10 07:18:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 461 training takes 0:05:03 [2024-03-10 07:18:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [462/800][0/402] eta 0:21:36 lr 0.000025 time 3.2260 (3.2260) loss 0.6076 (0.6076) grad_norm 0.1437 (0.1437) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:19:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [462/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7703) loss 0.5977 (0.6237) grad_norm 0.1402 (0.1438) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:20:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [462/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7581) loss 0.6248 (0.6232) grad_norm 0.1780 (0.1448) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:22:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [462/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7540) loss 0.6265 (0.6242) grad_norm 0.1766 (0.1455) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:23:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [462/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7519) loss 0.6543 (0.6248) grad_norm 0.1374 (0.1460) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:23:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 462 training takes 0:05:02 [2024-03-10 07:23:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [463/800][0/402] eta 0:22:45 lr 0.000025 time 3.3975 (3.3975) loss 0.6239 (0.6239) grad_norm 0.1546 (0.1546) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:24:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [463/800][100/402] eta 0:03:53 lr 0.000025 time 0.7456 (0.7720) loss 0.6005 (0.6224) grad_norm 0.1515 (0.1439) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:25:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [463/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7590) loss 0.6346 (0.6244) grad_norm 0.1608 (0.1439) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:27:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [463/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7546) loss 0.6089 (0.6247) grad_norm 0.1638 (0.1451) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:28:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [463/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7523) loss 0.6587 (0.6241) grad_norm 0.1303 (0.1468) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:28:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 463 training takes 0:05:02 [2024-03-10 07:28:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [464/800][0/402] eta 0:21:56 lr 0.000025 time 3.2754 (3.2754) loss 0.6319 (0.6319) grad_norm 0.1276 (0.1276) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:29:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [464/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7708) loss 0.6074 (0.6230) grad_norm 0.2073 (0.1506) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:30:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [464/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7584) loss 0.5980 (0.6237) grad_norm 0.1316 (0.1467) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:32:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [464/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7542) loss 0.6118 (0.6241) grad_norm 0.1521 (0.1461) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:33:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [464/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7521) loss 0.5892 (0.6236) grad_norm 0.1415 (0.1469) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:33:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 464 training takes 0:05:02 [2024-03-10 07:33:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [465/800][0/402] eta 0:21:36 lr 0.000025 time 3.2254 (3.2254) loss 0.6253 (0.6253) grad_norm 0.1418 (0.1418) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:34:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [465/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7703) loss 0.6200 (0.6251) grad_norm 0.1420 (0.1478) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:35:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [465/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7580) loss 0.6476 (0.6260) grad_norm 0.1506 (0.1471) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 07:37:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [465/800][300/402] eta 0:01:16 lr 0.000025 time 0.7450 (0.7539) loss 0.6333 (0.6257) grad_norm 0.1255 (inf) loss_scale 262144.0000 (515578.8970) mem 28968MB [2024-03-10 07:38:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [465/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7518) loss 0.6584 (0.6262) grad_norm 0.1300 (inf) loss_scale 262144.0000 (452378.1746) mem 28968MB [2024-03-10 07:38:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 465 training takes 0:05:02 [2024-03-10 07:38:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [466/800][0/402] eta 0:32:17 lr 0.000025 time 4.8193 (4.8193) loss 0.6111 (0.6111) grad_norm 0.1443 (0.1443) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:39:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [466/800][100/402] eta 0:03:57 lr 0.000025 time 0.7455 (0.7863) loss 0.6054 (0.6255) grad_norm 0.1466 (0.1424) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:41:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [466/800][200/402] eta 0:02:34 lr 0.000025 time 0.7463 (0.7662) loss 0.6244 (0.6234) grad_norm 0.1376 (0.1450) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:42:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [466/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7596) loss 0.6431 (0.6240) grad_norm 0.1269 (0.1459) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:43:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [466/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7561) loss 0.6343 (0.6243) grad_norm 0.1296 (0.1462) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:43:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 466 training takes 0:05:04 [2024-03-10 07:43:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [467/800][0/402] eta 0:22:36 lr 0.000025 time 3.3733 (3.3733) loss 0.6448 (0.6448) grad_norm 0.1333 (0.1333) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:44:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [467/800][100/402] eta 0:03:53 lr 0.000025 time 0.7457 (0.7718) loss 0.6272 (0.6245) grad_norm 0.1323 (0.1436) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:46:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [467/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7588) loss 0.6348 (0.6239) grad_norm 0.1500 (0.1458) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:47:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [467/800][300/402] eta 0:01:16 lr 0.000025 time 0.7468 (0.7544) loss 0.6353 (0.6244) grad_norm 0.1259 (0.1462) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:48:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [467/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7522) loss 0.6519 (0.6239) grad_norm 0.1283 (0.1468) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:48:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 467 training takes 0:05:02 [2024-03-10 07:48:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [468/800][0/402] eta 0:22:25 lr 0.000025 time 3.3460 (3.3460) loss 0.6286 (0.6286) grad_norm 0.1462 (0.1462) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:49:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [468/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7714) loss 0.6196 (0.6261) grad_norm 0.1308 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:51:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [468/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7586) loss 0.6434 (0.6232) grad_norm 0.1568 (0.1492) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:52:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [468/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7543) loss 0.6464 (0.6230) grad_norm 0.1470 (0.1481) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:53:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [468/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7522) loss 0.6058 (0.6232) grad_norm 0.1409 (0.1482) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:53:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 468 training takes 0:05:02 [2024-03-10 07:53:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [469/800][0/402] eta 0:22:03 lr 0.000025 time 3.2914 (3.2914) loss 0.6139 (0.6139) grad_norm 0.1901 (0.1901) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:54:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [469/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7709) loss 0.6163 (0.6232) grad_norm 0.1488 (0.1508) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:56:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [469/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7584) loss 0.6210 (0.6240) grad_norm 0.1440 (0.1468) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:57:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [469/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7542) loss 0.6193 (0.6240) grad_norm 0.1470 (0.1451) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:58:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [469/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7520) loss 0.6076 (0.6236) grad_norm 0.1434 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:58:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 469 training takes 0:05:02 [2024-03-10 07:58:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [470/800][0/402] eta 0:21:45 lr 0.000025 time 3.2480 (3.2480) loss 0.6311 (0.6311) grad_norm 0.1358 (0.1358) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 07:59:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [470/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7704) loss 0.6276 (0.6248) grad_norm 0.1632 (0.1468) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:01:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [470/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7581) loss 0.6359 (0.6244) grad_norm 0.1602 (0.1456) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:02:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [470/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7540) loss 0.6422 (0.6243) grad_norm 0.1552 (0.1454) loss_scale 524288.0000 (279562.2060) mem 28968MB [2024-03-10 08:03:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [470/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7519) loss 0.6184 (0.6238) grad_norm 0.1835 (0.1455) loss_scale 524288.0000 (340591.0823) mem 28968MB [2024-03-10 08:03:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 470 training takes 0:05:02 [2024-03-10 08:03:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [471/800][0/402] eta 0:31:59 lr 0.000025 time 4.7742 (4.7742) loss 0.6039 (0.6039) grad_norm 0.1482 (0.1482) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:05:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [471/800][100/402] eta 0:03:57 lr 0.000025 time 0.7460 (0.7856) loss 0.6194 (0.6252) grad_norm 0.1626 (0.1497) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:06:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [471/800][200/402] eta 0:02:34 lr 0.000025 time 0.7456 (0.7658) loss 0.6153 (0.6256) grad_norm 0.1577 (0.1464) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:07:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [471/800][300/402] eta 0:01:17 lr 0.000025 time 0.7454 (0.7593) loss 0.6382 (0.6248) grad_norm 0.1262 (0.1466) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:08:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [471/800][400/402] eta 0:00:01 lr 0.000025 time 0.7452 (0.7559) loss 0.6307 (0.6246) grad_norm 0.1435 (0.1463) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:08:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 471 training takes 0:05:03 [2024-03-10 08:08:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [472/800][0/402] eta 0:22:34 lr 0.000025 time 3.3682 (3.3682) loss 0.6291 (0.6291) grad_norm 0.1262 (0.1262) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:10:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [472/800][100/402] eta 0:03:53 lr 0.000025 time 0.7458 (0.7716) loss 0.6348 (0.6238) grad_norm 0.1376 (0.1462) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:11:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [472/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7587) loss 0.5900 (0.6244) grad_norm 0.1349 (0.1456) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:12:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [472/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7544) loss 0.6057 (0.6249) grad_norm 0.1435 (0.1443) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:13:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [472/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7522) loss 0.6227 (0.6251) grad_norm 0.1275 (0.1451) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:13:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 472 training takes 0:05:02 [2024-03-10 08:13:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [473/800][0/402] eta 0:21:58 lr 0.000025 time 3.2807 (3.2807) loss 0.6326 (0.6326) grad_norm 0.1442 (0.1442) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:15:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [473/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7708) loss 0.5678 (0.6237) grad_norm 0.1395 (0.1472) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:16:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [473/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7583) loss 0.6158 (0.6227) grad_norm 0.2267 (0.1476) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:17:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [473/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7542) loss 0.6176 (0.6236) grad_norm 0.1453 (0.1480) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:18:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [473/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7523) loss 0.6207 (0.6245) grad_norm 0.1707 (0.1470) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:18:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 473 training takes 0:05:02 [2024-03-10 08:18:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [474/800][0/402] eta 0:22:02 lr 0.000025 time 3.2888 (3.2888) loss 0.6415 (0.6415) grad_norm 0.1413 (0.1413) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:20:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [474/800][100/402] eta 0:03:52 lr 0.000025 time 0.7465 (0.7709) loss 0.6237 (0.6257) grad_norm 0.1417 (0.1483) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:21:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [474/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7584) loss 0.6324 (0.6241) grad_norm 0.1333 (0.1455) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:22:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [474/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7542) loss 0.6201 (0.6244) grad_norm 0.1575 (0.1460) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:23:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [474/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7521) loss 0.6411 (0.6251) grad_norm 0.1776 (0.1460) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:23:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 474 training takes 0:05:02 [2024-03-10 08:23:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [475/800][0/402] eta 0:22:23 lr 0.000025 time 3.3417 (3.3417) loss 0.6297 (0.6297) grad_norm 0.1475 (0.1475) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:25:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [475/800][100/402] eta 0:03:52 lr 0.000025 time 0.7459 (0.7714) loss 0.6022 (0.6278) grad_norm 0.1433 (0.1431) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:26:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [475/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7587) loss 0.6178 (0.6248) grad_norm 0.1816 (0.1448) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:27:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [475/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7543) loss 0.6330 (0.6240) grad_norm 0.1514 (0.1453) loss_scale 1048576.0000 (576542.6179) mem 28968MB [2024-03-10 08:28:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [475/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7521) loss 0.6147 (0.6236) grad_norm 0.1613 (inf) loss_scale 524288.0000 (573971.1521) mem 28968MB [2024-03-10 08:28:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 475 training takes 0:05:02 [2024-03-10 08:29:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [476/800][0/402] eta 0:32:53 lr 0.000025 time 4.9080 (4.9080) loss 0.6434 (0.6434) grad_norm 0.1495 (0.1495) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:30:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [476/800][100/402] eta 0:03:57 lr 0.000025 time 0.7455 (0.7870) loss 0.6288 (0.6216) grad_norm 0.1335 (0.1454) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:31:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [476/800][200/402] eta 0:02:34 lr 0.000025 time 0.7460 (0.7665) loss 0.6457 (0.6227) grad_norm 0.1238 (0.1472) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:32:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [476/800][300/402] eta 0:01:17 lr 0.000025 time 0.7456 (0.7596) loss 0.6226 (0.6238) grad_norm 0.1340 (0.1468) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:33:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [476/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7561) loss 0.5963 (0.6241) grad_norm 0.1718 (0.1470) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:34:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 476 training takes 0:05:04 [2024-03-10 08:34:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [477/800][0/402] eta 0:23:08 lr 0.000025 time 3.4533 (3.4533) loss 0.6337 (0.6337) grad_norm 0.1494 (0.1494) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:35:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [477/800][100/402] eta 0:03:53 lr 0.000025 time 0.7458 (0.7726) loss 0.5846 (0.6222) grad_norm 0.1261 (0.1431) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:36:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [477/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7593) loss 0.6395 (0.6235) grad_norm 0.1401 (0.1466) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 08:37:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [477/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7547) loss 0.6434 (0.6242) grad_norm 0.1200 (nan) loss_scale 262144.0000 (495547.9601) mem 28968MB [2024-03-10 08:39:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [477/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7524) loss 0.6589 (0.6233) grad_norm 0.1256 (nan) loss_scale 262144.0000 (437342.4838) mem 28968MB [2024-03-10 08:39:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 477 training takes 0:05:02 [2024-03-10 08:39:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [478/800][0/402] eta 0:21:48 lr 0.000025 time 3.2545 (3.2545) loss 0.6673 (0.6673) grad_norm 0.1335 (0.1335) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:40:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [478/800][100/402] eta 0:03:52 lr 0.000025 time 0.7468 (0.7707) loss 0.6388 (0.6240) grad_norm 0.1269 (0.1452) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:41:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [478/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7585) loss 0.6593 (0.6242) grad_norm 0.1260 (0.1446) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:42:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [478/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7543) loss 0.6260 (0.6235) grad_norm 0.1322 (0.1459) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:44:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [478/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7521) loss 0.6201 (0.6241) grad_norm 0.1287 (0.1450) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:44:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 478 training takes 0:05:02 [2024-03-10 08:44:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [479/800][0/402] eta 0:22:18 lr 0.000025 time 3.3301 (3.3301) loss 0.6250 (0.6250) grad_norm 0.1668 (0.1668) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:45:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [479/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7713) loss 0.5934 (0.6239) grad_norm 0.1330 (0.1475) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:46:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [479/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7586) loss 0.6339 (0.6245) grad_norm 0.1458 (0.1469) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:47:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [479/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7543) loss 0.5928 (0.6234) grad_norm 0.1572 (0.1462) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:49:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [479/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7521) loss 0.6122 (0.6242) grad_norm 0.1708 (0.1461) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:49:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 479 training takes 0:05:02 [2024-03-10 08:49:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [480/800][0/402] eta 0:23:13 lr 0.000025 time 3.4676 (3.4676) loss 0.6111 (0.6111) grad_norm 0.2308 (0.2308) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:50:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [480/800][100/402] eta 0:03:53 lr 0.000025 time 0.7452 (0.7726) loss 0.6050 (0.6231) grad_norm 0.1510 (0.1463) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:51:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [480/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7593) loss 0.5970 (0.6236) grad_norm 0.1643 (0.1466) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:52:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [480/800][300/402] eta 0:01:17 lr 0.000025 time 0.7456 (0.7550) loss 0.6193 (0.6234) grad_norm 0.1587 (0.1470) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:54:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [480/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7527) loss 0.6052 (0.6235) grad_norm 0.1593 (0.1468) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:54:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 480 training takes 0:05:02 [2024-03-10 08:54:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [481/800][0/402] eta 0:33:13 lr 0.000025 time 4.9595 (4.9595) loss 0.6392 (0.6392) grad_norm 0.1271 (0.1271) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:55:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [481/800][100/402] eta 0:03:57 lr 0.000025 time 0.7479 (0.7881) loss 0.6336 (0.6227) grad_norm 0.1339 (0.1447) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:56:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [481/800][200/402] eta 0:02:34 lr 0.000025 time 0.7466 (0.7673) loss 0.6456 (0.6250) grad_norm 0.1517 (0.1470) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:57:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [481/800][300/402] eta 0:01:17 lr 0.000025 time 0.7452 (0.7602) loss 0.5767 (0.6243) grad_norm 0.1337 (0.1461) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:59:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [481/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7566) loss 0.6387 (0.6238) grad_norm 0.1403 (0.1458) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 08:59:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 481 training takes 0:05:04 [2024-03-10 08:59:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [482/800][0/402] eta 0:24:43 lr 0.000025 time 3.6892 (3.6892) loss 0.6295 (0.6295) grad_norm 0.1522 (0.1522) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:00:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [482/800][100/402] eta 0:03:54 lr 0.000025 time 0.7459 (0.7757) loss 0.6358 (0.6268) grad_norm 0.1203 (0.1456) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:01:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [482/800][200/402] eta 0:02:33 lr 0.000025 time 0.7468 (0.7611) loss 0.6429 (0.6265) grad_norm 0.1277 (0.1462) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:03:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [482/800][300/402] eta 0:01:17 lr 0.000025 time 0.7467 (0.7563) loss 0.6187 (0.6257) grad_norm 0.1483 (0.1462) loss_scale 524288.0000 (299593.1429) mem 28968MB [2024-03-10 09:04:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [482/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7540) loss 0.5959 (0.6245) grad_norm 0.1361 (0.1454) loss_scale 524288.0000 (355626.7731) mem 28968MB [2024-03-10 09:04:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 482 training takes 0:05:03 [2024-03-10 09:04:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [483/800][0/402] eta 0:24:55 lr 0.000025 time 3.7203 (3.7203) loss 0.6349 (0.6349) grad_norm 0.1445 (0.1445) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:05:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [483/800][100/402] eta 0:03:54 lr 0.000025 time 0.7460 (0.7758) loss 0.6305 (0.6244) grad_norm 0.1525 (0.1462) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:06:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [483/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7614) loss 0.6080 (0.6235) grad_norm 0.1711 (0.1462) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:08:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [483/800][300/402] eta 0:01:17 lr 0.000025 time 0.7470 (0.7564) loss 0.6286 (0.6239) grad_norm 0.1521 (0.1469) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:09:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [483/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7539) loss 0.6498 (0.6242) grad_norm 0.1518 (0.1470) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:09:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 483 training takes 0:05:03 [2024-03-10 09:09:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [484/800][0/402] eta 0:24:50 lr 0.000025 time 3.7072 (3.7072) loss 0.6333 (0.6333) grad_norm 0.1602 (0.1602) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:10:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [484/800][100/402] eta 0:03:54 lr 0.000025 time 0.7454 (0.7756) loss 0.6169 (0.6226) grad_norm 0.1592 (inf) loss_scale 262144.0000 (314053.7030) mem 28968MB [2024-03-10 09:11:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [484/800][200/402] eta 0:02:33 lr 0.000025 time 0.7472 (0.7612) loss 0.6365 (0.6226) grad_norm 0.1471 (inf) loss_scale 262144.0000 (288227.9801) mem 28968MB [2024-03-10 09:13:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [484/800][300/402] eta 0:01:17 lr 0.000025 time 0.7455 (0.7563) loss 0.6709 (0.6234) grad_norm 0.1352 (inf) loss_scale 262144.0000 (279562.2060) mem 28968MB [2024-03-10 09:14:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [484/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7538) loss 0.5842 (0.6236) grad_norm 0.1424 (inf) loss_scale 262144.0000 (275218.5137) mem 28968MB [2024-03-10 09:14:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 484 training takes 0:05:03 [2024-03-10 09:14:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [485/800][0/402] eta 0:24:43 lr 0.000025 time 3.6911 (3.6911) loss 0.6281 (0.6281) grad_norm 0.1454 (0.1454) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:15:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [485/800][100/402] eta 0:03:54 lr 0.000025 time 0.7458 (0.7761) loss 0.6013 (0.6234) grad_norm 0.1422 (0.1461) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:16:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [485/800][200/402] eta 0:02:33 lr 0.000025 time 0.7464 (0.7613) loss 0.6223 (0.6244) grad_norm 0.1659 (0.1458) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:18:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [485/800][300/402] eta 0:01:17 lr 0.000025 time 0.7466 (0.7563) loss 0.6313 (0.6235) grad_norm 0.1182 (0.1455) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:19:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [485/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7538) loss 0.6166 (0.6232) grad_norm 0.1422 (0.1461) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:19:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 485 training takes 0:05:03 [2024-03-10 09:19:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [486/800][0/402] eta 0:36:47 lr 0.000025 time 5.4924 (5.4924) loss 0.6417 (0.6417) grad_norm 0.1647 (0.1647) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:20:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [486/800][100/402] eta 0:03:59 lr 0.000025 time 0.7459 (0.7930) loss 0.6666 (0.6267) grad_norm 0.1399 (0.1479) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:22:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [486/800][200/402] eta 0:02:35 lr 0.000025 time 0.7456 (0.7698) loss 0.5835 (0.6246) grad_norm 0.1756 (0.1464) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:23:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [486/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7621) loss 0.6193 (0.6242) grad_norm 0.1623 (0.1465) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:24:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [486/800][400/402] eta 0:00:01 lr 0.000025 time 0.7453 (0.7582) loss 0.6410 (0.6239) grad_norm 0.1503 (0.1466) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:24:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 486 training takes 0:05:04 [2024-03-10 09:24:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [487/800][0/402] eta 0:25:21 lr 0.000025 time 3.7838 (3.7838) loss 0.5886 (0.5886) grad_norm 0.1600 (0.1600) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:25:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [487/800][100/402] eta 0:03:54 lr 0.000025 time 0.7463 (0.7770) loss 0.6014 (0.6246) grad_norm 0.1557 (0.1464) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:27:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [487/800][200/402] eta 0:02:33 lr 0.000025 time 0.7467 (0.7619) loss 0.6411 (0.6255) grad_norm 0.1443 (0.1455) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:28:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [487/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7567) loss 0.5990 (0.6250) grad_norm 0.1247 (0.1455) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:29:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [487/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7541) loss 0.6529 (0.6243) grad_norm 0.1441 (0.1460) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:29:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 487 training takes 0:05:03 [2024-03-10 09:29:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [488/800][0/402] eta 0:25:08 lr 0.000025 time 3.7531 (3.7531) loss 0.6454 (0.6454) grad_norm 0.1348 (0.1348) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:30:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [488/800][100/402] eta 0:03:54 lr 0.000025 time 0.7455 (0.7767) loss 0.6315 (0.6276) grad_norm 0.1354 (0.1447) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:32:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [488/800][200/402] eta 0:02:33 lr 0.000025 time 0.7478 (0.7617) loss 0.6287 (0.6265) grad_norm 0.1470 (0.1445) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:33:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [488/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7567) loss 0.6467 (0.6249) grad_norm 0.1496 (0.1458) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:34:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [488/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7541) loss 0.6280 (0.6245) grad_norm 0.1310 (0.1465) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:34:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 488 training takes 0:05:03 [2024-03-10 09:34:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [489/800][0/402] eta 0:24:51 lr 0.000025 time 3.7092 (3.7092) loss 0.5746 (0.5746) grad_norm 0.1436 (0.1436) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 09:35:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [489/800][100/402] eta 0:03:54 lr 0.000025 time 0.7459 (0.7759) loss 0.6292 (0.6235) grad_norm 0.1594 (0.1487) loss_scale 524288.0000 (498333.1485) mem 28968MB [2024-03-10 09:37:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [489/800][200/402] eta 0:02:33 lr 0.000025 time 0.7467 (0.7615) loss 0.6328 (0.6234) grad_norm 0.1610 (0.1460) loss_scale 524288.0000 (511246.0100) mem 28968MB [2024-03-10 09:38:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [489/800][300/402] eta 0:01:17 lr 0.000025 time 0.7476 (0.7567) loss 0.6334 (0.6236) grad_norm 0.1234 (0.1455) loss_scale 524288.0000 (515578.8970) mem 28968MB [2024-03-10 09:39:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [489/800][400/402] eta 0:00:01 lr 0.000025 time 0.7453 (0.7541) loss 0.6228 (0.6230) grad_norm 0.1580 (0.1457) loss_scale 524288.0000 (517750.7431) mem 28968MB [2024-03-10 09:39:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 489 training takes 0:05:03 [2024-03-10 09:39:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [490/800][0/402] eta 0:24:38 lr 0.000025 time 3.6778 (3.6778) loss 0.6163 (0.6163) grad_norm 0.1525 (0.1525) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:41:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [490/800][100/402] eta 0:03:54 lr 0.000025 time 0.7470 (0.7753) loss 0.5934 (0.6249) grad_norm 0.1632 (0.1461) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:42:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [490/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7608) loss 0.6359 (0.6246) grad_norm 0.1043 (0.1459) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:43:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [490/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7561) loss 0.6188 (0.6242) grad_norm 0.1347 (0.1464) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:44:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [490/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7537) loss 0.5952 (0.6241) grad_norm 0.1672 (0.1462) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:44:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 490 training takes 0:05:03 [2024-03-10 09:44:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [491/800][0/402] eta 0:37:01 lr 0.000025 time 5.5264 (5.5264) loss 0.6091 (0.6091) grad_norm 0.1410 (0.1410) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:46:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [491/800][100/402] eta 0:03:59 lr 0.000025 time 0.7464 (0.7938) loss 0.6240 (0.6252) grad_norm 0.1357 (0.1488) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:47:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [491/800][200/402] eta 0:02:35 lr 0.000025 time 0.7464 (0.7703) loss 0.5938 (0.6242) grad_norm 0.1808 (0.1491) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:48:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [491/800][300/402] eta 0:01:17 lr 0.000025 time 0.7476 (0.7624) loss 0.6283 (0.6234) grad_norm 0.1506 (0.1485) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:49:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [491/800][400/402] eta 0:00:01 lr 0.000025 time 0.7437 (0.7583) loss 0.6205 (0.6234) grad_norm 0.1463 (0.1477) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:49:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 491 training takes 0:05:04 [2024-03-10 09:49:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [492/800][0/402] eta 0:25:38 lr 0.000025 time 3.8282 (3.8282) loss 0.5989 (0.5989) grad_norm 0.1523 (0.1523) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:51:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [492/800][100/402] eta 0:03:54 lr 0.000025 time 0.7454 (0.7771) loss 0.6044 (0.6229) grad_norm 0.1283 (0.1469) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:52:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [492/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7620) loss 0.6323 (0.6239) grad_norm 0.1437 (0.1462) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:53:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [492/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7567) loss 0.6197 (0.6249) grad_norm 0.1458 (0.1466) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:54:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [492/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7541) loss 0.6012 (0.6242) grad_norm 0.1556 (0.1469) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:54:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 492 training takes 0:05:03 [2024-03-10 09:54:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [493/800][0/402] eta 0:24:55 lr 0.000025 time 3.7195 (3.7195) loss 0.6206 (0.6206) grad_norm 0.1274 (0.1274) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:56:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [493/800][100/402] eta 0:03:54 lr 0.000025 time 0.7463 (0.7761) loss 0.6376 (0.6218) grad_norm 0.1594 (0.1421) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:57:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [493/800][200/402] eta 0:02:33 lr 0.000025 time 0.7472 (0.7616) loss 0.5697 (0.6221) grad_norm 0.1612 (0.1460) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 09:58:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [493/800][300/402] eta 0:01:17 lr 0.000025 time 0.7486 (0.7566) loss 0.6249 (0.6223) grad_norm 0.1407 (inf) loss_scale 262144.0000 (443293.3422) mem 28968MB [2024-03-10 09:59:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [493/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7542) loss 0.6319 (0.6228) grad_norm 0.1325 (inf) loss_scale 262144.0000 (398118.9426) mem 28968MB [2024-03-10 09:59:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 493 training takes 0:05:03 [2024-03-10 10:00:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [494/800][0/402] eta 0:25:41 lr 0.000025 time 3.8334 (3.8334) loss 0.6330 (0.6330) grad_norm 0.1389 (0.1389) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:01:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [494/800][100/402] eta 0:03:54 lr 0.000025 time 0.7499 (0.7769) loss 0.5850 (0.6202) grad_norm 0.1352 (0.1469) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:02:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [494/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7619) loss 0.6404 (0.6247) grad_norm 0.1521 (0.1460) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:03:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [494/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7567) loss 0.6258 (0.6241) grad_norm 0.1339 (0.1461) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:04:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [494/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7540) loss 0.6265 (0.6246) grad_norm 0.1691 (0.1456) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:05:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 494 training takes 0:05:03 [2024-03-10 10:05:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [495/800][0/402] eta 0:25:39 lr 0.000025 time 3.8293 (3.8293) loss 0.6129 (0.6129) grad_norm 0.1726 (0.1726) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:06:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [495/800][100/402] eta 0:03:54 lr 0.000025 time 0.7457 (0.7768) loss 0.5944 (0.6225) grad_norm 0.1682 (0.1516) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:07:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [495/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7615) loss 0.6134 (0.6251) grad_norm 0.1412 (0.1485) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:08:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [495/800][300/402] eta 0:01:17 lr 0.000025 time 0.7456 (0.7563) loss 0.6234 (0.6237) grad_norm 0.1539 (0.1469) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:10:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [495/800][400/402] eta 0:00:01 lr 0.000025 time 0.7455 (0.7536) loss 0.6437 (0.6232) grad_norm 0.1286 (0.1465) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:10:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 495 training takes 0:05:03 [2024-03-10 10:10:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [496/800][0/402] eta 0:38:24 lr 0.000025 time 5.7336 (5.7336) loss 0.6096 (0.6096) grad_norm 0.1288 (0.1288) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:11:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [496/800][100/402] eta 0:04:00 lr 0.000025 time 0.7457 (0.7956) loss 0.6392 (0.6246) grad_norm 0.1676 (0.1476) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:12:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [496/800][200/402] eta 0:02:35 lr 0.000025 time 0.7462 (0.7710) loss 0.6339 (0.6245) grad_norm 0.1478 (0.1466) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:13:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [496/800][300/402] eta 0:01:17 lr 0.000025 time 0.7466 (0.7629) loss 0.6330 (0.6232) grad_norm 0.1743 (0.1478) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:15:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [496/800][400/402] eta 0:00:01 lr 0.000025 time 0.7462 (0.7587) loss 0.5621 (0.6230) grad_norm 0.1540 (0.1475) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:15:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 496 training takes 0:05:05 [2024-03-10 10:15:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [497/800][0/402] eta 0:26:31 lr 0.000025 time 3.9581 (3.9581) loss 0.6322 (0.6322) grad_norm 0.1249 (0.1249) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:16:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [497/800][100/402] eta 0:03:55 lr 0.000025 time 0.7455 (0.7783) loss 0.6089 (0.6217) grad_norm 0.1618 (0.1463) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:17:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [497/800][200/402] eta 0:02:34 lr 0.000025 time 0.7457 (0.7624) loss 0.6357 (0.6230) grad_norm 0.1361 (0.1467) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:18:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [497/800][300/402] eta 0:01:17 lr 0.000025 time 0.7460 (0.7570) loss 0.6300 (0.6235) grad_norm 0.1295 (0.1471) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:20:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [497/800][400/402] eta 0:00:01 lr 0.000025 time 0.7453 (0.7542) loss 0.6060 (0.6234) grad_norm 0.1790 (0.1471) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:20:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 497 training takes 0:05:03 [2024-03-10 10:20:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [498/800][0/402] eta 0:23:26 lr 0.000025 time 3.4985 (3.4985) loss 0.6331 (0.6331) grad_norm 0.1372 (0.1372) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:21:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [498/800][100/402] eta 0:03:53 lr 0.000025 time 0.7460 (0.7737) loss 0.5910 (0.6242) grad_norm 0.1371 (0.1502) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 10:22:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [498/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7604) loss 0.5830 (0.6235) grad_norm 0.1574 (0.1492) loss_scale 524288.0000 (266056.5970) mem 28968MB [2024-03-10 10:23:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [498/800][300/402] eta 0:01:17 lr 0.000025 time 0.7475 (0.7558) loss 0.6284 (0.6235) grad_norm 0.1514 (0.1485) loss_scale 524288.0000 (351847.7608) mem 28968MB [2024-03-10 10:25:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [498/800][400/402] eta 0:00:01 lr 0.000025 time 0.7458 (0.7537) loss 0.6320 (0.6238) grad_norm 0.1250 (0.1482) loss_scale 524288.0000 (394850.3142) mem 28968MB [2024-03-10 10:25:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 498 training takes 0:05:03 [2024-03-10 10:25:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [499/800][0/402] eta 0:25:00 lr 0.000025 time 3.7333 (3.7333) loss 0.6358 (0.6358) grad_norm 0.1239 (0.1239) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:26:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [499/800][100/402] eta 0:03:54 lr 0.000025 time 0.7457 (0.7761) loss 0.6346 (0.6250) grad_norm 0.1376 (0.1409) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:27:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [499/800][200/402] eta 0:02:33 lr 0.000025 time 0.7475 (0.7613) loss 0.6039 (0.6210) grad_norm 0.1331 (0.1451) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:29:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [499/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7564) loss 0.6404 (0.6219) grad_norm 0.1769 (0.1451) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:30:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [499/800][400/402] eta 0:00:01 lr 0.000025 time 0.7458 (0.7539) loss 0.6236 (0.6221) grad_norm 0.1823 (0.1451) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:30:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 499 training takes 0:05:03 [2024-03-10 10:30:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [500/800][0/402] eta 0:25:00 lr 0.000025 time 3.7315 (3.7315) loss 0.5923 (0.5923) grad_norm 0.1422 (0.1422) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:31:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [500/800][100/402] eta 0:03:54 lr 0.000025 time 0.7465 (0.7763) loss 0.5645 (0.6210) grad_norm 0.1556 (0.1472) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:32:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [500/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7616) loss 0.6401 (0.6224) grad_norm 0.1516 (0.1475) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:34:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [500/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7568) loss 0.6297 (0.6228) grad_norm 0.1249 (0.1464) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:35:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [500/800][400/402] eta 0:00:01 lr 0.000025 time 0.7450 (0.7542) loss 0.6171 (0.6227) grad_norm 0.1638 (0.1463) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:35:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 500 training takes 0:05:03 [2024-03-10 10:35:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [501/800][0/402] eta 0:37:40 lr 0.000025 time 5.6223 (5.6223) loss 0.6295 (0.6295) grad_norm 0.1291 (0.1291) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:36:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [501/800][100/402] eta 0:04:00 lr 0.000025 time 0.7456 (0.7947) loss 0.6124 (0.6247) grad_norm 0.1623 (0.1555) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:37:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [501/800][200/402] eta 0:02:35 lr 0.000025 time 0.7461 (0.7709) loss 0.6365 (0.6245) grad_norm 0.1180 (0.1502) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:39:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [501/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7625) loss 0.5939 (0.6237) grad_norm 0.1567 (0.1495) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:40:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [501/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7583) loss 0.5891 (0.6233) grad_norm 0.1797 (0.1490) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:40:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 501 training takes 0:05:04 [2024-03-10 10:40:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [502/800][0/402] eta 0:25:14 lr 0.000025 time 3.7672 (3.7672) loss 0.6769 (0.6769) grad_norm 0.1193 (0.1193) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:41:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [502/800][100/402] eta 0:03:54 lr 0.000025 time 0.7458 (0.7763) loss 0.6487 (0.6214) grad_norm 0.1332 (0.1502) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:42:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [502/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7613) loss 0.6455 (0.6214) grad_norm 0.1416 (0.1481) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:44:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [502/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7565) loss 0.6261 (0.6211) grad_norm 0.1103 (0.1470) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:45:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [502/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7539) loss 0.6013 (0.6222) grad_norm 0.1270 (0.1457) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:45:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 502 training takes 0:05:03 [2024-03-10 10:45:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [503/800][0/402] eta 0:25:37 lr 0.000025 time 3.8235 (3.8235) loss 0.6238 (0.6238) grad_norm 0.1293 (0.1293) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:46:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [503/800][100/402] eta 0:03:54 lr 0.000025 time 0.7461 (0.7768) loss 0.6418 (0.6212) grad_norm 0.1413 (0.1441) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:48:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [503/800][200/402] eta 0:02:33 lr 0.000025 time 0.7468 (0.7617) loss 0.6547 (0.6242) grad_norm 0.1304 (inf) loss_scale 524288.0000 (542546.7861) mem 28968MB [2024-03-10 10:49:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [503/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7567) loss 0.6122 (0.6241) grad_norm 0.1829 (inf) loss_scale 524288.0000 (536480.7442) mem 28968MB [2024-03-10 10:50:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [503/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7541) loss 0.6284 (0.6230) grad_norm 0.1259 (inf) loss_scale 524288.0000 (533440.1596) mem 28968MB [2024-03-10 10:50:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 503 training takes 0:05:03 [2024-03-10 10:50:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [504/800][0/402] eta 0:25:13 lr 0.000025 time 3.7650 (3.7650) loss 0.6147 (0.6147) grad_norm 0.1506 (0.1506) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:51:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [504/800][100/402] eta 0:03:54 lr 0.000025 time 0.7472 (0.7764) loss 0.6113 (0.6246) grad_norm 0.1137 (0.1452) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:53:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [504/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7614) loss 0.6071 (0.6226) grad_norm 0.1485 (0.1465) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:54:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [504/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7565) loss 0.6126 (0.6222) grad_norm 0.1566 (0.1478) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:55:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [504/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7539) loss 0.6166 (0.6230) grad_norm 0.1819 (0.1480) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:55:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 504 training takes 0:05:03 [2024-03-10 10:55:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [505/800][0/402] eta 0:25:44 lr 0.000025 time 3.8417 (3.8417) loss 0.6186 (0.6186) grad_norm 0.1568 (0.1568) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:56:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [505/800][100/402] eta 0:03:54 lr 0.000025 time 0.7469 (0.7778) loss 0.6144 (0.6253) grad_norm 0.1405 (0.1463) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:58:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [505/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7623) loss 0.6037 (0.6248) grad_norm 0.1651 (0.1460) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 10:59:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [505/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7569) loss 0.6355 (0.6238) grad_norm 0.1367 (0.1467) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:00:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [505/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7544) loss 0.6198 (0.6241) grad_norm 0.1284 (0.1462) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:00:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 505 training takes 0:05:03 [2024-03-10 11:00:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [506/800][0/402] eta 0:37:30 lr 0.000025 time 5.5981 (5.5981) loss 0.6268 (0.6268) grad_norm 0.1329 (0.1329) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:01:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [506/800][100/402] eta 0:03:59 lr 0.000025 time 0.7459 (0.7945) loss 0.6102 (0.6254) grad_norm 0.1503 (0.1452) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:03:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [506/800][200/402] eta 0:02:35 lr 0.000025 time 0.7460 (0.7706) loss 0.6558 (0.6239) grad_norm 0.1606 (0.1471) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:04:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [506/800][300/402] eta 0:01:17 lr 0.000025 time 0.7464 (0.7625) loss 0.6251 (0.6246) grad_norm 0.1977 (0.1476) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:05:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [506/800][400/402] eta 0:00:01 lr 0.000025 time 0.7450 (0.7584) loss 0.6164 (0.6239) grad_norm 0.1596 (0.1473) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:05:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 506 training takes 0:05:04 [2024-03-10 11:05:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [507/800][0/402] eta 0:25:05 lr 0.000025 time 3.7452 (3.7452) loss 0.6280 (0.6280) grad_norm 0.1550 (0.1550) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:07:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [507/800][100/402] eta 0:03:54 lr 0.000025 time 0.7484 (0.7762) loss 0.6620 (0.6232) grad_norm 0.1444 (0.1485) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:08:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [507/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7613) loss 0.6114 (0.6237) grad_norm 0.1623 (0.1462) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:09:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [507/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7563) loss 0.6141 (0.6226) grad_norm 0.1473 (0.1468) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:10:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [507/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7538) loss 0.5868 (0.6229) grad_norm 0.1339 (0.1454) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:10:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 507 training takes 0:05:03 [2024-03-10 11:10:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [508/800][0/402] eta 0:24:59 lr 0.000025 time 3.7302 (3.7302) loss 0.5998 (0.5998) grad_norm 0.1537 (0.1537) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:12:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [508/800][100/402] eta 0:03:54 lr 0.000025 time 0.7468 (0.7762) loss 0.6313 (0.6252) grad_norm 0.1457 (inf) loss_scale 262144.0000 (371154.3762) mem 28968MB [2024-03-10 11:13:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [508/800][200/402] eta 0:02:33 lr 0.000025 time 0.7472 (0.7614) loss 0.6134 (0.6255) grad_norm 0.2024 (inf) loss_scale 262144.0000 (316920.3582) mem 28968MB [2024-03-10 11:14:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [508/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7565) loss 0.6103 (0.6246) grad_norm 0.1403 (inf) loss_scale 262144.0000 (298722.2326) mem 28968MB [2024-03-10 11:15:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [508/800][400/402] eta 0:00:01 lr 0.000025 time 0.7453 (0.7540) loss 0.6133 (0.6242) grad_norm 0.1444 (inf) loss_scale 262144.0000 (289600.4788) mem 28968MB [2024-03-10 11:15:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 508 training takes 0:05:03 [2024-03-10 11:15:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [509/800][0/402] eta 0:25:31 lr 0.000025 time 3.8087 (3.8087) loss 0.6264 (0.6264) grad_norm 0.1255 (0.1255) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:17:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [509/800][100/402] eta 0:03:54 lr 0.000025 time 0.7461 (0.7765) loss 0.6265 (0.6227) grad_norm 0.1622 (0.1449) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:18:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [509/800][200/402] eta 0:02:33 lr 0.000025 time 0.7474 (0.7617) loss 0.5678 (0.6213) grad_norm 0.1458 (0.1470) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:19:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [509/800][300/402] eta 0:01:17 lr 0.000025 time 0.7455 (0.7567) loss 0.6161 (0.6211) grad_norm 0.1412 (0.1476) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:20:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [509/800][400/402] eta 0:00:01 lr 0.000025 time 0.7456 (0.7541) loss 0.6131 (0.6217) grad_norm 0.1302 (0.1471) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:20:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 509 training takes 0:05:03 [2024-03-10 11:20:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [510/800][0/402] eta 0:26:25 lr 0.000025 time 3.9449 (3.9449) loss 0.6301 (0.6301) grad_norm 0.1459 (0.1459) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:22:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [510/800][100/402] eta 0:03:55 lr 0.000025 time 0.7466 (0.7782) loss 0.6305 (0.6272) grad_norm 0.1453 (0.1475) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:23:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [510/800][200/402] eta 0:02:33 lr 0.000025 time 0.7464 (0.7624) loss 0.6131 (0.6258) grad_norm 0.1687 (0.1483) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:24:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [510/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7572) loss 0.6243 (0.6242) grad_norm 0.1436 (0.1497) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:25:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [510/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7545) loss 0.6216 (0.6243) grad_norm 0.1306 (0.1483) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:25:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 510 training takes 0:05:03 [2024-03-10 11:26:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [511/800][0/402] eta 0:36:44 lr 0.000025 time 5.4830 (5.4830) loss 0.6236 (0.6236) grad_norm 0.1382 (0.1382) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:27:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [511/800][100/402] eta 0:03:59 lr 0.000025 time 0.7455 (0.7935) loss 0.6550 (0.6241) grad_norm 0.1376 (0.1433) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:28:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [511/800][200/402] eta 0:02:35 lr 0.000025 time 0.7473 (0.7702) loss 0.6593 (0.6247) grad_norm 0.1444 (0.1460) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:29:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [511/800][300/402] eta 0:01:17 lr 0.000025 time 0.7465 (0.7624) loss 0.6070 (0.6239) grad_norm 0.1495 (0.1477) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:31:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [511/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7584) loss 0.5896 (0.6239) grad_norm 0.1612 (0.1482) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:31:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 511 training takes 0:05:05 [2024-03-10 11:31:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [512/800][0/402] eta 0:25:48 lr 0.000025 time 3.8529 (3.8529) loss 0.6669 (0.6669) grad_norm 0.1530 (0.1530) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:32:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [512/800][100/402] eta 0:03:54 lr 0.000025 time 0.7457 (0.7774) loss 0.5983 (0.6268) grad_norm 0.1179 (0.1434) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:33:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [512/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7619) loss 0.6275 (0.6250) grad_norm 0.1272 (0.1437) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:34:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [512/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7566) loss 0.6418 (0.6244) grad_norm 0.1408 (0.1460) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:36:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [512/800][400/402] eta 0:00:01 lr 0.000025 time 0.7452 (0.7541) loss 0.6610 (0.6242) grad_norm 0.1235 (0.1462) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:36:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 512 training takes 0:05:03 [2024-03-10 11:36:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [513/800][0/402] eta 0:24:35 lr 0.000025 time 3.6700 (3.6700) loss 0.6191 (0.6191) grad_norm 0.1439 (0.1439) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 11:37:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [513/800][100/402] eta 0:03:54 lr 0.000025 time 0.7457 (0.7752) loss 0.6324 (0.6229) grad_norm 0.1475 (0.1521) loss_scale 524288.0000 (441232.4752) mem 28968MB [2024-03-10 11:38:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [513/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7607) loss 0.6501 (0.6237) grad_norm 0.1324 (0.1497) loss_scale 524288.0000 (482553.6318) mem 28968MB [2024-03-10 11:39:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [513/800][300/402] eta 0:01:17 lr 0.000025 time 0.7449 (0.7558) loss 0.6248 (0.6239) grad_norm 0.1339 (0.1487) loss_scale 524288.0000 (496418.8704) mem 28968MB [2024-03-10 11:41:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [513/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7532) loss 0.6072 (0.6241) grad_norm 0.1622 (0.1484) loss_scale 524288.0000 (503368.7781) mem 28968MB [2024-03-10 11:41:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 513 training takes 0:05:02 [2024-03-10 11:41:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [514/800][0/402] eta 0:24:35 lr 0.000025 time 3.6697 (3.6697) loss 0.6278 (0.6278) grad_norm 0.1511 (0.1511) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:42:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [514/800][100/402] eta 0:03:54 lr 0.000025 time 0.7456 (0.7752) loss 0.6197 (0.6248) grad_norm 0.1513 (0.1446) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:43:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [514/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7607) loss 0.6330 (0.6233) grad_norm 0.1241 (0.1454) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:44:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [514/800][300/402] eta 0:01:17 lr 0.000025 time 0.7485 (0.7559) loss 0.6394 (0.6225) grad_norm 0.1467 (0.1471) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:46:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [514/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7534) loss 0.6100 (0.6227) grad_norm 0.1628 (0.1467) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:46:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 514 training takes 0:05:02 [2024-03-10 11:46:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [515/800][0/402] eta 0:25:21 lr 0.000025 time 3.7836 (3.7836) loss 0.6339 (0.6339) grad_norm 0.1485 (0.1485) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:47:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [515/800][100/402] eta 0:03:54 lr 0.000025 time 0.7457 (0.7772) loss 0.6283 (0.6221) grad_norm 0.1314 (0.1466) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:48:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [515/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7619) loss 0.6358 (0.6214) grad_norm 0.1376 (0.1475) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:49:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [515/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7567) loss 0.6262 (0.6218) grad_norm 0.1513 (0.1479) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:51:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [515/800][400/402] eta 0:00:01 lr 0.000025 time 0.7452 (0.7541) loss 0.6238 (0.6226) grad_norm 0.1374 (0.1479) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:51:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 515 training takes 0:05:03 [2024-03-10 11:51:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [516/800][0/402] eta 0:37:23 lr 0.000025 time 5.5807 (5.5807) loss 0.6275 (0.6275) grad_norm 0.1582 (0.1582) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:52:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [516/800][100/402] eta 0:04:00 lr 0.000025 time 0.7461 (0.7951) loss 0.6350 (0.6252) grad_norm 0.1466 (0.1458) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:53:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [516/800][200/402] eta 0:02:35 lr 0.000025 time 0.7462 (0.7709) loss 0.6137 (0.6241) grad_norm 0.1485 (0.1467) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:55:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [516/800][300/402] eta 0:01:17 lr 0.000025 time 0.7473 (0.7628) loss 0.6254 (0.6242) grad_norm 0.1310 (0.1468) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:56:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [516/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7587) loss 0.6462 (0.6244) grad_norm 0.1286 (0.1467) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:56:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 516 training takes 0:05:05 [2024-03-10 11:56:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [517/800][0/402] eta 0:26:28 lr 0.000025 time 3.9505 (3.9505) loss 0.5660 (0.5660) grad_norm 0.1695 (0.1695) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:57:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [517/800][100/402] eta 0:03:55 lr 0.000025 time 0.7465 (0.7783) loss 0.6337 (0.6196) grad_norm 0.1268 (0.1476) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 11:58:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [517/800][200/402] eta 0:02:34 lr 0.000025 time 0.7463 (0.7627) loss 0.6559 (0.6220) grad_norm 0.1272 (0.1464) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:00:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [517/800][300/402] eta 0:01:17 lr 0.000025 time 0.7470 (0.7572) loss 0.6263 (0.6219) grad_norm 0.1291 (0.1461) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:01:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [517/800][400/402] eta 0:00:01 lr 0.000025 time 0.7452 (0.7544) loss 0.5720 (0.6229) grad_norm 0.1653 (0.1458) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:01:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 517 training takes 0:05:03 [2024-03-10 12:01:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [518/800][0/402] eta 0:28:32 lr 0.000025 time 4.2607 (4.2607) loss 0.6192 (0.6192) grad_norm 0.1343 (0.1343) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:02:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [518/800][100/402] eta 0:03:55 lr 0.000025 time 0.7452 (0.7806) loss 0.6305 (0.6253) grad_norm 0.1432 (inf) loss_scale 524288.0000 (545051.8812) mem 28968MB [2024-03-10 12:03:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [518/800][200/402] eta 0:02:34 lr 0.000025 time 0.7457 (0.7632) loss 0.6250 (0.6232) grad_norm 0.1520 (inf) loss_scale 524288.0000 (534721.5920) mem 28968MB [2024-03-10 12:05:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [518/800][300/402] eta 0:01:17 lr 0.000025 time 0.7453 (0.7576) loss 0.6284 (0.6238) grad_norm 0.1563 (inf) loss_scale 524288.0000 (531255.2824) mem 28968MB [2024-03-10 12:06:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [518/800][400/402] eta 0:00:01 lr 0.000025 time 0.7450 (0.7546) loss 0.6268 (0.6236) grad_norm 0.1182 (inf) loss_scale 524288.0000 (529517.8055) mem 28968MB [2024-03-10 12:06:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 518 training takes 0:05:03 [2024-03-10 12:06:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [519/800][0/402] eta 0:25:22 lr 0.000025 time 3.7864 (3.7864) loss 0.5763 (0.5763) grad_norm 0.1398 (0.1398) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:07:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [519/800][100/402] eta 0:03:54 lr 0.000025 time 0.7458 (0.7762) loss 0.6523 (0.6233) grad_norm 0.1396 (0.1491) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:08:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [519/800][200/402] eta 0:02:33 lr 0.000025 time 0.7471 (0.7612) loss 0.6058 (0.6231) grad_norm 0.1754 (0.1496) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:10:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [519/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7560) loss 0.6123 (0.6223) grad_norm 0.1597 (0.1487) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:11:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [519/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7536) loss 0.6203 (0.6227) grad_norm 0.1436 (0.1486) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:11:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 519 training takes 0:05:03 [2024-03-10 12:11:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [520/800][0/402] eta 0:25:13 lr 0.000025 time 3.7637 (3.7637) loss 0.6021 (0.6021) grad_norm 0.2065 (0.2065) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:12:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [520/800][100/402] eta 0:03:54 lr 0.000025 time 0.7468 (0.7761) loss 0.6463 (0.6231) grad_norm 0.1466 (0.1508) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:14:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [520/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7611) loss 0.6539 (0.6221) grad_norm 0.1728 (0.1496) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:15:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [520/800][300/402] eta 0:01:17 lr 0.000025 time 0.7498 (0.7561) loss 0.5900 (0.6221) grad_norm 0.1661 (0.1496) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:16:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [520/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7536) loss 0.6345 (0.6232) grad_norm 0.1445 (0.1495) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:16:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 520 training takes 0:05:03 [2024-03-10 12:16:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [521/800][0/402] eta 0:33:59 lr 0.000025 time 5.0741 (5.0741) loss 0.6686 (0.6686) grad_norm 0.1223 (0.1223) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:17:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [521/800][100/402] eta 0:03:58 lr 0.000025 time 0.7456 (0.7888) loss 0.6391 (0.6258) grad_norm 0.1556 (0.1478) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:19:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [521/800][200/402] eta 0:02:35 lr 0.000025 time 0.7459 (0.7676) loss 0.6386 (0.6238) grad_norm 0.1539 (0.1482) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:20:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [521/800][300/402] eta 0:01:17 lr 0.000025 time 0.7456 (0.7604) loss 0.6187 (0.6231) grad_norm 0.1297 (0.1474) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:21:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [521/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7567) loss 0.6266 (0.6237) grad_norm 0.1555 (0.1473) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:21:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 521 training takes 0:05:04 [2024-03-10 12:21:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [522/800][0/402] eta 0:25:02 lr 0.000025 time 3.7369 (3.7369) loss 0.6353 (0.6353) grad_norm 0.1582 (0.1582) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:22:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [522/800][100/402] eta 0:03:54 lr 0.000025 time 0.7464 (0.7766) loss 0.6373 (0.6220) grad_norm 0.1447 (0.1476) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:24:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [522/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7615) loss 0.6220 (0.6233) grad_norm 0.1696 (0.1473) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:25:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [522/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7563) loss 0.6254 (0.6233) grad_norm 0.1416 (0.1473) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:26:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [522/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7538) loss 0.6170 (0.6231) grad_norm 0.1513 (0.1471) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:26:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 522 training takes 0:05:03 [2024-03-10 12:26:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [523/800][0/402] eta 0:25:30 lr 0.000025 time 3.8081 (3.8081) loss 0.6168 (0.6168) grad_norm 0.1469 (0.1469) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:27:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [523/800][100/402] eta 0:03:54 lr 0.000025 time 0.7461 (0.7766) loss 0.6288 (0.6198) grad_norm 0.1568 (inf) loss_scale 524288.0000 (814982.3366) mem 28968MB [2024-03-10 12:29:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [523/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7615) loss 0.6387 (0.6231) grad_norm 0.1568 (inf) loss_scale 524288.0000 (670358.2886) mem 28968MB [2024-03-10 12:30:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [523/800][300/402] eta 0:01:17 lr 0.000025 time 0.7454 (0.7564) loss 0.6391 (0.6229) grad_norm 0.1423 (inf) loss_scale 524288.0000 (621829.9535) mem 28968MB [2024-03-10 12:31:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [523/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7537) loss 0.6507 (0.6229) grad_norm 0.1445 (inf) loss_scale 524288.0000 (597505.2768) mem 28968MB [2024-03-10 12:31:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 523 training takes 0:05:03 [2024-03-10 12:31:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [524/800][0/402] eta 0:24:43 lr 0.000025 time 3.6898 (3.6898) loss 0.6179 (0.6179) grad_norm 0.1642 (0.1642) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:33:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [524/800][100/402] eta 0:03:54 lr 0.000025 time 0.7456 (0.7755) loss 0.6348 (0.6232) grad_norm 0.1216 (0.1444) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:34:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [524/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7611) loss 0.6170 (0.6227) grad_norm 0.1380 (0.1464) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:35:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [524/800][300/402] eta 0:01:17 lr 0.000025 time 0.7481 (0.7561) loss 0.6347 (0.6224) grad_norm 0.1273 (0.1469) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:36:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [524/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7536) loss 0.6463 (0.6234) grad_norm 0.1516 (0.1472) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:36:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 524 training takes 0:05:03 [2024-03-10 12:36:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [525/800][0/402] eta 0:25:14 lr 0.000025 time 3.7679 (3.7679) loss 0.6322 (0.6322) grad_norm 0.1316 (0.1316) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:38:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [525/800][100/402] eta 0:03:54 lr 0.000025 time 0.7456 (0.7763) loss 0.5943 (0.6243) grad_norm 0.1580 (0.1454) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:39:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [525/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7613) loss 0.6057 (0.6237) grad_norm 0.1482 (0.1479) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:40:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [525/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7562) loss 0.6009 (0.6234) grad_norm 0.1585 (0.1478) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:41:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [525/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7536) loss 0.6263 (0.6233) grad_norm 0.1325 (0.1472) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:41:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 525 training takes 0:05:03 [2024-03-10 12:41:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [526/800][0/402] eta 0:34:24 lr 0.000025 time 5.1348 (5.1348) loss 0.6128 (0.6128) grad_norm 0.1355 (0.1355) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:43:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [526/800][100/402] eta 0:03:58 lr 0.000025 time 0.7456 (0.7896) loss 0.6341 (0.6234) grad_norm 0.1331 (0.1514) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:44:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [526/800][200/402] eta 0:02:35 lr 0.000025 time 0.7494 (0.7680) loss 0.6306 (0.6236) grad_norm 0.1437 (0.1499) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:45:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [526/800][300/402] eta 0:01:17 lr 0.000025 time 0.7456 (0.7609) loss 0.5979 (0.6230) grad_norm 0.1722 (0.1513) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:46:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [526/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7571) loss 0.6027 (0.6228) grad_norm 0.1137 (0.1497) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:46:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 526 training takes 0:05:04 [2024-03-10 12:46:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [527/800][0/402] eta 0:24:46 lr 0.000025 time 3.6980 (3.6980) loss 0.6292 (0.6292) grad_norm 0.1227 (0.1227) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:48:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [527/800][100/402] eta 0:03:54 lr 0.000025 time 0.7455 (0.7761) loss 0.6247 (0.6250) grad_norm 0.1605 (0.1436) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:49:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [527/800][200/402] eta 0:02:33 lr 0.000025 time 0.7464 (0.7611) loss 0.6224 (0.6239) grad_norm 0.1400 (0.1444) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:50:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [527/800][300/402] eta 0:01:17 lr 0.000025 time 0.7455 (0.7561) loss 0.6301 (0.6237) grad_norm 0.1591 (0.1458) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:51:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [527/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7535) loss 0.6258 (0.6241) grad_norm 0.1730 (0.1463) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:51:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 527 training takes 0:05:03 [2024-03-10 12:52:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [528/800][0/402] eta 0:24:28 lr 0.000025 time 3.6541 (3.6541) loss 0.6357 (0.6357) grad_norm 0.1174 (0.1174) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:53:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [528/800][100/402] eta 0:03:54 lr 0.000025 time 0.7452 (0.7750) loss 0.6159 (0.6254) grad_norm 0.1256 (inf) loss_scale 524288.0000 (617725.4653) mem 28968MB [2024-03-10 12:54:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [528/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7606) loss 0.6140 (0.6233) grad_norm 0.1430 (inf) loss_scale 524288.0000 (571239.1642) mem 28968MB [2024-03-10 12:55:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [528/800][300/402] eta 0:01:17 lr 0.000025 time 0.7455 (0.7557) loss 0.6201 (0.6229) grad_norm 0.1322 (inf) loss_scale 524288.0000 (555640.7708) mem 28968MB [2024-03-10 12:56:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [528/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7532) loss 0.6325 (0.6230) grad_norm 0.1416 (inf) loss_scale 524288.0000 (547822.1247) mem 28968MB [2024-03-10 12:56:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 528 training takes 0:05:02 [2024-03-10 12:57:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [529/800][0/402] eta 0:24:29 lr 0.000025 time 3.6553 (3.6553) loss 0.6095 (0.6095) grad_norm 0.1659 (0.1659) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:58:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [529/800][100/402] eta 0:03:54 lr 0.000025 time 0.7470 (0.7760) loss 0.6367 (0.6245) grad_norm 0.1352 (0.1494) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 12:59:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [529/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7614) loss 0.6143 (0.6253) grad_norm 0.1718 (0.1496) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:00:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [529/800][300/402] eta 0:01:17 lr 0.000025 time 0.7456 (0.7563) loss 0.6370 (0.6233) grad_norm 0.1406 (0.1491) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:02:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [529/800][400/402] eta 0:00:01 lr 0.000025 time 0.7453 (0.7538) loss 0.6034 (0.6238) grad_norm 0.1417 (0.1487) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:02:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 529 training takes 0:05:03 [2024-03-10 13:02:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [530/800][0/402] eta 0:24:32 lr 0.000025 time 3.6632 (3.6632) loss 0.6275 (0.6275) grad_norm 0.1345 (0.1345) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:03:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [530/800][100/402] eta 0:03:54 lr 0.000025 time 0.7461 (0.7751) loss 0.6370 (0.6211) grad_norm 0.1710 (0.1477) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:04:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [530/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7606) loss 0.6107 (0.6226) grad_norm 0.1539 (0.1466) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:05:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [530/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7556) loss 0.5995 (0.6226) grad_norm 0.1587 (0.1470) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:07:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [530/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7531) loss 0.6253 (0.6225) grad_norm 0.1423 (0.1470) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:07:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 530 training takes 0:05:02 [2024-03-10 13:07:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [531/800][0/402] eta 0:33:47 lr 0.000025 time 5.0447 (5.0447) loss 0.6497 (0.6497) grad_norm 0.1571 (0.1571) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:08:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [531/800][100/402] eta 0:03:58 lr 0.000025 time 0.7462 (0.7888) loss 0.6412 (0.6251) grad_norm 0.1401 (0.1477) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:09:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [531/800][200/402] eta 0:02:35 lr 0.000025 time 0.7460 (0.7684) loss 0.6255 (0.6236) grad_norm 0.1531 (0.1498) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:10:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [531/800][300/402] eta 0:01:17 lr 0.000025 time 0.7460 (0.7610) loss 0.6202 (0.6230) grad_norm 0.1644 (0.1478) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:12:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [531/800][400/402] eta 0:00:01 lr 0.000025 time 0.7464 (0.7572) loss 0.5882 (0.6232) grad_norm 0.1503 (0.1487) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:12:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 531 training takes 0:05:04 [2024-03-10 13:12:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [532/800][0/402] eta 0:25:33 lr 0.000025 time 3.8137 (3.8137) loss 0.5849 (0.5849) grad_norm 0.1616 (0.1616) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:13:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [532/800][100/402] eta 0:03:54 lr 0.000025 time 0.7454 (0.7767) loss 0.6494 (0.6198) grad_norm 0.1687 (0.1492) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:14:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [532/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7615) loss 0.6373 (0.6217) grad_norm 0.1385 (0.1500) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:15:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [532/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7564) loss 0.6149 (0.6223) grad_norm 0.1658 (0.1498) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:17:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [532/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7538) loss 0.6061 (0.6223) grad_norm 0.1607 (0.1492) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:17:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 532 training takes 0:05:03 [2024-03-10 13:17:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [533/800][0/402] eta 0:24:46 lr 0.000025 time 3.6974 (3.6974) loss 0.6106 (0.6106) grad_norm 0.1316 (0.1316) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:18:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [533/800][100/402] eta 0:03:54 lr 0.000025 time 0.7465 (0.7754) loss 0.6081 (0.6192) grad_norm 0.1746 (inf) loss_scale 524288.0000 (628107.4059) mem 28968MB [2024-03-10 13:19:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [533/800][200/402] eta 0:02:33 lr 0.000025 time 0.7464 (0.7611) loss 0.6000 (0.6206) grad_norm 0.1596 (inf) loss_scale 524288.0000 (576455.9602) mem 28968MB [2024-03-10 13:21:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [533/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7563) loss 0.6364 (0.6211) grad_norm 0.1492 (inf) loss_scale 524288.0000 (559124.4120) mem 28968MB [2024-03-10 13:22:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [533/800][400/402] eta 0:00:01 lr 0.000025 time 0.7459 (0.7540) loss 0.6291 (0.6215) grad_norm 0.1129 (inf) loss_scale 524288.0000 (550437.0274) mem 28968MB [2024-03-10 13:22:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 533 training takes 0:05:03 [2024-03-10 13:22:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [534/800][0/402] eta 0:25:42 lr 0.000025 time 3.8382 (3.8382) loss 0.6228 (0.6228) grad_norm 0.1442 (0.1442) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:23:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [534/800][100/402] eta 0:03:54 lr 0.000025 time 0.7460 (0.7771) loss 0.6053 (0.6210) grad_norm 0.1550 (0.1503) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:24:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [534/800][200/402] eta 0:02:33 lr 0.000025 time 0.7486 (0.7620) loss 0.5932 (0.6214) grad_norm 0.1409 (0.1494) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:26:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [534/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7568) loss 0.6351 (0.6222) grad_norm 0.1608 (0.1486) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:27:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [534/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7542) loss 0.5979 (0.6226) grad_norm 0.1390 (0.1488) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:27:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 534 training takes 0:05:03 [2024-03-10 13:27:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [535/800][0/402] eta 0:25:39 lr 0.000025 time 3.8288 (3.8288) loss 0.6012 (0.6012) grad_norm 0.1564 (0.1564) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:28:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [535/800][100/402] eta 0:03:54 lr 0.000025 time 0.7450 (0.7766) loss 0.6481 (0.6234) grad_norm 0.1383 (0.1478) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:29:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [535/800][200/402] eta 0:02:33 lr 0.000025 time 0.7453 (0.7611) loss 0.6041 (0.6228) grad_norm 0.1440 (0.1465) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:31:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [535/800][300/402] eta 0:01:17 lr 0.000025 time 0.7451 (0.7560) loss 0.6488 (0.6229) grad_norm 0.1457 (0.1475) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:32:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [535/800][400/402] eta 0:00:01 lr 0.000025 time 0.7466 (0.7534) loss 0.5866 (0.6227) grad_norm 0.1681 (0.1484) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:32:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 535 training takes 0:05:02 [2024-03-10 13:32:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [536/800][0/402] eta 0:37:44 lr 0.000025 time 5.6343 (5.6343) loss 0.6121 (0.6121) grad_norm 0.1302 (0.1302) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:33:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [536/800][100/402] eta 0:04:00 lr 0.000025 time 0.7462 (0.7948) loss 0.6438 (0.6253) grad_norm 0.1363 (0.1440) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:34:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [536/800][200/402] eta 0:02:35 lr 0.000025 time 0.7458 (0.7704) loss 0.6416 (0.6245) grad_norm 0.1498 (0.1486) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:36:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [536/800][300/402] eta 0:01:17 lr 0.000025 time 0.7455 (0.7623) loss 0.6078 (0.6229) grad_norm 0.1508 (0.1497) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:37:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [536/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7581) loss 0.6249 (0.6224) grad_norm 0.1408 (0.1487) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:37:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 536 training takes 0:05:04 [2024-03-10 13:37:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [537/800][0/402] eta 0:24:57 lr 0.000025 time 3.7252 (3.7252) loss 0.6302 (0.6302) grad_norm 0.1596 (0.1596) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:38:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [537/800][100/402] eta 0:03:54 lr 0.000025 time 0.7464 (0.7757) loss 0.6249 (0.6226) grad_norm 0.1534 (0.1465) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:40:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [537/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7609) loss 0.6161 (0.6226) grad_norm 0.1477 (0.1459) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:41:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [537/800][300/402] eta 0:01:17 lr 0.000025 time 0.7460 (0.7558) loss 0.6159 (0.6230) grad_norm 0.1494 (0.1481) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:42:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [537/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7532) loss 0.5813 (0.6220) grad_norm 0.1461 (0.1486) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:42:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 537 training takes 0:05:02 [2024-03-10 13:42:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [538/800][0/402] eta 0:24:54 lr 0.000025 time 3.7177 (3.7177) loss 0.6110 (0.6110) grad_norm 0.1497 (0.1497) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:43:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [538/800][100/402] eta 0:03:54 lr 0.000025 time 0.7462 (0.7756) loss 0.5968 (0.6239) grad_norm 0.1604 (0.1453) loss_scale 1048576.0000 (633298.3762) mem 28968MB [2024-03-10 13:45:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [538/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7614) loss 0.6058 (0.6236) grad_norm 0.1448 (inf) loss_scale 524288.0000 (599931.5423) mem 28968MB [2024-03-10 13:46:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [538/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7562) loss 0.6313 (0.6227) grad_norm 0.1440 (inf) loss_scale 524288.0000 (574800.7973) mem 28968MB [2024-03-10 13:47:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [538/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7536) loss 0.6402 (0.6231) grad_norm 0.1356 (inf) loss_scale 524288.0000 (562204.0898) mem 28968MB [2024-03-10 13:47:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 538 training takes 0:05:03 [2024-03-10 13:47:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [539/800][0/402] eta 0:25:11 lr 0.000025 time 3.7591 (3.7591) loss 0.6641 (0.6641) grad_norm 0.1533 (0.1533) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:48:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [539/800][100/402] eta 0:03:54 lr 0.000025 time 0.7461 (0.7760) loss 0.6157 (0.6174) grad_norm 0.1430 (0.1533) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:50:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [539/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7611) loss 0.6475 (0.6220) grad_norm 0.1585 (0.1505) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:51:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [539/800][300/402] eta 0:01:17 lr 0.000025 time 0.7454 (0.7561) loss 0.6348 (0.6224) grad_norm 0.1724 (0.1487) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:52:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [539/800][400/402] eta 0:00:01 lr 0.000025 time 0.7453 (0.7535) loss 0.6462 (0.6229) grad_norm 0.1074 (0.1471) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:52:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 539 training takes 0:05:03 [2024-03-10 13:52:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [540/800][0/402] eta 0:25:26 lr 0.000025 time 3.7979 (3.7979) loss 0.5945 (0.5945) grad_norm 0.1561 (0.1561) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:53:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [540/800][100/402] eta 0:03:54 lr 0.000025 time 0.7478 (0.7764) loss 0.5944 (0.6194) grad_norm 0.1393 (0.1463) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:55:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [540/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7614) loss 0.6274 (0.6222) grad_norm 0.1327 (0.1478) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:56:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [540/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7568) loss 0.6459 (0.6225) grad_norm 0.1458 (0.1474) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:57:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [540/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7541) loss 0.5941 (0.6224) grad_norm 0.1485 (0.1475) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:57:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 540 training takes 0:05:03 [2024-03-10 13:57:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [541/800][0/402] eta 0:37:30 lr 0.000025 time 5.5975 (5.5975) loss 0.6352 (0.6352) grad_norm 0.1665 (0.1665) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 13:59:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [541/800][100/402] eta 0:03:59 lr 0.000025 time 0.7488 (0.7944) loss 0.6082 (0.6230) grad_norm 0.1677 (0.1492) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:00:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [541/800][200/402] eta 0:02:35 lr 0.000025 time 0.7455 (0.7704) loss 0.6347 (0.6233) grad_norm 0.1267 (0.1494) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:01:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [541/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7623) loss 0.5856 (0.6231) grad_norm 0.1667 (0.1487) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:02:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [541/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7582) loss 0.6339 (0.6222) grad_norm 0.1345 (0.1482) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:02:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 541 training takes 0:05:04 [2024-03-10 14:02:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [542/800][0/402] eta 0:24:28 lr 0.000025 time 3.6533 (3.6533) loss 0.6570 (0.6570) grad_norm 0.1095 (0.1095) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:04:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [542/800][100/402] eta 0:03:54 lr 0.000025 time 0.7455 (0.7749) loss 0.6237 (0.6242) grad_norm 0.1519 (0.1482) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:05:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [542/800][200/402] eta 0:02:33 lr 0.000025 time 0.7452 (0.7605) loss 0.6149 (0.6230) grad_norm 0.1634 (0.1486) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:06:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [542/800][300/402] eta 0:01:17 lr 0.000025 time 0.7448 (0.7557) loss 0.6057 (0.6233) grad_norm 0.1306 (0.1486) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:07:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [542/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7533) loss 0.6074 (0.6235) grad_norm 0.1505 (0.1486) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:07:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 542 training takes 0:05:02 [2024-03-10 14:07:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [543/800][0/402] eta 0:25:39 lr 0.000025 time 3.8284 (3.8284) loss 0.6138 (0.6138) grad_norm 0.1388 (0.1388) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:09:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [543/800][100/402] eta 0:03:54 lr 0.000025 time 0.7463 (0.7767) loss 0.6189 (0.6200) grad_norm 0.1451 (0.1487) loss_scale 1048576.0000 (534669.9406) mem 28968MB [2024-03-10 14:10:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [543/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7613) loss 0.6467 (0.6217) grad_norm 0.1343 (inf) loss_scale 524288.0000 (539938.3881) mem 28968MB [2024-03-10 14:11:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [543/800][300/402] eta 0:01:17 lr 0.000025 time 0.7467 (0.7563) loss 0.6076 (0.6214) grad_norm 0.1407 (inf) loss_scale 524288.0000 (534738.9236) mem 28968MB [2024-03-10 14:12:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [543/800][400/402] eta 0:00:01 lr 0.000025 time 0.7450 (0.7538) loss 0.6280 (0.6219) grad_norm 0.1381 (inf) loss_scale 524288.0000 (532132.7082) mem 28968MB [2024-03-10 14:12:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 543 training takes 0:05:03 [2024-03-10 14:12:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [544/800][0/402] eta 0:24:52 lr 0.000025 time 3.7135 (3.7135) loss 0.6103 (0.6103) grad_norm 0.1422 (0.1422) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:14:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [544/800][100/402] eta 0:03:54 lr 0.000025 time 0.7455 (0.7763) loss 0.6088 (0.6179) grad_norm 0.1296 (0.1514) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:15:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [544/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7615) loss 0.6006 (0.6214) grad_norm 0.1328 (0.1489) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:16:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [544/800][300/402] eta 0:01:17 lr 0.000025 time 0.7474 (0.7565) loss 0.6214 (0.6217) grad_norm 0.1457 (0.1493) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:17:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [544/800][400/402] eta 0:00:01 lr 0.000025 time 0.7453 (0.7541) loss 0.5872 (0.6216) grad_norm 0.1425 (0.1492) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:17:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 544 training takes 0:05:03 [2024-03-10 14:17:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [545/800][0/402] eta 0:24:39 lr 0.000025 time 3.6794 (3.6794) loss 0.6347 (0.6347) grad_norm 0.1513 (0.1513) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:19:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [545/800][100/402] eta 0:03:54 lr 0.000025 time 0.7460 (0.7766) loss 0.6353 (0.6240) grad_norm 0.1522 (0.1490) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:20:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [545/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7615) loss 0.6171 (0.6223) grad_norm 0.1387 (0.1480) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:21:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [545/800][300/402] eta 0:01:17 lr 0.000025 time 0.7469 (0.7567) loss 0.6279 (0.6232) grad_norm 0.1623 (0.1470) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:22:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [545/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7539) loss 0.6229 (0.6229) grad_norm 0.1442 (0.1465) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:22:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 545 training takes 0:05:03 [2024-03-10 14:23:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [546/800][0/402] eta 0:36:05 lr 0.000025 time 5.3857 (5.3857) loss 0.6082 (0.6082) grad_norm 0.1620 (0.1620) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:24:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [546/800][100/402] eta 0:03:59 lr 0.000025 time 0.7480 (0.7922) loss 0.6176 (0.6225) grad_norm 0.1496 (0.1462) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:25:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [546/800][200/402] eta 0:02:35 lr 0.000025 time 0.7462 (0.7695) loss 0.6249 (0.6233) grad_norm 0.1244 (0.1453) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:26:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [546/800][300/402] eta 0:01:17 lr 0.000025 time 0.7503 (0.7620) loss 0.6206 (0.6232) grad_norm 0.1448 (0.1452) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:28:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [546/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7581) loss 0.6512 (0.6231) grad_norm 0.1413 (0.1457) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:28:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 546 training takes 0:05:04 [2024-03-10 14:28:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [547/800][0/402] eta 0:25:35 lr 0.000025 time 3.8187 (3.8187) loss 0.6488 (0.6488) grad_norm 0.1595 (0.1595) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:29:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [547/800][100/402] eta 0:03:54 lr 0.000025 time 0.7468 (0.7777) loss 0.6346 (0.6215) grad_norm 0.1635 (0.1460) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:30:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [547/800][200/402] eta 0:02:34 lr 0.000025 time 0.8271 (0.7626) loss 0.6273 (0.6220) grad_norm 0.1290 (0.1489) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:31:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [547/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7572) loss 0.6139 (0.6230) grad_norm 0.1788 (0.1483) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:33:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [547/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7544) loss 0.6220 (0.6231) grad_norm 0.1352 (0.1482) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:33:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 547 training takes 0:05:03 [2024-03-10 14:33:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [548/800][0/402] eta 0:24:42 lr 0.000025 time 3.6890 (3.6890) loss 0.5801 (0.5801) grad_norm 0.1375 (0.1375) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:34:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [548/800][100/402] eta 0:03:54 lr 0.000025 time 0.7458 (0.7755) loss 0.6058 (0.6237) grad_norm 0.1539 (0.1520) loss_scale 1048576.0000 (555433.8218) mem 28968MB [2024-03-10 14:35:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [548/800][200/402] eta 0:02:33 lr 0.000025 time 0.7465 (0.7610) loss 0.5938 (0.6236) grad_norm 0.1895 (inf) loss_scale 524288.0000 (563413.9701) mem 28968MB [2024-03-10 14:36:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [548/800][300/402] eta 0:01:17 lr 0.000025 time 0.7456 (0.7562) loss 0.6211 (0.6231) grad_norm 0.1486 (inf) loss_scale 524288.0000 (550415.3090) mem 28968MB [2024-03-10 14:38:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [548/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7537) loss 0.6063 (0.6228) grad_norm 0.1445 (inf) loss_scale 524288.0000 (543899.7706) mem 28968MB [2024-03-10 14:38:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 548 training takes 0:05:03 [2024-03-10 14:38:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [549/800][0/402] eta 0:25:10 lr 0.000025 time 3.7568 (3.7568) loss 0.6084 (0.6084) grad_norm 0.1435 (0.1435) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:39:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [549/800][100/402] eta 0:03:54 lr 0.000025 time 0.7460 (0.7764) loss 0.6147 (0.6220) grad_norm 0.1311 (0.1503) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:40:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [549/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7618) loss 0.6257 (0.6223) grad_norm 0.1407 (0.1496) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:41:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [549/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7568) loss 0.6356 (0.6216) grad_norm 0.1754 (0.1487) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:43:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [549/800][400/402] eta 0:00:01 lr 0.000025 time 0.7450 (0.7544) loss 0.6264 (0.6216) grad_norm 0.1343 (0.1494) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:43:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 549 training takes 0:05:03 [2024-03-10 14:43:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [550/800][0/402] eta 0:24:40 lr 0.000025 time 3.6839 (3.6839) loss 0.6233 (0.6233) grad_norm 0.1664 (0.1664) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:44:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [550/800][100/402] eta 0:03:54 lr 0.000025 time 0.7464 (0.7755) loss 0.6026 (0.6242) grad_norm 0.1239 (0.1477) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:45:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [550/800][200/402] eta 0:02:33 lr 0.000025 time 0.7476 (0.7612) loss 0.6306 (0.6229) grad_norm 0.1334 (0.1500) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:46:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [550/800][300/402] eta 0:01:17 lr 0.000025 time 0.7470 (0.7564) loss 0.6240 (0.6219) grad_norm 0.1846 (0.1503) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:48:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [550/800][400/402] eta 0:00:01 lr 0.000025 time 0.7456 (0.7539) loss 0.6498 (0.6220) grad_norm 0.1757 (0.1503) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:48:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 550 training takes 0:05:03 [2024-03-10 14:48:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [551/800][0/402] eta 0:36:12 lr 0.000025 time 5.4042 (5.4042) loss 0.6121 (0.6121) grad_norm 0.1493 (0.1493) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:49:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [551/800][100/402] eta 0:03:59 lr 0.000025 time 0.7480 (0.7924) loss 0.6457 (0.6206) grad_norm 0.1668 (0.1475) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:50:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [551/800][200/402] eta 0:02:35 lr 0.000025 time 0.7463 (0.7698) loss 0.6321 (0.6225) grad_norm 0.1393 (0.1483) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:52:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [551/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7620) loss 0.5940 (0.6225) grad_norm 0.1546 (0.1483) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:53:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [551/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7579) loss 0.6075 (0.6222) grad_norm 0.1376 (0.1482) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:53:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 551 training takes 0:05:04 [2024-03-10 14:53:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [552/800][0/402] eta 0:24:41 lr 0.000025 time 3.6853 (3.6853) loss 0.6228 (0.6228) grad_norm 0.1602 (0.1602) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:54:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [552/800][100/402] eta 0:03:54 lr 0.000025 time 0.7459 (0.7767) loss 0.6138 (0.6225) grad_norm 0.1504 (0.1451) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:55:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [552/800][200/402] eta 0:02:33 lr 0.000025 time 0.7464 (0.7616) loss 0.6264 (0.6247) grad_norm 0.1462 (0.1471) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:57:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [552/800][300/402] eta 0:01:17 lr 0.000025 time 0.7453 (0.7566) loss 0.6359 (0.6243) grad_norm 0.1337 (0.1475) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:58:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [552/800][400/402] eta 0:00:01 lr 0.000025 time 0.7453 (0.7540) loss 0.6200 (0.6235) grad_norm 0.1420 (0.1484) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:58:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 552 training takes 0:05:03 [2024-03-10 14:58:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [553/800][0/402] eta 0:25:18 lr 0.000025 time 3.7779 (3.7779) loss 0.5956 (0.5956) grad_norm 0.1773 (0.1773) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 14:59:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [553/800][100/402] eta 0:03:54 lr 0.000025 time 0.7465 (0.7761) loss 0.6302 (0.6217) grad_norm 0.1287 (inf) loss_scale 262144.0000 (469782.8119) mem 28968MB [2024-03-10 15:00:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [553/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7614) loss 0.6397 (0.6220) grad_norm 0.1299 (inf) loss_scale 262144.0000 (366479.9204) mem 28968MB [2024-03-10 15:02:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [553/800][300/402] eta 0:01:17 lr 0.000025 time 0.7456 (0.7563) loss 0.6120 (0.6227) grad_norm 0.1785 (inf) loss_scale 262144.0000 (331816.8239) mem 28968MB [2024-03-10 15:03:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [553/800][400/402] eta 0:00:01 lr 0.000025 time 0.7475 (0.7539) loss 0.6174 (0.6225) grad_norm 0.1374 (inf) loss_scale 262144.0000 (314442.0549) mem 28968MB [2024-03-10 15:03:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 553 training takes 0:05:03 [2024-03-10 15:03:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [554/800][0/402] eta 0:26:08 lr 0.000025 time 3.9024 (3.9024) loss 0.6374 (0.6374) grad_norm 0.1792 (0.1792) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:04:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [554/800][100/402] eta 0:03:54 lr 0.000025 time 0.7457 (0.7774) loss 0.6252 (0.6223) grad_norm 0.1386 (0.1521) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:05:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [554/800][200/402] eta 0:02:33 lr 0.000025 time 0.7466 (0.7620) loss 0.6201 (0.6221) grad_norm 0.1436 (0.1488) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:07:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [554/800][300/402] eta 0:01:17 lr 0.000025 time 0.7467 (0.7568) loss 0.6126 (0.6230) grad_norm 0.1720 (0.1474) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:08:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [554/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7542) loss 0.6580 (0.6235) grad_norm 0.1594 (0.1469) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:08:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 554 training takes 0:05:03 [2024-03-10 15:08:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [555/800][0/402] eta 0:25:00 lr 0.000025 time 3.7322 (3.7322) loss 0.6181 (0.6181) grad_norm 0.1476 (0.1476) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:09:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [555/800][100/402] eta 0:03:54 lr 0.000025 time 0.7461 (0.7761) loss 0.6689 (0.6178) grad_norm 0.1325 (0.1506) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:11:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [555/800][200/402] eta 0:02:33 lr 0.000025 time 0.7467 (0.7615) loss 0.6187 (0.6209) grad_norm 0.1648 (0.1490) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:12:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [555/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7565) loss 0.6136 (0.6216) grad_norm 0.1240 (0.1500) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:13:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [555/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7540) loss 0.6233 (0.6221) grad_norm 0.1447 (0.1486) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:13:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 555 training takes 0:05:03 [2024-03-10 15:13:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [556/800][0/402] eta 0:37:36 lr 0.000025 time 5.6132 (5.6132) loss 0.6398 (0.6398) grad_norm 0.1394 (0.1394) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:14:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [556/800][100/402] eta 0:03:59 lr 0.000025 time 0.7457 (0.7946) loss 0.6113 (0.6234) grad_norm 0.1357 (0.1502) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:16:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [556/800][200/402] eta 0:02:35 lr 0.000025 time 0.7465 (0.7706) loss 0.6126 (0.6206) grad_norm 0.1498 (0.1510) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:17:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [556/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7628) loss 0.6506 (0.6216) grad_norm 0.1606 (0.1488) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:18:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [556/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7586) loss 0.6228 (0.6229) grad_norm 0.1806 (0.1478) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:18:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 556 training takes 0:05:05 [2024-03-10 15:18:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [557/800][0/402] eta 0:26:12 lr 0.000025 time 3.9127 (3.9127) loss 0.6422 (0.6422) grad_norm 0.1531 (0.1531) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:19:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [557/800][100/402] eta 0:03:54 lr 0.000025 time 0.7462 (0.7778) loss 0.6470 (0.6190) grad_norm 0.1591 (0.1482) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:21:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [557/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7621) loss 0.6285 (0.6212) grad_norm 0.1773 (0.1496) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:22:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [557/800][300/402] eta 0:01:17 lr 0.000025 time 0.7464 (0.7568) loss 0.6521 (0.6228) grad_norm 0.1613 (0.1503) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:23:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [557/800][400/402] eta 0:00:01 lr 0.000025 time 0.7457 (0.7542) loss 0.6333 (0.6227) grad_norm 0.1591 (0.1493) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:23:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 557 training takes 0:05:03 [2024-03-10 15:23:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [558/800][0/402] eta 0:25:37 lr 0.000025 time 3.8254 (3.8254) loss 0.5908 (0.5908) grad_norm 0.1967 (0.1967) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 15:24:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [558/800][100/402] eta 0:03:54 lr 0.000025 time 0.7485 (0.7771) loss 0.6347 (0.6218) grad_norm 0.1371 (inf) loss_scale 131072.0000 (166111.0495) mem 28968MB [2024-03-10 15:26:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [558/800][200/402] eta 0:02:33 lr 0.000025 time 0.7464 (0.7622) loss 0.6363 (0.6249) grad_norm 0.1341 (inf) loss_scale 131072.0000 (148678.6866) mem 28968MB [2024-03-10 15:27:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [558/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7568) loss 0.6045 (0.6244) grad_norm 0.1656 (inf) loss_scale 131072.0000 (142829.2890) mem 28968MB [2024-03-10 15:28:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [558/800][400/402] eta 0:00:01 lr 0.000025 time 0.7461 (0.7541) loss 0.5955 (0.6236) grad_norm 0.1456 (inf) loss_scale 131072.0000 (139897.2968) mem 28968MB [2024-03-10 15:28:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 558 training takes 0:05:03 [2024-03-10 15:28:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [559/800][0/402] eta 0:26:16 lr 0.000025 time 3.9208 (3.9208) loss 0.5999 (0.5999) grad_norm 0.1933 (0.1933) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 15:30:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [559/800][100/402] eta 0:03:55 lr 0.000025 time 0.7461 (0.7787) loss 0.6156 (0.6217) grad_norm 0.1475 (0.1482) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 15:31:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [559/800][200/402] eta 0:02:34 lr 0.000025 time 0.7463 (0.7625) loss 0.5706 (0.6219) grad_norm 0.1642 (0.1488) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 15:32:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [559/800][300/402] eta 0:01:17 lr 0.000025 time 0.7455 (0.7571) loss 0.5950 (0.6222) grad_norm 0.1365 (0.1497) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 15:33:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [559/800][400/402] eta 0:00:01 lr 0.000025 time 0.7458 (0.7544) loss 0.6072 (0.6226) grad_norm 0.1767 (0.1481) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 15:33:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 559 training takes 0:05:03 [2024-03-10 15:33:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [560/800][0/402] eta 0:26:23 lr 0.000025 time 3.9381 (3.9381) loss 0.5933 (0.5933) grad_norm 0.1467 (0.1467) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 15:35:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [560/800][100/402] eta 0:03:54 lr 0.000025 time 0.7482 (0.7780) loss 0.6003 (0.6200) grad_norm 0.1627 (0.1470) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 15:36:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [560/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7621) loss 0.5848 (0.6202) grad_norm 0.1583 (0.1474) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 15:37:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [560/800][300/402] eta 0:01:17 lr 0.000025 time 0.7475 (0.7570) loss 0.6249 (0.6211) grad_norm 0.1597 (0.1472) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 15:38:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [560/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7542) loss 0.5955 (0.6221) grad_norm 0.1412 (0.1485) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 15:38:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 560 training takes 0:05:03 [2024-03-10 15:38:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [561/800][0/402] eta 0:36:49 lr 0.000025 time 5.4969 (5.4969) loss 0.6200 (0.6200) grad_norm 0.1502 (0.1502) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 15:40:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [561/800][100/402] eta 0:03:59 lr 0.000025 time 0.7464 (0.7931) loss 0.6327 (0.6223) grad_norm 0.1490 (0.1470) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 15:41:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [561/800][200/402] eta 0:02:35 lr 0.000025 time 0.7455 (0.7700) loss 0.6334 (0.6230) grad_norm 0.1406 (0.1472) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 15:42:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [561/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7620) loss 0.6376 (0.6222) grad_norm 0.1863 (0.1484) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 15:43:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [561/800][400/402] eta 0:00:01 lr 0.000025 time 0.7483 (0.7579) loss 0.6260 (0.6229) grad_norm 0.1371 (0.1477) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 15:43:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 561 training takes 0:05:04 [2024-03-10 15:43:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [562/800][0/402] eta 0:24:36 lr 0.000025 time 3.6735 (3.6735) loss 0.6415 (0.6415) grad_norm 0.1396 (0.1396) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 15:45:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [562/800][100/402] eta 0:03:54 lr 0.000025 time 0.7451 (0.7751) loss 0.5986 (0.6211) grad_norm 0.1643 (0.1477) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 15:46:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [562/800][200/402] eta 0:02:33 lr 0.000025 time 0.7449 (0.7605) loss 0.6394 (0.6222) grad_norm 0.1575 (0.1495) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 15:47:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [562/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7557) loss 0.6298 (0.6235) grad_norm 0.1484 (inf) loss_scale 65536.0000 (129330.1794) mem 28968MB [2024-03-10 15:48:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [562/800][400/402] eta 0:00:01 lr 0.000025 time 0.7466 (0.7532) loss 0.6242 (0.6227) grad_norm 0.1734 (inf) loss_scale 65536.0000 (113421.4065) mem 28968MB [2024-03-10 15:48:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 562 training takes 0:05:02 [2024-03-10 15:49:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [563/800][0/402] eta 0:25:11 lr 0.000025 time 3.7594 (3.7594) loss 0.5967 (0.5967) grad_norm 0.1594 (0.1594) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 15:50:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [563/800][100/402] eta 0:03:54 lr 0.000025 time 0.7461 (0.7760) loss 0.6073 (0.6242) grad_norm 0.1333 (0.1494) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 15:51:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [563/800][200/402] eta 0:02:33 lr 0.000025 time 0.7491 (0.7612) loss 0.6165 (0.6235) grad_norm 0.1732 (0.1500) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 15:52:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [563/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7564) loss 0.6461 (0.6237) grad_norm 0.1420 (0.1493) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 15:54:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [563/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7538) loss 0.6096 (0.6234) grad_norm 0.1512 (0.1496) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 15:54:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 563 training takes 0:05:03 [2024-03-10 15:54:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [564/800][0/402] eta 0:24:56 lr 0.000025 time 3.7219 (3.7219) loss 0.6353 (0.6353) grad_norm 0.1939 (0.1939) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 15:55:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [564/800][100/402] eta 0:03:54 lr 0.000025 time 0.7460 (0.7760) loss 0.6155 (0.6222) grad_norm 0.1815 (0.1540) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 15:56:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [564/800][200/402] eta 0:02:33 lr 0.000025 time 0.7475 (0.7611) loss 0.6164 (0.6236) grad_norm 0.1429 (0.1504) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 15:57:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [564/800][300/402] eta 0:01:17 lr 0.000025 time 0.7455 (0.7560) loss 0.6428 (0.6241) grad_norm 0.1590 (0.1482) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 15:59:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [564/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7534) loss 0.6357 (0.6231) grad_norm 0.1426 (0.1472) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 15:59:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 564 training takes 0:05:02 [2024-03-10 15:59:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [565/800][0/402] eta 0:24:31 lr 0.000025 time 3.6609 (3.6609) loss 0.6231 (0.6231) grad_norm 0.1282 (0.1282) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 16:00:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [565/800][100/402] eta 0:03:54 lr 0.000025 time 0.7472 (0.7759) loss 0.6208 (0.6234) grad_norm 0.1722 (0.1495) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 16:01:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [565/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7611) loss 0.5835 (0.6224) grad_norm 0.1396 (0.1475) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 16:02:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [565/800][300/402] eta 0:01:17 lr 0.000025 time 0.7454 (0.7561) loss 0.5877 (0.6220) grad_norm 0.1372 (0.1474) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 16:04:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [565/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7535) loss 0.6057 (0.6214) grad_norm 0.1703 (0.1492) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 16:04:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 565 training takes 0:05:03 [2024-03-10 16:04:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [566/800][0/402] eta 0:38:01 lr 0.000025 time 5.6762 (5.6762) loss 0.6300 (0.6300) grad_norm 0.1708 (0.1708) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 16:05:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [566/800][100/402] eta 0:04:00 lr 0.000025 time 0.7458 (0.7959) loss 0.6125 (0.6218) grad_norm 0.1295 (0.1506) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 16:06:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [566/800][200/402] eta 0:02:35 lr 0.000025 time 0.7463 (0.7712) loss 0.6398 (0.6228) grad_norm 0.1276 (0.1484) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 16:07:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [566/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7629) loss 0.6534 (0.6222) grad_norm 0.1220 (0.1471) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 16:09:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [566/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7587) loss 0.6107 (0.6219) grad_norm 0.1158 (0.1475) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 16:09:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 566 training takes 0:05:05 [2024-03-10 16:09:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [567/800][0/402] eta 0:25:31 lr 0.000025 time 3.8091 (3.8091) loss 0.6503 (0.6503) grad_norm 0.1181 (0.1181) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 16:10:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [567/800][100/402] eta 0:03:54 lr 0.000025 time 0.7467 (0.7768) loss 0.6254 (0.6183) grad_norm 0.1693 (0.1484) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 16:11:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [567/800][200/402] eta 0:02:33 lr 0.000025 time 0.7515 (0.7619) loss 0.6408 (0.6214) grad_norm 0.1414 (0.1480) loss_scale 65536.0000 (65536.0000) mem 28968MB [2024-03-10 16:13:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [567/800][300/402] eta 0:01:17 lr 0.000025 time 0.7460 (0.7568) loss 0.6387 (0.6221) grad_norm 0.1396 (0.1486) loss_scale 131072.0000 (69455.0963) mem 28968MB [2024-03-10 16:14:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [567/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7541) loss 0.6402 (0.6227) grad_norm 0.1103 (0.1483) loss_scale 131072.0000 (84820.9077) mem 28968MB [2024-03-10 16:14:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 567 training takes 0:05:03 [2024-03-10 16:14:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [568/800][0/402] eta 0:25:33 lr 0.000025 time 3.8135 (3.8135) loss 0.6616 (0.6616) grad_norm 0.1160 (0.1160) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:15:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [568/800][100/402] eta 0:03:54 lr 0.000025 time 0.7457 (0.7767) loss 0.6169 (0.6221) grad_norm 0.1561 (0.1480) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:16:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [568/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7618) loss 0.6251 (0.6207) grad_norm 0.1287 (0.1494) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:18:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [568/800][300/402] eta 0:01:17 lr 0.000025 time 0.7505 (0.7568) loss 0.6455 (0.6205) grad_norm 0.1445 (0.1496) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:19:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [568/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7541) loss 0.6320 (0.6212) grad_norm 0.1398 (0.1489) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:19:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 568 training takes 0:05:03 [2024-03-10 16:19:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [569/800][0/402] eta 0:24:52 lr 0.000025 time 3.7134 (3.7134) loss 0.5978 (0.5978) grad_norm 0.1188 (0.1188) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:20:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [569/800][100/402] eta 0:03:54 lr 0.000025 time 0.7461 (0.7758) loss 0.6628 (0.6237) grad_norm 0.1444 (0.1484) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:21:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [569/800][200/402] eta 0:02:33 lr 0.000025 time 0.7482 (0.7614) loss 0.6215 (0.6237) grad_norm 0.1329 (0.1485) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:23:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [569/800][300/402] eta 0:01:17 lr 0.000025 time 0.7464 (0.7564) loss 0.6083 (0.6241) grad_norm 0.1570 (0.1493) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:24:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [569/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7538) loss 0.6330 (0.6239) grad_norm 0.1232 (0.1496) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:24:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 569 training takes 0:05:03 [2024-03-10 16:24:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [570/800][0/402] eta 0:24:49 lr 0.000025 time 3.7043 (3.7043) loss 0.6148 (0.6148) grad_norm 0.1391 (0.1391) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:25:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [570/800][100/402] eta 0:03:54 lr 0.000025 time 0.7460 (0.7755) loss 0.6281 (0.6233) grad_norm 0.1428 (0.1474) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:26:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [570/800][200/402] eta 0:02:33 lr 0.000025 time 0.7466 (0.7611) loss 0.6284 (0.6209) grad_norm 0.2110 (0.1476) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:28:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [570/800][300/402] eta 0:01:17 lr 0.000025 time 0.7465 (0.7564) loss 0.6006 (0.6210) grad_norm 0.1349 (0.1480) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:29:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [570/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7538) loss 0.5836 (0.6216) grad_norm 0.1257 (0.1485) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:29:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 570 training takes 0:05:03 [2024-03-10 16:29:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [571/800][0/402] eta 0:37:00 lr 0.000025 time 5.5236 (5.5236) loss 0.6123 (0.6123) grad_norm 0.1599 (0.1599) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:30:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [571/800][100/402] eta 0:03:59 lr 0.000025 time 0.7470 (0.7936) loss 0.6254 (0.6226) grad_norm 0.1613 (0.1522) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:32:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [571/800][200/402] eta 0:02:35 lr 0.000025 time 0.7465 (0.7702) loss 0.6137 (0.6226) grad_norm 0.1803 (0.1523) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:33:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [571/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7625) loss 0.5968 (0.6224) grad_norm 0.1589 (0.1500) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:34:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [571/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7584) loss 0.6200 (0.6222) grad_norm 0.1292 (0.1506) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:34:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 571 training takes 0:05:04 [2024-03-10 16:34:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [572/800][0/402] eta 0:25:50 lr 0.000025 time 3.8566 (3.8566) loss 0.6395 (0.6395) grad_norm 0.1537 (0.1537) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:35:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [572/800][100/402] eta 0:03:54 lr 0.000025 time 0.7464 (0.7772) loss 0.6354 (0.6211) grad_norm 0.1339 (0.1487) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:37:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [572/800][200/402] eta 0:02:33 lr 0.000025 time 0.7479 (0.7617) loss 0.6416 (0.6221) grad_norm 0.1405 (0.1502) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-10 16:38:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [572/800][300/402] eta 0:01:17 lr 0.000025 time 0.7464 (0.7566) loss 0.6063 (0.6214) grad_norm 0.1630 (0.1496) loss_scale 262144.0000 (143264.7442) mem 28968MB [2024-03-10 16:39:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [572/800][400/402] eta 0:00:01 lr 0.000025 time 0.7455 (0.7541) loss 0.5959 (0.6216) grad_norm 0.1299 (0.1491) loss_scale 262144.0000 (172910.4439) mem 28968MB [2024-03-10 16:39:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 572 training takes 0:05:03 [2024-03-10 16:39:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [573/800][0/402] eta 0:25:27 lr 0.000025 time 3.7997 (3.7997) loss 0.6154 (0.6154) grad_norm 0.1264 (0.1264) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:40:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [573/800][100/402] eta 0:03:54 lr 0.000025 time 0.7478 (0.7765) loss 0.6250 (0.6208) grad_norm 0.1592 (0.1486) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:42:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [573/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7615) loss 0.6294 (0.6213) grad_norm 0.1138 (0.1502) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:43:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [573/800][300/402] eta 0:01:17 lr 0.000025 time 0.7487 (0.7565) loss 0.6396 (0.6232) grad_norm 0.1745 (0.1501) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:44:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [573/800][400/402] eta 0:00:01 lr 0.000025 time 0.7454 (0.7539) loss 0.6305 (0.6230) grad_norm 0.1523 (0.1496) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:44:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 573 training takes 0:05:03 [2024-03-10 16:44:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [574/800][0/402] eta 0:24:56 lr 0.000025 time 3.7233 (3.7233) loss 0.6345 (0.6345) grad_norm 0.1337 (0.1337) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:45:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [574/800][100/402] eta 0:03:54 lr 0.000025 time 0.7466 (0.7770) loss 0.6491 (0.6222) grad_norm 0.1265 (0.1473) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:47:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [574/800][200/402] eta 0:02:33 lr 0.000025 time 0.7464 (0.7618) loss 0.5999 (0.6230) grad_norm 0.1421 (0.1479) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:48:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [574/800][300/402] eta 0:01:17 lr 0.000025 time 0.7475 (0.7568) loss 0.6127 (0.6230) grad_norm 0.1513 (0.1481) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:49:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [574/800][400/402] eta 0:00:01 lr 0.000025 time 0.7454 (0.7543) loss 0.5971 (0.6231) grad_norm 0.1284 (0.1486) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:49:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 574 training takes 0:05:03 [2024-03-10 16:49:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [575/800][0/402] eta 0:24:55 lr 0.000025 time 3.7195 (3.7195) loss 0.5896 (0.5896) grad_norm 0.1541 (0.1541) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:50:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [575/800][100/402] eta 0:03:54 lr 0.000025 time 0.7701 (0.7765) loss 0.6287 (0.6244) grad_norm 0.1483 (0.1465) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:52:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [575/800][200/402] eta 0:02:33 lr 0.000025 time 0.7478 (0.7616) loss 0.6264 (0.6229) grad_norm 0.1519 (0.1472) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:53:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [575/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7565) loss 0.6199 (0.6228) grad_norm 0.1521 (0.1491) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:54:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [575/800][400/402] eta 0:00:01 lr 0.000025 time 0.7453 (0.7538) loss 0.6462 (0.6228) grad_norm 0.1202 (0.1495) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:54:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 575 training takes 0:05:03 [2024-03-10 16:54:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [576/800][0/402] eta 0:36:19 lr 0.000025 time 5.4209 (5.4209) loss 0.6279 (0.6279) grad_norm 0.1642 (0.1642) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:56:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [576/800][100/402] eta 0:03:59 lr 0.000025 time 0.7492 (0.7933) loss 0.6139 (0.6202) grad_norm 0.1221 (0.1508) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:57:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [576/800][200/402] eta 0:02:35 lr 0.000025 time 0.7458 (0.7695) loss 0.6291 (0.6214) grad_norm 0.1618 (0.1511) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:58:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [576/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7615) loss 0.6128 (0.6221) grad_norm 0.1221 (0.1488) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:59:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [576/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7575) loss 0.6041 (0.6226) grad_norm 0.1839 (0.1485) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 16:59:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 576 training takes 0:05:04 [2024-03-10 16:59:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [577/800][0/402] eta 0:25:21 lr 0.000025 time 3.7860 (3.7860) loss 0.6116 (0.6116) grad_norm 0.1659 (0.1659) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:01:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [577/800][100/402] eta 0:03:54 lr 0.000025 time 0.7452 (0.7760) loss 0.5829 (0.6191) grad_norm 0.1810 (0.1487) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:02:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [577/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7609) loss 0.6222 (0.6201) grad_norm 0.1418 (0.1496) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:03:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [577/800][300/402] eta 0:01:17 lr 0.000025 time 0.7451 (0.7561) loss 0.6072 (0.6210) grad_norm 0.1746 (0.1490) loss_scale 524288.0000 (295238.5914) mem 28968MB [2024-03-10 17:04:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [577/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7536) loss 0.5885 (0.6220) grad_norm 0.1394 (0.1493) loss_scale 524288.0000 (352358.1446) mem 28968MB [2024-03-10 17:04:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 577 training takes 0:05:03 [2024-03-10 17:04:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [578/800][0/402] eta 0:25:19 lr 0.000025 time 3.7795 (3.7795) loss 0.6160 (0.6160) grad_norm 0.1515 (0.1515) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:06:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [578/800][100/402] eta 0:03:54 lr 0.000025 time 0.7457 (0.7765) loss 0.6355 (0.6237) grad_norm 0.1803 (0.1487) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:07:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [578/800][200/402] eta 0:02:33 lr 0.000025 time 0.7466 (0.7615) loss 0.6410 (0.6219) grad_norm 0.1417 (0.1471) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:08:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [578/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7565) loss 0.6123 (0.6227) grad_norm 0.1272 (0.1487) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:09:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [578/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7539) loss 0.6387 (0.6227) grad_norm 0.1295 (0.1483) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:09:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 578 training takes 0:05:03 [2024-03-10 17:09:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [579/800][0/402] eta 0:24:41 lr 0.000025 time 3.6859 (3.6859) loss 0.5779 (0.5779) grad_norm 0.1543 (0.1543) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:11:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [579/800][100/402] eta 0:03:54 lr 0.000025 time 0.7459 (0.7756) loss 0.6523 (0.6192) grad_norm 0.1399 (0.1449) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:12:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [579/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7608) loss 0.6395 (0.6211) grad_norm 0.1666 (0.1464) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:13:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [579/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7558) loss 0.6213 (0.6221) grad_norm 0.1677 (0.1491) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:14:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [579/800][400/402] eta 0:00:01 lr 0.000025 time 0.7498 (0.7534) loss 0.5992 (0.6219) grad_norm 0.1943 (0.1503) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:14:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 579 training takes 0:05:02 [2024-03-10 17:15:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [580/800][0/402] eta 0:25:35 lr 0.000025 time 3.8184 (3.8184) loss 0.6409 (0.6409) grad_norm 0.1369 (0.1369) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:16:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [580/800][100/402] eta 0:03:54 lr 0.000025 time 0.7468 (0.7765) loss 0.6286 (0.6236) grad_norm 0.1193 (0.1503) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:17:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [580/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7614) loss 0.6070 (0.6231) grad_norm 0.1379 (0.1493) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:18:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [580/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7563) loss 0.6448 (0.6221) grad_norm 0.1332 (0.1497) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:20:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [580/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7536) loss 0.6659 (0.6219) grad_norm 0.1170 (inf) loss_scale 262144.0000 (498792.6983) mem 28968MB [2024-03-10 17:20:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 580 training takes 0:05:03 [2024-03-10 17:20:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [581/800][0/402] eta 0:34:17 lr 0.000025 time 5.1169 (5.1169) loss 0.6459 (0.6459) grad_norm 0.1312 (0.1312) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:21:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [581/800][100/402] eta 0:03:58 lr 0.000025 time 0.7462 (0.7894) loss 0.6271 (0.6236) grad_norm 0.1425 (0.1483) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:22:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [581/800][200/402] eta 0:02:35 lr 0.000025 time 0.7453 (0.7677) loss 0.6289 (0.6232) grad_norm 0.1457 (0.1494) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:23:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [581/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7605) loss 0.6527 (0.6232) grad_norm 0.1426 (0.1485) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:25:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [581/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7568) loss 0.5895 (0.6222) grad_norm 0.1636 (0.1495) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:25:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 581 training takes 0:05:04 [2024-03-10 17:25:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [582/800][0/402] eta 0:23:36 lr 0.000025 time 3.5231 (3.5231) loss 0.6233 (0.6233) grad_norm 0.1650 (0.1650) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:26:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [582/800][100/402] eta 0:03:53 lr 0.000025 time 0.7458 (0.7743) loss 0.6212 (0.6181) grad_norm 0.1462 (0.1534) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:27:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [582/800][200/402] eta 0:02:33 lr 0.000025 time 0.7465 (0.7603) loss 0.6244 (0.6198) grad_norm 0.1534 (0.1507) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:28:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [582/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7556) loss 0.6007 (0.6196) grad_norm 0.1656 (0.1518) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:30:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [582/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7532) loss 0.6123 (0.6200) grad_norm 0.1599 (0.1512) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:30:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 582 training takes 0:05:02 [2024-03-10 17:30:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [583/800][0/402] eta 0:23:23 lr 0.000025 time 3.4901 (3.4901) loss 0.6197 (0.6197) grad_norm 0.1569 (0.1569) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:31:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [583/800][100/402] eta 0:03:53 lr 0.000025 time 0.7461 (0.7732) loss 0.6325 (0.6215) grad_norm 0.1579 (0.1487) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:32:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [583/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7597) loss 0.6184 (0.6204) grad_norm 0.1546 (0.1497) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:33:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [583/800][300/402] eta 0:01:17 lr 0.000025 time 0.7469 (0.7552) loss 0.6370 (0.6215) grad_norm 0.1615 (0.1503) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:35:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [583/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7529) loss 0.6332 (0.6215) grad_norm 0.1340 (0.1495) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:35:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 583 training takes 0:05:02 [2024-03-10 17:35:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [584/800][0/402] eta 0:23:55 lr 0.000025 time 3.5713 (3.5713) loss 0.6217 (0.6217) grad_norm 0.1624 (0.1624) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:36:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [584/800][100/402] eta 0:03:53 lr 0.000025 time 0.7459 (0.7740) loss 0.6310 (0.6239) grad_norm 0.1330 (0.1482) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:37:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [584/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7604) loss 0.6434 (0.6231) grad_norm 0.1526 (0.1495) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:38:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [584/800][300/402] eta 0:01:17 lr 0.000025 time 0.7468 (0.7556) loss 0.6336 (0.6233) grad_norm 0.1444 (0.1486) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:40:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [584/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7532) loss 0.6350 (0.6225) grad_norm 0.1509 (0.1487) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:40:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 584 training takes 0:05:02 [2024-03-10 17:40:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [585/800][0/402] eta 0:22:24 lr 0.000025 time 3.3442 (3.3442) loss 0.6382 (0.6382) grad_norm 0.2172 (0.2172) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:41:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [585/800][100/402] eta 0:03:53 lr 0.000025 time 0.7466 (0.7718) loss 0.6067 (0.6235) grad_norm 0.1550 (0.1509) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:42:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [585/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7590) loss 0.6021 (0.6236) grad_norm 0.1518 (0.1495) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:44:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [585/800][300/402] eta 0:01:16 lr 0.000025 time 0.7464 (0.7547) loss 0.6525 (0.6228) grad_norm 0.1455 (0.1508) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 17:45:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [585/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7525) loss 0.6013 (0.6223) grad_norm 0.1631 (0.1515) loss_scale 524288.0000 (294176.5586) mem 28968MB [2024-03-10 17:45:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 585 training takes 0:05:02 [2024-03-10 17:45:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [586/800][0/402] eta 0:32:10 lr 0.000025 time 4.8013 (4.8013) loss 0.5999 (0.5999) grad_norm 0.1778 (0.1778) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:46:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [586/800][100/402] eta 0:03:57 lr 0.000025 time 0.7475 (0.7860) loss 0.6219 (0.6226) grad_norm 0.1745 (0.1524) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:47:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [586/800][200/402] eta 0:02:34 lr 0.000025 time 0.7466 (0.7661) loss 0.6266 (0.6225) grad_norm 0.1582 (0.1530) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:49:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [586/800][300/402] eta 0:01:17 lr 0.000025 time 0.7476 (0.7594) loss 0.6242 (0.6225) grad_norm 0.1569 (0.1533) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:50:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [586/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7562) loss 0.6523 (0.6220) grad_norm 0.1484 (0.1519) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:50:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 586 training takes 0:05:04 [2024-03-10 17:50:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [587/800][0/402] eta 0:22:40 lr 0.000025 time 3.3852 (3.3852) loss 0.6463 (0.6463) grad_norm 0.1361 (0.1361) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:51:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [587/800][100/402] eta 0:03:53 lr 0.000025 time 0.7459 (0.7723) loss 0.6022 (0.6230) grad_norm 0.1568 (0.1445) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:52:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [587/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7593) loss 0.6337 (0.6216) grad_norm 0.1960 (0.1487) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:54:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [587/800][300/402] eta 0:01:16 lr 0.000025 time 0.7464 (0.7549) loss 0.6262 (0.6219) grad_norm 0.1311 (0.1496) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:55:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [587/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7526) loss 0.6236 (0.6213) grad_norm 0.1895 (0.1508) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:55:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 587 training takes 0:05:02 [2024-03-10 17:55:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [588/800][0/402] eta 0:21:57 lr 0.000025 time 3.2778 (3.2778) loss 0.6052 (0.6052) grad_norm 0.1436 (0.1436) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:56:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [588/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7709) loss 0.6164 (0.6253) grad_norm 0.1338 (0.1450) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:57:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [588/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7585) loss 0.6498 (0.6230) grad_norm 0.1310 (0.1480) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 17:59:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [588/800][300/402] eta 0:01:16 lr 0.000025 time 0.7467 (0.7544) loss 0.6346 (0.6218) grad_norm 0.1489 (0.1483) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:00:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [588/800][400/402] eta 0:00:01 lr 0.000025 time 0.7436 (0.7522) loss 0.6149 (0.6224) grad_norm 0.1568 (0.1484) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:00:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 588 training takes 0:05:02 [2024-03-10 18:00:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [589/800][0/402] eta 0:22:16 lr 0.000025 time 3.3254 (3.3254) loss 0.6324 (0.6324) grad_norm 0.1844 (0.1844) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:01:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [589/800][100/402] eta 0:03:53 lr 0.000025 time 0.7465 (0.7726) loss 0.6606 (0.6205) grad_norm 0.1447 (0.1502) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:02:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [589/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7594) loss 0.6510 (0.6213) grad_norm 0.1481 (0.1499) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:04:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [589/800][300/402] eta 0:01:17 lr 0.000025 time 0.7456 (0.7549) loss 0.6399 (0.6217) grad_norm 0.1423 (0.1499) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:05:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [589/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7526) loss 0.6494 (0.6219) grad_norm 0.1482 (0.1495) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:05:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 589 training takes 0:05:02 [2024-03-10 18:05:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [590/800][0/402] eta 0:22:47 lr 0.000025 time 3.4028 (3.4028) loss 0.6055 (0.6055) grad_norm 0.1551 (0.1551) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:06:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [590/800][100/402] eta 0:03:53 lr 0.000025 time 0.7465 (0.7723) loss 0.6122 (0.6250) grad_norm 0.1647 (0.1446) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:08:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [590/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7592) loss 0.6184 (0.6227) grad_norm 0.1275 (0.1462) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:09:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [590/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7548) loss 0.6161 (0.6224) grad_norm 0.1424 (0.1475) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:10:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [590/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7525) loss 0.6042 (0.6218) grad_norm 0.2069 (inf) loss_scale 524288.0000 (583123.3117) mem 28968MB [2024-03-10 18:10:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 590 training takes 0:05:02 [2024-03-10 18:10:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [591/800][0/402] eta 0:32:48 lr 0.000025 time 4.8972 (4.8972) loss 0.6388 (0.6388) grad_norm 0.1296 (0.1296) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:11:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [591/800][100/402] eta 0:03:57 lr 0.000025 time 0.7460 (0.7877) loss 0.6318 (0.6223) grad_norm 0.1556 (0.1487) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:13:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [591/800][200/402] eta 0:02:35 lr 0.000025 time 0.7461 (0.7675) loss 0.6022 (0.6211) grad_norm 0.1520 (0.1483) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:14:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [591/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7605) loss 0.6471 (0.6214) grad_norm 0.1476 (0.1492) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:15:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [591/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7569) loss 0.5958 (0.6216) grad_norm 0.1497 (0.1490) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:15:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 591 training takes 0:05:04 [2024-03-10 18:15:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [592/800][0/402] eta 0:22:41 lr 0.000025 time 3.3859 (3.3859) loss 0.6192 (0.6192) grad_norm 0.1406 (0.1406) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:16:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [592/800][100/402] eta 0:03:53 lr 0.000025 time 0.7456 (0.7720) loss 0.6045 (0.6244) grad_norm 0.1580 (inf) loss_scale 262144.0000 (345199.5248) mem 28968MB [2024-03-10 18:18:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [592/800][200/402] eta 0:02:33 lr 0.000025 time 0.7465 (0.7591) loss 0.6370 (0.6228) grad_norm 0.1733 (inf) loss_scale 262144.0000 (303878.3682) mem 28968MB [2024-03-10 18:19:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [592/800][300/402] eta 0:01:16 lr 0.000025 time 0.7465 (0.7547) loss 0.6490 (0.6233) grad_norm 0.1426 (inf) loss_scale 262144.0000 (290013.1296) mem 28968MB [2024-03-10 18:20:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [592/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7525) loss 0.6567 (0.6221) grad_norm 0.1389 (inf) loss_scale 262144.0000 (283063.2219) mem 28968MB [2024-03-10 18:20:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 592 training takes 0:05:02 [2024-03-10 18:20:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [593/800][0/402] eta 0:23:14 lr 0.000025 time 3.4684 (3.4684) loss 0.6095 (0.6095) grad_norm 0.1641 (0.1641) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:21:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [593/800][100/402] eta 0:03:53 lr 0.000025 time 0.7460 (0.7729) loss 0.6247 (0.6196) grad_norm 0.1393 (0.1494) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:23:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [593/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7595) loss 0.6219 (0.6209) grad_norm 0.1553 (0.1503) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:24:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [593/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7554) loss 0.6111 (0.6212) grad_norm 0.1658 (0.1502) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:25:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [593/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7530) loss 0.6076 (0.6208) grad_norm 0.1500 (0.1508) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:25:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 593 training takes 0:05:02 [2024-03-10 18:25:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [594/800][0/402] eta 0:22:46 lr 0.000025 time 3.3999 (3.3999) loss 0.6429 (0.6429) grad_norm 0.1532 (0.1532) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:26:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [594/800][100/402] eta 0:03:53 lr 0.000025 time 0.7460 (0.7723) loss 0.6188 (0.6225) grad_norm 0.1616 (0.1509) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:28:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [594/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7592) loss 0.6156 (0.6227) grad_norm 0.1375 (0.1510) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:29:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [594/800][300/402] eta 0:01:16 lr 0.000025 time 0.7465 (0.7548) loss 0.6341 (0.6222) grad_norm 0.1390 (0.1503) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:30:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [594/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7526) loss 0.6094 (0.6220) grad_norm 0.1474 (0.1495) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:30:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 594 training takes 0:05:02 [2024-03-10 18:30:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [595/800][0/402] eta 0:22:22 lr 0.000025 time 3.3406 (3.3406) loss 0.6231 (0.6231) grad_norm 0.1428 (0.1428) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:32:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [595/800][100/402] eta 0:03:53 lr 0.000025 time 0.7459 (0.7717) loss 0.6452 (0.6232) grad_norm 0.1427 (0.1485) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:33:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [595/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7589) loss 0.6346 (0.6217) grad_norm 0.1299 (0.1488) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:34:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [595/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7547) loss 0.5853 (0.6220) grad_norm 0.1349 (0.1497) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:35:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [595/800][400/402] eta 0:00:01 lr 0.000025 time 0.7464 (0.7524) loss 0.5876 (0.6218) grad_norm 0.1507 (0.1494) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:35:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 595 training takes 0:05:02 [2024-03-10 18:35:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [596/800][0/402] eta 0:33:48 lr 0.000025 time 5.0458 (5.0458) loss 0.6339 (0.6339) grad_norm 0.1347 (0.1347) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:37:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [596/800][100/402] eta 0:03:58 lr 0.000025 time 0.7459 (0.7886) loss 0.6257 (0.6235) grad_norm 0.1705 (0.1486) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:38:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [596/800][200/402] eta 0:02:35 lr 0.000025 time 0.7463 (0.7675) loss 0.6117 (0.6223) grad_norm 0.1461 (0.1499) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:39:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [596/800][300/402] eta 0:01:17 lr 0.000025 time 0.7454 (0.7604) loss 0.6102 (0.6223) grad_norm 0.1262 (0.1489) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:40:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [596/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7567) loss 0.6053 (0.6218) grad_norm 0.1499 (0.1489) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:40:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 596 training takes 0:05:04 [2024-03-10 18:40:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [597/800][0/402] eta 0:23:30 lr 0.000025 time 3.5078 (3.5078) loss 0.6493 (0.6493) grad_norm 0.1456 (0.1456) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 18:42:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [597/800][100/402] eta 0:03:53 lr 0.000025 time 0.7459 (0.7732) loss 0.6221 (0.6205) grad_norm 0.1515 (0.1501) loss_scale 524288.0000 (467187.3267) mem 28968MB [2024-03-10 18:43:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [597/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7597) loss 0.6112 (0.6215) grad_norm 0.1707 (0.1497) loss_scale 524288.0000 (495595.6219) mem 28968MB [2024-03-10 18:44:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [597/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7551) loss 0.6463 (0.6211) grad_norm 0.1312 (0.1499) loss_scale 524288.0000 (505127.9734) mem 28968MB [2024-03-10 18:45:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [597/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7528) loss 0.6323 (0.6213) grad_norm 0.1524 (0.1498) loss_scale 524288.0000 (509906.0349) mem 28968MB [2024-03-10 18:45:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 597 training takes 0:05:02 [2024-03-10 18:45:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [598/800][0/402] eta 0:23:17 lr 0.000025 time 3.4757 (3.4757) loss 0.6338 (0.6338) grad_norm 0.1734 (0.1734) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:47:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [598/800][100/402] eta 0:03:53 lr 0.000025 time 0.7463 (0.7730) loss 0.6192 (0.6237) grad_norm 0.1287 (0.1500) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:48:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [598/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7598) loss 0.6246 (0.6209) grad_norm 0.1273 (0.1498) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:49:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [598/800][300/402] eta 0:01:17 lr 0.000025 time 0.7467 (0.7552) loss 0.6086 (0.6207) grad_norm 0.1505 (0.1494) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:50:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [598/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7528) loss 0.6281 (0.6208) grad_norm 0.1193 (0.1492) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:50:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 598 training takes 0:05:02 [2024-03-10 18:50:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [599/800][0/402] eta 0:22:45 lr 0.000025 time 3.3969 (3.3969) loss 0.6257 (0.6257) grad_norm 0.1592 (0.1592) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:52:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [599/800][100/402] eta 0:03:53 lr 0.000025 time 0.7464 (0.7723) loss 0.6057 (0.6197) grad_norm 0.1464 (0.1529) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:53:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [599/800][200/402] eta 0:02:33 lr 0.000025 time 0.7485 (0.7593) loss 0.6173 (0.6217) grad_norm 0.1405 (0.1502) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:54:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [599/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7549) loss 0.6452 (0.6216) grad_norm 0.1502 (0.1492) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:55:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [599/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7526) loss 0.6351 (0.6219) grad_norm 0.1542 (0.1497) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:55:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 599 training takes 0:05:02 [2024-03-10 18:56:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [600/800][0/402] eta 0:22:55 lr 0.000025 time 3.4222 (3.4222) loss 0.6227 (0.6227) grad_norm 0.1518 (0.1518) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 18:57:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [600/800][100/402] eta 0:03:53 lr 0.000025 time 0.7461 (0.7724) loss 0.6181 (0.6208) grad_norm 0.1388 (inf) loss_scale 262144.0000 (285503.3663) mem 28968MB [2024-03-10 18:58:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [600/800][200/402] eta 0:02:33 lr 0.000025 time 0.7469 (0.7593) loss 0.6045 (0.6216) grad_norm 0.1605 (inf) loss_scale 262144.0000 (273881.7910) mem 28968MB [2024-03-10 18:59:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [600/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7552) loss 0.6144 (0.6227) grad_norm 0.1536 (inf) loss_scale 262144.0000 (269982.1927) mem 28968MB [2024-03-10 19:00:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [600/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7529) loss 0.6308 (0.6231) grad_norm 0.1452 (inf) loss_scale 262144.0000 (268027.5312) mem 28968MB [2024-03-10 19:01:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 600 training takes 0:05:02 [2024-03-10 19:01:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [601/800][0/402] eta 0:33:55 lr 0.000025 time 5.0629 (5.0629) loss 0.6353 (0.6353) grad_norm 0.1281 (0.1281) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:02:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [601/800][100/402] eta 0:03:58 lr 0.000025 time 0.7463 (0.7890) loss 0.6173 (0.6230) grad_norm 0.1294 (0.1492) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:03:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [601/800][200/402] eta 0:02:35 lr 0.000025 time 0.7458 (0.7676) loss 0.6598 (0.6230) grad_norm 0.1213 (0.1482) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:04:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [601/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7605) loss 0.6089 (0.6230) grad_norm 0.1418 (0.1494) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:06:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [601/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7568) loss 0.6352 (0.6225) grad_norm 0.1259 (0.1493) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:06:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 601 training takes 0:05:04 [2024-03-10 19:06:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [602/800][0/402] eta 0:22:51 lr 0.000025 time 3.4116 (3.4116) loss 0.5855 (0.5855) grad_norm 0.1648 (0.1648) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:07:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [602/800][100/402] eta 0:03:53 lr 0.000025 time 0.7458 (0.7723) loss 0.6237 (0.6217) grad_norm 0.1403 (0.1561) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:08:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [602/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7592) loss 0.6040 (0.6209) grad_norm 0.2152 (0.1537) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:09:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [602/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7551) loss 0.6130 (0.6218) grad_norm 0.1860 (0.1510) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:11:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [602/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7528) loss 0.6187 (0.6221) grad_norm 0.1777 (0.1512) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:11:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 602 training takes 0:05:02 [2024-03-10 19:11:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [603/800][0/402] eta 0:23:36 lr 0.000025 time 3.5242 (3.5242) loss 0.6093 (0.6093) grad_norm 0.2306 (0.2306) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:12:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [603/800][100/402] eta 0:03:53 lr 0.000025 time 0.7456 (0.7734) loss 0.6313 (0.6225) grad_norm 0.1346 (0.1515) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:13:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [603/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7598) loss 0.6140 (0.6213) grad_norm 0.1387 (0.1505) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:14:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [603/800][300/402] eta 0:01:17 lr 0.000025 time 0.7464 (0.7552) loss 0.6119 (0.6211) grad_norm 0.1515 (0.1502) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:16:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [603/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7528) loss 0.6434 (0.6216) grad_norm 0.1556 (0.1510) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:16:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 603 training takes 0:05:02 [2024-03-10 19:16:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [604/800][0/402] eta 0:22:40 lr 0.000025 time 3.3845 (3.3845) loss 0.6138 (0.6138) grad_norm 0.1728 (0.1728) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:17:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [604/800][100/402] eta 0:03:53 lr 0.000025 time 0.7465 (0.7721) loss 0.6328 (0.6212) grad_norm 0.1522 (0.1490) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:18:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [604/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7591) loss 0.6396 (0.6204) grad_norm 0.1347 (0.1509) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:19:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [604/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7548) loss 0.6200 (0.6217) grad_norm 0.1331 (0.1511) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:21:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [604/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7525) loss 0.6173 (0.6221) grad_norm 0.1372 (0.1504) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 19:21:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 604 training takes 0:05:02 [2024-03-10 19:21:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [605/800][0/402] eta 0:23:42 lr 0.000025 time 3.5378 (3.5378) loss 0.6379 (0.6379) grad_norm 0.1239 (0.1239) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:22:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [605/800][100/402] eta 0:03:53 lr 0.000025 time 0.7470 (0.7745) loss 0.6387 (0.6195) grad_norm 0.1321 (0.1471) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:23:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [605/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7605) loss 0.6339 (0.6210) grad_norm 0.1431 (0.1498) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:25:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [605/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7558) loss 0.6172 (0.6207) grad_norm 0.1687 (0.1503) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:26:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [605/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7533) loss 0.6110 (0.6207) grad_norm 0.1549 (0.1502) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:26:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 605 training takes 0:05:02 [2024-03-10 19:26:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [606/800][0/402] eta 0:34:11 lr 0.000025 time 5.1037 (5.1037) loss 0.6433 (0.6433) grad_norm 0.1315 (0.1315) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:27:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [606/800][100/402] eta 0:03:58 lr 0.000025 time 0.7459 (0.7893) loss 0.6401 (0.6242) grad_norm 0.1729 (0.1504) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:28:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [606/800][200/402] eta 0:02:35 lr 0.000025 time 0.7458 (0.7678) loss 0.6096 (0.6229) grad_norm 0.1444 (0.1517) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:30:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [606/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7606) loss 0.6255 (0.6226) grad_norm 0.1561 (0.1522) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:31:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [606/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7569) loss 0.6373 (0.6217) grad_norm 0.1363 (0.1514) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:31:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 606 training takes 0:05:04 [2024-03-10 19:31:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [607/800][0/402] eta 0:22:51 lr 0.000025 time 3.4122 (3.4122) loss 0.5779 (0.5779) grad_norm 0.1666 (0.1666) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:32:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [607/800][100/402] eta 0:03:53 lr 0.000025 time 0.7465 (0.7730) loss 0.6324 (0.6246) grad_norm 0.1440 (0.1513) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:33:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [607/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7597) loss 0.6077 (0.6232) grad_norm 0.1646 (0.1496) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:35:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [607/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7552) loss 0.6372 (0.6231) grad_norm 0.1346 (0.1492) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:36:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [607/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7528) loss 0.6115 (0.6225) grad_norm 0.1413 (0.1499) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:36:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 607 training takes 0:05:02 [2024-03-10 19:36:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [608/800][0/402] eta 0:22:38 lr 0.000025 time 3.3803 (3.3803) loss 0.6255 (0.6255) grad_norm 0.1232 (0.1232) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:37:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [608/800][100/402] eta 0:03:53 lr 0.000025 time 0.7473 (0.7721) loss 0.6097 (0.6227) grad_norm 0.1605 (0.1479) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:38:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [608/800][200/402] eta 0:02:33 lr 0.000025 time 0.7465 (0.7592) loss 0.6450 (0.6229) grad_norm 0.1716 (0.1524) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:40:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [608/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7548) loss 0.6334 (0.6240) grad_norm 0.1401 (0.1513) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:41:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [608/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7526) loss 0.6093 (0.6235) grad_norm 0.1760 (0.1511) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:41:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 608 training takes 0:05:02 [2024-03-10 19:41:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [609/800][0/402] eta 0:23:17 lr 0.000025 time 3.4764 (3.4764) loss 0.6457 (0.6457) grad_norm 0.1550 (0.1550) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:42:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [609/800][100/402] eta 0:03:53 lr 0.000025 time 0.7459 (0.7730) loss 0.6077 (0.6205) grad_norm 0.1333 (0.1499) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:43:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [609/800][200/402] eta 0:02:33 lr 0.000025 time 0.7465 (0.7600) loss 0.6473 (0.6217) grad_norm 0.1483 (0.1485) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:45:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [609/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7553) loss 0.6075 (0.6212) grad_norm 0.1569 (0.1485) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:46:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [609/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7530) loss 0.6132 (0.6219) grad_norm 0.1266 (inf) loss_scale 524288.0000 (528210.3541) mem 28968MB [2024-03-10 19:46:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 609 training takes 0:05:02 [2024-03-10 19:46:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [610/800][0/402] eta 0:23:04 lr 0.000025 time 3.4431 (3.4431) loss 0.6329 (0.6329) grad_norm 0.1488 (0.1488) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:47:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [610/800][100/402] eta 0:03:53 lr 0.000025 time 0.7463 (0.7728) loss 0.6153 (0.6249) grad_norm 0.1640 (0.1458) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:49:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [610/800][200/402] eta 0:02:33 lr 0.000025 time 0.7466 (0.7595) loss 0.6316 (0.6219) grad_norm 0.1473 (0.1471) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:50:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [610/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7550) loss 0.6419 (0.6217) grad_norm 0.1412 (0.1488) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:51:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [610/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7527) loss 0.6382 (0.6220) grad_norm 0.1374 (0.1491) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:51:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 610 training takes 0:05:02 [2024-03-10 19:51:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [611/800][0/402] eta 0:36:32 lr 0.000025 time 5.4535 (5.4535) loss 0.6219 (0.6219) grad_norm 0.1296 (0.1296) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:52:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [611/800][100/402] eta 0:03:59 lr 0.000025 time 0.7457 (0.7930) loss 0.6516 (0.6233) grad_norm 0.1277 (0.1543) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:54:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [611/800][200/402] eta 0:02:35 lr 0.000025 time 0.7455 (0.7699) loss 0.6124 (0.6216) grad_norm 0.1871 (0.1540) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:55:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [611/800][300/402] eta 0:01:17 lr 0.000025 time 0.7460 (0.7622) loss 0.6197 (0.6232) grad_norm 0.1502 (0.1529) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:56:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [611/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7582) loss 0.6216 (0.6225) grad_norm 0.1340 (0.1525) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:56:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 611 training takes 0:05:04 [2024-03-10 19:56:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [612/800][0/402] eta 0:24:33 lr 0.000025 time 3.6650 (3.6650) loss 0.5953 (0.5953) grad_norm 0.1262 (0.1262) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:57:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [612/800][100/402] eta 0:03:54 lr 0.000025 time 0.7457 (0.7754) loss 0.6480 (0.6253) grad_norm 0.1543 (0.1506) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 19:59:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [612/800][200/402] eta 0:02:33 lr 0.000025 time 0.7480 (0.7609) loss 0.6079 (0.6220) grad_norm 0.1686 (0.1513) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:00:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [612/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7561) loss 0.6126 (0.6215) grad_norm 0.1418 (0.1512) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:01:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [612/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7536) loss 0.6398 (0.6220) grad_norm 0.1608 (0.1516) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:01:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 612 training takes 0:05:03 [2024-03-10 20:01:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [613/800][0/402] eta 0:23:55 lr 0.000025 time 3.5704 (3.5704) loss 0.6189 (0.6189) grad_norm 0.1467 (0.1467) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:02:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [613/800][100/402] eta 0:03:53 lr 0.000025 time 0.7460 (0.7742) loss 0.6294 (0.6233) grad_norm 0.1700 (0.1492) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:04:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [613/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7602) loss 0.6381 (0.6228) grad_norm 0.1605 (0.1505) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:05:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [613/800][300/402] eta 0:01:17 lr 0.000025 time 0.7466 (0.7555) loss 0.6160 (0.6228) grad_norm 0.1626 (0.1505) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:06:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [613/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7532) loss 0.6225 (0.6223) grad_norm 0.1343 (0.1508) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:06:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 613 training takes 0:05:02 [2024-03-10 20:06:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [614/800][0/402] eta 0:24:16 lr 0.000025 time 3.6228 (3.6228) loss 0.6094 (0.6094) grad_norm 0.1560 (0.1560) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:08:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [614/800][100/402] eta 0:03:53 lr 0.000025 time 0.7463 (0.7746) loss 0.6234 (0.6222) grad_norm 0.1311 (0.1491) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:09:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [614/800][200/402] eta 0:02:33 lr 0.000025 time 0.7465 (0.7608) loss 0.6266 (0.6211) grad_norm 0.1672 (0.1525) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:10:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [614/800][300/402] eta 0:01:17 lr 0.000025 time 0.7497 (0.7561) loss 0.6085 (0.6215) grad_norm 0.1793 (0.1529) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:11:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [614/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7536) loss 0.6021 (0.6214) grad_norm 0.1754 (0.1516) loss_scale 1048576.0000 (546514.6733) mem 28968MB [2024-03-10 20:11:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 614 training takes 0:05:03 [2024-03-10 20:11:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [615/800][0/402] eta 0:24:01 lr 0.000025 time 3.5856 (3.5856) loss 0.6483 (0.6483) grad_norm 0.1589 (0.1589) loss_scale 1048576.0000 (1048576.0000) mem 28968MB [2024-03-10 20:13:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [615/800][100/402] eta 0:03:53 lr 0.000025 time 0.7457 (0.7745) loss 0.6181 (0.6199) grad_norm 0.1624 (0.1508) loss_scale 1048576.0000 (1048576.0000) mem 28968MB [2024-03-10 20:14:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [615/800][200/402] eta 0:02:33 lr 0.000025 time 0.7469 (0.7607) loss 0.6223 (0.6196) grad_norm 0.1861 (inf) loss_scale 524288.0000 (842512.5572) mem 28968MB [2024-03-10 20:15:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [615/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7559) loss 0.6433 (0.6205) grad_norm 0.1445 (inf) loss_scale 524288.0000 (736790.1130) mem 28968MB [2024-03-10 20:16:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [615/800][400/402] eta 0:00:01 lr 0.000025 time 0.7454 (0.7535) loss 0.6282 (0.6207) grad_norm 0.1670 (inf) loss_scale 524288.0000 (683797.0673) mem 28968MB [2024-03-10 20:16:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 615 training takes 0:05:03 [2024-03-10 20:16:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [616/800][0/402] eta 0:37:48 lr 0.000025 time 5.6436 (5.6436) loss 0.6479 (0.6479) grad_norm 0.1445 (0.1445) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:18:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [616/800][100/402] eta 0:04:00 lr 0.000025 time 0.7468 (0.7950) loss 0.6061 (0.6226) grad_norm 0.1708 (0.1504) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:19:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [616/800][200/402] eta 0:02:35 lr 0.000025 time 0.7462 (0.7708) loss 0.6034 (0.6220) grad_norm 0.1609 (0.1495) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:20:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [616/800][300/402] eta 0:01:17 lr 0.000025 time 0.7491 (0.7630) loss 0.5920 (0.6215) grad_norm 0.1497 (0.1497) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:21:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [616/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7588) loss 0.6357 (0.6221) grad_norm 0.1505 (0.1503) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:21:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 616 training takes 0:05:05 [2024-03-10 20:21:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [617/800][0/402] eta 0:23:49 lr 0.000025 time 3.5557 (3.5557) loss 0.6196 (0.6196) grad_norm 0.1432 (0.1432) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:23:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [617/800][100/402] eta 0:03:53 lr 0.000025 time 0.7459 (0.7740) loss 0.6374 (0.6194) grad_norm 0.1137 (0.1484) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:24:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [617/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7601) loss 0.6210 (0.6196) grad_norm 0.1816 (0.1508) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:25:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [617/800][300/402] eta 0:01:17 lr 0.000025 time 0.7460 (0.7556) loss 0.6193 (0.6209) grad_norm 0.1455 (0.1511) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:26:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [617/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7532) loss 0.6246 (0.6217) grad_norm 0.1355 (0.1509) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:26:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 617 training takes 0:05:02 [2024-03-10 20:26:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [618/800][0/402] eta 0:24:09 lr 0.000025 time 3.6069 (3.6069) loss 0.6035 (0.6035) grad_norm 0.1553 (0.1553) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:28:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [618/800][100/402] eta 0:03:53 lr 0.000025 time 0.7464 (0.7743) loss 0.6436 (0.6231) grad_norm 0.1471 (0.1541) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:29:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [618/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7606) loss 0.6394 (0.6231) grad_norm 0.1515 (0.1539) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:30:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [618/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7558) loss 0.6285 (0.6240) grad_norm 0.1396 (0.1528) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:31:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [618/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7534) loss 0.6201 (0.6224) grad_norm 0.1353 (0.1520) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:31:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 618 training takes 0:05:02 [2024-03-10 20:32:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [619/800][0/402] eta 0:23:14 lr 0.000025 time 3.4699 (3.4699) loss 0.6348 (0.6348) grad_norm 0.1473 (0.1473) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:33:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [619/800][100/402] eta 0:03:53 lr 0.000025 time 0.7465 (0.7746) loss 0.6293 (0.6253) grad_norm 0.1560 (0.1484) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:34:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [619/800][200/402] eta 0:02:33 lr 0.000025 time 0.7465 (0.7606) loss 0.6482 (0.6238) grad_norm 0.1596 (0.1496) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:35:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [619/800][300/402] eta 0:01:17 lr 0.000025 time 0.7467 (0.7560) loss 0.6354 (0.6237) grad_norm 0.1471 (0.1497) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:37:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [619/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7535) loss 0.5861 (0.6230) grad_norm 0.1587 (0.1501) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:37:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 619 training takes 0:05:03 [2024-03-10 20:37:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [620/800][0/402] eta 0:23:31 lr 0.000025 time 3.5102 (3.5102) loss 0.5700 (0.5700) grad_norm 0.1644 (0.1644) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:38:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [620/800][100/402] eta 0:03:53 lr 0.000025 time 0.7461 (0.7735) loss 0.6129 (0.6224) grad_norm 0.1598 (0.1512) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:39:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [620/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7599) loss 0.6277 (0.6226) grad_norm 0.1338 (0.1499) loss_scale 1048576.0000 (756435.4229) mem 28968MB [2024-03-10 20:40:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [620/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7555) loss 0.5791 (0.6212) grad_norm 0.1963 (inf) loss_scale 524288.0000 (797753.8339) mem 28968MB [2024-03-10 20:42:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [620/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7531) loss 0.6196 (0.6220) grad_norm 0.1407 (inf) loss_scale 524288.0000 (729557.8653) mem 28968MB [2024-03-10 20:42:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 620 training takes 0:05:02 [2024-03-10 20:42:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [621/800][0/402] eta 0:37:33 lr 0.000025 time 5.6053 (5.6053) loss 0.5992 (0.5992) grad_norm 0.1421 (0.1421) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:43:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [621/800][100/402] eta 0:03:59 lr 0.000025 time 0.7465 (0.7946) loss 0.6139 (0.6219) grad_norm 0.1664 (0.1484) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:44:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [621/800][200/402] eta 0:02:35 lr 0.000025 time 0.7466 (0.7708) loss 0.6257 (0.6232) grad_norm 0.1679 (0.1491) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:45:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [621/800][300/402] eta 0:01:17 lr 0.000025 time 0.7465 (0.7625) loss 0.6378 (0.6221) grad_norm 0.1556 (0.1497) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:47:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [621/800][400/402] eta 0:00:01 lr 0.000025 time 0.7438 (0.7585) loss 0.6473 (0.6225) grad_norm 0.1446 (0.1501) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:47:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 621 training takes 0:05:05 [2024-03-10 20:47:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [622/800][0/402] eta 0:23:45 lr 0.000025 time 3.5461 (3.5461) loss 0.6014 (0.6014) grad_norm 0.1377 (0.1377) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:48:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [622/800][100/402] eta 0:03:53 lr 0.000025 time 0.7473 (0.7742) loss 0.6164 (0.6212) grad_norm 0.1424 (0.1487) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:49:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [622/800][200/402] eta 0:02:33 lr 0.000025 time 0.7475 (0.7605) loss 0.6162 (0.6206) grad_norm 0.1507 (0.1507) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:50:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [622/800][300/402] eta 0:01:17 lr 0.000025 time 0.7464 (0.7562) loss 0.6187 (0.6217) grad_norm 0.1623 (0.1518) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:52:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [622/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7537) loss 0.6176 (0.6214) grad_norm 0.1326 (0.1521) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:52:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 622 training takes 0:05:03 [2024-03-10 20:52:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [623/800][0/402] eta 0:24:41 lr 0.000025 time 3.6855 (3.6855) loss 0.5912 (0.5912) grad_norm 0.1446 (0.1446) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:53:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [623/800][100/402] eta 0:03:54 lr 0.000025 time 0.7459 (0.7755) loss 0.6280 (0.6194) grad_norm 0.1598 (0.1531) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:54:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [623/800][200/402] eta 0:02:33 lr 0.000025 time 0.7475 (0.7611) loss 0.6399 (0.6213) grad_norm 0.1410 (0.1525) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:56:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [623/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7564) loss 0.6426 (0.6222) grad_norm 0.1253 (0.1517) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:57:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [623/800][400/402] eta 0:00:01 lr 0.000025 time 0.7437 (0.7539) loss 0.6176 (0.6217) grad_norm 0.1703 (0.1519) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:57:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 623 training takes 0:05:03 [2024-03-10 20:57:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [624/800][0/402] eta 0:25:02 lr 0.000025 time 3.7372 (3.7372) loss 0.6021 (0.6021) grad_norm 0.1609 (0.1609) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:58:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [624/800][100/402] eta 0:03:54 lr 0.000025 time 0.7454 (0.7759) loss 0.5898 (0.6215) grad_norm 0.1382 (0.1530) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 20:59:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [624/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7611) loss 0.6367 (0.6226) grad_norm 0.1414 (0.1512) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:01:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [624/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7561) loss 0.6152 (0.6223) grad_norm 0.1556 (0.1523) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:02:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [624/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7536) loss 0.6316 (0.6225) grad_norm 0.1377 (0.1517) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:02:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 624 training takes 0:05:03 [2024-03-10 21:02:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [625/800][0/402] eta 0:24:55 lr 0.000025 time 3.7198 (3.7198) loss 0.5991 (0.5991) grad_norm 0.1574 (0.1574) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:03:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [625/800][100/402] eta 0:03:54 lr 0.000025 time 0.7460 (0.7764) loss 0.6108 (0.6242) grad_norm 0.1479 (0.1526) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:04:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [625/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7614) loss 0.6337 (0.6215) grad_norm 0.1665 (0.1520) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:06:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [625/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7563) loss 0.6462 (0.6204) grad_norm 0.1398 (inf) loss_scale 524288.0000 (571317.1561) mem 28968MB [2024-03-10 21:07:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [625/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7539) loss 0.6162 (0.6210) grad_norm 0.1392 (inf) loss_scale 524288.0000 (559589.1870) mem 28968MB [2024-03-10 21:07:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 625 training takes 0:05:03 [2024-03-10 21:07:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [626/800][0/402] eta 0:38:04 lr 0.000025 time 5.6820 (5.6820) loss 0.6124 (0.6124) grad_norm 0.1391 (0.1391) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:08:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [626/800][100/402] eta 0:04:00 lr 0.000025 time 0.7465 (0.7954) loss 0.6281 (0.6212) grad_norm 0.1324 (0.1507) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:09:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [626/800][200/402] eta 0:02:35 lr 0.000025 time 0.7467 (0.7710) loss 0.6106 (0.6220) grad_norm 0.1622 (0.1517) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:11:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [626/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7629) loss 0.6221 (0.6221) grad_norm 0.1609 (0.1512) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:12:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [626/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7588) loss 0.5854 (0.6215) grad_norm 0.1793 (0.1516) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:12:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 626 training takes 0:05:05 [2024-03-10 21:12:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [627/800][0/402] eta 0:23:54 lr 0.000025 time 3.5676 (3.5676) loss 0.6317 (0.6317) grad_norm 0.1520 (0.1520) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:13:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [627/800][100/402] eta 0:03:54 lr 0.000025 time 0.7460 (0.7750) loss 0.6596 (0.6213) grad_norm 0.1483 (0.1534) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:15:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [627/800][200/402] eta 0:02:33 lr 0.000025 time 0.7468 (0.7607) loss 0.5971 (0.6217) grad_norm 0.1544 (0.1522) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:16:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [627/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7559) loss 0.6084 (0.6226) grad_norm 0.1649 (0.1534) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:17:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [627/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7535) loss 0.6133 (0.6220) grad_norm 0.1513 (inf) loss_scale 262144.0000 (482449.5561) mem 28968MB [2024-03-10 21:17:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 627 training takes 0:05:03 [2024-03-10 21:17:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [628/800][0/402] eta 0:24:39 lr 0.000025 time 3.6809 (3.6809) loss 0.6623 (0.6623) grad_norm 0.1388 (0.1388) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:18:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [628/800][100/402] eta 0:03:54 lr 0.000025 time 0.7456 (0.7754) loss 0.6510 (0.6237) grad_norm 0.1600 (0.1489) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:20:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [628/800][200/402] eta 0:02:33 lr 0.000025 time 0.7475 (0.7610) loss 0.6302 (0.6233) grad_norm 0.1557 (0.1514) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:21:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [628/800][300/402] eta 0:01:17 lr 0.000025 time 0.7468 (0.7562) loss 0.6138 (0.6226) grad_norm 0.1602 (0.1506) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:22:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [628/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7537) loss 0.6387 (0.6226) grad_norm 0.1547 (0.1499) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:22:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 628 training takes 0:05:03 [2024-03-10 21:22:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [629/800][0/402] eta 0:24:06 lr 0.000025 time 3.5973 (3.5973) loss 0.5991 (0.5991) grad_norm 0.1576 (0.1576) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:23:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [629/800][100/402] eta 0:03:53 lr 0.000025 time 0.7488 (0.7746) loss 0.6279 (0.6206) grad_norm 0.1584 (0.1558) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:25:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [629/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7607) loss 0.6231 (0.6208) grad_norm 0.1191 (0.1535) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:26:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [629/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7559) loss 0.6556 (0.6207) grad_norm 0.1231 (0.1520) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:27:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [629/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7535) loss 0.6047 (0.6210) grad_norm 0.1656 (0.1515) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:27:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 629 training takes 0:05:02 [2024-03-10 21:27:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [630/800][0/402] eta 0:24:29 lr 0.000025 time 3.6560 (3.6560) loss 0.6323 (0.6323) grad_norm 0.1652 (0.1652) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:28:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [630/800][100/402] eta 0:03:54 lr 0.000025 time 0.7461 (0.7749) loss 0.6214 (0.6219) grad_norm 0.1250 (0.1489) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:30:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [630/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7605) loss 0.6178 (0.6210) grad_norm 0.1464 (0.1478) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:31:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [630/800][300/402] eta 0:01:17 lr 0.000025 time 0.7468 (0.7558) loss 0.6161 (0.6211) grad_norm 0.1632 (0.1488) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:32:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [630/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7534) loss 0.6129 (0.6215) grad_norm 0.1428 (0.1495) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:32:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 630 training takes 0:05:02 [2024-03-10 21:32:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [631/800][0/402] eta 0:37:59 lr 0.000025 time 5.6692 (5.6692) loss 0.6267 (0.6267) grad_norm 0.1329 (0.1329) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:34:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [631/800][100/402] eta 0:04:00 lr 0.000025 time 0.7493 (0.7953) loss 0.6101 (0.6201) grad_norm 0.1507 (0.1509) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:35:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [631/800][200/402] eta 0:02:35 lr 0.000025 time 0.7467 (0.7709) loss 0.6173 (0.6206) grad_norm 0.1575 (0.1534) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:36:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [631/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7628) loss 0.6483 (0.6207) grad_norm 0.1512 (0.1525) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:37:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [631/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7586) loss 0.6284 (0.6201) grad_norm 0.1590 (0.1524) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:37:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 631 training takes 0:05:05 [2024-03-10 21:37:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [632/800][0/402] eta 0:24:29 lr 0.000025 time 3.6548 (3.6548) loss 0.6024 (0.6024) grad_norm 0.1608 (0.1608) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:39:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [632/800][100/402] eta 0:03:54 lr 0.000025 time 0.7469 (0.7753) loss 0.6143 (0.6205) grad_norm 0.1372 (0.1487) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:40:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [632/800][200/402] eta 0:02:33 lr 0.000025 time 0.7464 (0.7610) loss 0.6161 (0.6213) grad_norm 0.1730 (0.1496) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:41:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [632/800][300/402] eta 0:01:17 lr 0.000025 time 0.7464 (0.7561) loss 0.6246 (0.6216) grad_norm 0.1278 (0.1498) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 21:42:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [632/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7537) loss 0.6172 (0.6215) grad_norm 0.1740 (0.1514) loss_scale 524288.0000 (310519.7007) mem 28968MB [2024-03-10 21:42:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 632 training takes 0:05:03 [2024-03-10 21:42:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [633/800][0/402] eta 0:25:05 lr 0.000025 time 3.7458 (3.7458) loss 0.6412 (0.6412) grad_norm 0.1454 (0.1454) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:44:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [633/800][100/402] eta 0:03:54 lr 0.000025 time 0.7458 (0.7761) loss 0.6377 (0.6205) grad_norm 0.1567 (0.1519) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:45:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [633/800][200/402] eta 0:02:33 lr 0.000025 time 0.7493 (0.7613) loss 0.6305 (0.6208) grad_norm 0.1593 (0.1514) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:46:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [633/800][300/402] eta 0:01:17 lr 0.000025 time 0.7470 (0.7563) loss 0.6164 (0.6206) grad_norm 0.1147 (0.1508) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:47:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [633/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7537) loss 0.6313 (0.6208) grad_norm 0.1595 (0.1508) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:47:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 633 training takes 0:05:03 [2024-03-10 21:47:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [634/800][0/402] eta 0:24:03 lr 0.000025 time 3.5908 (3.5908) loss 0.6205 (0.6205) grad_norm 0.1443 (0.1443) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:49:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [634/800][100/402] eta 0:03:53 lr 0.000025 time 0.7456 (0.7746) loss 0.6332 (0.6244) grad_norm 0.1473 (0.1516) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:50:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [634/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7605) loss 0.6320 (0.6226) grad_norm 0.1456 (0.1511) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:51:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [634/800][300/402] eta 0:01:17 lr 0.000025 time 0.7502 (0.7558) loss 0.5817 (0.6215) grad_norm 0.1821 (0.1513) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:52:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [634/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7534) loss 0.6354 (0.6216) grad_norm 0.1320 (0.1508) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:52:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 634 training takes 0:05:02 [2024-03-10 21:52:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [635/800][0/402] eta 0:25:26 lr 0.000025 time 3.7969 (3.7969) loss 0.6210 (0.6210) grad_norm 0.1460 (0.1460) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:54:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [635/800][100/402] eta 0:03:54 lr 0.000025 time 0.7471 (0.7775) loss 0.6084 (0.6203) grad_norm 0.1494 (0.1519) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:55:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [635/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7621) loss 0.5898 (0.6206) grad_norm 0.1652 (0.1524) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:56:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [635/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7569) loss 0.6096 (0.6213) grad_norm 0.1609 (0.1512) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:57:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [635/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7543) loss 0.6124 (0.6211) grad_norm 0.1566 (0.1508) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:57:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 635 training takes 0:05:03 [2024-03-10 21:58:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [636/800][0/402] eta 0:37:06 lr 0.000025 time 5.5393 (5.5393) loss 0.6103 (0.6103) grad_norm 0.1480 (0.1480) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 21:59:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [636/800][100/402] eta 0:03:59 lr 0.000025 time 0.7460 (0.7936) loss 0.6027 (0.6243) grad_norm 0.1502 (0.1490) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:00:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [636/800][200/402] eta 0:02:35 lr 0.000025 time 0.7457 (0.7698) loss 0.5888 (0.6231) grad_norm 0.1545 (0.1507) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:01:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [636/800][300/402] eta 0:01:17 lr 0.000025 time 0.7452 (0.7618) loss 0.6139 (0.6232) grad_norm 0.1554 (0.1503) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:03:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [636/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7577) loss 0.6336 (0.6222) grad_norm 0.1793 (inf) loss_scale 262144.0000 (465452.6883) mem 28968MB [2024-03-10 22:03:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 636 training takes 0:05:04 [2024-03-10 22:03:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [637/800][0/402] eta 0:22:52 lr 0.000025 time 3.4145 (3.4145) loss 0.6576 (0.6576) grad_norm 0.1300 (0.1300) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:04:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [637/800][100/402] eta 0:03:53 lr 0.000025 time 0.7479 (0.7721) loss 0.6379 (0.6235) grad_norm 0.1466 (0.1546) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:05:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [637/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7593) loss 0.6236 (0.6226) grad_norm 0.1592 (0.1517) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:06:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [637/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7550) loss 0.6211 (0.6214) grad_norm 0.1320 (0.1505) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:08:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [637/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7528) loss 0.6281 (0.6213) grad_norm 0.1306 (0.1509) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:08:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 637 training takes 0:05:02 [2024-03-10 22:08:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [638/800][0/402] eta 0:22:05 lr 0.000025 time 3.2967 (3.2967) loss 0.6345 (0.6345) grad_norm 0.1463 (0.1463) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:09:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [638/800][100/402] eta 0:03:52 lr 0.000025 time 0.7461 (0.7712) loss 0.6206 (0.6209) grad_norm 0.1488 (0.1491) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:10:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [638/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7586) loss 0.6148 (0.6207) grad_norm 0.1644 (0.1505) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:11:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [638/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7548) loss 0.6131 (0.6205) grad_norm 0.1696 (0.1508) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:13:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [638/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7525) loss 0.6443 (0.6206) grad_norm 0.1375 (0.1511) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:13:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 638 training takes 0:05:02 [2024-03-10 22:13:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [639/800][0/402] eta 0:22:50 lr 0.000025 time 3.4095 (3.4095) loss 0.6416 (0.6416) grad_norm 0.1555 (0.1555) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:14:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [639/800][100/402] eta 0:03:53 lr 0.000025 time 0.7465 (0.7723) loss 0.6317 (0.6245) grad_norm 0.1784 (0.1516) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:15:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [639/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7591) loss 0.6279 (0.6222) grad_norm 0.1618 (0.1510) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:16:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [639/800][300/402] eta 0:01:16 lr 0.000025 time 0.7450 (0.7548) loss 0.6053 (0.6220) grad_norm 0.1447 (0.1527) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:18:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [639/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7526) loss 0.6110 (0.6217) grad_norm 0.1588 (0.1526) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:18:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 639 training takes 0:05:02 [2024-03-10 22:18:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [640/800][0/402] eta 0:22:48 lr 0.000025 time 3.4051 (3.4051) loss 0.6527 (0.6527) grad_norm 0.1493 (0.1493) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:19:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [640/800][100/402] eta 0:03:53 lr 0.000025 time 0.7458 (0.7724) loss 0.6101 (0.6215) grad_norm 0.1613 (0.1526) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:20:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [640/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7592) loss 0.6085 (0.6215) grad_norm 0.1760 (0.1512) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:21:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [640/800][300/402] eta 0:01:16 lr 0.000025 time 0.7463 (0.7549) loss 0.6094 (0.6210) grad_norm 0.1764 (0.1523) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:23:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [640/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7527) loss 0.6266 (0.6216) grad_norm 0.1370 (0.1532) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:23:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 640 training takes 0:05:02 [2024-03-10 22:23:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [641/800][0/402] eta 0:32:53 lr 0.000025 time 4.9090 (4.9090) loss 0.6548 (0.6548) grad_norm 0.1246 (0.1246) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:24:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [641/800][100/402] eta 0:03:57 lr 0.000025 time 0.7457 (0.7870) loss 0.5984 (0.6218) grad_norm 0.1466 (0.1504) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:25:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [641/800][200/402] eta 0:02:34 lr 0.000025 time 0.7458 (0.7665) loss 0.5997 (0.6218) grad_norm 0.1233 (0.1537) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:27:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [641/800][300/402] eta 0:01:17 lr 0.000025 time 0.7465 (0.7596) loss 0.6351 (0.6221) grad_norm 0.1496 (0.1535) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 22:28:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [641/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7562) loss 0.6450 (0.6219) grad_norm 0.1433 (0.1526) loss_scale 524288.0000 (327516.5686) mem 28968MB [2024-03-10 22:28:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 641 training takes 0:05:04 [2024-03-10 22:28:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [642/800][0/402] eta 0:21:26 lr 0.000025 time 3.2015 (3.2015) loss 0.5963 (0.5963) grad_norm 0.1433 (0.1433) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:29:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [642/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7709) loss 0.6056 (0.6205) grad_norm 0.1876 (0.1528) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:30:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [642/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7584) loss 0.6119 (0.6205) grad_norm 0.1628 (0.1508) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:32:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [642/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7543) loss 0.6472 (0.6215) grad_norm 0.1580 (0.1496) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:33:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [642/800][400/402] eta 0:00:01 lr 0.000025 time 0.7450 (0.7522) loss 0.6253 (0.6213) grad_norm 0.1208 (0.1496) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:33:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 642 training takes 0:05:02 [2024-03-10 22:33:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [643/800][0/402] eta 0:22:53 lr 0.000025 time 3.4159 (3.4159) loss 0.6253 (0.6253) grad_norm 0.1550 (0.1550) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:34:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [643/800][100/402] eta 0:03:53 lr 0.000025 time 0.7462 (0.7722) loss 0.6287 (0.6218) grad_norm 0.1374 (0.1516) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:35:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [643/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7591) loss 0.6437 (0.6219) grad_norm 0.1443 (0.1516) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:37:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [643/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7547) loss 0.6254 (0.6218) grad_norm 0.1630 (0.1530) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:38:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [643/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7524) loss 0.6166 (0.6221) grad_norm 0.1569 (0.1526) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:38:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 643 training takes 0:05:02 [2024-03-10 22:38:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [644/800][0/402] eta 0:22:51 lr 0.000025 time 3.4111 (3.4111) loss 0.6235 (0.6235) grad_norm 0.1778 (0.1778) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:39:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [644/800][100/402] eta 0:03:53 lr 0.000025 time 0.7459 (0.7721) loss 0.6144 (0.6250) grad_norm 0.1323 (0.1514) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:40:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [644/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7594) loss 0.6227 (0.6245) grad_norm 0.1298 (0.1505) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:42:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [644/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7548) loss 0.6301 (0.6231) grad_norm 0.1370 (0.1519) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:43:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [644/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7525) loss 0.6459 (0.6228) grad_norm 0.1347 (0.1526) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:43:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 644 training takes 0:05:02 [2024-03-10 22:43:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [645/800][0/402] eta 0:22:25 lr 0.000025 time 3.3469 (3.3469) loss 0.6434 (0.6434) grad_norm 0.1561 (0.1561) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:44:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [645/800][100/402] eta 0:03:53 lr 0.000025 time 0.7465 (0.7716) loss 0.6051 (0.6246) grad_norm 0.1630 (0.1514) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:45:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [645/800][200/402] eta 0:02:33 lr 0.000025 time 0.7455 (0.7587) loss 0.6221 (0.6230) grad_norm 0.1525 (0.1514) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:47:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [645/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7544) loss 0.6484 (0.6214) grad_norm 0.1454 (0.1519) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:48:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [645/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7522) loss 0.6160 (0.6217) grad_norm 0.1213 (0.1523) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:48:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 645 training takes 0:05:02 [2024-03-10 22:48:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [646/800][0/402] eta 0:31:43 lr 0.000025 time 4.7344 (4.7344) loss 0.5992 (0.5992) grad_norm 0.1582 (0.1582) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:49:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [646/800][100/402] eta 0:03:57 lr 0.000025 time 0.7463 (0.7854) loss 0.6167 (0.6205) grad_norm 0.1296 (0.1511) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:51:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [646/800][200/402] eta 0:02:34 lr 0.000025 time 0.7456 (0.7657) loss 0.6103 (0.6210) grad_norm 0.1917 (0.1503) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:52:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [646/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7591) loss 0.6003 (0.6202) grad_norm 0.1405 (0.1510) loss_scale 1048576.0000 (541706.2060) mem 28968MB [2024-03-10 22:53:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [646/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7558) loss 0.6410 (0.6208) grad_norm 0.1715 (inf) loss_scale 524288.0000 (655033.1372) mem 28968MB [2024-03-10 22:53:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 646 training takes 0:05:03 [2024-03-10 22:53:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [647/800][0/402] eta 0:22:50 lr 0.000025 time 3.4095 (3.4095) loss 0.6384 (0.6384) grad_norm 0.1628 (0.1628) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:54:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [647/800][100/402] eta 0:03:53 lr 0.000025 time 0.7460 (0.7724) loss 0.6081 (0.6230) grad_norm 0.1689 (0.1525) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:56:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [647/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7591) loss 0.6196 (0.6225) grad_norm 0.1411 (0.1534) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:57:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [647/800][300/402] eta 0:01:16 lr 0.000025 time 0.7464 (0.7547) loss 0.6424 (0.6216) grad_norm 0.1538 (0.1527) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:58:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [647/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7524) loss 0.5991 (0.6220) grad_norm 0.1396 (0.1525) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:58:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 647 training takes 0:05:02 [2024-03-10 22:58:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [648/800][0/402] eta 0:22:11 lr 0.000025 time 3.3121 (3.3121) loss 0.6262 (0.6262) grad_norm 0.1770 (0.1770) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 22:59:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [648/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7711) loss 0.6369 (0.6209) grad_norm 0.1395 (0.1496) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:01:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [648/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7585) loss 0.6168 (0.6211) grad_norm 0.1116 (0.1491) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:02:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [648/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7543) loss 0.6270 (0.6199) grad_norm 0.1440 (0.1489) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:03:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [648/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7522) loss 0.6241 (0.6202) grad_norm 0.1769 (0.1499) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:03:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 648 training takes 0:05:02 [2024-03-10 23:03:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [649/800][0/402] eta 0:22:46 lr 0.000025 time 3.3980 (3.3980) loss 0.6212 (0.6212) grad_norm 0.1753 (0.1753) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:04:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [649/800][100/402] eta 0:03:53 lr 0.000025 time 0.7457 (0.7720) loss 0.5976 (0.6192) grad_norm 0.1695 (0.1520) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:06:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [649/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7591) loss 0.6086 (0.6194) grad_norm 0.1526 (0.1519) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:07:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [649/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7547) loss 0.6185 (0.6213) grad_norm 0.1803 (0.1512) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:08:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [649/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7524) loss 0.6021 (0.6216) grad_norm 0.1652 (0.1513) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:08:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 649 training takes 0:05:02 [2024-03-10 23:08:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [650/800][0/402] eta 0:22:13 lr 0.000025 time 3.3173 (3.3173) loss 0.6146 (0.6146) grad_norm 0.1412 (0.1412) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:09:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [650/800][100/402] eta 0:03:52 lr 0.000025 time 0.7452 (0.7712) loss 0.6207 (0.6216) grad_norm 0.1553 (0.1497) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:11:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [650/800][200/402] eta 0:02:33 lr 0.000025 time 0.7453 (0.7585) loss 0.6230 (0.6220) grad_norm 0.1541 (0.1536) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:12:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [650/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7543) loss 0.6281 (0.6224) grad_norm 0.1663 (0.1520) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:13:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [650/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7521) loss 0.5929 (0.6218) grad_norm 0.1452 (0.1529) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:13:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 650 training takes 0:05:02 [2024-03-10 23:13:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [651/800][0/402] eta 0:32:30 lr 0.000025 time 4.8513 (4.8513) loss 0.6395 (0.6395) grad_norm 0.1316 (0.1316) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:15:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [651/800][100/402] eta 0:03:57 lr 0.000025 time 0.7458 (0.7864) loss 0.6320 (0.6220) grad_norm 0.1490 (0.1494) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:16:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [651/800][200/402] eta 0:02:34 lr 0.000025 time 0.7457 (0.7663) loss 0.6349 (0.6211) grad_norm 0.1539 (0.1494) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:17:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [651/800][300/402] eta 0:01:17 lr 0.000025 time 0.7460 (0.7595) loss 0.6293 (0.6204) grad_norm 0.1608 (0.1509) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:18:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [651/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7560) loss 0.6361 (0.6208) grad_norm 0.1413 (0.1508) loss_scale 1048576.0000 (550437.0274) mem 28968MB [2024-03-10 23:18:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 651 training takes 0:05:03 [2024-03-10 23:18:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [652/800][0/402] eta 0:22:42 lr 0.000025 time 3.3896 (3.3896) loss 0.6149 (0.6149) grad_norm 0.1455 (0.1455) loss_scale 1048576.0000 (1048576.0000) mem 28968MB [2024-03-10 23:20:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [652/800][100/402] eta 0:03:53 lr 0.000025 time 0.7456 (0.7719) loss 0.6273 (0.6220) grad_norm 0.1378 (0.1500) loss_scale 1048576.0000 (1048576.0000) mem 28968MB [2024-03-10 23:21:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [652/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7590) loss 0.6406 (0.6198) grad_norm 0.1263 (0.1518) loss_scale 1048576.0000 (1048576.0000) mem 28968MB [2024-03-10 23:22:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [652/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7546) loss 0.6382 (0.6217) grad_norm 0.1600 (inf) loss_scale 524288.0000 (1013739.5880) mem 28968MB [2024-03-10 23:23:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [652/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7523) loss 0.6014 (0.6218) grad_norm 0.1377 (inf) loss_scale 524288.0000 (891681.8354) mem 28968MB [2024-03-10 23:23:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 652 training takes 0:05:02 [2024-03-10 23:23:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [653/800][0/402] eta 0:22:02 lr 0.000025 time 3.2890 (3.2890) loss 0.5915 (0.5915) grad_norm 0.1516 (0.1516) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:25:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [653/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7709) loss 0.6295 (0.6209) grad_norm 0.1732 (0.1528) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:26:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [653/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7584) loss 0.5740 (0.6201) grad_norm 0.1539 (0.1534) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:27:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [653/800][300/402] eta 0:01:16 lr 0.000025 time 0.7461 (0.7543) loss 0.6043 (0.6208) grad_norm 0.1561 (0.1528) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:28:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [653/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7522) loss 0.6233 (0.6215) grad_norm 0.1555 (0.1518) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:28:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 653 training takes 0:05:02 [2024-03-10 23:28:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [654/800][0/402] eta 0:21:54 lr 0.000025 time 3.2708 (3.2708) loss 0.6202 (0.6202) grad_norm 0.1463 (0.1463) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:30:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [654/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7708) loss 0.5904 (0.6224) grad_norm 0.1437 (0.1519) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:31:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [654/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7584) loss 0.5885 (0.6220) grad_norm 0.1243 (0.1522) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:32:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [654/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7542) loss 0.5916 (0.6225) grad_norm 0.1741 (0.1524) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:33:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [654/800][400/402] eta 0:00:01 lr 0.000025 time 0.7459 (0.7522) loss 0.6181 (0.6218) grad_norm 0.1533 (0.1532) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:33:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 654 training takes 0:05:02 [2024-03-10 23:33:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [655/800][0/402] eta 0:21:40 lr 0.000025 time 3.2354 (3.2354) loss 0.5984 (0.5984) grad_norm 0.1582 (0.1582) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-10 23:35:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [655/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7714) loss 0.6132 (0.6211) grad_norm 0.1684 (inf) loss_scale 262144.0000 (487951.2079) mem 28968MB [2024-03-10 23:36:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [655/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7587) loss 0.6073 (0.6205) grad_norm 0.1318 (inf) loss_scale 262144.0000 (375609.3134) mem 28968MB [2024-03-10 23:37:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [655/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7544) loss 0.6095 (0.6204) grad_norm 0.1572 (inf) loss_scale 262144.0000 (337913.1960) mem 28968MB [2024-03-10 23:38:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [655/800][400/402] eta 0:00:01 lr 0.000025 time 0.7440 (0.7523) loss 0.5967 (0.6205) grad_norm 0.1480 (inf) loss_scale 262144.0000 (319018.1347) mem 28968MB [2024-03-10 23:38:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 655 training takes 0:05:02 [2024-03-10 23:38:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [656/800][0/402] eta 0:33:49 lr 0.000025 time 5.0478 (5.0478) loss 0.6319 (0.6319) grad_norm 0.1542 (0.1542) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:40:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [656/800][100/402] eta 0:03:58 lr 0.000025 time 0.7454 (0.7883) loss 0.6138 (0.6221) grad_norm 0.1290 (0.1514) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:41:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [656/800][200/402] eta 0:02:34 lr 0.000025 time 0.7464 (0.7671) loss 0.6222 (0.6218) grad_norm 0.1490 (0.1515) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:42:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [656/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7600) loss 0.5752 (0.6221) grad_norm 0.1717 (0.1529) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:43:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [656/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7564) loss 0.5855 (0.6220) grad_norm 0.1778 (0.1529) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:43:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 656 training takes 0:05:04 [2024-03-10 23:44:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [657/800][0/402] eta 0:23:00 lr 0.000025 time 3.4342 (3.4342) loss 0.6245 (0.6245) grad_norm 0.1495 (0.1495) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:45:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [657/800][100/402] eta 0:03:53 lr 0.000025 time 0.7461 (0.7724) loss 0.6403 (0.6200) grad_norm 0.1435 (0.1524) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:46:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [657/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7591) loss 0.6526 (0.6212) grad_norm 0.1660 (0.1538) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:47:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [657/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7547) loss 0.5880 (0.6221) grad_norm 0.1619 (0.1521) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:49:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [657/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7524) loss 0.6022 (0.6210) grad_norm 0.1452 (0.1531) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:49:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 657 training takes 0:05:02 [2024-03-10 23:49:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [658/800][0/402] eta 0:22:19 lr 0.000025 time 3.3317 (3.3317) loss 0.6258 (0.6258) grad_norm 0.1553 (0.1553) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:50:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [658/800][100/402] eta 0:03:52 lr 0.000025 time 0.7455 (0.7715) loss 0.6382 (0.6226) grad_norm 0.1532 (0.1508) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:51:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [658/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7587) loss 0.6061 (0.6219) grad_norm 0.1518 (0.1503) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:52:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [658/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7544) loss 0.5922 (0.6214) grad_norm 0.1598 (0.1508) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:54:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [658/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7522) loss 0.6205 (0.6212) grad_norm 0.1770 (0.1517) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:54:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 658 training takes 0:05:02 [2024-03-10 23:54:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [659/800][0/402] eta 0:22:23 lr 0.000025 time 3.3414 (3.3414) loss 0.6214 (0.6214) grad_norm 0.1771 (0.1771) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:55:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [659/800][100/402] eta 0:03:52 lr 0.000025 time 0.7463 (0.7714) loss 0.6022 (0.6204) grad_norm 0.1544 (0.1520) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:56:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [659/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7587) loss 0.6434 (0.6205) grad_norm 0.1398 (0.1523) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:57:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [659/800][300/402] eta 0:01:16 lr 0.000025 time 0.7468 (0.7545) loss 0.6413 (0.6197) grad_norm 0.1668 (0.1527) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:59:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [659/800][400/402] eta 0:00:01 lr 0.000025 time 0.7452 (0.7525) loss 0.6340 (0.6202) grad_norm 0.1378 (0.1531) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-10 23:59:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 659 training takes 0:05:02 [2024-03-10 23:59:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [660/800][0/402] eta 0:21:58 lr 0.000025 time 3.2802 (3.2802) loss 0.6012 (0.6012) grad_norm 0.1224 (0.1224) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:00:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [660/800][100/402] eta 0:03:52 lr 0.000025 time 0.7452 (0.7709) loss 0.6033 (0.6218) grad_norm 0.1811 (0.1494) loss_scale 524288.0000 (324435.6436) mem 28968MB [2024-03-11 00:01:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [660/800][200/402] eta 0:02:33 lr 0.000025 time 0.7464 (0.7586) loss 0.6518 (0.6208) grad_norm 0.1433 (0.1527) loss_scale 524288.0000 (423864.6766) mem 28968MB [2024-03-11 00:02:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [660/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7543) loss 0.6530 (0.6210) grad_norm 0.1391 (0.1526) loss_scale 524288.0000 (457227.9070) mem 28968MB [2024-03-11 00:04:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [660/800][400/402] eta 0:00:01 lr 0.000025 time 0.7450 (0.7522) loss 0.6153 (0.6207) grad_norm 0.1263 (0.1516) loss_scale 524288.0000 (473951.1222) mem 28968MB [2024-03-11 00:04:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 660 training takes 0:05:02 [2024-03-11 00:04:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [661/800][0/402] eta 0:32:16 lr 0.000025 time 4.8170 (4.8170) loss 0.6125 (0.6125) grad_norm 0.1456 (0.1456) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 00:05:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [661/800][100/402] eta 0:03:57 lr 0.000025 time 0.7460 (0.7860) loss 0.6295 (0.6219) grad_norm 0.1790 (0.1515) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 00:06:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [661/800][200/402] eta 0:02:34 lr 0.000025 time 0.7457 (0.7660) loss 0.6201 (0.6220) grad_norm 0.1634 (0.1523) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 00:07:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [661/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7593) loss 0.6203 (0.6217) grad_norm 0.1582 (inf) loss_scale 262144.0000 (505127.9734) mem 28968MB [2024-03-11 00:09:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [661/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7559) loss 0.6530 (0.6207) grad_norm 0.1491 (inf) loss_scale 262144.0000 (444533.4663) mem 28968MB [2024-03-11 00:09:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 661 training takes 0:05:03 [2024-03-11 00:09:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [662/800][0/402] eta 0:21:44 lr 0.000025 time 3.2442 (3.2442) loss 0.6070 (0.6070) grad_norm 0.1542 (0.1542) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:10:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [662/800][100/402] eta 0:03:52 lr 0.000025 time 0.7457 (0.7705) loss 0.6256 (0.6215) grad_norm 0.1299 (0.1474) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:11:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [662/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7586) loss 0.5913 (0.6225) grad_norm 0.1668 (0.1500) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:13:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [662/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7544) loss 0.6014 (0.6220) grad_norm 0.1506 (0.1505) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:14:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [662/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7522) loss 0.6132 (0.6216) grad_norm 0.1784 (0.1511) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:14:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 662 training takes 0:05:02 [2024-03-11 00:14:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [663/800][0/402] eta 0:22:13 lr 0.000025 time 3.3162 (3.3162) loss 0.6248 (0.6248) grad_norm 0.1390 (0.1390) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:15:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [663/800][100/402] eta 0:03:52 lr 0.000025 time 0.7453 (0.7712) loss 0.6180 (0.6211) grad_norm 0.1603 (0.1538) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:16:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [663/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7586) loss 0.6140 (0.6220) grad_norm 0.1652 (0.1511) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:18:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [663/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7543) loss 0.6344 (0.6224) grad_norm 0.1416 (0.1509) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:19:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [663/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7522) loss 0.5977 (0.6217) grad_norm 0.1381 (0.1511) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:19:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 663 training takes 0:05:02 [2024-03-11 00:19:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [664/800][0/402] eta 0:22:11 lr 0.000025 time 3.3115 (3.3115) loss 0.6185 (0.6185) grad_norm 0.1607 (0.1607) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:20:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [664/800][100/402] eta 0:03:52 lr 0.000025 time 0.7464 (0.7711) loss 0.6289 (0.6228) grad_norm 0.1243 (0.1502) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:21:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [664/800][200/402] eta 0:02:33 lr 0.000025 time 0.7453 (0.7584) loss 0.6045 (0.6224) grad_norm 0.1813 (inf) loss_scale 131072.0000 (256275.1045) mem 28968MB [2024-03-11 00:23:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [664/800][300/402] eta 0:01:16 lr 0.000025 time 0.7457 (0.7547) loss 0.6313 (0.6209) grad_norm 0.1756 (inf) loss_scale 131072.0000 (214679.3887) mem 28968MB [2024-03-11 00:24:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [664/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7524) loss 0.6091 (0.6208) grad_norm 0.1333 (inf) loss_scale 131072.0000 (193829.6658) mem 28968MB [2024-03-11 00:24:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 664 training takes 0:05:02 [2024-03-11 00:24:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [665/800][0/402] eta 0:22:20 lr 0.000025 time 3.3337 (3.3337) loss 0.6360 (0.6360) grad_norm 0.1379 (0.1379) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:25:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [665/800][100/402] eta 0:03:53 lr 0.000025 time 0.7457 (0.7716) loss 0.6204 (0.6203) grad_norm 0.1356 (0.1501) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:26:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [665/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7587) loss 0.5761 (0.6205) grad_norm 0.1716 (0.1513) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:28:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [665/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7544) loss 0.6152 (0.6201) grad_norm 0.1464 (0.1513) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:29:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [665/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7522) loss 0.6007 (0.6206) grad_norm 0.1550 (0.1511) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:29:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 665 training takes 0:05:02 [2024-03-11 00:29:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [666/800][0/402] eta 0:32:49 lr 0.000025 time 4.8987 (4.8987) loss 0.6304 (0.6304) grad_norm 0.1333 (0.1333) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:30:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [666/800][100/402] eta 0:03:57 lr 0.000025 time 0.7470 (0.7877) loss 0.6402 (0.6220) grad_norm 0.1441 (0.1523) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:31:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [666/800][200/402] eta 0:02:34 lr 0.000025 time 0.7461 (0.7670) loss 0.6221 (0.6203) grad_norm 0.1667 (0.1522) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:33:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [666/800][300/402] eta 0:01:17 lr 0.000025 time 0.7464 (0.7601) loss 0.6315 (0.6205) grad_norm 0.1600 (0.1526) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:34:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [666/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7568) loss 0.6476 (0.6208) grad_norm 0.1283 (0.1518) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:34:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 666 training takes 0:05:04 [2024-03-11 00:34:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [667/800][0/402] eta 0:22:06 lr 0.000025 time 3.2986 (3.2986) loss 0.6393 (0.6393) grad_norm 0.1247 (0.1247) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:35:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [667/800][100/402] eta 0:03:52 lr 0.000025 time 0.7460 (0.7711) loss 0.6418 (0.6211) grad_norm 0.1600 (0.1560) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:36:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [667/800][200/402] eta 0:02:33 lr 0.000025 time 0.7456 (0.7586) loss 0.6169 (0.6204) grad_norm 0.1346 (0.1538) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:38:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [667/800][300/402] eta 0:01:16 lr 0.000025 time 0.7454 (0.7543) loss 0.6566 (0.6207) grad_norm 0.1581 (0.1532) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:39:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [667/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7521) loss 0.6373 (0.6207) grad_norm 0.1259 (0.1528) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:39:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 667 training takes 0:05:02 [2024-03-11 00:39:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [668/800][0/402] eta 0:23:09 lr 0.000025 time 3.4575 (3.4575) loss 0.6371 (0.6371) grad_norm 0.1190 (0.1190) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:40:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [668/800][100/402] eta 0:03:53 lr 0.000025 time 0.7467 (0.7726) loss 0.6499 (0.6270) grad_norm 0.1594 (0.1520) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:42:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [668/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7593) loss 0.6126 (0.6224) grad_norm 0.1884 (0.1509) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:43:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [668/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7548) loss 0.6059 (0.6221) grad_norm 0.1342 (0.1515) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:44:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [668/800][400/402] eta 0:00:01 lr 0.000025 time 0.7439 (0.7525) loss 0.6486 (0.6223) grad_norm 0.1397 (0.1514) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:44:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 668 training takes 0:05:02 [2024-03-11 00:44:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [669/800][0/402] eta 0:22:55 lr 0.000025 time 3.4212 (3.4212) loss 0.6064 (0.6064) grad_norm 0.1528 (0.1528) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:45:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [669/800][100/402] eta 0:03:53 lr 0.000025 time 0.7459 (0.7731) loss 0.5851 (0.6229) grad_norm 0.1382 (0.1511) loss_scale 131072.0000 (131072.0000) mem 28968MB [2024-03-11 00:47:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [669/800][200/402] eta 0:02:33 lr 0.000025 time 0.7470 (0.7596) loss 0.5943 (0.6240) grad_norm 0.1679 (0.1525) loss_scale 262144.0000 (143461.8905) mem 28968MB [2024-03-11 00:48:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [669/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7551) loss 0.6252 (0.6228) grad_norm 0.1543 (0.1520) loss_scale 262144.0000 (182891.1628) mem 28968MB [2024-03-11 00:49:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [669/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7527) loss 0.6252 (0.6220) grad_norm 0.1133 (0.1525) loss_scale 262144.0000 (202654.9626) mem 28968MB [2024-03-11 00:49:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 669 training takes 0:05:02 [2024-03-11 00:49:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [670/800][0/402] eta 0:22:49 lr 0.000025 time 3.4069 (3.4069) loss 0.6066 (0.6066) grad_norm 0.1603 (0.1603) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:50:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [670/800][100/402] eta 0:03:53 lr 0.000025 time 0.7467 (0.7733) loss 0.6410 (0.6192) grad_norm 0.1776 (0.1488) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:52:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [670/800][200/402] eta 0:02:33 lr 0.000025 time 0.7465 (0.7602) loss 0.6564 (0.6209) grad_norm 0.1841 (0.1518) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:53:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [670/800][300/402] eta 0:01:17 lr 0.000025 time 0.7466 (0.7558) loss 0.6354 (0.6208) grad_norm 0.1571 (0.1536) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:54:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [670/800][400/402] eta 0:00:01 lr 0.000025 time 0.7453 (0.7535) loss 0.6422 (0.6207) grad_norm 0.1399 (0.1529) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:54:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 670 training takes 0:05:02 [2024-03-11 00:54:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [671/800][0/402] eta 0:34:33 lr 0.000025 time 5.1585 (5.1585) loss 0.6351 (0.6351) grad_norm 0.1379 (0.1379) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:55:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [671/800][100/402] eta 0:03:59 lr 0.000025 time 0.7460 (0.7914) loss 0.6157 (0.6204) grad_norm 0.1509 (0.1530) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:57:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [671/800][200/402] eta 0:02:35 lr 0.000025 time 0.7458 (0.7688) loss 0.6108 (0.6215) grad_norm 0.1581 (0.1526) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:58:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [671/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7611) loss 0.6196 (0.6218) grad_norm 0.1828 (0.1510) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:59:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [671/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7573) loss 0.6439 (0.6213) grad_norm 0.1401 (0.1517) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 00:59:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 671 training takes 0:05:04 [2024-03-11 00:59:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [672/800][0/402] eta 0:23:11 lr 0.000025 time 3.4618 (3.4618) loss 0.6154 (0.6154) grad_norm 0.1370 (0.1370) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 01:01:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [672/800][100/402] eta 0:03:53 lr 0.000025 time 0.7456 (0.7726) loss 0.6436 (0.6220) grad_norm 0.1353 (0.1517) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 01:02:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [672/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7593) loss 0.6226 (0.6227) grad_norm 0.1779 (0.1542) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 01:03:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [672/800][300/402] eta 0:01:16 lr 0.000025 time 0.7460 (0.7548) loss 0.5930 (0.6223) grad_norm 0.1544 (0.1532) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 01:04:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [672/800][400/402] eta 0:00:01 lr 0.000025 time 0.7443 (0.7525) loss 0.5922 (0.6220) grad_norm 0.1527 (0.1523) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 01:04:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 672 training takes 0:05:02 [2024-03-11 01:04:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [673/800][0/402] eta 0:22:01 lr 0.000025 time 3.2885 (3.2885) loss 0.6143 (0.6143) grad_norm 0.1758 (0.1758) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 01:06:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [673/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7709) loss 0.6047 (0.6210) grad_norm 0.2039 (0.1509) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 01:07:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [673/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7584) loss 0.6233 (0.6206) grad_norm 0.1414 (0.1512) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 01:08:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [673/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7545) loss 0.6094 (0.6208) grad_norm 0.1572 (0.1507) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 01:09:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [673/800][400/402] eta 0:00:01 lr 0.000025 time 0.7446 (0.7523) loss 0.6232 (0.6207) grad_norm 0.1588 (0.1520) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 01:09:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 673 training takes 0:05:02 [2024-03-11 01:09:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [674/800][0/402] eta 0:22:31 lr 0.000025 time 3.3621 (3.3621) loss 0.6350 (0.6350) grad_norm 0.2007 (0.2007) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 01:11:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [674/800][100/402] eta 0:03:53 lr 0.000025 time 0.7455 (0.7717) loss 0.6641 (0.6199) grad_norm 0.1532 (0.1500) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 01:12:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [674/800][200/402] eta 0:02:33 lr 0.000025 time 0.7460 (0.7588) loss 0.6321 (0.6205) grad_norm 0.1595 (0.1493) loss_scale 524288.0000 (299965.7711) mem 28968MB [2024-03-11 01:13:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [674/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7545) loss 0.6387 (0.6208) grad_norm 0.1492 (0.1498) loss_scale 524288.0000 (374491.4286) mem 28968MB [2024-03-11 01:14:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [674/800][400/402] eta 0:00:01 lr 0.000025 time 0.7445 (0.7523) loss 0.6660 (0.6214) grad_norm 0.1517 (0.1511) loss_scale 524288.0000 (411847.1820) mem 28968MB [2024-03-11 01:14:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 674 training takes 0:05:02 [2024-03-11 01:14:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [675/800][0/402] eta 0:21:43 lr 0.000025 time 3.2436 (3.2436) loss 0.6486 (0.6486) grad_norm 0.1436 (0.1436) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:16:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [675/800][100/402] eta 0:03:52 lr 0.000025 time 0.7458 (0.7705) loss 0.6096 (0.6222) grad_norm 0.1338 (0.1521) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:17:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [675/800][200/402] eta 0:02:33 lr 0.000025 time 0.7462 (0.7582) loss 0.6366 (0.6209) grad_norm 0.1318 (0.1524) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:18:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [675/800][300/402] eta 0:01:16 lr 0.000025 time 0.7456 (0.7541) loss 0.6365 (0.6223) grad_norm 0.1572 (0.1523) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:19:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [675/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7522) loss 0.6140 (0.6216) grad_norm 0.1918 (0.1518) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:19:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 675 training takes 0:05:02 [2024-03-11 01:19:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [676/800][0/402] eta 0:32:59 lr 0.000025 time 4.9251 (4.9251) loss 0.6206 (0.6206) grad_norm 0.1425 (0.1425) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:21:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [676/800][100/402] eta 0:03:57 lr 0.000025 time 0.7458 (0.7879) loss 0.5872 (0.6186) grad_norm 0.1330 (0.1528) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:22:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [676/800][200/402] eta 0:02:34 lr 0.000025 time 0.7454 (0.7671) loss 0.6298 (0.6209) grad_norm 0.1224 (0.1521) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:23:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [676/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7602) loss 0.5839 (0.6213) grad_norm 0.1828 (0.1517) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:24:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [676/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7566) loss 0.6297 (0.6212) grad_norm 0.1261 (0.1517) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:24:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 676 training takes 0:05:04 [2024-03-11 01:25:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [677/800][0/402] eta 0:22:33 lr 0.000025 time 3.3674 (3.3674) loss 0.6189 (0.6189) grad_norm 0.1300 (0.1300) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:26:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [677/800][100/402] eta 0:03:53 lr 0.000025 time 0.7456 (0.7717) loss 0.6325 (0.6210) grad_norm 0.1142 (0.1528) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:27:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [677/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7588) loss 0.5915 (0.6201) grad_norm 0.1484 (0.1537) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:28:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [677/800][300/402] eta 0:01:16 lr 0.000025 time 0.7458 (0.7545) loss 0.6192 (0.6212) grad_norm 0.1479 (0.1531) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:29:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [677/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7525) loss 0.6167 (0.6211) grad_norm 0.1716 (0.1529) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:29:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 677 training takes 0:05:02 [2024-03-11 01:30:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [678/800][0/402] eta 0:22:20 lr 0.000025 time 3.3353 (3.3353) loss 0.6553 (0.6553) grad_norm 0.1094 (0.1094) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:31:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [678/800][100/402] eta 0:03:53 lr 0.000025 time 0.7472 (0.7722) loss 0.6108 (0.6209) grad_norm 0.1207 (0.1478) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:32:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [678/800][200/402] eta 0:02:33 lr 0.000025 time 0.7472 (0.7597) loss 0.5989 (0.6201) grad_norm 0.1427 (0.1486) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:33:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [678/800][300/402] eta 0:01:17 lr 0.000025 time 0.7471 (0.7554) loss 0.6214 (0.6213) grad_norm 0.1684 (0.1502) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:35:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [678/800][400/402] eta 0:00:01 lr 0.000025 time 0.7457 (0.7533) loss 0.5948 (0.6211) grad_norm 0.1360 (0.1515) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:35:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 678 training takes 0:05:02 [2024-03-11 01:35:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [679/800][0/402] eta 0:21:35 lr 0.000025 time 3.2221 (3.2221) loss 0.6322 (0.6322) grad_norm 0.1291 (0.1291) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:36:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [679/800][100/402] eta 0:03:52 lr 0.000025 time 0.7456 (0.7703) loss 0.6280 (0.6187) grad_norm 0.1300 (0.1497) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:37:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [679/800][200/402] eta 0:02:33 lr 0.000025 time 0.7464 (0.7581) loss 0.6150 (0.6200) grad_norm 0.1357 (inf) loss_scale 524288.0000 (555588.7761) mem 28968MB [2024-03-11 01:38:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [679/800][300/402] eta 0:01:16 lr 0.000025 time 0.7459 (0.7540) loss 0.6127 (0.6196) grad_norm 0.1954 (inf) loss_scale 524288.0000 (545189.8472) mem 28968MB [2024-03-11 01:40:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [679/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7519) loss 0.6258 (0.6197) grad_norm 0.1202 (inf) loss_scale 524288.0000 (539977.4165) mem 28968MB [2024-03-11 01:40:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 679 training takes 0:05:02 [2024-03-11 01:40:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [680/800][0/402] eta 0:22:11 lr 0.000025 time 3.3129 (3.3129) loss 0.6048 (0.6048) grad_norm 0.1544 (0.1544) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:41:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [680/800][100/402] eta 0:03:53 lr 0.000025 time 0.7459 (0.7718) loss 0.6148 (0.6219) grad_norm 0.1251 (0.1544) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:42:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [680/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7589) loss 0.5957 (0.6220) grad_norm 0.1834 (0.1528) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:43:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [680/800][300/402] eta 0:01:16 lr 0.000025 time 0.7462 (0.7546) loss 0.6333 (0.6211) grad_norm 0.1360 (0.1526) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:45:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [680/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7524) loss 0.5999 (0.6210) grad_norm 0.1214 (0.1526) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:45:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 680 training takes 0:05:02 [2024-03-11 01:45:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [681/800][0/402] eta 0:34:43 lr 0.000025 time 5.1817 (5.1817) loss 0.6032 (0.6032) grad_norm 0.1343 (0.1343) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:46:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [681/800][100/402] eta 0:03:58 lr 0.000025 time 0.7489 (0.7903) loss 0.6230 (0.6207) grad_norm 0.1530 (0.1523) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:47:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [681/800][200/402] eta 0:02:35 lr 0.000025 time 0.7461 (0.7685) loss 0.6255 (0.6210) grad_norm 0.1521 (0.1522) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:48:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [681/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7613) loss 0.6633 (0.6212) grad_norm 0.1423 (0.1516) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:50:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [681/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7576) loss 0.6125 (0.6208) grad_norm 0.1465 (0.1524) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:50:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 681 training takes 0:05:04 [2024-03-11 01:50:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [682/800][0/402] eta 0:23:26 lr 0.000025 time 3.4984 (3.4984) loss 0.6289 (0.6289) grad_norm 0.1643 (0.1643) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:51:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [682/800][100/402] eta 0:03:53 lr 0.000025 time 0.7466 (0.7738) loss 0.6154 (0.6178) grad_norm 0.1949 (0.1536) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:52:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [682/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7605) loss 0.6324 (0.6201) grad_norm 0.1799 (0.1534) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:53:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [682/800][300/402] eta 0:01:17 lr 0.000025 time 0.7460 (0.7556) loss 0.6264 (0.6201) grad_norm 0.1628 (0.1534) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:55:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [682/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7532) loss 0.6178 (0.6200) grad_norm 0.1538 (0.1525) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:55:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 682 training takes 0:05:02 [2024-03-11 01:55:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [683/800][0/402] eta 0:22:40 lr 0.000025 time 3.3833 (3.3833) loss 0.6387 (0.6387) grad_norm 0.1222 (0.1222) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:56:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [683/800][100/402] eta 0:03:53 lr 0.000025 time 0.7455 (0.7720) loss 0.6430 (0.6201) grad_norm 0.1344 (0.1538) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:57:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [683/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7591) loss 0.6086 (0.6205) grad_norm 0.1804 (0.1537) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 01:59:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [683/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7547) loss 0.6024 (0.6206) grad_norm 0.1435 (0.1559) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:00:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [683/800][400/402] eta 0:00:01 lr 0.000025 time 0.7444 (0.7525) loss 0.6263 (0.6211) grad_norm 0.1409 (0.1555) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:00:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 683 training takes 0:05:02 [2024-03-11 02:00:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [684/800][0/402] eta 0:22:32 lr 0.000025 time 3.3651 (3.3651) loss 0.6414 (0.6414) grad_norm 0.1423 (0.1423) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:01:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [684/800][100/402] eta 0:03:53 lr 0.000025 time 0.7458 (0.7718) loss 0.6351 (0.6178) grad_norm 0.1870 (0.1533) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:02:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [684/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7589) loss 0.6276 (0.6202) grad_norm 0.1566 (0.1523) loss_scale 1048576.0000 (620798.7264) mem 28968MB [2024-03-11 02:04:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [684/800][300/402] eta 0:01:16 lr 0.000025 time 0.7455 (0.7548) loss 0.6429 (0.6208) grad_norm 0.1391 (inf) loss_scale 524288.0000 (668859.1096) mem 28968MB [2024-03-11 02:05:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [684/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7527) loss 0.6173 (0.6214) grad_norm 0.1645 (inf) loss_scale 524288.0000 (632806.4638) mem 28968MB [2024-03-11 02:05:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 684 training takes 0:05:02 [2024-03-11 02:05:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [685/800][0/402] eta 0:24:05 lr 0.000025 time 3.5950 (3.5950) loss 0.5957 (0.5957) grad_norm 0.2010 (0.2010) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:06:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [685/800][100/402] eta 0:03:53 lr 0.000025 time 0.7456 (0.7739) loss 0.6338 (0.6188) grad_norm 0.1400 (0.1548) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:07:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [685/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7600) loss 0.6367 (0.6207) grad_norm 0.1306 (0.1533) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:09:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [685/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7553) loss 0.6022 (0.6209) grad_norm 0.1746 (0.1543) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:10:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [685/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7529) loss 0.6286 (0.6212) grad_norm 0.1581 (0.1531) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:10:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 685 training takes 0:05:02 [2024-03-11 02:10:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [686/800][0/402] eta 0:34:53 lr 0.000025 time 5.2068 (5.2068) loss 0.6209 (0.6209) grad_norm 0.1556 (0.1556) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:11:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [686/800][100/402] eta 0:03:58 lr 0.000025 time 0.7472 (0.7906) loss 0.6290 (0.6225) grad_norm 0.1227 (0.1488) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:12:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [686/800][200/402] eta 0:02:35 lr 0.000025 time 0.7468 (0.7687) loss 0.5803 (0.6220) grad_norm 0.1429 (0.1514) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:14:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [686/800][300/402] eta 0:01:17 lr 0.000025 time 0.7459 (0.7612) loss 0.6230 (0.6216) grad_norm 0.1359 (inf) loss_scale 262144.0000 (477258.8439) mem 28968MB [2024-03-11 02:15:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [686/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7576) loss 0.5756 (0.6213) grad_norm 0.1937 (inf) loss_scale 262144.0000 (423614.2444) mem 28968MB [2024-03-11 02:15:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 686 training takes 0:05:04 [2024-03-11 02:15:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [687/800][0/402] eta 0:24:36 lr 0.000025 time 3.6731 (3.6731) loss 0.6492 (0.6492) grad_norm 0.1550 (0.1550) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:16:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [687/800][100/402] eta 0:03:53 lr 0.000025 time 0.7460 (0.7748) loss 0.6414 (0.6211) grad_norm 0.1448 (0.1516) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:18:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [687/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7605) loss 0.6023 (0.6212) grad_norm 0.1612 (0.1538) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:19:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [687/800][300/402] eta 0:01:17 lr 0.000025 time 0.7456 (0.7556) loss 0.6416 (0.6205) grad_norm 0.1417 (0.1534) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:20:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [687/800][400/402] eta 0:00:01 lr 0.000025 time 0.7448 (0.7532) loss 0.6084 (0.6207) grad_norm 0.1225 (0.1540) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:20:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 687 training takes 0:05:02 [2024-03-11 02:20:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [688/800][0/402] eta 0:23:50 lr 0.000025 time 3.5574 (3.5574) loss 0.6359 (0.6359) grad_norm 0.1444 (0.1444) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:21:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [688/800][100/402] eta 0:03:53 lr 0.000025 time 0.7456 (0.7739) loss 0.6147 (0.6235) grad_norm 0.1942 (0.1519) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:23:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [688/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7599) loss 0.6118 (0.6220) grad_norm 0.1561 (0.1531) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:24:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [688/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7552) loss 0.6269 (0.6217) grad_norm 0.1260 (0.1531) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:25:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [688/800][400/402] eta 0:00:01 lr 0.000025 time 0.7453 (0.7529) loss 0.6244 (0.6211) grad_norm 0.1664 (0.1530) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:25:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 688 training takes 0:05:02 [2024-03-11 02:25:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [689/800][0/402] eta 0:23:30 lr 0.000025 time 3.5097 (3.5097) loss 0.6278 (0.6278) grad_norm 0.1411 (0.1411) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:26:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [689/800][100/402] eta 0:03:53 lr 0.000025 time 0.7463 (0.7742) loss 0.6068 (0.6239) grad_norm 0.1485 (0.1524) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:28:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [689/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7604) loss 0.6197 (0.6233) grad_norm 0.1366 (0.1503) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:29:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [689/800][300/402] eta 0:01:17 lr 0.000025 time 0.7461 (0.7558) loss 0.6114 (0.6216) grad_norm 0.1727 (0.1521) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:30:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [689/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7533) loss 0.6237 (0.6214) grad_norm 0.2301 (0.1540) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:30:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 689 training takes 0:05:02 [2024-03-11 02:30:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [690/800][0/402] eta 0:23:23 lr 0.000025 time 3.4911 (3.4911) loss 0.6266 (0.6266) grad_norm 0.1190 (0.1190) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:31:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [690/800][100/402] eta 0:03:53 lr 0.000025 time 0.7455 (0.7731) loss 0.6374 (0.6198) grad_norm 0.1606 (0.1557) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:33:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [690/800][200/402] eta 0:02:33 lr 0.000025 time 0.7466 (0.7596) loss 0.6258 (0.6186) grad_norm 0.1294 (0.1563) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:34:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [690/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7550) loss 0.6262 (0.6201) grad_norm 0.1683 (0.1547) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:35:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [690/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7527) loss 0.5676 (0.6202) grad_norm 0.1483 (0.1540) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:35:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 690 training takes 0:05:02 [2024-03-11 02:35:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [691/800][0/402] eta 0:37:05 lr 0.000025 time 5.5349 (5.5349) loss 0.5919 (0.5919) grad_norm 0.1611 (0.1611) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:36:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [691/800][100/402] eta 0:03:59 lr 0.000025 time 0.7455 (0.7938) loss 0.5794 (0.6225) grad_norm 0.1417 (0.1500) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:38:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [691/800][200/402] eta 0:02:35 lr 0.000025 time 0.7458 (0.7709) loss 0.5928 (0.6214) grad_norm 0.1280 (0.1515) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 02:39:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [691/800][300/402] eta 0:01:17 lr 0.000025 time 0.7464 (0.7628) loss 0.6079 (0.6209) grad_norm 0.1484 (0.1516) loss_scale 524288.0000 (317882.2591) mem 28968MB [2024-03-11 02:40:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [691/800][400/402] eta 0:00:01 lr 0.000025 time 0.7452 (0.7588) loss 0.6294 (0.6202) grad_norm 0.1494 (0.1522) loss_scale 524288.0000 (369355.0125) mem 28968MB [2024-03-11 02:40:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 691 training takes 0:05:05 [2024-03-11 02:40:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [692/800][0/402] eta 0:24:03 lr 0.000025 time 3.5909 (3.5909) loss 0.6220 (0.6220) grad_norm 0.1611 (0.1611) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:42:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [692/800][100/402] eta 0:03:53 lr 0.000025 time 0.7456 (0.7740) loss 0.6229 (0.6234) grad_norm 0.1555 (0.1531) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:43:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [692/800][200/402] eta 0:02:33 lr 0.000025 time 0.7461 (0.7601) loss 0.5955 (0.6218) grad_norm 0.1577 (0.1551) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:44:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [692/800][300/402] eta 0:01:17 lr 0.000025 time 0.7463 (0.7555) loss 0.6558 (0.6220) grad_norm 0.1591 (0.1558) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:45:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [692/800][400/402] eta 0:00:01 lr 0.000025 time 0.7442 (0.7530) loss 0.6147 (0.6218) grad_norm 0.1348 (0.1554) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:45:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 692 training takes 0:05:02 [2024-03-11 02:45:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [693/800][0/402] eta 0:23:23 lr 0.000025 time 3.4913 (3.4913) loss 0.6135 (0.6135) grad_norm 0.1457 (0.1457) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:47:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [693/800][100/402] eta 0:03:53 lr 0.000025 time 0.7455 (0.7730) loss 0.6215 (0.6222) grad_norm 0.1248 (0.1504) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:48:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [693/800][200/402] eta 0:02:33 lr 0.000025 time 0.7454 (0.7595) loss 0.6476 (0.6214) grad_norm 0.1960 (0.1528) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:49:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [693/800][300/402] eta 0:01:17 lr 0.000025 time 0.7453 (0.7552) loss 0.6060 (0.6203) grad_norm 0.1505 (0.1523) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:50:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [693/800][400/402] eta 0:00:01 lr 0.000025 time 0.7457 (0.7529) loss 0.6493 (0.6213) grad_norm 0.1043 (0.1529) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:50:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 693 training takes 0:05:02 [2024-03-11 02:50:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [694/800][0/402] eta 0:23:56 lr 0.000025 time 3.5741 (3.5741) loss 0.6216 (0.6216) grad_norm 0.1301 (0.1301) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:52:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [694/800][100/402] eta 0:03:53 lr 0.000025 time 0.7459 (0.7739) loss 0.6094 (0.6204) grad_norm 0.1773 (0.1547) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:53:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [694/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7600) loss 0.6223 (0.6200) grad_norm 0.1679 (0.1551) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:54:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [694/800][300/402] eta 0:01:17 lr 0.000025 time 0.7464 (0.7553) loss 0.6319 (0.6206) grad_norm 0.1512 (0.1532) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:55:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [694/800][400/402] eta 0:00:01 lr 0.000025 time 0.7447 (0.7529) loss 0.6357 (0.6200) grad_norm 0.1634 (0.1536) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:55:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 694 training takes 0:05:02 [2024-03-11 02:55:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [695/800][0/402] eta 0:23:34 lr 0.000025 time 3.5189 (3.5189) loss 0.6209 (0.6209) grad_norm 0.1159 (0.1159) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:57:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [695/800][100/402] eta 0:03:53 lr 0.000025 time 0.7460 (0.7737) loss 0.6302 (0.6201) grad_norm 0.1470 (0.1534) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:58:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [695/800][200/402] eta 0:02:33 lr 0.000025 time 0.7457 (0.7599) loss 0.6077 (0.6193) grad_norm 0.1416 (0.1524) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 02:59:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [695/800][300/402] eta 0:01:17 lr 0.000025 time 0.7457 (0.7553) loss 0.5994 (0.6197) grad_norm 0.1796 (0.1525) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:00:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [695/800][400/402] eta 0:00:01 lr 0.000025 time 0.7451 (0.7529) loss 0.6525 (0.6195) grad_norm 0.1361 (0.1522) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:00:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 695 training takes 0:05:02 [2024-03-11 03:01:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [696/800][0/402] eta 0:37:55 lr 0.000025 time 5.6597 (5.6597) loss 0.6375 (0.6375) grad_norm 0.1256 (0.1256) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:02:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [696/800][100/402] eta 0:04:00 lr 0.000025 time 0.7452 (0.7950) loss 0.6367 (0.6199) grad_norm 0.2492 (0.1546) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:03:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [696/800][200/402] eta 0:02:35 lr 0.000025 time 0.7476 (0.7707) loss 0.6228 (0.6214) grad_norm 0.1304 (0.1536) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:04:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [696/800][300/402] eta 0:01:17 lr 0.000025 time 0.7462 (0.7625) loss 0.6176 (0.6215) grad_norm 0.1540 (0.1547) loss_scale 1048576.0000 (653182.7243) mem 28968MB [2024-03-11 03:05:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [696/800][400/402] eta 0:00:01 lr 0.000025 time 0.7458 (0.7583) loss 0.6194 (0.6214) grad_norm 0.1599 (inf) loss_scale 262144.0000 (683797.0673) mem 28968MB [2024-03-11 03:05:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 696 training takes 0:05:04 [2024-03-11 03:06:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [697/800][0/402] eta 0:24:29 lr 0.000025 time 3.6566 (3.6566) loss 0.5869 (0.5869) grad_norm 0.1314 (0.1314) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:07:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [697/800][100/402] eta 0:03:54 lr 0.000025 time 0.7466 (0.7752) loss 0.6270 (0.6230) grad_norm 0.1580 (0.1527) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:08:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [697/800][200/402] eta 0:02:33 lr 0.000025 time 0.7458 (0.7613) loss 0.6205 (0.6227) grad_norm 0.1293 (0.1526) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:09:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [697/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7563) loss 0.6185 (0.6227) grad_norm 0.1697 (0.1548) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:11:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [697/800][400/402] eta 0:00:01 lr 0.000025 time 0.7463 (0.7537) loss 0.6293 (0.6217) grad_norm 0.1485 (0.1536) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:11:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 697 training takes 0:05:03 [2024-03-11 03:11:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [698/800][0/402] eta 0:23:32 lr 0.000025 time 3.5141 (3.5141) loss 0.6100 (0.6100) grad_norm 0.1303 (0.1303) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:12:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [698/800][100/402] eta 0:03:53 lr 0.000025 time 0.7461 (0.7736) loss 0.6090 (0.6178) grad_norm 0.1628 (0.1577) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:13:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [698/800][200/402] eta 0:02:33 lr 0.000025 time 0.7459 (0.7598) loss 0.6298 (0.6195) grad_norm 0.1647 (0.1564) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:14:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [698/800][300/402] eta 0:01:17 lr 0.000025 time 0.7472 (0.7553) loss 0.6018 (0.6202) grad_norm 0.1308 (0.1554) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:16:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [698/800][400/402] eta 0:00:01 lr 0.000025 time 0.7441 (0.7529) loss 0.5987 (0.6212) grad_norm 0.2031 (0.1553) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:16:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 698 training takes 0:05:02 [2024-03-11 03:16:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [699/800][0/402] eta 0:22:59 lr 0.000025 time 3.4320 (3.4320) loss 0.6011 (0.6011) grad_norm 0.1402 (0.1402) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:17:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [699/800][100/402] eta 0:03:53 lr 0.000025 time 0.7465 (0.7725) loss 0.6221 (0.6224) grad_norm 0.1435 (0.1520) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:18:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [699/800][200/402] eta 0:02:33 lr 0.000025 time 0.7463 (0.7593) loss 0.6310 (0.6214) grad_norm 0.1629 (0.1521) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:19:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [699/800][300/402] eta 0:01:17 lr 0.000025 time 0.7458 (0.7552) loss 0.6325 (0.6210) grad_norm 0.1445 (0.1539) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:21:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [699/800][400/402] eta 0:00:01 lr 0.000025 time 0.7449 (0.7529) loss 0.6282 (0.6213) grad_norm 0.1295 (0.1538) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:21:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 699 training takes 0:05:02 [2024-03-11 03:21:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [700/800][0/402] eta 0:23:36 lr 0.000003 time 3.5240 (3.5240) loss 0.6289 (0.6289) grad_norm 0.1467 (0.1467) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:22:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [700/800][100/402] eta 0:03:53 lr 0.000003 time 0.7461 (0.7736) loss 0.6667 (0.6230) grad_norm 0.1124 (0.1368) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:23:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [700/800][200/402] eta 0:02:33 lr 0.000003 time 0.7466 (0.7600) loss 0.6648 (0.6212) grad_norm 0.1595 (0.1357) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:24:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [700/800][300/402] eta 0:01:17 lr 0.000003 time 0.7461 (0.7554) loss 0.6450 (0.6202) grad_norm 0.1445 (0.1372) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:26:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [700/800][400/402] eta 0:00:01 lr 0.000003 time 0.7437 (0.7530) loss 0.6286 (0.6207) grad_norm 0.1122 (0.1369) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:26:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 700 training takes 0:05:02 [2024-03-11 03:26:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [701/800][0/402] eta 0:34:03 lr 0.000003 time 5.0835 (5.0835) loss 0.6162 (0.6162) grad_norm 0.1249 (0.1249) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:27:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [701/800][100/402] eta 0:03:58 lr 0.000003 time 0.7456 (0.7892) loss 0.6012 (0.6223) grad_norm 0.1357 (0.1358) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:28:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [701/800][200/402] eta 0:02:35 lr 0.000003 time 0.7463 (0.7677) loss 0.5956 (0.6227) grad_norm 0.1367 (0.1379) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:30:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [701/800][300/402] eta 0:01:17 lr 0.000003 time 0.7458 (0.7605) loss 0.6300 (0.6208) grad_norm 0.1436 (0.1391) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 03:31:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [701/800][400/402] eta 0:00:01 lr 0.000003 time 0.7442 (0.7571) loss 0.6165 (0.6196) grad_norm 0.1286 (0.1389) loss_scale 524288.0000 (285678.1247) mem 28968MB [2024-03-11 03:31:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 701 training takes 0:05:04 [2024-03-11 03:31:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [702/800][0/402] eta 0:23:27 lr 0.000003 time 3.5024 (3.5024) loss 0.6010 (0.6010) grad_norm 0.1746 (0.1746) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:32:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [702/800][100/402] eta 0:03:53 lr 0.000003 time 0.7461 (0.7734) loss 0.6415 (0.6207) grad_norm 0.1160 (0.1396) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:33:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [702/800][200/402] eta 0:02:33 lr 0.000003 time 0.7468 (0.7599) loss 0.6092 (0.6191) grad_norm 0.1505 (0.1422) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:35:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [702/800][300/402] eta 0:01:17 lr 0.000003 time 0.7459 (0.7553) loss 0.6184 (0.6186) grad_norm 0.1569 (0.1423) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:36:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [702/800][400/402] eta 0:00:01 lr 0.000003 time 0.7440 (0.7529) loss 0.6059 (0.6194) grad_norm 0.1347 (0.1416) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:36:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 702 training takes 0:05:02 [2024-03-11 03:36:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [703/800][0/402] eta 0:23:56 lr 0.000003 time 3.5733 (3.5733) loss 0.6190 (0.6190) grad_norm 0.1212 (0.1212) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:37:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [703/800][100/402] eta 0:03:53 lr 0.000003 time 0.7460 (0.7740) loss 0.6314 (0.6160) grad_norm 0.1274 (0.1423) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:38:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [703/800][200/402] eta 0:02:33 lr 0.000003 time 0.7459 (0.7601) loss 0.6256 (0.6177) grad_norm 0.1255 (0.1441) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:40:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [703/800][300/402] eta 0:01:17 lr 0.000003 time 0.7456 (0.7555) loss 0.5885 (0.6187) grad_norm 0.1351 (0.1435) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:41:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [703/800][400/402] eta 0:00:01 lr 0.000003 time 0.7439 (0.7531) loss 0.6546 (0.6191) grad_norm 0.1251 (0.1433) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:41:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 703 training takes 0:05:02 [2024-03-11 03:41:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [704/800][0/402] eta 0:23:22 lr 0.000003 time 3.4882 (3.4882) loss 0.6244 (0.6244) grad_norm 0.1519 (0.1519) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:42:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [704/800][100/402] eta 0:03:53 lr 0.000003 time 0.7459 (0.7739) loss 0.5814 (0.6231) grad_norm 0.1511 (0.1432) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:43:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [704/800][200/402] eta 0:02:33 lr 0.000003 time 0.7459 (0.7600) loss 0.6245 (0.6227) grad_norm 0.1266 (0.1420) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:45:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [704/800][300/402] eta 0:01:17 lr 0.000003 time 0.7459 (0.7555) loss 0.6405 (0.6208) grad_norm 0.1218 (0.1412) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:46:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [704/800][400/402] eta 0:00:01 lr 0.000003 time 0.7440 (0.7531) loss 0.6062 (0.6204) grad_norm 0.1770 (0.1421) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:46:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 704 training takes 0:05:02 [2024-03-11 03:46:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [705/800][0/402] eta 0:23:15 lr 0.000003 time 3.4721 (3.4721) loss 0.6252 (0.6252) grad_norm 0.1422 (0.1422) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:47:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [705/800][100/402] eta 0:03:53 lr 0.000003 time 0.7458 (0.7730) loss 0.6283 (0.6207) grad_norm 0.1672 (0.1428) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:48:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [705/800][200/402] eta 0:02:33 lr 0.000003 time 0.7459 (0.7596) loss 0.5980 (0.6187) grad_norm 0.1469 (0.1431) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:50:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [705/800][300/402] eta 0:01:17 lr 0.000003 time 0.7462 (0.7551) loss 0.6201 (0.6191) grad_norm 0.1354 (0.1441) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:51:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [705/800][400/402] eta 0:00:01 lr 0.000003 time 0.7440 (0.7528) loss 0.6233 (0.6196) grad_norm 0.1540 (0.1441) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:51:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 705 training takes 0:05:02 [2024-03-11 03:51:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [706/800][0/402] eta 0:34:00 lr 0.000003 time 5.0748 (5.0748) loss 0.6480 (0.6480) grad_norm 0.1376 (0.1376) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:52:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [706/800][100/402] eta 0:03:58 lr 0.000003 time 0.7465 (0.7895) loss 0.6045 (0.6186) grad_norm 0.1474 (0.1465) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:54:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [706/800][200/402] eta 0:02:35 lr 0.000003 time 0.7460 (0.7681) loss 0.6086 (0.6187) grad_norm 0.1407 (0.1443) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:55:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [706/800][300/402] eta 0:01:17 lr 0.000003 time 0.7463 (0.7613) loss 0.6243 (0.6196) grad_norm 0.1286 (0.1444) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 03:56:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [706/800][400/402] eta 0:00:01 lr 0.000003 time 0.7444 (0.7575) loss 0.6343 (0.6196) grad_norm 0.1183 (0.1446) loss_scale 1048576.0000 (584430.7631) mem 28968MB [2024-03-11 03:56:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 706 training takes 0:05:04 [2024-03-11 03:56:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [707/800][0/402] eta 0:23:00 lr 0.000003 time 3.4337 (3.4337) loss 0.6503 (0.6503) grad_norm 0.1433 (0.1433) loss_scale 1048576.0000 (1048576.0000) mem 28968MB [2024-03-11 03:57:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [707/800][100/402] eta 0:03:53 lr 0.000003 time 0.7464 (0.7726) loss 0.6193 (0.6172) grad_norm 0.1698 (0.1433) loss_scale 1048576.0000 (1048576.0000) mem 28968MB [2024-03-11 03:59:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [707/800][200/402] eta 0:02:33 lr 0.000003 time 0.7466 (0.7595) loss 0.6038 (0.6187) grad_norm 0.1382 (0.1440) loss_scale 1048576.0000 (1048576.0000) mem 28968MB [2024-03-11 04:00:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [707/800][300/402] eta 0:01:17 lr 0.000003 time 0.7462 (0.7550) loss 0.6307 (0.6185) grad_norm 0.1308 (0.1455) loss_scale 1048576.0000 (1048576.0000) mem 28968MB [2024-03-11 04:01:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [707/800][400/402] eta 0:00:01 lr 0.000003 time 0.7442 (0.7527) loss 0.6305 (0.6185) grad_norm 0.1266 (inf) loss_scale 524288.0000 (962284.2095) mem 28968MB [2024-03-11 04:01:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 707 training takes 0:05:02 [2024-03-11 04:01:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [708/800][0/402] eta 0:22:54 lr 0.000003 time 3.4187 (3.4187) loss 0.6371 (0.6371) grad_norm 0.1642 (0.1642) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:02:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [708/800][100/402] eta 0:03:53 lr 0.000003 time 0.7455 (0.7725) loss 0.6044 (0.6230) grad_norm 0.1766 (0.1471) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:04:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [708/800][200/402] eta 0:02:33 lr 0.000003 time 0.7464 (0.7593) loss 0.6235 (0.6210) grad_norm 0.1338 (0.1469) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:05:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [708/800][300/402] eta 0:01:17 lr 0.000003 time 0.7458 (0.7549) loss 0.6070 (0.6205) grad_norm 0.1388 (0.1469) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:06:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [708/800][400/402] eta 0:00:01 lr 0.000003 time 0.7441 (0.7527) loss 0.5898 (0.6204) grad_norm 0.1820 (0.1461) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:06:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 708 training takes 0:05:02 [2024-03-11 04:06:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [709/800][0/402] eta 0:23:26 lr 0.000003 time 3.4978 (3.4978) loss 0.6045 (0.6045) grad_norm 0.1627 (0.1627) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:07:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [709/800][100/402] eta 0:03:53 lr 0.000003 time 0.7460 (0.7734) loss 0.6251 (0.6205) grad_norm 0.1837 (0.1445) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:09:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [709/800][200/402] eta 0:02:33 lr 0.000003 time 0.7463 (0.7598) loss 0.6049 (0.6193) grad_norm 0.1402 (0.1452) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:10:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [709/800][300/402] eta 0:01:17 lr 0.000003 time 0.7467 (0.7552) loss 0.6217 (0.6195) grad_norm 0.1466 (0.1458) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:11:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [709/800][400/402] eta 0:00:01 lr 0.000003 time 0.7443 (0.7529) loss 0.6458 (0.6194) grad_norm 0.1166 (0.1461) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:11:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 709 training takes 0:05:02 [2024-03-11 04:11:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [710/800][0/402] eta 0:22:51 lr 0.000003 time 3.4105 (3.4105) loss 0.6199 (0.6199) grad_norm 0.1192 (0.1192) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:12:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [710/800][100/402] eta 0:03:53 lr 0.000003 time 0.7457 (0.7723) loss 0.6103 (0.6175) grad_norm 0.1814 (0.1479) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:14:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [710/800][200/402] eta 0:02:33 lr 0.000003 time 0.7461 (0.7593) loss 0.6262 (0.6195) grad_norm 0.1322 (0.1474) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:15:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [710/800][300/402] eta 0:01:17 lr 0.000003 time 0.7461 (0.7549) loss 0.6147 (0.6205) grad_norm 0.1407 (0.1466) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:16:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [710/800][400/402] eta 0:00:01 lr 0.000003 time 0.7438 (0.7526) loss 0.6028 (0.6199) grad_norm 0.1494 (0.1471) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:16:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 710 training takes 0:05:02 [2024-03-11 04:16:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [711/800][0/402] eta 0:33:06 lr 0.000003 time 4.9422 (4.9422) loss 0.6166 (0.6166) grad_norm 0.1430 (0.1430) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:18:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [711/800][100/402] eta 0:03:58 lr 0.000003 time 0.7461 (0.7888) loss 0.6518 (0.6185) grad_norm 0.1115 (0.1466) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:19:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [711/800][200/402] eta 0:02:35 lr 0.000003 time 0.7472 (0.7676) loss 0.6209 (0.6202) grad_norm 0.1417 (0.1466) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:20:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [711/800][300/402] eta 0:01:17 lr 0.000003 time 0.7468 (0.7605) loss 0.6246 (0.6202) grad_norm 0.1248 (0.1470) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:21:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [711/800][400/402] eta 0:00:01 lr 0.000003 time 0.7440 (0.7569) loss 0.6420 (0.6206) grad_norm 0.1421 (0.1468) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:21:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 711 training takes 0:05:04 [2024-03-11 04:21:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [712/800][0/402] eta 0:23:15 lr 0.000003 time 3.4706 (3.4706) loss 0.6359 (0.6359) grad_norm 0.1402 (0.1402) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:23:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [712/800][100/402] eta 0:03:53 lr 0.000003 time 0.7459 (0.7730) loss 0.6504 (0.6193) grad_norm 0.1587 (0.1455) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:24:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [712/800][200/402] eta 0:02:33 lr 0.000003 time 0.7464 (0.7596) loss 0.6264 (0.6193) grad_norm 0.1600 (0.1465) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:25:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [712/800][300/402] eta 0:01:17 lr 0.000003 time 0.7456 (0.7551) loss 0.6192 (0.6194) grad_norm 0.1326 (0.1468) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:26:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [712/800][400/402] eta 0:00:01 lr 0.000003 time 0.7444 (0.7527) loss 0.6119 (0.6196) grad_norm 0.1479 (inf) loss_scale 524288.0000 (585738.2145) mem 28968MB [2024-03-11 04:26:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 712 training takes 0:05:02 [2024-03-11 04:26:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [713/800][0/402] eta 0:23:07 lr 0.000003 time 3.4526 (3.4526) loss 0.6081 (0.6081) grad_norm 0.1600 (0.1600) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:28:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [713/800][100/402] eta 0:03:53 lr 0.000003 time 0.7462 (0.7728) loss 0.6021 (0.6187) grad_norm 0.1694 (0.1495) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:29:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [713/800][200/402] eta 0:02:33 lr 0.000003 time 0.7458 (0.7599) loss 0.6142 (0.6194) grad_norm 0.1307 (0.1487) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:30:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [713/800][300/402] eta 0:01:17 lr 0.000003 time 0.7456 (0.7553) loss 0.6474 (0.6199) grad_norm 0.1718 (0.1490) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:31:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [713/800][400/402] eta 0:00:01 lr 0.000003 time 0.7437 (0.7529) loss 0.5972 (0.6199) grad_norm 0.1404 (0.1486) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:31:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 713 training takes 0:05:02 [2024-03-11 04:31:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [714/800][0/402] eta 0:23:23 lr 0.000003 time 3.4921 (3.4921) loss 0.6057 (0.6057) grad_norm 0.1778 (0.1778) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:33:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [714/800][100/402] eta 0:03:53 lr 0.000003 time 0.7463 (0.7732) loss 0.6047 (0.6180) grad_norm 0.1326 (0.1491) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:34:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [714/800][200/402] eta 0:02:33 lr 0.000003 time 0.7462 (0.7597) loss 0.5981 (0.6188) grad_norm 0.1555 (0.1476) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:35:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [714/800][300/402] eta 0:01:17 lr 0.000003 time 0.7460 (0.7551) loss 0.6258 (0.6186) grad_norm 0.1440 (0.1480) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:36:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [714/800][400/402] eta 0:00:01 lr 0.000003 time 0.7445 (0.7528) loss 0.6559 (0.6185) grad_norm 0.1875 (0.1486) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:36:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 714 training takes 0:05:02 [2024-03-11 04:36:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [715/800][0/402] eta 0:23:07 lr 0.000003 time 3.4521 (3.4521) loss 0.6297 (0.6297) grad_norm 0.1659 (0.1659) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:38:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [715/800][100/402] eta 0:03:53 lr 0.000003 time 0.7462 (0.7728) loss 0.6295 (0.6173) grad_norm 0.1430 (0.1477) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:39:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [715/800][200/402] eta 0:02:33 lr 0.000003 time 0.7459 (0.7594) loss 0.6276 (0.6183) grad_norm 0.1427 (0.1474) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:40:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [715/800][300/402] eta 0:01:17 lr 0.000003 time 0.7459 (0.7553) loss 0.6290 (0.6199) grad_norm 0.1390 (0.1480) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:41:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [715/800][400/402] eta 0:00:01 lr 0.000003 time 0.7444 (0.7529) loss 0.6402 (0.6189) grad_norm 0.1330 (0.1488) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:41:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 715 training takes 0:05:02 [2024-03-11 04:42:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [716/800][0/402] eta 0:35:51 lr 0.000003 time 5.3516 (5.3516) loss 0.6448 (0.6448) grad_norm 0.1384 (0.1384) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:43:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [716/800][100/402] eta 0:03:59 lr 0.000003 time 0.7458 (0.7920) loss 0.5921 (0.6208) grad_norm 0.1298 (0.1514) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:44:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [716/800][200/402] eta 0:02:35 lr 0.000003 time 0.7461 (0.7692) loss 0.6373 (0.6194) grad_norm 0.1433 (0.1516) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:45:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [716/800][300/402] eta 0:01:17 lr 0.000003 time 0.7456 (0.7616) loss 0.6422 (0.6198) grad_norm 0.1711 (0.1516) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:47:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [716/800][400/402] eta 0:00:01 lr 0.000003 time 0.7450 (0.7577) loss 0.6065 (0.6195) grad_norm 0.1717 (0.1514) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:47:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 716 training takes 0:05:04 [2024-03-11 04:47:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [717/800][0/402] eta 0:22:53 lr 0.000003 time 3.4170 (3.4170) loss 0.6264 (0.6264) grad_norm 0.1567 (0.1567) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:48:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [717/800][100/402] eta 0:03:53 lr 0.000003 time 0.7456 (0.7725) loss 0.6431 (0.6212) grad_norm 0.1891 (0.1558) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:49:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [717/800][200/402] eta 0:02:33 lr 0.000003 time 0.7460 (0.7593) loss 0.6462 (0.6203) grad_norm 0.1739 (0.1543) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:50:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [717/800][300/402] eta 0:01:16 lr 0.000003 time 0.7464 (0.7549) loss 0.6550 (0.6211) grad_norm 0.1390 (0.1525) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:52:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [717/800][400/402] eta 0:00:01 lr 0.000003 time 0.7445 (0.7528) loss 0.6168 (0.6204) grad_norm 0.1462 (0.1520) loss_scale 1048576.0000 (575278.6035) mem 28968MB [2024-03-11 04:52:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 717 training takes 0:05:02 [2024-03-11 04:52:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [718/800][0/402] eta 0:23:49 lr 0.000003 time 3.5571 (3.5571) loss 0.6280 (0.6280) grad_norm 0.1333 (0.1333) loss_scale 1048576.0000 (1048576.0000) mem 28968MB [2024-03-11 04:53:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [718/800][100/402] eta 0:03:53 lr 0.000003 time 0.7459 (0.7740) loss 0.6210 (0.6182) grad_norm 0.1506 (inf) loss_scale 524288.0000 (757881.6634) mem 28968MB [2024-03-11 04:54:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [718/800][200/402] eta 0:02:33 lr 0.000003 time 0.7467 (0.7601) loss 0.6314 (0.6184) grad_norm 0.1586 (inf) loss_scale 524288.0000 (641665.9104) mem 28968MB [2024-03-11 04:55:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [718/800][300/402] eta 0:01:17 lr 0.000003 time 0.7458 (0.7555) loss 0.6632 (0.6200) grad_norm 0.1452 (inf) loss_scale 524288.0000 (602669.9269) mem 28968MB [2024-03-11 04:57:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [718/800][400/402] eta 0:00:01 lr 0.000003 time 0.7441 (0.7531) loss 0.6254 (0.6200) grad_norm 0.1646 (inf) loss_scale 524288.0000 (583123.3117) mem 28968MB [2024-03-11 04:57:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 718 training takes 0:05:02 [2024-03-11 04:57:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [719/800][0/402] eta 0:22:35 lr 0.000003 time 3.3713 (3.3713) loss 0.5990 (0.5990) grad_norm 0.1591 (0.1591) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:58:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [719/800][100/402] eta 0:03:53 lr 0.000003 time 0.7460 (0.7719) loss 0.6154 (0.6180) grad_norm 0.1191 (0.1500) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 04:59:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [719/800][200/402] eta 0:02:33 lr 0.000003 time 0.7456 (0.7590) loss 0.6265 (0.6174) grad_norm 0.1418 (0.1500) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:00:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [719/800][300/402] eta 0:01:16 lr 0.000003 time 0.7457 (0.7547) loss 0.6223 (0.6186) grad_norm 0.1497 (0.1502) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:02:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [719/800][400/402] eta 0:00:01 lr 0.000003 time 0.7439 (0.7525) loss 0.6306 (0.6186) grad_norm 0.1210 (0.1508) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:02:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 719 training takes 0:05:02 [2024-03-11 05:02:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [720/800][0/402] eta 0:23:06 lr 0.000003 time 3.4496 (3.4496) loss 0.6228 (0.6228) grad_norm 0.1598 (0.1598) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:03:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [720/800][100/402] eta 0:03:53 lr 0.000003 time 0.7464 (0.7736) loss 0.6189 (0.6182) grad_norm 0.1556 (0.1532) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:04:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [720/800][200/402] eta 0:02:33 lr 0.000003 time 0.7461 (0.7601) loss 0.6026 (0.6178) grad_norm 0.1600 (0.1515) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:05:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [720/800][300/402] eta 0:01:17 lr 0.000003 time 0.7461 (0.7555) loss 0.5925 (0.6183) grad_norm 0.1500 (0.1511) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:07:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [720/800][400/402] eta 0:00:01 lr 0.000003 time 0.7443 (0.7531) loss 0.6402 (0.6190) grad_norm 0.1579 (0.1509) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:07:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 720 training takes 0:05:02 [2024-03-11 05:07:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [721/800][0/402] eta 0:34:13 lr 0.000003 time 5.1075 (5.1075) loss 0.6279 (0.6279) grad_norm 0.1534 (0.1534) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:08:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [721/800][100/402] eta 0:03:58 lr 0.000003 time 0.7461 (0.7892) loss 0.6075 (0.6226) grad_norm 0.2018 (0.1511) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:09:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [721/800][200/402] eta 0:02:35 lr 0.000003 time 0.7461 (0.7679) loss 0.6099 (0.6209) grad_norm 0.1392 (0.1522) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:11:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [721/800][300/402] eta 0:01:17 lr 0.000003 time 0.7465 (0.7608) loss 0.6124 (0.6200) grad_norm 0.1328 (0.1522) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:12:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [721/800][400/402] eta 0:00:01 lr 0.000003 time 0.7441 (0.7572) loss 0.6452 (0.6204) grad_norm 0.1702 (0.1523) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:12:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 721 training takes 0:05:04 [2024-03-11 05:12:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [722/800][0/402] eta 0:23:11 lr 0.000003 time 3.4614 (3.4614) loss 0.6076 (0.6076) grad_norm 0.2091 (0.2091) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:13:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [722/800][100/402] eta 0:03:53 lr 0.000003 time 0.7463 (0.7737) loss 0.6292 (0.6221) grad_norm 0.1461 (0.1491) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:14:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [722/800][200/402] eta 0:02:33 lr 0.000003 time 0.7461 (0.7600) loss 0.6395 (0.6215) grad_norm 0.1211 (0.1507) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:16:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [722/800][300/402] eta 0:01:17 lr 0.000003 time 0.7456 (0.7556) loss 0.6417 (0.6198) grad_norm 0.1623 (0.1504) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:17:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [722/800][400/402] eta 0:00:01 lr 0.000003 time 0.7448 (0.7533) loss 0.6337 (0.6199) grad_norm 0.1512 (0.1511) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:17:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 722 training takes 0:05:02 [2024-03-11 05:17:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [723/800][0/402] eta 0:24:03 lr 0.000003 time 3.5919 (3.5919) loss 0.6255 (0.6255) grad_norm 0.1450 (0.1450) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:18:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [723/800][100/402] eta 0:03:53 lr 0.000003 time 0.7464 (0.7744) loss 0.6323 (0.6206) grad_norm 0.1491 (0.1532) loss_scale 1048576.0000 (866892.0396) mem 28968MB [2024-03-11 05:19:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [723/800][200/402] eta 0:02:33 lr 0.000003 time 0.7463 (0.7604) loss 0.6085 (0.6199) grad_norm 0.1249 (inf) loss_scale 524288.0000 (725134.6468) mem 28968MB [2024-03-11 05:21:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [723/800][300/402] eta 0:01:17 lr 0.000003 time 0.7462 (0.7557) loss 0.6216 (0.6196) grad_norm 0.1404 (inf) loss_scale 524288.0000 (658408.1860) mem 28968MB [2024-03-11 05:22:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [723/800][400/402] eta 0:00:01 lr 0.000003 time 0.7441 (0.7533) loss 0.6420 (0.6199) grad_norm 0.1382 (inf) loss_scale 524288.0000 (624961.7556) mem 28968MB [2024-03-11 05:22:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 723 training takes 0:05:02 [2024-03-11 05:22:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [724/800][0/402] eta 0:22:51 lr 0.000003 time 3.4124 (3.4124) loss 0.5989 (0.5989) grad_norm 0.1667 (0.1667) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:23:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [724/800][100/402] eta 0:03:53 lr 0.000003 time 0.7460 (0.7728) loss 0.6332 (0.6206) grad_norm 0.1467 (0.1500) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:24:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [724/800][200/402] eta 0:02:33 lr 0.000003 time 0.7466 (0.7598) loss 0.6134 (0.6203) grad_norm 0.1361 (0.1499) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:26:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [724/800][300/402] eta 0:01:17 lr 0.000003 time 0.7460 (0.7553) loss 0.5909 (0.6199) grad_norm 0.1642 (0.1504) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:27:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [724/800][400/402] eta 0:00:01 lr 0.000003 time 0.7448 (0.7530) loss 0.6129 (0.6198) grad_norm 0.1380 (0.1510) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:27:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 724 training takes 0:05:02 [2024-03-11 05:27:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [725/800][0/402] eta 0:23:53 lr 0.000003 time 3.5657 (3.5657) loss 0.5821 (0.5821) grad_norm 0.1808 (0.1808) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:28:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [725/800][100/402] eta 0:03:54 lr 0.000003 time 0.7461 (0.7751) loss 0.6126 (0.6184) grad_norm 0.1444 (0.1526) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:29:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [725/800][200/402] eta 0:02:33 lr 0.000003 time 0.7469 (0.7608) loss 0.6358 (0.6189) grad_norm 0.1761 (0.1526) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:31:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [725/800][300/402] eta 0:01:17 lr 0.000003 time 0.7458 (0.7560) loss 0.6391 (0.6189) grad_norm 0.1394 (0.1531) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:32:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [725/800][400/402] eta 0:00:01 lr 0.000003 time 0.7437 (0.7535) loss 0.6256 (0.6192) grad_norm 0.1702 (0.1535) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:32:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 725 training takes 0:05:03 [2024-03-11 05:32:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [726/800][0/402] eta 0:38:02 lr 0.000003 time 5.6781 (5.6781) loss 0.6140 (0.6140) grad_norm 0.1625 (0.1625) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:33:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [726/800][100/402] eta 0:04:00 lr 0.000003 time 0.7492 (0.7953) loss 0.5983 (0.6193) grad_norm 0.1644 (0.1534) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:35:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [726/800][200/402] eta 0:02:35 lr 0.000003 time 0.7466 (0.7710) loss 0.5928 (0.6194) grad_norm 0.1482 (0.1531) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:36:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [726/800][300/402] eta 0:01:17 lr 0.000003 time 0.7460 (0.7629) loss 0.6034 (0.6192) grad_norm 0.1438 (0.1545) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:37:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [726/800][400/402] eta 0:00:01 lr 0.000003 time 0.7443 (0.7588) loss 0.6467 (0.6203) grad_norm 0.1601 (0.1545) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:37:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 726 training takes 0:05:05 [2024-03-11 05:37:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [727/800][0/402] eta 0:25:10 lr 0.000003 time 3.7574 (3.7574) loss 0.6305 (0.6305) grad_norm 0.1479 (0.1479) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:38:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [727/800][100/402] eta 0:03:54 lr 0.000003 time 0.7465 (0.7758) loss 0.6071 (0.6188) grad_norm 0.1895 (0.1531) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:40:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [727/800][200/402] eta 0:02:33 lr 0.000003 time 0.7461 (0.7613) loss 0.5993 (0.6197) grad_norm 0.1844 (0.1533) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:41:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [727/800][300/402] eta 0:01:17 lr 0.000003 time 0.7458 (0.7563) loss 0.6234 (0.6197) grad_norm 0.1483 (0.1533) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:42:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [727/800][400/402] eta 0:00:01 lr 0.000003 time 0.7447 (0.7537) loss 0.6188 (0.6195) grad_norm 0.1373 (0.1543) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:42:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 727 training takes 0:05:03 [2024-03-11 05:42:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [728/800][0/402] eta 0:23:30 lr 0.000003 time 3.5087 (3.5087) loss 0.6277 (0.6277) grad_norm 0.1451 (0.1451) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:43:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [728/800][100/402] eta 0:03:53 lr 0.000003 time 0.7465 (0.7736) loss 0.5999 (0.6211) grad_norm 0.1642 (0.1555) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:45:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [728/800][200/402] eta 0:02:33 lr 0.000003 time 0.7460 (0.7602) loss 0.6079 (0.6189) grad_norm 0.1587 (0.1537) loss_scale 1048576.0000 (782519.4030) mem 28968MB [2024-03-11 05:46:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [728/800][300/402] eta 0:01:17 lr 0.000003 time 0.7461 (0.7556) loss 0.6201 (0.6201) grad_norm 0.1848 (inf) loss_scale 524288.0000 (726339.1894) mem 28968MB [2024-03-11 05:47:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [728/800][400/402] eta 0:00:01 lr 0.000003 time 0.7442 (0.7534) loss 0.6174 (0.6198) grad_norm 0.1600 (inf) loss_scale 524288.0000 (675952.3591) mem 28968MB [2024-03-11 05:47:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 728 training takes 0:05:02 [2024-03-11 05:47:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [729/800][0/402] eta 0:24:52 lr 0.000003 time 3.7134 (3.7134) loss 0.6178 (0.6178) grad_norm 0.1670 (0.1670) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:48:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [729/800][100/402] eta 0:03:54 lr 0.000003 time 0.7461 (0.7758) loss 0.6491 (0.6183) grad_norm 0.1663 (0.1541) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:50:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [729/800][200/402] eta 0:02:33 lr 0.000003 time 0.7461 (0.7610) loss 0.6386 (0.6183) grad_norm 0.1591 (0.1520) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:51:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [729/800][300/402] eta 0:01:17 lr 0.000003 time 0.7458 (0.7562) loss 0.6409 (0.6179) grad_norm 0.1843 (0.1531) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:52:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [729/800][400/402] eta 0:00:01 lr 0.000003 time 0.7440 (0.7536) loss 0.6078 (0.6190) grad_norm 0.1560 (0.1529) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:52:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 729 training takes 0:05:03 [2024-03-11 05:52:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [730/800][0/402] eta 0:23:33 lr 0.000003 time 3.5168 (3.5168) loss 0.5752 (0.5752) grad_norm 0.1552 (0.1552) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:54:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [730/800][100/402] eta 0:03:53 lr 0.000003 time 0.7461 (0.7742) loss 0.6055 (0.6172) grad_norm 0.1194 (0.1561) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:55:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [730/800][200/402] eta 0:02:33 lr 0.000003 time 0.7462 (0.7603) loss 0.6345 (0.6189) grad_norm 0.1371 (0.1552) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:56:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [730/800][300/402] eta 0:01:17 lr 0.000003 time 0.7460 (0.7557) loss 0.6002 (0.6188) grad_norm 0.1476 (0.1545) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:57:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [730/800][400/402] eta 0:00:01 lr 0.000003 time 0.7443 (0.7533) loss 0.6267 (0.6202) grad_norm 0.1287 (0.1540) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:57:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 730 training takes 0:05:02 [2024-03-11 05:57:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [731/800][0/402] eta 0:37:14 lr 0.000003 time 5.5593 (5.5593) loss 0.6154 (0.6154) grad_norm 0.1869 (0.1869) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 05:59:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [731/800][100/402] eta 0:03:59 lr 0.000003 time 0.7452 (0.7943) loss 0.5663 (0.6178) grad_norm 0.1690 (0.1512) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:00:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [731/800][200/402] eta 0:02:35 lr 0.000003 time 0.7458 (0.7702) loss 0.6479 (0.6204) grad_norm 0.1404 (0.1520) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:01:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [731/800][300/402] eta 0:01:17 lr 0.000003 time 0.7456 (0.7623) loss 0.6346 (0.6197) grad_norm 0.1511 (0.1531) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:02:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [731/800][400/402] eta 0:00:01 lr 0.000003 time 0.7465 (0.7584) loss 0.6085 (0.6200) grad_norm 0.1744 (0.1540) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:02:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 731 training takes 0:05:04 [2024-03-11 06:02:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [732/800][0/402] eta 0:24:48 lr 0.000003 time 3.7030 (3.7030) loss 0.6310 (0.6310) grad_norm 0.1595 (0.1595) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:04:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [732/800][100/402] eta 0:03:54 lr 0.000003 time 0.7464 (0.7756) loss 0.6166 (0.6188) grad_norm 0.2115 (0.1535) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:05:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [732/800][200/402] eta 0:02:33 lr 0.000003 time 0.7466 (0.7612) loss 0.6172 (0.6200) grad_norm 0.1529 (0.1530) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:06:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [732/800][300/402] eta 0:01:17 lr 0.000003 time 0.7485 (0.7563) loss 0.6267 (0.6200) grad_norm 0.1440 (0.1537) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:07:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [732/800][400/402] eta 0:00:01 lr 0.000003 time 0.7439 (0.7538) loss 0.6022 (0.6194) grad_norm 0.1677 (0.1536) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:07:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 732 training takes 0:05:03 [2024-03-11 06:07:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [733/800][0/402] eta 0:24:10 lr 0.000003 time 3.6072 (3.6072) loss 0.6287 (0.6287) grad_norm 0.1738 (0.1738) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:09:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [733/800][100/402] eta 0:03:53 lr 0.000003 time 0.7457 (0.7748) loss 0.6316 (0.6193) grad_norm 0.1636 (0.1554) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:10:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [733/800][200/402] eta 0:02:33 lr 0.000003 time 0.7458 (0.7608) loss 0.6147 (0.6196) grad_norm 0.1421 (0.1558) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:11:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [733/800][300/402] eta 0:01:17 lr 0.000003 time 0.7460 (0.7560) loss 0.6129 (0.6196) grad_norm 0.1534 (inf) loss_scale 524288.0000 (611379.0299) mem 28968MB [2024-03-11 06:12:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [733/800][400/402] eta 0:00:01 lr 0.000003 time 0.7441 (0.7535) loss 0.5866 (0.6190) grad_norm 0.1645 (inf) loss_scale 524288.0000 (589660.5686) mem 28968MB [2024-03-11 06:12:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 733 training takes 0:05:03 [2024-03-11 06:13:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [734/800][0/402] eta 0:23:59 lr 0.000003 time 3.5817 (3.5817) loss 0.6078 (0.6078) grad_norm 0.1430 (0.1430) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:14:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [734/800][100/402] eta 0:03:54 lr 0.000003 time 0.7468 (0.7761) loss 0.6361 (0.6197) grad_norm 0.1203 (0.1550) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:15:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [734/800][200/402] eta 0:02:33 lr 0.000003 time 0.7461 (0.7613) loss 0.5845 (0.6190) grad_norm 0.1673 (0.1549) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:16:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [734/800][300/402] eta 0:01:17 lr 0.000003 time 0.7460 (0.7564) loss 0.6339 (0.6186) grad_norm 0.1821 (0.1541) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:17:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [734/800][400/402] eta 0:00:01 lr 0.000003 time 0.7440 (0.7539) loss 0.6077 (0.6191) grad_norm 0.1566 (0.1553) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:18:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 734 training takes 0:05:03 [2024-03-11 06:18:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [735/800][0/402] eta 0:24:05 lr 0.000003 time 3.5946 (3.5946) loss 0.5988 (0.5988) grad_norm 0.1412 (0.1412) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:19:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [735/800][100/402] eta 0:03:53 lr 0.000003 time 0.7457 (0.7743) loss 0.6257 (0.6180) grad_norm 0.1635 (0.1544) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:20:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [735/800][200/402] eta 0:02:33 lr 0.000003 time 0.7461 (0.7603) loss 0.5928 (0.6205) grad_norm 0.1644 (0.1550) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:21:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [735/800][300/402] eta 0:01:17 lr 0.000003 time 0.7459 (0.7557) loss 0.6171 (0.6201) grad_norm 0.1911 (0.1545) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:23:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [735/800][400/402] eta 0:00:01 lr 0.000003 time 0.7438 (0.7533) loss 0.6222 (0.6205) grad_norm 0.1499 (0.1550) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:23:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 735 training takes 0:05:02 [2024-03-11 06:23:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [736/800][0/402] eta 0:37:25 lr 0.000003 time 5.5861 (5.5861) loss 0.6199 (0.6199) grad_norm 0.1478 (0.1478) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:24:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [736/800][100/402] eta 0:03:59 lr 0.000003 time 0.7467 (0.7943) loss 0.6067 (0.6161) grad_norm 0.1456 (0.1543) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:25:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [736/800][200/402] eta 0:02:35 lr 0.000003 time 0.7469 (0.7705) loss 0.6114 (0.6196) grad_norm 0.1598 (0.1549) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:26:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [736/800][300/402] eta 0:01:17 lr 0.000003 time 0.7457 (0.7625) loss 0.5943 (0.6194) grad_norm 0.1512 (0.1552) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:28:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [736/800][400/402] eta 0:00:01 lr 0.000003 time 0.7449 (0.7584) loss 0.6255 (0.6195) grad_norm 0.1420 (0.1548) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:28:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 736 training takes 0:05:04 [2024-03-11 06:28:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [737/800][0/402] eta 0:24:36 lr 0.000003 time 3.6737 (3.6737) loss 0.6043 (0.6043) grad_norm 0.1645 (0.1645) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:29:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [737/800][100/402] eta 0:03:54 lr 0.000003 time 0.7477 (0.7753) loss 0.5706 (0.6194) grad_norm 0.1645 (0.1545) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:30:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [737/800][200/402] eta 0:02:33 lr 0.000003 time 0.7474 (0.7609) loss 0.6177 (0.6207) grad_norm 0.1781 (0.1549) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:31:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [737/800][300/402] eta 0:01:17 lr 0.000003 time 0.7465 (0.7561) loss 0.6400 (0.6209) grad_norm 0.1386 (0.1558) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:33:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [737/800][400/402] eta 0:00:01 lr 0.000003 time 0.7446 (0.7537) loss 0.6207 (0.6206) grad_norm 0.2202 (0.1560) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:33:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 737 training takes 0:05:03 [2024-03-11 06:33:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [738/800][0/402] eta 0:25:24 lr 0.000003 time 3.7929 (3.7929) loss 0.6158 (0.6158) grad_norm 0.1956 (0.1956) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:34:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [738/800][100/402] eta 0:03:54 lr 0.000003 time 0.7464 (0.7764) loss 0.6461 (0.6183) grad_norm 0.1506 (0.1586) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:35:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [738/800][200/402] eta 0:02:33 lr 0.000003 time 0.7467 (0.7613) loss 0.5858 (0.6171) grad_norm 0.1462 (0.1572) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:36:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [738/800][300/402] eta 0:01:17 lr 0.000003 time 0.7467 (0.7562) loss 0.6365 (0.6176) grad_norm 0.1610 (inf) loss_scale 524288.0000 (567833.5150) mem 28968MB [2024-03-11 06:38:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [738/800][400/402] eta 0:00:01 lr 0.000003 time 0.7441 (0.7537) loss 0.6112 (0.6191) grad_norm 0.1688 (inf) loss_scale 524288.0000 (556974.2843) mem 28968MB [2024-03-11 06:38:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 738 training takes 0:05:03 [2024-03-11 06:38:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [739/800][0/402] eta 0:23:55 lr 0.000003 time 3.5710 (3.5710) loss 0.6212 (0.6212) grad_norm 0.1933 (0.1933) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:39:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [739/800][100/402] eta 0:03:53 lr 0.000003 time 0.7464 (0.7745) loss 0.6261 (0.6227) grad_norm 0.1323 (0.1561) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:40:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [739/800][200/402] eta 0:02:33 lr 0.000003 time 0.7465 (0.7606) loss 0.6144 (0.6221) grad_norm 0.2113 (0.1571) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:42:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [739/800][300/402] eta 0:01:17 lr 0.000003 time 0.7464 (0.7559) loss 0.6158 (0.6210) grad_norm 0.1527 (0.1574) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:43:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [739/800][400/402] eta 0:00:01 lr 0.000003 time 0.7441 (0.7535) loss 0.6206 (0.6205) grad_norm 0.1632 (0.1577) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:43:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 739 training takes 0:05:02 [2024-03-11 06:43:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [740/800][0/402] eta 0:24:42 lr 0.000003 time 3.6891 (3.6891) loss 0.6121 (0.6121) grad_norm 0.1401 (0.1401) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:44:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [740/800][100/402] eta 0:03:54 lr 0.000003 time 0.7449 (0.7762) loss 0.6122 (0.6216) grad_norm 0.1427 (0.1557) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:45:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [740/800][200/402] eta 0:02:33 lr 0.000003 time 0.7462 (0.7615) loss 0.6580 (0.6212) grad_norm 0.1442 (0.1559) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:47:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [740/800][300/402] eta 0:01:17 lr 0.000003 time 0.7469 (0.7565) loss 0.6079 (0.6202) grad_norm 0.1667 (0.1574) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:48:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [740/800][400/402] eta 0:00:01 lr 0.000003 time 0.7442 (0.7540) loss 0.6371 (0.6202) grad_norm 0.1727 (0.1582) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:48:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 740 training takes 0:05:03 [2024-03-11 06:48:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [741/800][0/402] eta 0:37:06 lr 0.000003 time 5.5380 (5.5380) loss 0.5900 (0.5900) grad_norm 0.1521 (0.1521) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:49:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [741/800][100/402] eta 0:03:59 lr 0.000003 time 0.7466 (0.7947) loss 0.6354 (0.6223) grad_norm 0.1408 (0.1595) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:50:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [741/800][200/402] eta 0:02:35 lr 0.000003 time 0.7455 (0.7705) loss 0.6146 (0.6217) grad_norm 0.1394 (0.1568) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:52:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [741/800][300/402] eta 0:01:17 lr 0.000003 time 0.7459 (0.7624) loss 0.5936 (0.6208) grad_norm 0.1797 (0.1568) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:53:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [741/800][400/402] eta 0:00:01 lr 0.000003 time 0.7441 (0.7582) loss 0.5942 (0.6205) grad_norm 0.1848 (0.1572) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:53:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 741 training takes 0:05:04 [2024-03-11 06:53:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [742/800][0/402] eta 0:23:51 lr 0.000003 time 3.5604 (3.5604) loss 0.6119 (0.6119) grad_norm 0.1692 (0.1692) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:54:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [742/800][100/402] eta 0:03:54 lr 0.000003 time 0.7460 (0.7749) loss 0.6058 (0.6187) grad_norm 0.1725 (0.1554) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:55:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [742/800][200/402] eta 0:02:33 lr 0.000003 time 0.7464 (0.7608) loss 0.6225 (0.6187) grad_norm 0.1706 (0.1568) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:57:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [742/800][300/402] eta 0:01:17 lr 0.000003 time 0.7460 (0.7561) loss 0.6322 (0.6197) grad_norm 0.1625 (0.1567) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:58:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [742/800][400/402] eta 0:00:01 lr 0.000003 time 0.7438 (0.7536) loss 0.6227 (0.6204) grad_norm 0.1705 (0.1562) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:58:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 742 training takes 0:05:03 [2024-03-11 06:58:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [743/800][0/402] eta 0:24:00 lr 0.000003 time 3.5833 (3.5833) loss 0.6152 (0.6152) grad_norm 0.1854 (0.1854) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 06:59:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [743/800][100/402] eta 0:03:53 lr 0.000003 time 0.7456 (0.7740) loss 0.6060 (0.6204) grad_norm 0.1601 (0.1587) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:01:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [743/800][200/402] eta 0:02:33 lr 0.000003 time 0.7462 (0.7608) loss 0.6257 (0.6183) grad_norm 0.1543 (0.1583) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:02:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [743/800][300/402] eta 0:01:17 lr 0.000003 time 0.7494 (0.7560) loss 0.6388 (0.6185) grad_norm 0.1720 (inf) loss_scale 524288.0000 (562608.0532) mem 28968MB [2024-03-11 07:03:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [743/800][400/402] eta 0:00:01 lr 0.000003 time 0.7437 (0.7535) loss 0.6446 (0.6197) grad_norm 0.1358 (inf) loss_scale 524288.0000 (553051.9302) mem 28968MB [2024-03-11 07:03:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 743 training takes 0:05:03 [2024-03-11 07:03:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [744/800][0/402] eta 0:22:56 lr 0.000003 time 3.4243 (3.4243) loss 0.6199 (0.6199) grad_norm 0.1326 (0.1326) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:04:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [744/800][100/402] eta 0:03:53 lr 0.000003 time 0.7458 (0.7726) loss 0.6210 (0.6181) grad_norm 0.1742 (0.1582) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:06:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [744/800][200/402] eta 0:02:33 lr 0.000003 time 0.7461 (0.7595) loss 0.6379 (0.6190) grad_norm 0.1380 (0.1566) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:07:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [744/800][300/402] eta 0:01:17 lr 0.000003 time 0.7467 (0.7553) loss 0.6256 (0.6191) grad_norm 0.1817 (0.1573) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:08:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [744/800][400/402] eta 0:00:01 lr 0.000003 time 0.7445 (0.7529) loss 0.6121 (0.6192) grad_norm 0.1696 (0.1579) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:08:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 744 training takes 0:05:02 [2024-03-11 07:08:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [745/800][0/402] eta 0:23:44 lr 0.000003 time 3.5435 (3.5435) loss 0.5968 (0.5968) grad_norm 0.1922 (0.1922) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:09:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [745/800][100/402] eta 0:03:53 lr 0.000003 time 0.7462 (0.7737) loss 0.6320 (0.6189) grad_norm 0.1397 (0.1603) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:11:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [745/800][200/402] eta 0:02:33 lr 0.000003 time 0.7461 (0.7600) loss 0.6198 (0.6193) grad_norm 0.1982 (0.1613) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:12:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [745/800][300/402] eta 0:01:17 lr 0.000003 time 0.7458 (0.7555) loss 0.6271 (0.6193) grad_norm 0.1417 (0.1598) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:13:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [745/800][400/402] eta 0:00:01 lr 0.000003 time 0.7451 (0.7531) loss 0.6347 (0.6195) grad_norm 0.1505 (inf) loss_scale 262144.0000 (474604.8479) mem 28968MB [2024-03-11 07:13:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 745 training takes 0:05:02 [2024-03-11 07:13:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [746/800][0/402] eta 0:37:45 lr 0.000003 time 5.6353 (5.6353) loss 0.6143 (0.6143) grad_norm 0.1554 (0.1554) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:14:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [746/800][100/402] eta 0:04:00 lr 0.000003 time 0.7471 (0.7949) loss 0.6020 (0.6223) grad_norm 0.2393 (0.1588) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:16:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [746/800][200/402] eta 0:02:35 lr 0.000003 time 0.7454 (0.7708) loss 0.6466 (0.6212) grad_norm 0.1651 (0.1590) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:17:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [746/800][300/402] eta 0:01:17 lr 0.000003 time 0.7484 (0.7626) loss 0.6259 (0.6198) grad_norm 0.1693 (0.1590) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:18:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [746/800][400/402] eta 0:00:01 lr 0.000003 time 0.7442 (0.7585) loss 0.6151 (0.6193) grad_norm 0.1514 (0.1585) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:18:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 746 training takes 0:05:05 [2024-03-11 07:18:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [747/800][0/402] eta 0:24:17 lr 0.000003 time 3.6268 (3.6268) loss 0.6195 (0.6195) grad_norm 0.1843 (0.1843) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:20:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [747/800][100/402] eta 0:03:54 lr 0.000003 time 0.7455 (0.7750) loss 0.6116 (0.6191) grad_norm 0.1719 (0.1636) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:21:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [747/800][200/402] eta 0:02:33 lr 0.000003 time 0.7464 (0.7608) loss 0.5798 (0.6192) grad_norm 0.1537 (0.1615) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:22:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [747/800][300/402] eta 0:01:17 lr 0.000003 time 0.7467 (0.7560) loss 0.6127 (0.6196) grad_norm 0.1302 (0.1598) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:23:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [747/800][400/402] eta 0:00:01 lr 0.000003 time 0.7444 (0.7536) loss 0.6250 (0.6195) grad_norm 0.2160 (0.1593) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:23:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 747 training takes 0:05:03 [2024-03-11 07:23:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [748/800][0/402] eta 0:25:48 lr 0.000003 time 3.8524 (3.8524) loss 0.6010 (0.6010) grad_norm 0.1685 (0.1685) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:25:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [748/800][100/402] eta 0:03:54 lr 0.000003 time 0.7461 (0.7774) loss 0.6162 (0.6191) grad_norm 0.1558 (0.1548) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:26:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [748/800][200/402] eta 0:02:33 lr 0.000003 time 0.7461 (0.7620) loss 0.6494 (0.6183) grad_norm 0.1319 (0.1557) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:27:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [748/800][300/402] eta 0:01:17 lr 0.000003 time 0.7467 (0.7568) loss 0.5946 (0.6196) grad_norm 0.1721 (0.1570) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:28:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [748/800][400/402] eta 0:00:01 lr 0.000003 time 0.7437 (0.7541) loss 0.6568 (0.6195) grad_norm 0.1687 (0.1574) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:28:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 748 training takes 0:05:03 [2024-03-11 07:28:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [749/800][0/402] eta 0:24:25 lr 0.000003 time 3.6466 (3.6466) loss 0.5972 (0.5972) grad_norm 0.1725 (0.1725) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:30:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [749/800][100/402] eta 0:03:54 lr 0.000003 time 0.7462 (0.7758) loss 0.6422 (0.6214) grad_norm 0.1491 (0.1589) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:31:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [749/800][200/402] eta 0:02:33 lr 0.000003 time 0.7461 (0.7612) loss 0.6158 (0.6206) grad_norm 0.1488 (0.1586) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:32:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [749/800][300/402] eta 0:01:17 lr 0.000003 time 0.7478 (0.7564) loss 0.6335 (0.6197) grad_norm 0.1583 (0.1594) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:33:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [749/800][400/402] eta 0:00:01 lr 0.000003 time 0.7438 (0.7538) loss 0.5950 (0.6203) grad_norm 0.2170 (0.1605) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:33:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 749 training takes 0:05:03 [2024-03-11 07:33:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [750/800][0/402] eta 0:24:58 lr 0.000003 time 3.7280 (3.7280) loss 0.5833 (0.5833) grad_norm 0.1698 (0.1698) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:35:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [750/800][100/402] eta 0:03:54 lr 0.000003 time 0.7462 (0.7761) loss 0.6106 (0.6206) grad_norm 0.1370 (0.1607) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:36:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [750/800][200/402] eta 0:02:33 lr 0.000003 time 0.7467 (0.7614) loss 0.6043 (0.6177) grad_norm 0.1597 (0.1598) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:37:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [750/800][300/402] eta 0:01:17 lr 0.000003 time 0.7466 (0.7564) loss 0.6222 (0.6193) grad_norm 0.1950 (0.1594) loss_scale 262144.0000 (262144.0000) mem 28968MB [2024-03-11 07:38:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [750/800][400/402] eta 0:00:01 lr 0.000003 time 0.7446 (0.7539) loss 0.6024 (0.6195) grad_norm 0.1486 (0.1592) loss_scale 524288.0000 (318364.4090) mem 28968MB [2024-03-11 07:38:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 750 training takes 0:05:03 [2024-03-11 07:39:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [751/800][0/402] eta 0:37:46 lr 0.000003 time 5.6391 (5.6391) loss 0.6329 (0.6329) grad_norm 0.1433 (0.1433) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:40:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [751/800][100/402] eta 0:04:00 lr 0.000003 time 0.7468 (0.7952) loss 0.6222 (0.6197) grad_norm 0.1389 (0.1607) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:41:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [751/800][200/402] eta 0:02:35 lr 0.000003 time 0.7461 (0.7710) loss 0.6416 (0.6189) grad_norm 0.1658 (0.1607) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:42:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [751/800][300/402] eta 0:01:17 lr 0.000003 time 0.7461 (0.7629) loss 0.6044 (0.6194) grad_norm 0.1337 (0.1592) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:43:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [751/800][400/402] eta 0:00:01 lr 0.000003 time 0.7444 (0.7587) loss 0.6108 (0.6195) grad_norm 0.1404 (0.1593) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:44:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 751 training takes 0:05:05 [2024-03-11 07:44:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [752/800][0/402] eta 0:24:04 lr 0.000003 time 3.5931 (3.5931) loss 0.5936 (0.5936) grad_norm 0.1776 (0.1776) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:45:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [752/800][100/402] eta 0:03:53 lr 0.000003 time 0.7458 (0.7745) loss 0.6485 (0.6181) grad_norm 0.2146 (0.1621) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:46:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [752/800][200/402] eta 0:02:33 lr 0.000003 time 0.7462 (0.7604) loss 0.6090 (0.6189) grad_norm 0.1407 (0.1595) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:47:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [752/800][300/402] eta 0:01:17 lr 0.000003 time 0.7459 (0.7558) loss 0.6141 (0.6201) grad_norm 0.1196 (0.1593) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:49:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [752/800][400/402] eta 0:00:01 lr 0.000003 time 0.7446 (0.7534) loss 0.6389 (0.6201) grad_norm 0.1554 (0.1601) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:49:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 752 training takes 0:05:02 [2024-03-11 07:49:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [753/800][0/402] eta 0:23:49 lr 0.000003 time 3.5555 (3.5555) loss 0.6379 (0.6379) grad_norm 0.1789 (0.1789) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:50:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [753/800][100/402] eta 0:03:53 lr 0.000003 time 0.7463 (0.7739) loss 0.6110 (0.6206) grad_norm 0.1689 (0.1616) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:51:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [753/800][200/402] eta 0:02:33 lr 0.000003 time 0.7464 (0.7604) loss 0.6017 (0.6206) grad_norm 0.1451 (0.1618) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:52:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [753/800][300/402] eta 0:01:17 lr 0.000003 time 0.7457 (0.7557) loss 0.6200 (0.6208) grad_norm 0.1549 (0.1611) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:54:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [753/800][400/402] eta 0:00:01 lr 0.000003 time 0.7446 (0.7532) loss 0.6168 (0.6204) grad_norm 0.2005 (0.1613) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:54:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 753 training takes 0:05:02 [2024-03-11 07:54:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [754/800][0/402] eta 0:23:58 lr 0.000003 time 3.5773 (3.5773) loss 0.6226 (0.6226) grad_norm 0.1499 (0.1499) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:55:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [754/800][100/402] eta 0:03:53 lr 0.000003 time 0.7456 (0.7740) loss 0.6135 (0.6200) grad_norm 0.1404 (0.1594) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:56:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [754/800][200/402] eta 0:02:33 lr 0.000003 time 0.7459 (0.7601) loss 0.6366 (0.6197) grad_norm 0.1571 (0.1600) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:57:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [754/800][300/402] eta 0:01:17 lr 0.000003 time 0.7457 (0.7554) loss 0.6180 (0.6186) grad_norm 0.1489 (0.1596) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:59:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [754/800][400/402] eta 0:00:01 lr 0.000003 time 0.7446 (0.7531) loss 0.6313 (0.6188) grad_norm 0.1701 (0.1592) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 07:59:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 754 training takes 0:05:02 [2024-03-11 07:59:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [755/800][0/402] eta 0:23:01 lr 0.000003 time 3.4360 (3.4360) loss 0.6379 (0.6379) grad_norm 0.1819 (0.1819) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:00:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [755/800][100/402] eta 0:03:53 lr 0.000003 time 0.7458 (0.7726) loss 0.6367 (0.6169) grad_norm 0.1601 (0.1590) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:01:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [755/800][200/402] eta 0:02:33 lr 0.000003 time 0.7462 (0.7594) loss 0.6134 (0.6187) grad_norm 0.1656 (0.1592) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:02:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [755/800][300/402] eta 0:01:17 lr 0.000003 time 0.7458 (0.7551) loss 0.6294 (0.6193) grad_norm 0.1352 (0.1595) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:04:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [755/800][400/402] eta 0:00:01 lr 0.000003 time 0.7440 (0.7528) loss 0.5955 (0.6194) grad_norm 0.1563 (inf) loss_scale 524288.0000 (604042.5337) mem 28968MB [2024-03-11 08:04:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 755 training takes 0:05:02 [2024-03-11 08:04:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [756/800][0/402] eta 0:36:45 lr 0.000003 time 5.4865 (5.4865) loss 0.6232 (0.6232) grad_norm 0.1463 (0.1463) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:05:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [756/800][100/402] eta 0:03:59 lr 0.000003 time 0.7467 (0.7934) loss 0.6022 (0.6192) grad_norm 0.1599 (0.1601) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:06:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [756/800][200/402] eta 0:02:35 lr 0.000003 time 0.7468 (0.7700) loss 0.6340 (0.6213) grad_norm 0.1448 (0.1602) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:08:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [756/800][300/402] eta 0:01:17 lr 0.000003 time 0.7462 (0.7622) loss 0.6088 (0.6212) grad_norm 0.1573 (0.1624) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:09:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [756/800][400/402] eta 0:00:01 lr 0.000003 time 0.7441 (0.7582) loss 0.6394 (0.6204) grad_norm 0.1534 (0.1615) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:09:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 756 training takes 0:05:04 [2024-03-11 08:09:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [757/800][0/402] eta 0:25:27 lr 0.000003 time 3.8008 (3.8008) loss 0.6280 (0.6280) grad_norm 0.1451 (0.1451) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:10:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [757/800][100/402] eta 0:03:54 lr 0.000003 time 0.7461 (0.7773) loss 0.5829 (0.6202) grad_norm 0.1671 (0.1573) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:11:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [757/800][200/402] eta 0:02:33 lr 0.000003 time 0.7486 (0.7620) loss 0.6088 (0.6207) grad_norm 0.1772 (0.1591) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:13:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [757/800][300/402] eta 0:01:17 lr 0.000003 time 0.7455 (0.7571) loss 0.6053 (0.6201) grad_norm 0.1470 (0.1606) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:14:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [757/800][400/402] eta 0:00:01 lr 0.000003 time 0.7443 (0.7544) loss 0.6219 (0.6202) grad_norm 0.1358 (0.1608) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:14:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 757 training takes 0:05:03 [2024-03-11 08:14:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [758/800][0/402] eta 0:23:44 lr 0.000003 time 3.5439 (3.5439) loss 0.6396 (0.6396) grad_norm 0.1319 (0.1319) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:15:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [758/800][100/402] eta 0:03:53 lr 0.000003 time 0.7474 (0.7740) loss 0.6068 (0.6175) grad_norm 0.1733 (0.1612) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:16:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [758/800][200/402] eta 0:02:33 lr 0.000003 time 0.7460 (0.7602) loss 0.6168 (0.6185) grad_norm 0.1509 (0.1599) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:18:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [758/800][300/402] eta 0:01:17 lr 0.000003 time 0.7457 (0.7555) loss 0.6101 (0.6187) grad_norm 0.1361 (0.1601) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:19:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [758/800][400/402] eta 0:00:01 lr 0.000003 time 0.7445 (0.7531) loss 0.6175 (0.6191) grad_norm 0.1399 (0.1602) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:19:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 758 training takes 0:05:02 [2024-03-11 08:19:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [759/800][0/402] eta 0:25:00 lr 0.000003 time 3.7324 (3.7324) loss 0.6072 (0.6072) grad_norm 0.1606 (0.1606) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:20:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [759/800][100/402] eta 0:03:54 lr 0.000003 time 0.7459 (0.7760) loss 0.6446 (0.6222) grad_norm 0.1564 (0.1567) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:21:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [759/800][200/402] eta 0:02:33 lr 0.000003 time 0.7488 (0.7613) loss 0.6116 (0.6188) grad_norm 0.1575 (0.1594) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:23:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [759/800][300/402] eta 0:01:17 lr 0.000003 time 0.7460 (0.7565) loss 0.6235 (0.6206) grad_norm 0.1739 (0.1612) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:24:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [759/800][400/402] eta 0:00:01 lr 0.000003 time 0.7444 (0.7539) loss 0.6210 (0.6201) grad_norm 0.1517 (0.1621) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:24:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 759 training takes 0:05:03 [2024-03-11 08:24:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [760/800][0/402] eta 0:23:19 lr 0.000003 time 3.4820 (3.4820) loss 0.6133 (0.6133) grad_norm 0.1762 (0.1762) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:25:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [760/800][100/402] eta 0:03:53 lr 0.000003 time 0.7467 (0.7740) loss 0.6301 (0.6189) grad_norm 0.1485 (0.1590) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:26:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [760/800][200/402] eta 0:02:33 lr 0.000003 time 0.7464 (0.7603) loss 0.6185 (0.6203) grad_norm 0.1538 (0.1612) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:28:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [760/800][300/402] eta 0:01:17 lr 0.000003 time 0.7465 (0.7557) loss 0.6184 (0.6204) grad_norm 0.1646 (0.1618) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:29:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [760/800][400/402] eta 0:00:01 lr 0.000003 time 0.7444 (0.7532) loss 0.6426 (0.6197) grad_norm 0.1638 (inf) loss_scale 524288.0000 (560896.6384) mem 28968MB [2024-03-11 08:29:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 760 training takes 0:05:02 [2024-03-11 08:29:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [761/800][0/402] eta 0:36:37 lr 0.000003 time 5.4657 (5.4657) loss 0.6013 (0.6013) grad_norm 0.1730 (0.1730) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:30:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [761/800][100/402] eta 0:03:59 lr 0.000003 time 0.7459 (0.7931) loss 0.6364 (0.6196) grad_norm 0.2260 (0.1602) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:32:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [761/800][200/402] eta 0:02:35 lr 0.000003 time 0.7462 (0.7700) loss 0.6040 (0.6192) grad_norm 0.1737 (0.1635) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:33:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [761/800][300/402] eta 0:01:17 lr 0.000003 time 0.7461 (0.7622) loss 0.6066 (0.6191) grad_norm 0.1512 (0.1629) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:34:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [761/800][400/402] eta 0:00:01 lr 0.000003 time 0.7437 (0.7584) loss 0.6081 (0.6192) grad_norm 0.1456 (0.1617) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:34:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 761 training takes 0:05:04 [2024-03-11 08:34:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [762/800][0/402] eta 0:23:54 lr 0.000003 time 3.5696 (3.5696) loss 0.6294 (0.6294) grad_norm 0.1370 (0.1370) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:35:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [762/800][100/402] eta 0:03:54 lr 0.000003 time 0.7459 (0.7749) loss 0.6351 (0.6202) grad_norm 0.1355 (0.1626) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:37:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [762/800][200/402] eta 0:02:33 lr 0.000003 time 0.7459 (0.7608) loss 0.6388 (0.6202) grad_norm 0.1673 (0.1637) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:38:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [762/800][300/402] eta 0:01:17 lr 0.000003 time 0.7467 (0.7560) loss 0.6394 (0.6201) grad_norm 0.1575 (0.1632) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:39:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [762/800][400/402] eta 0:00:01 lr 0.000003 time 0.7444 (0.7536) loss 0.6136 (0.6197) grad_norm 0.1609 (0.1625) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:39:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 762 training takes 0:05:03 [2024-03-11 08:39:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [763/800][0/402] eta 0:25:01 lr 0.000003 time 3.7346 (3.7346) loss 0.5965 (0.5965) grad_norm 0.1746 (0.1746) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:40:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [763/800][100/402] eta 0:03:54 lr 0.000003 time 0.7466 (0.7757) loss 0.6369 (0.6174) grad_norm 0.1684 (0.1625) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:42:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [763/800][200/402] eta 0:02:33 lr 0.000003 time 0.7464 (0.7611) loss 0.6366 (0.6182) grad_norm 0.1675 (0.1620) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:43:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [763/800][300/402] eta 0:01:17 lr 0.000003 time 0.7463 (0.7563) loss 0.6255 (0.6194) grad_norm 0.2099 (0.1625) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:44:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [763/800][400/402] eta 0:00:01 lr 0.000003 time 0.7442 (0.7537) loss 0.6062 (0.6194) grad_norm 0.1777 (0.1624) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:44:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 763 training takes 0:05:03 [2024-03-11 08:44:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [764/800][0/402] eta 0:24:48 lr 0.000003 time 3.7016 (3.7016) loss 0.6144 (0.6144) grad_norm 0.1800 (0.1800) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:45:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [764/800][100/402] eta 0:03:54 lr 0.000003 time 0.7458 (0.7767) loss 0.6443 (0.6230) grad_norm 0.1576 (0.1594) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:47:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [764/800][200/402] eta 0:02:33 lr 0.000003 time 0.7464 (0.7620) loss 0.6390 (0.6206) grad_norm 0.1427 (0.1613) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:48:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [764/800][300/402] eta 0:01:17 lr 0.000003 time 0.7472 (0.7569) loss 0.6291 (0.6200) grad_norm 0.1712 (0.1619) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:49:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [764/800][400/402] eta 0:00:01 lr 0.000003 time 0.7442 (0.7542) loss 0.6345 (0.6197) grad_norm 0.1509 (0.1616) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:49:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 764 training takes 0:05:03 [2024-03-11 08:49:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [765/800][0/402] eta 0:23:12 lr 0.000003 time 3.4648 (3.4648) loss 0.6340 (0.6340) grad_norm 0.1730 (0.1730) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:51:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [765/800][100/402] eta 0:03:53 lr 0.000003 time 0.7451 (0.7731) loss 0.6251 (0.6182) grad_norm 0.1592 (0.1638) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:52:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [765/800][200/402] eta 0:02:33 lr 0.000003 time 0.7461 (0.7598) loss 0.5783 (0.6181) grad_norm 0.1749 (0.1640) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:53:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [765/800][300/402] eta 0:01:17 lr 0.000003 time 0.7458 (0.7552) loss 0.6280 (0.6192) grad_norm 0.1551 (0.1618) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:54:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [765/800][400/402] eta 0:00:01 lr 0.000003 time 0.7444 (0.7529) loss 0.6698 (0.6199) grad_norm 0.1784 (inf) loss_scale 524288.0000 (542592.3192) mem 28968MB [2024-03-11 08:54:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 765 training takes 0:05:02 [2024-03-11 08:54:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [766/800][0/402] eta 0:33:08 lr 0.000003 time 4.9476 (4.9476) loss 0.6058 (0.6058) grad_norm 0.2022 (0.2022) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:56:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [766/800][100/402] eta 0:03:57 lr 0.000003 time 0.7457 (0.7876) loss 0.5962 (0.6214) grad_norm 0.1965 (0.1625) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:57:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [766/800][200/402] eta 0:02:34 lr 0.000003 time 0.7462 (0.7671) loss 0.6270 (0.6198) grad_norm 0.1883 (0.1652) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:58:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [766/800][300/402] eta 0:01:17 lr 0.000003 time 0.7459 (0.7601) loss 0.6293 (0.6194) grad_norm 0.1526 (0.1637) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:59:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [766/800][400/402] eta 0:00:01 lr 0.000003 time 0.7443 (0.7565) loss 0.6254 (0.6197) grad_norm 0.1580 (0.1633) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 08:59:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 766 training takes 0:05:04 [2024-03-11 08:59:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [767/800][0/402] eta 0:22:06 lr 0.000003 time 3.2990 (3.2990) loss 0.6393 (0.6393) grad_norm 0.1540 (0.1540) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:01:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [767/800][100/402] eta 0:03:52 lr 0.000003 time 0.7456 (0.7710) loss 0.6335 (0.6203) grad_norm 0.1306 (0.1620) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:02:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [767/800][200/402] eta 0:02:33 lr 0.000003 time 0.7457 (0.7585) loss 0.6233 (0.6192) grad_norm 0.1600 (0.1641) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:03:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [767/800][300/402] eta 0:01:16 lr 0.000003 time 0.7463 (0.7543) loss 0.5951 (0.6196) grad_norm 0.1628 (0.1643) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:04:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [767/800][400/402] eta 0:00:01 lr 0.000003 time 0.7444 (0.7521) loss 0.6139 (0.6187) grad_norm 0.1643 (0.1630) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:04:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 767 training takes 0:05:02 [2024-03-11 09:04:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [768/800][0/402] eta 0:22:46 lr 0.000003 time 3.4005 (3.4005) loss 0.5967 (0.5967) grad_norm 0.1518 (0.1518) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:06:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [768/800][100/402] eta 0:03:53 lr 0.000003 time 0.7457 (0.7720) loss 0.6222 (0.6181) grad_norm 0.1616 (0.1627) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:07:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [768/800][200/402] eta 0:02:33 lr 0.000003 time 0.7463 (0.7590) loss 0.6298 (0.6186) grad_norm 0.1638 (0.1627) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:08:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [768/800][300/402] eta 0:01:16 lr 0.000003 time 0.7465 (0.7549) loss 0.5964 (0.6172) grad_norm 0.1825 (0.1640) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:09:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [768/800][400/402] eta 0:00:01 lr 0.000003 time 0.7448 (0.7526) loss 0.6516 (0.6179) grad_norm 0.1506 (0.1642) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:09:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 768 training takes 0:05:02 [2024-03-11 09:09:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [769/800][0/402] eta 0:22:28 lr 0.000003 time 3.3543 (3.3543) loss 0.6028 (0.6028) grad_norm 0.1504 (0.1504) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:11:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [769/800][100/402] eta 0:03:53 lr 0.000003 time 0.7456 (0.7716) loss 0.6027 (0.6199) grad_norm 0.1596 (0.1605) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:12:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [769/800][200/402] eta 0:02:33 lr 0.000003 time 0.7456 (0.7587) loss 0.5795 (0.6187) grad_norm 0.1730 (0.1626) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:13:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [769/800][300/402] eta 0:01:16 lr 0.000003 time 0.7457 (0.7544) loss 0.6139 (0.6191) grad_norm 0.1980 (0.1628) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:14:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [769/800][400/402] eta 0:00:01 lr 0.000003 time 0.7444 (0.7522) loss 0.5974 (0.6192) grad_norm 0.1578 (0.1632) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:14:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 769 training takes 0:05:02 [2024-03-11 09:15:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [770/800][0/402] eta 0:22:22 lr 0.000003 time 3.3401 (3.3401) loss 0.6428 (0.6428) grad_norm 0.1370 (0.1370) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:16:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [770/800][100/402] eta 0:03:52 lr 0.000003 time 0.7467 (0.7715) loss 0.6250 (0.6218) grad_norm 0.1669 (0.1632) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:17:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [770/800][200/402] eta 0:02:33 lr 0.000003 time 0.7458 (0.7587) loss 0.6368 (0.6195) grad_norm 0.1585 (0.1637) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:18:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [770/800][300/402] eta 0:01:16 lr 0.000003 time 0.7455 (0.7544) loss 0.6221 (0.6193) grad_norm 0.1418 (0.1631) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:19:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [770/800][400/402] eta 0:00:01 lr 0.000003 time 0.7449 (0.7522) loss 0.6082 (0.6194) grad_norm 0.1730 (0.1628) loss_scale 1048576.0000 (554359.3815) mem 28968MB [2024-03-11 09:20:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 770 training takes 0:05:02 [2024-03-11 09:20:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [771/800][0/402] eta 0:33:53 lr 0.000003 time 5.0579 (5.0579) loss 0.6194 (0.6194) grad_norm 0.1423 (0.1423) loss_scale 1048576.0000 (1048576.0000) mem 28968MB [2024-03-11 09:21:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [771/800][100/402] eta 0:03:58 lr 0.000003 time 0.7461 (0.7889) loss 0.6246 (0.6206) grad_norm 0.1324 (nan) loss_scale 524288.0000 (747499.7228) mem 28968MB [2024-03-11 09:22:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [771/800][200/402] eta 0:02:35 lr 0.000003 time 0.7458 (0.7674) loss 0.6048 (0.6206) grad_norm 0.1505 (nan) loss_scale 524288.0000 (636449.1144) mem 28968MB [2024-03-11 09:23:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [771/800][300/402] eta 0:01:17 lr 0.000003 time 0.7453 (0.7602) loss 0.6167 (0.6201) grad_norm 0.1766 (nan) loss_scale 524288.0000 (599186.2857) mem 28968MB [2024-03-11 09:25:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [771/800][400/402] eta 0:00:01 lr 0.000003 time 0.7449 (0.7565) loss 0.6251 (0.6202) grad_norm 0.1420 (nan) loss_scale 524288.0000 (580508.4090) mem 28968MB [2024-03-11 09:25:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 771 training takes 0:05:04 [2024-03-11 09:25:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [772/800][0/402] eta 0:22:01 lr 0.000003 time 3.2865 (3.2865) loss 0.5998 (0.5998) grad_norm 0.1568 (0.1568) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:26:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [772/800][100/402] eta 0:03:52 lr 0.000003 time 0.7460 (0.7710) loss 0.6205 (0.6176) grad_norm 0.1850 (0.1631) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:27:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [772/800][200/402] eta 0:02:33 lr 0.000003 time 0.7459 (0.7584) loss 0.5971 (0.6182) grad_norm 0.1639 (0.1650) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:28:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [772/800][300/402] eta 0:01:16 lr 0.000003 time 0.7463 (0.7543) loss 0.6590 (0.6192) grad_norm 0.1691 (0.1640) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:30:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [772/800][400/402] eta 0:00:01 lr 0.000003 time 0.7442 (0.7521) loss 0.6475 (0.6190) grad_norm 0.1451 (0.1637) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:30:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 772 training takes 0:05:02 [2024-03-11 09:30:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [773/800][0/402] eta 0:22:10 lr 0.000003 time 3.3091 (3.3091) loss 0.6353 (0.6353) grad_norm 0.1651 (0.1651) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:31:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [773/800][100/402] eta 0:03:52 lr 0.000003 time 0.7459 (0.7711) loss 0.6441 (0.6201) grad_norm 0.1357 (0.1647) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:32:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [773/800][200/402] eta 0:02:33 lr 0.000003 time 0.7457 (0.7586) loss 0.6033 (0.6191) grad_norm 0.1430 (0.1649) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:33:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [773/800][300/402] eta 0:01:16 lr 0.000003 time 0.7458 (0.7543) loss 0.6167 (0.6203) grad_norm 0.1864 (0.1642) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:35:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [773/800][400/402] eta 0:00:01 lr 0.000003 time 0.7444 (0.7522) loss 0.6498 (0.6203) grad_norm 0.1714 (0.1641) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:35:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 773 training takes 0:05:02 [2024-03-11 09:35:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [774/800][0/402] eta 0:22:58 lr 0.000003 time 3.4294 (3.4294) loss 0.6401 (0.6401) grad_norm 0.1842 (0.1842) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:36:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [774/800][100/402] eta 0:03:53 lr 0.000003 time 0.7459 (0.7724) loss 0.6429 (0.6199) grad_norm 0.1639 (0.1632) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:37:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [774/800][200/402] eta 0:02:33 lr 0.000003 time 0.7464 (0.7592) loss 0.6033 (0.6201) grad_norm 0.1560 (0.1651) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:38:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [774/800][300/402] eta 0:01:16 lr 0.000003 time 0.7455 (0.7548) loss 0.6269 (0.6197) grad_norm 0.1888 (0.1643) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:40:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [774/800][400/402] eta 0:00:01 lr 0.000003 time 0.7448 (0.7525) loss 0.5987 (0.6195) grad_norm 0.2059 (0.1640) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:40:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 774 training takes 0:05:02 [2024-03-11 09:40:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [775/800][0/402] eta 0:22:00 lr 0.000003 time 3.2847 (3.2847) loss 0.6073 (0.6073) grad_norm 0.1448 (0.1448) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:41:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [775/800][100/402] eta 0:03:52 lr 0.000003 time 0.7460 (0.7709) loss 0.6090 (0.6207) grad_norm 0.1798 (0.1672) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:42:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [775/800][200/402] eta 0:02:33 lr 0.000003 time 0.7457 (0.7585) loss 0.6425 (0.6196) grad_norm 0.1636 (0.1642) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:43:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [775/800][300/402] eta 0:01:16 lr 0.000003 time 0.7459 (0.7543) loss 0.6270 (0.6184) grad_norm 0.1885 (0.1642) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:45:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [775/800][400/402] eta 0:00:01 lr 0.000003 time 0.7443 (0.7522) loss 0.6414 (0.6191) grad_norm 0.1609 (0.1635) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:45:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 775 training takes 0:05:02 [2024-03-11 09:45:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [776/800][0/402] eta 0:32:55 lr 0.000003 time 4.9153 (4.9153) loss 0.6560 (0.6560) grad_norm 0.1356 (0.1356) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:46:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [776/800][100/402] eta 0:03:57 lr 0.000003 time 0.7460 (0.7875) loss 0.6296 (0.6186) grad_norm 0.1660 (0.1644) loss_scale 1048576.0000 (877273.9802) mem 28968MB [2024-03-11 09:47:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [776/800][200/402] eta 0:02:34 lr 0.000003 time 0.7456 (0.7668) loss 0.6312 (0.6190) grad_norm 0.1690 (inf) loss_scale 524288.0000 (957282.0697) mem 28968MB [2024-03-11 09:49:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [776/800][300/402] eta 0:01:17 lr 0.000003 time 0.7465 (0.7599) loss 0.6104 (0.6199) grad_norm 0.1672 (inf) loss_scale 524288.0000 (813430.2193) mem 28968MB [2024-03-11 09:50:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [776/800][400/402] eta 0:00:01 lr 0.000003 time 0.7446 (0.7564) loss 0.6376 (0.6194) grad_norm 0.1554 (inf) loss_scale 524288.0000 (741324.9277) mem 28968MB [2024-03-11 09:50:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 776 training takes 0:05:04 [2024-03-11 09:50:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [777/800][0/402] eta 0:22:35 lr 0.000003 time 3.3720 (3.3720) loss 0.6350 (0.6350) grad_norm 0.1416 (0.1416) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:51:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [777/800][100/402] eta 0:03:53 lr 0.000003 time 0.7462 (0.7717) loss 0.6132 (0.6193) grad_norm 0.1559 (0.1678) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:52:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [777/800][200/402] eta 0:02:33 lr 0.000003 time 0.7451 (0.7589) loss 0.6430 (0.6190) grad_norm 0.1494 (0.1662) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:54:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [777/800][300/402] eta 0:01:16 lr 0.000003 time 0.7455 (0.7545) loss 0.6120 (0.6201) grad_norm 0.1753 (0.1651) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:55:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [777/800][400/402] eta 0:00:01 lr 0.000003 time 0.7440 (0.7523) loss 0.6268 (0.6202) grad_norm 0.1569 (0.1654) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:55:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 777 training takes 0:05:02 [2024-03-11 09:55:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [778/800][0/402] eta 0:22:45 lr 0.000003 time 3.3964 (3.3964) loss 0.6091 (0.6091) grad_norm 0.1715 (0.1715) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:56:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [778/800][100/402] eta 0:03:53 lr 0.000003 time 0.7458 (0.7719) loss 0.6020 (0.6185) grad_norm 0.1514 (0.1674) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:57:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [778/800][200/402] eta 0:02:33 lr 0.000003 time 0.7459 (0.7589) loss 0.6369 (0.6191) grad_norm 0.1964 (0.1672) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 09:59:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [778/800][300/402] eta 0:01:16 lr 0.000003 time 0.7458 (0.7545) loss 0.6086 (0.6195) grad_norm 0.1716 (0.1667) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:00:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [778/800][400/402] eta 0:00:01 lr 0.000003 time 0.7449 (0.7523) loss 0.6103 (0.6192) grad_norm 0.1744 (0.1662) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:00:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 778 training takes 0:05:02 [2024-03-11 10:00:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [779/800][0/402] eta 0:22:29 lr 0.000003 time 3.3582 (3.3582) loss 0.5857 (0.5857) grad_norm 0.1704 (0.1704) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:01:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [779/800][100/402] eta 0:03:53 lr 0.000003 time 0.7453 (0.7716) loss 0.6292 (0.6183) grad_norm 0.1769 (0.1672) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:02:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [779/800][200/402] eta 0:02:33 lr 0.000003 time 0.7463 (0.7587) loss 0.6457 (0.6205) grad_norm 0.1415 (0.1648) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:04:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [779/800][300/402] eta 0:01:16 lr 0.000003 time 0.7458 (0.7545) loss 0.5953 (0.6197) grad_norm 0.1792 (0.1653) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:05:25 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [779/800][400/402] eta 0:00:01 lr 0.000003 time 0.7447 (0.7523) loss 0.6362 (0.6191) grad_norm 0.1944 (0.1657) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:05:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 779 training takes 0:05:02 [2024-03-11 10:05:29 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [780/800][0/402] eta 0:22:27 lr 0.000003 time 3.3532 (3.3532) loss 0.6494 (0.6494) grad_norm 0.1430 (0.1430) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:06:44 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [780/800][100/402] eta 0:03:53 lr 0.000003 time 0.7461 (0.7717) loss 0.6056 (0.6203) grad_norm 0.1357 (0.1659) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:07:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [780/800][200/402] eta 0:02:33 lr 0.000003 time 0.7464 (0.7588) loss 0.6387 (0.6214) grad_norm 0.1706 (0.1650) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:09:13 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [780/800][300/402] eta 0:01:16 lr 0.000003 time 0.7456 (0.7545) loss 0.6346 (0.6205) grad_norm 0.2144 (0.1666) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:10:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [780/800][400/402] eta 0:00:01 lr 0.000003 time 0.7450 (0.7523) loss 0.5977 (0.6202) grad_norm 0.1631 (0.1672) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:10:28 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 780 training takes 0:05:02 [2024-03-11 10:10:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [781/800][0/402] eta 0:32:38 lr 0.000003 time 4.8719 (4.8719) loss 0.6387 (0.6387) grad_norm 0.1449 (0.1449) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:11:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [781/800][100/402] eta 0:03:57 lr 0.000003 time 0.7455 (0.7866) loss 0.5837 (0.6188) grad_norm 0.1626 (0.1654) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:13:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [781/800][200/402] eta 0:02:34 lr 0.000003 time 0.7468 (0.7663) loss 0.6229 (0.6191) grad_norm 0.1502 (0.1661) loss_scale 1048576.0000 (555588.7761) mem 28968MB [2024-03-11 10:14:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [781/800][300/402] eta 0:01:17 lr 0.000003 time 0.7459 (0.7595) loss 0.6103 (0.6197) grad_norm 0.1560 (inf) loss_scale 524288.0000 (562608.0532) mem 28968MB [2024-03-11 10:15:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [781/800][400/402] eta 0:00:01 lr 0.000003 time 0.7443 (0.7562) loss 0.6201 (0.6189) grad_norm 0.1760 (inf) loss_scale 524288.0000 (553051.9302) mem 28968MB [2024-03-11 10:15:32 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 781 training takes 0:05:04 [2024-03-11 10:15:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [782/800][0/402] eta 0:21:45 lr 0.000003 time 3.2481 (3.2481) loss 0.6162 (0.6162) grad_norm 0.1655 (0.1655) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:16:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [782/800][100/402] eta 0:03:52 lr 0.000003 time 0.7454 (0.7705) loss 0.6061 (0.6188) grad_norm 0.1886 (0.1642) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:18:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [782/800][200/402] eta 0:02:33 lr 0.000003 time 0.7459 (0.7583) loss 0.5847 (0.6202) grad_norm 0.1690 (0.1649) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:19:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [782/800][300/402] eta 0:01:16 lr 0.000003 time 0.7462 (0.7541) loss 0.6088 (0.6199) grad_norm 0.1759 (0.1655) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:20:34 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [782/800][400/402] eta 0:00:01 lr 0.000003 time 0.7452 (0.7520) loss 0.6269 (0.6190) grad_norm 0.1388 (0.1667) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:20:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 782 training takes 0:05:02 [2024-03-11 10:20:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [783/800][0/402] eta 0:21:38 lr 0.000003 time 3.2308 (3.2308) loss 0.6066 (0.6066) grad_norm 0.1656 (0.1656) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:21:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [783/800][100/402] eta 0:03:52 lr 0.000003 time 0.7462 (0.7703) loss 0.6412 (0.6194) grad_norm 0.1715 (0.1669) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:23:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [783/800][200/402] eta 0:02:33 lr 0.000003 time 0.7457 (0.7581) loss 0.5994 (0.6181) grad_norm 0.1534 (0.1641) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:24:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [783/800][300/402] eta 0:01:16 lr 0.000003 time 0.7457 (0.7540) loss 0.5877 (0.6179) grad_norm 0.1721 (0.1647) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:25:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [783/800][400/402] eta 0:00:01 lr 0.000003 time 0.7442 (0.7519) loss 0.5985 (0.6185) grad_norm 0.1713 (0.1646) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:25:37 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 783 training takes 0:05:02 [2024-03-11 10:25:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [784/800][0/402] eta 0:22:47 lr 0.000003 time 3.4008 (3.4008) loss 0.6511 (0.6511) grad_norm 0.1242 (0.1242) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:26:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [784/800][100/402] eta 0:03:53 lr 0.000003 time 0.7461 (0.7723) loss 0.6167 (0.6202) grad_norm 0.1642 (0.1669) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:28:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [784/800][200/402] eta 0:02:33 lr 0.000003 time 0.7456 (0.7592) loss 0.6113 (0.6212) grad_norm 0.1736 (0.1670) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:29:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [784/800][300/402] eta 0:01:16 lr 0.000003 time 0.7454 (0.7547) loss 0.6360 (0.6207) grad_norm 0.1667 (0.1673) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:30:39 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [784/800][400/402] eta 0:00:01 lr 0.000003 time 0.7450 (0.7525) loss 0.6504 (0.6201) grad_norm 0.1400 (0.1668) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:30:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 784 training takes 0:05:02 [2024-03-11 10:30:43 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [785/800][0/402] eta 0:22:50 lr 0.000003 time 3.4104 (3.4104) loss 0.6222 (0.6222) grad_norm 0.1631 (0.1631) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:31:58 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [785/800][100/402] eta 0:03:53 lr 0.000003 time 0.7456 (0.7722) loss 0.6452 (0.6185) grad_norm 0.1793 (0.1649) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:33:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [785/800][200/402] eta 0:02:33 lr 0.000003 time 0.7457 (0.7591) loss 0.6166 (0.6178) grad_norm 0.1585 (0.1652) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:34:27 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [785/800][300/402] eta 0:01:16 lr 0.000003 time 0.7466 (0.7547) loss 0.6195 (0.6181) grad_norm 0.1654 (0.1651) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:35:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [785/800][400/402] eta 0:00:01 lr 0.000003 time 0.7445 (0.7524) loss 0.6532 (0.6185) grad_norm 0.1571 (0.1656) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:35:42 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 785 training takes 0:05:02 [2024-03-11 10:35:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [786/800][0/402] eta 0:32:13 lr 0.000003 time 4.8101 (4.8101) loss 0.5953 (0.5953) grad_norm 0.1627 (0.1627) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:37:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [786/800][100/402] eta 0:03:57 lr 0.000003 time 0.7455 (0.7871) loss 0.6239 (0.6193) grad_norm 0.1598 (0.1661) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:38:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [786/800][200/402] eta 0:02:34 lr 0.000003 time 0.7458 (0.7667) loss 0.6007 (0.6194) grad_norm 0.1844 (0.1666) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:39:31 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [786/800][300/402] eta 0:01:17 lr 0.000003 time 0.7461 (0.7598) loss 0.6159 (0.6193) grad_norm 0.1651 (inf) loss_scale 524288.0000 (661891.8272) mem 28968MB [2024-03-11 10:40:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [786/800][400/402] eta 0:00:01 lr 0.000003 time 0.7448 (0.7563) loss 0.6140 (0.6193) grad_norm 0.1613 (inf) loss_scale 524288.0000 (627576.6584) mem 28968MB [2024-03-11 10:40:46 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 786 training takes 0:05:04 [2024-03-11 10:40:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [787/800][0/402] eta 0:22:24 lr 0.000003 time 3.3435 (3.3435) loss 0.6460 (0.6460) grad_norm 0.1337 (0.1337) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:42:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [787/800][100/402] eta 0:03:53 lr 0.000003 time 0.7458 (0.7716) loss 0.6069 (0.6187) grad_norm 0.1708 (0.1670) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:43:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [787/800][200/402] eta 0:02:33 lr 0.000003 time 0.7457 (0.7588) loss 0.6468 (0.6209) grad_norm 0.2082 (0.1660) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:44:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [787/800][300/402] eta 0:01:16 lr 0.000003 time 0.7456 (0.7545) loss 0.6057 (0.6208) grad_norm 0.1854 (0.1664) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:45:48 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [787/800][400/402] eta 0:00:01 lr 0.000003 time 0.7440 (0.7523) loss 0.6248 (0.6197) grad_norm 0.2030 (0.1667) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:45:49 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 787 training takes 0:05:02 [2024-03-11 10:45:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [788/800][0/402] eta 0:22:05 lr 0.000003 time 3.2983 (3.2983) loss 0.6296 (0.6296) grad_norm 0.1185 (0.1185) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:47:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [788/800][100/402] eta 0:03:52 lr 0.000003 time 0.7455 (0.7711) loss 0.6289 (0.6217) grad_norm 0.1700 (0.1664) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:48:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [788/800][200/402] eta 0:02:33 lr 0.000003 time 0.7459 (0.7585) loss 0.6138 (0.6203) grad_norm 0.1591 (0.1673) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:49:36 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [788/800][300/402] eta 0:01:16 lr 0.000003 time 0.7462 (0.7543) loss 0.6074 (0.6196) grad_norm 0.1557 (0.1664) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:50:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [788/800][400/402] eta 0:00:01 lr 0.000003 time 0.7443 (0.7521) loss 0.5852 (0.6199) grad_norm 0.1587 (0.1676) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:50:51 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 788 training takes 0:05:02 [2024-03-11 10:50:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [789/800][0/402] eta 0:22:16 lr 0.000003 time 3.3253 (3.3253) loss 0.6258 (0.6258) grad_norm 0.1804 (0.1804) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:52:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [789/800][100/402] eta 0:03:52 lr 0.000003 time 0.7460 (0.7714) loss 0.6110 (0.6196) grad_norm 0.1638 (0.1667) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:53:24 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [789/800][200/402] eta 0:02:33 lr 0.000003 time 0.7459 (0.7588) loss 0.6153 (0.6196) grad_norm 0.1526 (0.1677) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:54:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [789/800][300/402] eta 0:01:16 lr 0.000003 time 0.7456 (0.7545) loss 0.6358 (0.6188) grad_norm 0.1833 (0.1674) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:55:53 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [789/800][400/402] eta 0:00:01 lr 0.000003 time 0.7451 (0.7523) loss 0.5829 (0.6191) grad_norm 0.1267 (0.1676) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:55:54 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 789 training takes 0:05:02 [2024-03-11 10:55:57 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [790/800][0/402] eta 0:22:06 lr 0.000003 time 3.2991 (3.2991) loss 0.6424 (0.6424) grad_norm 0.1374 (0.1374) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:57:12 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [790/800][100/402] eta 0:03:52 lr 0.000003 time 0.7459 (0.7710) loss 0.5974 (0.6207) grad_norm 0.2264 (0.1700) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:58:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [790/800][200/402] eta 0:02:33 lr 0.000003 time 0.7458 (0.7585) loss 0.6221 (0.6183) grad_norm 0.1379 (0.1680) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 10:59:41 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [790/800][300/402] eta 0:01:16 lr 0.000003 time 0.7455 (0.7543) loss 0.6185 (0.6198) grad_norm 0.1506 (0.1687) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:00:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [790/800][400/402] eta 0:00:01 lr 0.000003 time 0.7445 (0.7522) loss 0.6157 (0.6195) grad_norm 0.1706 (0.1681) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:00:56 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 790 training takes 0:05:02 [2024-03-11 11:01:01 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [791/800][0/402] eta 0:33:05 lr 0.000003 time 4.9395 (4.9395) loss 0.6071 (0.6071) grad_norm 0.1618 (0.1618) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:02:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [791/800][100/402] eta 0:03:57 lr 0.000003 time 0.7458 (0.7876) loss 0.5772 (0.6191) grad_norm 0.1759 (0.1664) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:03:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [791/800][200/402] eta 0:02:34 lr 0.000003 time 0.7456 (0.7668) loss 0.6286 (0.6188) grad_norm 0.1611 (0.1682) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:04:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [791/800][300/402] eta 0:01:17 lr 0.000003 time 0.7462 (0.7599) loss 0.6266 (0.6195) grad_norm 0.1309 (0.1687) loss_scale 1048576.0000 (578284.4385) mem 28968MB [2024-03-11 11:05:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [791/800][400/402] eta 0:00:01 lr 0.000003 time 0.7442 (0.7563) loss 0.5992 (0.6192) grad_norm 0.1603 (inf) loss_scale 524288.0000 (585738.2145) mem 28968MB [2024-03-11 11:06:00 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 791 training takes 0:05:04 [2024-03-11 11:06:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [792/800][0/402] eta 0:22:49 lr 0.000003 time 3.4076 (3.4076) loss 0.6214 (0.6214) grad_norm 0.1848 (0.1848) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:07:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [792/800][100/402] eta 0:03:53 lr 0.000003 time 0.7457 (0.7721) loss 0.6115 (0.6224) grad_norm 0.1505 (0.1679) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:08:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [792/800][200/402] eta 0:02:33 lr 0.000003 time 0.7454 (0.7590) loss 0.6499 (0.6196) grad_norm 0.1194 (0.1680) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:09:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [792/800][300/402] eta 0:01:16 lr 0.000003 time 0.7457 (0.7546) loss 0.6272 (0.6190) grad_norm 0.1921 (0.1685) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:11:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [792/800][400/402] eta 0:00:01 lr 0.000003 time 0.7443 (0.7525) loss 0.6171 (0.6192) grad_norm 0.1730 (0.1681) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:11:03 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 792 training takes 0:05:02 [2024-03-11 11:11:06 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [793/800][0/402] eta 0:22:45 lr 0.000003 time 3.3974 (3.3974) loss 0.5979 (0.5979) grad_norm 0.1675 (0.1675) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:12:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [793/800][100/402] eta 0:03:53 lr 0.000003 time 0.7456 (0.7720) loss 0.6384 (0.6184) grad_norm 0.1553 (0.1663) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:13:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [793/800][200/402] eta 0:02:33 lr 0.000003 time 0.7466 (0.7589) loss 0.5950 (0.6188) grad_norm 0.1904 (0.1656) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:14:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [793/800][300/402] eta 0:01:16 lr 0.000003 time 0.7456 (0.7546) loss 0.6490 (0.6195) grad_norm 0.1686 (0.1680) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:16:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [793/800][400/402] eta 0:00:01 lr 0.000003 time 0.7446 (0.7523) loss 0.6180 (0.6193) grad_norm 0.1691 (0.1682) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:16:05 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 793 training takes 0:05:02 [2024-03-11 11:16:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [794/800][0/402] eta 0:21:35 lr 0.000003 time 3.2221 (3.2221) loss 0.5745 (0.5745) grad_norm 0.1587 (0.1587) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:17:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [794/800][100/402] eta 0:03:52 lr 0.000003 time 0.7462 (0.7706) loss 0.6128 (0.6199) grad_norm 0.1888 (0.1686) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:18:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [794/800][200/402] eta 0:02:33 lr 0.000003 time 0.7458 (0.7583) loss 0.5807 (0.6192) grad_norm 0.1680 (0.1672) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:19:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [794/800][300/402] eta 0:01:16 lr 0.000003 time 0.7454 (0.7541) loss 0.6095 (0.6198) grad_norm 0.2040 (0.1677) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:21:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [794/800][400/402] eta 0:00:01 lr 0.000003 time 0.7444 (0.7520) loss 0.6363 (0.6195) grad_norm 0.1590 (0.1674) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:21:08 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 794 training takes 0:05:02 [2024-03-11 11:21:11 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [795/800][0/402] eta 0:21:50 lr 0.000003 time 3.2590 (3.2590) loss 0.6251 (0.6251) grad_norm 0.1614 (0.1614) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:22:26 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [795/800][100/402] eta 0:03:52 lr 0.000003 time 0.7461 (0.7713) loss 0.6098 (0.6187) grad_norm 0.1654 (0.1665) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:23:40 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [795/800][200/402] eta 0:02:33 lr 0.000003 time 0.7459 (0.7586) loss 0.6175 (0.6192) grad_norm 0.1529 (0.1677) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:24:55 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [795/800][300/402] eta 0:01:16 lr 0.000003 time 0.7461 (0.7544) loss 0.6112 (0.6194) grad_norm 0.1765 (0.1681) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:26:09 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [795/800][400/402] eta 0:00:01 lr 0.000003 time 0.7454 (0.7522) loss 0.6246 (0.6187) grad_norm 0.1689 (0.1686) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:26:10 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 795 training takes 0:05:02 [2024-03-11 11:26:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [796/800][0/402] eta 0:34:06 lr 0.000003 time 5.0899 (5.0899) loss 0.6040 (0.6040) grad_norm 0.1993 (0.1993) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:27:30 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [796/800][100/402] eta 0:03:58 lr 0.000003 time 0.7457 (0.7889) loss 0.6330 (0.6200) grad_norm 0.1666 (0.1685) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:28:45 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [796/800][200/402] eta 0:02:35 lr 0.000003 time 0.7455 (0.7677) loss 0.5866 (0.6207) grad_norm 0.1705 (0.1697) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:29:59 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [796/800][300/402] eta 0:01:17 lr 0.000003 time 0.7464 (0.7605) loss 0.6156 (0.6193) grad_norm 0.2010 (0.1709) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:31:14 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [796/800][400/402] eta 0:00:01 lr 0.000003 time 0.7452 (0.7568) loss 0.6366 (0.6198) grad_norm 0.1593 (inf) loss_scale 524288.0000 (550437.0274) mem 28968MB [2024-03-11 11:31:15 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 796 training takes 0:05:04 [2024-03-11 11:31:18 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [797/800][0/402] eta 0:23:03 lr 0.000003 time 3.4419 (3.4419) loss 0.6261 (0.6261) grad_norm 0.1412 (0.1412) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:32:33 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [797/800][100/402] eta 0:03:53 lr 0.000003 time 0.7461 (0.7725) loss 0.6305 (0.6207) grad_norm 0.1816 (0.1709) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:33:47 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [797/800][200/402] eta 0:02:33 lr 0.000003 time 0.7456 (0.7597) loss 0.6267 (0.6203) grad_norm 0.1997 (0.1704) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:35:02 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [797/800][300/402] eta 0:01:17 lr 0.000003 time 0.7463 (0.7551) loss 0.6339 (0.6195) grad_norm 0.1944 (0.1701) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:36:16 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [797/800][400/402] eta 0:00:01 lr 0.000003 time 0.7446 (0.7527) loss 0.6176 (0.6197) grad_norm 0.2123 (0.1705) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:36:17 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 797 training takes 0:05:02 [2024-03-11 11:36:21 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [798/800][0/402] eta 0:22:02 lr 0.000003 time 3.2898 (3.2898) loss 0.6149 (0.6149) grad_norm 0.1457 (0.1457) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:37:35 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [798/800][100/402] eta 0:03:52 lr 0.000003 time 0.7455 (0.7709) loss 0.5957 (0.6207) grad_norm 0.1734 (0.1697) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:38:50 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [798/800][200/402] eta 0:02:33 lr 0.000003 time 0.7466 (0.7584) loss 0.6556 (0.6216) grad_norm 0.1667 (0.1681) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:40:04 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [798/800][300/402] eta 0:01:16 lr 0.000003 time 0.7457 (0.7543) loss 0.6101 (0.6206) grad_norm 0.1332 (0.1702) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:41:19 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [798/800][400/402] eta 0:00:01 lr 0.000003 time 0.7454 (0.7522) loss 0.6183 (0.6210) grad_norm 0.1587 (0.1703) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:41:20 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 798 training takes 0:05:02 [2024-03-11 11:41:23 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [799/800][0/402] eta 0:22:26 lr 0.000003 time 3.3496 (3.3496) loss 0.6242 (0.6242) grad_norm 0.1593 (0.1593) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:42:38 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [799/800][100/402] eta 0:03:53 lr 0.000003 time 0.7457 (0.7716) loss 0.6187 (0.6229) grad_norm 0.1838 (0.1698) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:43:52 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [799/800][200/402] eta 0:02:33 lr 0.000003 time 0.7460 (0.7588) loss 0.6415 (0.6211) grad_norm 0.1893 (0.1678) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:45:07 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [799/800][300/402] eta 0:01:16 lr 0.000003 time 0.7456 (0.7547) loss 0.6130 (0.6207) grad_norm 0.1758 (0.1673) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:46:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 176): INFO Train: [799/800][400/402] eta 0:00:01 lr 0.000003 time 0.7443 (0.7525) loss 0.6142 (0.6201) grad_norm 0.1448 (0.1683) loss_scale 524288.0000 (524288.0000) mem 28968MB [2024-03-11 11:46:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 185): INFO EPOCH 799 training takes 0:05:02 [2024-03-11 11:46:22 hydro_rgb_simmim_pretrain] (main_simmim_pt.py 117): INFO Training time 2 days, 19:19:17