2023-03-08 16:52:49,517 - mmseg - INFO - Multi-processing start method is `None`
2023-03-08 16:52:49,518 - mmseg - INFO - OpenCV num_threads is `112
2023-03-08 16:52:49,518 - mmseg - INFO - OMP num threads is 1
2023-03-08 16:52:49,562 - mmseg - INFO - Environment info:
------------------------------------------------------------
sys.platform: linux
Python: 3.7.13 (default, Mar 29 2022, 02:18:16) [GCC 7.5.0]
CUDA available: True
GPU 0,1,2,3: A100-SXM-80GB
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_11.2.r11.2/compiler.29618528_0
GCC: gcc (GCC) 6.1.0
PyTorch: 1.9.0+cu111
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  - CuDNN 8.0.5
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

TorchVision: 0.10.0+cu111
OpenCV: 4.6.0
MMCV: 1.4.2
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 11.1
MMSegmentation: 0.29.0+
------------------------------------------------------------

2023-03-08 16:52:49,563 - mmseg - INFO - Distributed training: True
2023-03-08 16:52:49,799 - mmseg - INFO - Config:
dataset_type = 'CityscapesDataset'
data_root = 'data/cityscapes/'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
crop_size = (512, 1024)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations'),
    dict(type='Resize', img_scale=(2048, 1024), ratio_range=(0.5, 2.0)),
    dict(type='RandomCrop', crop_size=(512, 1024), cat_max_ratio=0.75),
    dict(type='RandomFlip', prob=0.5),
    dict(type='PhotoMetricDistortion'),
    dict(
        type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_rgb=True),
    dict(type='Pad', size=(512, 1024), pad_val=0, seg_pad_val=255),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_semantic_seg'])
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(2048, 1024),
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img'])
        ])
]
data = dict(
    samples_per_gpu=4,
    workers_per_gpu=4,
    train=dict(
        type='CityscapesDataset',
        data_root='data/cityscapes/',
        img_dir='leftImg8bit/train',
        ann_dir='gtFine/train',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(type='LoadAnnotations'),
            dict(
                type='Resize', img_scale=(2048, 1024), ratio_range=(0.5, 2.0)),
            dict(type='RandomCrop', crop_size=(512, 1024), cat_max_ratio=0.75),
            dict(type='RandomFlip', prob=0.5),
            dict(type='PhotoMetricDistortion'),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='Pad', size=(512, 1024), pad_val=0, seg_pad_val=255),
            dict(type='DefaultFormatBundle'),
            dict(type='Collect', keys=['img', 'gt_semantic_seg'])
        ]),
    val=dict(
        type='CityscapesDataset',
        data_root='data/cityscapes/',
        img_dir='leftImg8bit/val',
        ann_dir='gtFine/val',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(2048, 1024),
                flip=False,
                transforms=[
                    dict(type='Resize', keep_ratio=True),
                    dict(type='RandomFlip'),
                    dict(
                        type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='ImageToTensor', keys=['img']),
                    dict(type='Collect', keys=['img'])
                ])
        ]),
    test=dict(
        type='CityscapesDataset',
        data_root='data/cityscapes/',
        img_dir='leftImg8bit/val',
        ann_dir='gtFine/val',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(2048, 1024),
                flip=False,
                transforms=[
                    dict(type='Resize', keep_ratio=True),
                    dict(type='RandomFlip'),
                    dict(
                        type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='ImageToTensor', keys=['img']),
                    dict(type='Collect', keys=['img'])
                ])
        ]))
log_config = dict(
    interval=50, hooks=[dict(type='TextLoggerHook', by_epoch=False)])
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = 'work_dirs/deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_best_mIoU_iter_128000_.pth'
resume_from = None
workflow = [('train', 1)]
cudnn_benchmark = True
optimizer = dict(
    type='AdamW',
    lr=6e-06,
    betas=(0.9, 0.999),
    weight_decay=0.01,
    paramwise_cfg=dict(
        custom_keys=dict(
            pos_block=dict(decay_mult=0.0),
            norm=dict(decay_mult=0.0),
            head=dict(lr_mult=1.0))))
optimizer_config = dict()
lr_config = dict(
    policy='poly',
    warmup='linear',
    warmup_iters=1500,
    warmup_ratio=1e-06,
    power=1.0,
    min_lr=0.0,
    by_epoch=False)
runner = dict(type='IterBasedRunner', max_iters=20000)
checkpoint_config = dict(by_epoch=False, interval=2000)
evaluation = dict(interval=500, metric='mIoU', pre_eval=True, save_best='mIoU')
custom_imports = dict(imports='mmcls.models', allow_failed_imports=False)
norm_cfg = dict(type='SyncBN', requires_grad=True)
backbone_norm_cfg = dict(type='LN', requires_grad=True)
model = dict(
    type='DiffSegV22',
    bit_scale=0.01,
    timesteps=10,
    pretrained=None,
    backbone=dict(
        type='mmcls.ConvNeXt',
        arch='tiny',
        out_indices=[0, 1, 2, 3],
        drop_path_rate=0.4,
        layer_scale_init_value=1.0,
        gap_before_final_norm=False,
        init_cfg=None),
    neck=[
        dict(
            type='FPN',
            in_channels=[96, 192, 384, 768],
            out_channels=256,
            act_cfg=None,
            norm_cfg=dict(type='GN', num_groups=32),
            num_outs=4),
        dict(
            type='MultiStageMerging',
            in_channels=[256, 256, 256, 256],
            out_channels=256,
            kernel_size=1,
            norm_cfg=dict(type='GN', num_groups=32),
            act_cfg=None)
    ],
    auxiliary_head=dict(
        type='FCNHead',
        in_channels=256,
        in_index=0,
        channels=256,
        num_convs=1,
        concat_input=False,
        dropout_ratio=0.1,
        num_classes=19,
        norm_cfg=dict(type='SyncBN', requires_grad=True),
        align_corners=False,
        loss_decode=dict(
            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)),
    decode_head=dict(
        type='DeformableHeadWithTime',
        in_channels=[256],
        channels=256,
        in_index=[0],
        dropout_ratio=0.0,
        num_classes=19,
        norm_cfg=dict(type='SyncBN', requires_grad=True),
        align_corners=False,
        num_feature_levels=1,
        encoder=dict(
            type='DetrTransformerEncoder',
            num_layers=6,
            transformerlayers=dict(
                type='BaseTransformerLayer',
                use_time_mlp=True,
                attn_cfgs=dict(
                    type='MultiScaleDeformableAttention',
                    embed_dims=256,
                    num_levels=1,
                    num_heads=8,
                    dropout=0.0),
                ffn_cfgs=dict(
                    type='FFN',
                    embed_dims=256,
                    feedforward_channels=1024,
                    ffn_drop=0.0,
                    act_cfg=dict(type='GELU')),
                operation_order=('self_attn', 'norm', 'ffn', 'norm'))),
        positional_encoding=dict(
            type='SinePositionalEncoding',
            num_feats=128,
            normalize=True,
            offset=-0.5),
        loss_decode=dict(
            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
    train_cfg=dict(),
    test_cfg=dict(mode='whole'))
work_dir = './work_dirs/deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_align_diffv22'
gpu_ids = range(0, 4)
auto_resume = False

2023-03-08 16:52:54,149 - mmseg - INFO - Set random seed to 1941061547, deterministic: True
2023-03-08 16:52:54,470 - mmseg - INFO - initialize FPN with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'}
2023-03-08 16:52:54,485 - mmseg - INFO - initialize MultiStageMerging with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'}
2023-03-08 16:52:54,532 - mmseg - INFO - initialize FCNHead with init_cfg {'type': 'Normal', 'std': 0.01, 'override': {'name': 'conv_seg'}}
Name of parameter - Initialization information

backbone.downsample_layers.0.0.weight - torch.Size([96, 3, 4, 4]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.downsample_layers.0.0.bias - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.downsample_layers.0.1.weight - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.downsample_layers.0.1.bias - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.downsample_layers.1.0.weight - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.downsample_layers.1.0.bias - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.downsample_layers.1.1.weight - torch.Size([192, 96, 2, 2]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.downsample_layers.1.1.bias - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.downsample_layers.2.0.weight - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.downsample_layers.2.0.bias - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.downsample_layers.2.1.weight - torch.Size([384, 192, 2, 2]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.downsample_layers.2.1.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.downsample_layers.3.0.weight - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.downsample_layers.3.0.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.downsample_layers.3.1.weight - torch.Size([768, 384, 2, 2]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.downsample_layers.3.1.bias - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.0.gamma - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.0.depthwise_conv.weight - torch.Size([96, 1, 7, 7]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.0.depthwise_conv.bias - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.0.norm.weight - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.0.norm.bias - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.0.pointwise_conv1.weight - torch.Size([384, 96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.0.pointwise_conv1.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.0.pointwise_conv2.weight - torch.Size([96, 384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.0.pointwise_conv2.bias - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.1.gamma - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.1.depthwise_conv.weight - torch.Size([96, 1, 7, 7]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.1.depthwise_conv.bias - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.1.norm.weight - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.1.norm.bias - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.1.pointwise_conv1.weight - torch.Size([384, 96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.1.pointwise_conv1.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.1.pointwise_conv2.weight - torch.Size([96, 384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.1.pointwise_conv2.bias - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.2.gamma - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.2.depthwise_conv.weight - torch.Size([96, 1, 7, 7]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.2.depthwise_conv.bias - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.2.norm.weight - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.2.norm.bias - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.2.pointwise_conv1.weight - torch.Size([384, 96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.2.pointwise_conv1.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.2.pointwise_conv2.weight - torch.Size([96, 384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.0.2.pointwise_conv2.bias - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.0.gamma - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.0.depthwise_conv.weight - torch.Size([192, 1, 7, 7]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.0.depthwise_conv.bias - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.0.norm.weight - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.0.norm.bias - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.0.pointwise_conv1.weight - torch.Size([768, 192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.0.pointwise_conv1.bias - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.0.pointwise_conv2.weight - torch.Size([192, 768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.0.pointwise_conv2.bias - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.1.gamma - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.1.depthwise_conv.weight - torch.Size([192, 1, 7, 7]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.1.depthwise_conv.bias - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.1.norm.weight - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.1.norm.bias - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.1.pointwise_conv1.weight - torch.Size([768, 192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.1.pointwise_conv1.bias - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.1.pointwise_conv2.weight - torch.Size([192, 768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.1.pointwise_conv2.bias - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.2.gamma - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.2.depthwise_conv.weight - torch.Size([192, 1, 7, 7]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.2.depthwise_conv.bias - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.2.norm.weight - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.2.norm.bias - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.2.pointwise_conv1.weight - torch.Size([768, 192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.2.pointwise_conv1.bias - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.2.pointwise_conv2.weight - torch.Size([192, 768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.1.2.pointwise_conv2.bias - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.0.gamma - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.0.depthwise_conv.weight - torch.Size([384, 1, 7, 7]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.0.depthwise_conv.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.0.norm.weight - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.0.norm.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.0.pointwise_conv1.weight - torch.Size([1536, 384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.0.pointwise_conv1.bias - torch.Size([1536]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.0.pointwise_conv2.weight - torch.Size([384, 1536]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.0.pointwise_conv2.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.1.gamma - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.1.depthwise_conv.weight - torch.Size([384, 1, 7, 7]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.1.depthwise_conv.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.1.norm.weight - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.1.norm.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.1.pointwise_conv1.weight - torch.Size([1536, 384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.1.pointwise_conv1.bias - torch.Size([1536]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.1.pointwise_conv2.weight - torch.Size([384, 1536]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.1.pointwise_conv2.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.2.gamma - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.2.depthwise_conv.weight - torch.Size([384, 1, 7, 7]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.2.depthwise_conv.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.2.norm.weight - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.2.norm.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.2.pointwise_conv1.weight - torch.Size([1536, 384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.2.pointwise_conv1.bias - torch.Size([1536]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.2.pointwise_conv2.weight - torch.Size([384, 1536]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.2.pointwise_conv2.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.3.gamma - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.3.depthwise_conv.weight - torch.Size([384, 1, 7, 7]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.3.depthwise_conv.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.3.norm.weight - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.3.norm.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.3.pointwise_conv1.weight - torch.Size([1536, 384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.3.pointwise_conv1.bias - torch.Size([1536]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.3.pointwise_conv2.weight - torch.Size([384, 1536]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.3.pointwise_conv2.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.4.gamma - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.4.depthwise_conv.weight - torch.Size([384, 1, 7, 7]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.4.depthwise_conv.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.4.norm.weight - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.4.norm.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.4.pointwise_conv1.weight - torch.Size([1536, 384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.4.pointwise_conv1.bias - torch.Size([1536]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.4.pointwise_conv2.weight - torch.Size([384, 1536]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.4.pointwise_conv2.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.5.gamma - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.5.depthwise_conv.weight - torch.Size([384, 1, 7, 7]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.5.depthwise_conv.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.5.norm.weight - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.5.norm.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.5.pointwise_conv1.weight - torch.Size([1536, 384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.5.pointwise_conv1.bias - torch.Size([1536]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.5.pointwise_conv2.weight - torch.Size([384, 1536]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.5.pointwise_conv2.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.6.gamma - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.6.depthwise_conv.weight - torch.Size([384, 1, 7, 7]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.6.depthwise_conv.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.6.norm.weight - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.6.norm.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.6.pointwise_conv1.weight - torch.Size([1536, 384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.6.pointwise_conv1.bias - torch.Size([1536]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.6.pointwise_conv2.weight - torch.Size([384, 1536]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.6.pointwise_conv2.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.7.gamma - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.7.depthwise_conv.weight - torch.Size([384, 1, 7, 7]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.7.depthwise_conv.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.7.norm.weight - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.7.norm.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.7.pointwise_conv1.weight - torch.Size([1536, 384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.7.pointwise_conv1.bias - torch.Size([1536]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.7.pointwise_conv2.weight - torch.Size([384, 1536]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.7.pointwise_conv2.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.8.gamma - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.8.depthwise_conv.weight - torch.Size([384, 1, 7, 7]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.8.depthwise_conv.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.8.norm.weight - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.8.norm.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.8.pointwise_conv1.weight - torch.Size([1536, 384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.8.pointwise_conv1.bias - torch.Size([1536]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.8.pointwise_conv2.weight - torch.Size([384, 1536]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.2.8.pointwise_conv2.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.0.gamma - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.0.depthwise_conv.weight - torch.Size([768, 1, 7, 7]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.0.depthwise_conv.bias - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.0.norm.weight - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.0.norm.bias - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.0.pointwise_conv1.weight - torch.Size([3072, 768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.0.pointwise_conv1.bias - torch.Size([3072]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.0.pointwise_conv2.weight - torch.Size([768, 3072]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.0.pointwise_conv2.bias - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.1.gamma - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.1.depthwise_conv.weight - torch.Size([768, 1, 7, 7]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.1.depthwise_conv.bias - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.1.norm.weight - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.1.norm.bias - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.1.pointwise_conv1.weight - torch.Size([3072, 768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.1.pointwise_conv1.bias - torch.Size([3072]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.1.pointwise_conv2.weight - torch.Size([768, 3072]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.1.pointwise_conv2.bias - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.2.gamma - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.2.depthwise_conv.weight - torch.Size([768, 1, 7, 7]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.2.depthwise_conv.bias - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.2.norm.weight - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.2.norm.bias - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.2.pointwise_conv1.weight - torch.Size([3072, 768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.2.pointwise_conv1.bias - torch.Size([3072]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.2.pointwise_conv2.weight - torch.Size([768, 3072]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.stages.3.2.pointwise_conv2.bias - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.norm0.weight - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.norm0.bias - torch.Size([96]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.norm1.weight - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.norm1.bias - torch.Size([192]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.norm2.weight - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.norm2.bias - torch.Size([384]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.norm3.weight - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

backbone.norm3.bias - torch.Size([768]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

neck.0.lateral_convs.0.conv.weight - torch.Size([256, 96, 1, 1]): 
XavierInit: gain=1, distribution=uniform, bias=0 

neck.0.lateral_convs.0.gn.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

neck.0.lateral_convs.0.gn.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

neck.0.lateral_convs.1.conv.weight - torch.Size([256, 192, 1, 1]): 
XavierInit: gain=1, distribution=uniform, bias=0 

neck.0.lateral_convs.1.gn.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

neck.0.lateral_convs.1.gn.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

neck.0.lateral_convs.2.conv.weight - torch.Size([256, 384, 1, 1]): 
XavierInit: gain=1, distribution=uniform, bias=0 

neck.0.lateral_convs.2.gn.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

neck.0.lateral_convs.2.gn.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

neck.0.lateral_convs.3.conv.weight - torch.Size([256, 768, 1, 1]): 
XavierInit: gain=1, distribution=uniform, bias=0 

neck.0.lateral_convs.3.gn.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

neck.0.lateral_convs.3.gn.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

neck.0.fpn_convs.0.conv.weight - torch.Size([256, 256, 3, 3]): 
XavierInit: gain=1, distribution=uniform, bias=0 

neck.0.fpn_convs.0.gn.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

neck.0.fpn_convs.0.gn.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

neck.0.fpn_convs.1.conv.weight - torch.Size([256, 256, 3, 3]): 
XavierInit: gain=1, distribution=uniform, bias=0 

neck.0.fpn_convs.1.gn.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

neck.0.fpn_convs.1.gn.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

neck.0.fpn_convs.2.conv.weight - torch.Size([256, 256, 3, 3]): 
XavierInit: gain=1, distribution=uniform, bias=0 

neck.0.fpn_convs.2.gn.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

neck.0.fpn_convs.2.gn.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

neck.0.fpn_convs.3.conv.weight - torch.Size([256, 256, 3, 3]): 
XavierInit: gain=1, distribution=uniform, bias=0 

neck.0.fpn_convs.3.gn.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

neck.0.fpn_convs.3.gn.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

neck.1.down.conv.weight - torch.Size([256, 1024, 1, 1]): 
Initialized by user-defined `init_weights` in ConvModule  

neck.1.down.gn.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

neck.1.down.gn.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.conv_seg.weight - torch.Size([19, 256, 1, 1]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.conv_seg.bias - torch.Size([19]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.0.attentions.0.sampling_offsets.weight - torch.Size([64, 256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.0.attentions.0.sampling_offsets.bias - torch.Size([64]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.0.attentions.0.attention_weights.weight - torch.Size([32, 256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.0.attentions.0.attention_weights.bias - torch.Size([32]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.0.attentions.0.value_proj.weight - torch.Size([256, 256]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.0.attentions.0.value_proj.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.0.attentions.0.output_proj.weight - torch.Size([256, 256]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.0.attentions.0.output_proj.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.0.time_mlp.1.weight - torch.Size([512, 1024]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.0.time_mlp.1.bias - torch.Size([512]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.0.ffns.0.layers.0.0.weight - torch.Size([1024, 256]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.0.ffns.0.layers.0.0.bias - torch.Size([1024]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.0.ffns.0.layers.1.weight - torch.Size([256, 1024]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.0.ffns.0.layers.1.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.0.norms.0.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.0.norms.0.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.0.norms.1.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.0.norms.1.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.1.attentions.0.sampling_offsets.weight - torch.Size([64, 256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.1.attentions.0.sampling_offsets.bias - torch.Size([64]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.1.attentions.0.attention_weights.weight - torch.Size([32, 256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.1.attentions.0.attention_weights.bias - torch.Size([32]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.1.attentions.0.value_proj.weight - torch.Size([256, 256]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.1.attentions.0.value_proj.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.1.attentions.0.output_proj.weight - torch.Size([256, 256]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.1.attentions.0.output_proj.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.1.time_mlp.1.weight - torch.Size([512, 1024]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.1.time_mlp.1.bias - torch.Size([512]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.1.ffns.0.layers.0.0.weight - torch.Size([1024, 256]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.1.ffns.0.layers.0.0.bias - torch.Size([1024]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.1.ffns.0.layers.1.weight - torch.Size([256, 1024]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.1.ffns.0.layers.1.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.1.norms.0.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.1.norms.0.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.1.norms.1.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.1.norms.1.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.2.attentions.0.sampling_offsets.weight - torch.Size([64, 256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.2.attentions.0.sampling_offsets.bias - torch.Size([64]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.2.attentions.0.attention_weights.weight - torch.Size([32, 256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.2.attentions.0.attention_weights.bias - torch.Size([32]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.2.attentions.0.value_proj.weight - torch.Size([256, 256]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.2.attentions.0.value_proj.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.2.attentions.0.output_proj.weight - torch.Size([256, 256]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.2.attentions.0.output_proj.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.2.time_mlp.1.weight - torch.Size([512, 1024]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.2.time_mlp.1.bias - torch.Size([512]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.2.ffns.0.layers.0.0.weight - torch.Size([1024, 256]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.2.ffns.0.layers.0.0.bias - torch.Size([1024]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.2.ffns.0.layers.1.weight - torch.Size([256, 1024]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.2.ffns.0.layers.1.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.2.norms.0.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.2.norms.0.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.2.norms.1.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.2.norms.1.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.3.attentions.0.sampling_offsets.weight - torch.Size([64, 256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.3.attentions.0.sampling_offsets.bias - torch.Size([64]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.3.attentions.0.attention_weights.weight - torch.Size([32, 256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.3.attentions.0.attention_weights.bias - torch.Size([32]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.3.attentions.0.value_proj.weight - torch.Size([256, 256]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.3.attentions.0.value_proj.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.3.attentions.0.output_proj.weight - torch.Size([256, 256]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.3.attentions.0.output_proj.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.3.time_mlp.1.weight - torch.Size([512, 1024]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.3.time_mlp.1.bias - torch.Size([512]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.3.ffns.0.layers.0.0.weight - torch.Size([1024, 256]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.3.ffns.0.layers.0.0.bias - torch.Size([1024]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.3.ffns.0.layers.1.weight - torch.Size([256, 1024]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.3.ffns.0.layers.1.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.3.norms.0.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.3.norms.0.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.3.norms.1.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.3.norms.1.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.4.attentions.0.sampling_offsets.weight - torch.Size([64, 256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.4.attentions.0.sampling_offsets.bias - torch.Size([64]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.4.attentions.0.attention_weights.weight - torch.Size([32, 256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.4.attentions.0.attention_weights.bias - torch.Size([32]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.4.attentions.0.value_proj.weight - torch.Size([256, 256]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.4.attentions.0.value_proj.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.4.attentions.0.output_proj.weight - torch.Size([256, 256]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.4.attentions.0.output_proj.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.4.time_mlp.1.weight - torch.Size([512, 1024]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.4.time_mlp.1.bias - torch.Size([512]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.4.ffns.0.layers.0.0.weight - torch.Size([1024, 256]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.4.ffns.0.layers.0.0.bias - torch.Size([1024]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.4.ffns.0.layers.1.weight - torch.Size([256, 1024]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.4.ffns.0.layers.1.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.4.norms.0.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.4.norms.0.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.4.norms.1.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.4.norms.1.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.5.attentions.0.sampling_offsets.weight - torch.Size([64, 256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.5.attentions.0.sampling_offsets.bias - torch.Size([64]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.5.attentions.0.attention_weights.weight - torch.Size([32, 256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.5.attentions.0.attention_weights.bias - torch.Size([32]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.5.attentions.0.value_proj.weight - torch.Size([256, 256]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.5.attentions.0.value_proj.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.5.attentions.0.output_proj.weight - torch.Size([256, 256]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.5.attentions.0.output_proj.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.5.time_mlp.1.weight - torch.Size([512, 1024]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.5.time_mlp.1.bias - torch.Size([512]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.5.ffns.0.layers.0.0.weight - torch.Size([1024, 256]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.5.ffns.0.layers.0.0.bias - torch.Size([1024]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.5.ffns.0.layers.1.weight - torch.Size([256, 1024]): 
Initialized by user-defined `init_weights` in DeformableHeadWithTime  

decode_head.encoder.layers.5.ffns.0.layers.1.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.5.norms.0.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.5.norms.0.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.5.norms.1.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

decode_head.encoder.layers.5.norms.1.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

auxiliary_head.conv_seg.weight - torch.Size([19, 256, 1, 1]): 
NormalInit: mean=0, std=0.01, bias=0 

auxiliary_head.conv_seg.bias - torch.Size([19]): 
NormalInit: mean=0, std=0.01, bias=0 

auxiliary_head.convs.0.conv.weight - torch.Size([256, 256, 3, 3]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

auxiliary_head.convs.0.bn.weight - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

auxiliary_head.convs.0.bn.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

embedding_table.weight - torch.Size([20, 256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

transform.conv.weight - torch.Size([256, 512, 1, 1]): 
Initialized by user-defined `init_weights` in ConvModule  

transform.conv.bias - torch.Size([256]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

time_mlp.0.weights - torch.Size([8]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

time_mlp.1.weight - torch.Size([1024, 17]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

time_mlp.1.bias - torch.Size([1024]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

time_mlp.3.weight - torch.Size([1024, 1024]): 
The value is the same before and after calling `init_weights` of DiffSegV22  

time_mlp.3.bias - torch.Size([1024]): 
The value is the same before and after calling `init_weights` of DiffSegV22  
2023-03-08 16:52:54,536 - mmseg - INFO - DiffSegV22(
  (backbone): ConvNeXt(
    (downsample_layers): ModuleList(
      (0): Sequential(
        (0): Conv2d(3, 96, kernel_size=(4, 4), stride=(4, 4))
        (1): LayerNorm2d((96,), eps=1e-06, elementwise_affine=True)
      )
      (1): Sequential(
        (0): LayerNorm2d((96,), eps=1e-05, elementwise_affine=True)
        (1): Conv2d(96, 192, kernel_size=(2, 2), stride=(2, 2))
      )
      (2): Sequential(
        (0): LayerNorm2d((192,), eps=1e-05, elementwise_affine=True)
        (1): Conv2d(192, 384, kernel_size=(2, 2), stride=(2, 2))
      )
      (3): Sequential(
        (0): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True)
        (1): Conv2d(384, 768, kernel_size=(2, 2), stride=(2, 2))
      )
    )
    (stages): ModuleList(
      (0): Sequential(
        (0): ConvNeXtBlock(
          (depthwise_conv): Conv2d(96, 96, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=96)
          (norm): LayerNorm2d((96,), eps=1e-06, elementwise_affine=True)
          (pointwise_conv1): Linear(in_features=96, out_features=384, bias=True)
          (act): GELU()
          (pointwise_conv2): Linear(in_features=384, out_features=96, bias=True)
          (drop_path): Identity()
        )
        (1): ConvNeXtBlock(
          (depthwise_conv): Conv2d(96, 96, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=96)
          (norm): LayerNorm2d((96,), eps=1e-06, elementwise_affine=True)
          (pointwise_conv1): Linear(in_features=96, out_features=384, bias=True)
          (act): GELU()
          (pointwise_conv2): Linear(in_features=384, out_features=96, bias=True)
          (drop_path): DropPath()
        )
        (2): ConvNeXtBlock(
          (depthwise_conv): Conv2d(96, 96, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=96)
          (norm): LayerNorm2d((96,), eps=1e-06, elementwise_affine=True)
          (pointwise_conv1): Linear(in_features=96, out_features=384, bias=True)
          (act): GELU()
          (pointwise_conv2): Linear(in_features=384, out_features=96, bias=True)
          (drop_path): DropPath()
        )
      )
      (1): Sequential(
        (0): ConvNeXtBlock(
          (depthwise_conv): Conv2d(192, 192, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=192)
          (norm): LayerNorm2d((192,), eps=1e-06, elementwise_affine=True)
          (pointwise_conv1): Linear(in_features=192, out_features=768, bias=True)
          (act): GELU()
          (pointwise_conv2): Linear(in_features=768, out_features=192, bias=True)
          (drop_path): DropPath()
        )
        (1): ConvNeXtBlock(
          (depthwise_conv): Conv2d(192, 192, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=192)
          (norm): LayerNorm2d((192,), eps=1e-06, elementwise_affine=True)
          (pointwise_conv1): Linear(in_features=192, out_features=768, bias=True)
          (act): GELU()
          (pointwise_conv2): Linear(in_features=768, out_features=192, bias=True)
          (drop_path): DropPath()
        )
        (2): ConvNeXtBlock(
          (depthwise_conv): Conv2d(192, 192, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=192)
          (norm): LayerNorm2d((192,), eps=1e-06, elementwise_affine=True)
          (pointwise_conv1): Linear(in_features=192, out_features=768, bias=True)
          (act): GELU()
          (pointwise_conv2): Linear(in_features=768, out_features=192, bias=True)
          (drop_path): DropPath()
        )
      )
      (2): Sequential(
        (0): ConvNeXtBlock(
          (depthwise_conv): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
          (norm): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
          (pointwise_conv1): Linear(in_features=384, out_features=1536, bias=True)
          (act): GELU()
          (pointwise_conv2): Linear(in_features=1536, out_features=384, bias=True)
          (drop_path): DropPath()
        )
        (1): ConvNeXtBlock(
          (depthwise_conv): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
          (norm): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
          (pointwise_conv1): Linear(in_features=384, out_features=1536, bias=True)
          (act): GELU()
          (pointwise_conv2): Linear(in_features=1536, out_features=384, bias=True)
          (drop_path): DropPath()
        )
        (2): ConvNeXtBlock(
          (depthwise_conv): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
          (norm): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
          (pointwise_conv1): Linear(in_features=384, out_features=1536, bias=True)
          (act): GELU()
          (pointwise_conv2): Linear(in_features=1536, out_features=384, bias=True)
          (drop_path): DropPath()
        )
        (3): ConvNeXtBlock(
          (depthwise_conv): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
          (norm): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
          (pointwise_conv1): Linear(in_features=384, out_features=1536, bias=True)
          (act): GELU()
          (pointwise_conv2): Linear(in_features=1536, out_features=384, bias=True)
          (drop_path): DropPath()
        )
        (4): ConvNeXtBlock(
          (depthwise_conv): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
          (norm): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
          (pointwise_conv1): Linear(in_features=384, out_features=1536, bias=True)
          (act): GELU()
          (pointwise_conv2): Linear(in_features=1536, out_features=384, bias=True)
          (drop_path): DropPath()
        )
        (5): ConvNeXtBlock(
          (depthwise_conv): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
          (norm): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
          (pointwise_conv1): Linear(in_features=384, out_features=1536, bias=True)
          (act): GELU()
          (pointwise_conv2): Linear(in_features=1536, out_features=384, bias=True)
          (drop_path): DropPath()
        )
        (6): ConvNeXtBlock(
          (depthwise_conv): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
          (norm): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
          (pointwise_conv1): Linear(in_features=384, out_features=1536, bias=True)
          (act): GELU()
          (pointwise_conv2): Linear(in_features=1536, out_features=384, bias=True)
          (drop_path): DropPath()
        )
        (7): ConvNeXtBlock(
          (depthwise_conv): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
          (norm): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
          (pointwise_conv1): Linear(in_features=384, out_features=1536, bias=True)
          (act): GELU()
          (pointwise_conv2): Linear(in_features=1536, out_features=384, bias=True)
          (drop_path): DropPath()
        )
        (8): ConvNeXtBlock(
          (depthwise_conv): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
          (norm): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
          (pointwise_conv1): Linear(in_features=384, out_features=1536, bias=True)
          (act): GELU()
          (pointwise_conv2): Linear(in_features=1536, out_features=384, bias=True)
          (drop_path): DropPath()
        )
      )
      (3): Sequential(
        (0): ConvNeXtBlock(
          (depthwise_conv): Conv2d(768, 768, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=768)
          (norm): LayerNorm2d((768,), eps=1e-06, elementwise_affine=True)
          (pointwise_conv1): Linear(in_features=768, out_features=3072, bias=True)
          (act): GELU()
          (pointwise_conv2): Linear(in_features=3072, out_features=768, bias=True)
          (drop_path): DropPath()
        )
        (1): ConvNeXtBlock(
          (depthwise_conv): Conv2d(768, 768, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=768)
          (norm): LayerNorm2d((768,), eps=1e-06, elementwise_affine=True)
          (pointwise_conv1): Linear(in_features=768, out_features=3072, bias=True)
          (act): GELU()
          (pointwise_conv2): Linear(in_features=3072, out_features=768, bias=True)
          (drop_path): DropPath()
        )
        (2): ConvNeXtBlock(
          (depthwise_conv): Conv2d(768, 768, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=768)
          (norm): LayerNorm2d((768,), eps=1e-06, elementwise_affine=True)
          (pointwise_conv1): Linear(in_features=768, out_features=3072, bias=True)
          (act): GELU()
          (pointwise_conv2): Linear(in_features=3072, out_features=768, bias=True)
          (drop_path): DropPath()
        )
      )
    )
    (norm0): LayerNorm2d((96,), eps=1e-06, elementwise_affine=True)
    (norm1): LayerNorm2d((192,), eps=1e-06, elementwise_affine=True)
    (norm2): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
    (norm3): LayerNorm2d((768,), eps=1e-06, elementwise_affine=True)
  )
  (neck): Sequential(
    (0): FPN(
      (lateral_convs): ModuleList(
        (0): ConvModule(
          (conv): Conv2d(96, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (gn): GroupNorm(32, 256, eps=1e-05, affine=True)
        )
        (1): ConvModule(
          (conv): Conv2d(192, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (gn): GroupNorm(32, 256, eps=1e-05, affine=True)
        )
        (2): ConvModule(
          (conv): Conv2d(384, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (gn): GroupNorm(32, 256, eps=1e-05, affine=True)
        )
        (3): ConvModule(
          (conv): Conv2d(768, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (gn): GroupNorm(32, 256, eps=1e-05, affine=True)
        )
      )
      (fpn_convs): ModuleList(
        (0): ConvModule(
          (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (gn): GroupNorm(32, 256, eps=1e-05, affine=True)
        )
        (1): ConvModule(
          (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (gn): GroupNorm(32, 256, eps=1e-05, affine=True)
        )
        (2): ConvModule(
          (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (gn): GroupNorm(32, 256, eps=1e-05, affine=True)
        )
        (3): ConvModule(
          (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (gn): GroupNorm(32, 256, eps=1e-05, affine=True)
        )
      )
    )
    init_cfg={'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'}
    (1): MultiStageMerging(
      (down): ConvModule(
        (conv): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (gn): GroupNorm(32, 256, eps=1e-05, affine=True)
      )
    )
    init_cfg={'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'}
  )
  (decode_head): DeformableHeadWithTime(
    input_transform=multiple_select, ignore_index=255, align_corners=False
    (loss_decode): CrossEntropyLoss(avg_non_ignore=False)
    (conv_seg): Conv2d(256, 19, kernel_size=(1, 1), stride=(1, 1))
    (encoder): DetrTransformerEncoder(
      (layers): ModuleList(
        (0): BaseTransformerLayer(
          (attentions): ModuleList(
            (0): MultiScaleDeformableAttention(
              (dropout): Dropout(p=0.0, inplace=False)
              (sampling_offsets): Linear(in_features=256, out_features=64, bias=True)
              (attention_weights): Linear(in_features=256, out_features=32, bias=True)
              (value_proj): Linear(in_features=256, out_features=256, bias=True)
              (output_proj): Linear(in_features=256, out_features=256, bias=True)
            )
          )
          (time_mlp): Sequential(
            (0): SiLU()
            (1): Linear(in_features=1024, out_features=512, bias=True)
          )
          (ffns): ModuleList(
            (0): FFN(
              (activate): GELU()
              (layers): Sequential(
                (0): Sequential(
                  (0): Linear(in_features=256, out_features=1024, bias=True)
                  (1): GELU()
                  (2): Dropout(p=0.0, inplace=False)
                )
                (1): Linear(in_features=1024, out_features=256, bias=True)
                (2): Dropout(p=0.0, inplace=False)
              )
              (dropout_layer): Identity()
            )
          )
          (norms): ModuleList(
            (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
            (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
          )
        )
        (1): BaseTransformerLayer(
          (attentions): ModuleList(
            (0): MultiScaleDeformableAttention(
              (dropout): Dropout(p=0.0, inplace=False)
              (sampling_offsets): Linear(in_features=256, out_features=64, bias=True)
              (attention_weights): Linear(in_features=256, out_features=32, bias=True)
              (value_proj): Linear(in_features=256, out_features=256, bias=True)
              (output_proj): Linear(in_features=256, out_features=256, bias=True)
            )
          )
          (time_mlp): Sequential(
            (0): SiLU()
            (1): Linear(in_features=1024, out_features=512, bias=True)
          )
          (ffns): ModuleList(
            (0): FFN(
              (activate): GELU()
              (layers): Sequential(
                (0): Sequential(
                  (0): Linear(in_features=256, out_features=1024, bias=True)
                  (1): GELU()
                  (2): Dropout(p=0.0, inplace=False)
                )
                (1): Linear(in_features=1024, out_features=256, bias=True)
                (2): Dropout(p=0.0, inplace=False)
              )
              (dropout_layer): Identity()
            )
          )
          (norms): ModuleList(
            (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
            (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
          )
        )
        (2): BaseTransformerLayer(
          (attentions): ModuleList(
            (0): MultiScaleDeformableAttention(
              (dropout): Dropout(p=0.0, inplace=False)
              (sampling_offsets): Linear(in_features=256, out_features=64, bias=True)
              (attention_weights): Linear(in_features=256, out_features=32, bias=True)
              (value_proj): Linear(in_features=256, out_features=256, bias=True)
              (output_proj): Linear(in_features=256, out_features=256, bias=True)
            )
          )
          (time_mlp): Sequential(
            (0): SiLU()
            (1): Linear(in_features=1024, out_features=512, bias=True)
          )
          (ffns): ModuleList(
            (0): FFN(
              (activate): GELU()
              (layers): Sequential(
                (0): Sequential(
                  (0): Linear(in_features=256, out_features=1024, bias=True)
                  (1): GELU()
                  (2): Dropout(p=0.0, inplace=False)
                )
                (1): Linear(in_features=1024, out_features=256, bias=True)
                (2): Dropout(p=0.0, inplace=False)
              )
              (dropout_layer): Identity()
            )
          )
          (norms): ModuleList(
            (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
            (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
          )
        )
        (3): BaseTransformerLayer(
          (attentions): ModuleList(
            (0): MultiScaleDeformableAttention(
              (dropout): Dropout(p=0.0, inplace=False)
              (sampling_offsets): Linear(in_features=256, out_features=64, bias=True)
              (attention_weights): Linear(in_features=256, out_features=32, bias=True)
              (value_proj): Linear(in_features=256, out_features=256, bias=True)
              (output_proj): Linear(in_features=256, out_features=256, bias=True)
            )
          )
          (time_mlp): Sequential(
            (0): SiLU()
            (1): Linear(in_features=1024, out_features=512, bias=True)
          )
          (ffns): ModuleList(
            (0): FFN(
              (activate): GELU()
              (layers): Sequential(
                (0): Sequential(
                  (0): Linear(in_features=256, out_features=1024, bias=True)
                  (1): GELU()
                  (2): Dropout(p=0.0, inplace=False)
                )
                (1): Linear(in_features=1024, out_features=256, bias=True)
                (2): Dropout(p=0.0, inplace=False)
              )
              (dropout_layer): Identity()
            )
          )
          (norms): ModuleList(
            (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
            (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
          )
        )
        (4): BaseTransformerLayer(
          (attentions): ModuleList(
            (0): MultiScaleDeformableAttention(
              (dropout): Dropout(p=0.0, inplace=False)
              (sampling_offsets): Linear(in_features=256, out_features=64, bias=True)
              (attention_weights): Linear(in_features=256, out_features=32, bias=True)
              (value_proj): Linear(in_features=256, out_features=256, bias=True)
              (output_proj): Linear(in_features=256, out_features=256, bias=True)
            )
          )
          (time_mlp): Sequential(
            (0): SiLU()
            (1): Linear(in_features=1024, out_features=512, bias=True)
          )
          (ffns): ModuleList(
            (0): FFN(
              (activate): GELU()
              (layers): Sequential(
                (0): Sequential(
                  (0): Linear(in_features=256, out_features=1024, bias=True)
                  (1): GELU()
                  (2): Dropout(p=0.0, inplace=False)
                )
                (1): Linear(in_features=1024, out_features=256, bias=True)
                (2): Dropout(p=0.0, inplace=False)
              )
              (dropout_layer): Identity()
            )
          )
          (norms): ModuleList(
            (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
            (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
          )
        )
        (5): BaseTransformerLayer(
          (attentions): ModuleList(
            (0): MultiScaleDeformableAttention(
              (dropout): Dropout(p=0.0, inplace=False)
              (sampling_offsets): Linear(in_features=256, out_features=64, bias=True)
              (attention_weights): Linear(in_features=256, out_features=32, bias=True)
              (value_proj): Linear(in_features=256, out_features=256, bias=True)
              (output_proj): Linear(in_features=256, out_features=256, bias=True)
            )
          )
          (time_mlp): Sequential(
            (0): SiLU()
            (1): Linear(in_features=1024, out_features=512, bias=True)
          )
          (ffns): ModuleList(
            (0): FFN(
              (activate): GELU()
              (layers): Sequential(
                (0): Sequential(
                  (0): Linear(in_features=256, out_features=1024, bias=True)
                  (1): GELU()
                  (2): Dropout(p=0.0, inplace=False)
                )
                (1): Linear(in_features=1024, out_features=256, bias=True)
                (2): Dropout(p=0.0, inplace=False)
              )
              (dropout_layer): Identity()
            )
          )
          (norms): ModuleList(
            (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
            (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
          )
        )
      )
    )
    (positional_encoding): SinePositionalEncoding(num_feats=128, temperature=10000, normalize=True, scale=6.283185307179586, eps=1e-06)
  )
  init_cfg={'type': 'Normal', 'std': 0.01, 'override': {'name': 'conv_seg'}}
  (auxiliary_head): FCNHead(
    input_transform=None, ignore_index=255, align_corners=False
    (loss_decode): CrossEntropyLoss(avg_non_ignore=False)
    (conv_seg): Conv2d(256, 19, kernel_size=(1, 1), stride=(1, 1))
    (dropout): Dropout2d(p=0.1, inplace=False)
    (convs): Sequential(
      (0): ConvModule(
        (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn): SyncBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (activate): ReLU(inplace=True)
      )
    )
  )
  init_cfg={'type': 'Normal', 'std': 0.01, 'override': {'name': 'conv_seg'}}
  (embedding_table): Embedding(20, 256)
  (transform): ConvModule(
    (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1))
  )
  (time_mlp): Sequential(
    (0): LearnedSinusoidalPosEmb()
    (1): Linear(in_features=17, out_features=1024, bias=True)
    (2): GELU()
    (3): Linear(in_features=1024, out_features=1024, bias=True)
  )
)
2023-03-08 16:52:54,542 - mmseg - INFO - Model size:136.06
2023-03-08 16:52:54,588 - mmseg - INFO - Loaded 2975 images
2023-03-08 16:52:55,010 - mmseg - INFO - Loaded 500 images
2023-03-08 16:52:55,011 - mmseg - INFO - load checkpoint from local path: work_dirs/deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_best_mIoU_iter_128000_.pth
2023-03-08 16:52:55,104 - mmseg - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH   ) PolyLrUpdaterHook                  
(NORMAL      ) CheckpointHook                     
(LOW         ) DistEvalHook                       
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
before_train_epoch:
(VERY_HIGH   ) PolyLrUpdaterHook                  
(LOW         ) IterTimerHook                      
(LOW         ) DistEvalHook                       
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
before_train_iter:
(VERY_HIGH   ) PolyLrUpdaterHook                  
(LOW         ) IterTimerHook                      
(LOW         ) DistEvalHook                       
 -------------------- 
after_train_iter:
(ABOVE_NORMAL) OptimizerHook                      
(NORMAL      ) CheckpointHook                     
(LOW         ) IterTimerHook                      
(LOW         ) DistEvalHook                       
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
after_train_epoch:
(NORMAL      ) CheckpointHook                     
(LOW         ) DistEvalHook                       
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
before_val_epoch:
(LOW         ) IterTimerHook                      
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
before_val_iter:
(LOW         ) IterTimerHook                      
 -------------------- 
after_val_iter:
(LOW         ) IterTimerHook                      
 -------------------- 
after_val_epoch:
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
after_run:
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
2023-03-08 16:52:55,104 - mmseg - INFO - workflow: [('train', 1)], max: 20000 iters
2023-03-08 16:53:49,599 - mmseg - INFO - Iter [50/20000]	lr: 1.955e-07, eta: 4:35:26, time: 0.828, data_time: 0.017, memory: 22072, pred_decode.loss_ce: 0.1134, pred_decode.acc_seg: 96.4060, aux.loss_ce: 0.0416, aux.acc_seg: 96.0319, loss: 0.1550
2023-03-08 16:54:39,586 - mmseg - INFO - Iter [100/20000]	lr: 3.940e-07, eta: 5:03:10, time: 1.000, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.1234, pred_decode.acc_seg: 96.2790, aux.loss_ce: 0.0412, aux.acc_seg: 95.9889, loss: 0.1646
2023-03-08 16:55:29,563 - mmseg - INFO - Iter [150/20000]	lr: 5.916e-07, eta: 5:11:50, time: 1.000, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.1157, pred_decode.acc_seg: 96.3692, aux.loss_ce: 0.0419, aux.acc_seg: 96.0071, loss: 0.1576
2023-03-08 16:56:22,387 - mmseg - INFO - Iter [200/20000]	lr: 7.881e-07, eta: 5:20:26, time: 1.056, data_time: 0.059, memory: 22072, pred_decode.loss_ce: 0.0981, pred_decode.acc_seg: 96.6626, aux.loss_ce: 0.0380, aux.acc_seg: 96.2943, loss: 0.1362
2023-03-08 16:56:49,326 - mmseg - INFO - Iter [250/20000]	lr: 9.836e-07, eta: 4:51:11, time: 0.539, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.1044, pred_decode.acc_seg: 96.5844, aux.loss_ce: 0.0392, aux.acc_seg: 96.2089, loss: 0.1437
2023-03-08 16:57:12,525 - mmseg - INFO - Iter [300/20000]	lr: 1.178e-06, eta: 4:27:25, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.1116, pred_decode.acc_seg: 96.5339, aux.loss_ce: 0.0394, aux.acc_seg: 96.1893, loss: 0.1511
2023-03-08 16:57:35,716 - mmseg - INFO - Iter [350/20000]	lr: 1.372e-06, eta: 4:10:20, time: 0.464, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.1016, pred_decode.acc_seg: 96.5901, aux.loss_ce: 0.0400, aux.acc_seg: 96.1521, loss: 0.1416
2023-03-08 16:58:01,484 - mmseg - INFO - Iter [400/20000]	lr: 1.564e-06, eta: 3:59:32, time: 0.515, data_time: 0.056, memory: 22072, pred_decode.loss_ce: 0.1042, pred_decode.acc_seg: 96.4568, aux.loss_ce: 0.0406, aux.acc_seg: 96.0870, loss: 0.1448
2023-03-08 16:58:24,666 - mmseg - INFO - Iter [450/20000]	lr: 1.756e-06, eta: 3:49:09, time: 0.464, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.0918, pred_decode.acc_seg: 96.6885, aux.loss_ce: 0.0396, aux.acc_seg: 96.1739, loss: 0.1315
2023-03-08 16:58:47,940 - mmseg - INFO - Iter [500/20000]	lr: 1.946e-06, eta: 3:40:50, time: 0.465, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.0912, pred_decode.acc_seg: 96.6136, aux.loss_ce: 0.0400, aux.acc_seg: 96.1327, loss: 0.1312
2023-03-08 17:00:26,107 - mmseg - INFO - per class results:
2023-03-08 17:00:26,108 - mmseg - INFO - 
+---------------+-------+-------+
|     Class     |  IoU  |  Acc  |
+---------------+-------+-------+
|      road     | 98.64 | 99.39 |
|    sidewalk   | 87.98 | 94.05 |
|    building   | 93.41 | 97.38 |
|      wall     |  56.4 | 60.95 |
|     fence     | 65.57 | 74.13 |
|      pole     | 71.29 | 81.93 |
| traffic light | 75.85 | 86.07 |
|  traffic sign | 83.52 | 89.53 |
|   vegetation  | 93.03 | 96.38 |
|    terrain    | 65.37 | 72.62 |
|      sky      | 95.37 | 98.56 |
|     person    | 85.16 | 92.44 |
|     rider     | 66.45 | 77.64 |
|      car      | 96.26 | 98.24 |
|     truck     | 87.35 | 91.57 |
|      bus      | 92.94 | 95.42 |
|     train     | 88.41 |  90.8 |
|   motorcycle  | 72.27 | 79.13 |
|    bicycle    | 81.62 | 91.82 |
+---------------+-------+-------+
2023-03-08 17:00:26,108 - mmseg - INFO - Summary:
2023-03-08 17:00:26,108 - mmseg - INFO - 
+-------+-------+-------+
|  aAcc |  mIoU |  mAcc |
+-------+-------+-------+
| 96.72 | 81.94 | 87.79 |
+-------+-------+-------+
2023-03-08 17:00:26,782 - mmseg - INFO - Now best checkpoint is saved as best_mIoU_iter_500.pth.
2023-03-08 17:00:26,782 - mmseg - INFO - Best mIoU is 0.8194 at 500 iter.
2023-03-08 17:00:26,782 - mmseg - INFO - Iter(val) [125]	aAcc: 0.9672, mIoU: 0.8194, mAcc: 0.8779, IoU.road: 0.9864, IoU.sidewalk: 0.8798, IoU.building: 0.9341, IoU.wall: 0.5640, IoU.fence: 0.6557, IoU.pole: 0.7129, IoU.traffic light: 0.7585, IoU.traffic sign: 0.8352, IoU.vegetation: 0.9303, IoU.terrain: 0.6537, IoU.sky: 0.9537, IoU.person: 0.8516, IoU.rider: 0.6645, IoU.car: 0.9626, IoU.truck: 0.8735, IoU.bus: 0.9294, IoU.train: 0.8841, IoU.motorcycle: 0.7227, IoU.bicycle: 0.8162, Acc.road: 0.9939, Acc.sidewalk: 0.9405, Acc.building: 0.9738, Acc.wall: 0.6095, Acc.fence: 0.7413, Acc.pole: 0.8193, Acc.traffic light: 0.8607, Acc.traffic sign: 0.8953, Acc.vegetation: 0.9638, Acc.terrain: 0.7262, Acc.sky: 0.9856, Acc.person: 0.9244, Acc.rider: 0.7764, Acc.car: 0.9824, Acc.truck: 0.9157, Acc.bus: 0.9542, Acc.train: 0.9080, Acc.motorcycle: 0.7913, Acc.bicycle: 0.9182
2023-03-08 17:00:49,956 - mmseg - INFO - Iter [550/20000]	lr: 2.136e-06, eta: 4:32:10, time: 2.440, data_time: 1.985, memory: 22072, pred_decode.loss_ce: 0.0908, pred_decode.acc_seg: 96.6846, aux.loss_ce: 0.0395, aux.acc_seg: 96.1980, loss: 0.1302
2023-03-08 17:01:15,810 - mmseg - INFO - Iter [600/20000]	lr: 2.324e-06, eta: 4:22:46, time: 0.517, data_time: 0.058, memory: 22072, pred_decode.loss_ce: 0.0918, pred_decode.acc_seg: 96.6927, aux.loss_ce: 0.0392, aux.acc_seg: 96.2090, loss: 0.1310
2023-03-08 17:01:39,047 - mmseg - INFO - Iter [650/20000]	lr: 2.512e-06, eta: 4:13:28, time: 0.465, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.0917, pred_decode.acc_seg: 96.6253, aux.loss_ce: 0.0403, aux.acc_seg: 96.0885, loss: 0.1320
2023-03-08 17:02:02,342 - mmseg - INFO - Iter [700/20000]	lr: 2.698e-06, eta: 4:05:27, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0946, pred_decode.acc_seg: 96.6179, aux.loss_ce: 0.0404, aux.acc_seg: 96.1226, loss: 0.1350
2023-03-08 17:02:28,215 - mmseg - INFO - Iter [750/20000]	lr: 2.884e-06, eta: 3:59:34, time: 0.517, data_time: 0.059, memory: 22072, pred_decode.loss_ce: 0.0908, pred_decode.acc_seg: 96.6657, aux.loss_ce: 0.0400, aux.acc_seg: 96.1454, loss: 0.1308
2023-03-08 17:02:51,326 - mmseg - INFO - Iter [800/20000]	lr: 3.068e-06, eta: 3:53:15, time: 0.462, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.0878, pred_decode.acc_seg: 96.7597, aux.loss_ce: 0.0383, aux.acc_seg: 96.3302, loss: 0.1261
2023-03-08 17:03:14,395 - mmseg - INFO - Iter [850/20000]	lr: 3.252e-06, eta: 3:47:37, time: 0.461, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.0909, pred_decode.acc_seg: 96.6438, aux.loss_ce: 0.0398, aux.acc_seg: 96.1616, loss: 0.1306
2023-03-08 17:03:37,519 - mmseg - INFO - Iter [900/20000]	lr: 3.434e-06, eta: 3:42:35, time: 0.462, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.0904, pred_decode.acc_seg: 96.6393, aux.loss_ce: 0.0400, aux.acc_seg: 96.1476, loss: 0.1304
2023-03-08 17:04:03,251 - mmseg - INFO - Iter [950/20000]	lr: 3.616e-06, eta: 3:38:55, time: 0.515, data_time: 0.059, memory: 22072, pred_decode.loss_ce: 0.0865, pred_decode.acc_seg: 96.6811, aux.loss_ce: 0.0398, aux.acc_seg: 96.1517, loss: 0.1263
2023-03-08 17:04:26,503 - mmseg - INFO - Exp name: deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_align_diffv22.py
2023-03-08 17:04:26,503 - mmseg - INFO - Iter [1000/20000]	lr: 3.796e-06, eta: 3:34:48, time: 0.465, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0842, pred_decode.acc_seg: 96.7599, aux.loss_ce: 0.0384, aux.acc_seg: 96.2946, loss: 0.1226
2023-03-08 17:05:51,586 - mmseg - INFO - per class results:
2023-03-08 17:05:51,588 - mmseg - INFO - 
+---------------+-------+-------+
|     Class     |  IoU  |  Acc  |
+---------------+-------+-------+
|      road     | 98.68 | 99.26 |
|    sidewalk   | 88.34 | 94.66 |
|    building   | 93.57 |  97.3 |
|      wall     | 56.21 | 61.02 |
|     fence     | 66.41 | 74.31 |
|      pole     | 72.98 | 84.95 |
| traffic light | 75.88 | 85.72 |
|  traffic sign | 83.89 | 89.82 |
|   vegetation  | 93.19 | 96.64 |
|    terrain    | 65.03 | 73.42 |
|      sky      | 95.37 | 98.52 |
|     person    | 85.27 | 92.41 |
|     rider     | 66.91 | 79.02 |
|      car      |  96.2 | 98.24 |
|     truck     | 85.96 | 89.58 |
|      bus      | 93.18 | 95.41 |
|     train     | 88.94 | 91.51 |
|   motorcycle  | 72.76 | 80.83 |
|    bicycle    | 81.68 | 91.57 |
+---------------+-------+-------+
2023-03-08 17:05:51,588 - mmseg - INFO - Summary:
2023-03-08 17:05:51,588 - mmseg - INFO - 
+-------+-------+-------+
|  aAcc |  mIoU |  mAcc |
+-------+-------+-------+
| 96.78 | 82.13 | 88.12 |
+-------+-------+-------+
2023-03-08 17:05:52,308 - mmseg - INFO - Now best checkpoint is saved as best_mIoU_iter_1000.pth.
2023-03-08 17:05:52,308 - mmseg - INFO - Best mIoU is 0.8213 at 1000 iter.
2023-03-08 17:05:52,308 - mmseg - INFO - Exp name: deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_align_diffv22.py
2023-03-08 17:05:52,308 - mmseg - INFO - Iter(val) [125]	aAcc: 0.9678, mIoU: 0.8213, mAcc: 0.8812, IoU.road: 0.9868, IoU.sidewalk: 0.8834, IoU.building: 0.9357, IoU.wall: 0.5621, IoU.fence: 0.6641, IoU.pole: 0.7298, IoU.traffic light: 0.7588, IoU.traffic sign: 0.8389, IoU.vegetation: 0.9319, IoU.terrain: 0.6503, IoU.sky: 0.9537, IoU.person: 0.8527, IoU.rider: 0.6691, IoU.car: 0.9620, IoU.truck: 0.8596, IoU.bus: 0.9318, IoU.train: 0.8894, IoU.motorcycle: 0.7276, IoU.bicycle: 0.8168, Acc.road: 0.9926, Acc.sidewalk: 0.9466, Acc.building: 0.9730, Acc.wall: 0.6102, Acc.fence: 0.7431, Acc.pole: 0.8495, Acc.traffic light: 0.8572, Acc.traffic sign: 0.8982, Acc.vegetation: 0.9664, Acc.terrain: 0.7342, Acc.sky: 0.9852, Acc.person: 0.9241, Acc.rider: 0.7902, Acc.car: 0.9824, Acc.truck: 0.8958, Acc.bus: 0.9541, Acc.train: 0.9151, Acc.motorcycle: 0.8083, Acc.bicycle: 0.9157
2023-03-08 17:06:15,596 - mmseg - INFO - Iter [1050/20000]	lr: 3.976e-06, eta: 3:56:50, time: 2.182, data_time: 1.724, memory: 22072, pred_decode.loss_ce: 0.0849, pred_decode.acc_seg: 96.7223, aux.loss_ce: 0.0398, aux.acc_seg: 96.1278, loss: 0.1247
2023-03-08 17:06:38,916 - mmseg - INFO - Iter [1100/20000]	lr: 4.154e-06, eta: 3:52:09, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0914, pred_decode.acc_seg: 96.5312, aux.loss_ce: 0.0413, aux.acc_seg: 96.0190, loss: 0.1327
2023-03-08 17:07:04,789 - mmseg - INFO - Iter [1150/20000]	lr: 4.332e-06, eta: 3:48:33, time: 0.517, data_time: 0.060, memory: 22072, pred_decode.loss_ce: 0.0853, pred_decode.acc_seg: 96.7118, aux.loss_ce: 0.0400, aux.acc_seg: 96.1225, loss: 0.1253
2023-03-08 17:07:28,078 - mmseg - INFO - Iter [1200/20000]	lr: 4.508e-06, eta: 3:44:31, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0851, pred_decode.acc_seg: 96.7069, aux.loss_ce: 0.0398, aux.acc_seg: 96.1556, loss: 0.1249
2023-03-08 17:07:51,414 - mmseg - INFO - Iter [1250/20000]	lr: 4.684e-06, eta: 3:40:48, time: 0.467, data_time: 0.009, memory: 22072, pred_decode.loss_ce: 0.0886, pred_decode.acc_seg: 96.6296, aux.loss_ce: 0.0406, aux.acc_seg: 96.0961, loss: 0.1292
2023-03-08 17:08:14,724 - mmseg - INFO - Iter [1300/20000]	lr: 4.859e-06, eta: 3:37:20, time: 0.466, data_time: 0.009, memory: 22072, pred_decode.loss_ce: 0.0869, pred_decode.acc_seg: 96.6646, aux.loss_ce: 0.0398, aux.acc_seg: 96.1204, loss: 0.1267
2023-03-08 17:08:40,655 - mmseg - INFO - Iter [1350/20000]	lr: 5.032e-06, eta: 3:34:41, time: 0.518, data_time: 0.060, memory: 22072, pred_decode.loss_ce: 0.0856, pred_decode.acc_seg: 96.6600, aux.loss_ce: 0.0407, aux.acc_seg: 96.0743, loss: 0.1263
2023-03-08 17:09:03,890 - mmseg - INFO - Iter [1400/20000]	lr: 5.205e-06, eta: 3:31:37, time: 0.465, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0856, pred_decode.acc_seg: 96.6822, aux.loss_ce: 0.0403, aux.acc_seg: 96.1359, loss: 0.1259
2023-03-08 17:09:27,178 - mmseg - INFO - Iter [1450/20000]	lr: 5.376e-06, eta: 3:28:44, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0844, pred_decode.acc_seg: 96.7204, aux.loss_ce: 0.0394, aux.acc_seg: 96.1896, loss: 0.1238
2023-03-08 17:09:52,921 - mmseg - INFO - Iter [1500/20000]	lr: 5.547e-06, eta: 3:26:31, time: 0.515, data_time: 0.059, memory: 22072, pred_decode.loss_ce: 0.0835, pred_decode.acc_seg: 96.7633, aux.loss_ce: 0.0395, aux.acc_seg: 96.2057, loss: 0.1230
2023-03-08 17:11:17,659 - mmseg - INFO - per class results:
2023-03-08 17:11:17,660 - mmseg - INFO - 
+---------------+-------+-------+
|     Class     |  IoU  |  Acc  |
+---------------+-------+-------+
|      road     | 98.71 | 99.24 |
|    sidewalk   | 88.59 | 95.02 |
|    building   |  93.6 | 97.35 |
|      wall     | 56.06 | 60.02 |
|     fence     | 65.81 | 73.91 |
|      pole     | 72.88 | 83.61 |
| traffic light | 75.84 | 85.46 |
|  traffic sign | 83.75 | 90.61 |
|   vegetation  | 93.18 |  96.7 |
|    terrain    | 65.44 |  73.0 |
|      sky      | 95.35 | 98.57 |
|     person    | 85.49 | 92.15 |
|     rider     | 68.08 | 80.85 |
|      car      |  96.2 | 98.39 |
|     truck     | 86.85 |  90.2 |
|      bus      | 92.48 | 95.25 |
|     train     | 86.48 | 88.52 |
|   motorcycle  | 72.05 | 79.76 |
|    bicycle    | 81.35 | 92.05 |
+---------------+-------+-------+
2023-03-08 17:11:17,660 - mmseg - INFO - Summary:
2023-03-08 17:11:17,660 - mmseg - INFO - 
+------+-------+-------+
| aAcc |  mIoU |  mAcc |
+------+-------+-------+
| 96.8 | 82.01 | 87.93 |
+------+-------+-------+
2023-03-08 17:11:17,661 - mmseg - INFO - Iter(val) [125]	aAcc: 0.9680, mIoU: 0.8201, mAcc: 0.8793, IoU.road: 0.9871, IoU.sidewalk: 0.8859, IoU.building: 0.9360, IoU.wall: 0.5606, IoU.fence: 0.6581, IoU.pole: 0.7288, IoU.traffic light: 0.7584, IoU.traffic sign: 0.8375, IoU.vegetation: 0.9318, IoU.terrain: 0.6544, IoU.sky: 0.9535, IoU.person: 0.8549, IoU.rider: 0.6808, IoU.car: 0.9620, IoU.truck: 0.8685, IoU.bus: 0.9248, IoU.train: 0.8648, IoU.motorcycle: 0.7205, IoU.bicycle: 0.8135, Acc.road: 0.9924, Acc.sidewalk: 0.9502, Acc.building: 0.9735, Acc.wall: 0.6002, Acc.fence: 0.7391, Acc.pole: 0.8361, Acc.traffic light: 0.8546, Acc.traffic sign: 0.9061, Acc.vegetation: 0.9670, Acc.terrain: 0.7300, Acc.sky: 0.9857, Acc.person: 0.9215, Acc.rider: 0.8085, Acc.car: 0.9839, Acc.truck: 0.9020, Acc.bus: 0.9525, Acc.train: 0.8852, Acc.motorcycle: 0.7976, Acc.bicycle: 0.9205
2023-03-08 17:11:40,975 - mmseg - INFO - Iter [1550/20000]	lr: 5.535e-06, eta: 3:40:45, time: 2.161, data_time: 1.703, memory: 22072, pred_decode.loss_ce: 0.0857, pred_decode.acc_seg: 96.6469, aux.loss_ce: 0.0414, aux.acc_seg: 96.0427, loss: 0.1270
2023-03-08 17:12:04,297 - mmseg - INFO - Iter [1600/20000]	lr: 5.520e-06, eta: 3:37:45, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0882, pred_decode.acc_seg: 96.5887, aux.loss_ce: 0.0410, aux.acc_seg: 96.0157, loss: 0.1293
2023-03-08 17:12:27,573 - mmseg - INFO - Iter [1650/20000]	lr: 5.505e-06, eta: 3:34:53, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0857, pred_decode.acc_seg: 96.6629, aux.loss_ce: 0.0404, aux.acc_seg: 96.1349, loss: 0.1261
2023-03-08 17:12:53,582 - mmseg - INFO - Iter [1700/20000]	lr: 5.490e-06, eta: 3:32:40, time: 0.520, data_time: 0.060, memory: 22072, pred_decode.loss_ce: 0.0819, pred_decode.acc_seg: 96.7658, aux.loss_ce: 0.0390, aux.acc_seg: 96.2226, loss: 0.1208
2023-03-08 17:13:16,875 - mmseg - INFO - Iter [1750/20000]	lr: 5.475e-06, eta: 3:30:04, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0846, pred_decode.acc_seg: 96.6970, aux.loss_ce: 0.0407, aux.acc_seg: 96.0782, loss: 0.1253
2023-03-08 17:13:40,209 - mmseg - INFO - Iter [1800/20000]	lr: 5.460e-06, eta: 3:27:37, time: 0.467, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0861, pred_decode.acc_seg: 96.6460, aux.loss_ce: 0.0414, aux.acc_seg: 96.0429, loss: 0.1275
2023-03-08 17:14:03,478 - mmseg - INFO - Iter [1850/20000]	lr: 5.445e-06, eta: 3:25:15, time: 0.465, data_time: 0.009, memory: 22072, pred_decode.loss_ce: 0.0819, pred_decode.acc_seg: 96.7853, aux.loss_ce: 0.0394, aux.acc_seg: 96.1878, loss: 0.1213
2023-03-08 17:14:29,364 - mmseg - INFO - Iter [1900/20000]	lr: 5.430e-06, eta: 3:23:24, time: 0.518, data_time: 0.060, memory: 22072, pred_decode.loss_ce: 0.0837, pred_decode.acc_seg: 96.7140, aux.loss_ce: 0.0398, aux.acc_seg: 96.1676, loss: 0.1235
2023-03-08 17:14:52,722 - mmseg - INFO - Iter [1950/20000]	lr: 5.415e-06, eta: 3:21:15, time: 0.467, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0838, pred_decode.acc_seg: 96.6902, aux.loss_ce: 0.0396, aux.acc_seg: 96.1878, loss: 0.1233
2023-03-08 17:15:16,012 - mmseg - INFO - Saving checkpoint at 2000 iterations
2023-03-08 17:15:16,740 - mmseg - INFO - Exp name: deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_align_diffv22.py
2023-03-08 17:15:16,740 - mmseg - INFO - Iter [2000/20000]	lr: 5.400e-06, eta: 3:19:16, time: 0.481, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0869, pred_decode.acc_seg: 96.6271, aux.loss_ce: 0.0402, aux.acc_seg: 96.1311, loss: 0.1271
2023-03-08 17:16:41,335 - mmseg - INFO - per class results:
2023-03-08 17:16:41,336 - mmseg - INFO - 
+---------------+-------+-------+
|     Class     |  IoU  |  Acc  |
+---------------+-------+-------+
|      road     |  98.7 | 99.28 |
|    sidewalk   | 88.59 | 94.91 |
|    building   | 93.61 | 97.35 |
|      wall     | 58.18 | 62.83 |
|     fence     |  66.8 | 72.79 |
|      pole     | 73.19 | 83.47 |
| traffic light | 75.94 | 85.53 |
|  traffic sign | 84.19 | 90.28 |
|   vegetation  | 93.11 | 96.58 |
|    terrain    | 66.26 | 75.15 |
|      sky      | 95.21 | 98.77 |
|     person    | 85.37 |  92.2 |
|     rider     | 66.77 | 78.66 |
|      car      | 96.25 | 98.41 |
|     truck     | 87.69 | 91.03 |
|      bus      | 92.85 | 95.38 |
|     train     | 87.62 | 89.68 |
|   motorcycle  | 71.57 | 78.51 |
|    bicycle    | 81.45 | 91.37 |
+---------------+-------+-------+
2023-03-08 17:16:41,336 - mmseg - INFO - Summary:
2023-03-08 17:16:41,336 - mmseg - INFO - 
+-------+-------+-------+
|  aAcc |  mIoU |  mAcc |
+-------+-------+-------+
| 96.81 | 82.28 | 88.01 |
+-------+-------+-------+
2023-03-08 17:16:42,064 - mmseg - INFO - Now best checkpoint is saved as best_mIoU_iter_2000.pth.
2023-03-08 17:16:42,064 - mmseg - INFO - Best mIoU is 0.8228 at 2000 iter.
2023-03-08 17:16:42,065 - mmseg - INFO - Exp name: deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_align_diffv22.py
2023-03-08 17:16:42,065 - mmseg - INFO - Iter(val) [125]	aAcc: 0.9681, mIoU: 0.8228, mAcc: 0.8801, IoU.road: 0.9870, IoU.sidewalk: 0.8859, IoU.building: 0.9361, IoU.wall: 0.5818, IoU.fence: 0.6680, IoU.pole: 0.7319, IoU.traffic light: 0.7594, IoU.traffic sign: 0.8419, IoU.vegetation: 0.9311, IoU.terrain: 0.6626, IoU.sky: 0.9521, IoU.person: 0.8537, IoU.rider: 0.6677, IoU.car: 0.9625, IoU.truck: 0.8769, IoU.bus: 0.9285, IoU.train: 0.8762, IoU.motorcycle: 0.7157, IoU.bicycle: 0.8145, Acc.road: 0.9928, Acc.sidewalk: 0.9491, Acc.building: 0.9735, Acc.wall: 0.6283, Acc.fence: 0.7279, Acc.pole: 0.8347, Acc.traffic light: 0.8553, Acc.traffic sign: 0.9028, Acc.vegetation: 0.9658, Acc.terrain: 0.7515, Acc.sky: 0.9877, Acc.person: 0.9220, Acc.rider: 0.7866, Acc.car: 0.9841, Acc.truck: 0.9103, Acc.bus: 0.9538, Acc.train: 0.8968, Acc.motorcycle: 0.7851, Acc.bicycle: 0.9137
2023-03-08 17:17:07,824 - mmseg - INFO - Iter [2050/20000]	lr: 5.385e-06, eta: 3:30:05, time: 2.222, data_time: 1.765, memory: 22072, pred_decode.loss_ce: 0.0805, pred_decode.acc_seg: 96.7433, aux.loss_ce: 0.0388, aux.acc_seg: 96.2138, loss: 0.1193
2023-03-08 17:17:31,038 - mmseg - INFO - Iter [2100/20000]	lr: 5.370e-06, eta: 3:27:48, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0830, pred_decode.acc_seg: 96.7384, aux.loss_ce: 0.0399, aux.acc_seg: 96.1385, loss: 0.1229
2023-03-08 17:17:54,406 - mmseg - INFO - Iter [2150/20000]	lr: 5.355e-06, eta: 3:25:38, time: 0.467, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0875, pred_decode.acc_seg: 96.6208, aux.loss_ce: 0.0413, aux.acc_seg: 96.0257, loss: 0.1289
2023-03-08 17:18:17,751 - mmseg - INFO - Iter [2200/20000]	lr: 5.340e-06, eta: 3:23:33, time: 0.467, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0826, pred_decode.acc_seg: 96.7552, aux.loss_ce: 0.0392, aux.acc_seg: 96.2115, loss: 0.1218
2023-03-08 17:18:43,627 - mmseg - INFO - Iter [2250/20000]	lr: 5.325e-06, eta: 3:21:52, time: 0.518, data_time: 0.058, memory: 22072, pred_decode.loss_ce: 0.0815, pred_decode.acc_seg: 96.7622, aux.loss_ce: 0.0394, aux.acc_seg: 96.1841, loss: 0.1209
2023-03-08 17:19:06,917 - mmseg - INFO - Iter [2300/20000]	lr: 5.310e-06, eta: 3:19:55, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0866, pred_decode.acc_seg: 96.6346, aux.loss_ce: 0.0422, aux.acc_seg: 95.9662, loss: 0.1288
2023-03-08 17:19:30,259 - mmseg - INFO - Iter [2350/20000]	lr: 5.295e-06, eta: 3:18:02, time: 0.467, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0849, pred_decode.acc_seg: 96.6571, aux.loss_ce: 0.0403, aux.acc_seg: 96.1105, loss: 0.1252
2023-03-08 17:19:53,532 - mmseg - INFO - Iter [2400/20000]	lr: 5.280e-06, eta: 3:16:12, time: 0.465, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0842, pred_decode.acc_seg: 96.6795, aux.loss_ce: 0.0412, aux.acc_seg: 96.0268, loss: 0.1254
2023-03-08 17:20:19,478 - mmseg - INFO - Iter [2450/20000]	lr: 5.265e-06, eta: 3:14:45, time: 0.519, data_time: 0.060, memory: 22072, pred_decode.loss_ce: 0.0891, pred_decode.acc_seg: 96.5415, aux.loss_ce: 0.0425, aux.acc_seg: 95.9152, loss: 0.1316
2023-03-08 17:20:42,746 - mmseg - INFO - Iter [2500/20000]	lr: 5.250e-06, eta: 3:13:01, time: 0.465, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0812, pred_decode.acc_seg: 96.7967, aux.loss_ce: 0.0400, aux.acc_seg: 96.1538, loss: 0.1212
2023-03-08 17:22:07,759 - mmseg - INFO - per class results:
2023-03-08 17:22:07,761 - mmseg - INFO - 
+---------------+-------+-------+
|     Class     |  IoU  |  Acc  |
+---------------+-------+-------+
|      road     | 98.76 | 99.35 |
|    sidewalk   | 88.98 | 94.69 |
|    building   | 93.71 | 97.32 |
|      wall     |  57.9 | 62.91 |
|     fence     | 67.73 | 75.67 |
|      pole     | 73.34 | 83.05 |
| traffic light | 76.11 | 86.31 |
|  traffic sign | 84.24 | 90.69 |
|   vegetation  | 93.27 | 96.78 |
|    terrain    | 65.94 | 74.51 |
|      sky      |  95.3 | 98.66 |
|     person    | 85.43 | 92.59 |
|     rider     | 66.99 | 77.87 |
|      car      |  96.2 | 98.35 |
|     truck     | 85.85 | 89.02 |
|      bus      | 92.04 | 95.35 |
|     train     |  84.2 | 86.24 |
|   motorcycle  | 72.13 | 81.39 |
|    bicycle    | 81.72 | 91.34 |
+---------------+-------+-------+
2023-03-08 17:22:07,761 - mmseg - INFO - Summary:
2023-03-08 17:22:07,761 - mmseg - INFO - 
+-------+------+------+
|  aAcc | mIoU | mAcc |
+-------+------+------+
| 96.86 | 82.1 | 88.0 |
+-------+------+------+
2023-03-08 17:22:07,761 - mmseg - INFO - Iter(val) [125]	aAcc: 0.9686, mIoU: 0.8210, mAcc: 0.8800, IoU.road: 0.9876, IoU.sidewalk: 0.8898, IoU.building: 0.9371, IoU.wall: 0.5790, IoU.fence: 0.6773, IoU.pole: 0.7334, IoU.traffic light: 0.7611, IoU.traffic sign: 0.8424, IoU.vegetation: 0.9327, IoU.terrain: 0.6594, IoU.sky: 0.9530, IoU.person: 0.8543, IoU.rider: 0.6699, IoU.car: 0.9620, IoU.truck: 0.8585, IoU.bus: 0.9204, IoU.train: 0.8420, IoU.motorcycle: 0.7213, IoU.bicycle: 0.8172, Acc.road: 0.9935, Acc.sidewalk: 0.9469, Acc.building: 0.9732, Acc.wall: 0.6291, Acc.fence: 0.7567, Acc.pole: 0.8305, Acc.traffic light: 0.8631, Acc.traffic sign: 0.9069, Acc.vegetation: 0.9678, Acc.terrain: 0.7451, Acc.sky: 0.9866, Acc.person: 0.9259, Acc.rider: 0.7787, Acc.car: 0.9835, Acc.truck: 0.8902, Acc.bus: 0.9535, Acc.train: 0.8624, Acc.motorcycle: 0.8139, Acc.bicycle: 0.9134
2023-03-08 17:22:31,049 - mmseg - INFO - Iter [2550/20000]	lr: 5.235e-06, eta: 3:21:03, time: 2.166, data_time: 1.708, memory: 22072, pred_decode.loss_ce: 0.0792, pred_decode.acc_seg: 96.8770, aux.loss_ce: 0.0379, aux.acc_seg: 96.3405, loss: 0.1171
2023-03-08 17:22:54,237 - mmseg - INFO - Iter [2600/20000]	lr: 5.220e-06, eta: 3:19:12, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0841, pred_decode.acc_seg: 96.6992, aux.loss_ce: 0.0407, aux.acc_seg: 96.1105, loss: 0.1248
2023-03-08 17:23:20,007 - mmseg - INFO - Iter [2650/20000]	lr: 5.205e-06, eta: 3:17:42, time: 0.515, data_time: 0.059, memory: 22072, pred_decode.loss_ce: 0.0856, pred_decode.acc_seg: 96.5922, aux.loss_ce: 0.0420, aux.acc_seg: 95.9452, loss: 0.1276
2023-03-08 17:23:43,199 - mmseg - INFO - Iter [2700/20000]	lr: 5.190e-06, eta: 3:15:57, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0810, pred_decode.acc_seg: 96.7928, aux.loss_ce: 0.0393, aux.acc_seg: 96.2001, loss: 0.1202
2023-03-08 17:24:06,392 - mmseg - INFO - Iter [2750/20000]	lr: 5.175e-06, eta: 3:14:15, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0816, pred_decode.acc_seg: 96.7622, aux.loss_ce: 0.0387, aux.acc_seg: 96.2437, loss: 0.1202
2023-03-08 17:24:32,306 - mmseg - INFO - Iter [2800/20000]	lr: 5.160e-06, eta: 3:12:53, time: 0.518, data_time: 0.060, memory: 22072, pred_decode.loss_ce: 0.0861, pred_decode.acc_seg: 96.6071, aux.loss_ce: 0.0408, aux.acc_seg: 96.0280, loss: 0.1270
2023-03-08 17:24:55,589 - mmseg - INFO - Iter [2850/20000]	lr: 5.145e-06, eta: 3:11:17, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0836, pred_decode.acc_seg: 96.7181, aux.loss_ce: 0.0407, aux.acc_seg: 96.1362, loss: 0.1243
2023-03-08 17:25:18,842 - mmseg - INFO - Iter [2900/20000]	lr: 5.130e-06, eta: 3:09:44, time: 0.465, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0817, pred_decode.acc_seg: 96.7532, aux.loss_ce: 0.0400, aux.acc_seg: 96.1376, loss: 0.1218
2023-03-08 17:25:42,127 - mmseg - INFO - Iter [2950/20000]	lr: 5.115e-06, eta: 3:08:12, time: 0.466, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.0851, pred_decode.acc_seg: 96.6016, aux.loss_ce: 0.0414, aux.acc_seg: 95.9999, loss: 0.1265
2023-03-08 17:26:08,013 - mmseg - INFO - Exp name: deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_align_diffv22.py
2023-03-08 17:26:08,013 - mmseg - INFO - Iter [3000/20000]	lr: 5.100e-06, eta: 3:06:58, time: 0.518, data_time: 0.058, memory: 22072, pred_decode.loss_ce: 0.0820, pred_decode.acc_seg: 96.7119, aux.loss_ce: 0.0394, aux.acc_seg: 96.1584, loss: 0.1214
2023-03-08 17:27:32,982 - mmseg - INFO - per class results:
2023-03-08 17:27:32,983 - mmseg - INFO - 
+---------------+-------+-------+
|     Class     |  IoU  |  Acc  |
+---------------+-------+-------+
|      road     |  98.7 | 99.24 |
|    sidewalk   | 88.52 | 95.13 |
|    building   | 93.68 | 97.44 |
|      wall     | 57.52 | 62.28 |
|     fence     | 66.91 | 74.23 |
|      pole     | 73.07 | 83.12 |
| traffic light | 75.94 | 86.21 |
|  traffic sign | 84.19 | 90.33 |
|   vegetation  | 93.27 | 96.71 |
|    terrain    | 65.94 | 73.21 |
|      sky      | 95.48 | 98.64 |
|     person    | 85.47 | 92.61 |
|     rider     | 67.96 | 81.44 |
|      car      | 96.28 | 98.35 |
|     truck     | 87.83 | 91.05 |
|      bus      | 93.14 | 95.34 |
|     train     | 88.04 | 90.09 |
|   motorcycle  | 71.85 |  79.5 |
|    bicycle    | 81.53 | 90.55 |
+---------------+-------+-------+
2023-03-08 17:27:32,983 - mmseg - INFO - Summary:
2023-03-08 17:27:32,984 - mmseg - INFO - 
+-------+-------+-------+
|  aAcc |  mIoU |  mAcc |
+-------+-------+-------+
| 96.84 | 82.38 | 88.18 |
+-------+-------+-------+
2023-03-08 17:27:33,750 - mmseg - INFO - Now best checkpoint is saved as best_mIoU_iter_3000.pth.
2023-03-08 17:27:33,750 - mmseg - INFO - Best mIoU is 0.8238 at 3000 iter.
2023-03-08 17:27:33,750 - mmseg - INFO - Exp name: deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_align_diffv22.py
2023-03-08 17:27:33,750 - mmseg - INFO - Iter(val) [125]	aAcc: 0.9684, mIoU: 0.8238, mAcc: 0.8818, IoU.road: 0.9870, IoU.sidewalk: 0.8852, IoU.building: 0.9368, IoU.wall: 0.5752, IoU.fence: 0.6691, IoU.pole: 0.7307, IoU.traffic light: 0.7594, IoU.traffic sign: 0.8419, IoU.vegetation: 0.9327, IoU.terrain: 0.6594, IoU.sky: 0.9548, IoU.person: 0.8547, IoU.rider: 0.6796, IoU.car: 0.9628, IoU.truck: 0.8783, IoU.bus: 0.9314, IoU.train: 0.8804, IoU.motorcycle: 0.7185, IoU.bicycle: 0.8153, Acc.road: 0.9924, Acc.sidewalk: 0.9513, Acc.building: 0.9744, Acc.wall: 0.6228, Acc.fence: 0.7423, Acc.pole: 0.8312, Acc.traffic light: 0.8621, Acc.traffic sign: 0.9033, Acc.vegetation: 0.9671, Acc.terrain: 0.7321, Acc.sky: 0.9864, Acc.person: 0.9261, Acc.rider: 0.8144, Acc.car: 0.9835, Acc.truck: 0.9105, Acc.bus: 0.9534, Acc.train: 0.9009, Acc.motorcycle: 0.7950, Acc.bicycle: 0.9055
2023-03-08 17:27:57,013 - mmseg - INFO - Iter [3050/20000]	lr: 5.085e-06, eta: 3:13:28, time: 2.180, data_time: 1.722, memory: 22072, pred_decode.loss_ce: 0.0801, pred_decode.acc_seg: 96.8182, aux.loss_ce: 0.0393, aux.acc_seg: 96.2089, loss: 0.1195
2023-03-08 17:28:20,283 - mmseg - INFO - Iter [3100/20000]	lr: 5.070e-06, eta: 3:11:54, time: 0.465, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0851, pred_decode.acc_seg: 96.6242, aux.loss_ce: 0.0411, aux.acc_seg: 96.0318, loss: 0.1262
2023-03-08 17:28:43,588 - mmseg - INFO - Iter [3150/20000]	lr: 5.055e-06, eta: 3:10:22, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0860, pred_decode.acc_seg: 96.6358, aux.loss_ce: 0.0415, aux.acc_seg: 96.0528, loss: 0.1275
2023-03-08 17:29:09,488 - mmseg - INFO - Iter [3200/20000]	lr: 5.040e-06, eta: 3:09:06, time: 0.518, data_time: 0.058, memory: 22072, pred_decode.loss_ce: 0.0836, pred_decode.acc_seg: 96.6524, aux.loss_ce: 0.0408, aux.acc_seg: 96.0325, loss: 0.1243
2023-03-08 17:29:32,616 - mmseg - INFO - Iter [3250/20000]	lr: 5.025e-06, eta: 3:07:38, time: 0.463, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0820, pred_decode.acc_seg: 96.7534, aux.loss_ce: 0.0402, aux.acc_seg: 96.1318, loss: 0.1222
2023-03-08 17:29:55,808 - mmseg - INFO - Iter [3300/20000]	lr: 5.010e-06, eta: 3:06:11, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0841, pred_decode.acc_seg: 96.6607, aux.loss_ce: 0.0415, aux.acc_seg: 95.9994, loss: 0.1256
2023-03-08 17:30:21,537 - mmseg - INFO - Iter [3350/20000]	lr: 4.995e-06, eta: 3:04:59, time: 0.515, data_time: 0.059, memory: 22072, pred_decode.loss_ce: 0.0796, pred_decode.acc_seg: 96.7991, aux.loss_ce: 0.0386, aux.acc_seg: 96.2239, loss: 0.1182
2023-03-08 17:30:44,725 - mmseg - INFO - Iter [3400/20000]	lr: 4.980e-06, eta: 3:03:37, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0820, pred_decode.acc_seg: 96.7456, aux.loss_ce: 0.0395, aux.acc_seg: 96.2064, loss: 0.1214
2023-03-08 17:31:07,949 - mmseg - INFO - Iter [3450/20000]	lr: 4.965e-06, eta: 3:02:16, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0830, pred_decode.acc_seg: 96.6756, aux.loss_ce: 0.0402, aux.acc_seg: 96.0959, loss: 0.1233
2023-03-08 17:31:31,257 - mmseg - INFO - Iter [3500/20000]	lr: 4.950e-06, eta: 3:00:57, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0819, pred_decode.acc_seg: 96.7773, aux.loss_ce: 0.0398, aux.acc_seg: 96.2114, loss: 0.1217
2023-03-08 17:32:56,316 - mmseg - INFO - per class results:
2023-03-08 17:32:56,317 - mmseg - INFO - 
+---------------+-------+-------+
|     Class     |  IoU  |  Acc  |
+---------------+-------+-------+
|      road     | 98.73 | 99.22 |
|    sidewalk   | 88.69 | 95.39 |
|    building   | 93.68 | 97.22 |
|      wall     | 59.11 | 63.83 |
|     fence     | 67.58 | 74.45 |
|      pole     | 72.98 | 83.22 |
| traffic light | 76.11 | 87.33 |
|  traffic sign | 83.66 | 90.87 |
|   vegetation  | 93.32 | 96.96 |
|    terrain    | 66.29 | 73.77 |
|      sky      | 95.39 | 98.28 |
|     person    | 85.22 | 93.27 |
|     rider     | 66.58 | 76.88 |
|      car      | 96.25 | 98.33 |
|     truck     | 87.17 | 90.39 |
|      bus      | 93.45 | 95.04 |
|     train     | 89.47 | 91.77 |
|   motorcycle  | 71.73 | 79.78 |
|    bicycle    | 81.66 | 91.53 |
+---------------+-------+-------+
2023-03-08 17:32:56,317 - mmseg - INFO - Summary:
2023-03-08 17:32:56,317 - mmseg - INFO - 
+-------+-------+-------+
|  aAcc |  mIoU |  mAcc |
+-------+-------+-------+
| 96.86 | 82.48 | 88.29 |
+-------+-------+-------+
2023-03-08 17:32:57,044 - mmseg - INFO - Now best checkpoint is saved as best_mIoU_iter_3500.pth.
2023-03-08 17:32:57,044 - mmseg - INFO - Best mIoU is 0.8248 at 3500 iter.
2023-03-08 17:32:57,044 - mmseg - INFO - Iter(val) [125]	aAcc: 0.9686, mIoU: 0.8248, mAcc: 0.8829, IoU.road: 0.9873, IoU.sidewalk: 0.8869, IoU.building: 0.9368, IoU.wall: 0.5911, IoU.fence: 0.6758, IoU.pole: 0.7298, IoU.traffic light: 0.7611, IoU.traffic sign: 0.8366, IoU.vegetation: 0.9332, IoU.terrain: 0.6629, IoU.sky: 0.9539, IoU.person: 0.8522, IoU.rider: 0.6658, IoU.car: 0.9625, IoU.truck: 0.8717, IoU.bus: 0.9345, IoU.train: 0.8947, IoU.motorcycle: 0.7173, IoU.bicycle: 0.8166, Acc.road: 0.9922, Acc.sidewalk: 0.9539, Acc.building: 0.9722, Acc.wall: 0.6383, Acc.fence: 0.7445, Acc.pole: 0.8322, Acc.traffic light: 0.8733, Acc.traffic sign: 0.9087, Acc.vegetation: 0.9696, Acc.terrain: 0.7377, Acc.sky: 0.9828, Acc.person: 0.9327, Acc.rider: 0.7688, Acc.car: 0.9833, Acc.truck: 0.9039, Acc.bus: 0.9504, Acc.train: 0.9177, Acc.motorcycle: 0.7978, Acc.bicycle: 0.9153
2023-03-08 17:33:22,945 - mmseg - INFO - Iter [3550/20000]	lr: 4.935e-06, eta: 3:06:29, time: 2.234, data_time: 1.775, memory: 22072, pred_decode.loss_ce: 0.0801, pred_decode.acc_seg: 96.8240, aux.loss_ce: 0.0389, aux.acc_seg: 96.2308, loss: 0.1191
2023-03-08 17:33:46,295 - mmseg - INFO - Iter [3600/20000]	lr: 4.920e-06, eta: 3:05:06, time: 0.467, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0801, pred_decode.acc_seg: 96.8076, aux.loss_ce: 0.0388, aux.acc_seg: 96.2750, loss: 0.1189
2023-03-08 17:34:09,514 - mmseg - INFO - Iter [3650/20000]	lr: 4.905e-06, eta: 3:03:45, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0835, pred_decode.acc_seg: 96.7013, aux.loss_ce: 0.0402, aux.acc_seg: 96.1315, loss: 0.1237
2023-03-08 17:34:32,783 - mmseg - INFO - Iter [3700/20000]	lr: 4.890e-06, eta: 3:02:25, time: 0.465, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0841, pred_decode.acc_seg: 96.6855, aux.loss_ce: 0.0401, aux.acc_seg: 96.1196, loss: 0.1243
2023-03-08 17:34:58,765 - mmseg - INFO - Iter [3750/20000]	lr: 4.875e-06, eta: 3:01:19, time: 0.520, data_time: 0.060, memory: 22072, pred_decode.loss_ce: 0.0839, pred_decode.acc_seg: 96.6755, aux.loss_ce: 0.0400, aux.acc_seg: 96.1140, loss: 0.1239
2023-03-08 17:35:22,062 - mmseg - INFO - Iter [3800/20000]	lr: 4.860e-06, eta: 3:00:02, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0796, pred_decode.acc_seg: 96.8543, aux.loss_ce: 0.0387, aux.acc_seg: 96.2756, loss: 0.1183
2023-03-08 17:35:45,359 - mmseg - INFO - Iter [3850/20000]	lr: 4.845e-06, eta: 2:58:46, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0783, pred_decode.acc_seg: 96.8886, aux.loss_ce: 0.0380, aux.acc_seg: 96.3097, loss: 0.1164
2023-03-08 17:36:08,531 - mmseg - INFO - Iter [3900/20000]	lr: 4.830e-06, eta: 2:57:32, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0847, pred_decode.acc_seg: 96.6052, aux.loss_ce: 0.0412, aux.acc_seg: 96.0052, loss: 0.1259
2023-03-08 17:36:34,297 - mmseg - INFO - Iter [3950/20000]	lr: 4.815e-06, eta: 2:56:29, time: 0.515, data_time: 0.059, memory: 22072, pred_decode.loss_ce: 0.0833, pred_decode.acc_seg: 96.6770, aux.loss_ce: 0.0405, aux.acc_seg: 96.0830, loss: 0.1238
2023-03-08 17:36:57,477 - mmseg - INFO - Saving checkpoint at 4000 iterations
2023-03-08 17:36:58,196 - mmseg - INFO - Exp name: deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_align_diffv22.py
2023-03-08 17:36:58,196 - mmseg - INFO - Iter [4000/20000]	lr: 4.800e-06, eta: 2:55:19, time: 0.478, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0845, pred_decode.acc_seg: 96.6431, aux.loss_ce: 0.0411, aux.acc_seg: 96.0373, loss: 0.1256
2023-03-08 17:38:23,031 - mmseg - INFO - per class results:
2023-03-08 17:38:23,032 - mmseg - INFO - 
+---------------+-------+-------+
|     Class     |  IoU  |  Acc  |
+---------------+-------+-------+
|      road     | 98.72 | 99.19 |
|    sidewalk   | 88.66 | 95.43 |
|    building   |  93.7 | 97.32 |
|      wall     | 60.42 | 65.32 |
|     fence     | 67.24 | 73.89 |
|      pole     |  72.9 | 83.58 |
| traffic light | 76.21 | 86.63 |
|  traffic sign | 84.06 | 90.67 |
|   vegetation  | 93.31 | 96.84 |
|    terrain    | 66.32 | 73.87 |
|      sky      | 95.42 | 98.58 |
|     person    | 85.32 | 92.05 |
|     rider     | 67.02 | 78.82 |
|      car      |  96.3 | 98.35 |
|     truck     | 88.51 | 91.82 |
|      bus      |  93.3 | 95.19 |
|     train     | 88.62 |  90.6 |
|   motorcycle  | 72.15 | 80.65 |
|    bicycle    | 81.51 | 91.56 |
+---------------+-------+-------+
2023-03-08 17:38:23,033 - mmseg - INFO - Summary:
2023-03-08 17:38:23,033 - mmseg - INFO - 
+-------+-------+-------+
|  aAcc |  mIoU |  mAcc |
+-------+-------+-------+
| 96.86 | 82.62 | 88.44 |
+-------+-------+-------+
2023-03-08 17:38:23,770 - mmseg - INFO - Now best checkpoint is saved as best_mIoU_iter_4000.pth.
2023-03-08 17:38:23,770 - mmseg - INFO - Best mIoU is 0.8262 at 4000 iter.
2023-03-08 17:38:23,770 - mmseg - INFO - Exp name: deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_align_diffv22.py
2023-03-08 17:38:23,770 - mmseg - INFO - Iter(val) [125]	aAcc: 0.9686, mIoU: 0.8262, mAcc: 0.8844, IoU.road: 0.9872, IoU.sidewalk: 0.8866, IoU.building: 0.9370, IoU.wall: 0.6042, IoU.fence: 0.6724, IoU.pole: 0.7290, IoU.traffic light: 0.7621, IoU.traffic sign: 0.8406, IoU.vegetation: 0.9331, IoU.terrain: 0.6632, IoU.sky: 0.9542, IoU.person: 0.8532, IoU.rider: 0.6702, IoU.car: 0.9630, IoU.truck: 0.8851, IoU.bus: 0.9330, IoU.train: 0.8862, IoU.motorcycle: 0.7215, IoU.bicycle: 0.8151, Acc.road: 0.9919, Acc.sidewalk: 0.9543, Acc.building: 0.9732, Acc.wall: 0.6532, Acc.fence: 0.7389, Acc.pole: 0.8358, Acc.traffic light: 0.8663, Acc.traffic sign: 0.9067, Acc.vegetation: 0.9684, Acc.terrain: 0.7387, Acc.sky: 0.9858, Acc.person: 0.9205, Acc.rider: 0.7882, Acc.car: 0.9835, Acc.truck: 0.9182, Acc.bus: 0.9519, Acc.train: 0.9060, Acc.motorcycle: 0.8065, Acc.bicycle: 0.9156
2023-03-08 17:38:47,051 - mmseg - INFO - Iter [4050/20000]	lr: 4.785e-06, eta: 2:59:46, time: 2.177, data_time: 1.719, memory: 22072, pred_decode.loss_ce: 0.0805, pred_decode.acc_seg: 96.8142, aux.loss_ce: 0.0389, aux.acc_seg: 96.2473, loss: 0.1195
2023-03-08 17:39:13,000 - mmseg - INFO - Iter [4100/20000]	lr: 4.770e-06, eta: 2:58:41, time: 0.519, data_time: 0.059, memory: 22072, pred_decode.loss_ce: 0.0830, pred_decode.acc_seg: 96.7186, aux.loss_ce: 0.0406, aux.acc_seg: 96.1151, loss: 0.1235