yfji
/

DDP-Weight

Model card Files Files and versions Community

DDP-Weight / ddp_convnext_t_4x4_512x1024_5k_cityscapes_aligned.log

yfji

Upload 2 files (#5)

fbde7ee 12 months ago

raw history blame contribute delete

No virus

121 kB

	2023-03-08 16:52:49,517 - mmseg - INFO - Multi-processing start method is `None`
	2023-03-08 16:52:49,518 - mmseg - INFO - OpenCV num_threads is `112
	2023-03-08 16:52:49,518 - mmseg - INFO - OMP num threads is 1
	2023-03-08 16:52:49,562 - mmseg - INFO - Environment info:
	------------------------------------------------------------
	sys.platform: linux
	Python: 3.7.13 (default, Mar 29 2022, 02:18:16) [GCC 7.5.0]
	CUDA available: True
	GPU 0,1,2,3: A100-SXM-80GB
	CUDA_HOME: /usr/local/cuda
	NVCC: Build cuda_11.2.r11.2/compiler.29618528_0
	GCC: gcc (GCC) 6.1.0
	PyTorch: 1.9.0+cu111
	PyTorch compiling details: PyTorch built with:
	- GCC 7.3
	- C++ Version: 201402
	- Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
	- Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
	- OpenMP 201511 (a.k.a. OpenMP 4.5)
	- NNPACK is enabled
	- CPU capability usage: AVX2
	- CUDA Runtime 11.1
	- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
	- CuDNN 8.0.5
	- Magma 2.5.2
	- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

	TorchVision: 0.10.0+cu111
	OpenCV: 4.6.0
	MMCV: 1.4.2
	MMCV Compiler: GCC 7.3
	MMCV CUDA Compiler: 11.1
	MMSegmentation: 0.29.0+
	------------------------------------------------------------

	2023-03-08 16:52:49,563 - mmseg - INFO - Distributed training: True
	2023-03-08 16:52:49,799 - mmseg - INFO - Config:
	dataset_type = 'CityscapesDataset'
	data_root = 'data/cityscapes/'
	img_norm_cfg = dict(
	mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
	crop_size = (512, 1024)
	train_pipeline = [
	dict(type='LoadImageFromFile'),
	dict(type='LoadAnnotations'),
	dict(type='Resize', img_scale=(2048, 1024), ratio_range=(0.5, 2.0)),
	dict(type='RandomCrop', crop_size=(512, 1024), cat_max_ratio=0.75),
	dict(type='RandomFlip', prob=0.5),
	dict(type='PhotoMetricDistortion'),
	dict(
	type='Normalize',
	mean=[123.675, 116.28, 103.53],
	std=[58.395, 57.12, 57.375],
	to_rgb=True),
	dict(type='Pad', size=(512, 1024), pad_val=0, seg_pad_val=255),
	dict(type='DefaultFormatBundle'),
	dict(type='Collect', keys=['img', 'gt_semantic_seg'])
	]
	test_pipeline = [
	dict(type='LoadImageFromFile'),
	dict(
	type='MultiScaleFlipAug',
	img_scale=(2048, 1024),
	flip=False,
	transforms=[
	dict(type='Resize', keep_ratio=True),
	dict(type='RandomFlip'),
	dict(
	type='Normalize',
	mean=[123.675, 116.28, 103.53],
	std=[58.395, 57.12, 57.375],
	to_rgb=True),
	dict(type='ImageToTensor', keys=['img']),
	dict(type='Collect', keys=['img'])
	])
	]
	data = dict(
	samples_per_gpu=4,
	workers_per_gpu=4,
	train=dict(
	type='CityscapesDataset',
	data_root='data/cityscapes/',
	img_dir='leftImg8bit/train',
	ann_dir='gtFine/train',
	pipeline=[
	dict(type='LoadImageFromFile'),
	dict(type='LoadAnnotations'),
	dict(
	type='Resize', img_scale=(2048, 1024), ratio_range=(0.5, 2.0)),
	dict(type='RandomCrop', crop_size=(512, 1024), cat_max_ratio=0.75),
	dict(type='RandomFlip', prob=0.5),
	dict(type='PhotoMetricDistortion'),
	dict(
	type='Normalize',
	mean=[123.675, 116.28, 103.53],
	std=[58.395, 57.12, 57.375],
	to_rgb=True),
	dict(type='Pad', size=(512, 1024), pad_val=0, seg_pad_val=255),
	dict(type='DefaultFormatBundle'),
	dict(type='Collect', keys=['img', 'gt_semantic_seg'])
	]),
	val=dict(
	type='CityscapesDataset',
	data_root='data/cityscapes/',
	img_dir='leftImg8bit/val',
	ann_dir='gtFine/val',
	pipeline=[
	dict(type='LoadImageFromFile'),
	dict(
	type='MultiScaleFlipAug',
	img_scale=(2048, 1024),
	flip=False,
	transforms=[
	dict(type='Resize', keep_ratio=True),
	dict(type='RandomFlip'),
	dict(
	type='Normalize',
	mean=[123.675, 116.28, 103.53],
	std=[58.395, 57.12, 57.375],
	to_rgb=True),
	dict(type='ImageToTensor', keys=['img']),
	dict(type='Collect', keys=['img'])
	])
	]),
	test=dict(
	type='CityscapesDataset',
	data_root='data/cityscapes/',
	img_dir='leftImg8bit/val',
	ann_dir='gtFine/val',
	pipeline=[
	dict(type='LoadImageFromFile'),
	dict(
	type='MultiScaleFlipAug',
	img_scale=(2048, 1024),
	flip=False,
	transforms=[
	dict(type='Resize', keep_ratio=True),
	dict(type='RandomFlip'),
	dict(
	type='Normalize',
	mean=[123.675, 116.28, 103.53],
	std=[58.395, 57.12, 57.375],
	to_rgb=True),
	dict(type='ImageToTensor', keys=['img']),
	dict(type='Collect', keys=['img'])
	])
	]))
	log_config = dict(
	interval=50, hooks=[dict(type='TextLoggerHook', by_epoch=False)])
	dist_params = dict(backend='nccl')
	log_level = 'INFO'
	load_from = 'work_dirs/deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_best_mIoU_iter_128000_.pth'
	resume_from = None
	workflow = [('train', 1)]
	cudnn_benchmark = True
	optimizer = dict(
	type='AdamW',
	lr=6e-06,
	betas=(0.9, 0.999),
	weight_decay=0.01,
	paramwise_cfg=dict(
	custom_keys=dict(
	pos_block=dict(decay_mult=0.0),
	norm=dict(decay_mult=0.0),
	head=dict(lr_mult=1.0))))
	optimizer_config = dict()
	lr_config = dict(
	policy='poly',
	warmup='linear',
	warmup_iters=1500,
	warmup_ratio=1e-06,
	power=1.0,
	min_lr=0.0,
	by_epoch=False)
	runner = dict(type='IterBasedRunner', max_iters=20000)
	checkpoint_config = dict(by_epoch=False, interval=2000)
	evaluation = dict(interval=500, metric='mIoU', pre_eval=True, save_best='mIoU')
	custom_imports = dict(imports='mmcls.models', allow_failed_imports=False)
	norm_cfg = dict(type='SyncBN', requires_grad=True)
	backbone_norm_cfg = dict(type='LN', requires_grad=True)
	model = dict(
	type='DiffSegV22',
	bit_scale=0.01,
	timesteps=10,
	pretrained=None,
	backbone=dict(
	type='mmcls.ConvNeXt',
	arch='tiny',
	out_indices=[0, 1, 2, 3],
	drop_path_rate=0.4,
	layer_scale_init_value=1.0,
	gap_before_final_norm=False,
	init_cfg=None),
	neck=[
	dict(
	type='FPN',
	in_channels=[96, 192, 384, 768],
	out_channels=256,
	act_cfg=None,
	norm_cfg=dict(type='GN', num_groups=32),
	num_outs=4),
	dict(
	type='MultiStageMerging',
	in_channels=[256, 256, 256, 256],
	out_channels=256,
	kernel_size=1,
	norm_cfg=dict(type='GN', num_groups=32),
	act_cfg=None)
	],
	auxiliary_head=dict(
	type='FCNHead',
	in_channels=256,
	in_index=0,
	channels=256,
	num_convs=1,
	concat_input=False,
	dropout_ratio=0.1,
	num_classes=19,
	norm_cfg=dict(type='SyncBN', requires_grad=True),
	align_corners=False,
	loss_decode=dict(
	type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)),
	decode_head=dict(
	type='DeformableHeadWithTime',
	in_channels=[256],
	channels=256,
	in_index=[0],
	dropout_ratio=0.0,
	num_classes=19,
	norm_cfg=dict(type='SyncBN', requires_grad=True),
	align_corners=False,
	num_feature_levels=1,
	encoder=dict(
	type='DetrTransformerEncoder',
	num_layers=6,
	transformerlayers=dict(
	type='BaseTransformerLayer',
	use_time_mlp=True,
	attn_cfgs=dict(
	type='MultiScaleDeformableAttention',
	embed_dims=256,
	num_levels=1,
	num_heads=8,
	dropout=0.0),
	ffn_cfgs=dict(
	type='FFN',
	embed_dims=256,
	feedforward_channels=1024,
	ffn_drop=0.0,
	act_cfg=dict(type='GELU')),
	operation_order=('self_attn', 'norm', 'ffn', 'norm'))),
	positional_encoding=dict(
	type='SinePositionalEncoding',
	num_feats=128,
	normalize=True,
	offset=-0.5),
	loss_decode=dict(
	type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
	train_cfg=dict(),
	test_cfg=dict(mode='whole'))
	work_dir = './work_dirs/deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_align_diffv22'
	gpu_ids = range(0, 4)
	auto_resume = False

	2023-03-08 16:52:54,149 - mmseg - INFO - Set random seed to 1941061547, deterministic: True
	2023-03-08 16:52:54,470 - mmseg - INFO - initialize FPN with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'}
	2023-03-08 16:52:54,485 - mmseg - INFO - initialize MultiStageMerging with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'}
	2023-03-08 16:52:54,532 - mmseg - INFO - initialize FCNHead with init_cfg {'type': 'Normal', 'std': 0.01, 'override': {'name': 'conv_seg'}}
	Name of parameter - Initialization information

	backbone.downsample_layers.0.0.weight - torch.Size([96, 3, 4, 4]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.downsample_layers.0.0.bias - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.downsample_layers.0.1.weight - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.downsample_layers.0.1.bias - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.downsample_layers.1.0.weight - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.downsample_layers.1.0.bias - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.downsample_layers.1.1.weight - torch.Size([192, 96, 2, 2]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.downsample_layers.1.1.bias - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.downsample_layers.2.0.weight - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.downsample_layers.2.0.bias - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.downsample_layers.2.1.weight - torch.Size([384, 192, 2, 2]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.downsample_layers.2.1.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.downsample_layers.3.0.weight - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.downsample_layers.3.0.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.downsample_layers.3.1.weight - torch.Size([768, 384, 2, 2]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.downsample_layers.3.1.bias - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.0.gamma - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.0.depthwise_conv.weight - torch.Size([96, 1, 7, 7]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.0.depthwise_conv.bias - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.0.norm.weight - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.0.norm.bias - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.0.pointwise_conv1.weight - torch.Size([384, 96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.0.pointwise_conv1.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.0.pointwise_conv2.weight - torch.Size([96, 384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.0.pointwise_conv2.bias - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.1.gamma - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.1.depthwise_conv.weight - torch.Size([96, 1, 7, 7]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.1.depthwise_conv.bias - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.1.norm.weight - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.1.norm.bias - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.1.pointwise_conv1.weight - torch.Size([384, 96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.1.pointwise_conv1.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.1.pointwise_conv2.weight - torch.Size([96, 384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.1.pointwise_conv2.bias - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.2.gamma - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.2.depthwise_conv.weight - torch.Size([96, 1, 7, 7]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.2.depthwise_conv.bias - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.2.norm.weight - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.2.norm.bias - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.2.pointwise_conv1.weight - torch.Size([384, 96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.2.pointwise_conv1.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.2.pointwise_conv2.weight - torch.Size([96, 384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.0.2.pointwise_conv2.bias - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.0.gamma - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.0.depthwise_conv.weight - torch.Size([192, 1, 7, 7]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.0.depthwise_conv.bias - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.0.norm.weight - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.0.norm.bias - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.0.pointwise_conv1.weight - torch.Size([768, 192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.0.pointwise_conv1.bias - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.0.pointwise_conv2.weight - torch.Size([192, 768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.0.pointwise_conv2.bias - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.1.gamma - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.1.depthwise_conv.weight - torch.Size([192, 1, 7, 7]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.1.depthwise_conv.bias - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.1.norm.weight - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.1.norm.bias - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.1.pointwise_conv1.weight - torch.Size([768, 192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.1.pointwise_conv1.bias - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.1.pointwise_conv2.weight - torch.Size([192, 768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.1.pointwise_conv2.bias - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.2.gamma - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.2.depthwise_conv.weight - torch.Size([192, 1, 7, 7]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.2.depthwise_conv.bias - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.2.norm.weight - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.2.norm.bias - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.2.pointwise_conv1.weight - torch.Size([768, 192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.2.pointwise_conv1.bias - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.2.pointwise_conv2.weight - torch.Size([192, 768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.1.2.pointwise_conv2.bias - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.0.gamma - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.0.depthwise_conv.weight - torch.Size([384, 1, 7, 7]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.0.depthwise_conv.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.0.norm.weight - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.0.norm.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.0.pointwise_conv1.weight - torch.Size([1536, 384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.0.pointwise_conv1.bias - torch.Size([1536]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.0.pointwise_conv2.weight - torch.Size([384, 1536]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.0.pointwise_conv2.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.1.gamma - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.1.depthwise_conv.weight - torch.Size([384, 1, 7, 7]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.1.depthwise_conv.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.1.norm.weight - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.1.norm.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.1.pointwise_conv1.weight - torch.Size([1536, 384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.1.pointwise_conv1.bias - torch.Size([1536]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.1.pointwise_conv2.weight - torch.Size([384, 1536]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.1.pointwise_conv2.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.2.gamma - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.2.depthwise_conv.weight - torch.Size([384, 1, 7, 7]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.2.depthwise_conv.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.2.norm.weight - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.2.norm.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.2.pointwise_conv1.weight - torch.Size([1536, 384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.2.pointwise_conv1.bias - torch.Size([1536]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.2.pointwise_conv2.weight - torch.Size([384, 1536]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.2.pointwise_conv2.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.3.gamma - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.3.depthwise_conv.weight - torch.Size([384, 1, 7, 7]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.3.depthwise_conv.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.3.norm.weight - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.3.norm.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.3.pointwise_conv1.weight - torch.Size([1536, 384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.3.pointwise_conv1.bias - torch.Size([1536]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.3.pointwise_conv2.weight - torch.Size([384, 1536]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.3.pointwise_conv2.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.4.gamma - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.4.depthwise_conv.weight - torch.Size([384, 1, 7, 7]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.4.depthwise_conv.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.4.norm.weight - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.4.norm.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.4.pointwise_conv1.weight - torch.Size([1536, 384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.4.pointwise_conv1.bias - torch.Size([1536]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.4.pointwise_conv2.weight - torch.Size([384, 1536]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.4.pointwise_conv2.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.5.gamma - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.5.depthwise_conv.weight - torch.Size([384, 1, 7, 7]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.5.depthwise_conv.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.5.norm.weight - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.5.norm.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.5.pointwise_conv1.weight - torch.Size([1536, 384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.5.pointwise_conv1.bias - torch.Size([1536]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.5.pointwise_conv2.weight - torch.Size([384, 1536]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.5.pointwise_conv2.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.6.gamma - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.6.depthwise_conv.weight - torch.Size([384, 1, 7, 7]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.6.depthwise_conv.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.6.norm.weight - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.6.norm.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.6.pointwise_conv1.weight - torch.Size([1536, 384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.6.pointwise_conv1.bias - torch.Size([1536]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.6.pointwise_conv2.weight - torch.Size([384, 1536]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.6.pointwise_conv2.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.7.gamma - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.7.depthwise_conv.weight - torch.Size([384, 1, 7, 7]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.7.depthwise_conv.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.7.norm.weight - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.7.norm.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.7.pointwise_conv1.weight - torch.Size([1536, 384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.7.pointwise_conv1.bias - torch.Size([1536]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.7.pointwise_conv2.weight - torch.Size([384, 1536]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.7.pointwise_conv2.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.8.gamma - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.8.depthwise_conv.weight - torch.Size([384, 1, 7, 7]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.8.depthwise_conv.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.8.norm.weight - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.8.norm.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.8.pointwise_conv1.weight - torch.Size([1536, 384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.8.pointwise_conv1.bias - torch.Size([1536]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.8.pointwise_conv2.weight - torch.Size([384, 1536]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.2.8.pointwise_conv2.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.0.gamma - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.0.depthwise_conv.weight - torch.Size([768, 1, 7, 7]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.0.depthwise_conv.bias - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.0.norm.weight - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.0.norm.bias - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.0.pointwise_conv1.weight - torch.Size([3072, 768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.0.pointwise_conv1.bias - torch.Size([3072]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.0.pointwise_conv2.weight - torch.Size([768, 3072]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.0.pointwise_conv2.bias - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.1.gamma - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.1.depthwise_conv.weight - torch.Size([768, 1, 7, 7]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.1.depthwise_conv.bias - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.1.norm.weight - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.1.norm.bias - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.1.pointwise_conv1.weight - torch.Size([3072, 768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.1.pointwise_conv1.bias - torch.Size([3072]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.1.pointwise_conv2.weight - torch.Size([768, 3072]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.1.pointwise_conv2.bias - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.2.gamma - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.2.depthwise_conv.weight - torch.Size([768, 1, 7, 7]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.2.depthwise_conv.bias - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.2.norm.weight - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.2.norm.bias - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.2.pointwise_conv1.weight - torch.Size([3072, 768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.2.pointwise_conv1.bias - torch.Size([3072]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.2.pointwise_conv2.weight - torch.Size([768, 3072]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.stages.3.2.pointwise_conv2.bias - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.norm0.weight - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.norm0.bias - torch.Size([96]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.norm1.weight - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.norm1.bias - torch.Size([192]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.norm2.weight - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.norm2.bias - torch.Size([384]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.norm3.weight - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	backbone.norm3.bias - torch.Size([768]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	neck.0.lateral_convs.0.conv.weight - torch.Size([256, 96, 1, 1]):
	XavierInit: gain=1, distribution=uniform, bias=0

	neck.0.lateral_convs.0.gn.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	neck.0.lateral_convs.0.gn.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	neck.0.lateral_convs.1.conv.weight - torch.Size([256, 192, 1, 1]):
	XavierInit: gain=1, distribution=uniform, bias=0

	neck.0.lateral_convs.1.gn.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	neck.0.lateral_convs.1.gn.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	neck.0.lateral_convs.2.conv.weight - torch.Size([256, 384, 1, 1]):
	XavierInit: gain=1, distribution=uniform, bias=0

	neck.0.lateral_convs.2.gn.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	neck.0.lateral_convs.2.gn.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	neck.0.lateral_convs.3.conv.weight - torch.Size([256, 768, 1, 1]):
	XavierInit: gain=1, distribution=uniform, bias=0

	neck.0.lateral_convs.3.gn.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	neck.0.lateral_convs.3.gn.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	neck.0.fpn_convs.0.conv.weight - torch.Size([256, 256, 3, 3]):
	XavierInit: gain=1, distribution=uniform, bias=0

	neck.0.fpn_convs.0.gn.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	neck.0.fpn_convs.0.gn.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	neck.0.fpn_convs.1.conv.weight - torch.Size([256, 256, 3, 3]):
	XavierInit: gain=1, distribution=uniform, bias=0

	neck.0.fpn_convs.1.gn.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	neck.0.fpn_convs.1.gn.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	neck.0.fpn_convs.2.conv.weight - torch.Size([256, 256, 3, 3]):
	XavierInit: gain=1, distribution=uniform, bias=0

	neck.0.fpn_convs.2.gn.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	neck.0.fpn_convs.2.gn.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	neck.0.fpn_convs.3.conv.weight - torch.Size([256, 256, 3, 3]):
	XavierInit: gain=1, distribution=uniform, bias=0

	neck.0.fpn_convs.3.gn.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	neck.0.fpn_convs.3.gn.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	neck.1.down.conv.weight - torch.Size([256, 1024, 1, 1]):
	Initialized by user-defined `init_weights` in ConvModule

	neck.1.down.gn.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	neck.1.down.gn.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.conv_seg.weight - torch.Size([19, 256, 1, 1]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.conv_seg.bias - torch.Size([19]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.0.attentions.0.sampling_offsets.weight - torch.Size([64, 256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.0.attentions.0.sampling_offsets.bias - torch.Size([64]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.0.attentions.0.attention_weights.weight - torch.Size([32, 256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.0.attentions.0.attention_weights.bias - torch.Size([32]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.0.attentions.0.value_proj.weight - torch.Size([256, 256]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.0.attentions.0.value_proj.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.0.attentions.0.output_proj.weight - torch.Size([256, 256]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.0.attentions.0.output_proj.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.0.time_mlp.1.weight - torch.Size([512, 1024]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.0.time_mlp.1.bias - torch.Size([512]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.0.ffns.0.layers.0.0.weight - torch.Size([1024, 256]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.0.ffns.0.layers.0.0.bias - torch.Size([1024]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.0.ffns.0.layers.1.weight - torch.Size([256, 1024]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.0.ffns.0.layers.1.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.0.norms.0.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.0.norms.0.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.0.norms.1.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.0.norms.1.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.1.attentions.0.sampling_offsets.weight - torch.Size([64, 256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.1.attentions.0.sampling_offsets.bias - torch.Size([64]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.1.attentions.0.attention_weights.weight - torch.Size([32, 256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.1.attentions.0.attention_weights.bias - torch.Size([32]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.1.attentions.0.value_proj.weight - torch.Size([256, 256]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.1.attentions.0.value_proj.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.1.attentions.0.output_proj.weight - torch.Size([256, 256]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.1.attentions.0.output_proj.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.1.time_mlp.1.weight - torch.Size([512, 1024]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.1.time_mlp.1.bias - torch.Size([512]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.1.ffns.0.layers.0.0.weight - torch.Size([1024, 256]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.1.ffns.0.layers.0.0.bias - torch.Size([1024]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.1.ffns.0.layers.1.weight - torch.Size([256, 1024]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.1.ffns.0.layers.1.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.1.norms.0.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.1.norms.0.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.1.norms.1.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.1.norms.1.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.2.attentions.0.sampling_offsets.weight - torch.Size([64, 256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.2.attentions.0.sampling_offsets.bias - torch.Size([64]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.2.attentions.0.attention_weights.weight - torch.Size([32, 256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.2.attentions.0.attention_weights.bias - torch.Size([32]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.2.attentions.0.value_proj.weight - torch.Size([256, 256]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.2.attentions.0.value_proj.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.2.attentions.0.output_proj.weight - torch.Size([256, 256]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.2.attentions.0.output_proj.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.2.time_mlp.1.weight - torch.Size([512, 1024]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.2.time_mlp.1.bias - torch.Size([512]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.2.ffns.0.layers.0.0.weight - torch.Size([1024, 256]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.2.ffns.0.layers.0.0.bias - torch.Size([1024]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.2.ffns.0.layers.1.weight - torch.Size([256, 1024]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.2.ffns.0.layers.1.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.2.norms.0.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.2.norms.0.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.2.norms.1.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.2.norms.1.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.3.attentions.0.sampling_offsets.weight - torch.Size([64, 256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.3.attentions.0.sampling_offsets.bias - torch.Size([64]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.3.attentions.0.attention_weights.weight - torch.Size([32, 256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.3.attentions.0.attention_weights.bias - torch.Size([32]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.3.attentions.0.value_proj.weight - torch.Size([256, 256]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.3.attentions.0.value_proj.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.3.attentions.0.output_proj.weight - torch.Size([256, 256]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.3.attentions.0.output_proj.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.3.time_mlp.1.weight - torch.Size([512, 1024]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.3.time_mlp.1.bias - torch.Size([512]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.3.ffns.0.layers.0.0.weight - torch.Size([1024, 256]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.3.ffns.0.layers.0.0.bias - torch.Size([1024]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.3.ffns.0.layers.1.weight - torch.Size([256, 1024]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.3.ffns.0.layers.1.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.3.norms.0.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.3.norms.0.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.3.norms.1.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.3.norms.1.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.4.attentions.0.sampling_offsets.weight - torch.Size([64, 256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.4.attentions.0.sampling_offsets.bias - torch.Size([64]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.4.attentions.0.attention_weights.weight - torch.Size([32, 256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.4.attentions.0.attention_weights.bias - torch.Size([32]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.4.attentions.0.value_proj.weight - torch.Size([256, 256]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.4.attentions.0.value_proj.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.4.attentions.0.output_proj.weight - torch.Size([256, 256]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.4.attentions.0.output_proj.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.4.time_mlp.1.weight - torch.Size([512, 1024]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.4.time_mlp.1.bias - torch.Size([512]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.4.ffns.0.layers.0.0.weight - torch.Size([1024, 256]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.4.ffns.0.layers.0.0.bias - torch.Size([1024]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.4.ffns.0.layers.1.weight - torch.Size([256, 1024]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.4.ffns.0.layers.1.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.4.norms.0.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.4.norms.0.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.4.norms.1.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.4.norms.1.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.5.attentions.0.sampling_offsets.weight - torch.Size([64, 256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.5.attentions.0.sampling_offsets.bias - torch.Size([64]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.5.attentions.0.attention_weights.weight - torch.Size([32, 256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.5.attentions.0.attention_weights.bias - torch.Size([32]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.5.attentions.0.value_proj.weight - torch.Size([256, 256]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.5.attentions.0.value_proj.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.5.attentions.0.output_proj.weight - torch.Size([256, 256]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.5.attentions.0.output_proj.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.5.time_mlp.1.weight - torch.Size([512, 1024]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.5.time_mlp.1.bias - torch.Size([512]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.5.ffns.0.layers.0.0.weight - torch.Size([1024, 256]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.5.ffns.0.layers.0.0.bias - torch.Size([1024]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.5.ffns.0.layers.1.weight - torch.Size([256, 1024]):
	Initialized by user-defined `init_weights` in DeformableHeadWithTime

	decode_head.encoder.layers.5.ffns.0.layers.1.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.5.norms.0.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.5.norms.0.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.5.norms.1.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	decode_head.encoder.layers.5.norms.1.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	auxiliary_head.conv_seg.weight - torch.Size([19, 256, 1, 1]):
	NormalInit: mean=0, std=0.01, bias=0

	auxiliary_head.conv_seg.bias - torch.Size([19]):
	NormalInit: mean=0, std=0.01, bias=0

	auxiliary_head.convs.0.conv.weight - torch.Size([256, 256, 3, 3]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	auxiliary_head.convs.0.bn.weight - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	auxiliary_head.convs.0.bn.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	embedding_table.weight - torch.Size([20, 256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	transform.conv.weight - torch.Size([256, 512, 1, 1]):
	Initialized by user-defined `init_weights` in ConvModule

	transform.conv.bias - torch.Size([256]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	time_mlp.0.weights - torch.Size([8]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	time_mlp.1.weight - torch.Size([1024, 17]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	time_mlp.1.bias - torch.Size([1024]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	time_mlp.3.weight - torch.Size([1024, 1024]):
	The value is the same before and after calling `init_weights` of DiffSegV22

	time_mlp.3.bias - torch.Size([1024]):
	The value is the same before and after calling `init_weights` of DiffSegV22
	2023-03-08 16:52:54,536 - mmseg - INFO - DiffSegV22(
	(backbone): ConvNeXt(
	(downsample_layers): ModuleList(
	(0): Sequential(
	(0): Conv2d(3, 96, kernel_size=(4, 4), stride=(4, 4))
	(1): LayerNorm2d((96,), eps=1e-06, elementwise_affine=True)
	)
	(1): Sequential(
	(0): LayerNorm2d((96,), eps=1e-05, elementwise_affine=True)
	(1): Conv2d(96, 192, kernel_size=(2, 2), stride=(2, 2))
	)
	(2): Sequential(
	(0): LayerNorm2d((192,), eps=1e-05, elementwise_affine=True)
	(1): Conv2d(192, 384, kernel_size=(2, 2), stride=(2, 2))
	)
	(3): Sequential(
	(0): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True)
	(1): Conv2d(384, 768, kernel_size=(2, 2), stride=(2, 2))
	)
	)
	(stages): ModuleList(
	(0): Sequential(
	(0): ConvNeXtBlock(
	(depthwise_conv): Conv2d(96, 96, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=96)
	(norm): LayerNorm2d((96,), eps=1e-06, elementwise_affine=True)
	(pointwise_conv1): Linear(in_features=96, out_features=384, bias=True)
	(act): GELU()
	(pointwise_conv2): Linear(in_features=384, out_features=96, bias=True)
	(drop_path): Identity()
	)
	(1): ConvNeXtBlock(
	(depthwise_conv): Conv2d(96, 96, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=96)
	(norm): LayerNorm2d((96,), eps=1e-06, elementwise_affine=True)
	(pointwise_conv1): Linear(in_features=96, out_features=384, bias=True)
	(act): GELU()
	(pointwise_conv2): Linear(in_features=384, out_features=96, bias=True)
	(drop_path): DropPath()
	)
	(2): ConvNeXtBlock(
	(depthwise_conv): Conv2d(96, 96, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=96)
	(norm): LayerNorm2d((96,), eps=1e-06, elementwise_affine=True)
	(pointwise_conv1): Linear(in_features=96, out_features=384, bias=True)
	(act): GELU()
	(pointwise_conv2): Linear(in_features=384, out_features=96, bias=True)
	(drop_path): DropPath()
	)
	)
	(1): Sequential(
	(0): ConvNeXtBlock(
	(depthwise_conv): Conv2d(192, 192, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=192)
	(norm): LayerNorm2d((192,), eps=1e-06, elementwise_affine=True)
	(pointwise_conv1): Linear(in_features=192, out_features=768, bias=True)
	(act): GELU()
	(pointwise_conv2): Linear(in_features=768, out_features=192, bias=True)
	(drop_path): DropPath()
	)
	(1): ConvNeXtBlock(
	(depthwise_conv): Conv2d(192, 192, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=192)
	(norm): LayerNorm2d((192,), eps=1e-06, elementwise_affine=True)
	(pointwise_conv1): Linear(in_features=192, out_features=768, bias=True)
	(act): GELU()
	(pointwise_conv2): Linear(in_features=768, out_features=192, bias=True)
	(drop_path): DropPath()
	)
	(2): ConvNeXtBlock(
	(depthwise_conv): Conv2d(192, 192, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=192)
	(norm): LayerNorm2d((192,), eps=1e-06, elementwise_affine=True)
	(pointwise_conv1): Linear(in_features=192, out_features=768, bias=True)
	(act): GELU()
	(pointwise_conv2): Linear(in_features=768, out_features=192, bias=True)
	(drop_path): DropPath()
	)
	)
	(2): Sequential(
	(0): ConvNeXtBlock(
	(depthwise_conv): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
	(norm): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
	(pointwise_conv1): Linear(in_features=384, out_features=1536, bias=True)
	(act): GELU()
	(pointwise_conv2): Linear(in_features=1536, out_features=384, bias=True)
	(drop_path): DropPath()
	)
	(1): ConvNeXtBlock(
	(depthwise_conv): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
	(norm): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
	(pointwise_conv1): Linear(in_features=384, out_features=1536, bias=True)
	(act): GELU()
	(pointwise_conv2): Linear(in_features=1536, out_features=384, bias=True)
	(drop_path): DropPath()
	)
	(2): ConvNeXtBlock(
	(depthwise_conv): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
	(norm): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
	(pointwise_conv1): Linear(in_features=384, out_features=1536, bias=True)
	(act): GELU()
	(pointwise_conv2): Linear(in_features=1536, out_features=384, bias=True)
	(drop_path): DropPath()
	)
	(3): ConvNeXtBlock(
	(depthwise_conv): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
	(norm): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
	(pointwise_conv1): Linear(in_features=384, out_features=1536, bias=True)
	(act): GELU()
	(pointwise_conv2): Linear(in_features=1536, out_features=384, bias=True)
	(drop_path): DropPath()
	)
	(4): ConvNeXtBlock(
	(depthwise_conv): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
	(norm): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
	(pointwise_conv1): Linear(in_features=384, out_features=1536, bias=True)
	(act): GELU()
	(pointwise_conv2): Linear(in_features=1536, out_features=384, bias=True)
	(drop_path): DropPath()
	)
	(5): ConvNeXtBlock(
	(depthwise_conv): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
	(norm): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
	(pointwise_conv1): Linear(in_features=384, out_features=1536, bias=True)
	(act): GELU()
	(pointwise_conv2): Linear(in_features=1536, out_features=384, bias=True)
	(drop_path): DropPath()
	)
	(6): ConvNeXtBlock(
	(depthwise_conv): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
	(norm): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
	(pointwise_conv1): Linear(in_features=384, out_features=1536, bias=True)
	(act): GELU()
	(pointwise_conv2): Linear(in_features=1536, out_features=384, bias=True)
	(drop_path): DropPath()
	)
	(7): ConvNeXtBlock(
	(depthwise_conv): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
	(norm): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
	(pointwise_conv1): Linear(in_features=384, out_features=1536, bias=True)
	(act): GELU()
	(pointwise_conv2): Linear(in_features=1536, out_features=384, bias=True)
	(drop_path): DropPath()
	)
	(8): ConvNeXtBlock(
	(depthwise_conv): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
	(norm): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
	(pointwise_conv1): Linear(in_features=384, out_features=1536, bias=True)
	(act): GELU()
	(pointwise_conv2): Linear(in_features=1536, out_features=384, bias=True)
	(drop_path): DropPath()
	)
	)
	(3): Sequential(
	(0): ConvNeXtBlock(
	(depthwise_conv): Conv2d(768, 768, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=768)
	(norm): LayerNorm2d((768,), eps=1e-06, elementwise_affine=True)
	(pointwise_conv1): Linear(in_features=768, out_features=3072, bias=True)
	(act): GELU()
	(pointwise_conv2): Linear(in_features=3072, out_features=768, bias=True)
	(drop_path): DropPath()
	)
	(1): ConvNeXtBlock(
	(depthwise_conv): Conv2d(768, 768, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=768)
	(norm): LayerNorm2d((768,), eps=1e-06, elementwise_affine=True)
	(pointwise_conv1): Linear(in_features=768, out_features=3072, bias=True)
	(act): GELU()
	(pointwise_conv2): Linear(in_features=3072, out_features=768, bias=True)
	(drop_path): DropPath()
	)
	(2): ConvNeXtBlock(
	(depthwise_conv): Conv2d(768, 768, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=768)
	(norm): LayerNorm2d((768,), eps=1e-06, elementwise_affine=True)
	(pointwise_conv1): Linear(in_features=768, out_features=3072, bias=True)
	(act): GELU()
	(pointwise_conv2): Linear(in_features=3072, out_features=768, bias=True)
	(drop_path): DropPath()
	)
	)
	)
	(norm0): LayerNorm2d((96,), eps=1e-06, elementwise_affine=True)
	(norm1): LayerNorm2d((192,), eps=1e-06, elementwise_affine=True)
	(norm2): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
	(norm3): LayerNorm2d((768,), eps=1e-06, elementwise_affine=True)
	)
	(neck): Sequential(
	(0): FPN(
	(lateral_convs): ModuleList(
	(0): ConvModule(
	(conv): Conv2d(96, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
	(gn): GroupNorm(32, 256, eps=1e-05, affine=True)
	)
	(1): ConvModule(
	(conv): Conv2d(192, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
	(gn): GroupNorm(32, 256, eps=1e-05, affine=True)
	)
	(2): ConvModule(
	(conv): Conv2d(384, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
	(gn): GroupNorm(32, 256, eps=1e-05, affine=True)
	)
	(3): ConvModule(
	(conv): Conv2d(768, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
	(gn): GroupNorm(32, 256, eps=1e-05, affine=True)
	)
	)
	(fpn_convs): ModuleList(
	(0): ConvModule(
	(conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
	(gn): GroupNorm(32, 256, eps=1e-05, affine=True)
	)
	(1): ConvModule(
	(conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
	(gn): GroupNorm(32, 256, eps=1e-05, affine=True)
	)
	(2): ConvModule(
	(conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
	(gn): GroupNorm(32, 256, eps=1e-05, affine=True)
	)
	(3): ConvModule(
	(conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
	(gn): GroupNorm(32, 256, eps=1e-05, affine=True)
	)
	)
	)
	init_cfg={'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'}
	(1): MultiStageMerging(
	(down): ConvModule(
	(conv): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
	(gn): GroupNorm(32, 256, eps=1e-05, affine=True)
	)
	)
	init_cfg={'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'}
	)
	(decode_head): DeformableHeadWithTime(
	input_transform=multiple_select, ignore_index=255, align_corners=False
	(loss_decode): CrossEntropyLoss(avg_non_ignore=False)
	(conv_seg): Conv2d(256, 19, kernel_size=(1, 1), stride=(1, 1))
	(encoder): DetrTransformerEncoder(
	(layers): ModuleList(
	(0): BaseTransformerLayer(
	(attentions): ModuleList(
	(0): MultiScaleDeformableAttention(
	(dropout): Dropout(p=0.0, inplace=False)
	(sampling_offsets): Linear(in_features=256, out_features=64, bias=True)
	(attention_weights): Linear(in_features=256, out_features=32, bias=True)
	(value_proj): Linear(in_features=256, out_features=256, bias=True)
	(output_proj): Linear(in_features=256, out_features=256, bias=True)
	)
	)
	(time_mlp): Sequential(
	(0): SiLU()
	(1): Linear(in_features=1024, out_features=512, bias=True)
	)
	(ffns): ModuleList(
	(0): FFN(
	(activate): GELU()
	(layers): Sequential(
	(0): Sequential(
	(0): Linear(in_features=256, out_features=1024, bias=True)
	(1): GELU()
	(2): Dropout(p=0.0, inplace=False)
	)
	(1): Linear(in_features=1024, out_features=256, bias=True)
	(2): Dropout(p=0.0, inplace=False)
	)
	(dropout_layer): Identity()
	)
	)
	(norms): ModuleList(
	(0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
	(1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
	)
	)
	(1): BaseTransformerLayer(
	(attentions): ModuleList(
	(0): MultiScaleDeformableAttention(
	(dropout): Dropout(p=0.0, inplace=False)
	(sampling_offsets): Linear(in_features=256, out_features=64, bias=True)
	(attention_weights): Linear(in_features=256, out_features=32, bias=True)
	(value_proj): Linear(in_features=256, out_features=256, bias=True)
	(output_proj): Linear(in_features=256, out_features=256, bias=True)
	)
	)
	(time_mlp): Sequential(
	(0): SiLU()
	(1): Linear(in_features=1024, out_features=512, bias=True)
	)
	(ffns): ModuleList(
	(0): FFN(
	(activate): GELU()
	(layers): Sequential(
	(0): Sequential(
	(0): Linear(in_features=256, out_features=1024, bias=True)
	(1): GELU()
	(2): Dropout(p=0.0, inplace=False)
	)
	(1): Linear(in_features=1024, out_features=256, bias=True)
	(2): Dropout(p=0.0, inplace=False)
	)
	(dropout_layer): Identity()
	)
	)
	(norms): ModuleList(
	(0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
	(1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
	)
	)
	(2): BaseTransformerLayer(
	(attentions): ModuleList(
	(0): MultiScaleDeformableAttention(
	(dropout): Dropout(p=0.0, inplace=False)
	(sampling_offsets): Linear(in_features=256, out_features=64, bias=True)
	(attention_weights): Linear(in_features=256, out_features=32, bias=True)
	(value_proj): Linear(in_features=256, out_features=256, bias=True)
	(output_proj): Linear(in_features=256, out_features=256, bias=True)
	)
	)
	(time_mlp): Sequential(
	(0): SiLU()
	(1): Linear(in_features=1024, out_features=512, bias=True)
	)
	(ffns): ModuleList(
	(0): FFN(
	(activate): GELU()
	(layers): Sequential(
	(0): Sequential(
	(0): Linear(in_features=256, out_features=1024, bias=True)
	(1): GELU()
	(2): Dropout(p=0.0, inplace=False)
	)
	(1): Linear(in_features=1024, out_features=256, bias=True)
	(2): Dropout(p=0.0, inplace=False)
	)
	(dropout_layer): Identity()
	)
	)
	(norms): ModuleList(
	(0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
	(1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
	)
	)
	(3): BaseTransformerLayer(
	(attentions): ModuleList(
	(0): MultiScaleDeformableAttention(
	(dropout): Dropout(p=0.0, inplace=False)
	(sampling_offsets): Linear(in_features=256, out_features=64, bias=True)
	(attention_weights): Linear(in_features=256, out_features=32, bias=True)
	(value_proj): Linear(in_features=256, out_features=256, bias=True)
	(output_proj): Linear(in_features=256, out_features=256, bias=True)
	)
	)
	(time_mlp): Sequential(
	(0): SiLU()
	(1): Linear(in_features=1024, out_features=512, bias=True)
	)
	(ffns): ModuleList(
	(0): FFN(
	(activate): GELU()
	(layers): Sequential(
	(0): Sequential(
	(0): Linear(in_features=256, out_features=1024, bias=True)
	(1): GELU()
	(2): Dropout(p=0.0, inplace=False)
	)
	(1): Linear(in_features=1024, out_features=256, bias=True)
	(2): Dropout(p=0.0, inplace=False)
	)
	(dropout_layer): Identity()
	)
	)
	(norms): ModuleList(
	(0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
	(1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
	)
	)
	(4): BaseTransformerLayer(
	(attentions): ModuleList(
	(0): MultiScaleDeformableAttention(
	(dropout): Dropout(p=0.0, inplace=False)
	(sampling_offsets): Linear(in_features=256, out_features=64, bias=True)
	(attention_weights): Linear(in_features=256, out_features=32, bias=True)
	(value_proj): Linear(in_features=256, out_features=256, bias=True)
	(output_proj): Linear(in_features=256, out_features=256, bias=True)
	)
	)
	(time_mlp): Sequential(
	(0): SiLU()
	(1): Linear(in_features=1024, out_features=512, bias=True)
	)
	(ffns): ModuleList(
	(0): FFN(
	(activate): GELU()
	(layers): Sequential(
	(0): Sequential(
	(0): Linear(in_features=256, out_features=1024, bias=True)
	(1): GELU()
	(2): Dropout(p=0.0, inplace=False)
	)
	(1): Linear(in_features=1024, out_features=256, bias=True)
	(2): Dropout(p=0.0, inplace=False)
	)
	(dropout_layer): Identity()
	)
	)
	(norms): ModuleList(
	(0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
	(1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
	)
	)
	(5): BaseTransformerLayer(
	(attentions): ModuleList(
	(0): MultiScaleDeformableAttention(
	(dropout): Dropout(p=0.0, inplace=False)
	(sampling_offsets): Linear(in_features=256, out_features=64, bias=True)
	(attention_weights): Linear(in_features=256, out_features=32, bias=True)
	(value_proj): Linear(in_features=256, out_features=256, bias=True)
	(output_proj): Linear(in_features=256, out_features=256, bias=True)
	)
	)
	(time_mlp): Sequential(
	(0): SiLU()
	(1): Linear(in_features=1024, out_features=512, bias=True)
	)
	(ffns): ModuleList(
	(0): FFN(
	(activate): GELU()
	(layers): Sequential(
	(0): Sequential(
	(0): Linear(in_features=256, out_features=1024, bias=True)
	(1): GELU()
	(2): Dropout(p=0.0, inplace=False)
	)
	(1): Linear(in_features=1024, out_features=256, bias=True)
	(2): Dropout(p=0.0, inplace=False)
	)
	(dropout_layer): Identity()
	)
	)
	(norms): ModuleList(
	(0): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
	(1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
	)
	)
	)
	)
	(positional_encoding): SinePositionalEncoding(num_feats=128, temperature=10000, normalize=True, scale=6.283185307179586, eps=1e-06)
	)
	init_cfg={'type': 'Normal', 'std': 0.01, 'override': {'name': 'conv_seg'}}
	(auxiliary_head): FCNHead(
	input_transform=None, ignore_index=255, align_corners=False
	(loss_decode): CrossEntropyLoss(avg_non_ignore=False)
	(conv_seg): Conv2d(256, 19, kernel_size=(1, 1), stride=(1, 1))
	(dropout): Dropout2d(p=0.1, inplace=False)
	(convs): Sequential(
	(0): ConvModule(
	(conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
	(bn): SyncBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
	(activate): ReLU(inplace=True)
	)
	)
	)
	init_cfg={'type': 'Normal', 'std': 0.01, 'override': {'name': 'conv_seg'}}
	(embedding_table): Embedding(20, 256)
	(transform): ConvModule(
	(conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1))
	)
	(time_mlp): Sequential(
	(0): LearnedSinusoidalPosEmb()
	(1): Linear(in_features=17, out_features=1024, bias=True)
	(2): GELU()
	(3): Linear(in_features=1024, out_features=1024, bias=True)
	)
	)
	2023-03-08 16:52:54,542 - mmseg - INFO - Model size:136.06
	2023-03-08 16:52:54,588 - mmseg - INFO - Loaded 2975 images
	2023-03-08 16:52:55,010 - mmseg - INFO - Loaded 500 images
	2023-03-08 16:52:55,011 - mmseg - INFO - load checkpoint from local path: work_dirs/deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_best_mIoU_iter_128000_.pth
	2023-03-08 16:52:55,104 - mmseg - INFO - Hooks will be executed in the following order:
	before_run:
	(VERY_HIGH ) PolyLrUpdaterHook
	(NORMAL ) CheckpointHook
	(LOW ) DistEvalHook
	(VERY_LOW ) TextLoggerHook
	--------------------
	before_train_epoch:
	(VERY_HIGH ) PolyLrUpdaterHook
	(LOW ) IterTimerHook
	(LOW ) DistEvalHook
	(VERY_LOW ) TextLoggerHook
	--------------------
	before_train_iter:
	(VERY_HIGH ) PolyLrUpdaterHook
	(LOW ) IterTimerHook
	(LOW ) DistEvalHook
	--------------------
	after_train_iter:
	(ABOVE_NORMAL) OptimizerHook
	(NORMAL ) CheckpointHook
	(LOW ) IterTimerHook
	(LOW ) DistEvalHook
	(VERY_LOW ) TextLoggerHook
	--------------------
	after_train_epoch:
	(NORMAL ) CheckpointHook
	(LOW ) DistEvalHook
	(VERY_LOW ) TextLoggerHook
	--------------------
	before_val_epoch:
	(LOW ) IterTimerHook
	(VERY_LOW ) TextLoggerHook
	--------------------
	before_val_iter:
	(LOW ) IterTimerHook
	--------------------
	after_val_iter:
	(LOW ) IterTimerHook
	--------------------
	after_val_epoch:
	(VERY_LOW ) TextLoggerHook
	--------------------
	after_run:
	(VERY_LOW ) TextLoggerHook
	--------------------
	2023-03-08 16:52:55,104 - mmseg - INFO - workflow: [('train', 1)], max: 20000 iters
	2023-03-08 16:53:49,599 - mmseg - INFO - Iter [50/20000] lr: 1.955e-07, eta: 4:35:26, time: 0.828, data_time: 0.017, memory: 22072, pred_decode.loss_ce: 0.1134, pred_decode.acc_seg: 96.4060, aux.loss_ce: 0.0416, aux.acc_seg: 96.0319, loss: 0.1550
	2023-03-08 16:54:39,586 - mmseg - INFO - Iter [100/20000] lr: 3.940e-07, eta: 5:03:10, time: 1.000, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.1234, pred_decode.acc_seg: 96.2790, aux.loss_ce: 0.0412, aux.acc_seg: 95.9889, loss: 0.1646
	2023-03-08 16:55:29,563 - mmseg - INFO - Iter [150/20000] lr: 5.916e-07, eta: 5:11:50, time: 1.000, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.1157, pred_decode.acc_seg: 96.3692, aux.loss_ce: 0.0419, aux.acc_seg: 96.0071, loss: 0.1576
	2023-03-08 16:56:22,387 - mmseg - INFO - Iter [200/20000] lr: 7.881e-07, eta: 5:20:26, time: 1.056, data_time: 0.059, memory: 22072, pred_decode.loss_ce: 0.0981, pred_decode.acc_seg: 96.6626, aux.loss_ce: 0.0380, aux.acc_seg: 96.2943, loss: 0.1362
	2023-03-08 16:56:49,326 - mmseg - INFO - Iter [250/20000] lr: 9.836e-07, eta: 4:51:11, time: 0.539, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.1044, pred_decode.acc_seg: 96.5844, aux.loss_ce: 0.0392, aux.acc_seg: 96.2089, loss: 0.1437
	2023-03-08 16:57:12,525 - mmseg - INFO - Iter [300/20000] lr: 1.178e-06, eta: 4:27:25, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.1116, pred_decode.acc_seg: 96.5339, aux.loss_ce: 0.0394, aux.acc_seg: 96.1893, loss: 0.1511
	2023-03-08 16:57:35,716 - mmseg - INFO - Iter [350/20000] lr: 1.372e-06, eta: 4:10:20, time: 0.464, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.1016, pred_decode.acc_seg: 96.5901, aux.loss_ce: 0.0400, aux.acc_seg: 96.1521, loss: 0.1416
	2023-03-08 16:58:01,484 - mmseg - INFO - Iter [400/20000] lr: 1.564e-06, eta: 3:59:32, time: 0.515, data_time: 0.056, memory: 22072, pred_decode.loss_ce: 0.1042, pred_decode.acc_seg: 96.4568, aux.loss_ce: 0.0406, aux.acc_seg: 96.0870, loss: 0.1448
	2023-03-08 16:58:24,666 - mmseg - INFO - Iter [450/20000] lr: 1.756e-06, eta: 3:49:09, time: 0.464, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.0918, pred_decode.acc_seg: 96.6885, aux.loss_ce: 0.0396, aux.acc_seg: 96.1739, loss: 0.1315
	2023-03-08 16:58:47,940 - mmseg - INFO - Iter [500/20000] lr: 1.946e-06, eta: 3:40:50, time: 0.465, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.0912, pred_decode.acc_seg: 96.6136, aux.loss_ce: 0.0400, aux.acc_seg: 96.1327, loss: 0.1312
	2023-03-08 17:00:26,107 - mmseg - INFO - per class results:
	2023-03-08 17:00:26,108 - mmseg - INFO -
	+---------------+-------+-------+
	\| Class \| IoU \| Acc \|
	+---------------+-------+-------+
	\| road \| 98.64 \| 99.39 \|
	\| sidewalk \| 87.98 \| 94.05 \|
	\| building \| 93.41 \| 97.38 \|
	\| wall \| 56.4 \| 60.95 \|
	\| fence \| 65.57 \| 74.13 \|
	\| pole \| 71.29 \| 81.93 \|
	\| traffic light \| 75.85 \| 86.07 \|
	\| traffic sign \| 83.52 \| 89.53 \|
	\| vegetation \| 93.03 \| 96.38 \|
	\| terrain \| 65.37 \| 72.62 \|
	\| sky \| 95.37 \| 98.56 \|
	\| person \| 85.16 \| 92.44 \|
	\| rider \| 66.45 \| 77.64 \|
	\| car \| 96.26 \| 98.24 \|
	\| truck \| 87.35 \| 91.57 \|
	\| bus \| 92.94 \| 95.42 \|
	\| train \| 88.41 \| 90.8 \|
	\| motorcycle \| 72.27 \| 79.13 \|
	\| bicycle \| 81.62 \| 91.82 \|
	+---------------+-------+-------+
	2023-03-08 17:00:26,108 - mmseg - INFO - Summary:
	2023-03-08 17:00:26,108 - mmseg - INFO -
	+-------+-------+-------+
	\| aAcc \| mIoU \| mAcc \|
	+-------+-------+-------+
	\| 96.72 \| 81.94 \| 87.79 \|
	+-------+-------+-------+
	2023-03-08 17:00:26,782 - mmseg - INFO - Now best checkpoint is saved as best_mIoU_iter_500.pth.
	2023-03-08 17:00:26,782 - mmseg - INFO - Best mIoU is 0.8194 at 500 iter.
	2023-03-08 17:00:26,782 - mmseg - INFO - Iter(val) [125] aAcc: 0.9672, mIoU: 0.8194, mAcc: 0.8779, IoU.road: 0.9864, IoU.sidewalk: 0.8798, IoU.building: 0.9341, IoU.wall: 0.5640, IoU.fence: 0.6557, IoU.pole: 0.7129, IoU.traffic light: 0.7585, IoU.traffic sign: 0.8352, IoU.vegetation: 0.9303, IoU.terrain: 0.6537, IoU.sky: 0.9537, IoU.person: 0.8516, IoU.rider: 0.6645, IoU.car: 0.9626, IoU.truck: 0.8735, IoU.bus: 0.9294, IoU.train: 0.8841, IoU.motorcycle: 0.7227, IoU.bicycle: 0.8162, Acc.road: 0.9939, Acc.sidewalk: 0.9405, Acc.building: 0.9738, Acc.wall: 0.6095, Acc.fence: 0.7413, Acc.pole: 0.8193, Acc.traffic light: 0.8607, Acc.traffic sign: 0.8953, Acc.vegetation: 0.9638, Acc.terrain: 0.7262, Acc.sky: 0.9856, Acc.person: 0.9244, Acc.rider: 0.7764, Acc.car: 0.9824, Acc.truck: 0.9157, Acc.bus: 0.9542, Acc.train: 0.9080, Acc.motorcycle: 0.7913, Acc.bicycle: 0.9182
	2023-03-08 17:00:49,956 - mmseg - INFO - Iter [550/20000] lr: 2.136e-06, eta: 4:32:10, time: 2.440, data_time: 1.985, memory: 22072, pred_decode.loss_ce: 0.0908, pred_decode.acc_seg: 96.6846, aux.loss_ce: 0.0395, aux.acc_seg: 96.1980, loss: 0.1302
	2023-03-08 17:01:15,810 - mmseg - INFO - Iter [600/20000] lr: 2.324e-06, eta: 4:22:46, time: 0.517, data_time: 0.058, memory: 22072, pred_decode.loss_ce: 0.0918, pred_decode.acc_seg: 96.6927, aux.loss_ce: 0.0392, aux.acc_seg: 96.2090, loss: 0.1310
	2023-03-08 17:01:39,047 - mmseg - INFO - Iter [650/20000] lr: 2.512e-06, eta: 4:13:28, time: 0.465, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.0917, pred_decode.acc_seg: 96.6253, aux.loss_ce: 0.0403, aux.acc_seg: 96.0885, loss: 0.1320
	2023-03-08 17:02:02,342 - mmseg - INFO - Iter [700/20000] lr: 2.698e-06, eta: 4:05:27, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0946, pred_decode.acc_seg: 96.6179, aux.loss_ce: 0.0404, aux.acc_seg: 96.1226, loss: 0.1350
	2023-03-08 17:02:28,215 - mmseg - INFO - Iter [750/20000] lr: 2.884e-06, eta: 3:59:34, time: 0.517, data_time: 0.059, memory: 22072, pred_decode.loss_ce: 0.0908, pred_decode.acc_seg: 96.6657, aux.loss_ce: 0.0400, aux.acc_seg: 96.1454, loss: 0.1308
	2023-03-08 17:02:51,326 - mmseg - INFO - Iter [800/20000] lr: 3.068e-06, eta: 3:53:15, time: 0.462, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.0878, pred_decode.acc_seg: 96.7597, aux.loss_ce: 0.0383, aux.acc_seg: 96.3302, loss: 0.1261
	2023-03-08 17:03:14,395 - mmseg - INFO - Iter [850/20000] lr: 3.252e-06, eta: 3:47:37, time: 0.461, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.0909, pred_decode.acc_seg: 96.6438, aux.loss_ce: 0.0398, aux.acc_seg: 96.1616, loss: 0.1306
	2023-03-08 17:03:37,519 - mmseg - INFO - Iter [900/20000] lr: 3.434e-06, eta: 3:42:35, time: 0.462, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.0904, pred_decode.acc_seg: 96.6393, aux.loss_ce: 0.0400, aux.acc_seg: 96.1476, loss: 0.1304
	2023-03-08 17:04:03,251 - mmseg - INFO - Iter [950/20000] lr: 3.616e-06, eta: 3:38:55, time: 0.515, data_time: 0.059, memory: 22072, pred_decode.loss_ce: 0.0865, pred_decode.acc_seg: 96.6811, aux.loss_ce: 0.0398, aux.acc_seg: 96.1517, loss: 0.1263
	2023-03-08 17:04:26,503 - mmseg - INFO - Exp name: deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_align_diffv22.py
	2023-03-08 17:04:26,503 - mmseg - INFO - Iter [1000/20000] lr: 3.796e-06, eta: 3:34:48, time: 0.465, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0842, pred_decode.acc_seg: 96.7599, aux.loss_ce: 0.0384, aux.acc_seg: 96.2946, loss: 0.1226
	2023-03-08 17:05:51,586 - mmseg - INFO - per class results:
	2023-03-08 17:05:51,588 - mmseg - INFO -
	+---------------+-------+-------+
	\| Class \| IoU \| Acc \|
	+---------------+-------+-------+
	\| road \| 98.68 \| 99.26 \|
	\| sidewalk \| 88.34 \| 94.66 \|
	\| building \| 93.57 \| 97.3 \|
	\| wall \| 56.21 \| 61.02 \|
	\| fence \| 66.41 \| 74.31 \|
	\| pole \| 72.98 \| 84.95 \|
	\| traffic light \| 75.88 \| 85.72 \|
	\| traffic sign \| 83.89 \| 89.82 \|
	\| vegetation \| 93.19 \| 96.64 \|
	\| terrain \| 65.03 \| 73.42 \|
	\| sky \| 95.37 \| 98.52 \|
	\| person \| 85.27 \| 92.41 \|
	\| rider \| 66.91 \| 79.02 \|
	\| car \| 96.2 \| 98.24 \|
	\| truck \| 85.96 \| 89.58 \|
	\| bus \| 93.18 \| 95.41 \|
	\| train \| 88.94 \| 91.51 \|
	\| motorcycle \| 72.76 \| 80.83 \|
	\| bicycle \| 81.68 \| 91.57 \|
	+---------------+-------+-------+
	2023-03-08 17:05:51,588 - mmseg - INFO - Summary:
	2023-03-08 17:05:51,588 - mmseg - INFO -
	+-------+-------+-------+
	\| aAcc \| mIoU \| mAcc \|
	+-------+-------+-------+
	\| 96.78 \| 82.13 \| 88.12 \|
	+-------+-------+-------+
	2023-03-08 17:05:52,308 - mmseg - INFO - Now best checkpoint is saved as best_mIoU_iter_1000.pth.
	2023-03-08 17:05:52,308 - mmseg - INFO - Best mIoU is 0.8213 at 1000 iter.
	2023-03-08 17:05:52,308 - mmseg - INFO - Exp name: deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_align_diffv22.py
	2023-03-08 17:05:52,308 - mmseg - INFO - Iter(val) [125] aAcc: 0.9678, mIoU: 0.8213, mAcc: 0.8812, IoU.road: 0.9868, IoU.sidewalk: 0.8834, IoU.building: 0.9357, IoU.wall: 0.5621, IoU.fence: 0.6641, IoU.pole: 0.7298, IoU.traffic light: 0.7588, IoU.traffic sign: 0.8389, IoU.vegetation: 0.9319, IoU.terrain: 0.6503, IoU.sky: 0.9537, IoU.person: 0.8527, IoU.rider: 0.6691, IoU.car: 0.9620, IoU.truck: 0.8596, IoU.bus: 0.9318, IoU.train: 0.8894, IoU.motorcycle: 0.7276, IoU.bicycle: 0.8168, Acc.road: 0.9926, Acc.sidewalk: 0.9466, Acc.building: 0.9730, Acc.wall: 0.6102, Acc.fence: 0.7431, Acc.pole: 0.8495, Acc.traffic light: 0.8572, Acc.traffic sign: 0.8982, Acc.vegetation: 0.9664, Acc.terrain: 0.7342, Acc.sky: 0.9852, Acc.person: 0.9241, Acc.rider: 0.7902, Acc.car: 0.9824, Acc.truck: 0.8958, Acc.bus: 0.9541, Acc.train: 0.9151, Acc.motorcycle: 0.8083, Acc.bicycle: 0.9157
	2023-03-08 17:06:15,596 - mmseg - INFO - Iter [1050/20000] lr: 3.976e-06, eta: 3:56:50, time: 2.182, data_time: 1.724, memory: 22072, pred_decode.loss_ce: 0.0849, pred_decode.acc_seg: 96.7223, aux.loss_ce: 0.0398, aux.acc_seg: 96.1278, loss: 0.1247
	2023-03-08 17:06:38,916 - mmseg - INFO - Iter [1100/20000] lr: 4.154e-06, eta: 3:52:09, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0914, pred_decode.acc_seg: 96.5312, aux.loss_ce: 0.0413, aux.acc_seg: 96.0190, loss: 0.1327
	2023-03-08 17:07:04,789 - mmseg - INFO - Iter [1150/20000] lr: 4.332e-06, eta: 3:48:33, time: 0.517, data_time: 0.060, memory: 22072, pred_decode.loss_ce: 0.0853, pred_decode.acc_seg: 96.7118, aux.loss_ce: 0.0400, aux.acc_seg: 96.1225, loss: 0.1253
	2023-03-08 17:07:28,078 - mmseg - INFO - Iter [1200/20000] lr: 4.508e-06, eta: 3:44:31, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0851, pred_decode.acc_seg: 96.7069, aux.loss_ce: 0.0398, aux.acc_seg: 96.1556, loss: 0.1249
	2023-03-08 17:07:51,414 - mmseg - INFO - Iter [1250/20000] lr: 4.684e-06, eta: 3:40:48, time: 0.467, data_time: 0.009, memory: 22072, pred_decode.loss_ce: 0.0886, pred_decode.acc_seg: 96.6296, aux.loss_ce: 0.0406, aux.acc_seg: 96.0961, loss: 0.1292
	2023-03-08 17:08:14,724 - mmseg - INFO - Iter [1300/20000] lr: 4.859e-06, eta: 3:37:20, time: 0.466, data_time: 0.009, memory: 22072, pred_decode.loss_ce: 0.0869, pred_decode.acc_seg: 96.6646, aux.loss_ce: 0.0398, aux.acc_seg: 96.1204, loss: 0.1267
	2023-03-08 17:08:40,655 - mmseg - INFO - Iter [1350/20000] lr: 5.032e-06, eta: 3:34:41, time: 0.518, data_time: 0.060, memory: 22072, pred_decode.loss_ce: 0.0856, pred_decode.acc_seg: 96.6600, aux.loss_ce: 0.0407, aux.acc_seg: 96.0743, loss: 0.1263
	2023-03-08 17:09:03,890 - mmseg - INFO - Iter [1400/20000] lr: 5.205e-06, eta: 3:31:37, time: 0.465, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0856, pred_decode.acc_seg: 96.6822, aux.loss_ce: 0.0403, aux.acc_seg: 96.1359, loss: 0.1259
	2023-03-08 17:09:27,178 - mmseg - INFO - Iter [1450/20000] lr: 5.376e-06, eta: 3:28:44, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0844, pred_decode.acc_seg: 96.7204, aux.loss_ce: 0.0394, aux.acc_seg: 96.1896, loss: 0.1238
	2023-03-08 17:09:52,921 - mmseg - INFO - Iter [1500/20000] lr: 5.547e-06, eta: 3:26:31, time: 0.515, data_time: 0.059, memory: 22072, pred_decode.loss_ce: 0.0835, pred_decode.acc_seg: 96.7633, aux.loss_ce: 0.0395, aux.acc_seg: 96.2057, loss: 0.1230
	2023-03-08 17:11:17,659 - mmseg - INFO - per class results:
	2023-03-08 17:11:17,660 - mmseg - INFO -
	+---------------+-------+-------+
	\| Class \| IoU \| Acc \|
	+---------------+-------+-------+
	\| road \| 98.71 \| 99.24 \|
	\| sidewalk \| 88.59 \| 95.02 \|
	\| building \| 93.6 \| 97.35 \|
	\| wall \| 56.06 \| 60.02 \|
	\| fence \| 65.81 \| 73.91 \|
	\| pole \| 72.88 \| 83.61 \|
	\| traffic light \| 75.84 \| 85.46 \|
	\| traffic sign \| 83.75 \| 90.61 \|
	\| vegetation \| 93.18 \| 96.7 \|
	\| terrain \| 65.44 \| 73.0 \|
	\| sky \| 95.35 \| 98.57 \|
	\| person \| 85.49 \| 92.15 \|
	\| rider \| 68.08 \| 80.85 \|
	\| car \| 96.2 \| 98.39 \|
	\| truck \| 86.85 \| 90.2 \|
	\| bus \| 92.48 \| 95.25 \|
	\| train \| 86.48 \| 88.52 \|
	\| motorcycle \| 72.05 \| 79.76 \|
	\| bicycle \| 81.35 \| 92.05 \|
	+---------------+-------+-------+
	2023-03-08 17:11:17,660 - mmseg - INFO - Summary:
	2023-03-08 17:11:17,660 - mmseg - INFO -
	+------+-------+-------+
	\| aAcc \| mIoU \| mAcc \|
	+------+-------+-------+
	\| 96.8 \| 82.01 \| 87.93 \|
	+------+-------+-------+
	2023-03-08 17:11:17,661 - mmseg - INFO - Iter(val) [125] aAcc: 0.9680, mIoU: 0.8201, mAcc: 0.8793, IoU.road: 0.9871, IoU.sidewalk: 0.8859, IoU.building: 0.9360, IoU.wall: 0.5606, IoU.fence: 0.6581, IoU.pole: 0.7288, IoU.traffic light: 0.7584, IoU.traffic sign: 0.8375, IoU.vegetation: 0.9318, IoU.terrain: 0.6544, IoU.sky: 0.9535, IoU.person: 0.8549, IoU.rider: 0.6808, IoU.car: 0.9620, IoU.truck: 0.8685, IoU.bus: 0.9248, IoU.train: 0.8648, IoU.motorcycle: 0.7205, IoU.bicycle: 0.8135, Acc.road: 0.9924, Acc.sidewalk: 0.9502, Acc.building: 0.9735, Acc.wall: 0.6002, Acc.fence: 0.7391, Acc.pole: 0.8361, Acc.traffic light: 0.8546, Acc.traffic sign: 0.9061, Acc.vegetation: 0.9670, Acc.terrain: 0.7300, Acc.sky: 0.9857, Acc.person: 0.9215, Acc.rider: 0.8085, Acc.car: 0.9839, Acc.truck: 0.9020, Acc.bus: 0.9525, Acc.train: 0.8852, Acc.motorcycle: 0.7976, Acc.bicycle: 0.9205
	2023-03-08 17:11:40,975 - mmseg - INFO - Iter [1550/20000] lr: 5.535e-06, eta: 3:40:45, time: 2.161, data_time: 1.703, memory: 22072, pred_decode.loss_ce: 0.0857, pred_decode.acc_seg: 96.6469, aux.loss_ce: 0.0414, aux.acc_seg: 96.0427, loss: 0.1270
	2023-03-08 17:12:04,297 - mmseg - INFO - Iter [1600/20000] lr: 5.520e-06, eta: 3:37:45, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0882, pred_decode.acc_seg: 96.5887, aux.loss_ce: 0.0410, aux.acc_seg: 96.0157, loss: 0.1293
	2023-03-08 17:12:27,573 - mmseg - INFO - Iter [1650/20000] lr: 5.505e-06, eta: 3:34:53, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0857, pred_decode.acc_seg: 96.6629, aux.loss_ce: 0.0404, aux.acc_seg: 96.1349, loss: 0.1261
	2023-03-08 17:12:53,582 - mmseg - INFO - Iter [1700/20000] lr: 5.490e-06, eta: 3:32:40, time: 0.520, data_time: 0.060, memory: 22072, pred_decode.loss_ce: 0.0819, pred_decode.acc_seg: 96.7658, aux.loss_ce: 0.0390, aux.acc_seg: 96.2226, loss: 0.1208
	2023-03-08 17:13:16,875 - mmseg - INFO - Iter [1750/20000] lr: 5.475e-06, eta: 3:30:04, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0846, pred_decode.acc_seg: 96.6970, aux.loss_ce: 0.0407, aux.acc_seg: 96.0782, loss: 0.1253
	2023-03-08 17:13:40,209 - mmseg - INFO - Iter [1800/20000] lr: 5.460e-06, eta: 3:27:37, time: 0.467, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0861, pred_decode.acc_seg: 96.6460, aux.loss_ce: 0.0414, aux.acc_seg: 96.0429, loss: 0.1275
	2023-03-08 17:14:03,478 - mmseg - INFO - Iter [1850/20000] lr: 5.445e-06, eta: 3:25:15, time: 0.465, data_time: 0.009, memory: 22072, pred_decode.loss_ce: 0.0819, pred_decode.acc_seg: 96.7853, aux.loss_ce: 0.0394, aux.acc_seg: 96.1878, loss: 0.1213
	2023-03-08 17:14:29,364 - mmseg - INFO - Iter [1900/20000] lr: 5.430e-06, eta: 3:23:24, time: 0.518, data_time: 0.060, memory: 22072, pred_decode.loss_ce: 0.0837, pred_decode.acc_seg: 96.7140, aux.loss_ce: 0.0398, aux.acc_seg: 96.1676, loss: 0.1235
	2023-03-08 17:14:52,722 - mmseg - INFO - Iter [1950/20000] lr: 5.415e-06, eta: 3:21:15, time: 0.467, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0838, pred_decode.acc_seg: 96.6902, aux.loss_ce: 0.0396, aux.acc_seg: 96.1878, loss: 0.1233
	2023-03-08 17:15:16,012 - mmseg - INFO - Saving checkpoint at 2000 iterations
	2023-03-08 17:15:16,740 - mmseg - INFO - Exp name: deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_align_diffv22.py
	2023-03-08 17:15:16,740 - mmseg - INFO - Iter [2000/20000] lr: 5.400e-06, eta: 3:19:16, time: 0.481, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0869, pred_decode.acc_seg: 96.6271, aux.loss_ce: 0.0402, aux.acc_seg: 96.1311, loss: 0.1271
	2023-03-08 17:16:41,335 - mmseg - INFO - per class results:
	2023-03-08 17:16:41,336 - mmseg - INFO -
	+---------------+-------+-------+
	\| Class \| IoU \| Acc \|
	+---------------+-------+-------+
	\| road \| 98.7 \| 99.28 \|
	\| sidewalk \| 88.59 \| 94.91 \|
	\| building \| 93.61 \| 97.35 \|
	\| wall \| 58.18 \| 62.83 \|
	\| fence \| 66.8 \| 72.79 \|
	\| pole \| 73.19 \| 83.47 \|
	\| traffic light \| 75.94 \| 85.53 \|
	\| traffic sign \| 84.19 \| 90.28 \|
	\| vegetation \| 93.11 \| 96.58 \|
	\| terrain \| 66.26 \| 75.15 \|
	\| sky \| 95.21 \| 98.77 \|
	\| person \| 85.37 \| 92.2 \|
	\| rider \| 66.77 \| 78.66 \|
	\| car \| 96.25 \| 98.41 \|
	\| truck \| 87.69 \| 91.03 \|
	\| bus \| 92.85 \| 95.38 \|
	\| train \| 87.62 \| 89.68 \|
	\| motorcycle \| 71.57 \| 78.51 \|
	\| bicycle \| 81.45 \| 91.37 \|
	+---------------+-------+-------+
	2023-03-08 17:16:41,336 - mmseg - INFO - Summary:
	2023-03-08 17:16:41,336 - mmseg - INFO -
	+-------+-------+-------+
	\| aAcc \| mIoU \| mAcc \|
	+-------+-------+-------+
	\| 96.81 \| 82.28 \| 88.01 \|
	+-------+-------+-------+
	2023-03-08 17:16:42,064 - mmseg - INFO - Now best checkpoint is saved as best_mIoU_iter_2000.pth.
	2023-03-08 17:16:42,064 - mmseg - INFO - Best mIoU is 0.8228 at 2000 iter.
	2023-03-08 17:16:42,065 - mmseg - INFO - Exp name: deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_align_diffv22.py
	2023-03-08 17:16:42,065 - mmseg - INFO - Iter(val) [125] aAcc: 0.9681, mIoU: 0.8228, mAcc: 0.8801, IoU.road: 0.9870, IoU.sidewalk: 0.8859, IoU.building: 0.9361, IoU.wall: 0.5818, IoU.fence: 0.6680, IoU.pole: 0.7319, IoU.traffic light: 0.7594, IoU.traffic sign: 0.8419, IoU.vegetation: 0.9311, IoU.terrain: 0.6626, IoU.sky: 0.9521, IoU.person: 0.8537, IoU.rider: 0.6677, IoU.car: 0.9625, IoU.truck: 0.8769, IoU.bus: 0.9285, IoU.train: 0.8762, IoU.motorcycle: 0.7157, IoU.bicycle: 0.8145, Acc.road: 0.9928, Acc.sidewalk: 0.9491, Acc.building: 0.9735, Acc.wall: 0.6283, Acc.fence: 0.7279, Acc.pole: 0.8347, Acc.traffic light: 0.8553, Acc.traffic sign: 0.9028, Acc.vegetation: 0.9658, Acc.terrain: 0.7515, Acc.sky: 0.9877, Acc.person: 0.9220, Acc.rider: 0.7866, Acc.car: 0.9841, Acc.truck: 0.9103, Acc.bus: 0.9538, Acc.train: 0.8968, Acc.motorcycle: 0.7851, Acc.bicycle: 0.9137
	2023-03-08 17:17:07,824 - mmseg - INFO - Iter [2050/20000] lr: 5.385e-06, eta: 3:30:05, time: 2.222, data_time: 1.765, memory: 22072, pred_decode.loss_ce: 0.0805, pred_decode.acc_seg: 96.7433, aux.loss_ce: 0.0388, aux.acc_seg: 96.2138, loss: 0.1193
	2023-03-08 17:17:31,038 - mmseg - INFO - Iter [2100/20000] lr: 5.370e-06, eta: 3:27:48, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0830, pred_decode.acc_seg: 96.7384, aux.loss_ce: 0.0399, aux.acc_seg: 96.1385, loss: 0.1229
	2023-03-08 17:17:54,406 - mmseg - INFO - Iter [2150/20000] lr: 5.355e-06, eta: 3:25:38, time: 0.467, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0875, pred_decode.acc_seg: 96.6208, aux.loss_ce: 0.0413, aux.acc_seg: 96.0257, loss: 0.1289
	2023-03-08 17:18:17,751 - mmseg - INFO - Iter [2200/20000] lr: 5.340e-06, eta: 3:23:33, time: 0.467, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0826, pred_decode.acc_seg: 96.7552, aux.loss_ce: 0.0392, aux.acc_seg: 96.2115, loss: 0.1218
	2023-03-08 17:18:43,627 - mmseg - INFO - Iter [2250/20000] lr: 5.325e-06, eta: 3:21:52, time: 0.518, data_time: 0.058, memory: 22072, pred_decode.loss_ce: 0.0815, pred_decode.acc_seg: 96.7622, aux.loss_ce: 0.0394, aux.acc_seg: 96.1841, loss: 0.1209
	2023-03-08 17:19:06,917 - mmseg - INFO - Iter [2300/20000] lr: 5.310e-06, eta: 3:19:55, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0866, pred_decode.acc_seg: 96.6346, aux.loss_ce: 0.0422, aux.acc_seg: 95.9662, loss: 0.1288
	2023-03-08 17:19:30,259 - mmseg - INFO - Iter [2350/20000] lr: 5.295e-06, eta: 3:18:02, time: 0.467, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0849, pred_decode.acc_seg: 96.6571, aux.loss_ce: 0.0403, aux.acc_seg: 96.1105, loss: 0.1252
	2023-03-08 17:19:53,532 - mmseg - INFO - Iter [2400/20000] lr: 5.280e-06, eta: 3:16:12, time: 0.465, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0842, pred_decode.acc_seg: 96.6795, aux.loss_ce: 0.0412, aux.acc_seg: 96.0268, loss: 0.1254
	2023-03-08 17:20:19,478 - mmseg - INFO - Iter [2450/20000] lr: 5.265e-06, eta: 3:14:45, time: 0.519, data_time: 0.060, memory: 22072, pred_decode.loss_ce: 0.0891, pred_decode.acc_seg: 96.5415, aux.loss_ce: 0.0425, aux.acc_seg: 95.9152, loss: 0.1316
	2023-03-08 17:20:42,746 - mmseg - INFO - Iter [2500/20000] lr: 5.250e-06, eta: 3:13:01, time: 0.465, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0812, pred_decode.acc_seg: 96.7967, aux.loss_ce: 0.0400, aux.acc_seg: 96.1538, loss: 0.1212
	2023-03-08 17:22:07,759 - mmseg - INFO - per class results:
	2023-03-08 17:22:07,761 - mmseg - INFO -
	+---------------+-------+-------+
	\| Class \| IoU \| Acc \|
	+---------------+-------+-------+
	\| road \| 98.76 \| 99.35 \|
	\| sidewalk \| 88.98 \| 94.69 \|
	\| building \| 93.71 \| 97.32 \|
	\| wall \| 57.9 \| 62.91 \|
	\| fence \| 67.73 \| 75.67 \|
	\| pole \| 73.34 \| 83.05 \|
	\| traffic light \| 76.11 \| 86.31 \|
	\| traffic sign \| 84.24 \| 90.69 \|
	\| vegetation \| 93.27 \| 96.78 \|
	\| terrain \| 65.94 \| 74.51 \|
	\| sky \| 95.3 \| 98.66 \|
	\| person \| 85.43 \| 92.59 \|
	\| rider \| 66.99 \| 77.87 \|
	\| car \| 96.2 \| 98.35 \|
	\| truck \| 85.85 \| 89.02 \|
	\| bus \| 92.04 \| 95.35 \|
	\| train \| 84.2 \| 86.24 \|
	\| motorcycle \| 72.13 \| 81.39 \|
	\| bicycle \| 81.72 \| 91.34 \|
	+---------------+-------+-------+
	2023-03-08 17:22:07,761 - mmseg - INFO - Summary:
	2023-03-08 17:22:07,761 - mmseg - INFO -
	+-------+------+------+
	\| aAcc \| mIoU \| mAcc \|
	+-------+------+------+
	\| 96.86 \| 82.1 \| 88.0 \|
	+-------+------+------+
	2023-03-08 17:22:07,761 - mmseg - INFO - Iter(val) [125] aAcc: 0.9686, mIoU: 0.8210, mAcc: 0.8800, IoU.road: 0.9876, IoU.sidewalk: 0.8898, IoU.building: 0.9371, IoU.wall: 0.5790, IoU.fence: 0.6773, IoU.pole: 0.7334, IoU.traffic light: 0.7611, IoU.traffic sign: 0.8424, IoU.vegetation: 0.9327, IoU.terrain: 0.6594, IoU.sky: 0.9530, IoU.person: 0.8543, IoU.rider: 0.6699, IoU.car: 0.9620, IoU.truck: 0.8585, IoU.bus: 0.9204, IoU.train: 0.8420, IoU.motorcycle: 0.7213, IoU.bicycle: 0.8172, Acc.road: 0.9935, Acc.sidewalk: 0.9469, Acc.building: 0.9732, Acc.wall: 0.6291, Acc.fence: 0.7567, Acc.pole: 0.8305, Acc.traffic light: 0.8631, Acc.traffic sign: 0.9069, Acc.vegetation: 0.9678, Acc.terrain: 0.7451, Acc.sky: 0.9866, Acc.person: 0.9259, Acc.rider: 0.7787, Acc.car: 0.9835, Acc.truck: 0.8902, Acc.bus: 0.9535, Acc.train: 0.8624, Acc.motorcycle: 0.8139, Acc.bicycle: 0.9134
	2023-03-08 17:22:31,049 - mmseg - INFO - Iter [2550/20000] lr: 5.235e-06, eta: 3:21:03, time: 2.166, data_time: 1.708, memory: 22072, pred_decode.loss_ce: 0.0792, pred_decode.acc_seg: 96.8770, aux.loss_ce: 0.0379, aux.acc_seg: 96.3405, loss: 0.1171
	2023-03-08 17:22:54,237 - mmseg - INFO - Iter [2600/20000] lr: 5.220e-06, eta: 3:19:12, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0841, pred_decode.acc_seg: 96.6992, aux.loss_ce: 0.0407, aux.acc_seg: 96.1105, loss: 0.1248
	2023-03-08 17:23:20,007 - mmseg - INFO - Iter [2650/20000] lr: 5.205e-06, eta: 3:17:42, time: 0.515, data_time: 0.059, memory: 22072, pred_decode.loss_ce: 0.0856, pred_decode.acc_seg: 96.5922, aux.loss_ce: 0.0420, aux.acc_seg: 95.9452, loss: 0.1276
	2023-03-08 17:23:43,199 - mmseg - INFO - Iter [2700/20000] lr: 5.190e-06, eta: 3:15:57, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0810, pred_decode.acc_seg: 96.7928, aux.loss_ce: 0.0393, aux.acc_seg: 96.2001, loss: 0.1202
	2023-03-08 17:24:06,392 - mmseg - INFO - Iter [2750/20000] lr: 5.175e-06, eta: 3:14:15, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0816, pred_decode.acc_seg: 96.7622, aux.loss_ce: 0.0387, aux.acc_seg: 96.2437, loss: 0.1202
	2023-03-08 17:24:32,306 - mmseg - INFO - Iter [2800/20000] lr: 5.160e-06, eta: 3:12:53, time: 0.518, data_time: 0.060, memory: 22072, pred_decode.loss_ce: 0.0861, pred_decode.acc_seg: 96.6071, aux.loss_ce: 0.0408, aux.acc_seg: 96.0280, loss: 0.1270
	2023-03-08 17:24:55,589 - mmseg - INFO - Iter [2850/20000] lr: 5.145e-06, eta: 3:11:17, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0836, pred_decode.acc_seg: 96.7181, aux.loss_ce: 0.0407, aux.acc_seg: 96.1362, loss: 0.1243
	2023-03-08 17:25:18,842 - mmseg - INFO - Iter [2900/20000] lr: 5.130e-06, eta: 3:09:44, time: 0.465, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0817, pred_decode.acc_seg: 96.7532, aux.loss_ce: 0.0400, aux.acc_seg: 96.1376, loss: 0.1218
	2023-03-08 17:25:42,127 - mmseg - INFO - Iter [2950/20000] lr: 5.115e-06, eta: 3:08:12, time: 0.466, data_time: 0.007, memory: 22072, pred_decode.loss_ce: 0.0851, pred_decode.acc_seg: 96.6016, aux.loss_ce: 0.0414, aux.acc_seg: 95.9999, loss: 0.1265
	2023-03-08 17:26:08,013 - mmseg - INFO - Exp name: deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_align_diffv22.py
	2023-03-08 17:26:08,013 - mmseg - INFO - Iter [3000/20000] lr: 5.100e-06, eta: 3:06:58, time: 0.518, data_time: 0.058, memory: 22072, pred_decode.loss_ce: 0.0820, pred_decode.acc_seg: 96.7119, aux.loss_ce: 0.0394, aux.acc_seg: 96.1584, loss: 0.1214
	2023-03-08 17:27:32,982 - mmseg - INFO - per class results:
	2023-03-08 17:27:32,983 - mmseg - INFO -
	+---------------+-------+-------+
	\| Class \| IoU \| Acc \|
	+---------------+-------+-------+
	\| road \| 98.7 \| 99.24 \|
	\| sidewalk \| 88.52 \| 95.13 \|
	\| building \| 93.68 \| 97.44 \|
	\| wall \| 57.52 \| 62.28 \|
	\| fence \| 66.91 \| 74.23 \|
	\| pole \| 73.07 \| 83.12 \|
	\| traffic light \| 75.94 \| 86.21 \|
	\| traffic sign \| 84.19 \| 90.33 \|
	\| vegetation \| 93.27 \| 96.71 \|
	\| terrain \| 65.94 \| 73.21 \|
	\| sky \| 95.48 \| 98.64 \|
	\| person \| 85.47 \| 92.61 \|
	\| rider \| 67.96 \| 81.44 \|
	\| car \| 96.28 \| 98.35 \|
	\| truck \| 87.83 \| 91.05 \|
	\| bus \| 93.14 \| 95.34 \|
	\| train \| 88.04 \| 90.09 \|
	\| motorcycle \| 71.85 \| 79.5 \|
	\| bicycle \| 81.53 \| 90.55 \|
	+---------------+-------+-------+
	2023-03-08 17:27:32,983 - mmseg - INFO - Summary:
	2023-03-08 17:27:32,984 - mmseg - INFO -
	+-------+-------+-------+
	\| aAcc \| mIoU \| mAcc \|
	+-------+-------+-------+
	\| 96.84 \| 82.38 \| 88.18 \|
	+-------+-------+-------+
	2023-03-08 17:27:33,750 - mmseg - INFO - Now best checkpoint is saved as best_mIoU_iter_3000.pth.
	2023-03-08 17:27:33,750 - mmseg - INFO - Best mIoU is 0.8238 at 3000 iter.
	2023-03-08 17:27:33,750 - mmseg - INFO - Exp name: deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_align_diffv22.py
	2023-03-08 17:27:33,750 - mmseg - INFO - Iter(val) [125] aAcc: 0.9684, mIoU: 0.8238, mAcc: 0.8818, IoU.road: 0.9870, IoU.sidewalk: 0.8852, IoU.building: 0.9368, IoU.wall: 0.5752, IoU.fence: 0.6691, IoU.pole: 0.7307, IoU.traffic light: 0.7594, IoU.traffic sign: 0.8419, IoU.vegetation: 0.9327, IoU.terrain: 0.6594, IoU.sky: 0.9548, IoU.person: 0.8547, IoU.rider: 0.6796, IoU.car: 0.9628, IoU.truck: 0.8783, IoU.bus: 0.9314, IoU.train: 0.8804, IoU.motorcycle: 0.7185, IoU.bicycle: 0.8153, Acc.road: 0.9924, Acc.sidewalk: 0.9513, Acc.building: 0.9744, Acc.wall: 0.6228, Acc.fence: 0.7423, Acc.pole: 0.8312, Acc.traffic light: 0.8621, Acc.traffic sign: 0.9033, Acc.vegetation: 0.9671, Acc.terrain: 0.7321, Acc.sky: 0.9864, Acc.person: 0.9261, Acc.rider: 0.8144, Acc.car: 0.9835, Acc.truck: 0.9105, Acc.bus: 0.9534, Acc.train: 0.9009, Acc.motorcycle: 0.7950, Acc.bicycle: 0.9055
	2023-03-08 17:27:57,013 - mmseg - INFO - Iter [3050/20000] lr: 5.085e-06, eta: 3:13:28, time: 2.180, data_time: 1.722, memory: 22072, pred_decode.loss_ce: 0.0801, pred_decode.acc_seg: 96.8182, aux.loss_ce: 0.0393, aux.acc_seg: 96.2089, loss: 0.1195
	2023-03-08 17:28:20,283 - mmseg - INFO - Iter [3100/20000] lr: 5.070e-06, eta: 3:11:54, time: 0.465, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0851, pred_decode.acc_seg: 96.6242, aux.loss_ce: 0.0411, aux.acc_seg: 96.0318, loss: 0.1262
	2023-03-08 17:28:43,588 - mmseg - INFO - Iter [3150/20000] lr: 5.055e-06, eta: 3:10:22, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0860, pred_decode.acc_seg: 96.6358, aux.loss_ce: 0.0415, aux.acc_seg: 96.0528, loss: 0.1275
	2023-03-08 17:29:09,488 - mmseg - INFO - Iter [3200/20000] lr: 5.040e-06, eta: 3:09:06, time: 0.518, data_time: 0.058, memory: 22072, pred_decode.loss_ce: 0.0836, pred_decode.acc_seg: 96.6524, aux.loss_ce: 0.0408, aux.acc_seg: 96.0325, loss: 0.1243
	2023-03-08 17:29:32,616 - mmseg - INFO - Iter [3250/20000] lr: 5.025e-06, eta: 3:07:38, time: 0.463, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0820, pred_decode.acc_seg: 96.7534, aux.loss_ce: 0.0402, aux.acc_seg: 96.1318, loss: 0.1222
	2023-03-08 17:29:55,808 - mmseg - INFO - Iter [3300/20000] lr: 5.010e-06, eta: 3:06:11, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0841, pred_decode.acc_seg: 96.6607, aux.loss_ce: 0.0415, aux.acc_seg: 95.9994, loss: 0.1256
	2023-03-08 17:30:21,537 - mmseg - INFO - Iter [3350/20000] lr: 4.995e-06, eta: 3:04:59, time: 0.515, data_time: 0.059, memory: 22072, pred_decode.loss_ce: 0.0796, pred_decode.acc_seg: 96.7991, aux.loss_ce: 0.0386, aux.acc_seg: 96.2239, loss: 0.1182
	2023-03-08 17:30:44,725 - mmseg - INFO - Iter [3400/20000] lr: 4.980e-06, eta: 3:03:37, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0820, pred_decode.acc_seg: 96.7456, aux.loss_ce: 0.0395, aux.acc_seg: 96.2064, loss: 0.1214
	2023-03-08 17:31:07,949 - mmseg - INFO - Iter [3450/20000] lr: 4.965e-06, eta: 3:02:16, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0830, pred_decode.acc_seg: 96.6756, aux.loss_ce: 0.0402, aux.acc_seg: 96.0959, loss: 0.1233
	2023-03-08 17:31:31,257 - mmseg - INFO - Iter [3500/20000] lr: 4.950e-06, eta: 3:00:57, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0819, pred_decode.acc_seg: 96.7773, aux.loss_ce: 0.0398, aux.acc_seg: 96.2114, loss: 0.1217
	2023-03-08 17:32:56,316 - mmseg - INFO - per class results:
	2023-03-08 17:32:56,317 - mmseg - INFO -
	+---------------+-------+-------+
	\| Class \| IoU \| Acc \|
	+---------------+-------+-------+
	\| road \| 98.73 \| 99.22 \|
	\| sidewalk \| 88.69 \| 95.39 \|
	\| building \| 93.68 \| 97.22 \|
	\| wall \| 59.11 \| 63.83 \|
	\| fence \| 67.58 \| 74.45 \|
	\| pole \| 72.98 \| 83.22 \|
	\| traffic light \| 76.11 \| 87.33 \|
	\| traffic sign \| 83.66 \| 90.87 \|
	\| vegetation \| 93.32 \| 96.96 \|
	\| terrain \| 66.29 \| 73.77 \|
	\| sky \| 95.39 \| 98.28 \|
	\| person \| 85.22 \| 93.27 \|
	\| rider \| 66.58 \| 76.88 \|
	\| car \| 96.25 \| 98.33 \|
	\| truck \| 87.17 \| 90.39 \|
	\| bus \| 93.45 \| 95.04 \|
	\| train \| 89.47 \| 91.77 \|
	\| motorcycle \| 71.73 \| 79.78 \|
	\| bicycle \| 81.66 \| 91.53 \|
	+---------------+-------+-------+
	2023-03-08 17:32:56,317 - mmseg - INFO - Summary:
	2023-03-08 17:32:56,317 - mmseg - INFO -
	+-------+-------+-------+
	\| aAcc \| mIoU \| mAcc \|
	+-------+-------+-------+
	\| 96.86 \| 82.48 \| 88.29 \|
	+-------+-------+-------+
	2023-03-08 17:32:57,044 - mmseg - INFO - Now best checkpoint is saved as best_mIoU_iter_3500.pth.
	2023-03-08 17:32:57,044 - mmseg - INFO - Best mIoU is 0.8248 at 3500 iter.
	2023-03-08 17:32:57,044 - mmseg - INFO - Iter(val) [125] aAcc: 0.9686, mIoU: 0.8248, mAcc: 0.8829, IoU.road: 0.9873, IoU.sidewalk: 0.8869, IoU.building: 0.9368, IoU.wall: 0.5911, IoU.fence: 0.6758, IoU.pole: 0.7298, IoU.traffic light: 0.7611, IoU.traffic sign: 0.8366, IoU.vegetation: 0.9332, IoU.terrain: 0.6629, IoU.sky: 0.9539, IoU.person: 0.8522, IoU.rider: 0.6658, IoU.car: 0.9625, IoU.truck: 0.8717, IoU.bus: 0.9345, IoU.train: 0.8947, IoU.motorcycle: 0.7173, IoU.bicycle: 0.8166, Acc.road: 0.9922, Acc.sidewalk: 0.9539, Acc.building: 0.9722, Acc.wall: 0.6383, Acc.fence: 0.7445, Acc.pole: 0.8322, Acc.traffic light: 0.8733, Acc.traffic sign: 0.9087, Acc.vegetation: 0.9696, Acc.terrain: 0.7377, Acc.sky: 0.9828, Acc.person: 0.9327, Acc.rider: 0.7688, Acc.car: 0.9833, Acc.truck: 0.9039, Acc.bus: 0.9504, Acc.train: 0.9177, Acc.motorcycle: 0.7978, Acc.bicycle: 0.9153
	2023-03-08 17:33:22,945 - mmseg - INFO - Iter [3550/20000] lr: 4.935e-06, eta: 3:06:29, time: 2.234, data_time: 1.775, memory: 22072, pred_decode.loss_ce: 0.0801, pred_decode.acc_seg: 96.8240, aux.loss_ce: 0.0389, aux.acc_seg: 96.2308, loss: 0.1191
	2023-03-08 17:33:46,295 - mmseg - INFO - Iter [3600/20000] lr: 4.920e-06, eta: 3:05:06, time: 0.467, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0801, pred_decode.acc_seg: 96.8076, aux.loss_ce: 0.0388, aux.acc_seg: 96.2750, loss: 0.1189
	2023-03-08 17:34:09,514 - mmseg - INFO - Iter [3650/20000] lr: 4.905e-06, eta: 3:03:45, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0835, pred_decode.acc_seg: 96.7013, aux.loss_ce: 0.0402, aux.acc_seg: 96.1315, loss: 0.1237
	2023-03-08 17:34:32,783 - mmseg - INFO - Iter [3700/20000] lr: 4.890e-06, eta: 3:02:25, time: 0.465, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0841, pred_decode.acc_seg: 96.6855, aux.loss_ce: 0.0401, aux.acc_seg: 96.1196, loss: 0.1243
	2023-03-08 17:34:58,765 - mmseg - INFO - Iter [3750/20000] lr: 4.875e-06, eta: 3:01:19, time: 0.520, data_time: 0.060, memory: 22072, pred_decode.loss_ce: 0.0839, pred_decode.acc_seg: 96.6755, aux.loss_ce: 0.0400, aux.acc_seg: 96.1140, loss: 0.1239
	2023-03-08 17:35:22,062 - mmseg - INFO - Iter [3800/20000] lr: 4.860e-06, eta: 3:00:02, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0796, pred_decode.acc_seg: 96.8543, aux.loss_ce: 0.0387, aux.acc_seg: 96.2756, loss: 0.1183
	2023-03-08 17:35:45,359 - mmseg - INFO - Iter [3850/20000] lr: 4.845e-06, eta: 2:58:46, time: 0.466, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0783, pred_decode.acc_seg: 96.8886, aux.loss_ce: 0.0380, aux.acc_seg: 96.3097, loss: 0.1164
	2023-03-08 17:36:08,531 - mmseg - INFO - Iter [3900/20000] lr: 4.830e-06, eta: 2:57:32, time: 0.464, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0847, pred_decode.acc_seg: 96.6052, aux.loss_ce: 0.0412, aux.acc_seg: 96.0052, loss: 0.1259
	2023-03-08 17:36:34,297 - mmseg - INFO - Iter [3950/20000] lr: 4.815e-06, eta: 2:56:29, time: 0.515, data_time: 0.059, memory: 22072, pred_decode.loss_ce: 0.0833, pred_decode.acc_seg: 96.6770, aux.loss_ce: 0.0405, aux.acc_seg: 96.0830, loss: 0.1238
	2023-03-08 17:36:57,477 - mmseg - INFO - Saving checkpoint at 4000 iterations
	2023-03-08 17:36:58,196 - mmseg - INFO - Exp name: deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_align_diffv22.py
	2023-03-08 17:36:58,196 - mmseg - INFO - Iter [4000/20000] lr: 4.800e-06, eta: 2:55:19, time: 0.478, data_time: 0.008, memory: 22072, pred_decode.loss_ce: 0.0845, pred_decode.acc_seg: 96.6431, aux.loss_ce: 0.0411, aux.acc_seg: 96.0373, loss: 0.1256
	2023-03-08 17:38:23,031 - mmseg - INFO - per class results:
	2023-03-08 17:38:23,032 - mmseg - INFO -
	+---------------+-------+-------+
	\| Class \| IoU \| Acc \|
	+---------------+-------+-------+
	\| road \| 98.72 \| 99.19 \|
	\| sidewalk \| 88.66 \| 95.43 \|
	\| building \| 93.7 \| 97.32 \|
	\| wall \| 60.42 \| 65.32 \|
	\| fence \| 67.24 \| 73.89 \|
	\| pole \| 72.9 \| 83.58 \|
	\| traffic light \| 76.21 \| 86.63 \|
	\| traffic sign \| 84.06 \| 90.67 \|
	\| vegetation \| 93.31 \| 96.84 \|
	\| terrain \| 66.32 \| 73.87 \|
	\| sky \| 95.42 \| 98.58 \|
	\| person \| 85.32 \| 92.05 \|
	\| rider \| 67.02 \| 78.82 \|
	\| car \| 96.3 \| 98.35 \|
	\| truck \| 88.51 \| 91.82 \|
	\| bus \| 93.3 \| 95.19 \|
	\| train \| 88.62 \| 90.6 \|
	\| motorcycle \| 72.15 \| 80.65 \|
	\| bicycle \| 81.51 \| 91.56 \|
	+---------------+-------+-------+
	2023-03-08 17:38:23,033 - mmseg - INFO - Summary:
	2023-03-08 17:38:23,033 - mmseg - INFO -
	+-------+-------+-------+
	\| aAcc \| mIoU \| mAcc \|
	+-------+-------+-------+
	\| 96.86 \| 82.62 \| 88.44 \|
	+-------+-------+-------+
	2023-03-08 17:38:23,770 - mmseg - INFO - Now best checkpoint is saved as best_mIoU_iter_4000.pth.
	2023-03-08 17:38:23,770 - mmseg - INFO - Best mIoU is 0.8262 at 4000 iter.
	2023-03-08 17:38:23,770 - mmseg - INFO - Exp name: deform_convnext_t_fpn_4x4_512x1024_160k_cityscapes_adam_diffv20_align_diffv22.py
	2023-03-08 17:38:23,770 - mmseg - INFO - Iter(val) [125] aAcc: 0.9686, mIoU: 0.8262, mAcc: 0.8844, IoU.road: 0.9872, IoU.sidewalk: 0.8866, IoU.building: 0.9370, IoU.wall: 0.6042, IoU.fence: 0.6724, IoU.pole: 0.7290, IoU.traffic light: 0.7621, IoU.traffic sign: 0.8406, IoU.vegetation: 0.9331, IoU.terrain: 0.6632, IoU.sky: 0.9542, IoU.person: 0.8532, IoU.rider: 0.6702, IoU.car: 0.9630, IoU.truck: 0.8851, IoU.bus: 0.9330, IoU.train: 0.8862, IoU.motorcycle: 0.7215, IoU.bicycle: 0.8151, Acc.road: 0.9919, Acc.sidewalk: 0.9543, Acc.building: 0.9732, Acc.wall: 0.6532, Acc.fence: 0.7389, Acc.pole: 0.8358, Acc.traffic light: 0.8663, Acc.traffic sign: 0.9067, Acc.vegetation: 0.9684, Acc.terrain: 0.7387, Acc.sky: 0.9858, Acc.person: 0.9205, Acc.rider: 0.7882, Acc.car: 0.9835, Acc.truck: 0.9182, Acc.bus: 0.9519, Acc.train: 0.9060, Acc.motorcycle: 0.8065, Acc.bicycle: 0.9156
	2023-03-08 17:38:47,051 - mmseg - INFO - Iter [4050/20000] lr: 4.785e-06, eta: 2:59:46, time: 2.177, data_time: 1.719, memory: 22072, pred_decode.loss_ce: 0.0805, pred_decode.acc_seg: 96.8142, aux.loss_ce: 0.0389, aux.acc_seg: 96.2473, loss: 0.1195
	2023-03-08 17:39:13,000 - mmseg - INFO - Iter [4100/20000] lr: 4.770e-06, eta: 2:58:41, time: 0.519, data_time: 0.059, memory: 22072, pred_decode.loss_ce: 0.0830, pred_decode.acc_seg: 96.7186, aux.loss_ce: 0.0406, aux.acc_seg: 96.1151, loss: 0.1235