DETRs with Collaborative Hybrid Assignments Training

Introduction

In this paper, we present a novel collaborative hybrid assignments training scheme, namely Co-DETR, to learn more efficient and effective DETR-based detectors from versatile label assignment manners.

Encoder optimization: The proposed training scheme can easily enhance the encoder's learning ability in end-to-end detectors by training multiple parallel auxiliary heads supervised by one-to-many label assignments.
Decoder optimization: We conduct extra customized positive queries by extracting the positive coordinates from these auxiliary heads to improve attention learning of the decoder.
State-of-the-art performance: Co-DETR with ViT-Large (304M parameters) is the first model to achieve 66.0 AP on COCO test-dev.

Model Zoo

This model is the Co-DETR checkpoint pre-trained on the large-scale Objects365 dataset.

Model	Backbone	Aug	Dataset	box AP (val)	mask AP (val)	box AP (test-dev)	mask AP (test-dev)
Co-DETR	ViT-L	DETR	COCO	65.3	56.2	-	-
Co-DETR (+TTA)	ViT-L	DETR	COCO	-	-	-	-

How to use

We implement Co-DETR using MMDetection V2.25.3 and MMCV V1.5.0. Please refer to our github repo for more details.

Training

Train Co-Deformable-DETR + ResNet-50 with 8 GPUs:

sh tools/dist_train.sh projects/configs/co_deformable_detr/co_deformable_detr_r50_1x_coco.py 8 path_to_exp

Train using slurm:

sh tools/slurm_train.sh partition job_name projects/configs/co_deformable_detr/co_deformable_detr_r50_1x_coco.py path_to_exp

Testing

Test Co-Deformable-DETR + ResNet-50 with 8 GPUs, and evaluate:

sh tools/dist_test.sh  projects/configs/co_deformable_detr/co_deformable_detr_r50_1x_coco.py path_to_checkpoint 8 --eval bbox

Test using slurm:

sh tools/slurm_test.sh partition job_name projects/configs/co_deformable_detr/co_deformable_detr_r50_1x_coco.py path_to_checkpoint --eval bbox

Cite Co-DETR

If you find this repository useful, please use the following BibTeX entry for citation.

@inproceedings{zong2023detrs,
  title={Detrs with collaborative hybrid assignments training},
  author={Zong, Zhuofan and Song, Guanglu and Liu, Yu},
  booktitle={Proceedings of the IEEE/CVF international conference on computer vision},
  pages={6748--6758},
  year={2023}
}

zongzhuofan
/

co-detr-vit-large-objects365