DETRs with Collaborative Hybrid Assignments Training
Introduction
In this paper, we present a novel collaborative hybrid assignments training scheme, namely Co-DETR, to learn more efficient and effective DETR-based detectors from versatile label assignment manners.
- Encoder optimization: The proposed training scheme can easily enhance the encoder's learning ability in end-to-end detectors by training multiple parallel auxiliary heads supervised by one-to-many label assignments.
- Decoder optimization: We conduct extra customized positive queries by extracting the positive coordinates from these auxiliary heads to improve attention learning of the decoder.
- State-of-the-art performance: Co-DETR with ViT-Large (304M parameters) is the first model to achieve 66.0 AP on COCO test-dev.
Model Zoo
Model | Backbone | Aug | Dataset | box AP (val) | box AP (minival) |
---|---|---|---|---|---|
Co-DETR | ViT-L | LSJ | LVIS | 68.0 | 72.0 |
How to use
We implement Co-DETR using MMDetection V2.25.3 and MMCV V1.5.0. Please refer to our github repo for more details.
Training
Train Co-Deformable-DETR + ResNet-50 with 8 GPUs:
sh tools/dist_train.sh projects/configs/co_deformable_detr/co_deformable_detr_r50_1x_coco.py 8 path_to_exp
Train using slurm:
sh tools/slurm_train.sh partition job_name projects/configs/co_deformable_detr/co_deformable_detr_r50_1x_coco.py path_to_exp
Testing
Test Co-Deformable-DETR + ResNet-50 with 8 GPUs, and evaluate:
sh tools/dist_test.sh projects/configs/co_deformable_detr/co_deformable_detr_r50_1x_coco.py path_to_checkpoint 8 --eval bbox
Test using slurm:
sh tools/slurm_test.sh partition job_name projects/configs/co_deformable_detr/co_deformable_detr_r50_1x_coco.py path_to_checkpoint --eval bbox
Cite Co-DETR
If you find this repository useful, please use the following BibTeX entry for citation.
@inproceedings{zong2023detrs,
title={Detrs with collaborative hybrid assignments training},
author={Zong, Zhuofan and Song, Guanglu and Liu, Yu},
booktitle={Proceedings of the IEEE/CVF international conference on computer vision},
pages={6748--6758},
year={2023}
}