Could you add some models trained on the COCO dataset, e.g. based on Vision Transformers (ViT) like BEIT etc?
· Sign up or log in to comment