Different Behavior between this hf model and the local model

#8
by QHQK - opened

Hi, thanks for your excellent job.

When testing the grounding dino with the provided "how to use" code, I change the input text from "a cat. a remote control." to "a cat. a remote control. the right cat". I find that the grounding-dino(hf) tends to match all detected cat boxes to the phrase "a cat," while the locally installed code with "groundingdino_swinb_cogcoor.pth" always matches the box of the right cat to the phrase "the right cat" and ignore "a cat."

I'm curious whether these two base models are the same. If they are different, which one will perform better?

Hey @QHQK ! Are you using .post_process_grounded_object_detection

Sign up or log in to comment