How to use visual grounding with this model ?

#25

by r4hul77 - opened Sep 27, 2024

Sep 27, 2024

In the documentation it is said that this model has visual grounding (object detection and segmentation), what is the best way to use this from this model (As I understand llama only outputs text tokens) ?

prudant

Nov 3, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment