How to call it through transformer

by awelker - opened Nov 18, 2024

Nov 18, 2024

Could you describe the usage prompts and how to inject image and text stop and start words, maybe even a e.g. hugging face transformer call?
Thanks in advance.

alanzhuly

Nexa AI org Nov 19, 2024

•

edited Nov 19, 2024

Hi @awelker ! To run this model, please follow these 2 steps.

Step 1:
Install Nexa-SDK
https://github.com/NexaAI/nexa-sdk?tab=readme-ov-file#install-option-1-executable-installer

Step 2:
To use CLI, type in terminal: nexa run omnivision
To use local UI, type in terminal: nexa run omnivision -st

Here is a quick tutorial video on how to inject image and input prompts. You can drag a photo into your terminal and write prompt for image captioning and question answering tasks.

zackli4ai

Nexa AI org Nov 19, 2024

@awelker
Please try our Nexa SDK with gguf model format for now.
We plan to release transformer version soon, and have forward propogation implementation shared to community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment