Tutorial: How to run infinity and nv-embed-2

#30

by michaelfeil - opened Dec 17, 2024

Discussion

michaelfeil

Dec 17, 2024

•

edited Jan 7

Usage for Infinity

Usage via Infinity, MIT License.
This needs a 24GB+ GPU.

docker run -it --gpus all  -v ./data:/app/.cache -p 7997:7997 michaelf34/infinity:0.0.70 \
v2 --model-id nvidia/NV-Embed-v2 --revision "refs/pr/23" --batch-size 8

michaelfeil

Dec 17, 2024

https://github.com/michaelfeil/infinity/issues/470 https://github.com/michaelfeil/infinity/issues/498#issuecomment-2549521971

nada5

NVIDIA org Dec 31, 2024

Hi, @michaelfeil . Thank you for supporting the NV-Embed integration in Infinity. Your previous PR has been approved and the suggested changes to the modeling/configs have been merged. However, we decided not to include the Infinity instruction in the README, as NV-Embed is a research-only model and cannot extend our supports beyond Huggingface and Sentence Transformer.

nada5 changed discussion status to closed Jan 2

Excel2

Jan 3

michaelfeil

Jan 3

@nada5 I opened this discussion as part of documentation. I acknowledge your decision! Closing it will not streamline users into commenting in a single thread. Please reopen?

nada5 changed discussion status to open Jan 7

michaelfeil

Jan 7

Thanks! :)

michaelfeil changed discussion title from How to run infinity and nv-embed-2 to Tutorial: How to run infinity and nv-embed-2 Jan 7

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment