How to run infinity and nv-embed-2

#30
by michaelfeil - opened

I got pinged by a lot of users & my PR got rejected.
I am stopping support for NV-Embed-{1|2} and let users have a discussion on how to best run it. Thanks!
https://huggingface.co/nvidia/NV-Embed-v2/discussions/23

From

Usage (Infinity)

Usage via Infinity, MIT License.
This needs a 24GB+ GPU.

docker run -it --gpus all  -v ./data:/app/.cache -p 7997:7997 michaelf34/infinity:0.0.70 \
v2 --model-id nvidia/NV-Embed-v2 --revision "refs/pr/23" --batch-size 8
NVIDIA org

Hi, @michaelfeil . Thank you for supporting the NV-Embed integration in Infinity. Your previous PR has been approved and the suggested changes to the modeling/configs have been merged. However, we decided not to include the Infinity instruction in the README, as NV-Embed is a research-only model and cannot extend our supports beyond Huggingface and Sentence Transformer.

Sign up or log in to comment