GPU requirements
#31
by
fedorn
- opened
For anyone wondering regarding the GPU required to run the model, I was able to run the model on AWS EC2 instance g6e.xlarge with 32 GiB RAM and one NVIDIA L40S GPU with 48 GiB VRAM. The sentence-transformers code didn't require any modifications, and with HuggingFace Transformers the model can be loaded with model = AutoModel.from_pretrained('nvidia/NV-Embed-v2', trust_remote_code=True, device_map="cuda")
. After loading the model takes around 30 GiB VRAM:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.127.08 Driver Version: 550.127.08 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA L40S Off | 00000000:30:00.0 Off | 0 |
| N/A 29C P0 79W / 350W | 30393MiB / 46068MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 2859 C .../bin/python 30386MiB |
+-----------------------------------------------------------------------------------------+
after embedding the example it takes almost all available VRAM:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.127.08 Driver Version: 550.127.08 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA L40S Off | 00000000:30:00.0 Off | 0 |
| N/A 32C P0 78W / 350W | 45499MiB / 46068MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 2859 C .../bin/python 45492MiB |
+-----------------------------------------------------------------------------------------+