What kind of GPU need to run this model locally on-prem ?

by eliastick - opened about 1 month ago

about 1 month ago

I'd like to run this model on-premise . What hardware and GPU I need . Thank you

about 1 month ago

•

@eliastick without quantization, you would require roughly 16gb vram for it? Any gpu with 16gb vram or more should be fine enough.

with quantization, you would require 5gb vram so any 6gb vram+ gpu should work. I would recommend using llama.cpp and possibly exllamav2 now.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment