Can you make a model that can run without quantization on a gpu with only 8g vram?
no
· Sign up or log in to comment