Accurate FP8 quantized models by Neural Magic, ready for use with vLLM!
-
neuralmagic/Meta-Llama-3-8B-Instruct-FP8
Text Generation • Updated • 1.26k • 4 -
neuralmagic/Meta-Llama-3-8B-Instruct-FP8-KV
Text Generation • Updated • 3.38k • 1 -
neuralmagic/Meta-Llama-3-70B-Instruct-FP8
Text Generation • Updated • 398 • 2 -
neuralmagic/Meta-Llama-3-70B-Instruct-FP8-KV
Text Generation • Updated • 14