catid
/

cat-llama-3-8b-awq-q128-w4-gemm

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

catid commited on Apr 19

Commit

0465785

•

1 Parent(s): 9ec8175

Create README.md

Files changed (1) hide show

README.md +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,3 @@


1	+ AI Model Name: Llama 3 8B "Built with Meta Llama 3" https://llama.meta.com/llama3/license/
2	+
3	+ This is the result of running AutoAWQ to quantize the LLaMA-3 8B model to ~4 bits/parameter.