Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Joseph717171
/
Llama-3.2-3B-Instruct-OQ8_0-F32.EF32.IQ4_K-Q8_0-GGUF
like
1
GGUF
Inference Endpoints
imatrix
conversational
Model card
Files
Files and versions
Community
Deploy
Use this model
Joseph717171
commited on
Sep 25
Commit
ff553b2
•
1 Parent(s):
20c5b90
Create README.md
Browse files
Files changed (1)
hide
show
README.md
+1
-0
README.md
ADDED
Viewed
@@ -0,0 +1 @@
1
+
Custom GGUF quants of Meta’s [Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct), where the Output Tensors are quantized to Q8_0 or kept at F32, and the Embeddings are kept at F32. Enjoy! 🧠🔥🚀