Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Joseph717171
/
Llama-3.1-SuperNova-Lite-8.0B-OQ8_0-F32.EF32.IQ4_K-Q8_0-GGUF
like
2
GGUF
Inference Endpoints
imatrix
conversational
Model card
Files
Files and versions
Community
1
Deploy
Use this model
Joseph717171
commited on
Sep 11, 2024
Commit
9e1746d
·
verified
·
1 Parent(s):
c6783bc
Create README.md
Browse files
Files changed (1)
hide
show
README.md
+1
-0
README.md
ADDED
Viewed
@@ -0,0 +1 @@
1
+
Custom GGUF quants of arcee-ai’s [Llama-3.1-SuperNova-Lite-8B](https://huggingface.co/arcee-ai/Llama-3.1-SuperNova-Lite), where the Output Tensors are quantized to Q8_0 while the Embeddings are kept at F32. 🧠🔥🚀