Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Joseph717171
/
Hermes-3-Llama-3.2-3B-OQ8_0-F32.EF32.IQ4_K-Q8_0-GGUF
like
1
GGUF
Inference Endpoints
imatrix
conversational
Model card
Files
Files and versions
Community
Deploy
Use this model
Joseph717171
commited on
Dec 13, 2024
Commit
5010adc
·
verified
·
1 Parent(s):
1a93520
Create README.md
Browse files
Files changed (1)
hide
show
README.md
+1
-0
README.md
ADDED
Viewed
@@ -0,0 +1 @@
1
+
Custom GGUF quants of Hermes-3-Llama-3.2-3B, where the Output Tensors are quantized to Q8_0 or upcast to F32 while the Embeddings are kept at F32. Enjoy! 🧠🔥🚀