Upload folder using huggingface_hub

Files changed (8) hide show

.gitattributes CHANGED Viewed

@@ -33,3 +33,9 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+Llama-3.2-1B-Instruct.f16.gguf filter=lfs diff=lfs merge=lfs -text
+Llama-3.2-1B-Instruct.q5_k.gguf filter=lfs diff=lfs merge=lfs -text
+Llama-3.2-1B-Instruct.q6_k.gguf filter=lfs diff=lfs merge=lfs -text
+Llama-3.2-1B-Instruct.q8_0.gguf filter=lfs diff=lfs merge=lfs -text
+Llama-3.2-1B-Instruct.q8_p.gguf filter=lfs diff=lfs merge=lfs -text
+Llama-3.2-1B-Instruct.q8q4.gguf filter=lfs diff=lfs merge=lfs -text

Llama-3.2-1B-Instruct.f16.gguf ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:1f33ad43d2b85b908ff06fe7002b69806a57359b9b2617ca27d7bdea428ae146
+size 2479595360

Llama-3.2-1B-Instruct.q5_k.gguf ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:3d14220bbfef251c582820bb9f7d493d85db6b746a54bffc831509aa934d364f
+size 1221369696

Llama-3.2-1B-Instruct.q6_k.gguf ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:ecf98ee3f1de6d04c47bb4a23eeb623f5f244a432e581c327e776b7cfeba604d
+size 1331666784

Llama-3.2-1B-Instruct.q8_0.gguf ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:80f02df7997db0a45285f874109aac66d948b8e83039204171b7fe0327afe150
+size 1567334240

Llama-3.2-1B-Instruct.q8_p.gguf ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:6c0522ea0da43f9a089b5f2df7699a659cc320a5259c33506c44925fa03232fc
+size 1321082720

Llama-3.2-1B-Instruct.q8q4.gguf ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:936b78412d3ef2688b7aad4c6760f9b05047ddf7153d379654ae1e4d34e9ce58
+size 871309152

README.md ADDED Viewed

+---
+license: mit
+language:
+- en
+pipeline_tag: text-generation
+---
+My own (ZeroWw) quantizations.
+output and embed tensors quantized to f16.
+all other tensors quantized to q5_k or q6_k.
+Result:
+both f16.q6 and f16.q5 are smaller than q8_0 standard quantization
+and they perform as well as the pure f16.
+Updated on: Tue Oct 08, 23:22:36