Quant-Cartel
/

Dusk-Miqu-70B-iMat-GGUF

Inference Endpoints

Model card Files Files and versions Community

InferenceIllusionist commited on May 20, 2024

Commit

8016a2c

·

verified ·

1 Parent(s): c341c08

Update README.md

Files changed (1) hide show

README.md +42 -5

README.md CHANGED Viewed

@@ -1,5 +1,42 @@
----
-license: other
-license_name: other
-license_link: LICENSE
----

+---
+license: other
+license_name: other
+license_link: LICENSE
+tags:
+- GGUF
+- iMat
+- llama3
+---
+```
+  e88 88e                               d8
+ d888 888b  8888 8888  ,"Y88b 888 8e   d88
+C8888 8888D 8888 8888 "8" 888 888 88b d88888
+ Y888 888P  Y888 888P ,ee 888 888 888  888
+  "88 88"    "88 88"  "88 888 888 888  888
+      b
+      8b,
+  e88'Y88                  d8           888
+ d888  'Y  ,"Y88b 888,8,  d88    ,e e,  888
+C8888     "8" 888 888 "  d88888 d88 88b 888
+ Y888  ,d ,ee 888 888     888   888   , 888
+  "88,d88 "88 888 888     888    "YeeP" 888
+PROUDLY PRESENTS
+```
+## Dusk-Miqu-70B-iMat-GGUF
+Quantized from fp16.
+* Weighted quantizations were creating using fp16 GGUF and [groups_merged-enhancedV2-TurboMini.txt](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-9432658) in 234 chunks and n_ctx=512
+* This method of calculating the importance matrix showed improvements in some areas for Mistral 7b and Llama3 8b models, see above post for details
+For a brief rundown of iMatrix quant performance please see this [PR](https://github.com/ggerganov/llama.cpp/pull/5747)
+<b>All quants are verified working prior to uploading to repo for your safety and convenience. </b>
+Original model card [here](https://huggingface.co/jukofyork/Dusk-Miqu-70B/) and below
+---