Quant-Cartel
/

magnum-72b-v1-iMat-GGUF

Inference Endpoints

Model card Files Files and versions Community

InferenceIllusionist commited on Jun 18, 2024

Commit

70350a0

·

verified ·

1 Parent(s): 7efcdee

Update README.md

Files changed (1) hide show

README.md +40 -5

README.md CHANGED Viewed

@@ -1,5 +1,40 @@
----
-license: other
-license_name: tongyi-qianwen
-license_link: LICENSE
----

+---
+license: other
+license_name: tongyi-qianwen
+license_link: LICENSE
+tags:
+- chat
+- qwen
+- opus
+---
+```
+  e88 88e                               d8
+ d888 888b  8888 8888  ,"Y88b 888 8e   d88
+C8888 8888D 8888 8888 "8" 888 888 88b d88888
+ Y888 888P  Y888 888P ,ee 888 888 888  888
+  "88 88"    "88 88"  "88 888 888 888  888
+      b
+      8b,
+  e88'Y88                  d8           888
+ d888  'Y  ,"Y88b 888,8,  d88    ,e e,  888
+C8888     "8" 888 888 "  d88888 d88 88b 888
+ Y888  ,d ,ee 888 888     888   888   , 888
+  "88,d88 "88 888 888     888    "YeeP" 888
+PROUDLY PRESENTS
+```
+## magnum-72b-v1-iMat-GGUF
+Quantized from fp16 with love.
+* Weighted quantizations were created using fp16 GGUF and [groups_merged.txt](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) in 92 chunks and n_ctx=512
+For a brief rundown of iMatrix quant performance please see this [PR](https://github.com/ggerganov/llama.cpp/pull/5747)
+<b>All quants are verified working prior to uploading to repo for your safety and convenience. </b>
+Original model card [here](https://huggingface.co/alpindale/magnum-72b-v1)