InferenceIllusionist
commited on
Commit
•
cd78bf4
1
Parent(s):
1217063
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,26 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
base_model: Undi95/Meta-Llama-3.1-8B-Claude
|
3 |
+
library_name: transformers
|
4 |
+
quantized_by: InferenceIllusionist
|
5 |
+
tags:
|
6 |
+
- iMat
|
7 |
+
- gguf
|
8 |
+
- llama3
|
9 |
+
license: apache-2.0
|
10 |
+
---
|
11 |
+
<img src="https://i.imgur.com/P68dXux.png" width="400"/>
|
12 |
+
|
13 |
+
# Meta-Llama-3.1-8B-Claude-iMat-GGUF
|
14 |
+
|
15 |
+
Quantized from Meta-Llama-3.1-8B-Claude fp16
|
16 |
+
* Weighted quantizations were creating using fp16 GGUF and groups_merged.txt in 88 chunks and n_ctx=512
|
17 |
+
* Static fp16 will also be included in repo
|
18 |
+
* For a brief rundown of iMatrix quant performance please see this [PR](https://github.com/ggerganov/llama.cpp/pull/5747)
|
19 |
+
* <i>All quants are verified working prior to uploading to repo for your safety and convenience</i>
|
20 |
+
|
21 |
+
<b>KL-Divergence Reference Chart</b>
|
22 |
+
(Click on image to view in full size)
|
23 |
+
[<img src="https://i.imgur.com/mV0nYdA.png" width="920"/>](https://i.imgur.com/mV0nYdA.png)
|
24 |
+
|
25 |
+
|
26 |
+
Original model card can be found [here](https://huggingface.co/Undi95/Meta-Llama-3.1-8B-Claude)
|