File size: 4,229 Bytes
6da6fe6
 
 
e57eda5
6da6fe6
 
 
 
 
 
94a920d
6da6fe6
94a920d
 
2fb7058
94a920d
 
 
 
 
 
6da6fe6
 
110e177
 
6da6fe6
 
2fb7058
0d6b15b
 
2e7fd30
1ece0cb
0d6b15b
cbc9a3e
1ece0cb
0d6b15b
cbc9a3e
0d6b15b
2e7fd30
0d6b15b
 
7333c42
cbc9a3e
c48d3a9
 
 
6da6fe6
94a920d
43a010a
 
110e177
43a010a
110e177
3b9837f
 
 
73728fc
 
 
 
 
 
6da6fe6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
---
datasets:
- ehartford/samantha-data
exported_from: cognitivecomputations/Samantha-1.11-70b
language:
- en
library_name: transformers
license: llama2
quantized_by: mradermacher
---
## About

weighted/imatrix quants of https://huggingface.co/cognitivecomputations/Samantha-1.11-70b

<!-- provided-files -->
## Usage

If you are unsure how to use GGUF files, refer to one of [TheBloke's
READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for
more details, including on how to concatenate multi-part files.

## Provided Quants

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

| Link | Type | Size/GB | Notes |
|:-----|:-----|--------:|:------|
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ1_S.gguf) | i1-IQ1_S | 15.0 | for the desperate |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ2_XXS.gguf) | i1-IQ2_XXS | 18.7 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ2_XS.gguf) | i1-IQ2_XS | 20.8 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ2_S.gguf) | i1-IQ2_S | 21.8 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ2_M.gguf) | i1-IQ2_M | 23.7 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q2_K.gguf) | i1-Q2_K | 25.9 | IQ3_XXS probably better |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ3_XXS.gguf) | i1-IQ3_XXS | 27.4 | lower quality |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ3_XS.gguf) | i1-IQ3_XS | 28.6 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q3_K_XS.gguf) | i1-Q3_K_XS | 28.7 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ3_S.gguf) | i1-IQ3_S | 30.3 | beats Q3_K* |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q3_K_S.gguf) | i1-Q3_K_S | 30.3 | IQ3_XS probably better |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ3_M.gguf) | i1-IQ3_M | 31.4 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q3_K_M.gguf) | i1-Q3_K_M | 33.7 | IQ3_S probably better |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q3_K_L.gguf) | i1-Q3_K_L | 36.6 | IQ3_M probably better |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q4_K_S.gguf) | i1-Q4_K_S | 39.7 | optimal size/speed/quality |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q4_K_M.gguf) | i1-Q4_K_M | 41.8 | fast, recommended |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q5_K_S.gguf) | i1-Q5_K_S | 47.9 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q5_K_M.gguf) | i1-Q5_K_M | 49.2 |  |
| [PART 1](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q6_K.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q6_K.gguf.part2of2) | i1-Q6_K | 57.0 | practically like static Q6_K |


Here is a handy graph by ikawrakow comparing some lower-quality quant
types (lower is better):

![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)

And here are Artefact2's thoughts on the matter:
https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9

## Thanks

I thank my company, [nethype GmbH](https://www.nethype.de/), for letting
me use its servers and providing upgrades to my workstation to enable
this work in my free time.

<!-- end -->