File size: 5,084 Bytes
f697015
5c23f16
70894d7
 
f697015
 
70894d7
 
 
 
f697015
14c7900
 
5bb30ef
8c7d6e7
ebce411
3d1d593
432e469
0a984fc
3d1d593
 
 
 
 
 
ebce411
 
9d8f51c
 
ebce411
 
e10acf8
 
0a984fc
e10acf8
0a984fc
 
e10acf8
 
0a984fc
6053c68
 
e10acf8
 
 
 
ebce411
62ae96a
 
9d8f51c
62ae96a
9d8f51c
21532e6
 
 
5c23f16
 
 
 
 
351aad2
 
 
c439231
 
351aad2
ebce411
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
---
base_model: alpindale/miquella-120b
language:
- en
library_name: transformers
pipeline_tag: text-generation
quantized_by: mradermacher
tags:
- mergekit
- merge
---
## About

static quants of https://huggingface.co/alpindale/miquella-120b commit 25de83c

you can find weighted quants at https://huggingface.co/alpindale/miquella-120b-gguf

<!-- provided-files -->
weighted/imatrix quants are available at https://huggingface.co/mradermacher/miquella-120b-i1-GGUF
## Usage

If you are unsure how to use GGUF files, refer to one of [TheBloke's
READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for
more details, including on how to concatenate multi-part files.

## Provided Quants

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

| Link | Type | Size/GB | Notes |
|:-----|:-----|--------:|:------|
| [GGUF](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q2_K.gguf) | Q2_K | 43.3 |  |
| [GGUF](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q3_K_XS.gguf) | Q3_K_XS | 48.0 |  |
| [GGUF](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.IQ3_XS.gguf) | IQ3_XS | 48.2 |  |
| [PART 1](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q3_K_S.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q3_K_S.gguf.split-ab) | Q3_K_S | 50.8 |  |
| [PART 1](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.IQ3_S.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.IQ3_S.gguf.part2of2) | IQ3_S | 51.0 | beats Q3_K* |
| [PART 1](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.IQ3_M.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.IQ3_M.gguf.part2of2) | IQ3_M | 52.7 |  |
| [PART 1](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q3_K_M.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q3_K_M.gguf.split-ab) | Q3_K_M | 56.7 | lower quality |
| [PART 1](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q3_K_L.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q3_K_L.gguf.split-ab) | Q3_K_L | 61.8 |  |
| [PART 1](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.IQ4_XS.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.IQ4_XS.gguf.part2of2) | IQ4_XS | 63.5 |  |
| [PART 1](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q4_K_S.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q4_K_S.gguf.split-ab) | Q4_K_S | 66.9 | fast, recommended |
| [PART 1](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q4_K_M.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q4_K_M.gguf.split-ab) | Q4_K_M | 70.7 | fast, recommended |
| [PART 1](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q5_K_S.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q5_K_S.gguf.split-ab) | Q5_K_S | 81.1 |  |
| [PART 1](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q5_K_M.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q5_K_M.gguf.split-ab) | Q5_K_M | 83.3 |  |
| [PART 1](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q6_K.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q6_K.gguf.split-ab) | Q6_K | 96.7 | very good quality |
| [PART 1](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q8_0.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q8_0.gguf.split-ab) [PART 3](https://huggingface.co/mradermacher/miquella-120b-GGUF/resolve/main/miquella-120b.Q8_0.gguf.split-ac) | Q8_0 | 125.2 | fast, best quality |

Here is a handy graph by ikawrakow comparing some lower-quality quant
types (lower is better):

![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)

And here are Artefact2's thoughts on the matter:
https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9

## FAQ / Model Request

See https://huggingface.co/mradermacher/model_requests for some answers to
questions you might have and/or if you want some other model quantized.

## Thanks

I thank my company, [nethype GmbH](https://www.nethype.de/), for letting
me use its servers and providing upgrades to my workstation to enable
this work in my free time.

<!-- end -->