File size: 776 Bytes
f794845 42ee8b8 f794845 ff31f9e f794845 ff31f9e f794845 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
---
license: mit
language:
- nl
tags:
- gguf
---
This repository contains quantized versions of [BramVanroy/fietje-2-instruct](https://huggingface.co/BramVanroy/fietje-2-instruct).
Available quantization types and expected performance differences compared to base `f16`, higher perplexity=worse (from llama.cpp):
```
Q3_K_M : 3.07G, +0.2496 ppl @ LLaMA-v1-7B
Q4_K_M : 3.80G, +0.0532 ppl @ LLaMA-v1-7B
Q5_K_M : 4.45G, +0.0122 ppl @ LLaMA-v1-7B
Q6_K : 5.15G, +0.0008 ppl @ LLaMA-v1-7B
Q8_0 : 6.70G, +0.0004 ppl @ LLaMA-v1-7B
F16 : 13.00G @ 7B
```
Also available on [ollama](https://ollama.com/bramvanroy/fietje-2b-instruct).
Quants were made with release [`b2777`](https://github.com/ggerganov/llama.cpp/releases/tag/b2777) of llama.cpp.
``` |