File size: 2,584 Bytes
529e879
 
 
 
 
 
 
 
 
 
 
 
 
 
75a1135
 
 
 
 
 
 
 
529e879
b105f61
529e879
73aeeb5
 
d50b348
 
 
 
73aeeb5
 
63c6e32
73aeeb5
63c6e32
 
73aeeb5
63c6e32
d50b348
63c6e32
 
 
 
 
 
 
 
73aeeb5
 
63c6e32
529e879
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
---
license: cc-by-nc-4.0
language:
- nl
tags:
- gguf
- llamacpp
- dpo
- geitje
- conversational
datasets:
- BramVanroy/ultra_feedback_dutch
---

<p align="center" style="margin:0;padding:0">
<img src="https://huggingface.co/BramVanroy/GEITje-7B-ultra/resolve/main/geitje-ultra-banner.png" alt="GEITje Ultra banner" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
</p>

<div style="margin:auto; text-align:center">
<h1 style="margin-bottom: 0">GEITje 7B ultra (GGUF version)</h1>
<em>A conversational model for Dutch, aligned through AI feedback.</em>
</div>

This is a `Q5_K_M` GGUF version of [BramVanroy/GEITje-7B-ultra](https://huggingface.co/BramVanroy/GEITje-7B-ultra), a powerful Dutch chatbot, which ultimately is Mistral-based model, further pretrained on Dutch and additionally treated with supervised-finetuning and DPO alignment. For more information on the model, data, licensing, usage, see the main model's README.

## Usage

### LM Studio

You can use this model in [LM Studio](https://lmstudio.ai/), an easy-to-use interface to locally run optimized models. Simply search for `BramVanroy/GEITje-7B-ultra-GGUF`, and download the available file.

### Ollama

The model is available on `ollama` and can be easily run as follows:

```shell
ollama run bramvanroy/geitje-7b-ultra-gguf
```

To reproduce, i.e. to create the ollama files manually instead of downloading them via ollama, follow the next steps.

First download the [GGUF file](https://huggingface.co/BramVanroy/GEITje-7B-ultra-GGUF/resolve/main/GEITje-7B-ultra-Q5_K_M.gguf?download=true) and [Modelfile](https://huggingface.co/BramVanroy/GEITje-7B-ultra-GGUF/resolve/main/Modelfile?download=true) to your computer. You can adapt the Modelfile as you wish.

Then, create the ollama model and run it.

```shelll
ollama create geitje-7b-ultra-gguf -f ./Modelfile
ollama run geitje-7b-ultra-gguf
```

## Reproduce this GGUF version from the non-quantized model

Assuming you have installed and build llama cpp, current working directory is the `build` directory in llamacpp.

Download initial model (probaby a huggingface-cli alternative exists, too...)

```python
from huggingface_hub import snapshot_download
model_id = "BramVanroy/GEITje-7B-ultra"
snapshot_download(repo_id=model_id, local_dir="geitje-ultra-hf", local_dir_use_symlinks=False)
```

Convert to GGML format

```shell
# Convert to GGML format
python convert.py build/geitje-ultra-hf/

cd build

# Quantize to Q5_K_M
bin/quantize geitje-ultra-hf/ggml-model-f32.gguf geitje-ultra-hf/GEITje-7B-ultra-Q5_K_M.gguf Q5_K_M
```