license: cc-by-nc-4.0
language:
- nl
tags:
- gguf
- llamacpp
- dpo
- geitje
- conversational
datasets:
- BramVanroy/ultra_feedback_dutch
GEITje 7B ultra (GGUF version)
A conversational model for Dutch, aligned through AI feedback.This is a Q5_K_M
GGUF version of BramVanroy/GEITje-7B-ultra, a powerful Dutch chatbot, which ultimately is Mistral-based model, further pretrained on Dutch and additionally treated with supervised-finetuning and DPO alignment. For more information on the model, data, licensing, usage, see the main model's README.
Usage
LM Studio
You can use this model in LM Studio, an easy-to-use interface to locally run optimized models. Simply search for BramVanroy/GEITje-7B-ultra-GGUF
, and download the available file.
Ollama
The model is available on ollama
and can be easily run as follows:
ollama run bramvanroy/geitje-7b-ultra-gguf
To reproduce, i.e. to create the ollama files manually instead of downloading them via ollama, follow the next steps.
First download the GGUF file and Modelfile to your computer. You can adapt the Modelfile as you wish.
Then, create the ollama model and run it.
ollama create geitje-7b-ultra-gguf -f ./Modelfile
ollama run geitje-7b-ultra-gguf
Reproduce this GGUF version from the non-quantized model
Assuming you have installed and build llama cpp, current working directory is the build
directory in llamacpp.
Download initial model (probaby a huggingface-cli alternative exists, too...)
from huggingface_hub import snapshot_download
model_id = "BramVanroy/GEITje-7B-ultra"
snapshot_download(repo_id=model_id, local_dir="geitje-ultra-hf", local_dir_use_symlinks=False)
Convert to GGML format
# Convert to GGML format
python convert.py build/geitje-ultra-hf/
cd build
# Quantize to Q5_K_M
bin/quantize geitje-ultra-hf/ggml-model-f32.gguf geitje-ultra-hf/GEITje-7B-ultra-Q5_K_M.gguf Q5_K_M