metadata

license: cc-by-nc-4.0
language:
  - nl
tags:
  - gguf
  - llamacpp
  - dpo
  - geitje
  - conversational
datasets:
  - BramVanroy/ultra_feedback_dutch

GEITje 7B ultra (GGUF version)

A conversational model for Dutch, aligned through AI feedback.

This is a Q5_K_M GGUF version of BramVanroy/GEITje-7B-ultra, a powerful Dutch chatbot, which ultimately is Mistral-based model, further pretrained on Dutch and additionally treated with supervised-finetuning and DPO alignment. For more information on the model, data, licensing, usage, see the main model's README.

Usage

LM Studio

You can use this model in LM Studio, an easy-to-use interface to locally run optimized models. Simply search for BramVanroy/GEITje-7B-ultra-GGUF, and download the available file.

Ollama

The model is available on ollama and can be easily run as follows:

ollama run bramvanroy/geitje-7b-ultra-gguf

To reproduce, i.e. to create the ollama files manually instead of downloading them via ollama, follow the next steps.

First download the GGUF file and Modelfile to your computer. You can adapt the Modelfile as you wish.

Then, create the ollama model and run it.

ollama create geitje-7b-ultra-gguf -f ./Modelfile
ollama run geitje-7b-ultra-gguf

Reproduce this GGUF version from the non-quantized model

Assuming you have installed and build llama cpp, current working directory is the build directory in llamacpp.

Download initial model (probaby a huggingface-cli alternative exists, too...)

from huggingface_hub import snapshot_download
model_id = "BramVanroy/GEITje-7B-ultra"
snapshot_download(repo_id=model_id, local_dir="geitje-ultra-hf", local_dir_use_symlinks=False)

Convert to GGML format

# Convert to GGML format
python convert.py build/geitje-ultra-hf/

cd build

# Quantize to Q5_K_M
bin/quantize geitje-ultra-hf/ggml-model-f32.gguf geitje-ultra-hf/GEITje-7B-ultra-Q5_K_M.gguf Q5_K_M