File size: 2,623 Bytes
104e05d
 
4d1ba39
104e05d
 
 
4d1ba39
104e05d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
**Description**
GGUF Format model files for [this project](https://huggingface.co/bogdan1/llama2-bg).

From [@bogdan1](https://huggingface.co/bogdan1): Llama-2-7b-base fine-tuned on the Chitanka dataset and a dataset made of scraped news comments 
dating mostly from 2022/2023. Big Thank you :)


**About GGUF**

**Introduction:**

GGUF was introduced by the llama.cpp team on August 21st, 2023, as a replacement for GGML, which is no longer supported.
GGUF is a successor file format to GGML, GGMF, and GGJT. It is designed to provide a comprehensive solution for model loading, 
ensuring unambiguous data representation while offering extensibility to accommodate future enhancements. GGUF eliminates the need for disruptive changes, 
introduces support for various non-llama models such as falcon, rwkv, and bloom, and simplifies configuration settings by automating prompt format adjustments.

**Key Features:**

1. **No More Breaking Changes:** GGUF is engineered to prevent compatibility issues with older models, ensuring a seamless transition from previous file formats
   like GGML, GGMF, and GGJT.

3. **Support for Non-Llama Models:** GGUF extends its compatibility to a wide range of models beyond llamas, including falcon, rwkv, bloom, and more.

4. **Streamlined Configuration:** Say goodbye to complex settings like rope-freq-base, rope-freq-scale, gqa, and rms-norm-eps. GGUF simplifies the
   configuration process, making it more user-friendly.

6. **Automatic Prompt Format:** GGUF introduces the ability to automatically set prompt formats, reducing the need for manual adjustments.

7. **Extensibility:** GGUF is designed to accommodate future updates and enhancements, ensuring long-term compatibility and adaptability.

8. **Enhanced Tokenization:** GGUF features improved tokenization code, including support for special tokens, which enhances overall performance,
   especially for models using new special tokens and custom prompt templates.

**Supported Clients and Libraries:**

GGUF is supported by a variety of clients and libraries, making it accessible and versatile for different use cases:

1. [**llama.cpp**](https://github.com/ggerganov/llama.cpp).
2. [**text-generation-webui**](https://github.com/oobabooga/text-generation-webui)
3. [**KoboldCpp**](https://github.com/LostRuins/koboldcpp)
4. [**LM Studio**](https://lmstudio.ai/)
5. [**LoLLMS Web UI**](https://github.com/ParisNeo/lollms-webui)
6. [**ctransformers**](https://github.com/marella/ctransformers)
7. [**llama-cpp-python**](https://github.com/abetlen/llama-cpp-python)
8. [**candle**](https://github.com/huggingface/candle)