QuantPanda
/

LongWriter-glm4-9B-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

LongWriter-glm4-9b

Original model link: https://huggingface.co/THUDM/LongWriter-glm4-9b

Model by: THUDM

Quants by: QuantPanda

GGUF quantization for llama.cpp and similar applications.

Example:

./llama-cli -m LongWriter-glm4-9B-Q5_K_M.gguf -p "You are a helpful AI assistant." --conversation

If the model takes too long to load you can reduce the context size with --ctx-size

Example with smaller context size:

./llama-cli -m LongWriter-glm4-9B-Q5_K_M.gguf -p "You are a helpful AI assistant." --conversation --ctx-size 4096

Downloads last month: 145

GGUF

Model size

9.4B params

Architecture

chatglm

3-bit

4-bit

5-bit

6-bit

8-bit

32-bit

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for QuantPanda/LongWriter-glm4-9B-GGUF

Base model

THUDM/LongWriter-glm4-9b

Quantized

(2)

this model

Dataset used to train QuantPanda/LongWriter-glm4-9B-GGUF