DESCRIPTION
This model does not represent the intended quality of the original product.
To perform this quantization, we started with llama.cpp as our base, modifying the file convert_hf_to_gguf_update.py to support this model. For modifying this file, we based our work on what was seen in the PR https://github.com/ggerganov/llama.cpp/pull/11716.
Note: As of today, llama.cpp does not support this model or this chat template https://github.com/ggerganov/llama.cpp/pull/11716.
PROMPT FORMAT
Basic prompt format:
<s><instruction>{prompt}</instruction>
Prompt format with system message:
<s><instruction>{system_prompt}
{prompt}</instruction>
DOWNLOAD
Quant | Link |
---|---|
BF16 | Velvet-2B-BF16 |
F16 | Velvet-2B-F16.gguf |
Q4_K_M | Velvet-2B-Q4_K_M |
Q4_K_S | Velvet-2B-Q4_K_S |
Q5_K_M | Velvet-2B-Q5_K_M |
Q6_K | Velvet-2B-Q6_K.gguf |
Q8_0 | Velvet-2B-Q8_0.gguf |
Original Model: https://huggingface.co/Almawave/Velvet-2B
License
Velvet-2B and Velvet-2B are made available under the Apache 2.0 license
- Downloads last month
- 214
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model authors have turned it off explicitly.
Model tree for SistInf/Velvet-2B-GGUF
Base model
Almawave/Velvet-2B