GGUF Quantizations
Collection
A CPU + GPU support type of quantization. It's currently the most used quantization method. Read more here : https://github.com/ggerganov/llama.cpp
•
17 items
•
Updated
This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.
Base model
meta-llama/Llama-3.2-1B-Instruct