Volko76
/

Llama3.2-1B-Instruct-French-GGUF

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Uploaded model

Developed by: Volko76
License: apache-2.0
Finetuned from model : unsloth/Llama-3.2-1B-Instruct-bnb-4bit

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month: 56

GGUF

Model size

1.24B params

Architecture

llama

4-bit

8-bit

Inference API

Unable to determine this model’s pipeline type. Check the docs .

Model tree for Volko76/Llama3.2-1B-Instruct-French-GGUF

Base model

meta-llama/Llama-3.2-1B-Instruct

Quantized

unsloth/Llama-3.2-1B-Instruct-bnb-4bit

Quantized

(51)

this model

Collection including Volko76/Llama3.2-1B-Instruct-French-GGUF

GGUF Quantizations

A CPU + GPU support type of quantization. It's currently the most used quantization method. Read more here : https://github.com/ggerganov/llama.cpp • 17 items • Updated 24 days ago