Edit model card

Felladrin/TinyMistral-248M-SFT-v3-GGUF

Quantized GGUF model files for TinyMistral-248M-SFT-v3 from Felladrin

Original Model Card:

Locutusque's TinyMistral-248M trained on OpenAssistant TOP-1 Conversation Threads

Recommended Prompt Format

<|im_start|>user
{message}<|im_end|>
<|im_start|>assistant

How it was trained

%pip install autotrain-advanced

!autotrain setup

!autotrain llm \
    --train \
    --trainer "sft" \
    --model './TinyMistral-248M/' \
    --model_max_length 4096 \
    --block-size 1024 \
    --project-name 'trained-model' \
    --data-path "OpenAssistant/oasst_top1_2023-08-25" \
    --train_split "train" \
    --valid_split "test" \
    --text-column "text" \
    --lr 1e-5 \
    --train_batch_size 2 \
    --epochs 5 \
    --evaluation_strategy "steps" \
    --save-strategy "steps" \
    --save-total-limit 2 \
    --warmup-ratio 0.05 \
    --weight-decay 0.0 \
    --gradient-accumulation 8 \
    --logging-steps 10 \
    --scheduler "constant"
Downloads last month
69
GGUF
Model size
248M params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples
Inference API (serverless) has been turned off for this model.

Model tree for afrideva/TinyMistral-248M-SFT-v3-GGUF

Quantized
(12)
this model

Dataset used to train afrideva/TinyMistral-248M-SFT-v3-GGUF