FastLlama-Logo

You can use ChatML & Alpaca format.

You can chat with the model via this space.

Overview:

FastLlama is a highly optimized version of the Llama-3.2-1B-Instruct model. Designed for superior performance in constrained environments, it combines speed, compactness, and high accuracy. This version has been fine-tuned using the MetaMathQA-50k section of the HuggingFaceTB/smoltalk dataset to enhance its mathematical reasoning and problem-solving abilities.

Features:

Lightweight and Fast: Optimized to deliver Llama-class capabilities with reduced computational overhead.

Fine-Tuned for Math Reasoning: Utilizes MetaMathQA-50k for better handling of complex mathematical problems and logical reasoning tasks.

Instruction-Tuned: Pre-trained on instruction-following tasks, making it robust in understanding and executing detailed queries.

Versatile Use Cases: Suitable for educational tools, tutoring systems, or any application requiring mathematical reasoning.

Performance Highlights:

Smaller Footprint: The model delivers comparable results to larger counterparts while operating efficiently on smaller hardware.

Enhanced Accuracy: Demonstrates improved performance on mathematical QA benchmarks.

Instruction Adherence: Retains high fidelity in understanding and following user instructions, even for complex queries.

Loading the Model:

import torch
from transformers import pipeline

model_id = "suayptalha/FastLlama-3.2-1B-Instruct"
pipe = pipeline(
    "text-generation",
    model=model_id,
    device_map="auto",
)
messages = [
    {"role": "system", "content": "You are a friendly assistant named FastLlama."},
    {"role": "user", "content": "Who are you?"},
]
outputs = pipe(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])

Dataset:

Dataset: MetaMathQA-50k

The MetaMathQA-50k subset of HuggingFaceTB/smoltalk was selected for fine-tuning due to its focus on mathematical reasoning, multi-step problem-solving, and logical inference. The dataset includes:

Algebraic problems

Geometric reasoning tasks

Statistical and probabilistic questions

Logical deduction problems

Model Fine-Tuning:

Fine-tuning was conducted using the following configuration:

Learning Rate: 2e-4

Epochs: 1

Optimizer: AdamW

Framework: Unsloth

License:

This model is licensed under the Apache 2.0 License. See the LICENSE file for details.

☕ Buy Me a Coffee

Downloads last month
994
Safetensors
Model size
1.24B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for suayptalha/FastLlama-3.2-1B-Instruct

Adapter
(134)
this model
Adapters
3 models
Finetunes
1 model
Quantizations
1 model

Dataset used to train suayptalha/FastLlama-3.2-1B-Instruct

Space using suayptalha/FastLlama-3.2-1B-Instruct 1

Collection including suayptalha/FastLlama-3.2-1B-Instruct