---
language:
- en
tags:
- falcon3
---
# Table of Contents
0. [TL;DR](#TL;DR)
1. [Model Details](#model-details)
2. [Usage](#usage)
3. [Training Details](#training-details)
4. [Evaluation](#evaluation)
# TL;DR
Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B.
Achieves state of art results on reasoning, language understanding, instruction following, code and mathematics tasks.
Supports context length up to 32K.
This repository contains the Falcon3-7B-Instruct, the best Instruct LLM under 8B at the time of release.
# Model Details
## Model Description
- **Developed by:** [https://www.tii.ae](https://www.tii.ae)
- **Model type:** Causal decoder-only
- **Architecture:** Transformer-base
- **Language(s) (NLP):** Mainly English
- **License:** TII Falcon-LLM License 2.0
## Model Architecture
Falcon 3 uses grouped query attention (GQA) for faster inference and a wider head dimension of 256.
High ROPE value is used to support long context understanding.
# Usage
Find below an example on how to use the model in `transformers` (Make sure to have the latest transformers, or the one built from source):
Click to expand
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "tiiuae/Falcon3-7B-Instruct"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "How many hours in one day?"
messages = [
{"role": "system", "content": "You are a helpful friendly assistant Falcon3 from TII, try to follow instructions as much as possible."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=1024
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
```
Category | Benchmark | Llama-3.1-8B-Instruct | Qwen2-7B-Instruct | Qwen2.5-7B-Instruct | Falcon3-7B-Instruct |
---|---|---|---|---|---|
General | MMLU (5-shot) | - | - | - | - |
MMLU-PRO (5-shot) | - | - | - | - | |
IFEval | - | - | - | - | |
Math | GSM8K (5-shot) | - | - | - | - |
MATH(4-shot) | - | - | - | - | |
Reasoning | Arc Challenge (25-shot) | - | - | - | - |
GPQA (0-shot) | - | - | - | - | |
MUSR (0-shot) | - | - | - | - | |
BBH (3-shot) | - | - | - | - | |
CommonSense Understanding | PIQA (0-shot) | - | - | - | - |
SciQ (0-shot) | - | - | - | - | |
Winogrande (0-shot) | - | - | - | - | |
OpenbookQA (0-shot) | - | - | - | - |