--- language: - en tags: - falcon3 --- # Table of Contents 0. [TL;DR](#TL;DR) 1. [Model Details](#model-details) 2. [Usage](#usage) 3. [Training Details](#training-details) 4. [Evaluation](#evaluation) # TL;DR Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B. Achieves state of art results on reasoning, language understanding, instruction following, code and mathematics tasks. Supports context length up to 32K. This repository contains the Falcon3-7B-Instruct, the best Instruct LLM under 8B at the time of release. # Model Details ## Model Description - **Developed by:** [https://www.tii.ae](https://www.tii.ae) - **Model type:** Causal decoder-only - **Architecture:** Transformer-base - **Language(s) (NLP):** Mainly English - **License:** TII Falcon-LLM License 2.0
## Model Architecture Falcon 3 uses grouped query attention (GQA) for faster inference and a wider head dimension of 256. High ROPE value is used to support long context understanding. # Usage Find below an example on how to use the model in `transformers` (Make sure to have the latest transformers, or the one built from source):
Click to expand ```python from transformers import AutoTokenizer, AutoModelForCausalLM from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "tiiuae/Falcon3-7B-Instruct" model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(model_name) prompt = "How many hours in one day?" messages = [ {"role": "system", "content": "You are a helpful friendly assistant Falcon3 from TII, try to follow instructions as much as possible."}, {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(model.device) generated_ids = model.generate( **model_inputs, max_new_tokens=1024 ) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ] response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] print(response) ```
# Benchmarks We report in the following table our internal pipeline benchmarks:
Category Benchmark Llama-3.1-8B-Instruct Qwen2-7B-Instruct Qwen2.5-7B-Instruct Falcon3-7B-Instruct
General MMLU (5-shot) - - - -
MMLU-PRO (5-shot) - - - -
IFEval - - - -
Math GSM8K (5-shot) - - - -
MATH(4-shot) - - - -
Reasoning Arc Challenge (25-shot) - - - -
GPQA (0-shot) - - - -
MUSR (0-shot) - - - -
BBH (3-shot) - - - -
CommonSense Understanding PIQA (0-shot) - - - -
SciQ (0-shot) - - - -
Winogrande (0-shot) - - - -
OpenbookQA (0-shot) - - - -
# Citation If Falcon3 series were helpful to your work, feel free to give us a cite. ``` @misc{Falcon3, title = {Falcon 3 family of Open Foundation Models}, author = {TII Team}, month = {December}, year = {2024} } ```