File size: 8,907 Bytes
7b924d7 ec30770 7b924d7 ec30770 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 |
---
base_model: HoangHa/Pensez-v0.1-e5
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2
- gguf
license: apache-2.0
language:
- en
---
# Uploaded model
- **Developed by:** HoangHa
- **License:** apache-2.0
- **Convert to GGUF from model :** [HoangHa/Pensez-v0.1-e5](https://huggingface.co/HoangHa/Pensez-v0.1-e5)
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
<div align="center">
# Pensez: Less Data, Better Reasoning – Rethinking French LLM
[**About**](#about) | [**How to Run Locally**](#run-locally) | [**Models and Datasets**](#models-and-datasets) | [**Benchmarks**](#benchmarks) | [**Training Details**](#training-details)

</div>
## About
Pensez is a bilingual (French-English) reasoning model designed to maximize efficiency with significantly reduced training data. The model leverages a curated dataset focusing on daily reasoning tasks and scientific questions to enhance performance.
Key strategies for improved reasoning:
- **Concise reasoning** for simple tasks to prevent overthinking.
- **Extended reasoning** for complex domains like mathematics, coding, and science.
- **Special tokens (`<think>...</think>`)** to explicitly guide the model’s reasoning process.
These optimizations result in superior reasoning capabilities while maintaining robust general understanding compared to models like [DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B).
## Models and Datasets
### Model Versions
Pensez is built upon [Qwen 2.5 Instruct 7B](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) and trained over five epochs.
| Model | Backbone | Size | Download Link |
|---------------|----------------------------------------|------|---------------|
| Pensez-v0.1-e1 | Qwen2.5-7B-Instruct | 7B | [🤗 Pensez-v0.1-e1](https://huggingface.co/HoangHa/Pensez-v0.1-e1) |
| Pensez-v0.1-e2 | Qwen2.5-7B-Instruct | 7B | [🤗 Pensez-v0.1-e2](https://huggingface.co/HoangHa/Pensez-v0.1-e2) |
| Pensez-v0.1-e3 | Qwen2.5-7B-Instruct | 7B | [🤗 Pensez-v0.1-e3](https://huggingface.co/HoangHa/Pensez-v0.1-e3) |
| Pensez-v0.1-e4 | Qwen2.5-7B-Instruct | 7B | [🤗 Pensez-v0.1-e4](https://huggingface.co/HoangHa/Pensez-v0.1-e4) |
| Pensez-v0.1-e5 | Qwen2.5-7B-Instruct | 7B | [🤗 Pensez-v0.1-e5](https://huggingface.co/HoangHa/Pensez-v0.1-e5) |
### Dataset
Pensez was trained on the hand-curated [Pensez v0.1](https://huggingface.co/datasets/HoangHa/Pensez-v0.1) dataset containing 2,000 samples (1,000 French, 1,000 English).
| Dataset | Description | Size | Link |
|--------------|----------------------|-------|-------|
| Pensez v0.1 | SFT Training Dataset | 2K samples | [🤗 Pensez v0.1](https://huggingface.co/datasets/HoangHa/Pensez-v0.1) |
## Benchmarks
Pensez was evaluated on French-specific benchmarks, demonstrating strong reasoning ability and improved task-specific performance:
| Benchmark | Pensez-v0.1-e5 | DeepSeek-R1-Distill-Qwen-7B | Qwen2.5-7B-Instruct |
|-----------|---------------|-----------------------------|----------------------|
| Math-hard (fr) | 0.3458 | 0.3403 | 0.2253 |
| MMLU (fr) | 0.5766 | 0.4961 | 0.6612 |
| BoolQA (fr) | 0.9157 | 0.7079 | 0.9382 |
| Trivia (en) | 0.4421 | 0.2711 | 0.5316 |
| HellaSwag (en) | 0.5050 | 0.3540 | 0.5258 |
**Key Observations:**
- Pensez outperforms Qwen2.5-7B-Instruct in reasoning tasks.
- Comparable to DeepSeek-R1-Distill-Qwen-7B in reasoning while maintaining strong understanding.
- Reduced degradation in knowledge-based tasks.
<details>
<summary>Click for detailed benchmark results</summary>
| Tasks | Pensez v0.1 e1 | Pensez v0.1 e2 | Pensez v0.1 e3 | Pensez v0.1 e4 | Pensez v0.1 e5 | Qwen 7B instruct | R1 distil |
|------------------------------------------------|---------------|---------------|---------------|---------------|---------------|-----------------|-----------|
| leaderboard_math_hard_fr | 0.0918 | 0.2547 | 0.2783 | 0.3035 | 0.3458 | 0.2253 | 0.3403 |
| leaderboard_math_algebra_hard_fr | 0.1029 | 0.3914 | 0.3971 | 0.5114 | 0.5000 | 0.4229 | 0.4771 |
| leaderboard_math_counting_and_prob_hard_fr | 0.0765 | 0.1378 | 0.1939 | 0.2041 | 0.2398 | 0.1224 | 0.2347 |
| leaderboard_math_geometry_hard_fr | 0.0388 | 0.1019 | 0.1408 | 0.1359 | 0.1748 | 0.1019 | 0.2330 |
| leaderboard_math_num_theory_hard_fr | 0.1198 | 0.2581 | 0.3502 | 0.3548 | 0.4332 | 0.3180 | 0.3963 |
| leaderboard_math_prealgebra_hard_fr | 0.1681 | 0.4425 | 0.4690 | 0.4956 | 0.5841 | 0.3274 | 0.4867 |
| leaderboard_math_precalculus_hard_fr | 0.0357 | 0.0714 | 0.1190 | 0.1190 | 0.1429 | 0.0595 | 0.2143 |
| leaderboard_mmlu_fr | 0.3806 | 0.3329 | - | - | 0.5766 | 0.6612 | 0.4961 |
| french_bench_arc_challenge | 0.5047 | 0.5021 | 0.4919 | 0.4859 | 0.4842 | 0.5518 | 0.3447 |
| french_bench_boolqa | 0.9326 | 0.9326 | 0.9326 | 0.9270 | 0.9157 | 0.9382 | 0.7079 |
| french_bench_fquadv2 | 0.4325 | 0.4400 | 0.4412 | 0.4375 | 0.4387 | 0.4800 | 0.2988 |
| french_bench_hellaswag | 0.4970 | 0.5055 | 0.5092 | 0.5058 | 0.5050 | 0.5258 | 0.3540 |
| french_bench_trivia | 0.4763 | 0.4763 | 0.4553 | 0.4395 | 0.4421 | 0.5316 | 0.2711 |
</details>
## Run Locally
You can run Pensez using Hugging Face’s `transformers` library:
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_path = "HoangHa/Pensez-v0.1-e5"
# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
model_path, torch_dtype=torch.float16, device_map="auto"
)
# Example input
messages = [{"role": "user", "content": "Bonjour!"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors='pt').to("cuda")
generated_ids = model.generate(input_ids, max_new_tokens=2500, temperature=0.8, repetition_penalty=1.1, do_sample=True, eos_token_id=tokenizer.eos_token_id)
response = tokenizer.decode(generated_ids[0], skip_special_tokens=True, clean_up_tokenization_space=True)
print(f"Réponse: {response}")
```
## Training Details
Pensez was trained with:
- **Packing Inputs Without Cross-Contamination Attention** ([Reference](https://github.com/MeetKai/functionary/tree/main/functionary/train/packing))
- **Liger Kernel** ([Reference](https://github.com/linkedin/Liger-Kernel))
- **DeepSpeed 3** ([Reference](https://github.com/deepspeedai/DeepSpeed))
- **NEFTune Noise** ([Reference](https://arxiv.org/abs/2310.05914)) for robustness.
| **Parameter** | **Value** |
|--------------|----------|
| Epochs | 5 |
| Global Batch Size | 200 |
| Learning Rate | 1e-5 |
| Scheduler | Cosine |
| Optimizer | AdamW |
| Warmup Ratio | 0.05 |
| Weight Decay | 0.01 |
| Max Sequence Length | 16,384 |
More details: [Training Config]() | Loss curves: [Wandb](https://wandb.ai/hahuyhoanghhh41/llamafactory?nw=nwuserhahuyhoanghhh41)
## Citation
```bibtex
@misc{dao2025alphamazeenhancinglargelanguage,
title={Pensez: Less Data, Better Reasoning – Rethinking French LLM},
author={Ha Huy Hoang},
year={2025},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={},
}
```
## Acknowledgement
- [llama-factory](https://github.com/hiyouga/LLaMA-Factory)
- [Deepseek R1](https://github.com/deepseek-ai/DeepSeek-R1)
- [Qwen 2.5](https://github.com/QwenLM/Qwen2.5)
- [NEFTune Noise](https://arxiv.org/abs/2310.05914)
- [Packing Inputs Without Cross-Contamination Attention](https://github.com/MeetKai/functionary/tree/main/functionary/train/packing)
- [Liger Kernel](https://github.com/linkedin/Liger-Kernel)
- [Deepspeed](https://github.com/deepspeedai/DeepSpeed)
- [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)
- [Hyperbolic](https://hyperbolic.xyz/)
- [Modal](https://modal.com/) |