Files changed (1) hide show
  1. README.md +15 -133
README.md CHANGED
@@ -4,118 +4,19 @@ language:
4
  - fr
5
  - es
6
  - pt
7
- license: other
8
- library_name: transformers
9
  tags:
10
  - falcon3
11
- license_name: falcon-llm-license
 
12
  license_link: https://falconllm.tii.ae/falcon-terms-and-conditions.html
13
- model-index:
14
- - name: Falcon3-10B-Base
15
- results:
16
- - task:
17
- type: text-generation
18
- name: Text Generation
19
- dataset:
20
- name: IFEval (0-Shot)
21
- type: HuggingFaceH4/ifeval
22
- args:
23
- num_few_shot: 0
24
- metrics:
25
- - type: inst_level_strict_acc and prompt_level_strict_acc
26
- value: 36.48
27
- name: strict accuracy
28
- source:
29
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=tiiuae/Falcon3-10B-Base
30
- name: Open LLM Leaderboard
31
- - task:
32
- type: text-generation
33
- name: Text Generation
34
- dataset:
35
- name: BBH (3-Shot)
36
- type: BBH
37
- args:
38
- num_few_shot: 3
39
- metrics:
40
- - type: acc_norm
41
- value: 41.38
42
- name: normalized accuracy
43
- source:
44
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=tiiuae/Falcon3-10B-Base
45
- name: Open LLM Leaderboard
46
- - task:
47
- type: text-generation
48
- name: Text Generation
49
- dataset:
50
- name: MATH Lvl 5 (4-Shot)
51
- type: hendrycks/competition_math
52
- args:
53
- num_few_shot: 4
54
- metrics:
55
- - type: exact_match
56
- value: 24.77
57
- name: exact match
58
- source:
59
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=tiiuae/Falcon3-10B-Base
60
- name: Open LLM Leaderboard
61
- - task:
62
- type: text-generation
63
- name: Text Generation
64
- dataset:
65
- name: GPQA (0-shot)
66
- type: Idavidrein/gpqa
67
- args:
68
- num_few_shot: 0
69
- metrics:
70
- - type: acc_norm
71
- value: 12.75
72
- name: acc_norm
73
- source:
74
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=tiiuae/Falcon3-10B-Base
75
- name: Open LLM Leaderboard
76
- - task:
77
- type: text-generation
78
- name: Text Generation
79
- dataset:
80
- name: MuSR (0-shot)
81
- type: TAUR-Lab/MuSR
82
- args:
83
- num_few_shot: 0
84
- metrics:
85
- - type: acc_norm
86
- value: 14.17
87
- name: acc_norm
88
- source:
89
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=tiiuae/Falcon3-10B-Base
90
- name: Open LLM Leaderboard
91
- - task:
92
- type: text-generation
93
- name: Text Generation
94
- dataset:
95
- name: MMLU-PRO (5-shot)
96
- type: TIGER-Lab/MMLU-Pro
97
- config: main
98
- split: test
99
- args:
100
- num_few_shot: 5
101
- metrics:
102
- - type: acc
103
- value: 36.0
104
- name: accuracy
105
- source:
106
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=tiiuae/Falcon3-10B-Base
107
- name: Open LLM Leaderboard
108
  ---
109
 
110
- <div align="center">
111
- <img src="https://huggingface.co/datasets/tiiuae/documentation-images/resolve/main/general/falco3-logo.png" alt="drawing" width="500"/>
112
- </div>
113
 
114
  # Falcon3-10B-Base
115
 
116
  **Falcon3** family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters.
117
 
118
- This repository contains the **Falcon3-10B-Base**. It achieves state-of-the-art results (at the time of release) on reasoning, language understanding, instruction following, code and mathematics tasks.
119
  Falcon3-10B-Base supports 4 languages (English, French, Spanish, Portuguese) and a context length of up to 32K.
120
 
121
  ⚠️ **This is a raw, pretrained model, which should be further finetuned using SFT, RLHF, continued pretraining, etc. for most use cases.**
@@ -130,7 +31,7 @@ Falcon3-10B-Base supports 4 languages (English, French, Spanish, Portuguese) and
130
  - Uses SwiGLu and RMSNorm
131
  - 32K context length
132
  - 131K vocab size
133
- - Depth up-scaled from **Falcon3-7B-Base** with continual pretraining on 2 Teratokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 1024 H100 GPU chips
134
  - Supports EN, FR, ES, PT
135
  - Developed by [Technology Innovation Institute](https://www.tii.ae)
136
  - License: TII Falcon-LLM License 2.0
@@ -161,11 +62,8 @@ print(response[0]['generated_text'])
161
  <br>
162
 
163
  ## Benchmarks
164
- We report in the following table our internal pipeline benchmarks.
165
- - We use [lm-evaluation harness](https://github.com/EleutherAI/lm-evaluation-harness).
166
- - We report **raw scores**.
167
- - We use same batch-size across all models.
168
-
169
  <table border="1" style="width: 100%; text-align: center; border-collapse: collapse;">
170
  <colgroup>
171
  <col style="width: 10%;">
@@ -181,7 +79,7 @@ We report in the following table our internal pipeline benchmarks.
181
  <th>Benchmark</th>
182
  <th>Gemma2-9B</th>
183
  <th>Yi1.5-9B</th>
184
- <th>Mistral-Nemo-Base-2407 (12B)</th>
185
  <th>Falcon3-10B-Base</th>
186
  </tr>
187
  </thead>
@@ -203,7 +101,7 @@ We report in the following table our internal pipeline benchmarks.
203
  </tr>
204
  <tr>
205
  <td>IFEval</td>
206
- <td>21.3</td>
207
  <td>29.1</td>
208
  <td>16.1</td>
209
  <td><b>36.4</b></td>
@@ -217,7 +115,7 @@ We report in the following table our internal pipeline benchmarks.
217
  <td><b>81.4</b></td>
218
  </tr>
219
  <tr>
220
- <td>MATH Lvl-5 (4-shot)</td>
221
  <td>10.5</td>
222
  <td>9.2</td>
223
  <td>4.9</td>
@@ -240,7 +138,7 @@ We report in the following table our internal pipeline benchmarks.
240
  </tr>
241
  <tr>
242
  <td>MUSR (0-shot)</td>
243
- <td><b>45.3</b></td>
244
  <td>43.3</td>
245
  <td>39.2</td>
246
  <td>44.2</td>
@@ -258,7 +156,7 @@ We report in the following table our internal pipeline benchmarks.
258
  <td><b>83.0</b></td>
259
  <td>80.5</td>
260
  <td>82.1</td>
261
- <td>79.4</td>
262
  </tr>
263
  <tr>
264
  <td>SciQ (0-shot)</td>
@@ -284,36 +182,20 @@ We report in the following table our internal pipeline benchmarks.
284
  </tbody>
285
  </table>
286
 
287
- ## Useful links
288
- - View our [release blogpost](https://huggingface.co/blog/falcon3).
289
- - Feel free to join [our discord server](https://discord.gg/fwXpMyGc) if you have any questions or to interact with our researchers and developers.
290
-
291
  ## Technical Report
292
 
293
  Coming soon....
294
 
295
  ## Citation
296
- If the Falcon3 family of models were helpful to your work, feel free to give us a cite.
297
-
298
  ```
299
  @misc{Falcon3,
300
- title = {The Falcon 3 Family of Open Models},
301
- url = {https://huggingface.co/blog/falcon3},
302
- author = {Falcon-LLM Team},
303
  month = {December},
304
  year = {2024}
305
  }
306
  ```
307
- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
308
- Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/tiiuae__Falcon3-10B-Base-details)
309
 
310
- | Metric |Value|
311
- |-------------------|----:|
312
- |Avg. |27.59|
313
- |IFEval (0-Shot) |36.48|
314
- |BBH (3-Shot) |41.38|
315
- |MATH Lvl 5 (4-Shot)|24.77|
316
- |GPQA (0-shot) |12.75|
317
- |MuSR (0-shot) |14.17|
318
- |MMLU-PRO (5-shot) |36.00|
319
 
 
4
  - fr
5
  - es
6
  - pt
 
 
7
  tags:
8
  - falcon3
9
+ license: other
10
+ license_name: falcon-llm-license
11
  license_link: https://falconllm.tii.ae/falcon-terms-and-conditions.html
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ---
13
 
 
 
 
14
 
15
  # Falcon3-10B-Base
16
 
17
  **Falcon3** family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters.
18
 
19
+ This repository contains the **Falcon3-10B-Base**. It achieves state-of-the-art results (at release's time) on reasoning, language understanding, instruction following, code and mathematics tasks.
20
  Falcon3-10B-Base supports 4 languages (English, French, Spanish, Portuguese) and a context length of up to 32K.
21
 
22
  ⚠️ **This is a raw, pretrained model, which should be further finetuned using SFT, RLHF, continued pretraining, etc. for most use cases.**
 
31
  - Uses SwiGLu and RMSNorm
32
  - 32K context length
33
  - 131K vocab size
34
+ - Depth up-scaled from **Falcon3-7B-Base** with 2 Teratokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 2048 H100 GPU chips
35
  - Supports EN, FR, ES, PT
36
  - Developed by [Technology Innovation Institute](https://www.tii.ae)
37
  - License: TII Falcon-LLM License 2.0
 
62
  <br>
63
 
64
  ## Benchmarks
65
+ We report in the following table our internal pipeline benchmarks:
66
+
 
 
 
67
  <table border="1" style="width: 100%; text-align: center; border-collapse: collapse;">
68
  <colgroup>
69
  <col style="width: 10%;">
 
79
  <th>Benchmark</th>
80
  <th>Gemma2-9B</th>
81
  <th>Yi1.5-9B</th>
82
+ <th>Mistral-NeMo-12B</th>
83
  <th>Falcon3-10B-Base</th>
84
  </tr>
85
  </thead>
 
101
  </tr>
102
  <tr>
103
  <td>IFEval</td>
104
+ <td>21.2</td>
105
  <td>29.1</td>
106
  <td>16.1</td>
107
  <td><b>36.4</b></td>
 
115
  <td><b>81.4</b></td>
116
  </tr>
117
  <tr>
118
+ <td>MATH(4-shot)</td>
119
  <td>10.5</td>
120
  <td>9.2</td>
121
  <td>4.9</td>
 
138
  </tr>
139
  <tr>
140
  <td>MUSR (0-shot)</td>
141
+ <td><b>45.2</b></td>
142
  <td>43.3</td>
143
  <td>39.2</td>
144
  <td>44.2</td>
 
156
  <td><b>83.0</b></td>
157
  <td>80.5</td>
158
  <td>82.1</td>
159
+ <td>79.5</td>
160
  </tr>
161
  <tr>
162
  <td>SciQ (0-shot)</td>
 
182
  </tbody>
183
  </table>
184
 
 
 
 
 
185
  ## Technical Report
186
 
187
  Coming soon....
188
 
189
  ## Citation
190
+ If Falcon3 family were helpful in your work, feel free to give us a cite.
191
+
192
  ```
193
  @misc{Falcon3,
194
+ title = {The Falcon 3 family of Open Models},
195
+ author = {TII Team},
 
196
  month = {December},
197
  year = {2024}
198
  }
199
  ```
 
 
200
 
 
 
 
 
 
 
 
 
 
201