tiiuae
/

Falcon3-7B-Instruct

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Dhia-GB commited on 4 days ago

Commit

7e687ae

•

1 Parent(s): 28519b8

Update README.md

clarification about evaluation pipeline.

Files changed (1) hide show

README.md +5 -1

README.md CHANGED Viewed

@@ -91,7 +91,10 @@ print(response)
 <br>
 ## Benchmarks
-We report in the following table our internal pipeline benchmarks:
 <table border="1" style="width: 100%; text-align: center; border-collapse: collapse;">
     <colgroup>
@@ -228,6 +231,7 @@ We report in the following table our internal pipeline benchmarks:
     </tbody>
 </table>
 ## Technical Report
 Coming soon....

 <br>
 ## Benchmarks
+We report in the following table our internal pipeline benchmarks.
+ - We use [lm-evaluation harness](https://github.com/EleutherAI/lm-evaluation-harness).
+ - We report **raw scores** obtained by applying chat template **without fewshot_as_multiturn** (unlike Llama3.1).
+ - We use same batch-size across all models.
 <table border="1" style="width: 100%; text-align: center; border-collapse: collapse;">
     <colgroup>
     </tbody>
 </table>
 ## Technical Report
 Coming soon....