Terry Zhuo
commited on
Commit
•
c373956
1
Parent(s):
cd5ba8d
fix: add more notes
Browse files
app.py
CHANGED
@@ -228,6 +228,7 @@ with demo:
|
|
228 |
- `complete` and `instruct` represent the calibrated Pass@1 score on the BigCodeBench benchmark variants.
|
229 |
- `elo_mle` represents the task-level Bootstrap of Maximum Likelihood Elo rating on `BigCodeBench-Complete`, which starts from 1000 and is boostrapped 500 times.
|
230 |
- `size` is the amount of activated model weight during inference.
|
|
|
231 |
- Model providers have the responsibility to avoid data contamination. Models trained on close data can be affected by contamination.
|
232 |
- For more details check the 📝 About section.
|
233 |
- Models with a 🔴 symbol represent external evaluation submission, this means that we didn't verify the results, you can find the author's submission under `Submission PR` field from `See All Columns` tab.
|
|
|
228 |
- `complete` and `instruct` represent the calibrated Pass@1 score on the BigCodeBench benchmark variants.
|
229 |
- `elo_mle` represents the task-level Bootstrap of Maximum Likelihood Elo rating on `BigCodeBench-Complete`, which starts from 1000 and is boostrapped 500 times.
|
230 |
- `size` is the amount of activated model weight during inference.
|
231 |
+
- Some instruction-tuned models are marked with 🟢 symbol, as they miss the chat templates in their tokenizer configurations.
|
232 |
- Model providers have the responsibility to avoid data contamination. Models trained on close data can be affected by contamination.
|
233 |
- For more details check the 📝 About section.
|
234 |
- Models with a 🔴 symbol represent external evaluation submission, this means that we didn't verify the results, you can find the author's submission under `Submission PR` field from `See All Columns` tab.
|