Update README.md
Browse files
README.md
CHANGED
@@ -60,7 +60,7 @@ The model was trained with compute provided by [HessianAI](https://hessian.ai/)
|
|
60 |
### Hugginface Leaderboard
|
61 |
|
62 |
This models is still an early Alpha and we can't guarantee that there isn't any contamination.
|
63 |
-
However, the average of **71.24** would earn the #2 spot on the HF leaderboard at the time of writing
|
64 |
|
65 |
| Metric | Value |
|
66 |
|-----------------------|-------|
|
@@ -84,6 +84,9 @@ We use [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-eval
|
|
84 |
| MMLU | 64.7 |
|
85 |
| **Avg.** | **48.87** |
|
86 |
|
|
|
|
|
|
|
87 |
### MTBench
|
88 |
|
89 |
```json
|
@@ -103,7 +106,8 @@ We use [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-eval
|
|
103 |
"average": 7.48125
|
104 |
}
|
105 |
```
|
106 |
-
|
|
|
107 |
## Prompt Format
|
108 |
|
109 |
This model follows the ChatML format:
|
|
|
60 |
### Hugginface Leaderboard
|
61 |
|
62 |
This models is still an early Alpha and we can't guarantee that there isn't any contamination.
|
63 |
+
However, the average of **71.24** would earn the #2 spot on the HF leaderboard at the time of writing.
|
64 |
|
65 |
| Metric | Value |
|
66 |
|-----------------------|-------|
|
|
|
84 |
| MMLU | 64.7 |
|
85 |
| **Avg.** | **48.87** |
|
86 |
|
87 |
+
Screenshot of the current (sadly no longer maintained) FastEval CoT leaderboard:
|
88 |
+
![FastEval Leaderboard](imgs/cot_leaderboard.png)
|
89 |
+
|
90 |
### MTBench
|
91 |
|
92 |
```json
|
|
|
106 |
"average": 7.48125
|
107 |
}
|
108 |
```
|
109 |
+
Screenshot of the current FastEval MT Bench leaderboard:
|
110 |
+
![FastEval Leaderboard](imgs/mtbench_leaderboard.png)
|
111 |
## Prompt Format
|
112 |
|
113 |
This model follows the ChatML format:
|