Update README.md
Browse files
README.md
CHANGED
@@ -151,6 +151,13 @@ We gain a slight edge over our previous releases, again topping the leaderboard,
|
|
151 |
|
152 |

|
153 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
154 |
|
155 |
# Dataset
|
156 |
|
|
|
151 |
|
152 |

|
153 |
|
154 |
+
## MT-Bench Performance
|
155 |
+
|
156 |
+
MT-Bench uses GPT-4 as a judge of model response quality, across a wide range of challenges.
|
157 |
+
We find our performance is *on-par with `Llama2-70b-chat`*, averaging **6.86**.
|
158 |
+
|
159 |
+

|
160 |
+
|
161 |
|
162 |
# Dataset
|
163 |
|