Update README.md
Browse files
README.md
CHANGED
@@ -22,7 +22,7 @@ base_model: stabilityai/stablelm-3b-4e1t
|
|
22 |
|
23 |
|
24 |
## Performance
|
25 |
-
Despite its compact dimensions, the model achieves outstanding scores in both
|
26 |
|
27 |
| Model | Size | Alignment | MT-Bench (score) | AlpacaEval (win rate %) |
|
28 |
|-------------|-----|----|---------------|--------------|
|
@@ -63,18 +63,17 @@ In AlpacaEval, Rocket 🦝 achieves a near 80% win rate, coupled with an average
|
|
63 |
| **Rocket** 🦝 | **79.75** | **1.42** | **1242** |
|
64 |
|
65 |
|
66 |
-
##
|
67 |
|
68 |
| Metric | Value |
|
69 |
|-----------------------|---------------------------|
|
70 |
-
| Average |
|
71 |
-
| ARC
|
72 |
-
| HellaSwag
|
73 |
-
| MMLU
|
74 |
-
| TruthfulQA
|
75 |
-
| Winogrande
|
76 |
-
| GSM8K
|
77 |
-
| DROP (3-shot) | 24.49 |
|
78 |
|
79 |
|
80 |
## Intended uses & limitations
|
|
|
22 |
|
23 |
|
24 |
## Performance
|
25 |
+
Despite its compact dimensions, the model achieves outstanding scores in both [MT-Bench](https://huggingface.co/spaces/lmsys/mt-bench) and [AlpacaEval](https://tatsu-lab.github.io/alpaca_eval/) benchmarks, surpassing the performance of considerably larger models.
|
26 |
|
27 |
| Model | Size | Alignment | MT-Bench (score) | AlpacaEval (win rate %) |
|
28 |
|-------------|-----|----|---------------|--------------|
|
|
|
63 |
| **Rocket** 🦝 | **79.75** | **1.42** | **1242** |
|
64 |
|
65 |
|
66 |
+
## Open LLM leaderboard
|
67 |
|
68 |
| Metric | Value |
|
69 |
|-----------------------|---------------------------|
|
70 |
+
| Average | 55.77 |
|
71 |
+
| ARC | 50.6 |
|
72 |
+
| HellaSwag | 76.69 |
|
73 |
+
| MMLU | 47.1 |
|
74 |
+
| TruthfulQA | 55.82 |
|
75 |
+
| Winogrande | 67.96 |
|
76 |
+
| GSM8K | 36.47 |
|
|
|
77 |
|
78 |
|
79 |
## Intended uses & limitations
|