Update README.md
Browse files
README.md
CHANGED
@@ -52,20 +52,26 @@ The chat template for Chikuma_10.7B - V2 is a modified version of ChatML, optimi
|
|
52 |
{asistant}<|im_end|>
|
53 |
```
|
54 |
|
55 |
-
##
|
56 |
-
|
|
57 |
-
|
58 |
-
|
|
59 |
-
|
|
60 |
-
|
|
61 |
-
|
|
62 |
-
|
63 |
-
|
64 |
-
|
65 |
-
|
|
66 |
-
|
67 |
-
|
68 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
69 |
|
70 |
### Training Environment
|
71 |
- Hardware: Single A100 80GB GPU in a runpod, utilized for approximately 1.5 hours.
|
|
|
52 |
{asistant}<|im_end|>
|
53 |
```
|
54 |
|
55 |
+
## Nous Benchmark Evaluation
|
56 |
+
| Model | AGIEval | GPT4All | TruthfulQA | Bigbench | Average |
|
57 |
+
|-------------------------------|---------|---------|------------|----------|---------|
|
58 |
+
| SynthIQ-7b | 42.67 | 73.71 | 56.51 | 44.59 | 54.37 |
|
59 |
+
| openchat/openchat-3.5-0106 | 44.17 | 73.72 | 52.53 | 44.4 | 53.71 |
|
60 |
+
| Chikuma_10.7B | 42.41 | 73.41 | 56.69 | 43.5 | 54.00 |
|
61 |
+
| **distilabled_Chikuma_10.7B** | **42.77** | **73.81** | **58.83** | **44.83** | **55.06** |
|
62 |
+
|
63 |
+
# OpenLLM Leaderboard
|
64 |
+
|
65 |
+
| Benchmark Name | Performance |
|
66 |
+
|----------------|-------------|
|
67 |
+
| ARC | 66.38 |
|
68 |
+
| HellaSwag | 85 |
|
69 |
+
| MMLU | 65.27 |
|
70 |
+
| TruthfulQA | 58.83 |
|
71 |
+
| Winogrande | 78.77 |
|
72 |
+
| GSM8K | 63.68 |
|
73 |
+
| **Average** | **69.65** |
|
74 |
+
|
75 |
|
76 |
### Training Environment
|
77 |
- Hardware: Single A100 80GB GPU in a runpod, utilized for approximately 1.5 hours.
|