sethuiyer
/

Chikuma_10.7B_v2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

sethuiyer commited on Jan 13

Commit

137218a

•

1 Parent(s): 79a5ec9

Update README.md

Files changed (1) hide show

README.md +20 -14

README.md CHANGED Viewed

@@ -52,20 +52,26 @@ The chat template for Chikuma_10.7B - V2 is a modified version of ChatML, optimi
 {asistant}<|im_end|>
 ```
-## Benchmarks
-| Benchmark Name         | Performance |
-|------------------------|-------------|
-| AGIEval                | 42.77       |
-| GPT4All                | 73.81       |
-| TruthfulQA             | 58.83       |
-| Bigbench               | 44.83       |
-| ARC                    | 66.38       |
-| HellaSwag              | 85          |
-| Winogrande             | 78.77       |
-| GSM8K                  | 63.68       |
-| **Average**            | **63.66**   |
-Details can be found [here](https://gist.github.com/sethuiyer)
 ### Training Environment
 - Hardware: Single A100 80GB GPU in a runpod, utilized for approximately 1.5 hours.

 {asistant}<|im_end|>
 ```
+## Nous Benchmark Evaluation
+| Model                         | AGIEval | GPT4All | TruthfulQA | Bigbench | Average |
+|-------------------------------|---------|---------|------------|----------|---------|
+| SynthIQ-7b                    | 42.67   | 73.71   | 56.51      | 44.59    | 54.37   |
+| openchat/openchat-3.5-0106    | 44.17   | 73.72   | 52.53      | 44.4     | 53.71   |
+| Chikuma_10.7B                 | 42.41   | 73.41   | 56.69      | 43.5     | 54.00   |
+| **distilabled_Chikuma_10.7B** | **42.77** | **73.81** | **58.83**  | **44.83** | **55.06** |
+# OpenLLM Leaderboard
+| Benchmark Name | Performance |
+|----------------|-------------|
+| ARC            | 66.38       |
+| HellaSwag      | 85          |
+| MMLU           | 65.27       |
+| TruthfulQA     | 58.83       |
+| Winogrande     | 78.77       |
+| GSM8K          | 63.68       |
+| **Average**    | **69.65**   |
 ### Training Environment
 - Hardware: Single A100 80GB GPU in a runpod, utilized for approximately 1.5 hours.