huihui-ai
/

Llama-3.1-8B-Fusion-7030

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

huihui-ai commited on 9 days ago

Commit

dddbc12

•

1 Parent(s): 0c136b3

Update README.md

Files changed (1) hide show

README.md +9 -1

README.md CHANGED Viewed

@@ -103,4 +103,12 @@ while True:
 ```
 ## Evaluations
-We will be submitting this model to the OpenLLM Leaderboard for a more conclusive benchmark - but here are our internal benchmarks using the main branch of lm evaluation harness:

 ```
 ## Evaluations
+The following data has been re-evaluated and calculated as the average for each test.
+| Benchmark   | SuperNova-Lite | Meta-Llama-3.1-8B-Instruct-abliterated | Llama-3.1-8B-Fusion-9010 | Llama-3.1-8B-Fusion-8020 | Llama-3.1-8B-Fusion-7030 | Llama-3.1-8B-Fusion-6040 | Llama-3.1-8B-Fusion-5050 |
+|-------------|----------------|----------------------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|
+| IF_Eval     | 82.09          | 76.29                                  | 82.44                    | 82.93                    | **83.10**                 | 82.94                    | 82.03                    |
+| MMLU Pro    | **35.87**      | 33.1                                   | 35.65                    | 35.32                    | 34.91                    | 34.5                     | 33.96                    |
+| TruthfulQA  | **64.35**      | 53.25                                  | 62.67                    | 61.04                    | 59.09                    | 57.8                     | 56.75                    |
+| BBH         | **49.48**      | 44.87                                  | 48.86                    | 48.47                    | 48.30                    | 48.19                    | 47.93                    |
+| GPQA        | 31.98          | 29.50                                  | 32.25                    | 32.38                    | **32.61**                 | 31.14                    | 30.6                     |