lgaalves
/

falcon-7b_guanaco

Text Generation

RefinedWebModel

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

lgaalves commited on Sep 1, 2023

Commit

6683d73

•

1 Parent(s): f6d294f

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -19,10 +19,10 @@ pipeline_tag: text-generation
 | Metric                | lgaalves/falcon-7b_guanaco | tiiuae/falcon-7b (base) |
 |-----------------------|-------|-------|
 | Avg.                  | - | 47.01 |
-| ARC (25-shot)         | 50.0 | 47.87 |
-| HellaSwag (10-shot)   | - | 78.13 |
 | MMLU (5-shot)         | - | 27.79 |
-| TruthfulQA (0-shot)   | 40.45 | 34.26 |
 We use state-of-the-art [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, using the same version as the HuggingFace LLM Leaderboard. Please see below for detailed instructions on reproducing benchmark results.

 | Metric                | lgaalves/falcon-7b_guanaco | tiiuae/falcon-7b (base) |
 |-----------------------|-------|-------|
 | Avg.                  | - | 47.01 |
+| ARC (25-shot)         | **50.0** | 47.87 |
+| HellaSwag (10-shot)   | **78.54** | 78.13 |
 | MMLU (5-shot)         | - | 27.79 |
+| TruthfulQA (0-shot)   | **40.45** | 34.26 |
 We use state-of-the-art [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, using the same version as the HuggingFace LLM Leaderboard. Please see below for detailed instructions on reproducing benchmark results.