MaziyarPanahi
/

phi-2-logical-sft

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions Metrics Training metrics Community

Adding Evaluation Results

#7

by leaderboard-pr-bot - opened Mar 4

base: refs/heads/main

←

from: refs/pr/7

Discussion Files changed

Files changed (1) hide show

README.md +18 -6

README.md CHANGED Viewed

@@ -1,6 +1,5 @@
 ---
 license: mit
-base_model: microsoft/phi-2
 tags:
 - axolotl
 - generated_from_trainer
@@ -10,16 +9,16 @@ tags:
 - reasoning
 - transformers
 - text-generation-inference
-model-index:
-- name: phi-2-logical-sft
-  results: []
 datasets:
 - garage-bAInd/Open-Platypus
-model_name: phi-2-logical-sft
 inference: false
 model_creator: MaziyarPanahi
 pipeline_tag: text-generation
 quantized_by: MaziyarPanahi
 ---
 <img src="https://cdn-uploads.huggingface.co/production/uploads/5fd5e18a90b6dc4633f6d292/uhDf-zhThjoAwQVAMEo2t.webp" width="600" />
@@ -252,4 +251,17 @@ special_tokens:
   pad_token: "<|endoftext|>"
 ```
-</details><br>

 ---
 license: mit
 tags:
 - axolotl
 - generated_from_trainer
 - reasoning
 - transformers
 - text-generation-inference
 datasets:
 - garage-bAInd/Open-Platypus
+base_model: microsoft/phi-2
 inference: false
 model_creator: MaziyarPanahi
 pipeline_tag: text-generation
 quantized_by: MaziyarPanahi
+model-index:
+- name: phi-2-logical-sft
+  results: []
 ---
 <img src="https://cdn-uploads.huggingface.co/production/uploads/5fd5e18a90b6dc4633f6d292/uhDf-zhThjoAwQVAMEo2t.webp" width="600" />
   pad_token: "<|endoftext|>"
 ```
+</details><br>
+# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
+Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__phi-2-logical-sft)
+|             Metric              |Value|
+|---------------------------------|----:|
+|Avg.                             |61.50|
+|AI2 Reasoning Challenge (25-Shot)|61.35|
+|HellaSwag (10-Shot)              |75.14|
+|MMLU (5-Shot)                    |57.40|
+|TruthfulQA (0-shot)              |44.39|
+|Winogrande (5-shot)              |74.90|
+|GSM8k (5-shot)                   |55.80|