Adding Evaluation Results (#1)

Browse files

- Adding Evaluation Results (fbc911e495fb59f2aec647d70ef05148dc92f62b)

Co-authored-by: Open LLM Leaderboard PR Bot <leaderboard-pr-bot@users.noreply.huggingface.co>

Files changed (1) hide show

README.md +121 -5

README.md CHANGED Viewed

@@ -1,13 +1,116 @@
 ---
 license: apache-2.0
 datasets:
 - Open-Orca/SlimOrca
-language:
-- en
 pipeline_tag: text-generation
 inference: false
-tags:
-- text-generation-inference
 ---
 # 🌟 Falcon-RW-1B-Instruct-OpenOrca
@@ -76,4 +179,17 @@ This model may generate inaccurate or misleading information and is prone to hal
 The model is provided 'as is' without any warranties, and the creators are not liable for any damages arising from its use. Users are responsible for their interactions with the model.
 ## 📬 Contact
-For further inquiries or feedback, please contact at eric.fu96@aol.com.

 ---
+language:
+- en
 license: apache-2.0
+tags:
+- text-generation-inference
 datasets:
 - Open-Orca/SlimOrca
 pipeline_tag: text-generation
 inference: false
+model-index:
+- name: falcon-rw-1b-instruct-openorca
+  results:
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: AI2 Reasoning Challenge (25-Shot)
+      type: ai2_arc
+      config: ARC-Challenge
+      split: test
+      args:
+        num_few_shot: 25
+    metrics:
+    - type: acc_norm
+      value: 34.56
+      name: normalized accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: HellaSwag (10-Shot)
+      type: hellaswag
+      split: validation
+      args:
+        num_few_shot: 10
+    metrics:
+    - type: acc_norm
+      value: 60.93
+      name: normalized accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: MMLU (5-Shot)
+      type: cais/mmlu
+      config: all
+      split: test
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 28.77
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: TruthfulQA (0-shot)
+      type: truthful_qa
+      config: multiple_choice
+      split: validation
+      args:
+        num_few_shot: 0
+    metrics:
+    - type: mc2
+      value: 37.42
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: Winogrande (5-shot)
+      type: winogrande
+      config: winogrande_xl
+      split: validation
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 60.69
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: GSM8k (5-shot)
+      type: gsm8k
+      config: main
+      split: test
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 3.41
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
+      name: Open LLM Leaderboard
 ---
 # 🌟 Falcon-RW-1B-Instruct-OpenOrca
 The model is provided 'as is' without any warranties, and the creators are not liable for any damages arising from its use. Users are responsible for their interactions with the model.
 ## 📬 Contact
+For further inquiries or feedback, please contact at eric.fu96@aol.com.
+# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
+Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ericzzz__falcon-rw-1b-instruct-openorca)
+|             Metric              |Value|
+|---------------------------------|----:|
+|Avg.                             |37.63|
+|AI2 Reasoning Challenge (25-Shot)|34.56|
+|HellaSwag (10-Shot)              |60.93|
+|MMLU (5-Shot)                    |28.77|
+|TruthfulQA (0-shot)              |37.42|
+|Winogrande (5-shot)              |60.69|
+|GSM8k (5-shot)                   | 3.41|