Athena-v3 / README.md
leaderboard-pr-bot's picture
Adding Evaluation Results
8928aea verified
|
raw
history blame
5.62 kB
metadata
license: cc-by-nc-4.0
model-index:
  - name: Athena-v3
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 61.69
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=IkariDev/Athena-v3
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 84.34
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=IkariDev/Athena-v3
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 57.87
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=IkariDev/Athena-v3
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 51.26
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=IkariDev/Athena-v3
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 75.77
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=IkariDev/Athena-v3
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 11.6
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=IkariDev/Athena-v3
          name: Open LLM Leaderboard

image/png

Experimental Athena v3 model. Use Alpaca format. Suitable for RP, ERP and general stuff.

Description

This repo contains fp16 files of Athena-V3.

GGUF - By TheBloke

GPTQ - By TheBloke

AWQ - By TheBloke

fp16 - by IkariDev+Undi95

OLD(GGUF - by IkariDev+Undi95)

Ratings:

Note: I have permission of all users to upload their ratings, i DONT screenshot random reviews without asking if i can put them here!

https://snombler.neocities.org/logs#athenav3

Models and loras used

  • Athena-v2
  • migtissera/Synthia-13B-v1.2
  • The-Face-Of-Goonery/Huginn-13b-FP16
  • PygmalionAI/pygmalion-2-13b
  • The-Face-Of-Goonery/LegerDemain-FP16
  • chargoddard/storytime-13b
  • lemonilia/LimaRP-Llama2-13B-v3-EXPERIMENT
  • zattio770/120-Days-of-LORA-v2-13B
Loras: [lemonilia/LimaRP-Llama2-13B-v3-EXPERIMENT(0.65) + zattio770/120-Days-of-LORA-v2-13B(0.35)](0.3) to the final model

+ [Athena-v2(0.70) + migtissera/Synthia-13B-v1.2(0.3)](0.5)
+ [The-Face-Of-Goonery/Huginn-13b-FP16(0.85) + PygmalionAI/pygmalion-2-13b](0.15)](0.40)
+ [The-Face-Of-Goonery/LegerDemain-FP16(0.3) chargoddard/storytime-13b(0.7)](0.10)

Prompt template: Alpaca

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{prompt}

### Response:

HUGE thanks to Undi95 for doing the merging (Recipe was my idea, he merged)

To TheBloke: please if you quant this, please include IkariDev + Undi95 in all the credits/links to the creator.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 57.09
AI2 Reasoning Challenge (25-Shot) 61.69
HellaSwag (10-Shot) 84.34
MMLU (5-Shot) 57.87
TruthfulQA (0-shot) 51.26
Winogrande (5-shot) 75.77
GSM8k (5-shot) 11.60