metadata
language:
- en
- ko
tags:
- generation
license: apache-2.0
model-index:
- name: task_1
results:
- task:
type: natural-language-generation
dataset:
type: hellaswag
name: hellaswag(10 shots)
metrics:
- type: acc_norm
value: 27.7
- name: task_2
results:
- task:
type: natural-language-generation
dataset:
type: ARC
name: ARC(25 shots)
metrics:
- type: acc_norm
value: 23.8
- name: task_3
results:
- task:
type: natural-language-generation
dataset:
type: MMLU
name: MMLU(5 shots)
metrics:
- type: acc
value: 24.9
- name: task_4
results:
- task:
type: natural-language-generation
dataset:
type: TruthfulQA
name: TruthfulQA(0 shots)
metrics:
- type: mc2
value: 46.5
Pretrained GPT2 with expanded n_ctx up to 2048(also with expanded embedding dimension to 1536) in Korean.
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 24.27 |
ARC (25-shot) | 21.16 |
HellaSwag (10-shot) | 28.11 |
MMLU (5-shot) | 26.56 |
TruthfulQA (0-shot) | 42.06 |
Winogrande (5-shot) | 49.09 |
GSM8K (5-shot) | 0.0 |
DROP (3-shot) | 2.89 |