denial07
/

Qwen2-72B-Instruct-kor-dpo

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

denial07 commited on Aug 3, 2024

Commit

66c7932

·

verified ·

1 Parent(s): 3673d7c

Update README.md

Files changed (1) hide show

README.md +10 -2

README.md CHANGED Viewed

@@ -18,5 +18,13 @@ This model is an improved version for Korean, based on the [Qwen2-72B-Instruct](
 | 10    | Qwen/Qwen2-72B-Instruct | 8.00 | 8.14 | 9.07 | 9.85 | 9.78 | 7.28 | 8.61 | 8.76 | 8.69 | 72B |
 | 11    | google/gemini-1.5-pro-001 | 7.00 | 8.00 | 9.57 | 8.85 | 9.35 | 8.64 | 8.61 | 8.52 | 8.57 | ? |
-### KMMLU Benchmark (in progress)
-* [HAERAE-HUB/KMMLU](https://huggingface.co/datasets/HAERAE-HUB/KMMLU) benchmark score.

 | 10    | Qwen/Qwen2-72B-Instruct | 8.00 | 8.14 | 9.07 | 9.85 | 9.78 | 7.28 | 8.61 | 8.76 | 8.69 | 72B |
 | 11    | google/gemini-1.5-pro-001 | 7.00 | 8.00 | 9.57 | 8.85 | 9.35 | 8.64 | 8.61 | 8.52 | 8.57 | ? |
+### KMMLU Benchmark
+* [HAERAE-HUB/KMMLU](https://huggingface.co/datasets/HAERAE-HUB/KMMLU) benchmark accuracy score.
+  | Category        |Qwen2-72B kor-dpo| Qwen2-72B  | Questions  |
+  |-----------------|-----------------|------------|------------|
+  | HUMSS           |     0.63        |   0.63     | 5130       |
+  | STEM            |     0.59        |   0.59     | 9900       |
+  | Applied Science |     0.56        |   0.56     | 11600      |
+  | Other           |     0.58        |   0.58     | 8400       |
+  | Overall Accuracy|     0.58        |   0.58     | 35030      |