likejazz commited on
Commit
77b9166
·
verified ·
1 Parent(s): b58233e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -13
README.md CHANGED
@@ -42,17 +42,16 @@ DNA 1.0 8B Instruct was fine-tuned on approximately 10B tokens of carefully cura
42
 
43
  We evaluated DNA 1.0 8B Instruct against other prominent language models of similar size across various benchmarks, including Korean-specific tasks and general language understanding metrics. More details will be provided in the upcoming <u>Technical Report</u>.
44
 
45
- | Language | Benchmark | **dnotitia/DNA-1.0-8B-Instruct** | LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct | LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct | yanolja/EEVE-Korean-Instruct-10.8B-v1.0 | Qwen/Qwen2.5-7B-Instruct | meta-llama/Llama-3.1-8B-Instruct | mistralai/Mistral-7B-Instruct-v0.3 | NCSOFT/Llama-VARCO-8B-Instruct | upstage/SOLAR-10.7B-Instruct-v1.0 |
46
- |----------|------------|----------------------------------|--------------------------------------|--------------------------------------|-----------------------------------------|--------------------------|----------------------------------|------------------------------------|--------------------------------|-----------------------------------|
47
- | Korean | KMMLU | **53.26** (1st) | 45.30 | 45.28 | 42.17 | <u>45.66</u> | 41.66 | 31.45 | 38.49 | 41.50 |
48
- | | KMMLU-hard | **29.46** (1st) | 23.17 | 20.78 | 19.25 | <u>24.78</u> | 20.49 | 17.86 | 19.83 | 20.61 |
49
- | | KoBEST | **83.40** (1st) | 79.05 | 80.13 | <u>81.67</u> | 78.51 | 67.56 | 63.77 | 72.99 | 73.26 |
50
- | | Belebele | **57.99** (1st) | 40.97 | 45.11 | 49.40 | <u>54.85</u> | 54.70 | 40.31 | 53.17 | 48.68 |
51
- | | CSATQA | <u>43.32</u> (2nd) | 40.11 | 34.76 | 39.57 | **45.45** | 36.90 | 27.27 | 32.62 | 34.22 |
52
- | English | MMLU | 66.64 (3rd) | 65.27 | 64.32 | 63.63 | **74.26** | <u>68.26</u> | 62.04 | 63.25 | 65.30 |
53
- | | MMLU-Pro | **43.05** (1st) | 40.73 | 38.90 | 32.79 | <u>42.5</u> | 40.92 | 33.49 | 37.11 | 30.25 |
54
- | | GSM8K | **80.52** (1st) | 65.96 | <u>80.06</u> | 56.18 | 75.74 | 75.82 | 49.66 | 64.14 | 69.22 |
55
-
56
  - The *highest* *scores* are in **bold** form, and the *second*\-*highest* *scores* are <u>underlined</u>.
57
 
58
  **Evaluation Protocol**
@@ -76,8 +75,8 @@ This model requires `transformers >= 4.43.0`.
76
  ```python
77
  from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
78
 
79
- tokenizer = AutoTokenizer.from_pretrained('dnotitia/DNA-1.0-8B-Instruct')
80
- model = AutoModelForCausalLM.from_pretrained('dnotitia/DNA-1.0-8B-Instruct', device_map='auto')
81
  streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
82
 
83
  conversation = [
 
42
 
43
  We evaluated DNA 1.0 8B Instruct against other prominent language models of similar size across various benchmarks, including Korean-specific tasks and general language understanding metrics. More details will be provided in the upcoming <u>Technical Report</u>.
44
 
45
+ | Language | Benchmark | **dnotitia/Llama-DNA-1.0-8B-Instruct** | LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct | LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct | yanolja/EEVE-Korean-Instruct-10.8B-v1.0 | Qwen/Qwen2.5-7B-Instruct | meta-llama/Llama-3.1-8B-Instruct | mistralai/Mistral-7B-Instruct-v0.3 | NCSOFT/Llama-VARCO-8B-Instruct | upstage/SOLAR-10.7B-Instruct-v1.0 |
46
+ |----------|------------|----------------------------------------|--------------------------------------|--------------------------------------|-----------------------------------------|--------------------------|----------------------------------|------------------------------------|--------------------------------|-----------------------------------|
47
+ | Korean | KMMLU | **53.26** (1st) | 45.30 | 45.28 | 42.17 | <u>45.66</u> | 41.66 | 31.45 | 38.49 | 41.50 |
48
+ | | KMMLU-hard | **29.46** (1st) | 23.17 | 20.78 | 19.25 | <u>24.78</u> | 20.49 | 17.86 | 19.83 | 20.61 |
49
+ | | KoBEST | **83.40** (1st) | 79.05 | 80.13 | <u>81.67</u> | 78.51 | 67.56 | 63.77 | 72.99 | 73.26 |
50
+ | | Belebele | **57.99** (1st) | 40.97 | 45.11 | 49.40 | <u>54.85</u> | 54.70 | 40.31 | 53.17 | 48.68 |
51
+ | | CSATQA | <u>43.32</u> (2nd) | 40.11 | 34.76 | 39.57 | **45.45** | 36.90 | 27.27 | 32.62 | 34.22 |
52
+ | English | MMLU | 66.64 (3rd) | 65.27 | 64.32 | 63.63 | **74.26** | <u>68.26</u> | 62.04 | 63.25 | 65.30 |
53
+ | | MMLU-Pro | **43.05** (1st) | 40.73 | 38.90 | 32.79 | <u>42.5</u> | 40.92 | 33.49 | 37.11 | 30.25 |
54
+ | | GSM8K | **80.52** (1st) | 65.96 | <u>80.06</u> | 56.18 | 75.74 | 75.82 | 49.66 | 64.14 | 69.22 |
 
55
  - The *highest* *scores* are in **bold** form, and the *second*\-*highest* *scores* are <u>underlined</u>.
56
 
57
  **Evaluation Protocol**
 
75
  ```python
76
  from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
77
 
78
+ tokenizer = AutoTokenizer.from_pretrained('dnotitia/Llama-DNA-1.0-8B-Instruct')
79
+ model = AutoModelForCausalLM.from_pretrained('dnotitia/Llama-DNA-1.0-8B-Instruct', device_map='auto')
80
  streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
81
 
82
  conversation = [