cstr commited on
Commit
f212568
1 Parent(s): 3b08f95

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -10,7 +10,7 @@ language:
10
  - de
11
  ---
12
 
13
- # llama3-8b-spaetzle-v51
14
 
15
  This is only a quick test in merging 3 and 3.1 llamas despite a number of differences in tokenizer setup i.a., also motivated by ongoing problems with BOS, looping, etc, with 3.1, esp. with llama.cpp, missing full RoPE scaling yet, etc. Performance is yet not satisfactory of course, which might have a number of causes.
16
 
@@ -19,12 +19,12 @@ This is only a quick test in merging 3 and 3.1 llamas despite a number of differ
19
 
20
  | Model | AGIEval | TruthfulQA | Bigbench |
21
  |----------------------------------------------------------------------------|--------:|-----------:|---------:|
22
- | [llama3-8b-spaetzle-v51](https://huggingface.co/cstr/llama3-8b-spaetzle-v51)| 42.23 | 57.29 | 44.3 |
23
  | [llama3-8b-spaetzle-v39](https://huggingface.co/cstr/llama3-8b-spaetzle-v39)| 43.43 | 60.0 | 45.89 |
24
 
25
  ### AGIEval Results
26
 
27
- | Task | llama3-8b-spaetzle-v51 | llama3-8b-spaetzle-v39 |
28
  |------------------------------|-----------------------:|-----------------------:|
29
  | agieval_aqua_rat | 27.95| 24.41|
30
  | agieval_logiqa_en | 38.10| 37.94|
@@ -38,7 +38,7 @@ This is only a quick test in merging 3 and 3.1 llamas despite a number of differ
38
 
39
  ### TruthfulQA Results
40
 
41
- | Task | llama3-8b-spaetzle-v51 | llama3-8b-spaetzle-v39 |
42
  |-------------|-----------------------:|-----------------------:|
43
  | mc1 | 38.07| 43.82|
44
  | mc2 | 57.29| 60.00|
@@ -46,7 +46,7 @@ This is only a quick test in merging 3 and 3.1 llamas despite a number of differ
46
 
47
  ### Bigbench Results
48
 
49
- | Task | llama3-8b-spaetzle-v51 | llama3-8b-spaetzle-v39 |
50
  |------------------------------------------------|-----------------------:|-----------------------:|
51
  | bigbench_causal_judgement | 56.32| 59.47|
52
  | bigbench_date_understanding | 69.65| 70.73|
 
10
  - de
11
  ---
12
 
13
+ # llama3.1-8b-spaetzle-v51
14
 
15
  This is only a quick test in merging 3 and 3.1 llamas despite a number of differences in tokenizer setup i.a., also motivated by ongoing problems with BOS, looping, etc, with 3.1, esp. with llama.cpp, missing full RoPE scaling yet, etc. Performance is yet not satisfactory of course, which might have a number of causes.
16
 
 
19
 
20
  | Model | AGIEval | TruthfulQA | Bigbench |
21
  |----------------------------------------------------------------------------|--------:|-----------:|---------:|
22
+ | [llama3.1-8b-spaetzle-v51](https://huggingface.co/cstr/llama3-8b-spaetzle-v51)| 42.23 | 57.29 | 44.3 |
23
  | [llama3-8b-spaetzle-v39](https://huggingface.co/cstr/llama3-8b-spaetzle-v39)| 43.43 | 60.0 | 45.89 |
24
 
25
  ### AGIEval Results
26
 
27
+ | Task | llama3.1-8b-spaetzle-v51 | llama3-8b-spaetzle-v39 |
28
  |------------------------------|-----------------------:|-----------------------:|
29
  | agieval_aqua_rat | 27.95| 24.41|
30
  | agieval_logiqa_en | 38.10| 37.94|
 
38
 
39
  ### TruthfulQA Results
40
 
41
+ | Task | llama3.1-8b-spaetzle-v51 | llama3-8b-spaetzle-v39 |
42
  |-------------|-----------------------:|-----------------------:|
43
  | mc1 | 38.07| 43.82|
44
  | mc2 | 57.29| 60.00|
 
46
 
47
  ### Bigbench Results
48
 
49
+ | Task | llama3.1-8b-spaetzle-v51 | llama3-8b-spaetzle-v39 |
50
  |------------------------------------------------|-----------------------:|-----------------------:|
51
  | bigbench_causal_judgement | 56.32| 59.47|
52
  | bigbench_date_understanding | 69.65| 70.73|