kuotient commited on
Commit
6559d95
β€’
1 Parent(s): 033c152

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -5
README.md CHANGED
@@ -14,14 +14,19 @@ What I understand here:
14
 
15
  So before (my) initial purpose in comparing which method is better, `llama3 β†’ CP + chat vector β†’ FT` vs. `llama3 β†’ CP β†’ FT + chat vector`, it seems reasonable to compare it with other methods in [Mergekit](https://github.com/arcee-ai/mergekit).
16
 
17
- | Model | Merge Method | Score(but what?) |
18
- |---|---|---|
19
- | [beomi/Llama-3-Open-Ko-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-Open-Ko-8B-Instruct-preview) | chat vector | - |
20
- | [kuotient/Llama-3-Ko-8B-ties](https://huggingface.co/kuotient/Llama-3-Ko-8B-ties) | Ties | - |
21
- | [kuotient/Llama-3-Ko-8B-dare-ties](https://huggingface.co/kuotient/Llama-3-Ko-8B-dare-ties) | Dare-ties | - |
22
  | [kuotient/Llama-3-Ko-8B-TA](https://huggingface.co/kuotient/Llama-3-Ko-8B-TA) | Task Arithmetic(maybe...? not sure about this) | - |
23
  | WIP | Model stock(I don't read this paper yet but still) | - |
24
  | [kuotient/Llama-3-Ko-8B-EMM](https://huggingface.co/kuotient/Llama-3-Ko-8B-EMM) | Evolutionary Model Merging | - |
 
 
 
 
 
25
 
26
  All that aside, I'd like to thank @[beomi](https://huggingface.co/beomi) for creating such an awesome korean-based model.
27
 
 
14
 
15
  So before (my) initial purpose in comparing which method is better, `llama3 β†’ CP + chat vector β†’ FT` vs. `llama3 β†’ CP β†’ FT + chat vector`, it seems reasonable to compare it with other methods in [Mergekit](https://github.com/arcee-ai/mergekit).
16
 
17
+ | Model | Method | Kobest(f1) | Haerae(acc) |
18
+ |---|---|---|---|
19
+ | [beomi/Llama-3-Open-Ko-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-Open-Ko-8B-Instruct-preview) | chat vector | 0.4368 | 0.439 |
20
+ | [kuotient/Llama-3-Ko-8B-ties](https://huggingface.co/kuotient/Llama-3-Ko-8B-ties) | Ties | 0.4821 | 0.5160 |
21
+ | [kuotient/Llama-3-Ko-8B-dare-ties](https://huggingface.co/kuotient/Llama-3-Ko-8B-dare-ties) | Dare-ties | 0.4950 | 0.5399 |
22
  | [kuotient/Llama-3-Ko-8B-TA](https://huggingface.co/kuotient/Llama-3-Ko-8B-TA) | Task Arithmetic(maybe...? not sure about this) | - |
23
  | WIP | Model stock(I don't read this paper yet but still) | - |
24
  | [kuotient/Llama-3-Ko-8B-EMM](https://huggingface.co/kuotient/Llama-3-Ko-8B-EMM) | Evolutionary Model Merging | - |
25
+ |---|---|---|---|
26
+ | [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | Base | 0.4368 | 0.439 |
27
+ | [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | - | 0.4239 | 0.4931 |
28
+ | [beomi/Llama-3-Open-Ko-8B](https://huggingface.co/beomi/Llama-3-Open-Ko-8B) | Korean Base | 0.4374 | 0.3813 |
29
+
30
 
31
  All that aside, I'd like to thank @[beomi](https://huggingface.co/beomi) for creating such an awesome korean-based model.
32