Upload README.md
Browse files
README.md
CHANGED
@@ -11,7 +11,7 @@ license: cc-by-nc-4.0
|
|
11 |
# **CoT-llama2-7B**
|
12 |
![img](./CoT-llama.png)
|
13 |
|
14 |
-
**More detail repo(Github): [CoT-llama2](
|
15 |
|
16 |
## Model Details
|
17 |
|
@@ -33,7 +33,18 @@ CoT-llama2 is an auto-regressive language model based on the LLaMA2 transformer
|
|
33 |
I use [KoCoT_2000](https://huggingface.co/datasets/kyujinpy/KoCoT_2000).
|
34 |
Using DeepL, translate about [kaist-CoT](https://huggingface.co/datasets/kaist-ai/CoT-Collection).
|
35 |
|
36 |
-
I use A100 GPU 40GB and COLAB, when trianing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
37 |
|
38 |
|
39 |
# **Model Benchmark**
|
@@ -43,7 +54,6 @@ I use A100 GPU 40GB and COLAB, when trianing.
|
|
43 |
|
44 |
> Question Answering (QA)
|
45 |
### COPA (F1)
|
46 |
-
![jpg](./results/copa.jpg)
|
47 |
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
48 |
| --- | --- | --- | --- | --- |
|
49 |
| [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.7196 | 0.7193 | 0.7204 | 0.7206 |
|
@@ -58,7 +68,6 @@ I use A100 GPU 40GB and COLAB, when trianing.
|
|
58 |
|
59 |
> Natural Language Inference (NLI; 자연어 추론 평가)
|
60 |
### HellaSwag (F1)
|
61 |
-
![jpg](./results/hella.jpg)
|
62 |
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
63 |
| --- | --- | --- | --- | --- |
|
64 |
| [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.5247 | 0.5260 | 0.5278 | 0.5427 |
|
@@ -73,7 +82,6 @@ I use A100 GPU 40GB and COLAB, when trianing.
|
|
73 |
|
74 |
> Question Answering (QA)
|
75 |
### BoolQ (F1)
|
76 |
-
![jpg](./results/bool.jpg)
|
77 |
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
78 |
| --- | --- | --- | --- | --- |
|
79 |
| [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.3552 | 0.4751 | 0.4109 | 0.4038 |
|
@@ -88,7 +96,6 @@ I use A100 GPU 40GB and COLAB, when trianing.
|
|
88 |
|
89 |
> Classification
|
90 |
### SentiNeg (F1)
|
91 |
-
![jpg](./results/senti.jpg)
|
92 |
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
93 |
| --- | --- | --- | --- | --- |
|
94 |
| [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.6790 | 0.6257 | 0.5514 | 0.7851 |
|
|
|
11 |
# **CoT-llama2-7B**
|
12 |
![img](./CoT-llama.png)
|
13 |
|
14 |
+
**More detail repo(Github): [CoT-llama2](https://github.com/Marker-Inc-Korea/CoT-llama2)**
|
15 |
|
16 |
## Model Details
|
17 |
|
|
|
33 |
I use [KoCoT_2000](https://huggingface.co/datasets/kyujinpy/KoCoT_2000).
|
34 |
Using DeepL, translate about [kaist-CoT](https://huggingface.co/datasets/kaist-ai/CoT-Collection).
|
35 |
|
36 |
+
I use A100 GPU 40GB and COLAB, when trianing.
|
37 |
+
|
38 |
+
**Training Hyperparameters**
|
39 |
+
| Hyperparameters | Value |
|
40 |
+
| --- | --- |
|
41 |
+
| batch_size | `64` |
|
42 |
+
| micro_batch_size | `1` |
|
43 |
+
| Epochs | `15` |
|
44 |
+
| learning_rate | `1e-5` |
|
45 |
+
| cutoff_len | `2048` |
|
46 |
+
| lr_scheduler | `linear` |
|
47 |
+
| base_model | `beomi/llama-2-ko-7b` |
|
48 |
|
49 |
|
50 |
# **Model Benchmark**
|
|
|
54 |
|
55 |
> Question Answering (QA)
|
56 |
### COPA (F1)
|
|
|
57 |
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
58 |
| --- | --- | --- | --- | --- |
|
59 |
| [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.7196 | 0.7193 | 0.7204 | 0.7206 |
|
|
|
68 |
|
69 |
> Natural Language Inference (NLI; 자연어 추론 평가)
|
70 |
### HellaSwag (F1)
|
|
|
71 |
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
72 |
| --- | --- | --- | --- | --- |
|
73 |
| [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.5247 | 0.5260 | 0.5278 | 0.5427 |
|
|
|
82 |
|
83 |
> Question Answering (QA)
|
84 |
### BoolQ (F1)
|
|
|
85 |
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
86 |
| --- | --- | --- | --- | --- |
|
87 |
| [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.3552 | 0.4751 | 0.4109 | 0.4038 |
|
|
|
96 |
|
97 |
> Classification
|
98 |
### SentiNeg (F1)
|
|
|
99 |
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
100 |
| --- | --- | --- | --- | --- |
|
101 |
| [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.6790 | 0.6257 | 0.5514 | 0.7851 |
|