kyujinpy commited on
Commit
8df9558
1 Parent(s): 34e288e

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -6
README.md CHANGED
@@ -11,7 +11,7 @@ license: cc-by-nc-4.0
11
  # **CoT-llama2-7B**
12
  ![img](./CoT-llama.png)
13
 
14
- **More detail repo(Github): [CoT-llama2](Not yet)**
15
 
16
  ## Model Details
17
 
@@ -33,7 +33,18 @@ CoT-llama2 is an auto-regressive language model based on the LLaMA2 transformer
33
  I use [KoCoT_2000](https://huggingface.co/datasets/kyujinpy/KoCoT_2000).
34
  Using DeepL, translate about [kaist-CoT](https://huggingface.co/datasets/kaist-ai/CoT-Collection).
35
 
36
- I use A100 GPU 40GB and COLAB, when trianing.
 
 
 
 
 
 
 
 
 
 
 
37
 
38
 
39
  # **Model Benchmark**
@@ -43,7 +54,6 @@ I use A100 GPU 40GB and COLAB, when trianing.
43
 
44
  > Question Answering (QA)
45
  ### COPA (F1)
46
- ![jpg](./results/copa.jpg)
47
  | Model | 0-shot | 5-shot | 10-shot | 50-shot |
48
  | --- | --- | --- | --- | --- |
49
  | [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.7196 | 0.7193 | 0.7204 | 0.7206 |
@@ -58,7 +68,6 @@ I use A100 GPU 40GB and COLAB, when trianing.
58
 
59
  > Natural Language Inference (NLI; 자연어 추론 평가)
60
  ### HellaSwag (F1)
61
- ![jpg](./results/hella.jpg)
62
  | Model | 0-shot | 5-shot | 10-shot | 50-shot |
63
  | --- | --- | --- | --- | --- |
64
  | [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.5247 | 0.5260 | 0.5278 | 0.5427 |
@@ -73,7 +82,6 @@ I use A100 GPU 40GB and COLAB, when trianing.
73
 
74
  > Question Answering (QA)
75
  ### BoolQ (F1)
76
- ![jpg](./results/bool.jpg)
77
  | Model | 0-shot | 5-shot | 10-shot | 50-shot |
78
  | --- | --- | --- | --- | --- |
79
  | [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.3552 | 0.4751 | 0.4109 | 0.4038 |
@@ -88,7 +96,6 @@ I use A100 GPU 40GB and COLAB, when trianing.
88
 
89
  > Classification
90
  ### SentiNeg (F1)
91
- ![jpg](./results/senti.jpg)
92
  | Model | 0-shot | 5-shot | 10-shot | 50-shot |
93
  | --- | --- | --- | --- | --- |
94
  | [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.6790 | 0.6257 | 0.5514 | 0.7851 |
 
11
  # **CoT-llama2-7B**
12
  ![img](./CoT-llama.png)
13
 
14
+ **More detail repo(Github): [CoT-llama2](https://github.com/Marker-Inc-Korea/CoT-llama2)**
15
 
16
  ## Model Details
17
 
 
33
  I use [KoCoT_2000](https://huggingface.co/datasets/kyujinpy/KoCoT_2000).
34
  Using DeepL, translate about [kaist-CoT](https://huggingface.co/datasets/kaist-ai/CoT-Collection).
35
 
36
+ I use A100 GPU 40GB and COLAB, when trianing.
37
+
38
+ **Training Hyperparameters**
39
+ | Hyperparameters | Value |
40
+ | --- | --- |
41
+ | batch_size | `64` |
42
+ | micro_batch_size | `1` |
43
+ | Epochs | `15` |
44
+ | learning_rate | `1e-5` |
45
+ | cutoff_len | `2048` |
46
+ | lr_scheduler | `linear` |
47
+ | base_model | `beomi/llama-2-ko-7b` |
48
 
49
 
50
  # **Model Benchmark**
 
54
 
55
  > Question Answering (QA)
56
  ### COPA (F1)
 
57
  | Model | 0-shot | 5-shot | 10-shot | 50-shot |
58
  | --- | --- | --- | --- | --- |
59
  | [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.7196 | 0.7193 | 0.7204 | 0.7206 |
 
68
 
69
  > Natural Language Inference (NLI; 자연어 추론 평가)
70
  ### HellaSwag (F1)
 
71
  | Model | 0-shot | 5-shot | 10-shot | 50-shot |
72
  | --- | --- | --- | --- | --- |
73
  | [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.5247 | 0.5260 | 0.5278 | 0.5427 |
 
82
 
83
  > Question Answering (QA)
84
  ### BoolQ (F1)
 
85
  | Model | 0-shot | 5-shot | 10-shot | 50-shot |
86
  | --- | --- | --- | --- | --- |
87
  | [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.3552 | 0.4751 | 0.4109 | 0.4038 |
 
96
 
97
  > Classification
98
  ### SentiNeg (F1)
 
99
  | Model | 0-shot | 5-shot | 10-shot | 50-shot |
100
  | --- | --- | --- | --- | --- |
101
  | [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.6790 | 0.6257 | 0.5514 | 0.7851 |