kyujinpy
/

CoT-llama-2k-7b

@@ -11,7 +11,7 @@ license: cc-by-nc-4.0
 # **CoT-llama2-7B**
 ![img](./CoT-llama.png)
-**More detail repo(Github): [CoT-llama2](Not yet)**
 ## Model Details
@@ -33,7 +33,18 @@ CoT-llama2 is an auto-regressive language model based on the LLaMA2 transformer
 I use [KoCoT_2000](https://huggingface.co/datasets/kyujinpy/KoCoT_2000).
 Using DeepL, translate about [kaist-CoT](https://huggingface.co/datasets/kaist-ai/CoT-Collection).
-I use A100 GPU 40GB and COLAB, when trianing.
 # **Model Benchmark**
@@ -43,7 +54,6 @@ I use A100 GPU 40GB and COLAB, when trianing.
 > Question Answering (QA)
 ### COPA (F1)
-![jpg](./results/copa.jpg)
 | Model | 0-shot | 5-shot | 10-shot | 50-shot |
 | --- | --- | --- | --- | --- |
 | [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.7196 | 0.7193 | 0.7204 | 0.7206 |
@@ -58,7 +68,6 @@ I use A100 GPU 40GB and COLAB, when trianing.
 > Natural Language Inference (NLI; 자연어 추론 평가)
 ### HellaSwag (F1)
-![jpg](./results/hella.jpg)
 | Model | 0-shot | 5-shot | 10-shot | 50-shot |
 | --- | --- | --- | --- | --- |
 | [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.5247 | 0.5260 | 0.5278 | 0.5427 |
@@ -73,7 +82,6 @@ I use A100 GPU 40GB and COLAB, when trianing.
 > Question Answering (QA)
 ### BoolQ (F1)
-![jpg](./results/bool.jpg)
 | Model | 0-shot | 5-shot | 10-shot | 50-shot |
 | --- | --- | --- | --- | --- |
 | [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.3552 | 0.4751 | 0.4109 | 0.4038 |
@@ -88,7 +96,6 @@ I use A100 GPU 40GB and COLAB, when trianing.
 > Classification
 ### SentiNeg (F1)
-![jpg](./results/senti.jpg)
 | Model | 0-shot | 5-shot | 10-shot | 50-shot |
 | --- | --- | --- | --- | --- |
 | [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.6790 | 0.6257 | 0.5514 | 0.7851 |

 # **CoT-llama2-7B**
 ![img](./CoT-llama.png)
+**More detail repo(Github): [CoT-llama2](https://github.com/Marker-Inc-Korea/CoT-llama2)**
 ## Model Details
 I use [KoCoT_2000](https://huggingface.co/datasets/kyujinpy/KoCoT_2000).
 Using DeepL, translate about [kaist-CoT](https://huggingface.co/datasets/kaist-ai/CoT-Collection).
+I use A100 GPU 40GB and COLAB, when trianing.
+**Training Hyperparameters**
+| Hyperparameters | Value |
+| --- | --- |
+| batch_size | `64` |
+| micro_batch_size | `1` |
+| Epochs | `15` |
+| learning_rate | `1e-5` |
+| cutoff_len | `2048` |
+| lr_scheduler | `linear` |
+| base_model | `beomi/llama-2-ko-7b` |
 # **Model Benchmark**
 > Question Answering (QA)
 ### COPA (F1)
 | Model | 0-shot | 5-shot | 10-shot | 50-shot |
 | --- | --- | --- | --- | --- |
 | [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.7196 | 0.7193 | 0.7204 | 0.7206 |
 > Natural Language Inference (NLI; 자연어 추론 평가)
 ### HellaSwag (F1)
 | Model | 0-shot | 5-shot | 10-shot | 50-shot |
 | --- | --- | --- | --- | --- |
 | [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.5247 | 0.5260 | 0.5278 | 0.5427 |
 > Question Answering (QA)
 ### BoolQ (F1)
 | Model | 0-shot | 5-shot | 10-shot | 50-shot |
 | --- | --- | --- | --- | --- |
 | [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.3552 | 0.4751 | 0.4109 | 0.4038 |
 > Classification
 ### SentiNeg (F1)
 | Model | 0-shot | 5-shot | 10-shot | 50-shot |
 | --- | --- | --- | --- | --- |
 | [Polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) | 0.6790 | 0.6257 | 0.5514 | 0.7851 |