|
--- |
|
license: other |
|
datasets: |
|
- nsmc |
|
language: |
|
- ko |
|
--- |
|
|
|
# Korean GPT Bot Sentiment Classification (ko-gpt-bot-sc) |
|
|
|
### Method |
|
- Promt-Tuning/Prefix-tuning/Soft Embedding |
|
- Parameters |
|
| Parameters | No. | |
|
|------------|---------------------| |
|
| All | 6173039616 (100.0%) | |
|
| Trainable | 6537216 (0.1%) | |
|
| Freezed | 6166502400 (99.9%) | |
|
|
|
<img src="https://huggingface.co/anhdungitvn/ko-gpt-bot-sc-7b/resolve/main/metrics/prompt_tuning.png" width="800"> |
|
|
|
|
|
### Model |
|
``` |
|
LAYER NAME #PARAMS RATIO MEM(MB) |
|
--model: 6,177,233,921 100.00% 23552.28 |
|
--learned_embedding: 6,537,216 0.11% 24.94 |
|
--transformer: 5,906,391,041 95.62% 22519.09 |
|
--wte |
|
--weight: 264,241,152 4.28% 1008.00 |
|
--h: 5,642,141,697 91.34% 21511.06 |
|
--0: 205,549,569 3.33% 772.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn: 71,303,169 1.15% 260.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--1(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--2(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--3(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--4(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--5(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--6(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--7(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--8(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--9(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--10(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--11(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--12(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--13(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--14(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--15(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--16(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--17(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--18(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--19(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--20(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--21(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--22(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--23(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--24(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--25(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--26(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--27(partially shared): 201,355,264 3.26% 768.11 |
|
--ln_1: 8,192 0.00% 0.03 |
|
--attn(shared): 67,108,864 1.09% 256.00 |
|
--mlp: 134,238,208 2.17% 512.08 |
|
--ln_f: 8,192 0.00% 0.03 |
|
--weight: 4,096 0.00% 0.02 |
|
--bias: 4,096 0.00% 0.02 |
|
--lm_head: 264,305,664 4.28% 1008.25 |
|
--weight: 264,241,152 4.28% 1008.00 |
|
--bias: 64,512 0.00% 0.25 |
|
``` |
|
|
|
### Metrics |
|
|
|
| Metric | Value | |
|
|--------|--------| |
|
| step | 520 | |
|
| loss | 3.1814 | |
|
|
|
| | precision | recall | f1-score | support | |
|
|--------------|-----------|--------|----------|---------| |
|
| 긍정 | 0.92549 | 0.944 | 0.934653 | 500 | |
|
| 부정 | 0.942857 | 0.924 | 0.933333 | 500 | |
|
| accuracy | 0.934 | 0.934 | 0.934 | 0.934 | |
|
| macro avg | 0.934174 | 0.934 | 0.933993 | 1000 | |
|
| weighted avg | 0.934174 | 0.934 | 0.933993 | 1000 | |
|
|
|
|
|
<img src="https://huggingface.co/anhdungitvn/ko-gpt-bot-sc/resolve/main/metrics/loss.png" width="800"> |
|
<img src="https://huggingface.co/anhdungitvn/ko-gpt-bot-sc/resolve/main/metrics/labels_preds.png" width="800"> |
|
<img src="https://huggingface.co/anhdungitvn/ko-gpt-bot-sc/resolve/main/metrics/test_dataset_preds-01.png" width="800"> |
|
|
|
|
|
|
|
### References |
|
- Prompt Tuning: <a href="https://arxiv.org/abs/2104.08691" download>**The Power of Scale for Parameter-Efficient Prompt Tuning**</a> |
|
- Prompt Tuning v2: <a href="https://arxiv.org/abs/2110.07602" download>**P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks**</a> |