tlphams commited on
Commit
8556743
β€’
1 Parent(s): 33eed1f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +78 -0
README.md CHANGED
@@ -1,3 +1,81 @@
1
  ---
2
  license: cc-by-nc-4.0
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-nc-4.0
3
+ base_model: EleutherAI/polyglot-ko-12.8b
4
+ tags:
5
+ - generated_from_trainer
6
+ model-index:
7
+ - name: gollm-12.8b-instruct-v2.2
8
+ results: []
9
  ---
10
+
11
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
+ should probably proofread and complete it, then remove this comment. -->
13
+
14
+ # gollm-12.8b-instruct-v2.2
15
+
16
+ This model is a fine-tuned version of [EleutherAI/polyglot-ko-12.8b](https://huggingface.co/EleutherAI/polyglot-ko-12.8b) on a custom mixed dataset
17
+
18
+ ## Model description
19
+
20
+ - No-context template
21
+
22
+ ```
23
+ μ•„λž˜λŠ” μž‘μ—…μ„ μ„€λͺ…ν•˜λŠ” μ§ˆλ¬Έμ–΄μ™€ μΆ”κ°€ μ»¨ν…μŠ€νŠΈλ₯Ό μ œκ³΅ν•˜λŠ” λ§₯락이 ν•¨κ»˜ μ œκ³΅λ©λ‹ˆλ‹€. μš”μ²­μ„ 적절히 μ™„λ£Œν•˜λŠ” 닡변을 μž‘μ„±ν•˜μ„Έμš”.
24
+
25
+ ### 질문:
26
+ {instruction}
27
+
28
+ ### λ‹΅λ³€:
29
+
30
+ ```
31
+
32
+ - With context template
33
+
34
+ ```
35
+ μ•„λž˜λŠ” μž‘μ—…μ„ μ„€λͺ…ν•˜λŠ” μ§ˆλ¬Έμ–΄μ™€ μΆ”κ°€ μ»¨ν…μŠ€νŠΈλ₯Ό μ œκ³΅ν•˜λŠ” λ§₯락이 ν•¨κ»˜ μ œκ³΅λ©λ‹ˆλ‹€. μš”μ²­μ„ 적절히 μ™„λ£Œν•˜λŠ” 닡변을 μž‘μ„±ν•˜μ„Έμš”.
36
+
37
+ ### λ§₯락:
38
+ {input}
39
+
40
+ ### 질문:
41
+ {instruction}
42
+
43
+ ### λ‹΅λ³€:
44
+
45
+ ```
46
+
47
+ ## Intended uses & limitations
48
+
49
+ More information needed
50
+
51
+ ## Training and evaluation data
52
+
53
+ - self-introduction (20 samples)
54
+ - High-quality reasoning dataset from private documents, QAs generated by Claude AI (1.3k samples)
55
+ - EverythingLM-v2 (0.9k samples)
56
+ - KoCoT (2k samples)
57
+ - Private MRC dataset - answer generated by GPT-3.5 (55k samples)
58
+
59
+ ## Training procedure
60
+
61
+ ### Training hyperparameters
62
+
63
+ The following hyperparameters were used during training:
64
+ - learning_rate: 5e-05
65
+ - train_batch_size: 2
66
+ - eval_batch_size: 8
67
+ - seed: 42
68
+ - gradient_accumulation_steps: 8
69
+ - total_train_batch_size: 16
70
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
71
+ - lr_scheduler_type: linear
72
+ - num_epochs: 8
73
+ - stop_at_epoch: 4
74
+
75
+
76
+ ### Framework versions
77
+
78
+ - Transformers 4.32.0.dev0
79
+ - Pytorch 2.0.0+cu117
80
+ - Datasets 2.11.0
81
+ - Tokenizers 0.13.3