drlee1 commited on
Commit
ca81d01
ยท
verified ยท
1 Parent(s): 6465126

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -8
README.md CHANGED
@@ -1,6 +1,11 @@
1
  ---
2
  library_name: transformers
3
- tags: []
 
 
 
 
 
4
  ---
5
 
6
  # Model Card for Model ID
@@ -35,7 +40,47 @@ This is the model card of a ๐Ÿค— transformers model that has been pushed on the
35
 
36
  ## Uses
37
 
38
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
  ### Direct Use
41
 
@@ -83,7 +128,9 @@ Use the code below to get started with the model.
83
 
84
  ### Training Procedure
85
 
86
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 
 
87
 
88
  #### Preprocessing [optional]
89
 
@@ -92,7 +139,12 @@ Use the code below to get started with the model.
92
 
93
  #### Training Hyperparameters
94
 
95
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 
 
 
 
 
96
 
97
  #### Speeds, Sizes, Times [optional]
98
 
@@ -120,9 +172,15 @@ Use the code below to get started with the model.
120
 
121
  #### Metrics
122
 
123
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
 
125
- [More Information Needed]
 
 
 
 
 
 
126
 
127
  ### Results
128
 
@@ -134,8 +192,6 @@ Use the code below to get started with the model.
134
 
135
  ## Model Examination [optional]
136
 
137
- <!-- Relevant interpretability work for the model goes here -->
138
-
139
  [More Information Needed]
140
 
141
  ## Environmental Impact
 
1
  ---
2
  library_name: transformers
3
+ datasets:
4
+ - daekeun-ml/naver-news-summarization-ko
5
+ language:
6
+ - ko
7
+ base_model:
8
+ - google/gemma-2-9b-it
9
  ---
10
 
11
  # Model Card for Model ID
 
40
 
41
  ## Uses
42
 
43
+ ```python
44
+ from peft import PeftModel
45
+ from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
46
+
47
+ MODEL_ID = "google/gemma-2-9b-it"
48
+ PEFT_MODEL_ID = "drlee1/gemma2-9b-it-qdora-summary"
49
+
50
+ model = AutoModelForCausalLM.from_pretrained(MODEL_ID, device_map = 'auto', torch_dtype = torch.float16)
51
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
52
+
53
+ model = PeftModel.from_pretrained(model, PEFT_MODEL_ID, device_map = 'auto', torch_dtype = torch.float16)
54
+
55
+ pipe = pipeline("text-generation", model = model, tokenizer = tokenizer, max_new_tokens = 512)
56
+
57
+ doc = "..."
58
+
59
+ messages = [
60
+ {"role": "user", "content": "๋‹ค์Œ ๊ธ€์„ ์š”์•ฝํ•ด์ฃผ์„ธ์š”:\n\n{}".format(doc)}
61
+ ]
62
+
63
+ prompt = tokenizer.apply_chat_template(messages, tokenize = False, add_generation_prompt = True)
64
+
65
+ outputs = pipe(
66
+ prompt,
67
+ do_sample = True,
68
+ temperature = .2,
69
+ top_k = 50,
70
+ top_p = .95,
71
+ add_special_tokens = True
72
+ )
73
+
74
+ print(outputs[0]['generated_text'][len(prompt):])
75
+ ```
76
+
77
+ ### Template
78
+
79
+ ```text
80
+ # chat template
81
+ <bos><start_of_turn>user\n๋‹ค์Œ ๊ธ€์„ ์š”์•ฝํ•ด์ฃผ์„ธ์š”:\n\n{data}<end_of_turn>\n<start_of_turn>model\n{label}
82
+ ```
83
+
84
 
85
  ### Direct Use
86
 
 
128
 
129
  ### Training Procedure
130
 
131
+ - SFT
132
+ - Quantization
133
+ - DoRA
134
 
135
  #### Preprocessing [optional]
136
 
 
139
 
140
  #### Training Hyperparameters
141
 
142
+ - per_device_train_batch_size: 2
143
+ - gradient_accumulation_steps: 4
144
+ - optimization: paged_adamw_8bit
145
+ - lr: 2e-4
146
+ - bf16: True
147
+ - max_steps: 500
148
 
149
  #### Speeds, Sizes, Times [optional]
150
 
 
172
 
173
  #### Metrics
174
 
175
+ - Training Loss
176
 
177
+ | Step | Training Loss |
178
+ | --- | --- |
179
+ |100 |1.528100 |
180
+ |200 |1.409400 |
181
+ |300 |1.372800 |
182
+ |400 |1.325900 |
183
+ |500 |1.341600 |
184
 
185
  ### Results
186
 
 
192
 
193
  ## Model Examination [optional]
194
 
 
 
195
  [More Information Needed]
196
 
197
  ## Environmental Impact