sjrhuschlee commited on
Commit
51e5707
1 Parent(s): 638828b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -1
README.md CHANGED
@@ -142,4 +142,50 @@ answer = tokenizer.decode(tokenizer.convert_tokens_to_ids(answer_tokens))
142
  from peft import LoraConfig, PeftModelForQuestionAnswering
143
  from transformers import AutoModelForQuestionAnswering, AutoTokenizer
144
  model_name = "sjrhuschlee/deberta-v3-large-squad2"
145
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
142
  from peft import LoraConfig, PeftModelForQuestionAnswering
143
  from transformers import AutoModelForQuestionAnswering, AutoTokenizer
144
  model_name = "sjrhuschlee/deberta-v3-large-squad2"
145
+ ```
146
+
147
+ ## Training procedure
148
+
149
+ ### Training hyperparameters
150
+
151
+ The following hyperparameters were used during training:
152
+ - learning_rate: 5e-05
153
+ - train_batch_size: 24
154
+ - eval_batch_size: 8
155
+ - seed: 42
156
+ - gradient_accumulation_steps: 1
157
+ - total_train_batch_size: 24
158
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
159
+ - lr_scheduler_type: linear
160
+ - lr_scheduler_warmup_ratio: 0.1
161
+ - num_epochs: 4.0
162
+
163
+ ### LoRA Config
164
+ ```
165
+ {
166
+ "base_model_name_or_path": "microsoft/deberta-v3-large",
167
+ "bias": "none",
168
+ "fan_in_fan_out": false,
169
+ "inference_mode": true,
170
+ "init_lora_weights": true,
171
+ "lora_alpha": 32,
172
+ "lora_dropout": 0.1,
173
+ "modules_to_save": ["qa_outputs"],
174
+ "peft_type": "LORA",
175
+ "r": 8,
176
+ "target_modules": [
177
+ "query_proj",
178
+ "key_proj",
179
+ "value_proj",
180
+ "dense"
181
+ ],
182
+ "task_type": "QUESTION_ANS"
183
+ }
184
+ ```
185
+
186
+ ### Framework versions
187
+
188
+ - Transformers 4.30.0.dev0
189
+ - Pytorch 2.0.1+cu117
190
+ - Datasets 2.12.0
191
+ - Tokenizers 0.13.3