avrecum commited on
Commit
d647b79
·
verified ·
1 Parent(s): 237c45c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -0
README.md CHANGED
@@ -7,3 +7,14 @@ pipeline_tag: text-generation
7
  <!-- Provide a quick summary of what the model is/does. -->
8
 
9
  Mistral 7B v0.3 finetuned on cleaned Stanford Alpaca dataset using LoRA
 
 
 
 
 
 
 
 
 
 
 
 
7
  <!-- Provide a quick summary of what the model is/does. -->
8
 
9
  Mistral 7B v0.3 finetuned on cleaned Stanford Alpaca dataset using LoRA
10
+
11
+ Model was finetuned on for 1 epoch using paged_adamw_8bit optimizer with these params:
12
+ per_device_train_batch_size = 10,
13
+ gradient_accumulation_steps = 4,
14
+ warmup_steps = 5,
15
+ num_train_epochs=1,
16
+ learning_rate = 2e-4,
17
+ optim = "paged_adamw_8bit",
18
+ weight_decay = 0.01,
19
+ lr_scheduler_type = "linear",
20
+ seed = 3407