kaizuberbuehler commited on
Commit
06607c1
1 Parent(s): ab54f6e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -1
README.md CHANGED
@@ -12,7 +12,22 @@ license: llama3
12
 
13
  ## Training Details
14
 
15
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
  ## Limitations
18
 
 
12
 
13
  ## Training Details
14
 
15
+ Hardware: 1x RTX 4090
16
+ Duration: ~30 hours in total (~2 hours for first phase and ~28 hours for second phase)
17
+
18
+ ### Hyperparameters
19
+
20
+ Adapter: QLoRA
21
+ Precision: 4 bit
22
+ Optimizer: adamw_bnb_8bit
23
+ LoRA Rank: 256
24
+ LoRA Alpha: 256
25
+ Learning Rate: 1e-5
26
+ Context Length: 4096 tokens
27
+ Batch Size: 1
28
+ Gradient Accumulation Steps: 1
29
+ Sample Packing: Off for first phase, on for second phase
30
+ Epochs: 2
31
 
32
  ## Limitations
33