BramVanroy
commited on
Commit
•
b23fe7d
1
Parent(s):
92b6946
Update README.md
Browse files
README.md
CHANGED
@@ -41,9 +41,12 @@ Trained with LoRA targetting `["q_proj", "v_proj"]` in 4 bit and merged before u
|
|
41 |
|
42 |
The adapters are in the `adapters` branch.
|
43 |
|
|
|
|
|
|
|
44 |
### Training hyperparameters
|
45 |
|
46 |
-
The following hyperparameters were used during training:
|
47 |
- learning_rate: 0.0003
|
48 |
- train_batch_size: 12
|
49 |
- eval_batch_size: 12
|
|
|
41 |
|
42 |
The adapters are in the `adapters` branch.
|
43 |
|
44 |
+
Initial training investigation on the Tier-1 HPC of [Vlaams Supercomputer Centrum (VSC)](https://www.vscentrum.be/) and training on our own research cluster of 4x 3090s.
|
45 |
+
|
46 |
+
|
47 |
### Training hyperparameters
|
48 |
|
49 |
+
The following hyperparameters were used during training in the HPC investigation:
|
50 |
- learning_rate: 0.0003
|
51 |
- train_batch_size: 12
|
52 |
- eval_batch_size: 12
|