Update README.md
Browse files
README.md
CHANGED
@@ -149,16 +149,16 @@ with torch.cuda.amp.autocast():
|
|
149 |
print('\n\n', tokenizer.decode(output_tokens[0], skip_special_tokens=True))
|
150 |
```
|
151 |
|
152 |
-
|
153 |
-
|
154 |
-
- See attached [Colab Notebook](https://huggingface.co/dfurman/falcon-7b-chat-oasst1/blob/main/finetune_falcon7b_oasst1_with_bnb_peft.ipynb) for the code (and hyperparams) used to train the model.
|
155 |
-
|
156 |
-
### CUDA Info
|
157 |
|
158 |
- CUDA Version: 12.0
|
159 |
-
-
|
160 |
- Max Memory: {0: "37GB"}
|
161 |
- Device Map: {"": 0}
|
|
|
|
|
|
|
|
|
162 |
|
163 |
### Package Versions Employed
|
164 |
|
|
|
149 |
print('\n\n', tokenizer.decode(output_tokens[0], skip_special_tokens=True))
|
150 |
```
|
151 |
|
152 |
+
### Training Procedure
|
|
|
|
|
|
|
|
|
153 |
|
154 |
- CUDA Version: 12.0
|
155 |
+
- Hardware: 1 A100-SXM
|
156 |
- Max Memory: {0: "37GB"}
|
157 |
- Device Map: {"": 0}
|
158 |
+
- Optimizer: paged_adamw_8bit
|
159 |
+
- Gradient Accumulations: 4
|
160 |
+
- Dataset Size: 9823 conversation trees
|
161 |
+
- Learning Rate: 2e-5
|
162 |
|
163 |
### Package Versions Employed
|
164 |
|