Update README.md
Browse files
README.md
CHANGED
@@ -36,7 +36,8 @@ This model is a fine-tuned version of **Qwen 2.5-14B**, trained on QwQ 32B Previ
|
|
36 |
- Gradient Accumulation Steps: 2 (Effective Batch Size: 16)
|
37 |
- Warm-Up Steps: 5
|
38 |
- Weight Decay: 0.01
|
39 |
-
- **Training Steps**: 500 steps
|
|
|
40 |
|
41 |
---
|
42 |
|
|
|
36 |
- Gradient Accumulation Steps: 2 (Effective Batch Size: 16)
|
37 |
- Warm-Up Steps: 5
|
38 |
- Weight Decay: 0.01
|
39 |
+
- **Training Steps**: 500 steps
|
40 |
+
- **Hardware Information**: A100-80GB
|
41 |
|
42 |
---
|
43 |
|