Update README.md
Browse files
README.md
CHANGED
@@ -81,12 +81,12 @@ The model was fine-tuned using the mlabonne/mini-platypus dataset, which consist
|
|
81 |
|
82 |
The training utilized a supervised fine-tuning procedure with the following hyperparameters:
|
83 |
|
84 |
-
Training regime: bf16 mixed precision
|
85 |
-
Number of epochs: 1
|
86 |
-
Batch size: 10
|
87 |
-
Learning rate: 2e-4
|
88 |
-
Warmup steps: 10
|
89 |
-
Gradient accumulation steps: 1
|
90 |
|
91 |
|
92 |
#### Training Hyperparameters
|
@@ -94,17 +94,12 @@ Gradient accumulation steps: 1
|
|
94 |
Training regime: bf16 mixed precision
|
95 |
Explanation: The model was trained using bfloat16 (bf16) mixed precision, which allows for faster training times and reduced memory usage compared to traditional fp32 (float32). This precision format is particularly beneficial when working with large models, as it helps to maintain numerical stability while optimizing performance on compatible hardware.
|
96 |
|
97 |
-
Number of epochs: 1
|
98 |
-
|
99 |
-
|
100 |
-
|
101 |
-
|
102 |
-
|
103 |
-
Warmup steps: 10
|
104 |
-
|
105 |
-
Gradient accumulation steps: 1
|
106 |
-
|
107 |
-
Evaluation strategy: Evaluations are performed every 1000 steps to monitor the model's performance during training.
|
108 |
|
109 |
|
110 |
### Testing Data
|
@@ -135,4 +130,4 @@ PyTorch, Transformers Library (from Hugging Face),PEFT, TRL, WandB, Intel Extens
|
|
135 |
|
136 |
## Model Card Contact
|
137 |
|
138 |
-
Md. Jannatul nayem | [Mail](nayemalimran106@gmail.com) | [LinkedIn](https://www.linkedin.com/in/md-jannatul-nayem)
|
|
|
81 |
|
82 |
The training utilized a supervised fine-tuning procedure with the following hyperparameters:
|
83 |
|
84 |
+
- Training regime: bf16 mixed precision
|
85 |
+
- Number of epochs: 1
|
86 |
+
- Batch size: 10
|
87 |
+
- Learning rate: 2e-4
|
88 |
+
- Warmup steps: 10
|
89 |
+
- Gradient accumulation steps: 1
|
90 |
|
91 |
|
92 |
#### Training Hyperparameters
|
|
|
94 |
Training regime: bf16 mixed precision
|
95 |
Explanation: The model was trained using bfloat16 (bf16) mixed precision, which allows for faster training times and reduced memory usage compared to traditional fp32 (float32). This precision format is particularly beneficial when working with large models, as it helps to maintain numerical stability while optimizing performance on compatible hardware.
|
96 |
|
97 |
+
- Number of epochs: 1
|
98 |
+
- Batch size: 10
|
99 |
+
- Learning rate: 2e-4
|
100 |
+
- Warmup steps: 10
|
101 |
+
- Gradient accumulation steps: 1
|
102 |
+
- Evaluation strategy: Evaluations are performed every 1000 steps to monitor the model's performance during training.
|
|
|
|
|
|
|
|
|
|
|
103 |
|
104 |
|
105 |
### Testing Data
|
|
|
130 |
|
131 |
## Model Card Contact
|
132 |
|
133 |
+
🤖 Md. Jannatul nayem | [Mail](nayemalimran106@gmail.com) | [LinkedIn](https://www.linkedin.com/in/md-jannatul-nayem)
|