nayem-ng
/

mdjannatulnayem_llama2_7b_finetuned_casuallm_lora

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

nayem-ng commited on Oct 19, 2024

Commit

7054660

·

verified ·

1 Parent(s): 2fee75e

Update README.md

Files changed (1) hide show

README.md +13 -18

README.md CHANGED Viewed

@@ -81,12 +81,12 @@ The model was fine-tuned using the mlabonne/mini-platypus dataset, which consist
 The training utilized a supervised fine-tuning procedure with the following hyperparameters:
-Training regime: bf16 mixed precision
-Number of epochs: 1
-Batch size: 10
-Learning rate: 2e-4
-Warmup steps: 10
-Gradient accumulation steps: 1
 #### Training Hyperparameters
@@ -94,17 +94,12 @@ Gradient accumulation steps: 1
 Training regime: bf16 mixed precision
 Explanation: The model was trained using bfloat16 (bf16) mixed precision, which allows for faster training times and reduced memory usage compared to traditional fp32 (float32). This precision format is particularly beneficial when working with large models, as it helps to maintain numerical stability while optimizing performance on compatible hardware.
-Number of epochs: 1
-Batch size: 10
-Learning rate: 2e-4
-Warmup steps: 10
-Gradient accumulation steps: 1
-Evaluation strategy: Evaluations are performed every 1000 steps to monitor the model's performance during training.
 ### Testing Data
@@ -135,4 +130,4 @@ PyTorch, Transformers Library (from Hugging Face),PEFT, TRL, WandB, Intel Extens
 ## Model Card Contact
-Md. Jannatul nayem | [Mail](nayemalimran106@gmail.com) | [LinkedIn](https://www.linkedin.com/in/md-jannatul-nayem)

 The training utilized a supervised fine-tuning procedure with the following hyperparameters:
+- Training regime: bf16 mixed precision
+- Number of epochs: 1
+- Batch size: 10
+- Learning rate: 2e-4
+- Warmup steps: 10
+- Gradient accumulation steps: 1
 #### Training Hyperparameters
 Training regime: bf16 mixed precision
 Explanation: The model was trained using bfloat16 (bf16) mixed precision, which allows for faster training times and reduced memory usage compared to traditional fp32 (float32). This precision format is particularly beneficial when working with large models, as it helps to maintain numerical stability while optimizing performance on compatible hardware.
+- Number of epochs: 1
+- Batch size: 10
+- Learning rate: 2e-4
+- Warmup steps: 10
+- Gradient accumulation steps: 1
+- Evaluation strategy: Evaluations are performed every 1000 steps to monitor the model's performance during training.
 ### Testing Data
 ## Model Card Contact
+🤖 Md. Jannatul nayem | [Mail](nayemalimran106@gmail.com) | [LinkedIn](https://www.linkedin.com/in/md-jannatul-nayem)