Training procedure
The following bitsandbytes
quantization config was used during training:
- load_in_8bit: True
- load_in_4bit: False
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: fp4
- bnb_4bit_use_double_quant: False
- bnb_4bit_compute_dtype: float32
Model Description
For more information on how it was created, check out the following link: https://github.com/DunnBC22/NLP_Projects/blob/main/OPT%20Models/Grade%20School%20Math%20Instructions%20Fine-Tune%20OPT.ipynb.
Intended uses & limitations
This is intended to show the possibilities. It is mainly limited by the input data.
Training & Evaluation Dataset
Dataset Source: https://huggingface.co/datasets/qwedsacf/grade-school-math-instructions
Hyperparameters Used
Hyperperameter | Value |
---|---|
Model Checkpoint | facebook/opt-2.7b |
per_device_train_batch_size | 4 |
gradient_accumulation_steps | 4 |
fp16 | True |
warmup_steps | 225 |
learning_rate | 2e-4 |
Training Steps | 450 |
Framework versions
Library | Version |
---|---|
Python | 3.10.1 |
Torch | 2.0.1+cu118 |
Datasets | 2.14.4 |
Transformer | 4.31.0 |
PEFT | 0.4.0 |
Metric
Perplexity = 6.35
- Downloads last month
- 6