DunnBC22
/

opt-2.7b-Fine-Tuned-Grade_School_Math_Instructions

Model card Files Files and versions Community

opt-2.7b-Fine-Tuned-Grade_School_Math_Instructions / README.md

DunnBC22's picture

Update README.md

a28194c over 1 year ago

|

history blame contribute delete

1.43 kB

	---
	library_name: peft
	datasets:
	- qwedsacf/grade-school-math-instructions
	language:
	- en
	metrics:
	- perplexity
	---
	## Training procedure


	The following `bitsandbytes` quantization config was used during training:
	- load_in_8bit: True
	- load_in_4bit: False
	- llm_int8_threshold: 6.0
	- llm_int8_skip_modules: None
	- llm_int8_enable_fp32_cpu_offload: False
	- llm_int8_has_fp16_weight: False
	- bnb_4bit_quant_type: fp4
	- bnb_4bit_use_double_quant: False
	- bnb_4bit_compute_dtype: float32

	### Model Description

	For more information on how it was created, check out the following link: https://github.com/DunnBC22/NLP_Projects/blob/main/OPT%20Models/Grade%20School%20Math%20Instructions%20Fine-Tune%20OPT.ipynb.

	### Intended uses & limitations

	This is intended to show the possibilities. It is mainly limited by the input data.

	### Training & Evaluation Dataset

	Dataset Source: https://huggingface.co/datasets/qwedsacf/grade-school-math-instructions

	### Hyperparameters Used

	\| Hyperperameter \| Value \|
	\|:-----:\|:-----:\|
	\| Model Checkpoint \| facebook/opt-2.7b \|
	\| per_device_train_batch_size \| 4 \|
	\| gradient_accumulation_steps \| 4 \|
	\| fp16 \| True \|
	\| warmup_steps \| 225 \|
	\| learning_rate \| 2e-4 \|
	\| Training Steps \| 450


	### Framework versions

	\| Library \| Version \|
	\|:-----:\|:-----:\|
	\| Python \| 3.10.1 \|
	\| Torch \| 2.0.1+cu118 \|
	\| Datasets \| 2.14.4 \|
	\| Transformer \| 4.31.0 \|
	\| PEFT \| 0.4.0


	### Metric

	Perplexity = 6.35