base_model: meta-llama/Llama-3.2-3B-Instruct | |
datasets: | |
- tatsu-lab/alpaca | |
language: en | |
tags: | |
- torchtune | |
# my_cool_model | |
This model is a finetuned version of [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on the [tatsu-lab/alpaca](https://huggingface.co/tatsu-lab/alpaca) dataset. | |
# Model description | |
More information needed | |
# Training and evaluation results | |
More information needed | |
# Training procedure | |
This model was trained using the [torchtune](https://github.com/pytorch/torchtune) library using the following command: | |
```bash | |
ppo_full_finetune_single_device.py \ | |
--config ./target/7B_full_ppo_low_memory_single_device.yaml \ | |
device=cuda \ | |
metric_logger._component_=torchtune.utils.metric_logging.WandBLogger \ | |
metric_logger.project=torchtune_ppo \ | |
forward_batch_size=2 \ | |
batch_size=64 \ | |
ppo_batch_size=32 \ | |
gradient_accumulation_steps=16 \ | |
compile=True \ | |
optimizer._component_=bitsandbytes.optim.PagedAdamW \ | |
optimizer.lr=3e-4 | |
``` | |
# Framework versions | |
- torchtune | |
- torchao 0.5.0 | |
- datasets 2.20.0 | |
- sentencepiece 0.2.0 | |