--- base_model: meta-llama/Llama-3.2-3B-Instruct datasets: - tatsu-lab/alpaca language: en tags: - torchtune --- # my_cool_model This model is a finetuned version of [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on the [tatsu-lab/alpaca](https://huggingface.co/tatsu-lab/alpaca) dataset. # Model description More information needed # Training and evaluation results More information needed # Training procedure This model was trained using the [torchtune](https://github.com/pytorch/torchtune) library using the following command: ```bash ppo_full_finetune_single_device.py \ --config \ ./target/7B_full_ppo_low_memory_single_device.yaml \ device=cuda \ metric_logger._component_=torchtune.utils.metric_logging.WandBLogger \ metric_logger.project=torchtune_ppo \ forward_batch_size=2 \ batch_size=64 \ ppo_batch_size=32 \ gradient_accumulation_steps=16 \ compile=True \ optimizer._component_=bitsandbytes.optim.PagedAdamW \ optimizer.lr=3e-4 ``` # Framework versions - torchtune - torchao 0.5.0 - datasets 2.20.0 - sentencepiece 0.2.0