|
# torchtune research repo: token coloring (colorful llama) |
|
|
|
Playground to try out [token coloring](https://docs.google.com/document/d/1Win9vhddD-pu5P3SsG7E-dzN5oQl5DYWW1DhO7sBOgI/edit#heading=h.oqq00pt8expe) with TorchTune. |
|
|
|
The repo was generated using the alpha version of [torchtune](https://github.com/pytorch-labs/torchtune). |
|
|
|
Brief notes: |
|
|
|
- The starting recipe is based on the Alpaca Llama2 7B full finetune recipe (switched to bf16). |
|
- I assume `output/` is used to store model outputs and `model/` is used to store the base model checkpoints. |
|
|
|
For the `colorful` recipe: |
|
|
|
- I copied a lot of functionality (like the actual model definition, dataset, etc) from torchtune repository directly since I needed to make changes. |
|
- I reduced the flexiblity of the recipe (e.g. cannot specify the model or tokenizer) and increased it in other ways (e.g. can pass in a dataset path directly). |
|
- I added intermediate checkpointing (i.e. every `n` steps) and automatically upload the checkpoint to HuggingFace Hub. |
|
|
|
## Getting started |
|
|
|
The below instructions can be copy-pasted as is on to a running instance. They assume that the `HF_TOKEN` environment variable is set with a valid token. |
|
|
|
```bash |
|
# for RunPod |
|
cd /workspace |
|
git clone git@github.com:pytorch-labs/torchtune.git |
|
cd torchtune |
|
pip install -e . |
|
|
|
cd /workspace |
|
git clone git@github.com:laurencer/torchtune-colorful-llama.git |
|
cd torchtune-colorful-llama |
|
|
|
# for wandb support |
|
pip install wandb |
|
``` |
|
|
|
```bash |
|
mkdir -p model/ |
|
tune download --repo-id meta-llama/Llama-2-7b --output-dir model/ |
|
``` |
|
|
|
```bash |
|
tune convert_checkpoint --checkpoint-path model/consolidated.00.pth --output-path model/llama2_native.tune |
|
``` |
|
|
|
```bash |
|
mkdir -p output/ |
|
# tune --nnodes 1 --nproc_per_node 1 ./colorful/full_finetune.py --config ./colorful/basic_config.yaml |
|
nohup tune --nnodes 1 --nproc_per_node 1 ./colorful/full_finetune.py --config ./colorful/basic_config.yaml 2>&1 > training_log_$(date "+%Y.%m.%d_%H.%M.%S").log & |
|
sleep 1 |
|
tail -f training_log_*.log |
|
``` |
|
|
|
## Baselines |
|
|
|
Two baseline configs are provided in the `baseline` directory. |
|
We forked the original recipe to support customizing the location/path of the Alpaca dataset. |
|
|
|
```bash |
|
# tune --nnodes 1 --nproc_per_node 1 ./baseline/full_finetune.py --config ./baseline/baseline_config.yaml |
|
nohup tune --nnodes 1 --nproc_per_node 1 ./baseline/full_finetune.py --config ./baseline/baseline_config.yaml 2>&1 > training_log_$(date "+%Y.%m.%d_%H.%M.%S").log & |
|
sleep 1 |
|
tail -f training_log_*.log |
|
``` |
|
|
|
The adversarial config uses a dataset that is equivalent to 4x the original alpaca cleaned dataset with extra examples that include prompt injection attempts. See [token coloring description](https://docs.google.com/document/d/1Win9vhddD-pu5P3SsG7E-dzN5oQl5DYWW1DhO7sBOgI/edit#heading=h.oqq00pt8expe) for more info. |
|
|
|
```bash |
|
# tune --nnodes 1 --nproc_per_node 1 ./baseline/full_finetune.py --config ./baseline/adversarial_config.yaml |
|
nohup tune --nnodes 1 --nproc_per_node 1 ./baseline/full_finetune.py --config ./baseline/adversarial_config.yaml 2>&1 > training_log_$(date "+%Y.%m.%d_%H.%M.%S").log & |
|
sleep 1 |
|
tail -f training_log_*.log |
|
``` |
|
|
|
## Colorful |
|
|
|
The `colorful` directory implements the changes required to support token coloring. This includes a custom dataset implementation and training script. |