jondurbin/airoboros-mpt-30b-gpt4-1p4-three-epochs

Overview

This is a test of qlora fine-tuning of the mpt-30b model, with 3 epochs.

qlora compatible model: https://huggingface.co/jondurbin/mpt-30b-qlora-compatible

My fork of qlora with mpt-30b support: https://github.com/jondurbin/qlora

Differences in the qlora scripts:

requires adding --mpt True for mpt-based models
uses --num_train_epochs instead of --max_steps
uses airoboros prompt format (mostly 1:1 with vicuna) rather than alpaca, and expects an input file in JSONL format with "instruction" and "response"

I think there's a bug in gradient accumulation, so if you try this, maybe set gradient accumulation steps to 1

See the mpt-30b-qlora-compatible model card for training details.

This is not as high quality as the llama-33b versions unfortunately, but I don't have a great answer as to why. Perhaps there are fewer forward layers that can be tuned?

License and usage

This is a real gray area, here's why:

the dataset was generated with gpt-4, via https://github.com/jondurbin/airoboros
the ToS for openai API usage has a clause preventing the output from being used to train a model that competes with OpenAI
- what does compete actually mean here?
- a 30b parameter model isn't anywhere near the quality of gpt-4, or even gpt-3.5, so I can't imagine this could credibly be considered competing in the first place
- if someone else uses the dataset to do the same, they wouldn't necessarily be violating the ToS because they didn't call the API, so I don't know how that works
the training data used in essentially all large language models includes a significant of copyrighted or otherwise unallowable licensing in the first place
other work using the self-instruct method, e.g. the original here: https://github.com/yizhongw/self-instruct released the data and model as apache-2

I am purposingly not placing a license on here because I am not a lawyer and refuse to attempt to interpret all of the terms accordingly. Your best bet is probably to avoid using this commercially, especially since it didn't perform quite as well as expected using qlora.

jondurbin
/

airoboros-mpt-30b-gpt4-1p4-three-epochs

Overview

License and usage

Dataset used to train jondurbin/airoboros-mpt-30b-gpt4-1p4-three-epochs