license: cc-by-nc-4.0
datasets:
- jondurbin/airoboros-gpt4-1.2
Overview
This is a qlora fine-tuned 13b parameter LlaMa model, using completely synthetic training data created gpt4 via https://github.com/jondurbin/airoboros
This is mostly an extension of 1.1, but with thousands of new training data and an update to allow "PLAINFORMAT" at the end of coding prompts to just print the code without backticks or explanations/usage/etc.
The dataset used to fine-tune this model is available here, with a specific focus on:
- coding
- math/reasoning (using orca style ELI5 instruction/response pairs)
- trivia
- role playing
- multiple choice and fill-in-the-blank
- context-obedient question answering
- theory of mind
- misc/general
This model was fine-tuned with a fork of qlora, which among other things was updated to use a slightly modified vicuna template to be compatible with the 7b/13b versions:
A chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user's input. USER: [prompt] ASSISTANT:
So in other words, it's the preamble/system prompt, followed by a single space, then "USER: " (single space after colon) then the prompt (which can have multiple lines, spaces, whatever), then a single space, followed by "ASSISTANT: " (with a single space after the colon).
Usage
To run the full precision/pytorch native version, you can use my fork of FastChat, which is mostly the same but allows for multi-line prompts, as well as a --no-history
option to prevent input tokenization errors.
pip install git+https://github.com/jondurbin/FastChat
Be sure you are pulling the latest branch!
Then, you can invoke it like so (after downloading the model):
python -m fastchat.serve.cli \
--model-path airoboros-13b-gpt4-1.2 \
--temperature 0.5 \
--max-new-tokens 2048 \
--no-history
Alternatively, please check out TheBloke's quantized versions:
- https://huggingface.co/TheBloke/airoboros-13B-gpt4-1.2-GPTQ
- https://huggingface.co/TheBloke/airoboros-13B-gpt4-1.2-GGML
Coding updates from gpt4/1.1:
I added a few hundred instruction/response pairs to the training data with "PLAINFORMAT" as a single, all caps term at the end of the normal instructions, which produce plain text output instead of markdown/backtick code formatting.
It's not guaranteed to work all the time, but mostly it does seem to work as expected.
So for example, instead of:
Implement the Snake game in python.
You would use:
Implement the Snake game in python. PLAINFORMAT
Other updates from gpt4/1.1:
- Several hundred role-playing data.
- A few thousand ORCA style reasoning/math questions with ELI5 prompts to generate the responses (should not be needed in your prompts to this model however, just ask the question).
- Many more coding examples in various languages, including some that use specific libraries (pandas, numpy, tensorflow, etc.)
Usage and License Notices
All airoboros models and datasets are intended and licensed for research use only. I've used the 'cc-nc-4.0' license, but really it is subject to a custom/special license because:
- the base model is LLaMa, which has it's own special research license
- the dataset(s) were generated with OpenAI (gpt-4 and/or gpt-3.5-turbo), which has a clausing saying the data can't be used to create models to compete with openai
So, to reiterate: this model (and datasets) cannot be used commercially.
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 48.19 |
ARC (25-shot) | 58.36 |
HellaSwag (10-shot) | 81.61 |
MMLU (5-shot) | 48.84 |
TruthfulQA (0-shot) | 47.54 |
Winogrande (5-shot) | 73.64 |
GSM8K (5-shot) | 3.87 |
DROP (3-shot) | 23.44 |