---
license: wtfpl
base_model: CausalLM/7B
tags:
- generated_from_trainer
model-index:
- name: workspace/causal-dolphin-v0.1
  results: []
datasets:
- ehartford/dolphin
- THUDM/AgentInstruct
---

[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
# Causal-Dolphin-Agent-v0.1

This model is a LoRA fine-tune of [CausalLM/7B](https://huggingface.co/CausalLM/7B) on Eric's wonderful [Dolphin](https://huggingface.co/datasets/ehartford/dolphin) dataset, with [THUDM/AgentInstruct](https://huggingface.co/datasets/THUDM/AgentInstruct) mixed in both training runs.

Causal-Dolphin-Agent was trained for 3 epochs on 5 million GPT3.5 augmented FLAN instructions & AgentInstruct dataset in ChatML format. It was then trained a further 3 epochs on 1 million GPT4 augmented FLAN instructions with AgentInstruct shuffled in as well. 

It achieves the following results on the evaluation set:
- Loss: 2.8435

## Prompt Format

Causal-Dolphin-Agent uses ChatML as the prompt format:
```
<|im_start|>system
You are Dolphin, a helpful AI assistant.<|im_end|>
<|im_start|>user
If Danny owns a bike, then Edward owns a bike. If Edward owns a bike, then Freddy owns a bike. If Danny owns a bike, which of the following statements must be true? Let's think step by step.

I. Edward owns a bike.
II. Freddy owns a bike.
III. Freddy does not own a bike.

Choose one answer:
I only
II only
III only
I and II only
I and III only
<|im_end|>
<|im_start|>assistant
```

## Training and evaluation data

[ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin)
[THUDM/AgentInstruct](https://huggingface.co/datasets/THUDM/AgentInstruct)

## Training procedure
Causal-Dolphin-Agent was trained for 3 epochs on 5 million GPT3.5 augmented FLAN instructions & AgentInstruct dataset in ChatML format. It was then trained a further 3 epochs on 1 million GPT4 augmented FLAN instructions with AgentInstruct shuffled in as well. 

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 6e-06
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 100
- num_epochs: 3

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 2.7774        | 0.0   | 1    | 5.1009          |
| 3.2798        | 0.15  | 46   | 5.1010          |
| 2.0722        | 0.3   | 92   | 5.0489          |
| 2.5919        | 0.45  | 138  | 4.8834          |
| 2.0011        | 0.6   | 184  | 4.6678          |
| 1.3733        | 0.75  | 230  | 4.4628          |
| 1.7321        | 0.9   | 276  | 4.2757          |
| 1.3994        | 1.05  | 322  | 4.1029          |
| 1.2308        | 1.2   | 368  | 3.8916          |
| 0.8229        | 1.35  | 414  | 3.6451          |
| 0.9592        | 1.5   | 460  | 3.4106          |
| 0.8528        | 1.65  | 506  | 3.2250          |
| 0.7362        | 1.8   | 552  | 3.0852          |
| 0.8077        | 1.95  | 598  | 2.9881          |
| 0.6912        | 2.1   | 644  | 2.9315          |
| 0.7776        | 2.25  | 690  | 2.8911          |
| 0.6916        | 2.41  | 736  | 2.8678          |
| 0.8674        | 2.56  | 782  | 2.8534          |
| 0.7797        | 2.71  | 828  | 2.8545          |
| 0.6838        | 2.86  | 874  | 2.8435          |


### Framework versions

- Transformers 4.34.1
- Pytorch 2.1.0+cu121
- Datasets 2.14.6
- Tokenizers 0.14.1