--- license: wtfpl base_model: CausalLM/7B tags: - generated_from_trainer model-index: - name: workspace/causal-dolphin-v0.1 results: [] datasets: - ehartford/dolphin - THUDM/AgentInstruct --- [Built with Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) # Causal-Dolphin-Agent-v0.1 This model is a LoRA fine-tune of [CausalLM/7B](https://huggingface.co/CausalLM/7B) on Eric's wonderful [Dolphin](https://huggingface.co/datasets/ehartford/dolphin) dataset, with [THUDM/AgentInstruct](https://huggingface.co/datasets/THUDM/AgentInstruct) mixed in both training runs. Causal-Dolphin-Agent was trained for 3 epochs on 5 million GPT3.5 augmented FLAN instructions & AgentInstruct dataset in ChatML format. It was then trained a further 3 epochs on 1 million GPT4 augmented FLAN instructions with AgentInstruct shuffled in as well. It achieves the following results on the evaluation set: - Loss: 2.8435 ## Prompt Format Causal-Dolphin-Agent uses ChatML as the prompt format: ``` <|im_start|>system You are Dolphin, a helpful AI assistant.<|im_end|> <|im_start|>user If Danny owns a bike, then Edward owns a bike. If Edward owns a bike, then Freddy owns a bike. If Danny owns a bike, which of the following statements must be true? Let's think step by step. I. Edward owns a bike. II. Freddy owns a bike. III. Freddy does not own a bike. Choose one answer: I only II only III only I and II only I and III only <|im_end|> <|im_start|>assistant ``` ## Training and evaluation data [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) [THUDM/AgentInstruct](https://huggingface.co/datasets/THUDM/AgentInstruct) ## Training procedure Causal-Dolphin-Agent was trained for 3 epochs on 5 million GPT3.5 augmented FLAN instructions & AgentInstruct dataset in ChatML format. It was then trained a further 3 epochs on 1 million GPT4 augmented FLAN instructions with AgentInstruct shuffled in as well. ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 6e-06 - train_batch_size: 1 - eval_batch_size: 1 - seed: 42 - optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 100 - num_epochs: 3 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 2.7774 | 0.0 | 1 | 5.1009 | | 3.2798 | 0.15 | 46 | 5.1010 | | 2.0722 | 0.3 | 92 | 5.0489 | | 2.5919 | 0.45 | 138 | 4.8834 | | 2.0011 | 0.6 | 184 | 4.6678 | | 1.3733 | 0.75 | 230 | 4.4628 | | 1.7321 | 0.9 | 276 | 4.2757 | | 1.3994 | 1.05 | 322 | 4.1029 | | 1.2308 | 1.2 | 368 | 3.8916 | | 0.8229 | 1.35 | 414 | 3.6451 | | 0.9592 | 1.5 | 460 | 3.4106 | | 0.8528 | 1.65 | 506 | 3.2250 | | 0.7362 | 1.8 | 552 | 3.0852 | | 0.8077 | 1.95 | 598 | 2.9881 | | 0.6912 | 2.1 | 644 | 2.9315 | | 0.7776 | 2.25 | 690 | 2.8911 | | 0.6916 | 2.41 | 736 | 2.8678 | | 0.8674 | 2.56 | 782 | 2.8534 | | 0.7797 | 2.71 | 828 | 2.8545 | | 0.6838 | 2.86 | 874 | 2.8435 | ### Framework versions - Transformers 4.34.1 - Pytorch 2.1.0+cu121 - Datasets 2.14.6 - Tokenizers 0.14.1