cognitivecomputations
/

dolphin-2.9.2-mixtral-8x22b

Text Generation

Generated from Trainer

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

dolphin-2.9.2-mixtral-8x22b / README.md

ehartford's picture

Update README.md

6bc3a8e verified about 1 month ago

|

raw history blame contribute delete

No virus

3.15 kB

	---
	license: apache-2.0
	base_model: mistral-community/Mixtral-8x22B-v0.1
	tags:
	- generated_from_trainer
	- axolotl
	model-index:
	- name: out
	results: []
	datasets:
	- cognitivecomputations/Dolphin-2.9.2
	- cognitivecomputations/SystemChat-2.0
	- teknium/OpenHermes-2.5
	- m-a-p/CodeFeedback-Filtered-Instruction
	- cognitivecomputations/dolphin-coder
	- cognitivecomputations/samantha-data
	- HuggingFaceH4/ultrachat_200k
	- microsoft/orca-math-word-problems-200k
	- abacusai/SystemChat-1.1
	- Locutusque/function-calling-chatml
	- internlm/Agent-FLAN
	language:
	- en
	---

	# Dolphin 2.9.2 Mixtral 8x22b 🐬

	Curated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations

	[![Discord](https://img.shields.io/discord/1156064224225808488?logo=Discord&logoColor=%23ffffff&label=Discord&link=https%3A%2F%2Fdiscord.gg%2FtCMkMDDHwm)](https://discord.gg/cognitivecomputations)
	Discord: https://discord.gg/cognitivecomputations

	<img src="https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/ldkN1J0WIDQwU4vutGYiD.png" width="600" />

	New in 2.9.2 is SystemChat 2.0 - a dataset designed to teach Dolphin to obey the system prompt, even over a long conversation.

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/z1u6U91tL-H__7JCDbWys.png)

	My appreciation for the sponsors of Dolphin 2.9.2:
	- [Crusoe Cloud](https://crusoe.ai/) - provided excellent on-demand 8xH100 node
	- [OnDemand](https://on-demand.io/) - provided inference sponsorship, enabling creation of SystemChat

	This model is based on Dolphin-2.9-Mixtral-8x22b, and is Apache-2.0 licensed.

	The base model has 64k context, and fine-tuning was with 16k sequence length.

	It took 1 week on 8xH100 provided by Crusoe Cloud

	This model was trained FFT on 50% parameters (targeted with [Laser Scanner](https://github.com/cognitivecomputations/laserRMT/blob/main/laser_scanner.py) by Fernando Fernandes, David Golchinfar, Lucas Atkins, and Eric Hartford), using ChatML prompt template format.

	example:

	```
	<\|im_start\|>system
	You are Dolphin, a helpful AI assistant.<\|im_end\|>
	<\|im_start\|>user
	{prompt}<\|im_end\|>
	<\|im_start\|>assistant

	```

	Dolphin-2.9 has a variety of instruction, conversational, and coding skills. It also has initial agentic abilities and supports function calling.

	Dolphin is uncensored. I have filtered the dataset to remove alignment and bias. This makes the model more compliant. You are advised to implement your own alignment layer before exposing the model as a service. It will be highly compliant with any requests, even unethical ones. Please read my blog post about uncensored models. https://erichartford.com/uncensored-models You are responsible for any content you create using this model. Enjoy responsibly.

	Dolphin is licensed Apache 2.0. I grant permission for any use, including commercial, that falls within accordance with Apache-2.0 license. Dolphin was trained on data generated from GPT4, among other models.

	## Evals

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/SDWV3SvJ8xR1gjl1z0LyO.png)

	## Training