RDson
/

Llama-3-Teal-Instruct-2x8B-MoE

Text Generation

Mixture of Experts

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Llama-3-Teal-Instruct-2x8B-MoE / README.md

RDson's picture

Update README.md

55395ff verified 9 months ago

|

history blame contribute delete

754 Bytes

	---
	tags:
	- moe
	- llama
	- '3'
	- llama 3
	- 2x8b
	---
	<img src="https://i.imgur.com/eFrFD6v.jpeg" alt="drawing" width="640"/>

	# Llama-3-Teal-Instruct-2x8B-MoE
	This is a experimental MoE created from [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) and [nvidia/Llama3-ChatQA-1.5-8B](https://huggingface.co/nvidia/Llama3-ChatQA-1.5-8B) using Mergekit.

	Green + Blue = Teal.

	Mergekit yaml file:
	```
	base_model: Meta-Llama-3-8B-Instruct
	experts:
	- source_model: Meta-Llama-3-8B-Instruct
	positive_prompts:
	- "explain"
	- "chat"
	- "assistant"
	- source_model: Llama3-ChatQA-1.5-8B
	positive_prompts:
	- "python"
	- "math"
	- "solve"
	- "code"
	gate_mode: hidden
	dtype: float16
	```