sethuiyer
/

Chikuma_10.7B_v2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Chikuma_10.7B_v2 / README.md

sethuiyer's picture

Create README.md

0ccd03f verified 6 months ago

|

No virus

2.51 kB

	---
	license: apache-2.0
	datasets:
	- argilla/distilabel-intel-orca-dpo-pairs
	library_name: transformers
	pipeline_tag: text-generation
	---

	# Chikuma_10.7B - V2

	This model is the DPO fine tune of [Chikuma_10.7B](https://huggingface.co/sethuiyer/Chikuma_10.7B) using [argilla/distilabel-intel-orca-dpo-pairs](https://huggingface.co/datasets/argilla/distilabel-intel-orca-dpo-pairs)

	# Dataset
	Dataset: `/argilla/distilabel-intel-orca-dpo-pairs`

	The dataset was roughly ~3000 samples but they were high quality (according to the chosen_score).
	The following filters were applied to the original dataset:
	```python
	dataset = dataset.filter(
	lambda r:
	r["status"] != "tie" and
	r["chosen_score"] >= 8 and
	not r["in_gsm8k_train"]
	)
	```

	# Chat Template
	I decided to go with a slight modification of ChatML.

	```
	<\|im_start\|>GPT4 Correct system:
	{system} Always use <\|end_of_turn\|> when you want to end the answer. <\|im_end\|>
	<\|im_start\|>GPT4 Correct user:
	{user}<\|im_end\|>
	<\|im_start\|>GPT4 Correct Assistant:
	{asistant}<\|im_end\|>
	```

	### Training Hardware

	I used 1 x A100 80GB in runpod for about 1.5 hours.

	## Usage

	```python
	# Format prompt
	from transformers import AutoModelForCausalLM, AutoTokenizer
	tokenizer = AutoTokenizer.from_pretrained(new_model)

	# Create pipeline
	pipeline = transformers.pipeline(
	"text-generation",
	model=new_model,
	tokenizer=tokenizer,
	device="cuda"
	)

	# Generate text

	message = [
	{"role": "system", "content": "You are a helpful assistant chatbot. Always use <\|end_of_turn\|> when you want to end the answer."},
	{"role": "user", "content": "What is large language model?"}
	]

	prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)

	sequences = pipeline(
	prompt,
	do_sample=True,
	temperature=0.7,
	top_p=0.9,
	num_return_sequences=1,
	max_length=512,
	)
	print(sequences[0]['generated_text'])
	```

	## Things in Pipeline:
	1. Manual Testing and Evaluation against GPT-4 on text-generation-webui across 45 sample complex prompts.
	2. Nous Benchmark
	3. GGUF Format
	4. Ollama Model (if model benchmarks are good)

	## Acknowledgements

	I'd like to thank the amazing open community and in particular:

	* The Intel team for publishing a great open dataset and show how well it worked in the first place
	* Teknium and NousResearch for their awesome work and models.
	* Maxime for sharing such great resources.
	* Argilla for publishing argilla/distilabel-intel-orca-dpo-pairs