aloobun
/

Cypher-CoT-1.8B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Cypher-CoT-1.8B / README.md

aloobun's picture

Update README.md

ea8902a verified 8 months ago

|

history blame contribute delete

2.54 kB

	---
	library_name: transformers
	license: apache-2.0
	datasets:
	- kaist-ai/CoT-Collection
	tags:
	- finetune
	- gpt4
	- synthetic data
	- custom_code
	- h2oai
	---

	![Cypher aloobun h2oai1.8B](https://i.imgur.com/2R6f4EX.jpeg)
	- This is an experimental model, Finetuned [h2oai/h2o-danube-1.8b-chat](https://huggingface.co/h2oai/h2o-danube-1.8b-chat), on variety of CoT tasks.
	- The original idea was to use this 1.8B model, divide the dataset based on task specific capabilities, train models and transform them into a mixture of experts.
	- Hyperparameters: adamw with eps of 1e-8, cosine decay w/ 20% warmup, lr=2e-5.

	## Format:
	```
	<\|system\|></s><\|prompt\|></s><\|answer\|>
	```

	## Benchamrks:

	WIP

	## Example:
	```
	from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer, StoppingCriteria
	import torch

	class MyStoppingCriteria(StoppingCriteria):
	def __init__(self, target_sequence, prompt):
	self.target_sequence = target_sequence
	self.prompt=prompt

	def __call__(self, input_ids, scores, **kwargs):
	generated_text = tokenizer.decode(input_ids[0])
	generated_text = generated_text.replace(self.prompt,'')
	if self.target_sequence in generated_text:
	return True
	return False

	def __len__(self):
	return 1

	def __iter__(self):
	yield self

	modelpath="aloobun/Cypher-CoT-1.8B"

	model = AutoModelForCausalLM.from_pretrained(
	modelpath,
	torch_dtype=torch.bfloat16,
	device_map="cuda",
	trust_remote_code=True,
	)

	tokenizer = AutoTokenizer.from_pretrained(
	modelpath,
	trust_remote_code=True,
	use_fast=False,
	)

	prompt = "<\|prompt\|>James takes a spinning class 3 times a week. He works out for 1.5 hours each class and burns 7 calories per minute. How many calories does he burn per week?</s><\|answer\|>"
	encoded_input = tokenizer(prompt, return_tensors='pt')
	input_ids=encoded_input['input_ids'].cuda()
	streamer = TextStreamer(tokenizer=tokenizer, skip_prompt=True)
	op = model.generate(
	input_ids,
	streamer=streamer,
	pad_token_id=tokenizer.eos_token_id,
	do_sample=True,
	temperature=0.7,
	top_p=0.8,
	max_new_tokens=512,
	stopping_criteria=MyStoppingCriteria("</s>", prompt)
	)
	```

	## Output:
	>James takes a spinning class 3 times a week, so he spends a total of 3 * 1.5 = 4.5 hours in the class each week.
	>Since there are 60 minutes in an hour, this is equivalent to 4.5 * 60 = 270 minutes.
	>If he burns 7 calories per minute, then he burns a total of 270 * 7 = 1890 calories per week.
	>####1890
	>The answer is: 1890</s>