--- license_name: tongyi-qianwen-research license_link: https://huggingface.co/Qwen/Qwen1.5-1.8B-Chat/raw/main/LICENSE library_name: transformers license: other tags: - finetune - synthetic data - custom_code - qwen2 - COT --- ![Reyna aloobun qwen4B](https://i.imgur.com/QfbOY6c.jpeg) - Finetuned [Qwen/Qwen1.5-4B](https://huggingface.co/Qwen/Qwen1.5-4B), with SFT on variety of CoT tasks including Reasoning, Closed Book Question Answering, Ethics, and more. - Datasets : Curated from - [kaist-ai/CoT-Collection](https://huggingface.co/datasets/kaist-ai/CoT-Collection), [euclaise/TinyCoT](https://huggingface.co/datasets/euclaise/TinyCoT) and a very small subset from [teknium/OpenHermes-2.5](https://huggingface.co/datasets/teknium/OpenHermes-2.5). - This marks the fourth model in this series. This experiment aims to improve Chain of Thought (CoT) capabilities on smaller language models. - In the next run, I may rerun the finetuning experiment using an iterative rationale-bootstrapping procedure inspired by euclaise/Memphis-CoT-3B. ## Benchamrks: WIP ## Example: ``` from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer, StoppingCriteria import torch class MyStoppingCriteria(StoppingCriteria): def __init__(self, target_sequence, prompt): self.target_sequence = target_sequence self.prompt=prompt def __call__(self, input_ids, scores, **kwargs): generated_text = tokenizer.decode(input_ids[0]) generated_text = generated_text.replace(self.prompt,'') if self.target_sequence in generated_text: return True return False def __len__(self): return 1 def __iter__(self): yield self modelpath="aloobun/Reyna-CoT-4B-v0.1" model = AutoModelForCausalLM.from_pretrained( modelpath, torch_dtype=torch.bfloat16, device_map="cuda", trust_remote_code=True, ) tokenizer = AutoTokenizer.from_pretrained( modelpath, trust_remote_code=True, use_fast=False, ) prompt = "Avery opens a flower shop. She ties 8 bunches of flowers with 9 flowers in each bunch. How many bunches would she have if she put 12 flowers in each bunch instead?\n" encoded_input = tokenizer(prompt, return_tensors='pt') input_ids=encoded_input['input_ids'].cuda() streamer = TextStreamer(tokenizer=tokenizer, skip_prompt=True) op = model.generate( input_ids, streamer=streamer, pad_token_id=tokenizer.eos_token_id, do_sample=True, temperature=0.6, top_p=0.8, max_new_tokens=512, stopping_criteria=MyStoppingCriteria("<|endoftext|>", prompt) ) ``` ## Output: >She would have 8 x 9 = 72 flowers in total. >She would have 72 / 12 = 6 bunches of flowers with 12 flowers in each bunch. >Therefore, the answer is 6.<|endoftext|>