--- license: other inference: false language: - en pipeline_tag: text-generation tags: - transformers - gguf - imatrix - stablelm-zephyr-3b - stabilityai --- Quantizations of https://huggingface.co/stabilityai/stablelm-zephyr-3b # From original readme ## Usage `StableLM Zephyr 3B` uses the following instruction format: ``` <|user|> List 3 synonyms for the word "tiny"<|endoftext|> <|assistant|> 1. Dwarf 2. Little 3. Petite<|endoftext|> ``` This format is also available through the tokenizer's `apply_chat_template` method: ```python from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained('stabilityai/stablelm-zephyr-3b') model = AutoModelForCausalLM.from_pretrained( 'stabilityai/stablelm-zephyr-3b', device_map="auto" ) prompt = [{'role': 'user', 'content': 'List 3 synonyms for the word "tiny"'}] inputs = tokenizer.apply_chat_template( prompt, add_generation_prompt=True, return_tensors='pt' ) tokens = model.generate( inputs.to(model.device), max_new_tokens=1024, temperature=0.8, do_sample=True ) print(tokenizer.decode(tokens[0], skip_special_tokens=False)) ``` You can also see how to run a performance optimized version of this model [here](https://github.com/openvinotoolkit/openvino_notebooks/blob/main/notebooks/273-stable-zephyr-3b-chatbot/273-stable-zephyr-3b-chatbot.ipynb) using [OpenVINO](https://docs.openvino.ai/2023.2/home.html) from Intel.