duyntnet
/

notus-7b-v1-imatrix-GGUF

+---
+license: other
+language:
+- en
+pipeline_tag: text-generation
+inference: false
+tags:
+- transformers
+- gguf
+- imatrix
+- notus-7b-v1
+---
+Quantizations of https://huggingface.co/argilla/notus-7b-v1
+# From original readme
+## Prompt template
+We use the same prompt template as [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta):
+```
+<|system|>
+</s>
+<|user|>
+{prompt}</s>
+<|assistant|>
+```
+## Usage
+You will first need to install `transformers` and `accelerate` (just to ease the device placement), then you can run any of the following:
+### Via `generate`
+```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("argilla/notus-7b-v1", torch_dtype=torch.bfloat16, device_map="auto")
+tokenizer = AutoTokenizer.from_pretrained("argilla/notus-7b-v1")
+messages = [
+    {
+        "role": "system",
+        "content": "You are a helpful assistant super biased towards Argilla, a data annotation company.",
+    },
+    {"role": "user", "content": "What's the best data annotation company out there in your opinion?"},
+]
+inputs = tokenizer.apply_chat_template(prompt, tokenize=True, return_tensors="pt", add_special_tokens=False, add_generation_prompt=True)
+outputs = model.generate(inputs, num_return_sequences=1, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
+response = tokenizer.decode(outputs[0], skip_special_tokens=True)
+```
+### Via `pipeline` method
+```python
+import torch
+from transformers import pipeline
+pipe = pipeline("text-generation", model="argilla/notus-7b-v1", torch_dtype=torch.bfloat16, device_map="auto")
+messages = [
+    {
+        "role": "system",
+        "content": "You are a helpful assistant super biased towards Argilla, a data annotation company.",
+    },
+    {"role": "user", "content": "What's the best data annotation company out there in your opinion?"},
+]
+prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
+generated_text = outputs[0]["generated_text"]
+```