ambroz-soniox commited on
Commit
2b57bed
·
1 Parent(s): 22a39ee

Added info to readme.

Browse files
Files changed (1) hide show
  1. README.md +41 -0
README.md CHANGED
@@ -1,3 +1,44 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ # Model Card for Soniox-7B-v1.0
6
+
7
+ Soniox 7B is a powerful large language model. Supports English and code with 8K context.
8
+ Approaches GPT-4 performance on many benchmarks.
9
+ Built on top of Mistral 7B, enhanced with additional pre-training and fine-tuning for strong problem-solving capabilities.
10
+ Apache 2.0 License.
11
+ For more details, please read our [blog post](https://soniox.com/news/soniox-7b).
12
+
13
+ ## Usage in Transformers
14
+
15
+ The model is available in transformers and can be used as follows:
16
+
17
+ ```python
18
+ import torch
19
+ from transformers import AutoModelForCausalLM, AutoTokenizer
20
+
21
+ model_path = "soniox/Soniox-7B-v1.0"
22
+ model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16)
23
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
24
+
25
+ device = "cuda"
26
+ model.to(device)
27
+
28
+ messages = [
29
+ {"role": "user", "content": "12 plus 21?"},
30
+ {"role": "assistant", "content": "33."},
31
+ {"role": "user", "content": "Five minus one?"},
32
+ ]
33
+ tok_prompt = tokenizer.apply_chat_template(messages, return_tensors="pt")
34
+
35
+ model_inputs = tok_prompt.to(device)
36
+ generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
37
+ decoded = tokenizer.batch_decode(generated_ids)
38
+ print(decoded[0])
39
+ ```
40
+
41
+ ## Inference deployment
42
+
43
+ Refer to our [documentation](https://docs.soniox.com) for inference with vLLM and other
44
+ deployment options.