alpindale commited on
Commit
5f45cb2
1 Parent(s): 09e8eb3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -1
README.md CHANGED
@@ -1,4 +1,26 @@
1
- magnum-v4-123b quantized to 4-bit precision using [HQQ](https://github.com/mobiusml/hqq/).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
  HQQ provides a similar level of precision to AWQ at 4-bit, but with no need for calibration.
4
 
@@ -23,4 +45,14 @@ tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
23
  output_path = "magnum-v4-123b-hqq-4bit"
24
  model.save_pretrained(output_path)
25
  tokenizer.save_pretrained(output_path)
 
 
 
 
 
 
 
 
 
 
26
  ```
 
1
+ ---
2
+ language:
3
+ - en
4
+ - fr
5
+ - de
6
+ - es
7
+ - it
8
+ - pt
9
+ - zh
10
+ - ja
11
+ - ru
12
+ - ko
13
+ license: other
14
+ license_name: mrl
15
+ inference: false
16
+ license_link: https://mistral.ai/licenses/MRL-0.1.md
17
+ base_model:
18
+ - anthracite-org/magnum-v4-123b
19
+ ---
20
+
21
+ # Magnum-v4-123b HQQ
22
+
23
+ This repo contains magnum-v4-123b quantized to 4-bit precision using [HQQ](https://github.com/mobiusml/hqq/).
24
 
25
  HQQ provides a similar level of precision to AWQ at 4-bit, but with no need for calibration.
26
 
 
45
  output_path = "magnum-v4-123b-hqq-4bit"
46
  model.save_pretrained(output_path)
47
  tokenizer.save_pretrained(output_path)
48
+ ```
49
+
50
+ ## Inference
51
+
52
+ You can perform inference directly with transformers, or using [aphrodite](https://github.com/PygmalionAI/aphrodite-engine):
53
+
54
+ ```sh
55
+ pip install aphrodite-engine
56
+
57
+ aphrodite run alpindale/magnum-v4-123b-hqq-4bit -tp 2
58
  ```