mobicham commited on
Commit
d0699d4
1 Parent(s): 0913ac0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -7
README.md CHANGED
@@ -10,14 +10,11 @@ This is a version of the LLama-2-7B-hf model quantized to 4-bit via Half-Quadrat
10
 
11
  To run the model, install the HQQ library from https://github.com/mobiusml/hqq and use it as follows:
12
  ``` Python
13
- from hqq.models.llama_hf import LlamaHQQ
14
- import transformers
15
-
16
  model_id = 'mobiuslabsgmbh/Llama-2-7b-hf-4bit_g64-HQQ'
17
- #Load the tokenizer
18
- tokenizer = transformers.AutoTokenizer.from_pretrained(model_id)
19
- #Load the model
20
- model = LlamaHQQ.from_quantized(model_id)
21
  ```
22
 
23
  *Limitations*: <br>
 
10
 
11
  To run the model, install the HQQ library from https://github.com/mobiusml/hqq and use it as follows:
12
  ``` Python
 
 
 
13
  model_id = 'mobiuslabsgmbh/Llama-2-7b-hf-4bit_g64-HQQ'
14
+
15
+ from hqq.engine.hf import HQQModelForCausalLM, AutoTokenizer
16
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
17
+ model = HQQModelForCausalLM.from_quantized(model_id)
18
  ```
19
 
20
  *Limitations*: <br>