Update README.md
Browse files
README.md
CHANGED
@@ -8,6 +8,7 @@ pipeline_tag: text-generation
|
|
8 |
## Llama-2-70b-chat-hf-2bit_g16_s128-HQQ
|
9 |
This is a version of the LLama-2-70B-chat-hf model quantized to 2-bit via Half-Quadratic Quantization (HQQ): https://mobiusml.github.io/hqq_blog/
|
10 |
|
|
|
11 |
To run the model, install the HQQ library from https://github.com/mobiusml/hqq and use it as follows:
|
12 |
``` Python
|
13 |
from hqq.models.llama_hf import LlamaHQQ
|
@@ -20,6 +21,97 @@ tokenizer = transformers.AutoTokenizer.from_pretrained(model_id)
|
|
20 |
model = LlamaHQQ.from_quantized(model_id)
|
21 |
```
|
22 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
*Limitations*: <br>
|
24 |
-Only supports single GPU runtime.<br>
|
25 |
-Not compatible with HuggingFace's PEFT.<br>
|
|
|
8 |
## Llama-2-70b-chat-hf-2bit_g16_s128-HQQ
|
9 |
This is a version of the LLama-2-70B-chat-hf model quantized to 2-bit via Half-Quadratic Quantization (HQQ): https://mobiusml.github.io/hqq_blog/
|
10 |
|
11 |
+
### Basic Usage
|
12 |
To run the model, install the HQQ library from https://github.com/mobiusml/hqq and use it as follows:
|
13 |
``` Python
|
14 |
from hqq.models.llama_hf import LlamaHQQ
|
|
|
21 |
model = LlamaHQQ.from_quantized(model_id)
|
22 |
```
|
23 |
|
24 |
+
### Basic Chat Example
|
25 |
+
``` Python
|
26 |
+
import transformers
|
27 |
+
from hqq.models.llama_hf import LlamaHQQ
|
28 |
+
|
29 |
+
model_id = 'mobiuslabsgmbh/Llama-2-70b-chat-hf-2bit_g16_s128-HQQ'
|
30 |
+
#Load the tokenizer
|
31 |
+
tokenizer = transformers.AutoTokenizer.from_pretrained(model_id)
|
32 |
+
#Load the model
|
33 |
+
model = LlamaHQQ.from_quantized(model_id)
|
34 |
+
|
35 |
+
##########################################################################################################
|
36 |
+
from threading import Thread
|
37 |
+
|
38 |
+
from sys import stdout
|
39 |
+
def print_flush(data):
|
40 |
+
stdout.write("\r" + data)
|
41 |
+
stdout.flush()
|
42 |
+
|
43 |
+
#Adapted from https://huggingface.co/spaces/huggingface-projects/llama-2-7b-chat/blob/main/app.py
|
44 |
+
def process_conversation(chat):
|
45 |
+
system_prompt = chat['system_prompt']
|
46 |
+
chat_history = chat['chat_history']
|
47 |
+
message = chat['message']
|
48 |
+
|
49 |
+
conversation = []
|
50 |
+
if system_prompt:
|
51 |
+
conversation.append({"role": "system", "content": system_prompt})
|
52 |
+
for user, assistant in chat_history:
|
53 |
+
conversation.extend([{"role": "user", "content": user}, {"role": "assistant", "content": assistant}])
|
54 |
+
conversation.append({"role": "user", "content": message})
|
55 |
+
|
56 |
+
return tokenizer.apply_chat_template(conversation, return_tensors="pt").to('cuda')
|
57 |
+
|
58 |
+
def chat_processor(chat, max_new_tokens=100, do_sample=True):
|
59 |
+
tokenizer.use_default_system_prompt = False
|
60 |
+
streamer = transformers.TextIteratorStreamer(tokenizer, timeout=10.0, skip_prompt=True, skip_special_tokens=True)
|
61 |
+
|
62 |
+
generate_params = dict(
|
63 |
+
{"input_ids": process_conversation(chat)},
|
64 |
+
streamer=streamer,
|
65 |
+
max_new_tokens=max_new_tokens,
|
66 |
+
do_sample=do_sample,
|
67 |
+
top_p=0.90,
|
68 |
+
top_k=50,
|
69 |
+
temperature= 0.6,
|
70 |
+
num_beams=1,
|
71 |
+
repetition_penalty=1.2,
|
72 |
+
)
|
73 |
+
|
74 |
+
t = Thread(target=model.generate, kwargs=generate_params)
|
75 |
+
t.start()
|
76 |
+
|
77 |
+
outputs = []
|
78 |
+
for text in streamer:
|
79 |
+
outputs.append(text)
|
80 |
+
print_flush("".join(outputs))
|
81 |
+
|
82 |
+
return outputs
|
83 |
+
|
84 |
+
###################################################################################################
|
85 |
+
|
86 |
+
outputs = chat_processor({'system_prompt':"You are a helpful assistant.",
|
87 |
+
'chat_history':[],
|
88 |
+
'message':"How can I build a car?"
|
89 |
+
},
|
90 |
+
max_new_tokens=1000, do_sample=False)
|
91 |
+
```
|
92 |
+
|
93 |
+
<b>Output</b>:
|
94 |
+
<p>
|
95 |
+
Building a car is a complex process that involves designing, prototyping, testing, and manufacturing. Here are some general steps you can follow to build a car:
|
96 |
+
|
97 |
+
1. Design the car: Determine the type of car you want to build, including the size, shape, and features. Create a detailed set of blueprints or computer-aided design (CAD) drawings to guide your building process.
|
98 |
+
2. Source materials: Purchase or gather all the necessary materials, such as steel, aluminum, rubber, plastics, and any other components required for the car's body, frame, and engine.
|
99 |
+
3. Build the frame: Construct the frame, which is the foundation of the car. This includes creating the chassis, suspension, and steering systems.
|
100 |
+
4. Install the engine: Choose an appropriate engine and install it in the frame. Connect the engine to the transmission, exhaust system, and cooling system.
|
101 |
+
5. Add the body: Attach the body panels to the frame, including the hood, doors, trunk lid, and roof. Ensure proper alignment and fitment.
|
102 |
+
6. Install the electrical system: Connect the battery, starter, alternator, and wiring harness to the engine and other components. Install headlights, taillights, and other electrical accessories.
|
103 |
+
7. Add the brakes: Install the brake system, including the brake pads, rotors, calipers, and master cylinder. Connect the brake lines and bleed the system to remove air bubbles.
|
104 |
+
8. Install the interior: Fit the seats, dashboard, carpeting, and other interior components. Install the steering column, pedals, and shifter.
|
105 |
+
9. Test and inspect: Check the car's systems, including the brakes, suspension, and engine performance. Make sure everything is functioning properly and safely.
|
106 |
+
10. Register and insure: Obtain registration and insurance for your newly built car. Comply with local regulations and laws regarding vehicle ownership and operation.
|
107 |
+
|
108 |
+
Please note that this is a high-level overview of the process, and building a car can be a complex and time-consuming task. It requires specialized knowledge, skills, and tools, as well as a clean and organized workspace. Additionally, safety precautions should always be taken when working on vehicles, as they can be dangerous if mishandled.
|
109 |
+
|
110 |
+
If you are not experienced in automotive construction, it may be advisable to seek guidance from professionals or take a course in automotive mechanics before attempting to build a car.
|
111 |
+
|
112 |
+
----------------------------------------------------------------------------------------------------------------------------------
|
113 |
+
</p>
|
114 |
+
|
115 |
*Limitations*: <br>
|
116 |
-Only supports single GPU runtime.<br>
|
117 |
-Not compatible with HuggingFace's PEFT.<br>
|