|
--- |
|
license: llama2 |
|
train: false |
|
inference: false |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
## Llama-2-70b-chat-hf-2bit_g16_s128-HQQ |
|
This is a version of the LLama-2-70B-chat-hf model quantized to 2-bit via Half-Quadratic Quantization (HQQ): https://mobiusml.github.io/hqq_blog/ |
|
|
|
### Basic Usage |
|
To run the model, install the HQQ library from https://github.com/mobiusml/hqq and use it as follows: |
|
``` Python |
|
model_id = 'mobiuslabsgmbh/Llama-2-70b-chat-hf-2bit_g16_s128-HQQ' |
|
|
|
from hqq.engine.hf import HQQModelForCausalLM, AutoTokenizer |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = HQQModelForCausalLM.from_quantized(model_id) |
|
``` |
|
|
|
### Basic Chat Example |
|
``` Python |
|
model_id = 'mobiuslabsgmbh/Llama-2-70b-chat-hf-2bit_g16_s128-HQQ' |
|
|
|
from hqq.engine.hf import HQQModelForCausalLM, AutoTokenizer |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = HQQModelForCausalLM.from_quantized(model_id) |
|
|
|
########################################################################################################## |
|
import transformers |
|
from threading import Thread |
|
|
|
from sys import stdout |
|
def print_flush(data): |
|
stdout.write("\r" + data) |
|
stdout.flush() |
|
|
|
#Adapted from https://huggingface.co/spaces/huggingface-projects/llama-2-7b-chat/blob/main/app.py |
|
def process_conversation(chat): |
|
system_prompt = chat['system_prompt'] |
|
chat_history = chat['chat_history'] |
|
message = chat['message'] |
|
|
|
conversation = [] |
|
if system_prompt: |
|
conversation.append({"role": "system", "content": system_prompt}) |
|
for user, assistant in chat_history: |
|
conversation.extend([{"role": "user", "content": user}, {"role": "assistant", "content": assistant}]) |
|
conversation.append({"role": "user", "content": message}) |
|
|
|
return tokenizer.apply_chat_template(conversation, return_tensors="pt").to('cuda') |
|
|
|
def chat_processor(chat, max_new_tokens=100, do_sample=True): |
|
tokenizer.use_default_system_prompt = False |
|
streamer = transformers.TextIteratorStreamer(tokenizer, timeout=10.0, skip_prompt=True, skip_special_tokens=True) |
|
|
|
generate_params = dict( |
|
{"input_ids": process_conversation(chat)}, |
|
streamer=streamer, |
|
max_new_tokens=max_new_tokens, |
|
do_sample=do_sample, |
|
top_p=0.90, |
|
top_k=50, |
|
temperature= 0.6, |
|
num_beams=1, |
|
repetition_penalty=1.2, |
|
) |
|
|
|
t = Thread(target=model.generate, kwargs=generate_params) |
|
t.start() |
|
|
|
outputs = [] |
|
for text in streamer: |
|
outputs.append(text) |
|
print_flush("".join(outputs)) |
|
|
|
return outputs |
|
|
|
################################################################################################### |
|
|
|
outputs = chat_processor({'system_prompt':"You are a helpful assistant.", |
|
'chat_history':[], |
|
'message':"How can I build a car?" |
|
}, |
|
max_new_tokens=1000, do_sample=False) |
|
``` |
|
|
|
<b>Output</b>: |
|
<p> |
|
Building a car is a complex process that involves designing, prototyping, testing, and manufacturing. Here are some general steps you can follow to build a car: |
|
|
|
1. Design the car: Determine the type of car you want to build, including the size, shape, and features. Create a detailed set of blueprints or computer-aided design (CAD) drawings to guide your building process. |
|
2. Source materials: Purchase or gather all the necessary materials, such as steel, aluminum, rubber, plastics, and any other components required for the car's body, frame, and engine. |
|
3. Build the frame: Construct the frame, which is the foundation of the car. This includes creating the chassis, suspension, and steering systems. |
|
4. Install the engine: Choose an appropriate engine and install it in the frame. Connect the engine to the transmission, exhaust system, and cooling system. |
|
5. Add the body: Attach the body panels to the frame, including the hood, doors, trunk lid, and roof. Ensure proper alignment and fitment. |
|
6. Install the electrical system: Connect the battery, starter, alternator, and wiring harness to the engine and other components. Install headlights, taillights, and other electrical accessories. |
|
7. Add the brakes: Install the brake system, including the brake pads, rotors, calipers, and master cylinder. Connect the brake lines and bleed the system to remove air bubbles. |
|
8. Install the interior: Fit the seats, dashboard, carpeting, and other interior components. Install the steering column, pedals, and shifter. |
|
9. Test and inspect: Check the car's systems, including the brakes, suspension, and engine performance. Make sure everything is functioning properly and safely. |
|
10. Register and insure: Obtain registration and insurance for your newly built car. Comply with local regulations and laws regarding vehicle ownership and operation. |
|
|
|
Please note that this is a high-level overview of the process, and building a car can be a complex and time-consuming task. It requires specialized knowledge, skills, and tools, as well as a clean and organized workspace. Additionally, safety precautions should always be taken when working on vehicles, as they can be dangerous if mishandled. |
|
|
|
If you are not experienced in automotive construction, it may be advisable to seek guidance from professionals or take a course in automotive mechanics before attempting to build a car. |
|
|
|
---------------------------------------------------------------------------------------------------------------------------------- |
|
</p> |
|
|
|
*Limitations*: <br> |
|
-Only supports single GPU runtime.<br> |
|
-Not compatible with HuggingFace's PEFT.<br> |
|
|
|
|
|
|
|
|
|
|