File size: 5,470 Bytes
af08ef9 769060c af08ef9 769060c 70e587f 769060c af29244 769060c 70e587f af29244 70e587f af29244 70e587f 36a6b7c 70e587f af29244 70e587f 769060c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 |
---
license: llama2
train: false
inference: false
pipeline_tag: text-generation
---
## Llama-2-70b-chat-hf-2bit_g16_s128-HQQ
This is a version of the LLama-2-70B-chat-hf model quantized to 2-bit via Half-Quadratic Quantization (HQQ): https://mobiusml.github.io/hqq_blog/
### Basic Usage
To run the model, install the HQQ library from https://github.com/mobiusml/hqq and use it as follows:
``` Python
model_id = 'mobiuslabsgmbh/Llama-2-70b-chat-hf-2bit_g16_s128-HQQ'
from hqq.engine.hf import HQQModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = HQQModelForCausalLM.from_quantized(model_id)
```
### Basic Chat Example
``` Python
model_id = 'mobiuslabsgmbh/Llama-2-70b-chat-hf-2bit_g16_s128-HQQ'
from hqq.engine.hf import HQQModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = HQQModelForCausalLM.from_quantized(model_id)
from hqq.core.quantize import *
HQQLinear.set_backend(HQQBackend.PYTORCH_COMPILE)
##########################################################################################################
import transformers
from threading import Thread
from sys import stdout
def print_flush(data):
stdout.write("\r" + data)
stdout.flush()
#Adapted from https://huggingface.co/spaces/huggingface-projects/llama-2-7b-chat/blob/main/app.py
def process_conversation(chat):
system_prompt = chat['system_prompt']
chat_history = chat['chat_history']
message = chat['message']
conversation = []
if system_prompt:
conversation.append({"role": "system", "content": system_prompt})
for user, assistant in chat_history:
conversation.extend([{"role": "user", "content": user}, {"role": "assistant", "content": assistant}])
conversation.append({"role": "user", "content": message})
return tokenizer.apply_chat_template(conversation, return_tensors="pt").to('cuda')
def chat_processor(chat, max_new_tokens=100, do_sample=True):
tokenizer.use_default_system_prompt = False
streamer = transformers.TextIteratorStreamer(tokenizer, timeout=10.0, skip_prompt=True, skip_special_tokens=True)
generate_params = dict(
{"input_ids": process_conversation(chat)},
streamer=streamer,
max_new_tokens=max_new_tokens,
do_sample=do_sample,
top_p=0.90,
top_k=50,
temperature= 0.6,
num_beams=1,
repetition_penalty=1.2,
)
t = Thread(target=model.generate, kwargs=generate_params)
t.start()
outputs = []
for text in streamer:
outputs.append(text)
print_flush("".join(outputs))
return outputs
###################################################################################################
outputs = chat_processor({'system_prompt':"You are a helpful assistant.",
'chat_history':[],
'message':"How can I build a car?"
},
max_new_tokens=1000, do_sample=False)
```
<b>Output</b>:
<p>
Building a car is a complex process that involves designing, prototyping, testing, and manufacturing. Here are some general steps you can follow to build a car:
1. Design the car: Determine the type of car you want to build, including the size, shape, and features. Create a detailed set of blueprints or computer-aided design (CAD) drawings to guide your building process.
2. Source materials: Purchase or gather all the necessary materials, such as steel, aluminum, rubber, plastics, and any other components required for the car's body, frame, and engine.
3. Build the frame: Construct the frame, which is the foundation of the car. This includes creating the chassis, suspension, and steering systems.
4. Install the engine: Choose an appropriate engine and install it in the frame. Connect the engine to the transmission, exhaust system, and cooling system.
5. Add the body: Attach the body panels to the frame, including the hood, doors, trunk lid, and roof. Ensure proper alignment and fitment.
6. Install the electrical system: Connect the battery, starter, alternator, and wiring harness to the engine and other components. Install headlights, taillights, and other electrical accessories.
7. Add the brakes: Install the brake system, including the brake pads, rotors, calipers, and master cylinder. Connect the brake lines and bleed the system to remove air bubbles.
8. Install the interior: Fit the seats, dashboard, carpeting, and other interior components. Install the steering column, pedals, and shifter.
9. Test and inspect: Check the car's systems, including the brakes, suspension, and engine performance. Make sure everything is functioning properly and safely.
10. Register and insure: Obtain registration and insurance for your newly built car. Comply with local regulations and laws regarding vehicle ownership and operation.
Please note that this is a high-level overview of the process, and building a car can be a complex and time-consuming task. It requires specialized knowledge, skills, and tools, as well as a clean and organized workspace. Additionally, safety precautions should always be taken when working on vehicles, as they can be dangerous if mishandled.
If you are not experienced in automotive construction, it may be advisable to seek guidance from professionals or take a course in automotive mechanics before attempting to build a car.
----------------------------------------------------------------------------------------------------------------------------------
</p>
*Limitations*: <br>
-Only supports single GPU runtime.<br>
-Not compatible with HuggingFace's PEFT.<br>
|