Edit model card

Llama-2-13b-chat-hf-4bit_g64-HQQ

This is a version of the LLama-2-13B-chat-hf model quantized to 4-bit via Half-Quadratic Quantization (HQQ): https://mobiusml.github.io/hqq_blog/

Basic Usage

To run the model, install the HQQ library from https://github.com/mobiusml/hqq and use it as follows:

model_id = 'mobiuslabsgmbh/Llama-2-13b-chat-hf-4bit_g64-HQQ'

from hqq.engine.hf import HQQModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
model     = HQQModelForCausalLM.from_quantized(model_id)

Basic Chat Example

model_id = 'mobiuslabsgmbh/Llama-2-13b-chat-hf-4bit_g64-HQQ'

from hqq.engine.hf import HQQModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
model     = HQQModelForCausalLM.from_quantized(model_id)

##########################################################################################################
import transformers
from threading import Thread

from sys import stdout
def print_flush(data):
    stdout.write("\r" + data)
    stdout.flush()

#Adapted from https://huggingface.co/spaces/huggingface-projects/llama-2-7b-chat/blob/main/app.py
def process_conversation(chat):
    system_prompt = chat['system_prompt']
    chat_history  = chat['chat_history']
    message       = chat['message']

    conversation = []
    if system_prompt:
        conversation.append({"role": "system", "content": system_prompt})
    for user, assistant in chat_history:
        conversation.extend([{"role": "user", "content": user}, {"role": "assistant", "content": assistant}])
    conversation.append({"role": "user", "content": message})

    return tokenizer.apply_chat_template(conversation, return_tensors="pt").to('cuda')

def chat_processor(chat, max_new_tokens=100, do_sample=True):
    tokenizer.use_default_system_prompt = False
    streamer = transformers.TextIteratorStreamer(tokenizer, timeout=10.0, skip_prompt=True, skip_special_tokens=True)

    generate_params = dict(
        {"input_ids": process_conversation(chat)},
        streamer=streamer,
        max_new_tokens=max_new_tokens,
        do_sample=do_sample,
        top_p=0.90,
        top_k=50,
        temperature= 0.6,
        num_beams=1,
        repetition_penalty=1.2,
    )

    t = Thread(target=model.generate, kwargs=generate_params)
    t.start()

    outputs = []
    for text in streamer:
        outputs.append(text)
        print_flush("".join(outputs))

    return outputs

###################################################################################################

outputs = chat_processor({'system_prompt':"You are a helpful assistant.",
                        'chat_history':[],
                        'message':"How can I build a car?"
                        }, 
                         max_new_tokens=1000, do_sample=False)

Output:

Wow, that's an exciting project! Building a car from scratch can be a challenging but rewarding experience. Here are some general steps you might consider as you embark on this journey:

  1. Define your goals and requirements: What kind of car do you want to build? Do you have any specific performance or design preferences? How many passengers will the car seat? Answers to these questions will help guide your decisions throughout the process.
  2. Choose a platform: Decide on the type of vehicle you want to build (e.g., sedan, SUV, sports car) and select a suitable platform (chassis and body style). This will influence the choice of components and materials for the rest of the build.
  3. Design and prototype: Sketch out your ideas and create a detailed set of blueprints or computer-aided designs (CADs). Build a scale model or mockup to test fit and form before committing to final plans.
  4. Source materials and components: Purchase or fabricate the necessary parts, including the frame, suspension, steering system, brakes, engine, transmission, and electrical components. You may need to customize certain pieces or find specialized suppliers.
  5. Assemble the chassis: Start by building the frame and attaching the suspension, steering, and brake systems. Ensure everything is properly aligned and balanced.
  6. Install the powertrain: Select and install the appropriate engine and transmission, considering factors like power output, fuel efficiency, and compatibility with your chosen platform.
  7. Add the bodywork: Install the body panels, paying close attention to gaps, fits, and finishes. Use rustproofing methods to protect the metal components.
  8. Integrate the electrical and electronics: Install the wiring harness, batteries, and other essential electrical components. Consider adding advanced features like infotainment systems or driver assistance technologies.
  9. Test and refine: Once the major components are in place, perform thorough tests to ensure proper functioning and durability. Make any necessary adjustments or upgrades based on your testing results.
  10. Register and insure your creation: Depending on where you live, you may need to register your DIY car with local authorities and obtain insurance coverage. Be sure to research the laws and regulations in your area regarding homemade vehicles.

Please note that building a car from scratch can be a complex and time-consuming process, requiring significant expertise, tools, and resources. It's important to approach this project carefully and thoughtfully, taking into account safety considerations, legal requirements, and your own skills and limitations. If you're new to car construction, it may be helpful to seek guidance from experienced enthusiasts or professionals in the field. Good luck with your project!


Limitations:
-Only supports single GPU runtime.
-Not compatible with HuggingFace's PEFT.

Downloads last month
2
Inference API
Input a message to start chatting with mobiuslabsgmbh/Llama-2-13b-chat-hf-4bit_g64-HQQ.
Inference API (serverless) has been turned off for this model.

Collection including mobiuslabsgmbh/Llama-2-13b-chat-hf-4bit_g64-HQQ