Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Quantization made by Richard Erkhov.

Github

Discord

Request more models

llama-2-70b-fb16-guanaco-1k - GGUF

Original model description:

license: cc-by-nc-4.0 language: - en pipeline_tag: text-generation

quantumaikr/llama-2-70b-fb16-guanaco-1k

Model Description

quantumaikr/llama-2-70b-fb16-guanaco-1k is a Llama2 70B model finetuned on an guanaco, mlabonne/guanaco-llama2-1k Dataset

Usage

Start chatting with quantumaikr/llama-2-70b-fb16-guanaco-1k using the following code snippet:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

tokenizer = AutoTokenizer.from_pretrained("quantumaikr/llama-2-70b-fb16-guanaco-1k")
model = AutoModelForCausalLM.from_pretrained("quantumaikr/llama-2-70b-fb16-guanaco-1k", torch_dtype=torch.float16, device_map="auto")

system_prompt = "### System:\nYou are QuantumLM, an AI that follows instructions extremely well. Help as much as you can. Remember, be safe, and don't do anything illegal.\n\n"

message = "Write me a poem please"
prompt = f"{system_prompt}### User: {message}\n\n### Assistant:\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
output = model.generate(**inputs, do_sample=True, top_p=0.95, top_k=0, max_new_tokens=256)

print(tokenizer.decode(output[0], skip_special_tokens=True))

QuantumLM should be used with this prompt format:

### System:
This is a system prompt, please behave and help the user.

### User:
Your prompt here

### Assistant
The output of QuantumLM

Use and Limitations

Intended Use

These models are intended for research only, in adherence with the CC BY-NC-4.0 license.

Limitations and bias

Although the aforementioned dataset helps to steer the base language models into "safer" distributions of text, not all biases and toxicity can be mitigated through fine-tuning. We ask that users be mindful of such potential issues that can arise in generated responses. Do not treat model outputs as substitutes for human judgment or as sources of truth. Please use it responsibly.

Contact us : hi@quantumai.kr

Downloads last month
43
GGUF
Model size
69B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference API
Unable to determine this model's library. Check the docs .