Meta-Llama-3-13B-Instruct

Meta-Llama-3-13B-Instruct is a meta-llama/Meta-Llama-3-8B-Instruct self-merge made with MergeKit.

Configuration

The following YAML configuration was used to produce this model:

slices:
- sources:
  - layer_range: [0, 16]
    model: meta-llama/Meta-Llama-3-8B-Instruct
- sources:
  - layer_range: [4, 24]
    model: meta-llama/Meta-Llama-3-8B-Instruct
- sources:
  - layer_range: [8, 31]
    model: meta-llama/Meta-Llama-3-8B-Instruct
merge_method: passthrough
dtype: float16

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "andrijdavid/Meta-Llama-3-13B-Instruct"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    {"role": "user", "content": "Who are you?"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = model.generate(
    input_ids,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
Downloads last month
24
Safetensors
Model size
13.9B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for andrijdavid/Meta-Llama-3-13B-Instruct

Finetuned
(529)
this model
Quantizations
1 model