metadata

base_model: tog/TinyLlama-1.1B-alpaca-chat-v1.5
datasets:
  - tatsu-lab/alpaca
inference: false
language:
  - en
license: apache-2.0
model_creator: tog
model_name: TinyLlama-1.1B-alpaca-chat-v1.5
pipeline_tag: text-generation
quantized_by: afrideva
tags:
  - gguf
  - ggml
  - quantized
  - q2_k
  - q3_k_m
  - q4_k_m
  - q5_k_m
  - q6_k
  - q8_0
widget:
  - text: >-
      ###Instruction:\nWhat is a large language model? Be concise\n\n###
      Response:\n

tog/TinyLlama-1.1B-alpaca-chat-v1.5-GGUF

Quantized GGUF model files for TinyLlama-1.1B-alpaca-chat-v1.5 from tog

Name	Quant method	Size
tinyllama-1.1b-alpaca-chat-v1.5.q2_k.gguf	q2_k	482.14 MB
tinyllama-1.1b-alpaca-chat-v1.5.q3_k_m.gguf	q3_k_m	549.85 MB
tinyllama-1.1b-alpaca-chat-v1.5.q4_k_m.gguf	q4_k_m	667.81 MB
tinyllama-1.1b-alpaca-chat-v1.5.q5_k_m.gguf	q5_k_m	782.04 MB
tinyllama-1.1b-alpaca-chat-v1.5.q6_k.gguf	q6_k	903.41 MB
tinyllama-1.1b-alpaca-chat-v1.5.q8_0.gguf	q8_0	1.17 GB

Original Model Card:

This Model

This is the chat model finetuned on top of PY007/TinyLlama-1.1B-intermediate-step-715k-1.5T. The dataset used is tatsu-lab/stanford_alpaca.

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{instruction}

### Response:

You can use it with the transformers library:

from transformers import AutoTokenizer
import transformers
import torch

model = "tog/TinyLlama-1.1B-alpaca-chat-v1.5"
tokenizer = AutoTokenizer.from_pretrained(model)

pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto")

sequences = pipeline(
    '###Instruction:\nWhat is a large language model? Be concise.\n\n### Response:\n',
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
    max_length=200)

for seq in sequences:
    print(f"{seq['generated_text']}")

You should get something along those lines:

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Result: ###Instruction:
What is a large language model? Be concise.

### Response:
A large language model is a type of natural language understanding model that can learn to accurately recognize and interpret text data by understanding the context of words. Languages used for text understanding are typically trained on a corpus of text data.