Edit model card

Example usage:

from transformers import AutoModelForCausalLM
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("s3nh/TinyLLama-1.1B-MoE")
tokenizer = AutoTokenizer.from_pretrained("s3nh/TinyLLama-1.1B-MoE")

input_text =  """
###Input: You are a pirate. tell me a story about wrecked ship.
###Response:
""")

input_ids = tokenizer.encode(input_text, return_tensors='pt').to(device)
output = model.generate(inputs=input_ids,
                        max_length=max_length,
                        do_sample=True,
                        top_k=10,
                        temperature=0.7,
                        pad_token_id=tokenizer.eos_token_id,
                        attention_mask=input_ids.new_ones(input_ids.shape))
tokenizer.decode(output[0], skip_special_tokens=True)

This model was possible to create by tremendous work of mergekit developers. I decided to merge tinyLlama models to create mixture of experts. Config used as below:

"""base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
experts:
  - source_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
    positive_prompts:
    - "chat"
    - "assistant"
    - "tell me"
    - "explain"
  - source_model: 78health/TinyLlama_1.1B-function-calling
    positive_prompts:
    - "code"
    - "python"
    - "javascript"
    - "programming"
    - "algorithm"
  - source_model: phanerozoic/Tiny-Pirate-1.1b-v0.1
    positive_prompts:
    - "storywriting"
    - "write"
    - "scene"
    - "story"
    - "character"
  - source_model: Tensoic/TinyLlama-1.1B-3T-openhermes
    positive_prompts:
    - "reason"
    - "provide"
    - "instruct"
    - "summarize"
    - "count"
"""
Downloads last month
461
Safetensors
Model size
3.38B params
Tensor type
BF16
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for s3nh/TinyLLama-4x1.1B-MoE