Micro Llama v0 (Development)

Micro Llama v0 is a lightweight and experimental version of the LlamaForCausalLM model designed for development and testing purposes. This repository contains the necessary model configuration, tokenizer, and generation settings to run a minimal Llama architecture.

Model Overview

Micro Llama v0 is based on the LlamaForCausalLM architecture. It is tailored to fit resource-constrained environments for testing the foundational components of a transformer-based language model. This version features:

  • 1 hidden layer
  • 2048 hidden size
  • 32 attention heads
  • 5632 intermediate size
  • Max position embeddings of 2048
  • Vocabulary size of 32,000

These parameters make the model compact and suitable for development, while still maintaining key characteristics of the Llama architecture.

Files and Configuration

  • config.json: Contains the model architecture configuration, such as hidden size, number of attention heads, hidden layers, and activation functions.
  • generation_config.json: Specifies generation parameters, including max length and token behavior.
  • model.safetensors: Stores the model weights in a safe and efficient format.
  • special_tokens_map.json: Maps the special tokens used by the model, including <s>, </s>, <unk>, and </s> (for padding).
  • tokenizer.json: Defines the tokenizer configuration, including vocabulary size and token mapping.
  • tokenizer_config.json: Further configures the tokenizer, specifying token types, maximum sequence length, and other tokenizer options.

Requirements

  • Transformers version 4.44.0 or above
  • PyTorch version compatible with the model's float32 tensor type
  • safetensors package for loading model weights

Usage

  1. Clone the repository:
    git clone https://github.com/your-repo/micro-llama.git
    cd micro-llama
    
  2. Install the required dependencies:
    pip install transformers safetensors torch
    
  3. Load the model in your code:
    from transformers import LlamaForCausalLM, LlamaTokenizer
    
    tokenizer = LlamaTokenizer.from_pretrained("UnieAI-Wilson/micro-llama-0-dev")
    model = LlamaForCausalLM.from_pretrained("UnieAI-Wilson/micro-llama-0-dev", torch_dtype="float16")
    
    inputs = tokenizer("Your text here", return_tensors="pt")
    outputs = model.generate(**inputs)
    print(tokenizer.decode(outputs[0], skip_special_tokens=True))
    

License

Micro Llama v0 is licensed under the Apache 2.0 License. See the LICENSE file for details.

Contribution

This is an experimental and evolving project. Contributions are welcome, and feel free to submit issues or pull requests.

Disclaimer

This is an early-stage development version, and the model may undergo significant changes. It is not intended for production use.

Downloads last month
0
Safetensors
Model size
175M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .