license: apache-2.0
Micro Llama v0 (Development)
Micro Llama v0 is a lightweight and experimental version of the LlamaForCausalLM model designed for development and testing purposes. This repository contains the necessary model configuration, tokenizer, and generation settings to run a minimal Llama architecture.
Model Overview
Micro Llama v0 is based on the LlamaForCausalLM architecture. It is tailored to fit resource-constrained environments for testing the foundational components of a transformer-based language model. This version features:
- 1 hidden layer
- 2048 hidden size
- 32 attention heads
- 5632 intermediate size
- Max position embeddings of 2048
- Vocabulary size of 32,000
These parameters make the model compact and suitable for development, while still maintaining key characteristics of the Llama architecture.
Files and Configuration
config.json
: Contains the model architecture configuration, such as hidden size, number of attention heads, hidden layers, and activation functions.generation_config.json
: Specifies generation parameters, including max length and token behavior.model.safetensors
: Stores the model weights in a safe and efficient format.special_tokens_map.json
: Maps the special tokens used by the model, including<s>
,</s>
,<unk>
, and</s>
(for padding).tokenizer.json
: Defines the tokenizer configuration, including vocabulary size and token mapping.tokenizer_config.json
: Further configures the tokenizer, specifying token types, maximum sequence length, and other tokenizer options.
Requirements
- Transformers version 4.44.0 or above
- PyTorch version compatible with the model's
float32
tensor type safetensors
package for loading model weights
Usage
- Clone the repository:
git clone https://github.com/your-repo/micro-llama.git cd micro-llama
- Install the required dependencies:
pip install transformers safetensors torch
- Load the model in your code:
from transformers import LlamaForCausalLM, LlamaTokenizer tokenizer = LlamaTokenizer.from_pretrained("UnieAI-Wilson/micro-llama-0-dev") model = LlamaForCausalLM.from_pretrained("UnieAI-Wilson/micro-llama-0-dev", torch_dtype="float16") inputs = tokenizer("Your text here", return_tensors="pt") outputs = model.generate(**inputs) print(tokenizer.decode(outputs[0], skip_special_tokens=True))
License
Micro Llama v0 is licensed under the Apache 2.0 License. See the LICENSE file for details.
Contribution
This is an experimental and evolving project. Contributions are welcome, and feel free to submit issues or pull requests.
Disclaimer
This is an early-stage development version, and the model may undergo significant changes. It is not intended for production use.