Pico-OpenLAiNN-testing 🤗
Hey there fellow researchers, developers, and AI enthusiasts! Today I'm releasing a smol open LLM. This is mainly just a test and I plan to release actually usable models in the near future.
Models Overview
- Pico-OpenLAiNN-100M-SmallData: The smallest of the bunch, this 100M parameter model is perfect for quick experiments and applications where computational resources are extremely limited.
Pretraining Details
This specific version of Pico LAiNN was trained on just 8 billion tokens of the fineweb dataset.
Other information:
- Compatibility: Built to be compatible with existing projects that use LLAMA 2's tokenizer and architecture.
- Ease of Use: No need to reinvent the wheel. These models are ready to be plugged into your applications.
- Open Source: Fully open source, so you can tweak, tune, and twist them to your heart's content.
Getting Started
To start using these models, you can simply load them via the Hugging Face transformers
library:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
MODEL_NAME = "UUFO-Aigis/Pico-OpenLAiNN-100M" #Replace 100M with 250M or 500M if you prefer those models.
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)
def generate_text(prompt, model, tokenizer, max_length=512, temperature=1, top_k=50, top_p=0.95):
inputs = tokenizer.encode(prompt, return_tensors="pt")
outputs = model.generate(
inputs,
max_length=max_length,
temperature=temperature,
top_k=top_k,
top_p=top_p,
do_sample=True
)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
return generated_text
def main():
# Define your prompt
prompt = "According to all known laws of aviation, there is no way a bee should be able to fly."
generated_text = generate_text(prompt, model, tokenizer)
print(generated_text)
if __name__ == "__main__":
main()
Benchy :3
Tasks | Value | Stderr | |
---|---|---|---|
arc_challenge | 0.1826 | ± | 0.0113 |
arc_easy | 0.3859 | ± | 0.0100 |
boolq | 0.5804 | ± | 0.0086 |
hellaswag | 0.2791 | ± | 0.0045 |
lambada_openai | 0.2437 | ± | 0.0060 |
piqa | 0.6159 | ± | 0.0113 |
winogrande | 0.5067 | ± | 0.0141 |
Future Plans
- More Models: I'm currenetly training the bigger siblings of this models, including a 1B parameter version and beyond. 2-4 Billion parameter versions are planned.
- New architecture: This is still up in the air and I'm still developing it, and will release if I deem it to be actually useful, so stay tuned!
- Paper: A detailed paper will be posted at some point.
Credit Where Credit's Due
If you find these models useful and decide to use these models, a link to this repository would be highly appreciated. I am a one man show running this. Thanks 🤗
Contact
If you have questions, Please reach out to me at urlsys32dll@gmail.com