UUFO-Aigis's picture
Update README.md
d6d271e verified
|
raw
history blame
3.16 kB

Pico-OpenLAiNN-testing 🤗

Hey there fellow researchers, developers, and AI enthusiasts! Today I'm releasing a smol open LLM. This is mainly just a test and I plan to release actually usable models in the near future.

Models Overview

  • Pico-OpenLAiNN-100M-SmallData: The smallest of the bunch, this 100M parameter model is perfect for quick experiments and applications where computational resources are extremely limited.

Pretraining Details

This specific version of Pico LAiNN was trained on just 8 billion tokens of the fineweb dataset.

Other information:

  • Compatibility: Built to be compatible with existing projects that use LLAMA 2's tokenizer and architecture.
  • Ease of Use: No need to reinvent the wheel. These models are ready to be plugged into your applications.
  • Open Source: Fully open source, so you can tweak, tune, and twist them to your heart's content.

Getting Started

To start using these models, you can simply load them via the Hugging Face transformers library:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer


MODEL_NAME = "UUFO-Aigis/Pico-OpenLAiNN-100M" #Replace 100M with 250M or 500M if you prefer those models.

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)

def generate_text(prompt, model, tokenizer, max_length=512, temperature=1, top_k=50, top_p=0.95):
    inputs = tokenizer.encode(prompt, return_tensors="pt")

    outputs = model.generate(
        inputs,
        max_length=max_length,
        temperature=temperature,
        top_k=top_k,
        top_p=top_p,
        do_sample=True
    )


    generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return generated_text

def main():
    # Define your prompt
    prompt = "According to all known laws of aviation, there is no way a bee should be able to fly."

    generated_text = generate_text(prompt, model, tokenizer)

    print(generated_text)

if __name__ == "__main__":
    main()

Benchy :3

Tasks Value Stderr
arc_challenge 0.1826 ± 0.0113
arc_easy 0.3859 ± 0.0100
boolq 0.5804 ± 0.0086
hellaswag 0.2791 ± 0.0045
lambada_openai 0.2437 ± 0.0060
piqa 0.6159 ± 0.0113
winogrande 0.5067 ± 0.0141

Future Plans

  • More Models: I'm currenetly training the bigger siblings of this models, including a 1B parameter version and beyond. 2-4 Billion parameter versions are planned.
  • New architecture: This is still up in the air and I'm still developing it, and will release if I deem it to be actually useful, so stay tuned!
  • Paper: A detailed paper will be posted at some point.

Credit Where Credit's Due

If you find these models useful and decide to use these models, a link to this repository would be highly appreciated. I am a one man show running this. Thanks 🤗

Contact

If you have questions, Please reach out to me at urlsys32dll@gmail.com

U.U.F.O Research Logo