Edit model card

Democratizing access to LLMs for the open-source community.
Let's advance AI, together.


Introduction πŸŽ‰

We are thrilled to announce the open-sourcing of our boomer-634m model, an important milestone in our ongoing AI research. This model, with 634 million parameters, was meticulously pre-trained from scratch on a custom synthetic dataset comprising 12 billion tokens.

Run the model

Here is a quick guide to get you started with boomer-634m: Please note that, at the moment, trust_remote_code=True is required for running the model.

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("budecosystem/boomer-634m",
                                             trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("budecosystem/boomer-634m")

input_ids = tokenizer("Explain why the sky is blue.", return_tensors='pt').to(model.device)["input_ids"]
outputs = model.generate(input_ids, max_new_tokens=216)
print(tokenizer.batch_decode(outputs))

Evaluations

The boomer-634m model has been rigorously evaluated on various benchmarks, showcasing its robust performance across different tasks:

Model Name MMLU ARC Hellaswag GSM8K Winogrande MathQA logiqa
boomer-634m 25.91 29.86 39.24 1.67 50.67 23.55 28.42

Final thought on Boomer!

Embarking on the journey with boomer-634m is just the beginning. We are committed to developing more advanced, efficient, and accessible AI models. Join us in this exciting adventure to shape the future of AI.

Aknowledgements

Our heartfelt thanks go to the open-source community and the trailblazers in AI research whose work has paved the way for innovations like boomer-634m.

Downloads last month
16
Safetensors
Model size
634M params
Tensor type
FP16
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train budecosystem/boomer-634m