test of ModernBERT2Olmo-large_1b

experimental seq2seq with EncoderDecoderModel. You will need to patch modeling_llama.py with this code for it work

WIP + output of this model is gibberish bc cross attn needs training

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("pszemraj/ModernBERT2Olmo-large_1b-test")
model = AutoModelForSeq2SeqLM.from_pretrained("pszemraj/ModernBERT2Olmo-large_1b-test")

ARTICLE_TO_SUMMARIZE = (
    "PG&E stated it scheduled the blackouts in response to forecasts for high winds "
    "amid dry conditions. The aim is to reduce the risk of wildfires. Nearly 800 thousand customers were "
    "scheduled to be affected by the shutoffs which were expected to last through at least midday tomorrow."
)
prompt = f"summarize dis botmon: {ARTICLE_TO_SUMMARIZE}"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

# autoregressively generate summary (uses greedy decoding by default)
generated_ids = model.generate(
    **inputs,
    min_new_tokens=10,
    max_new_tokens=100,
)
generated_text = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(generated_text)
Downloads last month
13
Safetensors
Model size
1.84B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.