BEE-spoke-data/tFINE-900m-e16-d32-flan

This is a basic text-to-text "instruct" model, similar to Google's original flan-t5 model series (but not trained for as long).

Details: Click here to expand

Fine-tuned from the base model on the pszemraj/flan-subsets-deduped dataset, subset flan-v2 for 1 epoch. It achieves the following results on the evaluation set:

  • Loss: 1.4134
  • Rouge1: 62.9142
  • Rouge2: 22.5279
  • Rougel: 61.4902
  • Rougelsum: 61.7795
  • Gen Len: 12.0586
  • Num Input Tokens Seen: 1931815668

Model features

  • pretrained & fine-tuned at 1024 context length (input)
  • tokenizer with byte-pair fallback to support understanding and generating text beyond what the original T5 tokenizer does

Usage Example

from transformers import pipeline

pipe = pipeline(
    "text2text-generation",
    model="BEE-spoke-data/tFINE-900m-e16-d32-flan",
)
prompt = "What color is tuesday?"
res = pipe(prompt, max_new_tokens=96, top_k=4, penalty_alpha=0.6)
print(res[0]["generated_text"])

Quick eval

Quick eval for: BEE-spoke-data/tFINE-900m-e16-d32-flan

hf (pretrained=BEE-spoke-data/tFINE-900m-e16-d32-flan,trust_remote_code=True,dtype=bfloat16,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8

Tasks Version Filter n-shot Metric Value Stderr
boolq 2 none 0 acc ↑ 0.6700 ± 0.0082
openbookqa 1 none 0 acc ↑ 0.1900 ± 0.0176
none 0 acc_norm ↑ 0.2980 ± 0.0205
piqa 1 none 0 acc ↑ 0.6001 ± 0.0114
none 0 acc_norm ↑ 0.6072 ± 0.0114
social_iqa 0 none 0 acc ↑ 0.4299 ± 0.0112
tinyArc 0 none 25 acc_norm ↑ 0.3214 ± N/A
tinyGSM8k 0 flexible-extract 5 exact_match ↑ 0.0492 ± N/A
strict-match 5 exact_match ↑ 0.0380 ± N/A
tinyHellaswag 0 none 10 acc_norm ↑ 0.4005 ± N/A
tinyMMLU 0 none 0 acc_norm ↑ 0.2857 ± N/A
winogrande 1 none 0 acc ↑ 0.4988 ± 0.0141
Downloads last month
39
Safetensors
Model size
887M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for BEE-spoke-data/tFINE-900m-e16-d32-flan

Finetuned
(1)
this model
Finetunes
1 model

Dataset used to train BEE-spoke-data/tFINE-900m-e16-d32-flan