File size: 2,833 Bytes
0e925fc 7e3a4a0 0e925fc 7e3a4a0 0e925fc 9282a3e 0e925fc 9282a3e d9ffec9 9282a3e 7e3a4a0 0e925fc d9ffec9 4fbe0a7 fbb0f7d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
---
library_name: transformers
license: apache-2.0
datasets:
- pszemraj/flan-subsets-deduped
language:
- en
base_model: pszemraj/tFINE-900m-e16-d32-1024ctx
pipeline_tag: text2text-generation
---
# BEE-spoke-data/tFINE-900m-e16-d32-flan
This is a basic text-to-text "instruct" model, similar to Google's original [flan-t5](https://huggingface.co/collections/google/flan-t5-release-65005c39e3201fff885e22fb) model series (but not trained for as long).
<details>
<summary>Details: Click here to expand</summary>
Fine-tuned from [the base model](https://hf.co/pszemraj/tFINE-900m-e16-d32-1024ctx) on the `pszemraj/flan-subsets-deduped` dataset, subset `flan-v2` for 1 epoch. It achieves the following results on the evaluation set:
- Loss: 1.4134
- Rouge1: 62.9142
- Rouge2: 22.5279
- Rougel: 61.4902
- Rougelsum: 61.7795
- Gen Len: 12.0586
- Num Input Tokens Seen: 1931815668
### Model features
- pretrained & fine-tuned at 1024 context length (input)
- tokenizer with byte-pair fallback to support understanding and generating text beyond what the original T5 tokenizer does
</details>
## Usage Example
```py
from transformers import pipeline
pipe = pipeline(
"text2text-generation",
model="BEE-spoke-data/tFINE-900m-e16-d32-flan",
)
prompt = "What color is tuesday?"
res = pipe(prompt, max_new_tokens=96, top_k=4, penalty_alpha=0.6)
print(res[0]["generated_text"])
```
## Quick eval
Quick eval for: `BEE-spoke-data/tFINE-900m-e16-d32-flan`
hf (pretrained=BEE-spoke-data/tFINE-900m-e16-d32-flan,trust_remote_code=True,dtype=bfloat16,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8
| Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr|
|-------------|------:|----------------|-----:|-----------|---|-----:|---|------|
|boolq | 2|none | 0|acc |↑ |0.6700|± |0.0082|
|openbookqa | 1|none | 0|acc |↑ |0.1900|± |0.0176|
| | |none | 0|acc_norm |↑ |0.2980|± |0.0205|
|piqa | 1|none | 0|acc |↑ |0.6001|± |0.0114|
| | |none | 0|acc_norm |↑ |0.6072|± |0.0114|
|social_iqa | 0|none | 0|acc |↑ |0.4299|± |0.0112|
|tinyArc | 0|none | 25|acc_norm |↑ |0.3214|± | N/A|
|tinyGSM8k | 0|flexible-extract| 5|exact_match|↑ |0.0492|± | N/A|
| | |strict-match | 5|exact_match|↑ |0.0380|± | N/A|
|tinyHellaswag| 0|none | 10|acc_norm |↑ |0.4005|± | N/A|
|tinyMMLU | 0|none | 0|acc_norm |↑ |0.2857|± | N/A|
|winogrande | 1|none | 0|acc |↑ |0.4988|± |0.0141|
|