File size: 2,833 Bytes
0e925fc
 
7e3a4a0
 
 
 
 
 
 
0e925fc
 
7e3a4a0
0e925fc
9282a3e
0e925fc
9282a3e
d9ffec9
 
 
9282a3e
7e3a4a0
 
 
 
 
 
 
0e925fc
d9ffec9
 
 
 
 
 
 
4fbe0a7
 
 
 
 
 
 
 
 
 
 
 
 
 
fbb0f7d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
---
library_name: transformers
license: apache-2.0
datasets:
- pszemraj/flan-subsets-deduped
language:
- en
base_model: pszemraj/tFINE-900m-e16-d32-1024ctx
pipeline_tag: text2text-generation
---

# BEE-spoke-data/tFINE-900m-e16-d32-flan

This is a basic text-to-text "instruct" model, similar to Google's original [flan-t5](https://huggingface.co/collections/google/flan-t5-release-65005c39e3201fff885e22fb) model series (but not trained for as long).  


<details>
  <summary>Details: Click here to expand</summary>

Fine-tuned from [the base model](https://hf.co/pszemraj/tFINE-900m-e16-d32-1024ctx) on the `pszemraj/flan-subsets-deduped` dataset, subset `flan-v2` for 1 epoch. It achieves the following results on the evaluation set:
- Loss: 1.4134
- Rouge1: 62.9142
- Rouge2: 22.5279
- Rougel: 61.4902
- Rougelsum: 61.7795
- Gen Len: 12.0586
- Num Input Tokens Seen: 1931815668

### Model features

- pretrained & fine-tuned at 1024 context length (input)
- tokenizer with byte-pair fallback to support understanding and generating text beyond what the original T5 tokenizer does

</details>

## Usage Example

```py
from transformers import pipeline

pipe = pipeline(
    "text2text-generation",
    model="BEE-spoke-data/tFINE-900m-e16-d32-flan",
)
prompt = "What color is tuesday?"
res = pipe(prompt, max_new_tokens=96, top_k=4, penalty_alpha=0.6)
print(res[0]["generated_text"])
```

## Quick eval

Quick eval for:	`BEE-spoke-data/tFINE-900m-e16-d32-flan`


hf (pretrained=BEE-spoke-data/tFINE-900m-e16-d32-flan,trust_remote_code=True,dtype=bfloat16,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8
|    Tasks    |Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|-------------|------:|----------------|-----:|-----------|---|-----:|---|------|
|boolq        |      2|none            |     0|acc        |↑  |0.6700|±  |0.0082|
|openbookqa   |      1|none            |     0|acc        |↑  |0.1900|±  |0.0176|
|             |       |none            |     0|acc_norm   |↑  |0.2980|±  |0.0205|
|piqa         |      1|none            |     0|acc        |↑  |0.6001|±  |0.0114|
|             |       |none            |     0|acc_norm   |↑  |0.6072|±  |0.0114|
|social_iqa   |      0|none            |     0|acc        |↑  |0.4299|±  |0.0112|
|tinyArc      |      0|none            |    25|acc_norm   |↑  |0.3214|±  |   N/A|
|tinyGSM8k    |      0|flexible-extract|     5|exact_match|↑  |0.0492|±  |   N/A|
|             |       |strict-match    |     5|exact_match|↑  |0.0380|±  |   N/A|
|tinyHellaswag|      0|none            |    10|acc_norm   |↑  |0.4005|±  |   N/A|
|tinyMMLU     |      0|none            |     0|acc_norm   |↑  |0.2857|±  |   N/A|
|winogrande   |      1|none            |     0|acc        |↑  |0.4988|±  |0.0141|