File size: 3,857 Bytes
1d23df8
7c45a8e
 
1d23df8
7c45a8e
 
 
 
 
 
 
 
1b95302
 
 
 
075bd49
 
1b95302
c834e50
075bd49
 
1b95302
 
 
 
 
 
075bd49
 
1b95302
c834e50
075bd49
 
1b95302
 
 
 
 
1d23df8
7c45a8e
 
 
2fe493c
7c45a8e
2fe493c
7c45a8e
 
2fe493c
 
 
 
 
 
7c45a8e
2fe493c
7c45a8e
 
 
 
 
3a969f0
 
 
 
 
7c45a8e
 
 
 
 
 
 
 
488616c
 
 
7c45a8e
 
1b95302
 
 
 
 
 
 
 
 
 
 
 
 
7c45a8e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2fe493c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
---
language:
- en
license: mit
datasets:
- multi_nli
library_name: transformers
pipeline_tag: zero-shot-classification
tags:
- t5
- text-classification
- mnli
model-index:
- name: sjrhuschlee/flan-t5-base-mnli
  results:
  - task:
      type: natural-language-inference
      name: Natural Language Inference
    dataset:
      name: MultiNLI-matched
      type: multi_nli
      config: default
      split: validation_matched
    metrics:
    - type: accuracy
      value: 87.468
      name: Accuracy
  - task:
      type: natural-language-inference
      name: Natural Language Inference
    dataset:
      name: MultiNLI-mismatched
      type: multi_nli
      config: default
      split: validation_mismatched
    metrics:
    - type: accuracy
      value: 87.276
      name: Accuracy
---

# flan-t5-base-mnli

flan-t5-base-mnli is the [flan-T5 base model](https://huggingface.co/google/flan-t5-base) fine-tuned on the [Multi-Genre Natural Language Inference (MNLI)](https://huggingface.co/datasets/multi_nli) corpus. 

## Overview

- **License:** MIT 
- **Language model:** flan-t5-base  
- **Language:** English  
- **Downstream-task:** Zero-shot Classification, Text Classification  
- **Training data:** MNLI  
- **Eval data:** MNLI (Matched and Mismatched)  
- **Infrastructure**: 1x NVIDIA 3070  

## Model Usage

Use the code below to get started with the model. The model can be loaded with the zero-shot-classification pipeline like so:

```python
from transformers import pipeline
classifier = pipeline(
  'zero-shot-classification',
  model='sjrhuschlee/flan-t5-base-mnli',
  trust_remote_code=True,
)
```

You can then use this pipeline to classify sequences into any of the class names you specify. For example:

```python
sequence_to_classify = "one day I will see the world"
candidate_labels = ['travel', 'cooking', 'dancing']
classifier(sequence_to_classify, candidate_labels)
# {'sequence': 'one day I will see the world',
#  'labels': ['travel', 'cooking', 'dancing'],
#  'scores': [0.7944864630699158, 0.10624771565198898, 0.09926578402519226]}
```

## Metrics
```bash
# MNLI
{
    "eval_accuracy": 0.8746816097809476,
    "eval_accuracy_mm": 0.8727624084621644,
    "eval_loss": 0.4271220564842224,
    "eval_loss_mm": 0.4265698492527008,
    "eval_samples": 9815,
    "eval_samples_mm": 9832,
}
```

## Uses

#### Direct Use

This fine-tuned model can be used for zero-shot classification tasks, including zero-shot sentence-pair classification, and zero-shot sequence classification.

#### Misuse and Out-of-scope Use

The model should not be used to intentionally create hostile or alienating environments for people. In addition, the model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model.

## Risks, Limitations and Biases

**CONTENT WARNING: Readers should be aware this section contains content that is disturbing, offensive, and can propagate historical and current stereotypes.**

Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). 

Predictions generated by the model can include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups. For example:

```python
sequence_to_classify = "The CEO had a strong handshake."
candidate_labels = ['male', 'female']
hypothesis_template = "This text speaks about a {} profession."
classifier(sequence_to_classify, candidate_labels, hypothesis_template=hypothesis_template)
```

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.