File size: 1,804 Bytes

26bfba3
 
 
314ac53

---
language: en
---
# UnifiedQA-Reddit-SYAC  

This is an abstractive title answering (TA) / clickbait spoiling model.  
This is a variant of [allenai/unifiedqa-t5-large](https://huggingface.co/allenai/unifiedqa-t5-large), fine-tuned on the Reddit SYAC dataset.  
The model was trained as part of my masters thesis:

_Abstractive title answering for clickbait content_  


### Disinformation  
This model has the proven capability of generating, and hallucinating false information.  
Any use of a TA system such as this one should be with knowledge of this risk.


## Performance  

### Intrinsic  

The following scores is the result of intrinsic evaluation on the Reddit SYAC test set.  
We used a max input length of 2048 and truncated the tokens exceeding this limit.  

| rouge1    | rouge2    | rougeL    | bleu      | meteor   |
|:----------|:----------|:----------|:----------|:---------|
| **44.58** | **23.89** | **43.45** | 17.46     | 36.22    |


### Qualtiy  
Using human evaluation, we measured model performance by asking the evaluators to rate the models
on a scale from 1 to 5 on how good their generated answer was for a given clickbait article.  

Mean quality = 4.065  

### Factuality  
We included a factuality assessment to address the issue of generating false information.  
Human raters were asked to place each output in the categories "True", "Irrelevant", and "False".  

| True    | Irrelevant | False    |
|:-------:|:----------:|:--------:|
|   85%   |    7.5%    | 7.5%     |

## Cite  

If you use this model, please cite my master's thesis

```
@mastersthesis{heiervang2022AbstractiveTA
  title={Abstractive title answering for clickbait content},
  author={Markus Sverdvik Heiervang},
  publisher={University of Oslo, Department of Informatics},
  year={2022}
}
```