A2T Entailment model

Important: These pretrained entailment models are intended to be used with the Ask2Transformers library but are also fully compatible with the ZeroShotTextClassificationPipeline from Transformers.

Textual Entailment (or Natural Language Inference) has turned out to be a good choice for zero-shot text classification problems (Yin et al., 2019; Wang et al., 2021; Sainz and Rigau, 2021). Recent research addressed Information Extraction problems with the same idea (Lyu et al., 2021; Sainz et al., 2021; Sainz et al., 2022a, Sainz et al., 2022b). The A2T entailment models are first trained with NLI datasets such as MNLI (Williams et al., 2018), SNLI (Bowman et al., 2015) or/and ANLI (Nie et al., 2020) and then fine-tuned to specific tasks that were previously converted to textual entailment format.

For more information please, take a look to the Ask2Transformers library or the following published papers:

About the model

The model name describes the configuration used for training as follows:

HiTZ/A2T_[pretrained_model]_[NLI_datasets]_[finetune_datasets]

  • pretrained_model: The checkpoint used for initialization. For example: RoBERTalarge.
  • NLI_datasets: The NLI datasets used for pivot training.
    • S: Standford Natural Language Inference (SNLI) dataset.
    • M: Multi Natural Language Inference (MNLI) dataset.
    • F: Fever-nli dataset.
    • A: Adversarial Natural Language Inference (ANLI) dataset.
  • finetune_datasets: The datasets used for fine tuning the entailment model. Note that for more than 1 dataset the training was performed sequentially. For example: ACE-arg.

Some models like HiTZ/A2T_RoBERTa_SMFA_ACE-arg have been trained marking some information between square brackets ('[[' and ']]') like the event trigger span. Make sure you follow the same preprocessing in order to obtain the best results.

Cite

If you use this model, consider citing the following publications:

@inproceedings{sainz-etal-2021-label,
    title = "Label Verbalization and Entailment for Effective Zero and Few-Shot Relation Extraction",
    author = "Sainz, Oscar  and
      Lopez de Lacalle, Oier  and
      Labaka, Gorka  and
      Barrena, Ander  and
      Agirre, Eneko",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.emnlp-main.92",
    doi = "10.18653/v1/2021.emnlp-main.92",
    pages = "1199--1212",
}
Downloads last month
37
Safetensors
Model size
355M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train HiTZ/A2T_RoBERTa_SMFA_ACE-arg_WikiEvents-arg

Collection including HiTZ/A2T_RoBERTa_SMFA_ACE-arg_WikiEvents-arg