sjrhuschlee
/

flan-t5-large-squad2

Question Answering

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

sjrhuschlee commited on Jun 14, 2023

Commit

f451ebc

•

1 Parent(s): cb2b286

Update README.md

Files changed (1) hide show

README.md +78 -0

README.md CHANGED Viewed

@@ -1,3 +1,81 @@
 ---
 license: mit
 ---

 ---
 license: mit
+datasets:
+- squad_v2
+- squad
+language:
+- en
+library_name: transformers
+tags:
+- question-answering
+- squad
+- squad_v2
+- t5
 ---
+# flan-t5-large for Extractive QA
+This is the [flan-t5-large](https://huggingface.co/google/flan-t5-large) model, fine-tuned using the [SQuAD2.0](https://huggingface.co/datasets/squad_v2) dataset. It's been trained on question-answer pairs, including unanswerable questions, for the task of Extractive Question Answering.
+This model was trained using LoRA available through the [PEFT library](https://github.com/huggingface/peft).
+NOTE: The <cls> token must be manually added to the beginning of the question for this model to work properly. It uses the <cls> token to be able to make "no answer" predictions. The t5 tokenizer does not automatically add this special token which is why it is added manually.
+## Overview
+**Language model:** flan-t5-large
+**Language:** English
+**Downstream-task:** Extractive QA
+**Training data:** SQuAD 2.0
+**Eval data:** SQuAD 2.0
+**Infrastructure**: 1x NVIDIA 3070
+## Model Usage
+### Using Transformers
+This uses the merged weights (base model weights + LoRA weights) to allow for simple use in Transformers pipelines. It has the same performance as using the weights separately when using the PEFT library.
+```python
+import torch
+from transformers import(
+  AutoModelForQuestionAnswering,
+  AutoTokenizer,
+  pipeline
+)
+model_name = "sjrhuschlee/flan-t5-large-squad2"
+# a) Using pipelines
+nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
+qa_input = {
+'question': f'{nlp.tokenizer.cls_token}Where do I live?',  # '<cls>Where do I live?'
+'context': 'My name is Sarah and I live in London'
+}
+res = nlp(qa_input)
+# {'score': 0.984, 'start': 30, 'end': 37, 'answer': ' London'}
+# b) Load model & tokenizer
+model = AutoModelForQuestionAnswering.from_pretrained(model_name)
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+question = f'{tokenizer.cls_token}Where do I live?'  # '<cls>Where do I live?'
+context = 'My name is Sarah and I live in London'
+encoding = tokenizer(question, context, return_tensors="pt")
+start_scores, end_scores = model(
+  encoding["input_ids"],
+  attention_mask=encoding["attention_mask"],
+  return_dict=False
+)
+all_tokens = tokenizer.convert_ids_to_tokens(input_ids[0].tolist())
+answer_tokens = all_tokens[torch.argmax(start_scores):torch.argmax(end_scores) + 1]
+answer = tokenizer.decode(tokenizer.convert_tokens_to_ids(answer_tokens))
+# 'London'
+```
+### Using with Peft
+**NOTE**: This requires code in the PR https://github.com/huggingface/peft/pull/473 for the PEFT library.
+```python
+#!pip install peft
+from peft import LoraConfig, PeftModelForQuestionAnswering
+from transformers import AutoModelForQuestionAnswering, AutoTokenizer
+model_name = "sjrhuschlee/flan-t5-large-squad2"
+```