gaussalgo
/

mt5-large-priming-QA_en-cs

@@ -5,19 +5,28 @@ language:
 - multilingual
 - cs
 - en
 ---
-# Mt5-large for Prime Czech+English Generative Question Answering
-This is the [mt5-base](https://huggingface.co/google/mt5-base) model with an LM head for a generation of extractive answers,
 given a small set of 2-5 demonstrations (i.e. primes).
-## Priming
-Note that **this is a priming model** that expects a **set of demonstrations** of your task of interest,
 similarly to GPT-3.
 Rather than performing well on the conventional question answering, it aims to learn to extrapolate the pattern of given demonstrations
-to novel tasks, such as Named Entity Recognition or Keywords Extraction from a given pattern.
 ## Data & Training
@@ -29,10 +38,10 @@ To train the model to use the demonstrations, we've **clustered** the samples by
 in English AdversarialQA and by the category in the Czech SQAD and used the examples of the same cluster as the demonstrations
 of the task in training.
-We find that the specific algorithm of selection of these demonstrations makes a big difference in the model's ability to extrapolate
-to new tasks and will be shared in the following article; stay tuned!
-For the Czech SQAD 3.0, original contexts (=whole Wikipedia websites) were limited to a maximum of 8000 characters
 per a sequence of prime demonstrations.
 Pre-processing script for Czech SQAD is available [here](https://huggingface.co/gaussalgo/xlm-roberta-large_extractive-QA_en-cs/blob/main/parse_czech_squad.py).
@@ -88,11 +97,6 @@ input_text = """
     Context: Customer id: Barrack Obama, if not deliverable, return to Bill Clinton.
     Answer:"""
 ```
-Note that despite its size, English AdversarialQA has a variety of reported biases,
-conditioned by the relative position or type of the answer in the context that can affect the model's performance on new data
-(see, e.g. [L. Mikula (2022)](https://is.muni.cz/th/adh58/?lang=en), Chap. 4.1).
 ## Usage
 Here is how to use this model to answer the question on a given context using 🤗 Transformers in PyTorch:
@@ -100,8 +104,8 @@ Here is how to use this model to answer the question on a given context using
 ```python
 from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
-tokenizer = AutoTokenizer.from_pretrained("gaussalgo/mt5-base-priming-QA_en-cs")
-model = AutoModelForSeq2SeqLM.from_pretrained("gaussalgo/mt5-base-priming-QA_en-cs")
 # For the expected format of input_text, see Intended use above
 inputs = tokenizer(input_text, return_tensors="pt")

 - multilingual
 - cs
 - en
+widget:
+- text: "Otázka: Jaký je důvod dotazu zákazníka?\nKontext: Dobrý den, Žádáme zaslání nové smlouvy kvůli řešení pojistné události. Zašlete na tento mail nebo přímo do systému. S pozdravem Petra Hladká | disponentka servisu.\nOdpověď: řešení pojistné události\nOtázka: Jaký je důvod dotazu zákazníka?\nKontext: Dobrý den, chtěla bych Vás požádat o zaslání kopie technického průkazu z důvodu jeho ztráty. S pozdravem Milan Tvrdý.\nOdpověď:"
+  example_title: "Few-shot: Customer request (cs)"
+- text: "Otázka: Jaké schopnosti daly magické předměty Jurovi Jánošíkovi? \nKontext: Podle slovenského lidového podání byl Juro Jánošík obdařen magickými předměty (kouzelná valaška, čarovný opasek), které mu dodávaly nadpřirozené schopnosti. Okrádal především šlechtice, trestal panské dráby a ze svého lupu vyděloval část pro chudé, tedy bohatým bral a chudým dával. \nOdpověď:"
+  example_title: "Zero-shot: Question Answering (cs)"
+- text: "Question: What is the score of this review? \n Context: I did not like the plot at all. Not recommended. \n Answer: 1 \n Question: What is the score of this review? \n Context: I loved the performance. Can’t believe they did not use CGI for the finale. I think it’s my new favourite movie. \nAnswer: 5 \nQuestion: Is the score of this review 1, 2, 3, 4 or 5? \nContext: The beginning was awesome, but at the end it felt a little rushed. I enjoyed the movie, but probably won’t rewatch soon. \nAnswer:"
+  example_title: "Few-shot: Movie reviews (en)"
+- text: "Question: What is the score of this review? \n Context: I did not like the plot at all. Not recommended. \n Answer: 1 \n Question: What is the score of this review? \n Context: I loved the performance. Can’t believe they did not use CGI for the finale. I think it’s my new favourite movie. \nAnswer: 5 \nQuestion: Is the score of this review 1, 2, 3, 4 or 5? \nContext: The beginning was awesome, but at the end it felt a little rushed. I enjoyed the movie, but probably won’t rewatch soon. \nAnswer:"
+  example_title: "Few-shot: Customer request (en)"
 ---
+# Mt5-large for Few-shot Czech+English Generative Question Answering
+This is the [mt5-large](https://huggingface.co/google/mt5-large) model with an LM head for a generation of extractive answers,
 given a small set of 2-5 demonstrations (i.e. primes).
+## Few-shot (i.e. priming)
+Note that **this is primarily a few-shot model** that expects a **set of demonstrations** of your task of interest,
 similarly to GPT-3.
 Rather than performing well on the conventional question answering, it aims to learn to extrapolate the pattern of given demonstrations
+to novel tasks, such as Named Entity Recognition or Keywords Extraction from a given pattern. However, it can be also used as conventional QA model (see examples).
 ## Data & Training
 in English AdversarialQA and by the category in the Czech SQAD and used the examples of the same cluster as the demonstrations
 of the task in training.
+We find that the specific algorithm of selection of these demonstrations is crucial for the model's ability to extrapolate
+to new tasks. We'll share more details in the following article; stay tuned!
+For the Czech SQAD 3.0, original contexts (=whole Wikipedia websites) were limited to a maximum of 4000 characters
 per a sequence of prime demonstrations.
 Pre-processing script for Czech SQAD is available [here](https://huggingface.co/gaussalgo/xlm-roberta-large_extractive-QA_en-cs/blob/main/parse_czech_squad.py).
     Context: Customer id: Barrack Obama, if not deliverable, return to Bill Clinton.
     Answer:"""
 ```
 ## Usage
 Here is how to use this model to answer the question on a given context using 🤗 Transformers in PyTorch:
 ```python
 from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
+tokenizer = AutoTokenizer.from_pretrained("gaussalgo/mt5-large-priming-QA_en-cs")
+model = AutoModelForSeq2SeqLM.from_pretrained("gaussalgo/mt5-large-priming-QA_en-cs")
 # For the expected format of input_text, see Intended use above
 inputs = tokenizer(input_text, return_tensors="pt")