|
--- |
|
license: mit |
|
datasets: |
|
- sberquad |
|
- adversarial_qa |
|
language: |
|
- en |
|
- ru |
|
metrics: |
|
- rouge |
|
pipeline_tag: text2text-generation |
|
--- |
|
|
|
# Model Card for mTk-AdversarialQA_en-SberQuAD_ru-1B |
|
This model is a generative in-context few-shot learner specialized in Russian. It was trained on a combination of English AdversarialQA and Russian SberQuAD datasets. |
|
|
|
You can find detailed information on [Project Github](https://github.com/fewshot-goes-multilingual/slavic-incontext-learning) & the referenced paper. |
|
|
|
## Model Details |
|
### Model Description |
|
- **Developed by:** Michal Stefanik & Marek Kadlcik, Masaryk University |
|
- **Model type:** mt5 |
|
- **Language(s) (NLP):** en,ru |
|
- **License:** MIT |
|
- **Finetuned from model:** google/mt5-large |
|
### Model Sources |
|
- **Repository:** https://github.com/fewshot-goes-multilingual/slavic-incontext-learning |
|
- **Paper:** [To be filled] |
|
## Uses |
|
This model is intended to be used in a few-shot in-context learning format in the target language (Russian), or in the source language (English, see below). |
|
It was evaluated for unseen task learning (with k=3 demonstrations) in Russian: see the referenced paper for details. |
|
### How to Get Started with the Model |
|
Use the code below to get started with the model. |
|
```python |
|
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer |
|
model = AutoModelForSeq2SeqLM.from_pretrained("{this model path}") |
|
tokenizer = AutoTokenizer.from_pretrained("{this model path}") |
|
# Instead, use keywords "Вопрос", "Контекст" and "Отвечать" for Russian few-shot prompts |
|
input_text = """ |
|
Question: What is the customer's name? |
|
Context: Origin: Barrack Obama, Customer id: Bill Moe. |
|
Answer: Bill Moe, |
|
Question: What is the customer's name? |
|
Context: Customer id: Barrack Obama, if not deliverable, return to Bill Clinton. |
|
Answer: |
|
""" |
|
inputs = tokenizer(input_text, return_tensors="pt") |
|
|
|
outputs = model.generate(**inputs) |
|
print("Answer:") |
|
print(tokenizer.decode(outputs)) |
|
``` |
|
## Training Details |
|
Training this model can be reproduced by running `pip install -r requirements.txt && python train_mt5_qa_en_AQA+ru_info.py |
|
`. |
|
See the referenced script for hyperparameters and other training configurations. |
|
## Citation |
|
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. --> |
|
**BibTeX:** |
|
[Will be filled soon] |