--- license: mit datasets: - squad - fewshot-goes-multilingual/cs_squad-3.0 language: - cs - en metrics: - rouge pipeline_tag: text2text-generation --- # Model Card for mTk-SQuAD_en-SQAD_cs-1B This model is a generative in-context few-shot learner specialized in Czech. It was trained on a combination of English SQuAD and Czech SQAD dataset. You can find detailed information on [Project Github](https://github.com/fewshot-goes-multilingual/slavic-incontext-learning) & the referenced paper. ## Model Details ### Model Description - **Developed by:** Michal Stefanik & Marek Kadlcik, Masaryk University - **Model type:** mt5 - **Language(s) (NLP):** cs,en - **License:** MIT - **Finetuned from model:** google/mt5-large ### Model Sources - **Repository:** https://github.com/fewshot-goes-multilingual/slavic-incontext-learning - **Paper:** https://arxiv.org/abs/2304.01922 ## Uses This model is intended to be used in a few-shot in-context learning format in the target language (Czech), or in the source language (English, see below). It was evaluated for unseen task learning (with k=3 demonstrations) in Czech: see the referenced paper for details. ### How to Get Started with the Model Use the code below to get started with the model. ```python from transformers import AutoModelForSeq2SeqLM, AutoTokenizer model = AutoModelForSeq2SeqLM.from_pretrained("{this model path}") tokenizer = AutoTokenizer.from_pretrained("{this model path}") # Instead, use keywords "Otázka", "Kontext" and "Odpověď" for Czech few-shot prompts input_text = """ Question: What is the customer's name? Context: Origin: Barrack Obama, Customer id: Bill Moe. Answer: Bill Moe, Question: What is the customer's name? Context: Customer id: Barrack Obama, if not deliverable, return to Bill Clinton. Answer: """ inputs = tokenizer(input_text, return_tensors="pt") outputs = model.generate(**inputs) print("Answer:") print(tokenizer.decode(outputs)) ``` ## Training Details Training this model can be reproduced by running `pip install -r requirements.txt && python train_mt5_qa_en_SQuAD+cs_random.py`. See the referenced script for hyperparameters and other training configurations. ## Citation **BibTeX:** ```bib @inproceedings{stefanik2023resources, author = {\v{S}tef\'{a}nik, Michal and Kadlčík, Marek and Gramacki, Piotr and Sojka, Petr}, title = {Resources and Few-shot Learners for In-context Learning in Slavic Languages}, booktitle = {Proceedings of the 9th Workshop on Slavic Natural Language Processing}, publisher = {ACL}, numpages = {9}, url = {https://arxiv.org/abs/2304.01922}, } ```