![](https://cdn-avatars.huggingface.co/v1/production/uploads/1669037788328-637b68dae8de1ebc2724e480.png)
Polish Question Answering
Collection of models and datasets for Polish Question Answering.
Sentence Similarity • Updated • 830 • 8Note SilverRetriever is a state-of-the-art neural passage retriever trained on the PolQA and MAUPQA datasets.
ipipan/silver-retriever-base-v1
Sentence Similarity • Updated • 2.41k • 10Note SilverRetriever is a state-of-the-art neural passage retriever trained on the PolQA and MAUPQA datasets.
ipipan/polqa
Updated • 3.77k • 7Note PolQA is the first Polish dataset for open-domain question answering. It consists of 7,000 questions, 87,525 manually labeled evidence passages, and a corpus of over 7 million candidate passages. The dataset can be used to train both a passage retriever and an abstractive reader.
ipipan/maupqa
Updated • 98 • 4Note MAUPQA is a collection of 14 datasets for Polish document retrieval. Most of the datasets are either machine-generated or machine-translated from English. Across all datasets, it consists of over 1M questions, 1M positive, and 7M hard-negative question-passage pairs.
clarin-pl/poquad
Viewer • Updated • 52k • 653 • 4Note PoQuAD is a Polish equivalent of the SQuAD. It consists of more than 70,000 question-passage pairs, as well as extractive and abstractive answers.
allegro/polish-question-passage-pairs
Viewer • Updated • 10.4k • 12 • 4Note Over 10,000 manually annotated question-passage pairs. While the questions are taken from the PolQA dataset, the passages are often unique. In particular, the dataset consists mostly of hard negatives (8k pairs).
allegro/klej-dyk
Viewer • Updated • 5.18k • 2.63k • 1Note The "Czy wiesz?" (eng. "Did you know?") dataset consists of almost 5k question-passage pairs obtained from "Czy wiesz..." section of Polish Wikipedia. Each question is written by a Wikipedia collaborator and is answered with a link to a relevant Wikipedia article.
piotr-rybak/allegro-faq
Viewer • Updated • 1.88kNote Allegro FAQ is one of the PolEval 2022 test sets. It consists of 900 frequently asked questions and 921 help articles regarding the large Polish e-commerce platform - Allegro.com. Each question-passage pair is manually checked and edited where necessary.
piotr-rybak/legal-questions
Updated • 5Note Legal Questions is one of the PolEval 2022 test sets. It consists of 718 questions and approximately 26,000 passages extracted from over 1,000 acts of law.
Running16📈Polish Information Retrieval Benchmark (PIRB)
Note The benchmark for Polish Information Retrieval, consisting of 41 datasets.
sdadas/mmlw-retrieval-roberta-base
Sentence Similarity • Updated • 239 • 1Note Neural text encoder for Polish, see more models here: https://huggingface.co/sdadas?search_models=mmlw
sdadas/gpt-exams
Viewer • Updated • 8.13k • 6 • 2Note The dataset contains 8131 multi-domain question-answer pairs. It was created semi-automatically using the gpt-3.5-turbo-0613 model available in the OpenAI API.
apohllo/plt5-base-poquad
Text2Text Generation • Updated • 8 • 1Note This is a plT5-base model trained on the PoQuAD dataset. This model was trained as a result of single experiment run, so don't expect state-of-the-art results.
sdadas/polish-reranker-large-ranknet
Text Classification • Updated • 111Note Cross-encoder for Polish, see more models here: https://huggingface.co/sdadas?search_models=reranker