TEXT_Datasets
Datasets for fine-tunning, instruction and evaluation of text models from projecte-aina
- Viewer • Updated • 56.4k • 40 • 13
projecte-aina/ceil
Updated • 6Note Named Entities Recognition
projecte-aina/catalanqa
Updated • 152 • 2Note QA dataset
projecte-aina/GuiaCat
Viewer • Updated • 5.75k • 8 • 1Note Sentiment analysis
projecte-aina/CaWikiTC
Viewer • Updated • 21k • 3 • 1Note Text classification
projecte-aina/ancora-ca-ner
Updated • 14 • 1Note Named Entities Recognition
projecte-aina/teca
Updated • 132 • 1Note Textual entailment
projecte-aina/viquiquad
Updated • 12 • 1Note Extractive-QA
projecte-aina/xquad-ca
Updated • 240 • 1Note Cross-lingual-QA, Extractive-QA
projecte-aina/WikiCAT_ca
Updated • 6 • 1Note Text classification
projecte-aina/Parafraseja
Updated • 10 • 1Note Paraphrase
projecte-aina/sts-ca
Updated • 2.51k • 1Note Semantic Textual Similarity
projecte-aina/wnli-ca
Updated • 16 • 1Note Textual entailmen
projecte-aina/tecla
Updated • 165Note Text classification
projecte-aina/vilaquad
Updated • 12 • 1Note Extractive-QA
projecte-aina/catalan_general_crawling
Updated • 935Note A 435-million-token web corpus of Catalan mainly intended to pretrain language models and word representations.
projecte-aina/raco_forums
Updated • 6 • 1Note A 19-million-sentence corpus of Catalan user-generated text built from the forums mainly intended to pretrain language models and word representations.
projecte-aina/catalan_government_crawling
Updated • 18 • 1Note A 39-million-token web corpus of Catalan mainly intended to pretrain language models and word representations.
projecte-aina/catalan_textual_corpus
Updated • 10 • 1Note A 1760-million-token web corpus of Catalan mainly intended to pretrain language models and word representations.
projecte-aina/CoQCat
Updated • 8 • 1Note Conversational QA
projecte-aina/caBreu
Updated • 381Note Summarization
projecte-aina/CaSERa-catalan-stance-emotions-raco
Updated • 6Note Emotion and dynamic stance detection
projecte-aina/InToxiCat
Updated • 8 • 1Note Abusive language detection
projecte-aina/UD_Catalan-AnCora
Updated • 6 • 1Note POS tagging
projecte-aina/CaSSA-catalan-structured-sentiment-analysis
Updated • 6 • 1Note Sentiment analysis
projecte-aina/CaSET-catalan-stance-emotions-twitter
Updated • 10 • 2Note Emotion, static stance, and dynamic stance detection.
projecte-aina/COPA-ca
Updated • 176Note Commonsense reasoning
projecte-aina/xnli-ca
Updated • 426Note Textual entailment
projecte-aina/casum
Updated • 12Note Summarization
projecte-aina/vilasum
Updated • 12 • 1Note Summarization
projecte-aina/CATalog
Viewer • Updated • 34.3M • 13 • 3Note Language Modeling
projecte-aina/mgsm_ca
Viewer • Updated • 258 • 5Note Question Answering
projecte-aina/MentorES
Viewer • Updated • 10.2k • 27 • 2Note Instruction Tuning
projecte-aina/MentorCA
Viewer • Updated • 10.2k • 14 • 2Note Instruction Tuning
projecte-aina/openbookqa_ca
Viewer • Updated • 1k • 3Note Question Answering
projecte-aina/PAWS-ca
Updated • 204Note Paraphrase Identification
projecte-aina/NLUCat
UpdatedNote Intent classification, spans identification and examples generation.
projecte-aina/siqa_ca
Viewer • Updated • 1.95k • 3Note Multiple Choice Question Answering
projecte-aina/piqa_ca
Viewer • Updated • 1.84k • 2Note Multiple Choice Question Answering
projecte-aina/xstorycloze_ca
Viewer • Updated • 1.87k • 72Note Multiple Choice Commonsense Reasoning
projecte-aina/arc_ca
Updated • 320Note Multiple Choice Question Answering
projecte-aina/oasst1_ca
Viewer • Updated • 5.49k • 3Note Instruction Tuning