A collection of training corpus and models for "Multilingual Pretraining Using a Large Corpus Machine-Translated from a Single Source Language".
BritLLM
community
AI & ML interests
contact@llm.org.uk
Recent Activity
Collections
1
datasets
17
britllm/TransWeb-Edu-English
Viewer
•
Updated
•
36M
•
81
britllm/TransWeb-Edu-Spanish
Viewer
•
Updated
•
35.2M
•
455
•
2
britllm/TransWeb-Edu-French
Viewer
•
Updated
•
36M
•
552
britllm/TransWeb-Edu-German
Viewer
•
Updated
•
36M
•
471
britllm/xnli_brit
Viewer
•
Updated
•
9.69k
•
40
britllm/piqa_scottish_gaelic
Updated
•
4
britllm/piqa_welsh
Updated
•
3
britllm/piqa_irish
Updated
britllm/arc_scottish_gaelic
Viewer
•
Updated
•
7.56k
•
39
britllm/arc_welsh
Viewer
•
Updated
•
7.72k
•
45