fdelucaf commited on
Commit
f5c46e3
·
verified ·
1 Parent(s): 87c6b12

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -322,7 +322,8 @@ including all of the official European languages plus Catalan, Basque, Galician,
322
  It amounts to 6,574,251,526 parallel sentence pairs.
323
 
324
  This highly multilingual corpus is predominantly composed of data sourced from [OPUS](https://opus.nlpl.eu/),
325
- with additional data taken from the [NTEU project](https://nteu.eu/), [Aina Project](https://projecteaina.cat/), and other sources (see: [Data Sources#](#pre-data-sources) and [References below](#pre-references)).
 
326
  Where little parallel Catalan <-> xx data could be found, synthetic Catalan data was generated from the Spanish side of the collected Spanish <-> xx corpora using
327
  [Projecte Aina’s Spanish-Catalan model](https://huggingface.co/projecte-aina/aina-translator-es-ca). The final distribution of languages was as below:
328
 
 
322
  It amounts to 6,574,251,526 parallel sentence pairs.
323
 
324
  This highly multilingual corpus is predominantly composed of data sourced from [OPUS](https://opus.nlpl.eu/),
325
+ with additional data taken from the [NTEU project](https://nteu.eu/), [Aina Project](https://projecteaina.cat/), and other sources
326
+ (see: [Data Sources](#pre-data-sources) and [References](#pre-references)).
327
  Where little parallel Catalan <-> xx data could be found, synthetic Catalan data was generated from the Spanish side of the collected Spanish <-> xx corpora using
328
  [Projecte Aina’s Spanish-Catalan model](https://huggingface.co/projecte-aina/aina-translator-es-ca). The final distribution of languages was as below:
329