ylacombe PHBJT commited on
Commit
fed4f57
1 Parent(s): bb39386

Update README.md (#1)

Browse files

- Update README.md (dff727b9fa48abaa2ae4589c210b251249b35ff8)


Co-authored-by: Paul Henri Biojout <PHBJT@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -40,7 +40,7 @@ datasets:
40
  It is a fine-tuned version, trained on a [cleaned version](https://huggingface.co/datasets/PHBJT/cml-tts-cleaned-levenshtein) of [CML-TTS](https://huggingface.co/datasets/ylacombe/cml-tts) and on the non-English version of [Multilingual LibriSpeech](https://huggingface.co/datasets/facebook/multilingual_librispeech).
41
  In all, this represents some 9,200 hours of non-English data. To retain English capabilities, we also added back the [LibriTTS-R English dataset](https://huggingface.co/datasets/parler-tts/libritts_r_filtered), some 580h of high-quality English data.
42
 
43
- **Parler-TTS Mini Multilingual** can speak in 7 European languages: English, French, Spanish, Portuguese, Polish, German, Italian and Dutch.
44
 
45
  Thanks to its **better prompt tokenizer**, it can easily be extended to other languages. This tokenizer has a larger vocabulary and handles byte fallback, which simplifies multilingual training.
46
 
 
40
  It is a fine-tuned version, trained on a [cleaned version](https://huggingface.co/datasets/PHBJT/cml-tts-cleaned-levenshtein) of [CML-TTS](https://huggingface.co/datasets/ylacombe/cml-tts) and on the non-English version of [Multilingual LibriSpeech](https://huggingface.co/datasets/facebook/multilingual_librispeech).
41
  In all, this represents some 9,200 hours of non-English data. To retain English capabilities, we also added back the [LibriTTS-R English dataset](https://huggingface.co/datasets/parler-tts/libritts_r_filtered), some 580h of high-quality English data.
42
 
43
+ **Parler-TTS Mini Multilingual** can speak in 8 European languages: English, French, Spanish, Portuguese, Polish, German, Italian and Dutch.
44
 
45
  Thanks to its **better prompt tokenizer**, it can easily be extended to other languages. This tokenizer has a larger vocabulary and handles byte fallback, which simplifies multilingual training.
46