lpw commited on
Commit
a51c8ed
1 Parent(s): 91986b5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -20
README.md CHANGED
@@ -9,17 +9,16 @@ tags:
9
  - speech-to-speech-translation
10
 
11
  datasets:
12
- - mtedx
13
- - covost2
14
- - europarl_st
15
- - voxpopuli
16
 
17
  ---
18
  ## xm_transformer_unity_hk-en
19
 
20
- Speech-to-speech translation model from fairseq S2UT ([paper](https://arxiv.org/abs/2204.02967)/[code](https://github.com/facebookresearch/fairseq/blob/main/examples/speech_to_speech/docs/enhanced_direct_s2st_discrete_units.md)):
21
  - Hokkien-English
22
- - Trained on mTEDx, CoVoST 2, Europarl-ST and VoxPopuli
23
  - Speech synthesis with [facebook/unit_hifigan_mhubert_vp_en_es_fr_it3_400k_layer11_km1000_lj_dur](https://huggingface.co/facebook/unit_hifigan_mhubert_vp_en_es_fr_it3_400k_layer11_km1000_lj_dur)
24
 
25
  ## Usage
@@ -88,18 +87,4 @@ tts_sample = tts_model.get_model_input(unit)
88
  wav, sr = tts_model.get_prediction(tts_sample)
89
 
90
  ipd.Audio(wav, rate=sr)
91
- ```
92
-
93
- ## Citation
94
- ```bibtex
95
- @misc{https://doi.org/10.48550/arxiv.2204.02967,
96
- doi = {10.48550/ARXIV.2204.02967},
97
- url = {https://arxiv.org/abs/2204.02967},
98
- author = {Popuri, Sravya and Chen, Peng-Jen and Wang, Changhan and Pino, Juan and Adi, Yossi and Gu, Jiatao and Hsu, Wei-Ning and Lee, Ann},
99
- keywords = {Computation and Language (cs.CL), Sound (cs.SD), Audio and Speech Processing (eess.AS), FOS: Computer and information sciences, FOS: Computer and information sciences, FOS: Electrical engineering, electronic engineering, information engineering, FOS: Electrical engineering, electronic engineering, information engineering},
100
- title = {Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation},
101
- publisher = {arXiv},
102
- year = {2022},
103
- copyright = {arXiv.org perpetual, non-exclusive license}
104
- }
105
  ```
 
9
  - speech-to-speech-translation
10
 
11
  datasets:
12
+ - MuST-C
13
+ - TAT
14
+ - Hokkien dramas
 
15
 
16
  ---
17
  ## xm_transformer_unity_hk-en
18
 
19
+ Speech-to-speech translation model from fairseq S2UT:
20
  - Hokkien-English
21
+ - Trained on MuST-C, TAT and Hokkien dramas
22
  - Speech synthesis with [facebook/unit_hifigan_mhubert_vp_en_es_fr_it3_400k_layer11_km1000_lj_dur](https://huggingface.co/facebook/unit_hifigan_mhubert_vp_en_es_fr_it3_400k_layer11_km1000_lj_dur)
23
 
24
  ## Usage
 
87
  wav, sr = tts_model.get_prediction(tts_sample)
88
 
89
  ipd.Audio(wav, rate=sr)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90
  ```