espnet
/

owsm_v3.1_ebf_small_lowrestriction

Automatic Speech Recognition

speech-translation

Model card Files Files and versions Community

pyf98 commited on Sep 3, 2024

Commit

cfcc86a

·

verified ·

1 Parent(s): a4b693c

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ Our demo is available [here](https://huggingface.co/spaces/pyf98/OWSM_v3_demo).
 [OWSM v3.1](https://arxiv.org/abs/2401.16658) is an improved version of OWSM v3. It significantly outperforms OWSM v3 in almost all evaluation benchmarks.
 We do not include any new training data. Instead, we utilize a state-of-the-art speech encoder, [E-Branchformer](https://arxiv.org/abs/2210.00077).
-This is a small size model with 367M parameters and is trained on 70k hours of public speech data with lower restrictions (compared to the full OWSM data). Please check our [project page](https://www.wavlab.org/activities/2024/owsm/) for more information.
 Specifically, it supports the following speech-to-text tasks:
 - Speech recognition
 - Utterance-level alignment

 [OWSM v3.1](https://arxiv.org/abs/2401.16658) is an improved version of OWSM v3. It significantly outperforms OWSM v3 in almost all evaluation benchmarks.
 We do not include any new training data. Instead, we utilize a state-of-the-art speech encoder, [E-Branchformer](https://arxiv.org/abs/2210.00077).
+**This is a small size model with 367M parameters and is trained on 70k hours of public speech data with lower restrictions (compared to the full OWSM data).** Please check our [project page](https://www.wavlab.org/activities/2024/owsm/) for more information.
 Specifically, it supports the following speech-to-text tasks:
 - Speech recognition
 - Utterance-level alignment