nvidia
/

stt_uk_citrinet_1024_gamma_0_25

Automatic Speech Recognition

hf-asr-leaderboard

Model card Files Files and versions Community

dpykhtar commited on Jul 27, 2022

Commit

9be9066

•

1 Parent(s): e602fb6

Update README.md

Files changed (1) hide show

README.md +23 -0

README.md CHANGED Viewed

@@ -45,6 +45,29 @@ To train, fine-tune or play with the model you will need to install [NVIDIA NeMo
 pip install nemo_toolkit['all']
 ```
 ### Input
 This model accepts 16000 kHz Mono-channel Audio (wav files) as input.

 pip install nemo_toolkit['all']
 ```
+### Automatically instantiate the model
+```python
+import nemo.collections.asr as nemo_asr
+asr_model = nemo_asr.models.EncDecCTCModel.from_pretrained("nvidia/stt_zh_citrinet_1024_gamma_0_25")
+```
+### Transcribing using Python
+First, let's get a sample of spoken Mandarin Chinese.
+Then simply do:
+```
+asr_model.transcribe(['<Path of audio file(s)>'])
+```
+### Transcribing many audio files
+```shell
+python [NEMO_GIT_FOLDER]/examples/asr/transcribe_speech.py
+ pretrained_name="nvidia/stt_zh_citrinet_1024_gamma_0_25"
+ audio_dir="<DIRECTORY CONTAINING AUDIO FILES>"
+```
 ### Input
 This model accepts 16000 kHz Mono-channel Audio (wav files) as input.