reazonspeech-k2-v2

reazonspeech-k2-v2 is an automatic speech recognition (ASR) model trained on ReazonSpeech v2.0 corpus.

This model provides end-to-end Japanese speech recognition based on Next-gen Kaldi.

Model Architecture

  • Character-based RNN-T model. The total parameter count is 159.34M.

  • This model utilizes an enhanced Transformer architecture called Zipformer.

  • The training recipe is available on k2-fsa/icefall.

Note that this model can process Japanese audio clips up to ~30 seconds.

Usage

We recommend to use this model through our reazonspeech library.

from reazonspeech.k2.asr import load_model, transcribe, audio_from_path

audio = audio_from_path("speech.wav")
model = load_model()
ret = transcribe(model, audio)
print(ret.text)

License

Apaceh Licence 2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Spaces using reazon-research/reazonspeech-k2-v2 5

Collection including reazon-research/reazonspeech-k2-v2