TedLium3 Zipformer

rnnt_type=regular

The WERs are

dev test comment
greedy search 6.74 6.16 --epoch 50, --avg 22, --max-duration 500
beam search (beam size 4) 6.56 5.95 --epoch 50, --avg 22, --max-duration 500
modified beam search (beam size 4) 6.54 6.00 --epoch 50, --avg 22, --max-duration 500
fast beam search (set as default) 6.91 6.28 --epoch 50, --avg 22, --max-duration 500

The training command for reproducing is given below:

export CUDA_VISIBLE_DEVICES="0,1,2,3"

./zipformer/train.py \
  --use-fp16 true \
  --world-size 4 \
  --num-epochs 50 \
  --start-epoch 0 \
  --exp-dir zipformer/exp \
  --max-duration 1000

The tensorboard training log can be found at https://tensorboard.dev/experiment/AKXbJha0S9aXyfmuvG4h5A/#scalars

The decoding command is:

epoch=50
avg=22

## greedy search
./zipformer/decode.py \
  --epoch $epoch \
  --avg $avg \
  --exp-dir zipformer/exp \
  --bpe-model ./data/lang_bpe_500/bpe.model \
  --max-duration 500

## beam search
./zipformer/decode.py \
  --epoch $epoch \
  --avg $avg \
  --exp-dir zipformer/exp \
  --bpe-model ./data/lang_bpe_500/bpe.model \
  --max-duration 500 \
  --decoding-method beam_search \
  --beam-size 4

## modified beam search
./zipformer/decode.py \
  --epoch $epoch \
  --avg $avg \
  --exp-dir zipformer/exp \
  --bpe-model ./data/lang_bpe_500/bpe.model \
  --max-duration 500 \
  --decoding-method modified_beam_search \
  --beam-size 4

## fast beam search
./zipformer/decode.py \
  --epoch $epoch \
  --avg $avg \
  --exp-dir ./zipformer/exp \
  --bpe-model ./data/lang_bpe_500/bpe.model \
  --max-duration 1500 \
  --decoding-method fast_beam_search \
  --beam 4 \
  --max-contexts 4 \
  --max-states 8

rnnt_type=modified

Using the codes from this PR https://github.com/k2-fsa/icefall/pull/1125.

The WERs are

dev test comment
greedy search 6.32 5.83 --epoch 50, --avg 22, --max-duration 500
modified beam search (beam size 4) 6.16 5.79 --epoch 50, --avg 22, --max-duration 500
fast beam search (set as default) 6.30 5.89 --epoch 50, --avg 22, --max-duration 500

The training command for reproducing is given below:

export CUDA_VISIBLE_DEVICES="0,1,2,3"

./zipformer/train.py \
  --use-fp16 true \
  --world-size 4 \
  --num-epochs 50 \
  --start-epoch 0 \
  --exp-dir zipformer/exp \
  --max-duration 1000 \
  --rnnt-type modified

The tensorboard training log can be found at https://tensorboard.dev/experiment/3d4bYmbJTGiWQQaW88CVEQ/#scalars

The decoding commands are same as above.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.