w2v-bert-uk / README.md
Yehor's picture
Add metrics
545584b verified
|
raw
history blame
1.57 kB
metadata
base_model: facebook/w2v-bert-2.0
datasets:
  - common_voice_10_0
metrics:
  - wer
model-index:
  - name: w2v-bert-2.0-uk
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: common_voice_10_0
          type: common_voice_10_0
          config: uk
          split: test
          args: uk
        metrics:
          - name: Wer
            type: wer
            value: 0.0655

wav2vec2-bert-uk

Quality:

  • AM:
    • WER: 0.0727
    • CER: 0.0151
    • Accuracy: 92.73%
  • AM + LM:
    • WER: 0.0655
    • CER: 0.0139
    • Accuracy: 93.45%

This model was trained with the following hparams with 2 RTX A4000:

torchrun --standalone --nnodes=1 --nproc-per-node=2 ../train_w2v2_bert.py \
  --custom_set ~/cv10/train.csv \
  --custom_set_eval ~/cv10/test.csv \
  --num_train_epochs 15 \
  --tokenize_config . \
  --w2v2_bert_model facebook/w2v-bert-2.0 \
  --batch 4 \
  --num_proc 5 \
  --grad_accum 1 \
  --learning_rate 3e-5 \
  --logging_steps 20 \
  --eval_step 500 \
  --group_by_length \
  --attention_dropout 0.0 \
  --activation_dropout 0.05 \
  --feat_proj_dropout 0.05 \
  --feat_quantizer_dropout 0.0 \
  --hidden_dropout 0.05 \
  --layerdrop 0.0 \
  --final_dropout 0.0 \
  --mask_time_prob 0.0 \
  --mask_time_length 10 \
  --mask_feature_prob 0.0 \
  --mask_feature_length 10