metadata
license: apache-2.0
tags:
- automatic-speech-recognition
- NbAiLab/NPSC
- generated_from_trainer
model-index:
- name: ''
results: []
This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the NBAILAB/NPSC - 16K_MP3 dataset. It achieves the following results on the evaluation set:
- Loss: 0.1957
- Wer: 0.1697
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 7.5e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 2000
- num_epochs: 20.0
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Wer |
---|---|---|---|---|
4.4527 | 0.28 | 250 | 4.0144 | 1.0 |
3.1828 | 0.56 | 500 | 3.1369 | 1.0 |
2.9927 | 0.85 | 750 | 3.0183 | 1.0 |
2.9591 | 1.13 | 1000 | 2.9991 | 1.0 |
2.8989 | 1.41 | 1250 | 2.9000 | 1.0000 |
2.4286 | 1.69 | 1500 | 1.7688 | 0.9550 |
1.6765 | 1.98 | 1750 | 0.6842 | 0.4855 |
1.4521 | 2.26 | 2000 | 0.5096 | 0.3736 |
1.3589 | 2.54 | 2250 | 0.4479 | 0.3335 |
1.3136 | 2.82 | 2500 | 0.4056 | 0.3123 |
1.2856 | 3.11 | 2750 | 0.3870 | 0.2987 |
1.2283 | 3.39 | 3000 | 0.3646 | 0.2828 |
1.2053 | 3.67 | 3250 | 0.3499 | 0.2748 |
1.2087 | 3.95 | 3500 | 0.3345 | 0.2603 |
1.2002 | 4.24 | 3750 | 0.3320 | 0.2523 |
1.1383 | 4.52 | 4000 | 0.3117 | 0.2439 |
1.1364 | 4.8 | 4250 | 0.3198 | 0.2383 |
1.158 | 5.08 | 4500 | 0.3071 | 0.2342 |
1.108 | 5.37 | 4750 | 0.3011 | 0.2314 |
1.1025 | 5.65 | 5000 | 0.2875 | 0.2289 |
1.0697 | 5.93 | 5250 | 0.2926 | 0.2256 |
1.0904 | 6.21 | 5500 | 0.2695 | 0.2245 |
1.0802 | 6.5 | 5750 | 0.2602 | 0.2189 |
1.0882 | 6.78 | 6000 | 0.2603 | 0.2168 |
1.0881 | 7.06 | 6250 | 0.2540 | 0.2293 |
1.0378 | 7.34 | 6500 | 0.2614 | 0.2193 |
1.0397 | 7.63 | 6750 | 0.2707 | 0.2104 |
1.0296 | 7.91 | 7000 | 0.2483 | 0.2119 |
1.0249 | 8.19 | 7250 | 0.2483 | 0.2047 |
1.013 | 8.47 | 7500 | 0.2487 | 0.2042 |
1.0064 | 8.76 | 7750 | 0.2456 | 0.2016 |
1.0668 | 9.04 | 8000 | 0.2397 | 0.1995 |
1.0129 | 9.32 | 8250 | 0.2374 | 0.1994 |
1.0164 | 9.6 | 8500 | 0.2206 | 0.1992 |
0.975 | 9.89 | 8750 | 0.2247 | 0.1973 |
0.9849 | 10.17 | 9000 | 0.2325 | 0.1953 |
0.9826 | 10.45 | 9250 | 0.2301 | 0.1934 |
0.9835 | 10.73 | 9500 | 0.2192 | 0.1942 |
0.9676 | 11.02 | 9750 | 0.2266 | 0.1913 |
0.9627 | 11.3 | 10000 | 0.2193 | 0.1921 |
0.976 | 11.58 | 10250 | 0.2309 | 0.1882 |
0.969 | 11.86 | 10500 | 0.2268 | 0.1886 |
0.9611 | 12.15 | 10750 | 0.2322 | 0.1863 |
0.9397 | 12.43 | 11000 | 0.2197 | 0.1844 |
0.9601 | 12.71 | 11250 | 0.2211 | 0.1871 |
0.9718 | 12.99 | 11500 | 0.2079 | 0.1898 |
0.9347 | 13.28 | 11750 | 0.2054 | 0.1843 |
0.9377 | 13.56 | 12000 | 0.2031 | 0.1842 |
0.934 | 13.84 | 12250 | 0.2059 | 0.1806 |
0.9295 | 14.12 | 12500 | 0.2122 | 0.1861 |
0.935 | 14.41 | 12750 | 0.2072 | 0.1787 |
0.9021 | 14.69 | 13000 | 0.2105 | 0.1781 |
0.9193 | 14.97 | 13250 | 0.2035 | 0.1786 |
0.9214 | 15.25 | 13500 | 0.2035 | 0.1766 |
0.9048 | 15.54 | 13750 | 0.1964 | 0.1758 |
0.9006 | 15.82 | 14000 | 0.1984 | 0.1757 |
0.9027 | 16.1 | 14250 | 0.2022 | 0.1743 |
0.9083 | 16.38 | 14500 | 0.1969 | 0.1744 |
0.9761 | 16.67 | 14750 | 0.1963 | 0.1728 |
0.9311 | 16.95 | 15000 | 0.1960 | 0.1737 |
0.886 | 17.23 | 15250 | 0.1929 | 0.1726 |
0.8969 | 17.51 | 15500 | 0.1928 | 0.1734 |
0.9084 | 17.8 | 15750 | 0.1937 | 0.1713 |
0.8795 | 18.08 | 16000 | 0.1978 | 0.1709 |
0.8883 | 18.36 | 16250 | 0.1956 | 0.1703 |
0.8901 | 18.64 | 16500 | 0.1933 | 0.1705 |
0.8922 | 18.93 | 16750 | 0.1962 | 0.1711 |
0.8765 | 19.21 | 17000 | 0.1962 | 0.1711 |
0.8992 | 19.49 | 17250 | 0.1965 | 0.1703 |
0.8778 | 19.77 | 17500 | 0.1957 | 0.1699 |
Framework versions
- Transformers 4.17.0.dev0
- Pytorch 1.10.0+cu113
- Datasets 1.18.1
- Tokenizers 0.11.0