metadata

library_name: transformers
language:
  - zul
license: mit
base_model: facebook/w2v-bert-2.0
tags:
  - generated_from_trainer
datasets:
  - NCHLT/ZULU
metrics:
  - wer
model-index:
  - name: facebook/w2v-bert-2.0
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: NCHLT
          type: NCHLT/ZULU
        metrics:
          - name: Wer
            type: wer
            value: 0.5654182709135457

facebook/w2v-bert-2.0

This model is a fine-tuned version of facebook/w2v-bert-2.0 on the NCHLT dataset. It achieves the following results on the evaluation set:

Loss: 0.5160
Wer: 0.5654
Cer: 0.1543

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.01
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
0.9749	1.0	569	0.2271	0.2541	0.0418
0.1275	2.0	1138	0.1601	0.1873	0.0314
0.0836	3.0	1707	0.1292	0.1541	0.0250
0.0618	4.0	2276	0.1122	0.1208	0.0213
0.0478	5.0	2845	0.1032	0.1068	0.0190
0.0384	6.0	3414	0.1039	0.1036	0.0187
0.0315	7.0	3983	0.0911	0.0882	0.0166
0.0259	8.0	4552	0.1015	0.1015	0.0187
0.0219	9.0	5121	0.0971	0.0874	0.0162
0.0188	10.0	5690	0.0918	0.0873	0.0160
0.0168	11.0	6259	0.0931	0.0826	0.0155
0.015	12.0	6828	0.0983	0.0839	0.0159
0.014	13.0	7397	0.1054	0.0878	0.0160
0.0117	14.0	7966	0.1033	0.0787	0.0150
0.0099	15.0	8535	0.1068	0.0791	0.0150
0.011	16.0	9104	0.1013	0.0786	0.0151
0.0093	17.0	9673	0.1083	0.0805	0.0158
0.0085	18.0	10242	0.1012	0.0747	0.0144
0.0071	19.0	10811	0.0971	0.0743	0.0145
0.0063	20.0	11380	0.0927	0.0726	0.0141
0.0063	21.0	11949	0.0992	0.0737	0.0139
0.0067	22.0	12518	0.0989	0.0788	0.0144
0.0069	23.0	13087	0.1005	0.0691	0.0133
0.0058	24.0	13656	0.1197	0.0724	0.0144
0.0055	25.0	14225	0.0939	0.0720	0.0135
0.0043	26.0	14794	0.0982	0.0655	0.0130
0.0053	27.0	15363	0.0941	0.0708	0.0139
0.0052	28.0	15932	0.0985	0.0685	0.0131
0.0043	29.0	16501	0.1055	0.0752	0.0138
0.005	30.0	17070	0.0948	0.0653	0.0133
0.0037	31.0	17639	0.0967	0.0658	0.0127
0.0045	32.0	18208	0.0936	0.0680	0.0133
0.003	33.0	18777	0.1062	0.0621	0.0126
0.0036	34.0	19346	0.1002	0.0737	0.0137
0.0035	35.0	19915	0.1091	0.0695	0.0137
0.0027	36.0	20484	0.1061	0.0684	0.0134
0.0038	37.0	21053	0.0839	0.0623	0.0125
0.0025	38.0	21622	0.1079	0.0669	0.0133
0.0029	39.0	22191	0.0898	0.0625	0.0126
0.0029	40.0	22760	0.0941	0.0630	0.0124
0.0023	41.0	23329	0.1058	0.0640	0.0124
0.0021	42.0	23898	0.0955	0.0589	0.0116
0.0022	43.0	24467	0.0965	0.0647	0.0126
0.002	44.0	25036	0.0939	0.0605	0.0120
0.0016	45.0	25605	0.0973	0.0599	0.0123
0.0015	46.0	26174	0.1069	0.0604	0.0123
0.0012	47.0	26743	0.0997	0.0564	0.0116
0.0011	48.0	27312	0.0882	0.0559	0.0111
0.0011	49.0	27881	0.1167	0.0574	0.0119
0.002	50.0	28450	0.0950	0.0538	0.0110
0.0015	51.0	29019	0.0916	0.0548	0.0112
0.001	52.0	29588	0.0996	0.0591	0.0119
0.0008	53.0	30157	0.0978	0.0575	0.0117
0.001	54.0	30726	0.0967	0.0551	0.0113
0.001	55.0	31295	0.0948	0.0577	0.0115
0.0013	56.0	31864	0.0963	0.0563	0.0115
0.0011	57.0	32433	0.1028	0.0593	0.0121
0.0008	58.0	33002	0.1064	0.0578	0.0118
0.0011	59.0	33571	0.1034	0.0573	0.0115
0.0007	60.0	34140	0.1102	0.0561	0.0115

Framework versions

Transformers 4.46.3
Pytorch 2.1.0+cu118
Datasets 3.1.0
Tokenizers 0.20.3