Beijuka's picture
End of training
9ef6d4b verified
metadata
library_name: transformers
language:
  - bem
license: cc-by-nc-4.0
base_model: facebook/mms-1b-all
tags:
  - generated_from_trainer
datasets:
  - BIG_C/Bemba
metrics:
  - wer
model-index:
  - name: facebook/mms-1b-all
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: BIG_C
          type: BIG_C/Bemba
        metrics:
          - name: Wer
            type: wer
            value: 0.4668925293764474

facebook/mms-1b-all

This model is a fine-tuned version of facebook/mms-1b-all on the BIG_C dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4083
  • Model Preparation Time: 0.011
  • Wer: 0.4669
  • Cer: 0.0879

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Wer Cer
2.6526 1.0 310 0.6127 0.011 0.5519 0.1287
0.7346 2.0 620 0.5850 0.011 0.5399 0.1242
0.7091 3.0 930 0.5726 0.011 0.5136 0.1200
0.6967 4.0 1240 0.5618 0.011 0.5028 0.1189
0.6769 5.0 1550 0.5520 0.011 0.4967 0.1176
0.6626 6.0 1860 0.5432 0.011 0.4935 0.1158
0.6428 7.0 2170 0.5231 0.011 0.4930 0.1178
0.6229 8.0 2480 0.5320 0.011 0.4798 0.1134
0.6081 9.0 2790 0.5168 0.011 0.4842 0.1155
0.5939 10.0 3100 0.5067 0.011 0.4835 0.1171
0.5807 11.0 3410 0.5217 0.011 0.4682 0.1106
0.5705 12.0 3720 0.5030 0.011 0.4797 0.1172
0.5584 13.0 4030 0.4976 0.011 0.4689 0.1108
0.5512 14.0 4340 0.4981 0.011 0.4766 0.1188
0.5444 15.0 4650 0.5096 0.011 0.4594 0.1090
0.5333 16.0 4960 0.4995 0.011 0.4641 0.1111
0.5204 17.0 5270 0.5116 0.011 0.4555 0.1086
0.513 18.0 5580 0.4998 0.011 0.4590 0.1121
0.5049 19.0 5890 0.4997 0.011 0.4557 0.1109
0.5011 20.0 6200 0.4960 0.011 0.4718 0.1198
0.4888 21.0 6510 0.5026 0.011 0.4579 0.1126
0.491 22.0 6820 0.5145 0.011 0.4474 0.1071
0.4804 23.0 7130 0.5026 0.011 0.4510 0.1053
0.4727 24.0 7440 0.5218 0.011 0.4416 0.1052
0.4666 25.0 7750 0.4990 0.011 0.4593 0.1148
0.4614 26.0 8060 0.5103 0.011 0.4446 0.1053
0.4546 27.0 8370 0.5019 0.011 0.4479 0.1086
0.45 28.0 8680 0.4946 0.011 0.4485 0.1086
0.4443 29.0 8990 0.4997 0.011 0.4389 0.1051
0.4369 30.0 9300 0.5063 0.011 0.4376 0.1045
0.4302 31.0 9610 0.5071 0.011 0.4448 0.1062
0.4227 32.0 9920 0.5074 0.011 0.4435 0.1096
0.4226 33.0 10230 0.5092 0.011 0.4477 0.1110
0.4191 34.0 10540 0.5107 0.011 0.4519 0.1109
0.4128 35.0 10850 0.5162 0.011 0.4412 0.1068
0.408 36.0 11160 0.5201 0.011 0.4388 0.1074
0.4022 37.0 11470 0.5138 0.011 0.4436 0.1088
0.3979 38.0 11780 0.5331 0.011 0.4386 0.1062
0.3937 39.0 12090 0.5225 0.011 0.4446 0.1124
0.3905 40.0 12400 0.5200 0.011 0.4355 0.1065
0.3846 41.0 12710 0.5115 0.011 0.4394 0.1092
0.3827 42.0 13020 0.5169 0.011 0.4458 0.1131
0.3797 43.0 13330 0.5237 0.011 0.4387 0.1088
0.3729 44.0 13640 0.5431 0.011 0.4318 0.1057
0.3694 45.0 13950 0.5375 0.011 0.4318 0.1060
0.3656 46.0 14260 0.5301 0.011 0.4409 0.1099
0.3618 47.0 14570 0.5422 0.011 0.4460 0.1146
0.3572 48.0 14880 0.5404 0.011 0.4395 0.1084
0.3523 49.0 15190 0.5442 0.011 0.4421 0.1112
0.3514 50.0 15500 0.5561 0.011 0.4345 0.1072
0.3473 51.0 15810 0.5549 0.011 0.4393 0.1113
0.3443 52.0 16120 0.5469 0.011 0.4424 0.1127
0.3412 53.0 16430 0.5624 0.011 0.4529 0.1165
0.3343 54.0 16740 0.5548 0.011 0.4491 0.1143

Framework versions

  • Transformers 4.47.0.dev0
  • Pytorch 2.1.0+cu118
  • Datasets 3.1.0
  • Tokenizers 0.20.1