w2v-bert-cv-grain-lg_cv_only
This model is a fine-tuned version of facebook/w2v-bert-2.0 on the common_voice_17_0 dataset.
It achieves the following results on the evaluation set:
- Loss: inf
- Wer: 0.5800
- Cer: 0.1379
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 100
- mixed_precision_training: Native AMP
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
Wer |
Cer |
0.5013 |
1.0 |
2221 |
inf |
0.2789 |
0.0724 |
0.299 |
2.0 |
4442 |
inf |
0.2501 |
0.0648 |
0.2554 |
3.0 |
6663 |
inf |
0.2435 |
0.0685 |
0.2411 |
4.0 |
8884 |
inf |
0.2447 |
0.0648 |
0.2886 |
5.0 |
11105 |
inf |
0.2506 |
0.0654 |
0.3923 |
6.0 |
13326 |
inf |
0.4237 |
0.1108 |
2.1779 |
7.0 |
15547 |
inf |
0.5612 |
0.1439 |
4.5629 |
8.0 |
17768 |
inf |
0.5152 |
0.1379 |
2.236 |
9.0 |
19989 |
inf |
0.5787 |
0.1384 |
2.2033 |
10.0 |
22210 |
inf |
0.5742 |
0.1375 |
2.2047 |
11.0 |
24431 |
inf |
0.5784 |
0.1382 |
2.2057 |
12.0 |
26652 |
inf |
0.5805 |
0.1390 |
2.2076 |
13.0 |
28873 |
inf |
0.5800 |
0.1379 |
Framework versions
- Transformers 4.46.1
- Pytorch 2.1.0+cu118
- Datasets 3.1.0
- Tokenizers 0.20.1