|
--- |
|
library_name: transformers |
|
license: mit |
|
datasets: |
|
- AlienKevin/mixed_cantonese_and_english_speech |
|
- mozilla-foundation/common_voice_17_0 |
|
metrics: |
|
- cer |
|
base_model: |
|
- openai/whisper-small |
|
--- |
|
|
|
CER: 15.4% <br> |
|
|
|
transformers-4.46.3 |
|
|
|
Train Args: <br> |
|
per_device_train_batch_size=16, <br> |
|
gradient_accumulation_steps=1, <br> |
|
learning_rate=1e-5, <br> |
|
gradient_checkpointing=True, <br> |
|
per_device_eval_batch_size=64, <br> |
|
generation_max_length=225, <br> |
|
|
|
Hardware: <br> |
|
NVIDIA Tesla V100 16GB * 4 <br> |
|
|
|
FAQ: |
|
1. If having tokenizer issue during inference, please update your transformers version to >= 4.46.3 |
|
|
|
```bash |
|
pip install --upgrade transformers==4.46.3 |
|
``` |
|
|