Update README.md

a9b0c01 verified 9 days ago

No virus

4.22 kB

	---
	language:
	- sq
	license: apache-2.0
	tags:
	- generated_from_trainer
	base_model: openai/whisper-medium
	datasets:
	- Kushtrim/common_voice_18_sq
	metrics:
	- wer
	model-index:
	- name: Whisper Medium SQ
	results:
	- task:
	type: automatic-speech-recognition
	name: Automatic Speech Recognition
	dataset:
	name: Common Voice 18.0
	type: Kushtrim/common_voice_18_sq
	args: 'config: sq, split: test'
	metrics:
	- type: wer
	value: 5.801801801801801
	name: Wer
	---


	# Whisper Medium SQ

	[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/KushtrimVisoka/datasets/blob/main/whisper-medium-sq.ipynb)

	This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) on the [Mozilla Common Voice 18.0](https://huggingface.co/datasets/Kushtrim/common_voice_18_sq) dataset.

	It achieves the following results on the evaluation set:
	- Loss: 0.1396
	- Wer: 5.8018

	## Model description

	You can read more about the model in the [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) model card.

	## Performance and Limitations

	The Whisper Medium SQ model demonstrates slightly improved performance compared to the pretrained version, with a WER of 5.8018. However, the results are not yet optimal. The main challenge is the lack of sufficient and high-quality data for the Albanian language. This model serves as an example to highlight that increased community participation and voice donations can significantly enhance performance. To achieve top-tier results, a larger and more diverse dataset is essential. Contributions from the community are crucial for improving the model's accuracy and efficiency. You can contribute to this effort by visiting the [Mozilla Common Voice website](https://commonvoice.mozilla.org/sq/speak).

	\| Original Text \| OpenAI Whisper Medium Transcription \| Whisper Medium SQ Transcription \|
	\|:-------------:\|:----------------------------------:\|:------------------------------:\|
	\| Duke nderuar të gjithë burrat dhe gratë që bënë sakrifica të mëdha për të ndërtuar një të ardhme më të mirë për Kosovën, 1. Ne, udhëheqësit e popullit tonë, të zgjedhur në mënyrë demokratike, nëpërmjet kësaj Deklarate shpallim Kosovën shtet të pavarur dhe demokratik. \| Duken deruar te gjith burra dhe grat, qe ben sakrifisa te mdha per te ndërtua një te arël me me te mir per Kosoven. Ne, ne. Ne, u dheqsi te popoli ton te zjeldër në menjur demokratike, ne prëmjet kësaj deklerate, shpalim Kosoven shtet te pavarur dhe demokratik. \| Duk e nderuar, të gjithë burët dhe gratë, që bënë sakrifica të mdha për të ndërtuar një të arëdhme më të mirë për Kosovën. Ne, ne, udhëheqësit e popullit tonë, të zgjedhën në mënyrë demokratike, nëpërmjet kësoj deklarate shpallim Kosovën shtet të pavarur dhe demokratike.

	## Example

	```bash
	wget -N https://github.com/KushtrimVisoka/datasets/blob/2d88b7aca989001f1b870ad732295e689bee86d4/Deklarata-e-Pavare%CC%88sise%CC%88-se%CC%88-Kosove%CC%88s.mp3
	```

	```python
	from transformers import pipeline

	transcriber = pipeline("automatic-speech-recognition", model="Kushtrim/whisper-medium-sq")
	transcriber("Deklarata-e-Pavarësisë-së-Kosovës.mp3", generate_kwargs={'task': 'transcribe', 'language': 'sq'})

	```

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-05
	- train_batch_size: 16
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 500
	- training_steps: 5000
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Wer \|
	\|:-------------:\|:-------:\|:----:\|:---------------:\|:-------:\|
	\| 0.0241 \| 4.6729 \| 1000 \| 0.1425 \| 9.4414 \|
	\| 0.0027 \| 9.3458 \| 2000 \| 0.1281 \| 13.2973 \|
	\| 0.0015 \| 14.0187 \| 3000 \| 0.1322 \| 6.1622 \|
	\| 0.0001 \| 18.6916 \| 4000 \| 0.1383 \| 5.7658 \|
	\| 0.0001 \| 23.3645 \| 5000 \| 0.1396 \| 5.8018 \|