codewithdark
/

WhisperLiveSubs

Automatic Speech Recognition

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

codewithdark commited on Sep 4, 2024

Commit

046c1ca

·

verified ·

1 Parent(s): de2c3fb

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -35,7 +35,7 @@ model = WhisperForConditionalGeneration.from_pretrained("codewithdark/WhisperLiv
 ```
 ### Training Data
-The model was fine-tuned on the Mozilla Common Voice dataset, specifically the Urdu subset. The dataset consists of approximately [number of hours] of transcribed Urdu speech.
 #### Preprocessing
 The audio was resampled to 16kHz, and text was tokenized using the Whisper tokenizer configured for Urdu.

 ```
 ### Training Data
+The model was fine-tuned on the Mozilla Common Voice dataset, specifically the Urdu subset. The dataset consists of approximately 141 hr of transcribed Urdu speech.
 #### Preprocessing
 The audio was resampled to 16kHz, and text was tokenized using the Whisper tokenizer configured for Urdu.