YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
# Whisper Large v2 Uzbek Speech Recognition Model
This project contains a fine-tuned version of the Faster Whisper Large v2 model for Uzbek speech recognition. The model can be used to transcribe Uzbek audio files into text.
## Installation
1. Ensure you have Python 3.7 or higher installed.
2. Install the required libraries:
pip install transformers datasets accelerate soundfile librosa torch
## Usage
You can use the model with the following Python code:
```python
from transformers import pipeline, WhisperForConditionalGeneration, WhisperProcessor
import torch
# Load the model and processor
model_name = "totetecdev/whisper-large-v2-uzbek-100steps"
model = WhisperForConditionalGeneration.from_pretrained(model_name)
processor = WhisperProcessor.from_pretrained(model_name)
# Create the speech recognition pipeline
pipe = pipeline(
"automatic-speech-recognition",
model=model,
tokenizer=processor.tokenizer,
feature_extractor=processor.feature_extractor,
torch_dtype=torch.float16,
device_map="auto",
)
# Transcribe an audio file
audio_file = "path/to/your/audio/file.wav" # Replace with the path to your audio file
result = pipe(audio_file)
print(result["text"])
Example Usage
Prepare your audio file (it should be in WAV format).
Save the above code in a Python file (e.g.,
transcribe.py
).Update the
model_name
andaudio_file
variables in the code with your values.Run the following command in your terminal or command prompt:
python transcribe.py
The transcribed text will be displayed on the screen.
Notes
This model will perform best with Uzbek audio files.
Longer audio files may require more processing time.
GPU usage is recommended, but the model can also run on CPU.
If you're using Google Colab, you can upload your audio file using:
from google.colab import files uploaded = files.upload() audio_file = next(iter(uploaded))
Model Details
- Base Model: Faster Whisper Large v2
- Fine-tuned for: Uzbek Speech Recognition
License
This project is licensed under [LICENSE]. See the LICENSE file for details.
Contact
For questions or feedback, please contact [KHABIB SALIMOV] at [totetec.dev@gmail.com].
Acknowledgements
- OpenAI for the original Whisper model