Whisper Small DV Model
Model Description
The whisper-small-dv
model is an advanced Automatic Speech Recognition (ASR) model, trained on the extensive Mozilla Common Voice 13.0 dataset. This model is capable of transcribing spoken language into written text with high accuracy, making it a valuable tool for a wide range of applications, from transcription services to voice assistants.
Training
The model was trained using the PyTorch framework and the Transformers library. Training metrics and visualizations can be viewed on TensorBoard.
Performance
The model's performance was evaluated on a held-out test set. The evaluation metrics and results can be found in the "Eval Results" section.
Usage
The model can be used for any ASR task. To use the model, you can load it using the Transformers library:
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
# Load the model
model = Wav2Vec2ForCTC.from_pretrained("Ryukijano/whisper-small-dv")
processor = Wav2Vec2Processor.from_pretrained("Ryukijano/whisper-small-dv")
# Use the model for ASR
inputs = processor("path_to_audio_file", return_tensors="pt", padding=True)
logits = model(inputs.input_values).logits
predicted_ids = torch.argmax(logits, dim=-1)
transcription = processor.decode(predicted_ids[0])
License
This model is released under the MIT license.
P
- Downloads last month
- 1