|
--- |
|
license: apache-2.0 |
|
language: |
|
- fi |
|
tags: |
|
- speech-recognition |
|
- whisper |
|
--- |
|
|
|
Example how to use with WhisperX (https://github.com/m-bain/whisperX) |
|
|
|
```python |
|
import whisperx |
|
|
|
device = "cuda" |
|
audio_file = "oma_nauhoitus_16kHz.wav" |
|
batch_size = 16 # reduce if low on GPU mem |
|
compute_type = "float16" # change to "int8" if low on GPU mem (may reduce accuracy) |
|
|
|
# 1. Transcribe with original whisper (batched) |
|
model = whisperx.load_model("Finnish-NLP/whisper-large-finnish-v3-ct2", device, compute_type=compute_type) |
|
|
|
audio = whisperx.load_audio(audio_file) |
|
result = model.transcribe(audio, batch_size=batch_size) |
|
print(result["segments"]) # before alignment |
|
``` |
|
|
|
|
|
How to use in Python with faster-whisper (https://github.com/SYSTRAN/faster-whisper) |
|
```python |
|
import faster_whisper |
|
model = faster_whisper.WhisperModel("Finnish-NLP/whisper-large-finnish-v3-ct2") |
|
print("model loaded") |
|
|
|
segments, info = model.transcribe(audio_path, word_timestamps=True, beam_size=5, language="fi") |
|
|
|
for segment in segments: |
|
for word in segment.words: |
|
print("[%.2fs -> %.2fs] %s" % (word.start, word.end, word.word)) |
|
``` |