File size: 1,107 Bytes
a197707
 
3a02e1e
 
 
 
a197707
3a02e1e
a032b7b
 
 
 
 
 
477ce7b
a032b7b
 
 
 
 
 
 
 
 
 
 
 
 
1a34264
3a02e1e
 
 
 
a8fe0f0
3a02e1e
 
 
 
a032b7b
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
---
license: apache-2.0
language:
- fi
tags:
- speech-recognition
---

Example how to use with WhisperX (https://github.com/m-bain/whisperX)

```python
import whisperx

device = "cuda" 
audio_file = "oma_nauhoitus_16kHz.wav"
batch_size = 16 # reduce if low on GPU mem
compute_type = "float16" # change to "int8" if low on GPU mem (may reduce accuracy)

# 1. Transcribe with original whisper (batched)
model = whisperx.load_model("Finnish-NLP/whisper-large-finnish-v3-ct2", device, compute_type=compute_type)

audio = whisperx.load_audio(audio_file)
result = model.transcribe(audio, batch_size=batch_size)
print(result["segments"]) # before alignment
```


How to use in Python with faster-whisper (https://github.com/SYSTRAN/faster-whisper)
```python
import faster_whisper
model = faster_whisper.WhisperModel("Finnish-NLP/whisper-large-finnish-v3-ct2")
print("model loaded")

segments, info = model.transcribe(audio_path, word_timestamps=True, beam_size=5, language="fi")

for segment in segments:
    for word in segment.words:
        print("[%.2fs -> %.2fs] %s" % (word.start, word.end, word.word))
```