File size: 1,407 Bytes
56a31c1 b5bf292 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
---
license: mit
datasets:
- mozilla-foundation/common_voice_17_0
language:
- ru
base_model:
- dvislobokov/whisper-large-v3-turbo-russian
pipeline_tag: automatic-speech-recognition
---
## Example of use this model with faster-whisper
```python
import io
import json
import logging
import sys
import time
from datetime import datetime
from faster_whisper import WhisperModel
from pydub import AudioSegment
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler('faster-whisper.log'),
logging.StreamHandler(sys.stdout)
]
)
model = WhisperModel("/path/to/dvislobokov/faster-whisper-large-v3-turbo-russian", "cpu")
audio = AudioSegment.from_wav("ezyZip.wav")
chunk_length = 30 * 1000 # in milliseconds
chunks = [audio[i:i + chunk_length] for i in range(0, len(audio), chunk_length)]
logging.info(f'Start transcribe at {datetime.now().strftime("%Y-%m-%d %H:%M:%S")}')
start = time.time()
text = []
for i, chunk in enumerate(chunks):
buffer = io.BytesIO()
chunk.export(buffer, format="wav")
segments, info = model.transcribe(buffer, language="ru")
text.append("".join(segment.text for segment in segments))
end = time.time()
logging.info(f'Finish transcribe at {datetime.now().strftime("%Y-%m-%d %H:%M:%S")}')
logging.info(f'Total time: {end - start}')
logging.info(f'Text: {text}')
```
|