Spaces:

kobakhit
/

speech-to-chat

Running

speech-to-chat / README.md

first commit

c4254a4 about 1 year ago

928 Bytes

	---
	title: Speech To Chat
	emoji: 🐨
	colorFrom: gray
	colorTo: gray
	sdk: streamlit
	sdk_version: 1.27.2
	app_file: app.py
	pinned: false
	---

	Speaker Diarization app that also has transcribing and AI Chat features.

	The following code is an application to perform speech diarization (the process of separating an audio stream into segments according to speaker identity) and transcription (the process of translating speech into written text). It uses both PyAnnote and Whisper APIs, and can process audio either uploaded from a local file or fetched from a YouTube video URL.

	## TO DO
	- [ ] Asynchrounous Whisper requests [plan](https://stackoverflow.com/a/63179518)

	# References
	- [pyannote.audio](https://github.com/pyannote/pyannote-audio)
	- [HuggingFace pyannote diarization](https://huggingface.co/pyannote/speaker-diarization-3.0)
	- [Whisper API](https://platform.openai.com/docs/guides/speech-to-text/quickstart)