script.md · fastrtc/echo-audio-gradio at main

Hi, I'm Freddy and I want to give a tour of FastRTC - the real-time communication library for Python.

Let's start with the basics - echoing audio.

In FastRTC, you can wrap any iterator with ReplyOnPause and pass it to the Stream class.

This will create a WebRTC-powered web server that handles voice detection and turn taking - you just worry about the logic for the generating the response.

Each stream comes with a built-in webRTC-powered Gradio UI that you can use for testing.

Simply call ui.launch(). Let's see it in action.

We can level up our application by having an LLM generate the response.

We'll import the SambaNova API as well as some FastRTC utils for doing speech-to-text and text-to-speech and then pipe them all together.

Importantly, you can use any LLM, speech-to-text, or text-to-speech model. Even an audio-to-audio model. Bring the tools you love and we'll just handle the real-time communication.

You can also call into the stream for FREE if you have a Hugging Face Token.

Finally, deployment is really easy too. You can stick with Gradio or mount the stream in a FastAPI app and build any application you want. By the way, video is supported too!

Thanks for watching! Please visit fastrtc.org to see the cookbook for all the demos shown here as well as the docs.