Clap
π’
a tiny vision language model
Generates a sound effect that matches video shot
Real-time object detection w/ π€ Transformers.js
Edit audios with text prompts
Generate music from text prompts πΆ
Get Music from Generated Spectrogram with Diffusion
Generate a video waveform from text-based audio descriptions
Generate music from text and melody descriptions