Audio - a diwank Collection

Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

diwank 's Collections

F

search

Vision

Art

K

S1.1

Sam

Audio

thought

Audio

updated about 24 hours ago

espnet/yodas2

Updated Jun 10, 2024 • 18.4k • 26
Flux9665/BibleMMS

Viewer • Updated Jun 16, 2024 • 736k • 164 • 65
google/MusicCaps

Viewer • Updated Mar 8, 2023 • 5.52k • 328 • 130
ShoukanLabs/AniSpeech

Viewer • Updated Jan 29, 2024 • 23.7k • 370 • 39
muzaik/captioned-audio-1k

Viewer • Updated May 28, 2024 • 1.05k • 72 • 4
aoxo/text2asmr-uncensored

Preview • Updated Feb 19, 2024 • 93 • 5
google/fleurs

Updated Aug 25, 2024 • 21.8k • 261
phongdtd/youtube_casual_audio

Updated Sep 10, 2024 • 81 • 4
ProgramComputer/voxceleb

Updated Jul 27, 2024 • 2.04k • 63
jhu-clsp/seamless-align

Preview • Updated Jun 2, 2024 • 113 • 10
IVLLab/MultiDialog

Updated Aug 29, 2024 • 445 • 12
PetraAI/PetraAI

Updated Sep 14, 2023 • 331 • 20
ReDUB/SoundHarvest

Viewer • Updated Dec 14, 2023 • 2 • 55 • 2
jhu-clsp/seamless-align-expressive

Updated Feb 22, 2024 • 51 • 4
jg583/NSynth

Updated Apr 26, 2024 • 157 • 17
voice-is-cool/voxtube

Viewer • Updated Feb 13, 2024 • 4.46M • 409 • 11
google/speech_commands

Updated Jan 18, 2024 • 1.12k • 35
Fhrozen/FSD50k

Preview • Updated May 27, 2022 • 224 • 4
nvidia/parakeet-tdt-1.1b

Automatic Speech Recognition • Updated Apr 30, 2024 • 171k • 82
yl4579/StyleTTS2-LibriTTS

Updated Nov 21, 2023 • 44
coqui/XTTS-v2

Text-to-Speech • Updated Dec 11, 2023 • 1.72M • 2.14k
facebook/wav2vec2-large-robust

Updated Nov 5, 2021 • 3.75k • 33
laion/links_to_pocasts_lecture_and_shows_for_tts

Viewer • Updated May 29, 2024 • 331k • 8 • 8
laion/youtube-urls-for-emotional-tts

Viewer • Updated May 21, 2024 • 78.3k • 33 • 3
laion/chirp-v2-dataset

Viewer • Updated Mar 25, 2024 • 64 • 699 • 6
speechcolab/gigaspeech

Viewer • Updated Nov 23, 2023 • 364k • 16k • 97
fixie-ai/boolq-audio

Viewer • Updated Jun 12, 2024 • 12.7k • 185 • 7
fixie-ai/soda-audio

Viewer • Updated Jul 24, 2024 • 102k • 104 • 4
amphion/Emilia

Preview • Updated Sep 6, 2024 • 51 • 82
google/cvss

Updated Feb 10, 2024 • 97 • 13
PolyAI/minds14

Updated Sep 10, 2024 • 3.77k • 80
Qwen/Qwen2-Audio-7B-Instruct

Audio-Text-to-Text • Updated Nov 20, 2024 • 411k • 290
infgrad/dialogue_rewrite_llm

Viewer • Updated Feb 17, 2024 • 1.64M • 52 • 14
FBK-MT/Speech-MASSIVE

Viewer • Updated Aug 8, 2024 • 97.6k • 721 • 33
Qwen/Qwen2-Audio-7B

Audio-Text-to-Text • Updated Nov 20, 2024 • 8.71k • 86
Mozilla/whisperfile

Updated Oct 2, 2024 • 504 • 238
vucinatim/spectrogram-captions

Viewer • Updated Jan 3, 2023 • 1k • 37 • 2
rachit8562/mel_spectogram_bird_audio

Viewer • Updated Jan 7, 2023 • 72.2k • 44 • 2
novateur/WavTokenizer

Text-to-Speech • Updated Dec 2, 2024 • 46
gpt-omni/mini-omni

Text-to-Speech • Updated Sep 4, 2024 • 408
amphion/Emilia-Dataset

Viewer • Updated Sep 6, 2024 • 52.9M • 39.9k • 176
FLUX that Plays Music

Paper • 2409.00587 • Published Sep 1, 2024 • 31
feizhengcong/FluxMusic

Updated Nov 22, 2024 • 64
fishaudio/fish-speech-1.4

Text-to-Speech • Updated Nov 5, 2024 • 3.83k • 443
ICTNLP/Llama-3.1-8B-Omni

Updated Nov 14, 2024 • 4.29k • 385
HuggingFaceFV/finevideo

Viewer • Updated 19 days ago • 39.5k • 5.37k • 285
kyutai/moshiko-pytorch-bf16

Updated Sep 18, 2024 • 99.1k • 155
kyutai/moshika-pytorch-bf16

Updated Sep 18, 2024 • 207 • 46
Revai/reverb-asr

Automatic Speech Recognition • Updated 26 days ago • 53 • 75
FBK-MT/mosel

Viewer • Updated Oct 30, 2024 • 57.5M • 499 • 68
homebrewltd/llama3-s-instruct-v0.2

Updated Aug 23, 2024 • 26 • 44
SWivid/F5-TTS

Text-to-Speech • Updated Nov 8, 2024 • 546k • 825
mit-han-lab/hart-0.7b-1024px

Unconditional Image Generation • Updated Nov 17, 2024 • 9
THUDM/glm-4-voice-9b

Updated Oct 25, 2024 • 3.91k • 76
amphion/MaskGCT

Text-to-Speech • Updated 13 days ago • 33 • 259
nvidia/parakeet-tdt_ctc-110m

Automatic Speech Recognition • Updated Oct 22, 2024 • 26k • 15
nvidia/audio-flamingo

Updated Oct 2, 2024 • 21
fishaudio/fish-agent-v0.1-3b

Audio-to-Audio • Updated Nov 1, 2024 • 828 • 237
OuteAI/OuteTTS-0.1-350M

Text-to-Speech • Updated Nov 27, 2024 • 5.67k • 297
adamo1139/Meta_Spirit-LM-ungated

Text-to-Audio • Updated Oct 20, 2024 • 18
si-pbc/hertz-dev

Audio-to-Audio • Updated Nov 14, 2024 • 210
pyannote/speech-separation-ami-1.0

Updated Nov 11, 2024 • 27.9k • 50
nyuuzyou/suno

Preview • Updated Nov 20, 2024 • 126 • 53
gpt-omni/mini-omni2

Any-to-Any • Updated Oct 24, 2024 • 669 • 240
fixie-ai/ultravox-v0_4_1-llama-3_1-70b

Audio-Text-to-Text • Updated 22 days ago • 443 • 22
aiola/whisper-ner-tag-and-mask-v1

Automatic Speech Recognition • Updated Nov 21, 2024 • 30 • 5
nyrahealth/CrisperWhisper

Automatic Speech Recognition • Updated 16 days ago • 11.3k • 197
laion/laions_got_talent

Viewer • Updated 4 days ago • 461k • 264 • 12
nvidia/se_den_sb_16k_small

Updated Nov 28, 2024 • 2
nvidia/se_der_sb_16k_small

Updated Nov 28, 2024 • 2
nvidia/sr_ssl_flowmatching_16k_430m

Updated Nov 28, 2024 • 5
nvidia/low-frame-rate-speech-codec-22khz

Updated 23 days ago • 953 • 11
laion/laion-audio-preview

Viewer • Updated Dec 4, 2024 • 4.15M • 8.35k • 10
NexaAIDev/OmniAudio-2.6B

Audio-Text-to-Text • Updated 22 days ago • 8.95k • 212
laion/LAION-Audio-300M

Viewer • Updated 9 minutes ago • 130M • 19 • 6
hexgrad/Kokoro-82M

Text-to-Speech • Updated 2 days ago • 717 • 185

Collection guide
Browse collections

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs