Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
diwank
's Collections
F
search
Vision
Art
K
S1.1
Sam
Audio
thought
Audio
updated
about 24 hours ago
Upvote
-
espnet/yodas2
Updated
Jun 10, 2024
•
18.4k
•
26
Flux9665/BibleMMS
Viewer
•
Updated
Jun 16, 2024
•
736k
•
164
•
65
google/MusicCaps
Viewer
•
Updated
Mar 8, 2023
•
5.52k
•
328
•
130
ShoukanLabs/AniSpeech
Viewer
•
Updated
Jan 29, 2024
•
23.7k
•
370
•
39
muzaik/captioned-audio-1k
Viewer
•
Updated
May 28, 2024
•
1.05k
•
72
•
4
aoxo/text2asmr-uncensored
Preview
•
Updated
Feb 19, 2024
•
93
•
5
google/fleurs
Updated
Aug 25, 2024
•
21.8k
•
261
phongdtd/youtube_casual_audio
Updated
Sep 10, 2024
•
81
•
4
ProgramComputer/voxceleb
Updated
Jul 27, 2024
•
2.04k
•
63
jhu-clsp/seamless-align
Preview
•
Updated
Jun 2, 2024
•
113
•
10
IVLLab/MultiDialog
Updated
Aug 29, 2024
•
445
•
12
PetraAI/PetraAI
Updated
Sep 14, 2023
•
331
•
20
ReDUB/SoundHarvest
Viewer
•
Updated
Dec 14, 2023
•
2
•
55
•
2
jhu-clsp/seamless-align-expressive
Updated
Feb 22, 2024
•
51
•
4
jg583/NSynth
Updated
Apr 26, 2024
•
157
•
17
voice-is-cool/voxtube
Viewer
•
Updated
Feb 13, 2024
•
4.46M
•
409
•
11
google/speech_commands
Updated
Jan 18, 2024
•
1.12k
•
35
Fhrozen/FSD50k
Preview
•
Updated
May 27, 2022
•
224
•
4
nvidia/parakeet-tdt-1.1b
Automatic Speech Recognition
•
Updated
Apr 30, 2024
•
171k
•
82
yl4579/StyleTTS2-LibriTTS
Updated
Nov 21, 2023
•
44
coqui/XTTS-v2
Text-to-Speech
•
Updated
Dec 11, 2023
•
1.72M
•
2.14k
facebook/wav2vec2-large-robust
Updated
Nov 5, 2021
•
3.75k
•
33
laion/links_to_pocasts_lecture_and_shows_for_tts
Viewer
•
Updated
May 29, 2024
•
331k
•
8
•
8
laion/youtube-urls-for-emotional-tts
Viewer
•
Updated
May 21, 2024
•
78.3k
•
33
•
3
laion/chirp-v2-dataset
Viewer
•
Updated
Mar 25, 2024
•
64
•
699
•
6
speechcolab/gigaspeech
Viewer
•
Updated
Nov 23, 2023
•
364k
•
16k
•
97
fixie-ai/boolq-audio
Viewer
•
Updated
Jun 12, 2024
•
12.7k
•
185
•
7
fixie-ai/soda-audio
Viewer
•
Updated
Jul 24, 2024
•
102k
•
104
•
4
amphion/Emilia
Preview
•
Updated
Sep 6, 2024
•
51
•
82
google/cvss
Updated
Feb 10, 2024
•
97
•
13
PolyAI/minds14
Updated
Sep 10, 2024
•
3.77k
•
80
Qwen/Qwen2-Audio-7B-Instruct
Audio-Text-to-Text
•
Updated
Nov 20, 2024
•
411k
•
290
infgrad/dialogue_rewrite_llm
Viewer
•
Updated
Feb 17, 2024
•
1.64M
•
52
•
14
FBK-MT/Speech-MASSIVE
Viewer
•
Updated
Aug 8, 2024
•
97.6k
•
721
•
33
Qwen/Qwen2-Audio-7B
Audio-Text-to-Text
•
Updated
Nov 20, 2024
•
8.71k
•
86
Mozilla/whisperfile
Updated
Oct 2, 2024
•
504
•
238
vucinatim/spectrogram-captions
Viewer
•
Updated
Jan 3, 2023
•
1k
•
37
•
2
rachit8562/mel_spectogram_bird_audio
Viewer
•
Updated
Jan 7, 2023
•
72.2k
•
44
•
2
novateur/WavTokenizer
Text-to-Speech
•
Updated
Dec 2, 2024
•
46
gpt-omni/mini-omni
Text-to-Speech
•
Updated
Sep 4, 2024
•
408
amphion/Emilia-Dataset
Viewer
•
Updated
Sep 6, 2024
•
52.9M
•
39.9k
•
176
FLUX that Plays Music
Paper
•
2409.00587
•
Published
Sep 1, 2024
•
31
feizhengcong/FluxMusic
Updated
Nov 22, 2024
•
64
fishaudio/fish-speech-1.4
Text-to-Speech
•
Updated
Nov 5, 2024
•
3.83k
•
443
ICTNLP/Llama-3.1-8B-Omni
Updated
Nov 14, 2024
•
4.29k
•
385
HuggingFaceFV/finevideo
Viewer
•
Updated
19 days ago
•
39.5k
•
5.37k
•
285
kyutai/moshiko-pytorch-bf16
Updated
Sep 18, 2024
•
99.1k
•
155
kyutai/moshika-pytorch-bf16
Updated
Sep 18, 2024
•
207
•
46
Revai/reverb-asr
Automatic Speech Recognition
•
Updated
26 days ago
•
53
•
75
FBK-MT/mosel
Viewer
•
Updated
Oct 30, 2024
•
57.5M
•
499
•
68
homebrewltd/llama3-s-instruct-v0.2
Updated
Aug 23, 2024
•
26
•
44
SWivid/F5-TTS
Text-to-Speech
•
Updated
Nov 8, 2024
•
546k
•
825
mit-han-lab/hart-0.7b-1024px
Unconditional Image Generation
•
Updated
Nov 17, 2024
•
9
THUDM/glm-4-voice-9b
Updated
Oct 25, 2024
•
3.91k
•
76
amphion/MaskGCT
Text-to-Speech
•
Updated
13 days ago
•
33
•
259
nvidia/parakeet-tdt_ctc-110m
Automatic Speech Recognition
•
Updated
Oct 22, 2024
•
26k
•
15
nvidia/audio-flamingo
Updated
Oct 2, 2024
•
21
fishaudio/fish-agent-v0.1-3b
Audio-to-Audio
•
Updated
Nov 1, 2024
•
828
•
237
OuteAI/OuteTTS-0.1-350M
Text-to-Speech
•
Updated
Nov 27, 2024
•
5.67k
•
297
adamo1139/Meta_Spirit-LM-ungated
Text-to-Audio
•
Updated
Oct 20, 2024
•
18
si-pbc/hertz-dev
Audio-to-Audio
•
Updated
Nov 14, 2024
•
210
pyannote/speech-separation-ami-1.0
Updated
Nov 11, 2024
•
27.9k
•
50
nyuuzyou/suno
Preview
•
Updated
Nov 20, 2024
•
126
•
53
gpt-omni/mini-omni2
Any-to-Any
•
Updated
Oct 24, 2024
•
669
•
240
fixie-ai/ultravox-v0_4_1-llama-3_1-70b
Audio-Text-to-Text
•
Updated
22 days ago
•
443
•
22
aiola/whisper-ner-tag-and-mask-v1
Automatic Speech Recognition
•
Updated
Nov 21, 2024
•
30
•
5
nyrahealth/CrisperWhisper
Automatic Speech Recognition
•
Updated
16 days ago
•
11.3k
•
197
laion/laions_got_talent
Viewer
•
Updated
4 days ago
•
461k
•
264
•
12
nvidia/se_den_sb_16k_small
Updated
Nov 28, 2024
•
2
nvidia/se_der_sb_16k_small
Updated
Nov 28, 2024
•
2
nvidia/sr_ssl_flowmatching_16k_430m
Updated
Nov 28, 2024
•
5
nvidia/low-frame-rate-speech-codec-22khz
Updated
23 days ago
•
953
•
11
laion/laion-audio-preview
Viewer
•
Updated
Dec 4, 2024
•
4.15M
•
8.35k
•
10
NexaAIDev/OmniAudio-2.6B
Audio-Text-to-Text
•
Updated
22 days ago
•
8.95k
•
212
laion/LAION-Audio-300M
Viewer
•
Updated
9 minutes ago
•
130M
•
19
•
6
hexgrad/Kokoro-82M
Text-to-Speech
•
Updated
2 days ago
•
717
•
185
Upvote
-
Share collection
View history
Collection guide
Browse collections