llam
's Collections
DQR-TTS: Semi-supervised Text-to-speech Synthesis with Dynamic Quantized
Representation
Paper
•
2311.07965
•
Published
•
1
CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking
Embedding
Paper
•
2311.08673
•
Published
CLN-VC: Text-Free Voice Conversion Based on Fine-Grained Style Control
and Contrastive Learning with Negative Samples Augmentation
Paper
•
2311.08670
•
Published
Stock Volatility Prediction Based on Transformer Model Using
Mixed-Frequency Data
Paper
•
2309.16196
•
Published
Sparks of Large Audio Models: A Survey and Outlook
Paper
•
2308.12792
•
Published
Research on the Impact of Executive Shareholding on New Investment in
Enterprises Based on Multivariable Linear Regression Model
Paper
•
2309.10986
•
Published
A Hierarchy-based Analysis Approach for Blended Learning: A Case Study
with Chinese Students
Paper
•
2309.10218
•
Published
An Empirical Study of Attention Networks for Semantic Segmentation
Paper
•
2309.10217
•
Published
Contrastive Latent Space Reconstruction Learning for Audio-Text
Retrieval
Paper
•
2309.08839
•
Published
AOSR-Net: All-in-One Sandstorm Removal Network
Paper
•
2309.08838
•
Published
FastGraphTTS: An Ultrafast Syntax-Aware Speech Synthesis Framework
Paper
•
2309.08837
•
Published
DiffTalker: Co-driven audio-image diffusion for talking faces via
intermediate landmarks
Paper
•
2309.07509
•
Published
Machine Unlearning Methodology base on Stochastic Teacher Network
Paper
•
2308.14322
•
Published
Voice Conversion with Denoising Diffusion Probabilistic GAN Models
Paper
•
2308.14319
•
Published
Symbolic & Acoustic: Multi-domain Music Emotion Modeling for
Instrumental Music
Paper
•
2308.14317
•
Published
•
2
Improving Music Genre Classification from Multi-Modal Properties of
Music and Genre Correlations Perspective
Paper
•
2303.07667
•
Published
EmoMix: Emotion Mixing via Diffusion Models for Emotional Speech
Synthesis
Paper
•
2306.00648
•
Published
•
1
SAR: Self-Supervised Anti-Distortion Representation for End-To-End
Speech Model
Paper
•
2304.11547
•
Published
Dynamic Alignment Mask CTC: Improved Mask-CTC with Aligned Cross Entropy
Paper
•
2303.07687
•
Published
QI-TTS: Questioning Intonation Control for Emotional Speech Synthesis
Paper
•
2303.07682
•
Published
Improving EEG-based Emotion Recognition by Fusing Time-frequency And
Spatial Representations
Paper
•
2303.11421
•
Published
•
1
Linguistic-Enhanced Transformer with CTC Embedding for Speech
Recognition
Paper
•
2210.14725
•
Published
Improving Imbalanced Text Classification with Dynamic Curriculum
Learning
Paper
•
2210.14724
•
Published
Semi-Supervised Learning Based on Reference Model for Low-resource TTS
Paper
•
2210.14723
•
Published
MetaSpeech: Speech Effects Switch Along with Environment for Metaverse
Paper
•
2210.13811
•
Published
Improving Speech Representation Learning via Speech-level and
Phoneme-level Masking Approach
Paper
•
2210.13805
•
Published
Adapitch: Adaption Multi-Speaker Text-to-Speech Conditioned on Pitch
Disentangling with Untranscribed Data
Paper
•
2210.13803
•
Published
Pre-Avatar: An Automatic Presentation Generation Framework Leveraging
Talking Avatar
Paper
•
2210.06877
•
Published
Boosting Star-GANs for Voice Conversion with Contrastive Discriminator
Paper
•
2209.10088
•
Published
Tiny-Sepformer: A Tiny Time-Domain Transformer Network for Speech
Separation
Paper
•
2206.13689
•
Published
SUSing: SU-net for Singing Voice Synthesis
Paper
•
2205.11841
•
Published
TDASS: Target Domain Adaptation Speech Synthesis Framework for
Multi-speaker Low-Resource TTS
Paper
•
2205.11824
•
Published
MetaSID: Singer Identification with Domain Adaptation for Metaverse
Paper
•
2205.11821
•
Published
Singer Identification for Metaverse with Timbral and Middle-Level
Perceptual Features
Paper
•
2205.11817
•
Published
MDCNN-SID: Multi-scale Dilated Convolution Network for Singer
Identification
Paper
•
2004.04371
•
Published
Investigation of Singing Voice Separation for Singing Voice Detection in
Polyphonic Music
Paper
•
2004.04040
•
Published
DRVC: A Framework of Any-to-Any Voice Conversion with Self-Supervised
Learning
Paper
•
2202.10976
•
Published
nnSpeech: Speaker-Guided Conditional Variational Autoencoder for
Zero-shot Multi-speaker Text-to-Speech
Paper
•
2202.10712
•
Published
AVQVC: One-shot Voice Conversion by Vector Quantization with applying
contrastive learning
Paper
•
2202.10020
•
Published
•
1
Singer Identification Using Deep Timbre Feature Learning with KNN-Net
Paper
•
2102.10236
•
Published
TGAVC: Improving Autoencoder Voice Conversion with Text-Guided and
Adversarial Training
Paper
•
2208.04035
•
Published
PMVC: Data Augmentation-Based Prosody Modeling for Expressive Voice
Conversion
Paper
•
2308.11084
•
Published
Medical Speech Symptoms Classification via Disentangled Representation
Paper
•
2403.05000
•
Published