SJTU Cross Media Language Intelligence Lab

university

https://x-lance.sjtu.edu.cn

X-LANCE

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

yfyeung authored a paper 10 days ago

SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training

yfyeung authored a paper 10 days ago

Interleaved Speech-Text Language Models are Simple Streaming Text to Speech Synthesizers

zdy023 updated a dataset about 1 month ago

X-LANCE/WikiHow-taskset

View all activity

X-LANCE's activity

yfyeung

authored 2 papers 10 days ago

SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training

Paper • 2412.15649 • Published 23 days ago

Interleaved Speech-Text Language Models are Simple Streaming Text to Speech Synthesizers

Paper • 2412.16102 • Published 23 days ago

zdy023

updated a dataset about 1 month ago

X-LANCE/WikiHow-taskset

Viewer • Updated Dec 9, 2024 • 306 • 173 • 4

yfyeung

authored a paper about 2 months ago

k2SSL: A Faster and Better Framework for Self-Supervised Speech Representation Learning

Paper • 2411.17100 • Published Nov 26, 2024

Situo

authored a paper 3 months ago

MobA: A Two-Level Agent System for Efficient Mobile Task Automation

Paper • 2410.13757 • Published Oct 17, 2024 • 32

lankunyao

authored a paper 3 months ago

MobA: A Two-Level Agent System for Efficient Mobile Task Automation

Paper • 2410.13757 • Published Oct 17, 2024 • 32

JamesZhutheThird

authored 6 papers 3 months ago

MobA: A Two-Level Agent System for Efficient Mobile Task Automation

Paper • 2410.13757 • Published Oct 17, 2024 • 32

Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions Using RL from Knowledge Feedback

Paper • 2403.18349 • Published Mar 27, 2024

yfyeung

authored a paper 4 months ago

LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization

Paper • 2409.00819 • Published Sep 1, 2024

yfyeung

authored 7 papers 5 months ago

Zipformer: A faster and better encoder for automatic speech recognition

Paper • 2310.11230 • Published Oct 17, 2023

VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech

Paper • 2401.14321 • Published Jan 25, 2024

An Embarrassingly Simple Approach for LLM with Strong ASR Capacity

Paper • 2402.08846 • Published Feb 13, 2024 • 1

PromptASR for contextualized ASR with controllable style

Paper • 2309.07414 • Published Sep 14, 2023

Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context

Paper • 2309.08105 • Published Sep 15, 2023

GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement

Paper • 2406.11546 • Published Jun 17, 2024

Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS

Paper • 2309.07377 • Published Sep 14, 2023

AI & ML interests

Recent Activity

Team members 13

X-LANCE's activity