8 4 19

Qingkai Fang

poeroz

https://fangqingkai.github.io/

poeroz

AI & ML interests

Large Language Models, Speech-Language Models, Speech Translation

Recent Activity

upvoted a paper 9 days ago

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

authored a paper 9 days ago

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

liked a model 10 days ago

ICTNLP/llava-mini-llama-3.1-8b

View all activity

Organizations

poeroz's activity

upvoted a paper 9 days ago

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Paper • 2501.03895 • Published 10 days ago • 48

authored a paper 9 days ago

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Paper • 2501.03895 • Published 10 days ago • 48

liked a model 10 days ago

ICTNLP/llava-mini-llama-3.1-8b

Image-Text-to-Text • Updated 5 days ago • 3.75k • 35

liked 3 datasets about 1 month ago

liked a model about 2 months ago

fishaudio/fish-agent-v0.1-3b

Audio-to-Audio • Updated Nov 1, 2024 • 586 • 242

updated a model 2 months ago

ICTNLP/Llama-3.1-8B-Omni

Updated Nov 14, 2024 • 5.85k • 390

liked a dataset 3 months ago

amphion/Emilia-Dataset

Viewer • Updated Sep 6, 2024 • 52.9M • 37.9k • 194

upvoted a paper 3 months ago

Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data

Paper • 2410.18558 • Published Oct 24, 2024 • 19

liked a dataset 3 months ago

BAAI/Infinity-MM

Updated Dec 13, 2024 • 15.9k • 87

liked 3 models 3 months ago

THUDM/glm-4-voice-decoder

Updated Oct 25, 2024 • 81 • 15

THUDM/glm-4-voice-9b

Updated Oct 25, 2024 • 2.07k • 82

THUDM/glm-4-voice-tokenizer

Updated Oct 25, 2024 • 9.25k • 8

updated a collection 3 months ago

Paper list

Collection

21 items • Updated Oct 12, 2024