Qingkai Fang's picture

8 4 19

Qingkai Fang

poeroz

·

https://fangqingkai.github.io/

poeroz

AI & ML interests

Large Language Models, Speech-Language Models, Speech Translation

Recent Activity

upvoted a paper 9 days ago

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

authored a paper 9 days ago

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

liked a model 10 days ago

ICTNLP/llava-mini-llama-3.1-8b

View all activity

Organizations

poeroz's activity

upvoted a paper 9 days ago

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Paper • 2501.03895 • Published 10 days ago • 48

upvoted a paper 3 months ago

Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data

Paper • 2410.18558 • Published Oct 24, 2024 • 19

upvoted a paper 4 months ago

LLaMA-Omni: Seamless Speech Interaction with Large Language Models

Paper • 2409.06666 • Published Sep 10, 2024 • 56

upvoted a paper 11 months ago

Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters

Paper • 2403.02677 • Published Mar 5, 2024 • 18