3 4 3

Xiaoqian Shen

shenxq

AI & ML interests

None yet

Recent Activity

liked a dataset 8 days ago

lmms-lab/LLaVA-Video-178K

liked a dataset 8 days ago

LongVideos/LongVideoDB-373K-IterCap

new activity 3 months ago

Vision-CAIR/LongVU_Qwen2_7B:Update README.md

View all activity

Organizations

None yet

shenxq's activity

liked 2 datasets 8 days ago

lmms-lab/LLaVA-Video-178K

Viewer • Updated Oct 11, 2024 • 1.63M • 7.33k • 105

LongVideos/LongVideoDB-373K-IterCap

Viewer • Updated 29 days ago • 250k • 40 • 2

New activity in Vision-CAIR/LongVU_Qwen2_7B 3 months ago

Update README.md

#5 opened 3 months ago by

shenxq

updated a model 3 months ago

Vision-CAIR/LongVU_Qwen2_7B

Video-Text-to-Text • Updated Oct 30, 2024 • 302 • 68

authored 7 papers 3 months ago

ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions

Paper • 2303.06594 • Published Mar 12, 2023

MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models

Paper • 2304.10592 • Published Apr 20, 2023

StoryGPT-V: Large Language Models as Consistent Story Visualizers

Paper • 2312.02252 • Published Dec 4, 2023 • 1

Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations

Paper • 2308.16349 • Published Aug 30, 2023

MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens

Paper • 2404.03413 • Published Apr 4, 2024 • 26

Goldfish: Vision-Language Understanding of Arbitrarily Long Videos

Paper • 2407.12679 • Published Jul 17, 2024 • 8

LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding

Paper • 2410.17434 • Published Oct 22, 2024 • 26

upvoted a paper 3 months ago

LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding

Paper • 2410.17434 • Published Oct 22, 2024 • 26

commented a paper 3 months ago

LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding

Paper • 2410.17434 • Published Oct 22, 2024 • 26 •

updated 2 datasets 3 months ago

shenxq/OneVision

Updated Oct 23, 2024 • 53

shenxq/VideoChat2

Viewer • Updated Oct 23, 2024 • 661k • 89 • 2

authored a paper 6 months ago

Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling

Paper • 2408.03695 • Published Aug 7, 2024 • 13

updated 3 models 10 months ago

authored a paper over 1 year ago

MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning

Paper • 2310.09478 • Published Oct 14, 2023 • 19