7 69 158

linxi

AI & ML interests

None yet

Recent Activity

liked a dataset about 1 month ago

Xkev/LLaVA-CoT-100k

upvoted a collection about 1 month ago

InternVL2.5

liked a model about 1 month ago

HuggingFaceTB/SmolVLM-Instruct

View all activity

Organizations

linxi's activity

liked a dataset about 1 month ago

Xkev/LLaVA-CoT-100k

Viewer • Updated Nov 27, 2024 • 98.6k • 1.69k • 65

upvoted a collection about 1 month ago

InternVL2.5

Collection

Better than InternVL 2.0 • 18 items • Updated 6 days ago • 78

liked 5 models about 1 month ago

New activity in BAAI/Infinity-MM about 1 month ago

stage2 data only has 21M

#2 opened 2 months ago by

HensonLiu

upvoted 12 papers about 1 month ago

BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices

Paper • 2411.10640 • Published Nov 16, 2024 • 44

SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instruction Tuning

Paper • 2411.10161 • Published Nov 15, 2024 • 8

VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation

Paper • 2411.13281 • Published Nov 20, 2024 • 17

SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration

Paper • 2411.10958 • Published Nov 17, 2024 • 52

DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding

Paper • 2411.14347 • Published Nov 21, 2024 • 13

Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

Paper • 2411.14432 • Published Nov 21, 2024 • 22

Hymba: A Hybrid-head Architecture for Small Language Models

Paper • 2411.13676 • Published Nov 20, 2024 • 40

Multimodal Autoregressive Pre-training of Large Vision Encoders

Paper • 2411.14402 • Published Nov 21, 2024 • 43

Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

Paper • 2411.14405 • Published Nov 21, 2024 • 58

Efficient Long Video Tokenization via Coordinated-based Patch Reconstruction

Paper • 2411.14762 • Published Nov 22, 2024 • 11

VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection

Paper • 2411.14794 • Published Nov 22, 2024 • 13

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

Paper • 2411.15124 • Published Nov 22, 2024 • 58