Haoyu Li

trainfanlhy

lihytotoro

AI & ML interests

Multimodal Large Language Models, Natural Language Processing

Recent Activity

upvoted a paper about 1 month ago

VisionZip: Longer is Better but Not Necessary in Vision Language Models

updated a dataset about 1 month ago

trainfanlhy/new_tsvdataset_1204

updated a dataset about 1 month ago

trainfanlhy/mmmu_pro_processed

View all activity

Organizations

trainfanlhy's activity

upvoted a paper about 1 month ago

VisionZip: Longer is Better but Not Necessary in Vision Language Models

Paper • 2412.04467 • Published Dec 5, 2024 • 105

updated 2 datasets about 1 month ago

trainfanlhy/new_tsvdataset_1204

Viewer • Updated Dec 5, 2024 • 5.01k • 23

trainfanlhy/mmmu_pro_processed

Preview • Updated Dec 4, 2024 • 2

liked a model about 2 months ago

Qwen/Qwen2-VL-7B-Instruct

Image-Text-to-Text • Updated Dec 6, 2024 • 1.53M • • 1.03k

liked a model 2 months ago

microsoft/wavlm-large

Feature Extraction • Updated Feb 2, 2022 • 225k • 62

upvoted a paper 3 months ago

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

Paper • 2410.10594 • Published Oct 14, 2024 • 24

liked a model 3 months ago

Lin-Chen/open-llava-next-vicuna-7b

Image-Text-to-Text • Updated May 27, 2024 • 33 • 3

liked a dataset 3 months ago

StarBottle/MIBench

Updated Oct 16, 2024 • 138 • 5

liked 6 models 3 months ago

liked a model 4 months ago

RhapsodyAI/minicpm-guidance

Visual Question Answering • Updated Jul 15, 2024 • 32 • 6

liked a Space 4 months ago

Running on Zero

147

🔥

Llava Next

liked 3 models 4 months ago

llava-hf/llava-v1.6-mistral-7b-hf

Image-Text-to-Text • Updated about 9 hours ago • 442k • 246

pyannote/wespeaker-voxceleb-resnet34-LM

Updated May 10, 2024 • 10.1M • 48

openbmb/MiniCPM3-4B

Text Generation • Updated Nov 30, 2024 • 41k • 398

liked a model 5 months ago

openbmb/MiniCPM-V-2_6-gguf

Updated Aug 13, 2024 • 3.25k • 148