Rykov Elisei

lmeribal

lmeribal

AI & ML interests

NLP, Multimodality

Recent Activity

liked a dataset about 6 hours ago

fava-uw/fava-data

liked a dataset 18 days ago

microsoft/wiki_qa

liked a dataset 18 days ago

wikimedia/wikipedia

View all activity

Organizations

lmeribal's activity

upvoted a paper about 2 months ago

Inference Optimal VLMs Need Only One Visual Token but Larger Models

Paper • 2411.03312 • Published Nov 5, 2024 • 6

upvoted 3 papers 3 months ago

upvoted 3 papers 4 months ago

Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Paper • 2409.12191 • Published Sep 18, 2024 • 76

InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning

Paper • 2409.12568 • Published Sep 19, 2024 • 48

MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark

Paper • 2409.02813 • Published Sep 4, 2024 • 29

upvoted a paper 5 months ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 124

upvoted a collection 5 months ago

Vision-Language Modeling

Collection

Our datasets and models for Visual-Language Modeling • 5 items • Updated Nov 25, 2024 • 6

upvoted 2 papers 6 months ago

Vision language models are blind

Paper • 2407.06581 • Published Jul 9, 2024 • 83

Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model

Paper • 2407.07053 • Published Jul 9, 2024 • 43

upvoted a paper 7 months ago

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Paper • 2406.08418 • Published Jun 12, 2024 • 29

upvoted a paper 8 months ago

Your Transformer is Secretly Linear

Paper • 2405.12250 • Published May 19, 2024 • 149

upvoted 2 papers 9 months ago

Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15, 2024 • 82

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

Paper • 2404.02905 • Published Apr 3, 2024 • 65

upvoted a paper 11 months ago

Linear Transformers with Learnable Kernel Functions are Better In-Context Models

Paper • 2402.10644 • Published Feb 16, 2024 • 79