Dotanoob7's picture

1 16 2

Dotanoob7

Dotanoob

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 28 days ago

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

upvoted a paper 28 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

upvoted a paper 29 days ago

PaSa: An LLM Agent for Comprehensive Academic Paper Search

View all activity

Organizations

None yet

Dotanoob's activity

upvoted 2 papers 28 days ago

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Paper • 2501.13106 • Published 29 days ago • 83

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 29 days ago • 325

upvoted a paper 29 days ago

PaSa: An LLM Agent for Comprehensive Academic Paper Search

Paper • 2501.10120 • Published Jan 17 • 43

upvoted 3 papers 2 months ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 140

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published Dec 12, 2024 • 94

Perception Tokens Enhance Visual Reasoning in Multimodal Language Models

Paper • 2412.03548 • Published Dec 4, 2024 • 17

upvoted 4 papers 3 months ago

Multimodal Autoregressive Pre-training of Large Vision Encoders

Paper • 2411.14402 • Published Nov 21, 2024 • 43

OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision

Paper • 2411.07199 • Published Nov 11, 2024 • 47

LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation

Paper • 2411.04997 • Published Nov 7, 2024 • 37

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published Nov 7, 2024 • 115

upvoted 6 papers 4 months ago

Teach Multimodal LLMs to Comprehend Electrocardiographic Images

Paper • 2410.19008 • Published Oct 21, 2024 • 23

Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Paper • 2409.12191 • Published Sep 18, 2024 • 76

NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples

Paper • 2410.14669 • Published Oct 18, 2024 • 37

HumanEval-V: Benchmarking High-Level Visual Reasoning with Complex Diagrams in Coding Tasks

Paper • 2410.12381 • Published Oct 16, 2024 • 44

Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models

Paper • 2410.07985 • Published Oct 10, 2024 • 31

MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks

Paper • 2410.10563 • Published Oct 14, 2024 • 39