Shehan Munasinghe's picture

2 8 2

Shehan Munasinghe

shehan97

·

https://shehanmunasinghe.github.io/

AI & ML interests

Computer Vision, Multi-modal learning

Recent Activity

upvoted a paper 4 days ago

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

upvoted a paper about 1 month ago

MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation

upvoted a paper about 1 month ago

MALT: Improving Reasoning with Multi-Agent LLM Training

View all activity

Organizations

shehan97's activity

upvoted a paper 4 days ago

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

Paper • 2501.06186 • Published 8 days ago • 55

upvoted 2 papers about 1 month ago

MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation

Paper • 2411.17636 • Published Nov 26, 2024 • 2

MALT: Improving Reasoning with Multi-Agent LLM Training

Paper • 2412.01928 • Published Dec 2, 2024 • 40

upvoted a paper 2 months ago

VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos

Paper • 2411.04923 • Published Nov 7, 2024 • 20

upvoted a paper 11 months ago

PALO: A Polyglot Large Multimodal Model for 5B People

Paper • 2402.14818 • Published Feb 22, 2024 • 23

upvoted 2 papers about 1 year ago

PG-Video-LLaVA: Pixel Grounding Large Video-Language Models

Paper • 2311.13435 • Published Nov 22, 2023 • 16

GLaMM: Pixel Grounding Large Multimodal Model

Paper • 2311.03356 • Published Nov 6, 2023 • 33

upvoted a paper over 1 year ago

TokenFlow: Consistent Diffusion Features for Consistent Video Editing

Paper • 2307.10373 • Published Jul 19, 2023 • 56