Orr Zohar's picture

Orr Zohar PRO

orrzohar

·

https://orrzohar.github.io

AI & ML interests

Large Multi-Modal Models, Foundation Models, Video Understanding

Recent Activity

upvoted a collection about 9 hours ago

upvoted a paper about 13 hours ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

new activity about 13 hours ago

HuggingFaceTB/SmolVLM2-2.2B-Instruct:checkpoint you are trying to load has model type `smolvlm` but Transformers does not recognize this

View all activity

Organizations

Articles 1

Article

124

SmolVLM2: Bringing Video Understanding to Every Device

Collections 1

Papers 7

arxiv:2501.09755

arxiv:2412.10360

arxiv:2407.06189

arxiv:2403.10517

spaces 1

Video STaR Dataset

Browse and view video data with questions and labels

models 1

orrzohar/encoder

Updated Sep 20, 2024

datasets 1

orrzohar/Video-STaR

Updated Jul 9, 2024 • 68 • 3