KW's picture

KW

kevineen

·

AI & ML interests

None yet

Recent Activity

liked a model about 8 hours ago

Kijai/SkyReels-V1-Hunyuan_comfy

liked a model about 9 hours ago

ModelSpace/GemmaX2-28-9B-v0.1

liked a model 1 day ago

OpenGVLab/InternVideo2_5_Chat_8B

View all activity

Organizations

kevineen's activity

upvoted a paper 1 day ago

FlexiViT: One Model for All Patch Sizes

Paper • 2212.08013 • Published Dec 15, 2022 • 1

upvoted a paper 2 days ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published 3 days ago • 97

upvoted an article 2 days ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

4 days ago

• 135

upvoted an article 4 days ago

Article

PaliGemma 2 Mix - New Instruction Vision Language Models by Google

5 days ago

• 53

upvoted a paper 4 days ago

Magma: A Foundation Model for Multimodal AI Agents

Paper • 2502.13130 • Published 5 days ago • 42

upvoted a paper 7 days ago

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published 10 days ago • 139

upvoted 2 articles 8 days ago

Article

We now support VLMs in smolagents!

about 1 month ago

• 84

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

Jan 23

• 142

upvoted an article 11 days ago

Article

Build awesome datasets for video generation

12 days ago

• 25

upvoted a paper 12 days ago

CustomVideoX: 3D Reference Attention Driven Dynamic Adaptation for Zero-Shot Customized Video Diffusion Transformers

Paper • 2502.06527 • Published 13 days ago • 9

upvoted an article 15 days ago

Article

The SOTA Text-to-speech and Zero Shot Voice cloning model that no one knows about...

By

•

Jan 20

• 61

upvoted a paper 15 days ago

MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation

Paper • 2502.04299 • Published 17 days ago • 15

upvoted 2 collections 15 days ago

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 3 items • Updated 27 days ago • 359

Qwen2-VL

Vision-language model series based on Qwen2 • 16 items • Updated Dec 6, 2024 • 207

upvoted a paper 16 days ago

DextrAH-G: Pixels-to-Action Dexterous Arm-Hand Grasping with Geometric Fabrics

Paper • 2407.02274 • Published Jul 2, 2024 • 1

upvoted an article 21 days ago

Article

State of open video generation models in Diffusers

28 days ago

• 40

upvoted 2 articles 26 days ago

Article

Welcome to Inference Providers on the Hub 🔥

27 days ago

• 384

Article

FineWeb2-C: Help Build Better Language Models in Your Language

By

and 5 others •

Dec 23, 2024

• 18

upvoted 2 collections about 1 month ago

Eagle 2

Eagle 2 is a family of frontier vision-language models with vision-centric design. The model supports 4K HD input, long-context video, and grounding. • 9 items • Updated Jan 23 • 31

INTELLECT-MATH

6 items • Updated Jan 22 • 2