6 157 59

rotem israeli

irotem98

https://rotem154154.github.io

rotem154154

AI & ML interests

None yet

Recent Activity

liked a model about 3 hours ago

timm/mvitv2_tiny.fb_in1k

upvoted a paper about 4 hours ago

CoMP: Continual Multimodal Pre-training for Vision Foundation Models

upvoted a paper about 4 hours ago

Long-Context Autoregressive Video Modeling with Next-Frame Prediction

View all activity

Organizations

None yet

irotem98's activity

liked a model about 3 hours ago

timm/mvitv2_tiny.fb_in1k

Image Classification • Updated Jan 21 • 2.07k • 1

upvoted 2 papers about 4 hours ago

CoMP: Continual Multimodal Pre-training for Vision Foundation Models

Paper • 2503.18931 • Published 1 day ago • 16

Long-Context Autoregressive Video Modeling with Next-Frame Prediction

Paper • 2503.19325 • Published 1 day ago • 47

upvoted a paper 1 day ago

Training-free Diffusion Acceleration with Bottleneck Sampling

Paper • 2503.18940 • Published 1 day ago • 12

upvoted a collection 6 days ago

Orpheus TTS

Collection

TTS Towards Human-Sounding Speech • 2 items • Updated 7 days ago • 50

upvoted a paper 6 days ago

TULIP: Towards Unified Language-Image Pretraining

Paper • 2503.15485 • Published 7 days ago • 43

updated a dataset 7 days ago

irotem98/imagenet_3gb

Updated 7 days ago • 15

published a dataset 7 days ago

irotem98/imagenet_3gb

Updated 7 days ago • 15

published a dataset 8 days ago

irotem98/sfhq_encoded

Updated 8 days ago • 13

upvoted 2 papers 9 days ago

PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity

Paper • 2503.07677 • Published 16 days ago • 80

ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

Paper • 2503.11647 • Published 12 days ago • 119

liked a dataset 10 days ago

cloneofsimo/ye-pop-vae-t5-xl-metadata

Updated Jun 4, 2024 • 87 • 1

upvoted 2 papers 13 days ago

Reangle-A-Video: 4D Video Generation as Video-to-Video Translation

Paper • 2503.09151 • Published 14 days ago • 29

TPDiff: Temporal Pyramid Video Diffusion Model

Paper • 2503.09566 • Published 14 days ago • 42

upvoted a paper 14 days ago

YuE: Scaling Open Foundation Models for Long-Form Music Generation

Paper • 2503.08638 • Published 15 days ago • 59

upvoted 2 papers 15 days ago

LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning

Paper • 2503.04812 • Published 22 days ago • 13

EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer

Paper • 2503.07027 • Published 16 days ago • 26

upvoted a paper 16 days ago

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published 20 days ago • 84

upvoted a paper 26 days ago

UniTok: A Unified Tokenizer for Visual Generation and Understanding

Paper • 2502.20321 • Published 27 days ago • 29

upvoted a paper 29 days ago

DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks

Paper • 2502.17157 • Published 30 days ago • 51