MUST FOLLOWS - a williamcstanford Collection

williamcstanford 's Collections

video segmentation

RL

LLMs

Autonomous agents

Transformer improvements

video understanding

brain

singing portraits

Depth Estimation

Cellular Automata DL

Code Understanding

MUST FOLLOWS

updated Jun 6, 2024

Explorative Inbetweening of Time and Space

Paper • 2403.14611 • Published Mar 21, 2024 • 11
MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies

Paper • 2403.01422 • Published Mar 3, 2024 • 26
DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation

Paper • 2402.11929 • Published Feb 19, 2024 • 10
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text

Paper • 2403.14773 • Published Mar 21, 2024 • 10
SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions

Paper • 2403.16627 • Published Mar 25, 2024 • 20
CameraCtrl: Enabling Camera Control for Text-to-Video Generation

Paper • 2404.02101 • Published Apr 2, 2024 • 22
PointInfinity: Resolution-Invariant Point Diffusion Models

Paper • 2404.03566 • Published Apr 4, 2024 • 13
MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model

Paper • 2404.19759 • Published Apr 30, 2024 • 24
InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation

Paper • 2404.19427 • Published Apr 30, 2024 • 71
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation

Paper • 2405.01434 • Published May 2, 2024 • 53
LogoMotion: Visually Grounded Code Generation for Content-Aware Animation

Paper • 2405.07065 • Published May 11, 2024 • 16
Naturalistic Music Decoding from EEG Data via Latent Diffusion Models

Paper • 2405.09062 • Published May 15, 2024 • 9
Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation

Paper • 2405.14598 • Published May 23, 2024 • 11
Searching Priors Makes Text-to-Video Synthesis Better

Paper • 2406.03215 • Published Jun 5, 2024 • 11