Art Atk

ArtAtk

AI & ML interests

Multimodal Models

Recent Activity

upvoted a paper about 3 hours ago

DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation

upvoted a paper about 4 hours ago

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models

upvoted a paper about 4 hours ago

SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation

View all activity

Organizations

None yet

ArtAtk's activity

upvoted a paper about 3 hours ago

DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation

Paper • 2503.10618 • Published about 17 hours ago • 8

upvoted 3 papers about 4 hours ago

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models

Paper • 2503.10437 • Published about 20 hours ago • 6

SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation

Paper • 2503.09641 • Published 2 days ago • 6

Transformers without Normalization

Paper • 2503.10622 • Published about 17 hours ago • 18

upvoted a paper 1 day ago

MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice

Paper • 2503.05978 • Published 6 days ago • 30

upvoted a paper 16 days ago

SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference

Paper • 2502.18137 • Published 17 days ago • 53

upvoted 2 papers 17 days ago

Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment

Paper • 2502.16894 • Published 18 days ago • 27

Audio-FLAN: A Preliminary Release

Paper • 2502.16584 • Published 19 days ago • 34

upvoted a paper 20 days ago

Dynamic Concepts Personalization from Single Videos

Paper • 2502.14844 • Published 22 days ago • 16

upvoted a paper 26 days ago

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published 29 days ago • 143

upvoted 7 papers about 1 month ago

FlashVideo:Flowing Fidelity to Detail for Efficient High-Resolution Video Generation

Paper • 2502.05179 • Published Feb 7 • 24

Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis

Paper • 2502.04128 • Published Feb 6 • 25

VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models

Paper • 2502.02492 • Published Feb 4 • 62

OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models

Paper • 2502.01061 • Published Feb 3 • 189

upvoted 3 papers about 2 months ago

DiffuEraser: A Diffusion Model for Video Inpainting

Paper • 2501.10018 • Published Jan 17 • 14

Temporal Preference Optimization for Long-Form Video Understanding

Paper • 2501.13919 • Published Jan 23 • 22

Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step

Paper • 2501.13926 • Published Jan 23 • 37