Real-Time Video Generation with Pyramid Attention Broadcast Paper • 2408.12588 • Published Aug 22 • 15
DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers Paper • 2403.10266 • Published Mar 15
Real-Time Video Generation with Pyramid Attention Broadcast Paper • 2408.12588 • Published Aug 22 • 15
HeteGen: Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices Paper • 2403.01164 • Published Mar 2