DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention Paper • 2309.14327 • Published Sep 25, 2023 • 21
Teaching Transformers Causal Reasoning through Axiomatic Training Paper • 2407.07612 • Published Jul 10 • 2