Upcycling Large Language Models into Mixture of Experts Paper • 2410.07524 • Published Oct 10 • 4 • 3
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding Paper • 2404.16710 • Published Apr 25 • 75 • 12
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding Paper • 2404.16710 • Published Apr 25 • 75 • 12