@mayank-mishra on Hugging Face: "Thrilled to unveil DS-MoE: a dense training and sparse inference scheme for…"

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

mayank-mishra

posted an update Apr 9

Post

2502

Thrilled to unveil DS-MoE: a dense training and sparse inference scheme for enhanced computational and memory efficiency in your MoE models! 🚀🚀🚀

Discover more in our blog: https://huggingface.co/blog/bpan/ds-moe and dive into the details with our paper: Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models (2404.05567)

victor

Apr 10

Great writing!

In this post