Edit model card

This is a Hugging Face transformers-style conversion of the original SMoE 15B-parameter model with BFLOAT16 from the paper "Efficient Large Scale Language Modeling with Mixtures of Experts" from Artetxe et al. The original model card can be found at https://github.com/facebookresearch/fairseq/blob/main/examples/moe_lm/model_card.md.

The usage example and modeling code can be found at https://github.com/pingzhili/light-fairseq

Downloads last month
9
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.