YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

This is a test of converting the architecture of GPT-J 6B into a mixture-of-experts model. It is initialized with 4 experts (2 active) with otherwise the same configuration as GPT-J-6B, making this a 17B parameter model in total.

The model weights were initialized randomly - not loaded from the pretrained GPT-J - for testing purposes. This model is not useable for any downstream purposes unless you're trying to generate absolute schizo babble - in which case, this model is perfect for your use-case. You have been warned.

Be sure to pass trust_remote_code=True into AutoModelForCausalLM.from_pretrained if you still want to use this model for some god-forsaken reason.

Downloads last month
22
Safetensors
Model size
17.3B params
Tensor type
BF16
·
Inference Examples
Inference API (serverless) does not yet support model repos that contain custom code.