Edit model card

๐Ÿค— Language model initialized from mT5 and trained for an additional 100K steps on the Prefix LM objective using mC4 data.

Paper: Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation

Authors: Tu Vu, Aditya Barua, Brian Lester, Daniel Cer, Mohit Iyyer, Noah Constant

PyTorch port of the original Flax checkpoint at Google/T5X repository.

Downloads last month
142
Safetensors
Model size
3.74B params
Tensor type
F32
ยท
Inference API
Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.

Spaces using DKYoon/mt5-xl-lm-adapt 2