DistilLED Large CNN 16384

distil-led-large-cnn-16384 was initialized from sshleifer/distilbart-cnn-12-6, in a fashion similar to allenai/led-large-16384.

To be able to process 16K tokens, sshleifer/distilbart-cnn-12-6's position embedding matrix was simply copied 16 times.

This checkpoint should be loaded into LEDForConditionalGeneration.from_pretrained. See the LED documentation for more information.

Downloads last month
350
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Dataset used to train HHousen/distil-led-large-cnn-16384