File size: 608 Bytes
8f00ca5 996d9e7 8f00ca5 996d9e7 8f00ca5 996d9e7 6b42a54 996d9e7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 |
---
language:
- en
license: mit
tags:
- english
---
This is a version of the [google/mt5-base](https://huggingface.co/google/mt5-base) model only for English with some embeddings left.
* Using `sentencepiece` vocabulary was shrinking from 250K to 20K (top 20K English tokens) the number of model parameters reduced to 244M parameters, and model size reduced from 2.2GB to 0.9GB - 39% of the original one.
Approach was taken from article: [How to adapt a multilingual T5 model for a single language](https://cointegrated.medium.com/how-to-adapt-a-multilingual-t5-model-for-a-single-language-b9f94f3d9c90). |