sn4kebyt3 commited on
Commit
9673a4d
1 Parent(s): 655f4bc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -2
README.md CHANGED
@@ -4,8 +4,11 @@ language:
4
  - ru
5
  - en
6
  library_name: transformers
7
- pipeline_tag: text2text-generation
8
  tags:
9
  - mbart
10
  - mbart-50
11
- ---
 
 
 
 
 
4
  - ru
5
  - en
6
  library_name: transformers
 
7
  tags:
8
  - mbart
9
  - mbart-50
10
+ ---
11
+
12
+ This is a smaller version of the [facebook/mbart-large-50l](facebook/mbart-large-50) with only Russian and English embeddings left.
13
+
14
+ sentencepiece vocabulary was shrinked from 250k to 25k (most common 10k English tokens and most common 15k Russian tokens). The creation of this model is heavily based on David Dale's method described [here](https://cointegrated.medium.com/how-to-adapt-a-multilingual-t5-model-for-a-single-language-b9f94f3d9c90), but with some specific to MBart additions.