tomsherborne
commited on
Commit
•
bfe20d3
1
Parent(s):
c94ddeb
Add the "max_length" parameter to the Generation configuration.
Browse filesThe 12B model does not match the performance of the 1.2B model as the generation defaults to the max_length of "20". This results in shorter sequences than the model should be generating. For example on WMT14-DE-EN: the 12B model scores 15.52 and the 1.2B model scores 31.786 (SacreBLEU). The default max_length is properly set in the smaller models (see https://huggingface.co/facebook/m2m100_1.2B/blob/main/generation_config.json) and the 12B models should match this. I am submitting similar PRs for the other 12B models.
- generation_config.json +1 -0
generation_config.json
CHANGED
@@ -2,6 +2,7 @@
|
|
2 |
"_from_model_config": true,
|
3 |
"bos_token_id": 0,
|
4 |
"decoder_start_token_id": 2,
|
|
|
5 |
"eos_token_id": 2,
|
6 |
"pad_token_id": 1,
|
7 |
"transformers_version": "4.27.0.dev0"
|
|
|
2 |
"_from_model_config": true,
|
3 |
"bos_token_id": 0,
|
4 |
"decoder_start_token_id": 2,
|
5 |
+
"max_length": 200,
|
6 |
"eos_token_id": 2,
|
7 |
"pad_token_id": 1,
|
8 |
"transformers_version": "4.27.0.dev0"
|