Marian Krotil
commited on
Commit
•
6ff56c1
1
Parent(s):
03af426
Update README.md
Browse files
README.md
CHANGED
@@ -20,7 +20,7 @@ This model is a fine-tuned checkpoint of [facebook/mbart-large-cc25](https://hug
|
|
20 |
The model deals with the task ``Headline + Text to Abstract`` (HT2A) which consists in generating a multi-sentence summary considered as an abstract from a Czech news text.
|
21 |
|
22 |
## Dataset
|
23 |
-
The model has been trained on the private CNC dataset provided by Czech News Center. The dataset includes 3/4M Czech news-based documents consisting of a Headline, Abstract, and Full-text sections. Truncation and padding were set to 512 tokens.
|
24 |
|
25 |
## Training
|
26 |
The model has been trained on 1x NVIDIA Tesla A100 40GB for 60 hours. During training, the model has seen 3712K documents corresponding to roughly 5.5 epochs.
|
@@ -41,7 +41,7 @@ def summ_config():
|
|
41 |
("repetition_penalty", 1.2),
|
42 |
("no_repeat_ngram_size", None),
|
43 |
("early_stopping", True),
|
44 |
-
("max_length",
|
45 |
("min_length", 10),
|
46 |
])),
|
47 |
#texts to summarize
|
|
|
20 |
The model deals with the task ``Headline + Text to Abstract`` (HT2A) which consists in generating a multi-sentence summary considered as an abstract from a Czech news text.
|
21 |
|
22 |
## Dataset
|
23 |
+
The model has been trained on the private CNC dataset provided by Czech News Center. The dataset includes 3/4M Czech news-based documents consisting of a Headline, Abstract, and Full-text sections. Truncation and padding were set to 512 tokens for the encoder and 128 for the decoder.
|
24 |
|
25 |
## Training
|
26 |
The model has been trained on 1x NVIDIA Tesla A100 40GB for 60 hours. During training, the model has seen 3712K documents corresponding to roughly 5.5 epochs.
|
|
|
41 |
("repetition_penalty", 1.2),
|
42 |
("no_repeat_ngram_size", None),
|
43 |
("early_stopping", True),
|
44 |
+
("max_length", 128),
|
45 |
("min_length", 10),
|
46 |
])),
|
47 |
#texts to summarize
|