Model Card for Model ID
This model trained to summarize news post. Trained on data grabbed from russian news site Lenta.ru.
Модель обучена суммаризации новостных статей. Обучение проводилось на данных, полученных с русского новостного сайта Lenta.ru.
Model Details
Model Description
- Developed by: i-k-a
- Shared by [optional]: i-k-a
- Model type: Transformer Text2Text Generation
- Language(s) (NLP): Russian
- Finetuned from model [optional]: mT5-base
Model Sources [optional]
- Repository: link
How to Get Started with the Model
Use code below to infer model.
Используйте код ниже для запуска модели.
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
MAX_NEW_TOKENS=400
MODEL_DIR='i-k-a/my_lenta_model_ru_mt5-base_4_epochs'
text = input('Введите текст:')
tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR)
model = AutoModelForSeq2SeqLM.from_pretrained(MODEL_DIR)
inputs = tokenizer(text, return_tensors="pt").input_ids
outputs = model.generate(inputs, max_new_tokens=MAX_NEW_TOKENS, do_sample=False)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(f'Резюме от нейросети: "{result}"\n\nИсходный текст: "{text}"')
Training Details
Model trained 4 epochs. Length of input text is cut to 1024 tokens. Output is 400 tokens. Trained using Google Colab resources.
Technical Specifications [optional]
Model Architecture and Objective
google/mt5-base
Compute Infrastructure
Google Colab
Hardware
Google Colab T4 GPU
Software
Python
- Downloads last month
- 5
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Evaluation results
- ROUGE-1 on lenta.ruself-reported0.130
- ROUGE-2 on lenta.ruself-reported0.034
- ROUGE-L on lenta.ruself-reported0.132
- ROUGE-LSUM on lenta.ruself-reported0.131
- ValidationLoss on lenta.ruself-reported0.619
- gen_len on lenta.ruself-reported19.000