--- language: ["ru", "en"] tags: - russian license: mit widget: - text: "translate ru to en: Интересный момент. Модель не видела русских диалогов, но может их понимать" --- This pruned model of mt5-base [google/mt5-base](https://huggingface.co/google/mt5-base) with only some Rusian and English embeddings left. The model has been fine-tuned for several tasks: * translation (opus100 dataset) * dialog (daily dialog dataset) How to use: ``` # !pip install transformers sentencepiece from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, T5Tokenizer import torch model_name = 'artemnech/enrut5-base' model = AutoModelForSeq2SeqLM.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name) def generate(text, **kwargs): model.eval() inputs = tokenizer(text, return_tensors='pt') with torch.no_grad(): hypotheses = model.generate(**inputs, **kwargs) return tokenizer.decode(hypotheses[0], skip_special_tokens=True) print(generate('translate ru to en: Интересный момент. Модель не видела русских диалогов, но может их понимать', num_beams=4,)) # The Model didn't see Russian dialogues, but can understand them. print(generate("translate en to ru: The Model didn't see Russian dialogues, but can understand them.", num_beams=4,)) # Модель не видела русских диалога, но может понимать их. print(generate('dialog: user1>>: Hello', num_beams=2)) # Hi print(generate('dialog: user1>>: Hello user2>>: Hi user1>>: Would you like to drink something?', num_beams=2)) # I'd like to drink a cup of coffee. #An interesting point. The model has not seen Russian dialogues, but can understand them print(generate('dialog: user1>>: Привет')) # Hi print(generate('dialog: user1>>: Привет user2>>: Hi user1>>: Хочешь выпить что-нибудь?', num_beams=2)) # I'd like to have a cup of coffee. ```