Edit model card

Release Notes

  • this model is finetuned from mt5-small

  • will use about 1.5G vram, fp16 will be less than 1G(if batch size is small), cpu inference speed is ok anyway

  • used a trimmed piece of pontoon dataset that features ja to zh translate part

  • also scrambled bunch of the translation from mt5-translation-ja_zh-game-v0.1, which is a large amount of junk for training

  • reason for making this model
    Testing the ideas of using pontoon dataset
    Constructing a flexible translation evaluation standard, need a poor performance model to compare

模型公开声明

  • 这个模型由 mt5-translation-ja_zh 继续训练得来
  • 使用大于1.5g的显存,fp16载入会小于1G显存(batch拉高会大于1G),使用cpu运作速度也还可以
  • 制作这个模型的原因
    尝试使用现有的模型精调,小模型训练速度奇快
  • 本模型缺陷
    本身就是用来做测试的,虽然使用的显存很低,但翻译能力奇差

简单的后端应用

还没稳定调试,慎用

A more precise example using it

使用指南

from transformers import pipeline
model_name="iryneko571/mt5-translation-ja_zh-game-small"
#pipe = pipeline("translation",model=model_name,tokenizer=model_name,repetition_penalty=1.4,batch_size=1,max_length=256)
pipe = pipeline("translation",
  model=model_name,
  repetition_penalty=1.4,
  batch_size=1,
  max_length=256
  )

def translate_batch(batch, language='<-ja2zh->'): # batch is an array of string
    i=0 # quickly format the list
    while i<len(batch):
        batch[i]=f'{language} {batch[i]}'
        i+=1
    translated=pipe(batch)
    result=[]
    i=0
    while i<len(translated):
        result.append(translated[i]['translation_text'])
        i+=1
    return result

inputs=[]

print(translate_batch(inputs))

Roadmap

  • Scamble more translation results from gpt4o, gpt3.5, claude, mt5 and other sources to make a more messy input
  • increase translation accuracy
  • apply lora on it and apply int8 inference to further decrease hardware requirements
  • create onnx and ncnn model

how to find me

找到作者

Discord Server:
https://discord.gg/JmjPmJjA
If you need any help, a test server or just want to chat
如果需要帮助,需要试试最新的版本,或者只是为了看下我是啥,可以进channel看看(这边允许发布这个吗?)

Downloads last month
58
Safetensors
Model size
300M params
Tensor type
F32
·
Inference API
Examples
This model can be loaded on Inference API (serverless).

Dataset used to train iryneko571/mt5-translation-ja_zh-game-small