Export and use ONNX format?
Hi all,
I am just exported model to ONNX format, via following script:
CONVERSION TO ONNX ------------------
import torch
from transformers import MBartForConditionalGeneration, MBart50TokenizerFast
model_path = "./mbart_large_50_model"
model = MBartForConditionalGeneration.from_pretrained(model_path)
tokenizer = MBart50TokenizerFast.from_pretrained(model_path)
input_text = "This is just simple text"
inputs = tokenizer(input_text, return_tensors="pt")
onnx_path = "./mbart_large_50.onnx"
export to ONNX format
torch.onnx.export(
model, # PyTorch model
(inputs["input_ids"],), # input data
onnx_path, # Pth to the ONNX file
input_names=["input_ids"], # input tensor names
output_names=["logits"], # otput tensor names
dynamic_axes={"input_ids": {0: "batch", 1: "sequence"}},
opset_version=14
)
-----------------END OF CONVERSION-----------------got something, but I am not sure if I missed something?
-Now, while dealing with original PyTorch model (so, not onnx exporter one), there is a way to specify source and target language for translation:
....
model = MBartForConditionalGeneration.from_pretrained(model_path)
tokenizer = MBart50TokenizerFast.from_pretrained(model_path)specify src language
tokenizer.src_lang = "en_XX"
...
generated_tokens = model.generate(**inputs, forced_bos_token_id=tokenizer.lang_code_to_id["hr_HR"])
-Pay attention on:
forced_bos_token_id=tokenizer.lang_code_to_id["hr_HR"]
-Wondering, is there a way (and HOW?) to do the same with exported ONNX model?
#TRIED SO FAR---
tokenizer = MBart50TokenizerFast.from_pretrained(tokenizer_path)
tokenizer.src_lang = "en_XX"
input_text = "Hello, my name is BART"
inputs = tokenizer(input_text, return_tensors="np")
target_lang_id = tokenizer.lang_code_to_id["hr_HR"]
input_ids = np.array(inputs["input_ids"], dtype=np.int64)
start_token = np.array([[target_lang_id]], dtype=np.int64)
input_ids = np.concatenate((start_token, input_ids), axis=1)
-As you can see, I am trying to put the target_lang_id as first input entry.
but, this way only the first word is translated, not the rest.
-Any help?