library_name: transformers
tags: []
Translation and Fusion Improves Zero-shot Cross-lingual Information Extraction
Summary
We propose TransFusion, a framework in which models are fine-tuned to use English translations of low-resource language data, enabling more precise predictions through annotation fusion. Based on TransFusion, we introduce GoLLIE-TF, a cross-lingual instruction-tuned LLM for IE tasks, designed to close the performance gap between high and low-resource languages.
- ๐ Paper: Translation and Fusion Improves Zero-shot Cross-lingual Information Extraction
- ๐ค Model: GoLLIE-7B-TF
- ๐ Example Jupyter Notebooks: GoLLIE-TF Notebooks
Important: This is based on GoLLIE README (Our flash attention implementation has small numerical differences compared to the attention implementation in Huggingface.
You must use the flag trust_remote_code=True
or you will get inferior results. Flash attention requires an available CUDA GPU. Running GOLLIE
pre-trained models on a CPU is not supported. We plan to address this in future releases. First, install flash attention 2:)
pip install flash-attn --no-build-isolation
pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary
Then you can load the model using
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("HiTZ/GoLLIE-7B")
model = AutoModelForCausalLM.from_pretrained("HiTZ/GoLLIE-7B", trust_remote_code=True, torch_dtype=torch.bfloat16)
model.to("cuda")