--- pipeline_tag: sentence-similarity tags: - sentence-transformers - feature-extraction - sentence-similarity --- # BGE-M3 in HuggingFace Transformer > **This is not an official implementation of BGE-M3. Official implementation can be found in [Flag Embedding](https://github.com/FlagOpen/FlagEmbedding) project.** ## Introduction Full introduction please see the github repo. https://github.com/liuyanyi/transformers-bge-m3 ## Use BGE-M3 in HuggingFace Transformer ```python from transformers import AutoModel, AutoTokenizer # Trust remote code is required to load the model tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) model = AutoModel.from_pretrained(model_path, trust_remote_code=True) input_str = "Hello, world!" input_ids = tokenizer(input_str, return_tensors="pt", padding=True, truncation=True) output = model(**input_ids, return_dict=True) dense_output = output.dense_output # To align with Flag Embedding project, a normalization is required colbert_output = output.colbert_output # To align with Flag Embedding project, a normalization is required sparse_output = output.sparse_output ``` ## References - [Official BGE-M3 Weight](https://huggingface.co/BAAI/bge-m3) - [Flag Embedding](https://github.com/FlagOpen/FlagEmbedding) - [HuggingFace Transformer](https://github.com/huggingface/transformers)