Is there a way to integrate the retriever model with the LangChain framework for computing embeddings?
Hello,
Thanks for sharing your models.
When using LangChain, the standard way to compute embedddings with a Hugging Face model seems to depend on the HuggingFaceEmbeddings class. This class in turn expects a Sentence Transformer model.
But your retriever model doesn't seem to provide this interface. Indeed, when we run the following code:
from langchain.embeddings import HuggingFaceEmbeddings
lc_embedding_model = HuggingFaceEmbeddings('cmarkea/bloomz-560m-retriever')
we get the following warning:
No sentence-transformers model found with name cmarkea/bloomz-560m-retriever. Creating a new one with MEAN pooling.
Note: this is actually the same warning that we get when we run the following code:
from sentence_transformers import SentenceTransformer
hf_embedding_model = SentenceTransformer('cmarkea/bloomz-560m-retriever')
Can we safely ignore this warning? Or what would be the recommended way to use your retriever model for computing embeddings within the LangChain framework?
Thanks
Hello JeromeL-DT,
We are using an inference server for the embedding part, so we simply make an API call within LangChain; hence, we are not using the HuggingFaceEmbeddings class. The warning seems troublesome to me because, by default, the pooler is set to average, which doesn't make much sense for causal models. Indeed, only the last token is influenced by all tokens in the sentence. You need to make sure to impose the EOS token on the pooler.
Otherwise, one way to check if you are iso is to test both models:
import numpy
from transformers import pipeline
from sentence_transformers import SentenceTransformer
retriever = pipeline('feature-extraction', 'cmarkea/bloomz-560m-retriever')
test_transformer = np.array(retriever("Hello world!")[0][-1])
retriever_st = SentenceTransformer('cmarkea/bloomz-560m-retriever')
test_st = retriever_st.encode("Hello world!")
assert np.array_equal(test_transformer, test_st)
However, by searching online, it seems possible to create a custom class to handle the inference as desired. It appears that the only constraint is to have the embed_documents
method implemented. This way, you can fully utilize the Transformers pipeline.
Helo Cyrile,
Thank you so much for your detailed answer and your research!
FYI, my initial motivation was about trying out LangChain's (as of this writing still experimental) Semantic Chunking, which relies on an Embeddings instance.
Taking a closer look at it, this Semantic Chunking feature is not yet available in a langchain_experimental
package I could easily install.