jinaai
/

jina-embeddings-v2-small-en

@@ -2651,6 +2651,44 @@ Jina Embeddings V2 [technical report](https://arxiv.org/abs/2310.19923)
 ## Usage
 You can use Jina Embedding models directly from transformers package:
 ```python
 !pip install transformers
@@ -2678,8 +2716,9 @@ Alternatively, you can use Jina AI's [Embeddings platform](https://jina.ai/embed
 ## RAG Performance
-Jina Embeddings are very effective for retrieval augmented generation (RAG).
-Ravi Theja wrote a [blog post](https://blog.llamaindex.ai/boosting-rag-picking-the-best-embedding-reranker-models-42d079022e83) on using Jina Embeddings together with [LLama Index](https://github.com/run-llama/llama_index) for RAG:
 <img src="https://miro.medium.com/v2/resize:fit:4800/format:webp/1*ZP2RVejCZovF3FDCg-Bx3A.png" width="780px">
@@ -2706,28 +2745,4 @@ If you find Jina Embeddings useful in your research, please cite the following p
       archivePrefix={arXiv},
       primaryClass={cs.CL}
 }
-```
-<!---
-``` latex
-@misc{günther2023jina,
-      title={Beyond the 512-Token Barrier: Training General-Purpose Text
-Embeddings for Large Documents},
-      author={Michael Günther and Jackmin Ong and Isabelle Mohr and Alaeddine Abdessalem and Tanguy Abel and Mohammad Kalim Akram and Susana Guzman and Georgios Mastrapas and Saba Sturua and Bo Wang},
-      year={2023},
-      eprint={2307.11224},
-      archivePrefix={arXiv},
-      primaryClass={cs.CL}
-}
-@misc{günther2023jina,
-      title={Jina Embeddings: A Novel Set of High-Performance Sentence Embedding Models},
-      author={Michael Günther and Louis Milliken and Jonathan Geuter and Georgios Mastrapas and Bo Wang and Han Xiao},
-      year={2023},
-      eprint={2307.11224},
-      archivePrefix={arXiv},
-      primaryClass={cs.CL}
-}
-```
--->

 ## Usage
+**<details><summary>Please apply mean pooling when integrating the model.</summary>**
+<p>
+### Why mean pooling?
+`mean poooling` takes all token embeddings from model output and averaging them at sentence/paragraph level.
+It has been proved to be the most effective way to produce high-quality sentence embeddings.
+We offer an `encode` function to deal with this.
+However, if you would like to do it without using the default `encode` function:
+```python
+import torch
+import torch.nn.functional as F
+from transformers import AutoTokenizer, AutoModel
+def mean_pooling(model_output, attention_mask):
+    token_embeddings = model_output[0]
+    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
+    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
+sentences = ['How is the weather today?', 'What is the current weather like today?']
+tokenizer = AutoTokenizer.from_pretrained('jinaai/jina-embeddings-v2-small-en')
+model = AutoModel.from_pretrained('jinaai/jina-embeddings-v2-small-en', trust_remote_code=True)
+encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
+with torch.no_grad():
+    model_output = model(**encoded_input)
+embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
+embeddings = F.normalize(embeddings, p=2, dim=1)
+```
+</p>
+</details>
 You can use Jina Embedding models directly from transformers package:
 ```python
 !pip install transformers
 ## RAG Performance
+According to the latest blog post from [LLamaIndex](https://blog.llamaindex.ai/boosting-rag-picking-the-best-embedding-reranker-models-42d079022e83),
+> In summary, to achieve the peak performance in both hit rate and MRR, the combination of OpenAI or JinaAI-Base embeddings with the CohereRerank/bge-reranker-large reranker stands out.
 <img src="https://miro.medium.com/v2/resize:fit:4800/format:webp/1*ZP2RVejCZovF3FDCg-Bx3A.png" width="780px">
       archivePrefix={arXiv},
       primaryClass={cs.CL}
 }
+```