Files changed (1) hide show
  1. README.md +42 -27
README.md CHANGED
@@ -2651,6 +2651,44 @@ Jina Embeddings V2 [technical report](https://arxiv.org/abs/2310.19923)
2651
 
2652
  ## Usage
2653
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2654
  You can use Jina Embedding models directly from transformers package:
2655
  ```python
2656
  !pip install transformers
@@ -2678,8 +2716,9 @@ Alternatively, you can use Jina AI's [Embeddings platform](https://jina.ai/embed
2678
 
2679
  ## RAG Performance
2680
 
2681
- Jina Embeddings are very effective for retrieval augmented generation (RAG).
2682
- Ravi Theja wrote a [blog post](https://blog.llamaindex.ai/boosting-rag-picking-the-best-embedding-reranker-models-42d079022e83) on using Jina Embeddings together with [LLama Index](https://github.com/run-llama/llama_index) for RAG:
 
2683
 
2684
 
2685
  <img src="https://miro.medium.com/v2/resize:fit:4800/format:webp/1*ZP2RVejCZovF3FDCg-Bx3A.png" width="780px">
@@ -2706,28 +2745,4 @@ If you find Jina Embeddings useful in your research, please cite the following p
2706
  archivePrefix={arXiv},
2707
  primaryClass={cs.CL}
2708
  }
2709
- ```
2710
-
2711
- <!---
2712
-
2713
- ``` latex
2714
- @misc{günther2023jina,
2715
- title={Beyond the 512-Token Barrier: Training General-Purpose Text
2716
- Embeddings for Large Documents},
2717
- author={Michael Günther and Jackmin Ong and Isabelle Mohr and Alaeddine Abdessalem and Tanguy Abel and Mohammad Kalim Akram and Susana Guzman and Georgios Mastrapas and Saba Sturua and Bo Wang},
2718
- year={2023},
2719
- eprint={2307.11224},
2720
- archivePrefix={arXiv},
2721
- primaryClass={cs.CL}
2722
- }
2723
-
2724
- @misc{günther2023jina,
2725
- title={Jina Embeddings: A Novel Set of High-Performance Sentence Embedding Models},
2726
- author={Michael Günther and Louis Milliken and Jonathan Geuter and Georgios Mastrapas and Bo Wang and Han Xiao},
2727
- year={2023},
2728
- eprint={2307.11224},
2729
- archivePrefix={arXiv},
2730
- primaryClass={cs.CL}
2731
- }
2732
- ```
2733
- -->
 
2651
 
2652
  ## Usage
2653
 
2654
+ **<details><summary>Please apply mean pooling when integrating the model.</summary>**
2655
+ <p>
2656
+
2657
+ ### Why mean pooling?
2658
+
2659
+ `mean poooling` takes all token embeddings from model output and averaging them at sentence/paragraph level.
2660
+ It has been proved to be the most effective way to produce high-quality sentence embeddings.
2661
+ We offer an `encode` function to deal with this.
2662
+
2663
+ However, if you would like to do it without using the default `encode` function:
2664
+
2665
+ ```python
2666
+ import torch
2667
+ import torch.nn.functional as F
2668
+ from transformers import AutoTokenizer, AutoModel
2669
+
2670
+ def mean_pooling(model_output, attention_mask):
2671
+ token_embeddings = model_output[0]
2672
+ input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
2673
+ return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
2674
+
2675
+ sentences = ['How is the weather today?', 'What is the current weather like today?']
2676
+
2677
+ tokenizer = AutoTokenizer.from_pretrained('jinaai/jina-embeddings-v2-small-en')
2678
+ model = AutoModel.from_pretrained('jinaai/jina-embeddings-v2-small-en', trust_remote_code=True)
2679
+
2680
+ encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
2681
+
2682
+ with torch.no_grad():
2683
+ model_output = model(**encoded_input)
2684
+
2685
+ embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
2686
+ embeddings = F.normalize(embeddings, p=2, dim=1)
2687
+ ```
2688
+
2689
+ </p>
2690
+ </details>
2691
+
2692
  You can use Jina Embedding models directly from transformers package:
2693
  ```python
2694
  !pip install transformers
 
2716
 
2717
  ## RAG Performance
2718
 
2719
+ According to the latest blog post from [LLamaIndex](https://blog.llamaindex.ai/boosting-rag-picking-the-best-embedding-reranker-models-42d079022e83),
2720
+
2721
+ > In summary, to achieve the peak performance in both hit rate and MRR, the combination of OpenAI or JinaAI-Base embeddings with the CohereRerank/bge-reranker-large reranker stands out.
2722
 
2723
 
2724
  <img src="https://miro.medium.com/v2/resize:fit:4800/format:webp/1*ZP2RVejCZovF3FDCg-Bx3A.png" width="780px">
 
2745
  archivePrefix={arXiv},
2746
  primaryClass={cs.CL}
2747
  }
2748
+ ```