Cyrile commited on
Commit
f93ddb0
·
verified ·
1 Parent(s): 37f957a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -0
README.md CHANGED
@@ -84,3 +84,47 @@ Next, we evaluate the model in a cross-language context, with queries in French
84
  As observed, the cross-language context does not significantly impact the behavior of our models. If the model is used in a reranking context along with filtering of the
85
  Top-K results from a search, a threshold of 0.8 could be applied to filter the contexts outputted by the retriever, thereby reducing noise issues present in the contexts
86
  for RAG-type applications.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
84
  As observed, the cross-language context does not significantly impact the behavior of our models. If the model is used in a reranking context along with filtering of the
85
  Top-K results from a search, a threshold of 0.8 could be applied to filter the contexts outputted by the retriever, thereby reducing noise issues present in the contexts
86
  for RAG-type applications.
87
+
88
+ How to Use Bloomz-3b-reranking
89
+ ------------------------------
90
+
91
+ The following example utilizes the API Pipeline of the Transformers library.
92
+
93
+ ```python
94
+ import numpy as np
95
+ from transformers import pipeline
96
+ from scipy.spatial.distance import cdist
97
+
98
+ retriever = pipeline('feature-extraction', 'cmarkea/bloomz-3b-retriever')
99
+
100
+ # Inportant: take only last token!
101
+ infer = lambda x: [ii[0][-1] for ii in retriever(x)]
102
+
103
+ list_of_contexts = [...]
104
+ emb_contexts = np.concatenate(infer(list_of_contexts), axis=0)
105
+ list_of_queries = [...]
106
+ emb_queries = np.concatenate(infer(list_of_queries), axis=0)
107
+
108
+ # Important: take l2 distance!
109
+ dist = cdist(emb_queries, emb_contexts, 'euclidean')
110
+ top_k = lambda x: [
111
+ [list_of_contexts[qq] for qq in ii]
112
+ for ii in dist.argsort(axis=-1)[:,:x]
113
+ ]
114
+
115
+ # top 5 nearest contexts for each queries
116
+ top_contexts = top_k(5)
117
+ ```
118
+
119
+ Citation
120
+ --------
121
+
122
+ ```bibtex
123
+ @online{DeBloomzReranking,
124
+ AUTHOR = {Cyrile Delestre},
125
+ ORGANIZATION = {Cr{\'e}dit Mutuel Ark{\'e}a},
126
+ URL = {https://huggingface.co/cmarkea/bloomz-3b-reranking},
127
+ YEAR = {2024},
128
+ KEYWORDS = {NLP ; Transformers ; LLM ; Bloomz},
129
+ }
130
+ ```