antoinelouis
commited on
Commit
•
5c4728f
1
Parent(s):
4a94c57
Update README.md
Browse files
README.md
CHANGED
@@ -88,7 +88,7 @@ The model is fine-tuned on the French version of the [mMARCO](https://huggingfac
|
|
88 |
- a development set of ~101k queries;
|
89 |
- a smaller dev set of 6,980 queries (which is actually used for evaluation in most published works).
|
90 |
|
91 |
-
The triples are sampled from the ~39.8M triples
|
92 |
|
93 |
## Citation
|
94 |
|
|
|
88 |
- a development set of ~101k queries;
|
89 |
- a smaller dev set of 6,980 queries (which is actually used for evaluation in most published works).
|
90 |
|
91 |
+
The triples are sampled from the ~39.8M triples of [triples.train.small.tsv](https://microsoft.github.io/msmarco/Datasets.html#passage-ranking-dataset). In the future, better negatives could be selected by exploiting the [msmarco-hard-negatives](https://huggingface.co/datasets/sentence-transformers/msmarco-hard-negatives) dataset that contains 50 hard negatives mined from BM25 and 12 dense retrievers for each training query.
|
92 |
|
93 |
## Citation
|
94 |
|