antoinelouis
/

crossencoder-mMiniLMv2-L12-mmarcoFR

@@ -10,6 +10,30 @@ tags:
 - passage-reranking
 library_name: sentence-transformers
 base_model: nreimers/mMiniLMv2-L12-H384-distilled-from-XLMR-Large
 ---
 # crossencoder-mMiniLMv2-L12-mmarcoFR
@@ -71,21 +95,12 @@ with torch.no_grad():
 print(scores)
 ```
-***
 ## Evaluation
-We evaluate the model on 500 random training queries from [mMARCO-fr](https://ir-datasets.com/mmarco.html#mmarco/v2/fr/) (which were excluded from training) by reranking
-subsets of candidate passages comprising of at least one relevant and up to 200 BM25 negative passages for each query. Below, we compare the model performance with other
-cross-encoder models fine-tuned on the same dataset. We report the R-precision (RP), mean reciprocal rank (MRR), and recall at various cut-offs (R@k).
-|    | model                                                                                                                        | Vocab. | #Param. |  Size |     RP |   MRR@10 |  R@10(↑) |   R@20 |   R@50 |   R@100 |
-|---:|:-----------------------------------------------------------------------------------------------------------------------------|:-------|--------:|------:|-------:|---------:|---------:|-------:|-------:|--------:|
-|  1 | [crossencoder-camembert-base-mmarcoFR](https://huggingface.co/antoinelouis/crossencoder-camembert-base-mmarcoFR)             |     fr |    110M | 443MB |  35.65 |    50.44 |    82.95 |  91.50 |  96.80 |   98.80 |
-|  2 | **crossencoder-mMiniLMv2-L12-mmarcoFR**                                                                                      | fr,99+ |    118M | 471MB |  34.37 |    51.01 |    82.23 |  90.60 |  96.45 |   98.40 |
-|  3 | [crossencoder-distilcamembert-mmarcoFR](https://huggingface.co/antoinelouis/crossencoder-distilcamembert-mmarcoFR)           |     fr |     68M | 272MB |  27.28 |    43.71 |    80.30 |  89.10 |  95.55 |   98.60 |
-|  4 | [crossencoder-electra-base-french-mmarcoFR](https://huggingface.co/antoinelouis/crossencoder-electra-base-french-mmarcoFR)   |     fr |    110M | 443MB |  28.32 |    45.28 |    79.22 |  87.15 |  93.15 |   95.75 |
-|  5 | [crossencoder-mMiniLMv2-L6-mmarcoFR](https://huggingface.co/antoinelouis/crossencoder-mMiniLMv2-L6-mmarcoFR)                 | fr,99+ |    107M | 428MB |  33.92 |    49.33 |    79.00 |  88.35 |  94.80 |   98.20 |
 ***
@@ -94,28 +109,29 @@ cross-encoder models fine-tuned on the same dataset. We report the R-precision (
 #### Data
 We use the French training samples from the [mMARCO](https://huggingface.co/datasets/unicamp-dl/mmarco) dataset, a multilingual machine-translated version of MS MARCO
-that contains 8.8M passages and 539K training queries. We sample 1M question-passage pairs from the official ~39.8M
-[training triples](https://microsoft.github.io/msmarco/Datasets.html#passage-ranking-dataset) with a positive-to-negative ratio of 4 (i.e., 25% of the pairs are
-relevant and 75% are irrelevant).
 #### Implementation
 The model is initialized from the [nreimers/mMiniLMv2-L12-H384-distilled-from-XLMR-Large](https://huggingface.co/nreimers/mMiniLMv2-L12-H384-distilled-from-XLMR-Large) checkpoint and optimized via the binary cross-entropy loss
-(as in [monoBERT](https://doi.org/10.48550/arXiv.1910.14424)). It is fine-tuned on one 32GB NVIDIA V100 GPU for 10 epochs (i.e., 312.4k steps) using the AdamW optimizer
-with a batch size of 32, a peak learning rate of 2e-5 with warm up along the first 500 steps and linear scheduling. We set the maximum sequence length of the
-concatenated question-passage pairs to 512 tokens. We use the sigmoid function to get scores between 0 and 1.
 ***
 ## Citation
 ```bibtex
-@online{louis2023,
-   author    = 'Antoine Louis',
-   title     = 'crossencoder-mMiniLMv2-L12-mmarcoFR: A Cross-Encoder Model Trained on 1M sentence pairs in French',
-   publisher = 'Hugging Face',
-   month     = 'september',
-   year      = '2023',
-   url       = 'https://huggingface.co/antoinelouis/crossencoder-mMiniLMv2-L12-mmarcoFR',
 }
 ```

 - passage-reranking
 library_name: sentence-transformers
 base_model: nreimers/mMiniLMv2-L12-H384-distilled-from-XLMR-Large
+model-index:
+- name: crossencoder-mMiniLMv2-L12-mmarcoFR
+  results:
+    - task:
+        type: text-classification
+        name: Passage Reranking
+      dataset:
+        type: unicamp-dl/mmarco
+        name: mMARCO-fr
+        config: french
+        split: validation
+      metrics:
+        - type: recall_at_500
+          name: Recall@500
+          value: 96.03
+        - type: recall_at_100
+          name: Recall@100
+          value: 84.74
+        - type: recall_at_10
+          name: Recall@10
+          value: 58.41
+        - type: mrr_at_10
+          name: MRR@10
+          value: 32.96
 ---
 # crossencoder-mMiniLMv2-L12-mmarcoFR
 print(scores)
 ```
 ## Evaluation
+The model is evaluated on the smaller development set of [mMARCO-fr](https://ir-datasets.com/mmarco.html#mmarco/v2/fr/), which consists of 6,980 queries for which
+an ensemble of 1000 passages containing the positive(s) and [ColBERTv2 hard negatives](https://huggingface.co/datasets/antoinelouis/msmarco-dev-small-negatives) need
+to be reranked. We report the mean reciprocal rank (MRR) and recall at various cut-offs (R@k). To see how it compares to other neural retrievers in French, check out
+the [*DécouvrIR*](https://huggingface.co/spaces/antoinelouis/decouvrir) leaderboard.
 ***
 #### Data
 We use the French training samples from the [mMARCO](https://huggingface.co/datasets/unicamp-dl/mmarco) dataset, a multilingual machine-translated version of MS MARCO
+that contains 8.8M passages and 539K training queries. We do not use the BM25 negatives provided by the official dataset but instead sample harder negatives mined from
+12 distinct dense retrievers, using the [msmarco-hard-negatives](https://huggingface.co/datasets/sentence-transformers/msmarco-hard-negatives#msmarco-hard-negativesjsonlgz)
+distillation dataset. Eventually, we sample 2.6M training triplets of the form (query, passage, relevance) with a positive-to-negative ratio of 1 (i.e., 50% of the pairs are
+relevant and 50% are irrelevant).
 #### Implementation
 The model is initialized from the [nreimers/mMiniLMv2-L12-H384-distilled-from-XLMR-Large](https://huggingface.co/nreimers/mMiniLMv2-L12-H384-distilled-from-XLMR-Large) checkpoint and optimized via the binary cross-entropy loss
+(as in [monoBERT](https://doi.org/10.48550/arXiv.1910.14424)). It is fine-tuned on one 80GB NVIDIA H100 GPU for 20k steps using the AdamW optimizer
+with a batch size of 128 and a constant learning rate of 2e-5. We set the maximum sequence length of the concatenated question-passage pairs to 256 tokens.
+We use the sigmoid function to get scores between 0 and 1.
 ***
 ## Citation
 ```bibtex
+@online{louis2024decouvrir,
+	author    = 'Antoine Louis',
+	title     = 'DécouvrIR: A Benchmark for Evaluating the Robustness of Information Retrieval Models in French',
+	publisher = 'Hugging Face',
+	month     = 'mar',
+	year      = '2024',
+	url       = 'https://huggingface.co/spaces/antoinelouis/decouvrir',
 }
 ```