piotr-rybak commited on
Commit
f1f9016
1 Parent(s): 10982b1

update evaluation

Browse files
Files changed (1) hide show
  1. README.md +16 -30
README.md CHANGED
@@ -29,36 +29,22 @@ It was initialized from the [HerBERT-base](https://huggingface.co/allegro/herber
29
  ## Evaluation
30
 
31
 
32
- ### Accuracy@10
33
-
34
- | **Model** | [**PolQA**](https://huggingface.co/datasets/ipipan/polqa) | [**Allegro FAQ**](https://huggingface.co/datasets/piotr-rybak/allegro-faq) | [**Legal Questions**](https://huggingface.co/datasets/piotr-rybak/legal-questions) | **Average** |
35
- |:-----------------------|------------:|------------:|------------:|------------:|
36
- | BM25 | 61.35 | 66.89 | **96.38** | 74.87 |
37
- | BM25 (lemma) | 71.49 | 75.33 | 94.57 | 80.46 |
38
- | MiniLM-L12-v2 | 37.24 | 71.67 | 78.97 | 62.62 |
39
- | LaBSE | 46.23 | 67.11 | 81.34 | 64.89 |
40
- | mContriever-Base | 78.66 | 84.44 | 95.82 | 86.31 |
41
- | E5-Base | 86.61 | 91.89 | 96.24 | 91.58 |
42
- | ST-DistilRoBERTa | 48.43 | 84.89 | 88.02 | 73.78 |
43
- | ST-MPNet | 56.80 | 86.00 | 87.19 | 76.66 |
44
- | HerBERT-QA | 75.84 | 85.78 | 91.09 | 84.23 |
45
- | **SilverRetriever** | **87.24** | **94.56** | 95.54 | **92.45** |
46
-
47
- ### NDCG@10
48
-
49
- | **Model** | **PolQA** | **Allegro FAQ** | **Legal Questions** | **Average** |
50
- |:-----------------------|-------------:|-------------:|-------------:|-------------:|
51
- | BM25 | 24.51 | 48.71 | **82.21** | 51.81 |
52
- | BM25 (lemma) | 31.97 | 55.70 | 78.65 | 55.44 |
53
- | MiniLM-L12-v2 | 11.93 | 51.25 | 54.44 | 39.21 |
54
- | LaBSE | 15.53 | 46.71 | 56.16 | 39.47 |
55
- | mContriever-Base | 36.30 | 67.38 | 77.42 | 60.37 |
56
- | E5-Base | **46.08** | 75.90 | 77.69 | 66.56 |
57
- | ST-DistilRoBERTa | 16.73 | 64.39 | 63.76 | 48.29 |
58
- | ST-MPNet | 21.55 | 65.44 | 62.99 | 49.99 |
59
- | HerBERT-QA | 32.52 | 63.58 | 66.99 | 54.36 |
60
- | **SilverRetriever** | 43.40 | **79.66** | 77.10 | **66.72** |
61
-
62
 
63
 
64
  ## Usage
 
29
  ## Evaluation
30
 
31
 
32
+ | **Model** | **Average [Acc]** | **Average [NDCG]** | [**PolQA**](https://huggingface.co/datasets/ipipan/polqa) **[Acc]** | [**PolQA**](https://huggingface.co/datasets/ipipan/polqa) **[NDCG]** | [**Allegro FAQ**](https://huggingface.co/datasets/piotr-rybak/allegro-faq) **[Acc]** | [**Allegro FAQ**](https://huggingface.co/datasets/piotr-rybak/allegro-faq) **[NDCG]** | [**Legal Questions**](https://huggingface.co/datasets/piotr-rybak/legal-questions) **[Acc]** | [**Legal Questions**](https://huggingface.co/datasets/piotr-rybak/legal-questions) **[NDCG]** |
33
+ |--------------------:|------------:|-------------:|------------:|-------------:|------------:|-------------:|------------:|-------------:|
34
+ | BM25 | 74.87 | 51.81 | 61.35 | 24.51 | 66.89 | 48.71 | **96.38** | **82.21** |
35
+ | BM25 (lemma) | 80.46 | 55.44 | 71.49 | 31.97 | 75.33 | 55.70 | 94.57 | 78.65 |
36
+ | [MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) | 62.62 | 39.21 | 37.24 | 11.93 | 71.67 | 51.25 | 78.97 | 54.44 |
37
+ | [LaBSE](https://huggingface.co/sentence-transformers/LaBSE) | 64.89 | 39.47 | 46.23 | 15.53 | 67.11 | 46.71 | 81.34 | 56.16 |
38
+ | [mContriever-Base](https://huggingface.co/nthakur/mcontriever-base-msmarco) | 86.31 | 60.37 | 78.66 | 36.30 | 84.44 | 67.38 | 95.82 | 77.42 |
39
+ | [E5-Base](https://huggingface.co/intfloat/multilingual-e5-base) | 91.58 | 66.56 | 86.61 | **46.08** | 91.89 | 75.90 | 96.24 | 77.69 |
40
+ | [ST-DistilRoBERTa](https://huggingface.co/sdadas/st-polish-paraphrase-from-distilroberta) | 73.78 | 48.29 | 48.43 | 16.73 | 84.89 | 64.39 | 88.02 | 63.76 |
41
+ | [ST-MPNet](sdadas/st-polish-paraphrase-from-mpnet) | 76.66 | 49.99 | 56.80 | 21.55 | 86.00 | 65.44 | 87.19 | 62.99 |
42
+ | [HerBERT-QA](https://huggingface.co/ipipan/herbert-base-qa-v1) | 84.23 | 54.36 | 75.84 | 32.52 | 85.78 | 63.58 | 91.09 | 66.99 |
43
+ | [**SilverRetriever**](https://huggingface.co/ipipan/silver-retriever-base-v1) | **92.45** | **66.72** | **87.24** | 43.40 | **94.56** | **79.66** | 95.54 | 77.10 |
44
+
45
+ Legend:
46
+ - **Acc** is the Accuracy at 10
47
+ - **NDCG** is the Normalized Discounted Cumulative Gain at 10
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
 
49
 
50
  ## Usage