LiYuan commited on
Commit
e7706e8
1 Parent(s): 37a4eb5
Files changed (1) hide show
  1. README.md +5 -3
README.md CHANGED
@@ -2,8 +2,10 @@
2
  license: afl-3.0
3
  ---
4
 
5
- This model is actually very accurate for this task, intuitively inspired by information retrieval techniques. In 2019, Nils Reimers and Iryna Gurevych introduced a new transformers model called Sentence-BERT, Sentence Embeddings using Siamese BERT-Networks. This model was introduce in this paper: https://doi.org/10.48550/arxiv.1908.10084}
6
 
7
- This new Sentence-BERT model is modified on the BERT model by adding a pooling operation to the output of BERT model. In such a way, it can output a fixed size of the sentence embedding to calculate cosine similarity, and so on. To obtain a meaningful sentence embedding in a sentence vector space where similar or pairwise sentence embedding are close, they created a triplet network to modify the BERT model as the architecture below figure.
8
 
9
- ![](1.png)
 
 
 
2
  license: afl-3.0
3
  ---
4
 
5
+ There are two types of Cross-Encoder models. One is the Cross-Encoder Regression model that we fine-tuned and mentioned in the previous section. Next, we have the Cross-Encoder Classification model. These two models are introduced in the same paper https://doi.org/10.48550/arxiv.1908.10084
6
 
7
+ Both models resolve the issue that the BERT model is too time-consuming and resource-consuming to train in pairwised sentences. These two model weights are initialized as the BERT and RoBERTa networks. We only need to fine-tune them, spending much less time to yield a comparable or even better sentence embedding. The below figure \ref{figure:5} shows the architecture of Cross-Encoder Classification.
8
 
9
+ ![](1.png)
10
+
11
+ Then we evaluated the model performance on the 2,000 held-out test set. We also got a test accuracy **46.05%** that is almost identical to the best validation accuracy, suggesting a good generalization model.