update usage scripst
Browse files
README.md
CHANGED
@@ -1083,3 +1083,29 @@ we provide scripts in "eval" folder for results reproducing.
|
|
1083 |
| [bge-large-zh-no-instruct]| 1.3 | 1024 | 512 | 63.4 | 68.58 | 50.01 | 76.77 | 64.9 | 70.54 | 53 |
|
1084 |
| [bge-base-zh]| 0.41 | 768 | 512 | 62.8 | 67.07 | 47.64 | 77.5 | 64.91 | 69.53 | 54.12 |
|
1085 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1083 |
| [bge-large-zh-no-instruct]| 1.3 | 1024 | 512 | 63.4 | 68.58 | 50.01 | 76.77 | 64.9 | 70.54 | 53 |
|
1084 |
| [bge-base-zh]| 0.41 | 768 | 512 | 62.8 | 67.07 | 47.64 | 77.5 | 64.91 | 69.53 | 54.12 |
|
1085 |
|
1086 |
+
## Usage
|
1087 |
+
在sentence-transformer package中可以很容易地调用piccolo模型
|
1088 |
+
```python
|
1089 |
+
# for s2s dataset, you can use piccolo as below
|
1090 |
+
# 对于短对短数据集,下面是通用的使用方式
|
1091 |
+
from sentence_transformers import SentenceTransformer
|
1092 |
+
sentences = ["数据1", "数据2"]
|
1093 |
+
model = SentenceTransformer('sensenova/piccolo-base-zh')
|
1094 |
+
embeddings_1 = model.encode(sentences, normalize_embeddings=True)
|
1095 |
+
embeddings_2 = model.encode(sentences, normalize_embeddings=True)
|
1096 |
+
similarity = embeddings_1 @ embeddings_2.T
|
1097 |
+
print(similarity)
|
1098 |
+
|
1099 |
+
# for s2p dataset, we recommend to add instruction for passage retrieval
|
1100 |
+
# 对于短对长数据集,我们推荐添加instruction,来帮助模型更好地进行检索。
|
1101 |
+
from sentence_transformers import SentenceTransformer
|
1102 |
+
queries = ['query_1', 'query_2']
|
1103 |
+
passages = ["doc_1", "doc_2"]
|
1104 |
+
|
1105 |
+
model = SentenceTransformer('sensenova/piccolo-base-zh')
|
1106 |
+
q_embeddings = model.encode(["查询:" + q for q in queries], normalize_embeddings=True)
|
1107 |
+
p_embeddings = model.encode(["结果:" + p for p in passages], normalize_embeddings=True)
|
1108 |
+
scores = q_embeddings @ p_embeddings.T
|
1109 |
+
|
1110 |
+
|
1111 |
+
```
|