openbmb
/

MiniCPM-Embedding

Feature Extraction

sentence-transformers

Inference Endpoints

Model card Files Files and versions Community

yushi commited on Sep 28

Commit

132fbea

•

1 Parent(s): a748263

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -360,9 +360,9 @@ tokenizer = AutoTokenizer.from_pretrained(model_name)
 model = AutoModel.from_pretrained(model_name, trust_remote_code=True, attn_implementation="flash_attention_2", torch_dtype=torch.float16).to("cuda")
 model.eval()
-# 事实上我们用的是weighted mean pooling，但为了部署方便，我们将一部分pooling步骤集成在model.forward中
-# In fact, we will use weighted mean pooling, but we will integrate some pooling steps into model.forward for deployment convenience
-def mean_pooling(hidden,attention_mask):
     s = torch.sum(hidden * attention_mask.unsqueeze(-1).float(), dim=1)
     d = attention_mask.sum(dim=1, keepdim=True).float()
     reps = s / d

 model = AutoModel.from_pretrained(model_name, trust_remote_code=True, attn_implementation="flash_attention_2", torch_dtype=torch.float16).to("cuda")
 model.eval()
+# 由于在 `model.forward` 中缩放了最终隐层表示，此处的 mean pooling 实际上起到了 weighted mean pooling 的作用
+# As we scale hidden states in `model.forward`, mean pooling here actually works as weighted mean pooling
+def mean_pooling(hidden, attention_mask):
     s = torch.sum(hidden * attention_mask.unsqueeze(-1).float(), dim=1)
     d = attention_mask.sum(dim=1, keepdim=True).float()
     reps = s / d