vamsibanda commited on
Commit
5e0b4c7
1 Parent(s): 7efeb5a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -55
README.md CHANGED
@@ -10,65 +10,22 @@ tags:
10
  - onnx
11
  ---
12
 
13
- #
14
-
15
- This is the ONNX model of sentence-transformers/all-MiniLM-L6-v2 [https://seb.sbert.net]. Currently, Hugging Face does not support downloading ONNX model and generate embeddings. I have created a workaround using sbert and optimum together to generate embeddings.
16
 
 
 
17
  ```
18
- pip install onnx
19
- pip install onnxruntime==1.10.0
20
- pip install transformers>4.6.1
21
- pip install sentencepiece
22
- pip install sentence-transformers
23
- pip install optimum
24
- pip install torch==1.9.0
25
  ```
26
-
27
  Then you can use the model like this:
28
-
29
  ```python
30
- import os
31
- from sentence_transformers.util import snapshot_download
32
- from transformers import AutoTokenizer
33
- from optimum.onnxruntime import ORTModelForFeatureExtraction
34
- from sentence_transformers.models import Transformer, Pooling, Dense
35
- import torch
36
- from transformers.modeling_outputs import BaseModelOutput
37
- import torch.nn.functional as F
38
- import shutil
39
-
40
- model_name = 'vamsibanda/sbert-onnx-all-MiniLM-L6-v2'
41
- cache_folder = './'
42
- model_path = os.path.join(cache_folder, model_name.replace("/", "_"))
43
-
44
- def download_onnx_model(model_name, cache_folder, model_path, force_download = False):
45
- if force_download and os.path.exists(model_path):
46
- shutil.rmtree(model_path)
47
- elif os.path.exists(model_path):
48
- return
49
- snapshot_download(model_name,
50
- cache_dir=cache_folder,
51
- library_name='sentence-transformers'
52
- )
53
- return
54
-
55
- def mean_pooling(model_output, attention_mask):
56
- token_embeddings = model_output[0] #First element of model_output contains all token embeddings
57
- input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
58
- return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
59
-
60
- def generate_embedding(text):
61
- token = tokenizer(text, return_tensors='pt')
62
- embedding = model(input_ids=token['input_ids'], attention_mask=token['attention_mask'])
63
- embedding = mean_pooling(embedding, token['attention_mask'])
64
- embedding = F.normalize(embedding, p=2, dim=1)
65
- return embedding.tolist()[0]
66
-
67
- _ = download_onnx_model(model_name, cache_folder, model_path)
68
- tokenizer = AutoTokenizer.from_pretrained(model_path)
69
- model = ORTModelForFeatureExtraction.from_pretrained(model_path, force_download=False)
70
- pooling_layer = Pooling.load(f"{model_path}/1_Pooling")
71
 
72
- generate_embedding('That is a happy person')
73
-
 
 
 
74
  ```
 
10
  - onnx
11
  ---
12
 
13
+ # ONNX convert all-MiniLM-L6-v2
14
+ ## Conversion of [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2)
15
+ This is a [sentence-transformers](https://www.SBERT.net) ONNX model: It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search. This custom model takes `last_hidden_state` and `pooler_output` whereas the sentence-transformers exported with default ONNX config only contains `last_hidden_state` as output.
16
 
17
+ ## Usage (HuggingFace Optimum)
18
+ Using this model becomes easy when you have [optimum](https://github.com/huggingface/optimum) installed:
19
  ```
20
+ python -m pip install optimum
 
 
 
 
 
 
21
  ```
 
22
  Then you can use the model like this:
 
23
  ```python
24
+ from optimum.onnxruntime.modeling_ort import ORTModelForCustomTasks
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
+ model = ORTModelForCustomTasks.from_pretrained("vamsibanda/sbert-all-MiniLM-L6-with-pooler")
27
+ tokenizer = AutoTokenizer.from_pretrained("vamsibanda/sbert-all-MiniLM-L6-with-pooler")
28
+ inputs = tokenizer("I love burritos!", return_tensors="pt")
29
+ pred = model(**inputs)
30
+ embedding = pred['pooler_output']
31
  ```