mpjan commited on
Commit
f2e93ea
·
1 Parent(s): 1bc9d52

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -4
README.md CHANGED
@@ -1,5 +1,7 @@
1
  ---
2
  pipeline_tag: sentence-similarity
 
 
3
  tags:
4
  - sentence-transformers
5
  - feature-extraction
@@ -8,10 +10,12 @@ tags:
8
 
9
  ---
10
 
11
- # msmarco-distilbert-base-tas-b-mmarco-pt-100k
12
 
13
  This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
14
 
 
 
15
  <!--- Describe your model here -->
16
 
17
  ## Usage (Sentence-Transformers)
@@ -28,7 +32,7 @@ Then you can use the model like this:
28
  from sentence_transformers import SentenceTransformer
29
  sentences = ["This is an example sentence", "Each sentence is converted"]
30
 
31
- model = SentenceTransformer('{MODEL_NAME}')
32
  embeddings = model.encode(sentences)
33
  print(embeddings)
34
  ```
@@ -51,8 +55,8 @@ def cls_pooling(model_output, attention_mask):
51
  sentences = ['This is an example sentence', 'Each sentence is converted']
52
 
53
  # Load model from HuggingFace Hub
54
- tokenizer = AutoTokenizer.from_pretrained('{MODEL_NAME}')
55
- model = AutoModel.from_pretrained('{MODEL_NAME}')
56
 
57
  # Tokenize sentences
58
  encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
 
1
  ---
2
  pipeline_tag: sentence-similarity
3
+ language:
4
+ - 'pt'
5
  tags:
6
  - sentence-transformers
7
  - feature-extraction
 
10
 
11
  ---
12
 
13
+ # mpjan/msmarco-distilbert-base-tas-b-mmarco-pt-100k
14
 
15
  This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
16
 
17
+ It is a fine-tuning of [sentence-transformers/msmarco-distilbert-base-tas-b](https://huggingface.co/sentence-transformers/msmarco-distilbert-base-tas-b) on the first 100k triplets of the Portuguese subset in [unicamp-dl/mmarco](https://huggingface.co/datasets/unicamp-dl/mmarco).
18
+
19
  <!--- Describe your model here -->
20
 
21
  ## Usage (Sentence-Transformers)
 
32
  from sentence_transformers import SentenceTransformer
33
  sentences = ["This is an example sentence", "Each sentence is converted"]
34
 
35
+ model = SentenceTransformer('mpjan/msmarco-distilbert-base-tas-b-mmarco-pt-100k')
36
  embeddings = model.encode(sentences)
37
  print(embeddings)
38
  ```
 
55
  sentences = ['This is an example sentence', 'Each sentence is converted']
56
 
57
  # Load model from HuggingFace Hub
58
+ tokenizer = AutoTokenizer.from_pretrained('{mpjan/msmarco-distilbert-base-tas-b-mmarco-pt-100k}')
59
+ model = AutoModel.from_pretrained('{mpjan/msmarco-distilbert-base-tas-b-mmarco-pt-100k}')
60
 
61
  # Tokenize sentences
62
  encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')