dlicari commited on
Commit
e419761
1 Parent(s): 9a7187e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -4
README.md CHANGED
@@ -7,7 +7,16 @@ tags:
7
  - transformers
8
  ---
9
 
10
- # {MODEL_NAME}
 
 
 
 
 
 
 
 
 
11
 
12
  This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
13
 
@@ -27,7 +36,7 @@ Then you can use the model like this:
27
  from sentence_transformers import SentenceTransformer
28
  sentences = ["This is an example sentence", "Each sentence is converted"]
29
 
30
- model = SentenceTransformer('{MODEL_NAME}')
31
  embeddings = model.encode(sentences)
32
  print(embeddings)
33
  ```
@@ -53,8 +62,8 @@ def mean_pooling(model_output, attention_mask):
53
  sentences = ['This is an example sentence', 'Each sentence is converted']
54
 
55
  # Load model from HuggingFace Hub
56
- tokenizer = AutoTokenizer.from_pretrained('{MODEL_NAME}')
57
- model = AutoModel.from_pretrained('{MODEL_NAME}')
58
 
59
  # Tokenize sentences
60
  encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
 
7
  - transformers
8
  ---
9
 
10
+ <img src="https://huggingface.co/dlicari/distil-ita-legal-bert/resolve/main/ITALIAN_LEGAL_BERT-DI.jpg" width="600"/>
11
+
12
+ # Distil-ITA-Legal-BERT
13
+ We used the process of knowledge distillation to create a fast, lightweight student model with only 4-levels of Transformers,
14
+ capable of producing sentence embeddings similar to those produced by the more complex
15
+ [ITALIAN-LEGAL-BERT](dlicari/Italian-Legal-BERT) teacher model.
16
+
17
+ It optimized on the ITALIAN-LEGAL-BERT train set (3.7 GB) using Sentence-BERT library by minimizing the mean square error (MSE) between its embeddings
18
+ and those produced by the teacher model.
19
+
20
 
21
  This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
22
 
 
36
  from sentence_transformers import SentenceTransformer
37
  sentences = ["This is an example sentence", "Each sentence is converted"]
38
 
39
+ model = SentenceTransformer('dlicari/distil-ita-legal-bert')
40
  embeddings = model.encode(sentences)
41
  print(embeddings)
42
  ```
 
62
  sentences = ['This is an example sentence', 'Each sentence is converted']
63
 
64
  # Load model from HuggingFace Hub
65
+ tokenizer = AutoTokenizer.from_pretrained('dlicari/distil-ita-legal-bert')
66
+ model = AutoModel.from_pretrained('dlicari/distil-ita-legal-bert')
67
 
68
  # Tokenize sentences
69
  encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')