hpprc commited on
Commit
614e151
·
1 Parent(s): 6e43f2d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -21
README.md CHANGED
@@ -9,27 +9,24 @@ datasets:
9
  - wiki40b
10
  ---
11
 
12
- # {MODEL_NAME}
13
 
14
- This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 1024 dimensional dense vector space and can be used for tasks like clustering or semantic search.
15
-
16
- <!--- Describe your model here -->
17
 
18
  ## Usage (Sentence-Transformers)
19
 
20
  Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
21
 
22
  ```
23
- pip install -U sentence-transformers
24
  ```
25
 
26
  Then you can use the model like this:
27
 
28
  ```python
29
  from sentence_transformers import SentenceTransformer
30
- sentences = ["This is an example sentence", "Each sentence is converted"]
31
 
32
- model = SentenceTransformer('{MODEL_NAME}')
33
  embeddings = model.encode(sentences)
34
  print(embeddings)
35
  ```
@@ -52,8 +49,8 @@ def cls_pooling(model_output, attention_mask):
52
  sentences = ['This is an example sentence', 'Each sentence is converted']
53
 
54
  # Load model from HuggingFace Hub
55
- tokenizer = AutoTokenizer.from_pretrained('{MODEL_NAME}')
56
- model = AutoModel.from_pretrained('{MODEL_NAME}')
57
 
58
  # Tokenize sentences
59
  encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
@@ -69,24 +66,24 @@ print("Sentence embeddings:")
69
  print(sentence_embeddings)
70
  ```
71
 
72
-
73
-
74
- ## Evaluation Results
75
-
76
- <!--- Describe how your model was evaluated -->
77
-
78
- For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name={MODEL_NAME})
79
-
80
-
81
-
82
  ## Full Model Architecture
83
  ```
84
  SentenceTransformer(
85
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
86
- (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
87
  )
88
  ```
89
 
90
  ## Citing & Authors
91
 
92
- <!--- Describe where people can find more information -->
 
 
 
 
 
 
 
 
 
 
 
9
  - wiki40b
10
  ---
11
 
12
+ # unsup-simcse-ja-large
13
 
 
 
 
14
 
15
  ## Usage (Sentence-Transformers)
16
 
17
  Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
18
 
19
  ```
20
+ pip install -U fugashi[unidic-lite] sentence-transformers
21
  ```
22
 
23
  Then you can use the model like this:
24
 
25
  ```python
26
  from sentence_transformers import SentenceTransformer
27
+ sentences = ["こんにちは、世界!", "文埋め込み最高!文埋め込み最高と叫びなさい", "極度乾燥しなさい"]
28
 
29
+ model = SentenceTransformer("unsup-simcse-ja-large")
30
  embeddings = model.encode(sentences)
31
  print(embeddings)
32
  ```
 
49
  sentences = ['This is an example sentence', 'Each sentence is converted']
50
 
51
  # Load model from HuggingFace Hub
52
+ tokenizer = AutoTokenizer.from_pretrained("unsup-simcse-ja-large")
53
+ model = AutoModel.from_pretrained("unsup-simcse-ja-large")
54
 
55
  # Tokenize sentences
56
  encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
 
66
  print(sentence_embeddings)
67
  ```
68
 
 
 
 
 
 
 
 
 
 
 
69
  ## Full Model Architecture
70
  ```
71
  SentenceTransformer(
72
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
73
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
74
  )
75
  ```
76
 
77
  ## Citing & Authors
78
 
79
+ ```
80
+ @misc{
81
+ hayato-tsukagoshi-2023-simple-simcse-ja,
82
+ author = {Hayato Tsukagoshi},
83
+ title = {Japanese Simple-SimCSE},
84
+ year = {2023},
85
+ publisher = {GitHub},
86
+ journal = {GitHub repository},
87
+ howpublished = {\url{https://github.com/hppRC/simple-simcse-ja}}
88
+ }
89
+ ```