LeoChiuu commited on
Commit
5d48402
1 Parent(s): 62425dc

Add new SentenceTransformer model.

Browse files
README.md CHANGED
@@ -1,201 +1,387 @@
1
  ---
2
  base_model: colorfulscoop/sbert-base-ja
3
- language: ja
4
- license: cc-by-sa-4.0
5
- model_name: LeoChiuu/sbert-base-ja-arc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  ---
7
 
8
- # Model Card for LeoChiuu/sbert-base-ja-arc
9
-
10
- <!-- Provide a quick summary of what the model is/does. -->
11
-
12
 
 
13
 
14
  ## Model Details
15
 
16
  ### Model Description
 
 
 
 
 
 
 
 
 
17
 
18
- <!-- Provide a longer summary of what this model is. -->
19
-
20
- Generates similarity embeddings
21
-
22
- - **Developed by:** [More Information Needed]
23
- - **Funded by [optional]:** [More Information Needed]
24
- - **Shared by [optional]:** [More Information Needed]
25
- - **Model type:** [More Information Needed]
26
- - **Language(s) (NLP):** ja
27
- - **License:** cc-by-sa-4.0
28
- - **Finetuned from model [optional]:** colorfulscoop/sbert-base-ja
29
-
30
- ### Model Sources [optional]
31
 
32
- <!-- Provide the basic links for the model. -->
 
 
33
 
34
- - **Repository:** [More Information Needed]
35
- - **Paper [optional]:** [More Information Needed]
36
- - **Demo [optional]:** [More Information Needed]
37
 
38
- ## Uses
 
 
 
 
 
39
 
40
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
41
 
42
- ### Direct Use
43
 
44
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
45
 
46
- [More Information Needed]
 
 
47
 
48
- ### Downstream Use [optional]
49
-
50
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
51
-
52
- [More Information Needed]
53
-
54
- ### Out-of-Scope Use
55
 
56
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
 
 
 
 
 
 
 
 
 
 
57
 
58
- [More Information Needed]
 
 
 
 
59
 
60
- ## Bias, Risks, and Limitations
 
61
 
62
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
63
 
64
- [More Information Needed]
 
65
 
66
- ### Recommendations
67
-
68
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
69
-
70
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
71
-
72
- ## How to Get Started with the Model
73
-
74
- Use the code below to get started with the model.
75
-
76
- [More Information Needed]
77
-
78
- ## Training Details
79
 
80
- ### Training Data
81
 
82
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
83
 
84
- [More Information Needed]
 
85
 
86
- ### Training Procedure
87
-
88
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
89
-
90
- #### Preprocessing [optional]
91
-
92
- [More Information Needed]
93
-
94
-
95
- #### Training Hyperparameters
96
-
97
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
98
-
99
- #### Speeds, Sizes, Times [optional]
100
-
101
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
102
 
103
- [More Information Needed]
 
104
 
105
  ## Evaluation
106
 
107
- <!-- This section describes the evaluation protocols and provides the results. -->
108
-
109
- ### Testing Data, Factors & Metrics
110
-
111
- #### Testing Data
112
-
113
- <!-- This should link to a Dataset Card if possible. -->
114
-
115
- [More Information Needed]
116
-
117
- #### Factors
118
-
119
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
120
-
121
- [More Information Needed]
122
 
123
- #### Metrics
 
 
124
 
125
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
 
 
126
 
127
- [More Information Needed]
 
128
 
129
- ### Results
 
130
 
131
- [More Information Needed]
132
-
133
- #### Summary
134
-
135
-
136
-
137
- ## Model Examination [optional]
138
-
139
- <!-- Relevant interpretability work for the model goes here -->
140
-
141
- [More Information Needed]
142
-
143
- ## Environmental Impact
144
-
145
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
146
-
147
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
148
-
149
- - **Hardware Type:** [More Information Needed]
150
- - **Hours used:** [More Information Needed]
151
- - **Cloud Provider:** [More Information Needed]
152
- - **Compute Region:** [More Information Needed]
153
- - **Carbon Emitted:** [More Information Needed]
154
-
155
- ## Technical Specifications [optional]
156
-
157
- ### Model Architecture and Objective
158
-
159
- [More Information Needed]
160
-
161
- ### Compute Infrastructure
162
-
163
- [More Information Needed]
164
-
165
- #### Hardware
166
-
167
- [More Information Needed]
168
-
169
- #### Software
170
-
171
- [More Information Needed]
172
-
173
- ## Citation [optional]
174
-
175
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
176
-
177
- **BibTeX:**
178
-
179
- [More Information Needed]
180
-
181
- **APA:**
182
-
183
- [More Information Needed]
184
-
185
- ## Glossary [optional]
186
-
187
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
188
-
189
- [More Information Needed]
190
-
191
- ## More Information [optional]
192
-
193
- [More Information Needed]
194
 
195
- ## Model Card Authors [optional]
 
196
 
197
- [More Information Needed]
198
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
199
  ## Model Card Contact
200
 
201
- [More Information Needed]
 
 
1
  ---
2
  base_model: colorfulscoop/sbert-base-ja
3
+ library_name: sentence-transformers
4
+ metrics:
5
+ - accuracy
6
+ pipeline_tag: sentence-similarity
7
+ tags:
8
+ - sentence-transformers
9
+ - sentence-similarity
10
+ - feature-extraction
11
+ - generated_from_trainer
12
+ - dataset_size:5330
13
+ - loss:SoftmaxLoss
14
+ widget:
15
+ - source_sentence: 顔 に マスク を かぶった 男性 は 、 測定 中 に 残した すべて の 作業 を 一 時 停止 して い ます 。
16
+ sentences:
17
+ - 誰 か が マスク を して いる 。
18
+ - 緑 の 服 を 着た 女性 が 、 別の 男性 の 顔 に 何 か を 書き ます 。
19
+ - 男性 が 女性 に キス して い ます 。
20
+ - source_sentence: 女の子 は 、 バレーボール を スパイク に ジャンプ し ます 。
21
+ sentences:
22
+ - ベンチ で 昼寝 を して いる 男
23
+ - プロパガンダ は 反戦 です 。
24
+ - 女の子 が ジャンプ し ます 。
25
+ - source_sentence: ステージ で ドラム を 演奏 する 男 。
26
+ sentences:
27
+ - 男 が ステージ で リズム を 作り ます 。
28
+ - 女性 は 頭 を 覆って いる
29
+ - 2 人 の 女性 が リビング ルーム に 座って レシピ を 議論 して い ます 。
30
+ - source_sentence: 青い シャツ を 着て フィールド を 耕す 東洋 の 帽子 を 持つ 男 。
31
+ sentences:
32
+ - 男 が 水 を サーフィン して いる
33
+ - 子犬 と ビーチ を 訪れた 人々
34
+ - バイク に 乗る 男
35
+ - source_sentence: 水着 姿 の 少女 に バケツ の 水 を 注ぐ ウォーター パーク の 少年 。
36
+ sentences:
37
+ - 騎士 たち は 、 溶けた 鉛 の バケツ を 城壁 の 下 の 不幸な 農奴 に 注ぎ ます 。
38
+ - 誰 も 歩いて い ない
39
+ - 僧 ks は にぎやかな 通り を 渡り ます 。
40
+ model-index:
41
+ - name: SentenceTransformer based on colorfulscoop/sbert-base-ja
42
+ results:
43
+ - task:
44
+ type: label-accuracy
45
+ name: Label Accuracy
46
+ dataset:
47
+ name: val
48
+ type: val
49
+ metrics:
50
+ - type: accuracy
51
+ value: 0.7782363977485929
52
+ name: Accuracy
53
  ---
54
 
55
+ # SentenceTransformer based on colorfulscoop/sbert-base-ja
 
 
 
56
 
57
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [colorfulscoop/sbert-base-ja](https://huggingface.co/colorfulscoop/sbert-base-ja) on the csv dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
58
 
59
  ## Model Details
60
 
61
  ### Model Description
62
+ - **Model Type:** Sentence Transformer
63
+ - **Base model:** [colorfulscoop/sbert-base-ja](https://huggingface.co/colorfulscoop/sbert-base-ja) <!-- at revision ecb8a98cd5176719ff7ab0d770a27420118732cf -->
64
+ - **Maximum Sequence Length:** 512 tokens
65
+ - **Output Dimensionality:** 768 tokens
66
+ - **Similarity Function:** Cosine Similarity
67
+ - **Training Dataset:**
68
+ - csv
69
+ <!-- - **Language:** Unknown -->
70
+ <!-- - **License:** Unknown -->
71
 
72
+ ### Model Sources
 
 
 
 
 
 
 
 
 
 
 
 
73
 
74
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
75
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
76
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
77
 
78
+ ### Full Model Architecture
 
 
79
 
80
+ ```
81
+ SentenceTransformer(
82
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
83
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
84
+ )
85
+ ```
86
 
87
+ ## Usage
88
 
89
+ ### Direct Usage (Sentence Transformers)
90
 
91
+ First install the Sentence Transformers library:
92
 
93
+ ```bash
94
+ pip install -U sentence-transformers
95
+ ```
96
 
97
+ Then you can load this model and run inference.
98
+ ```python
99
+ from sentence_transformers import SentenceTransformer
 
 
 
 
100
 
101
+ # Download from the 🤗 Hub
102
+ model = SentenceTransformer("sentence_transformers_model_id")
103
+ # Run inference
104
+ sentences = [
105
+ '水着 姿 の 少女 に バケツ の 水 を 注ぐ ウォーター パーク の 少年 。',
106
+ '騎士 たち は 、 溶けた 鉛 の バケツ を 城壁 の 下 の 不幸な 農奴 に 注ぎ ます 。',
107
+ '僧 ks は にぎやかな 通り を 渡り ます 。',
108
+ ]
109
+ embeddings = model.encode(sentences)
110
+ print(embeddings.shape)
111
+ # [3, 768]
112
 
113
+ # Get the similarity scores for the embeddings
114
+ similarities = model.similarity(embeddings, embeddings)
115
+ print(similarities.shape)
116
+ # [3, 3]
117
+ ```
118
 
119
+ <!--
120
+ ### Direct Usage (Transformers)
121
 
122
+ <details><summary>Click to see the direct usage in Transformers</summary>
123
 
124
+ </details>
125
+ -->
126
 
127
+ <!--
128
+ ### Downstream Usage (Sentence Transformers)
 
 
 
 
 
 
 
 
 
 
 
129
 
130
+ You can finetune this model on your own dataset.
131
 
132
+ <details><summary>Click to expand</summary>
133
 
134
+ </details>
135
+ -->
136
 
137
+ <!--
138
+ ### Out-of-Scope Use
 
 
 
 
 
 
 
 
 
 
 
 
 
 
139
 
140
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
141
+ -->
142
 
143
  ## Evaluation
144
 
145
+ ### Metrics
 
 
 
 
 
 
 
 
 
 
 
 
 
 
146
 
147
+ #### Label Accuracy
148
+ * Dataset: `val`
149
+ * Evaluated with [<code>LabelAccuracyEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.LabelAccuracyEvaluator)
150
 
151
+ | Metric | Value |
152
+ |:-------------|:-----------|
153
+ | **accuracy** | **0.7782** |
154
 
155
+ <!--
156
+ ## Bias, Risks and Limitations
157
 
158
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
159
+ -->
160
 
161
+ <!--
162
+ ### Recommendations
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
163
 
164
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
165
+ -->
166
 
167
+ ## Training Details
168
 
169
+ ### Training Dataset
170
+
171
+ #### csv
172
+
173
+ * Dataset: csv
174
+ * Size: 5,330 training samples
175
+ * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
176
+ * Approximate statistics based on the first 1000 samples:
177
+ | | sentence_0 | sentence_1 | label |
178
+ |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
179
+ | type | string | string | int |
180
+ | details | <ul><li>min: 7 tokens</li><li>mean: 35.79 tokens</li><li>max: 177 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 22.66 tokens</li><li>max: 77 tokens</li></ul> | <ul><li>0: ~32.80%</li><li>1: ~67.20%</li></ul> |
181
+ * Samples:
182
+ | sentence_0 | sentence_1 | label |
183
+ |:----------------------------------------------------------------------|:------------------------------------------|:---------------|
184
+ | <code>薬剤 師 が 処方 を 準備 して い ます 。</code> | <code>薬剤 師 が 自宅 の ソファ に 座って い ます 。</code> | <code>1</code> |
185
+ | <code>3 人 の 男性 が 小屋 を 背景 に 象 に 乗って おり 、 2 人 が 帽子 を かぶって い ます 。</code> | <code>一 人 の 男 は 帽子 を かぶって い ませ ん 。</code> | <code>0</code> |
186
+ | <code>茶色 の 犬 と の クロスカントリー スキー の 女性 。</code> | <code>草 は 緑 でした</code> | <code>1</code> |
187
+ * Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
188
+
189
+ ### Evaluation Dataset
190
+
191
+ #### csv
192
+
193
+ * Dataset: csv
194
+ * Size: 5,330 evaluation samples
195
+ * Columns: <code>text1</code>, <code>text2</code>, and <code>label</code>
196
+ * Approximate statistics based on the first 1000 samples:
197
+ | | text1 | text2 | label |
198
+ |:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
199
+ | type | string | string | int |
200
+ | details | <ul><li>min: 12 tokens</li><li>mean: 36.61 tokens</li><li>max: 108 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 22.81 tokens</li><li>max: 68 tokens</li></ul> | <ul><li>0: ~32.90%</li><li>1: ~67.10%</li></ul> |
201
+ * Samples:
202
+ | text1 | text2 | label |
203
+ |:----------------------------------------------------------------------------|:---------------------------------------------------------|:---------------|
204
+ | <code>青い ジャージ の 裏 に 10 番 の ソフトボール プレーヤー が ホーム プレート に 向かって 走って い ます 。</code> | <code>ソフトボール 選手 は 10 番 です</code> | <code>0</code> |
205
+ | <code>山 の 湖 の そば の 岩 だらけ の 道 で 自転車 に 乗る 男 。</code> | <code>自転車 の 男</code> | <code>0</code> |
206
+ | <code>テント の 前 の 芝生 の 椅子 に 座って いる 赤い ひげ を 生やした ひげ を 生やした 男性 。</code> | <code>顔 の 毛 の ない 男性 と 青い シャツ を 着た 女性 が 座って い ます 。</code> | <code>1</code> |
207
+ * Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
208
+
209
+ ### Training Hyperparameters
210
+ #### Non-Default Hyperparameters
211
+
212
+ - `eval_strategy`: steps
213
+ - `num_train_epochs`: 1
214
+ - `multi_dataset_batch_sampler`: round_robin
215
+
216
+ #### All Hyperparameters
217
+ <details><summary>Click to expand</summary>
218
+
219
+ - `overwrite_output_dir`: False
220
+ - `do_predict`: False
221
+ - `eval_strategy`: steps
222
+ - `prediction_loss_only`: True
223
+ - `per_device_train_batch_size`: 8
224
+ - `per_device_eval_batch_size`: 8
225
+ - `per_gpu_train_batch_size`: None
226
+ - `per_gpu_eval_batch_size`: None
227
+ - `gradient_accumulation_steps`: 1
228
+ - `eval_accumulation_steps`: None
229
+ - `torch_empty_cache_steps`: None
230
+ - `learning_rate`: 5e-05
231
+ - `weight_decay`: 0.0
232
+ - `adam_beta1`: 0.9
233
+ - `adam_beta2`: 0.999
234
+ - `adam_epsilon`: 1e-08
235
+ - `max_grad_norm`: 1
236
+ - `num_train_epochs`: 1
237
+ - `max_steps`: -1
238
+ - `lr_scheduler_type`: linear
239
+ - `lr_scheduler_kwargs`: {}
240
+ - `warmup_ratio`: 0.0
241
+ - `warmup_steps`: 0
242
+ - `log_level`: passive
243
+ - `log_level_replica`: warning
244
+ - `log_on_each_node`: True
245
+ - `logging_nan_inf_filter`: True
246
+ - `save_safetensors`: True
247
+ - `save_on_each_node`: False
248
+ - `save_only_model`: False
249
+ - `restore_callback_states_from_checkpoint`: False
250
+ - `no_cuda`: False
251
+ - `use_cpu`: False
252
+ - `use_mps_device`: False
253
+ - `seed`: 42
254
+ - `data_seed`: None
255
+ - `jit_mode_eval`: False
256
+ - `use_ipex`: False
257
+ - `bf16`: False
258
+ - `fp16`: False
259
+ - `fp16_opt_level`: O1
260
+ - `half_precision_backend`: auto
261
+ - `bf16_full_eval`: False
262
+ - `fp16_full_eval`: False
263
+ - `tf32`: None
264
+ - `local_rank`: 0
265
+ - `ddp_backend`: None
266
+ - `tpu_num_cores`: None
267
+ - `tpu_metrics_debug`: False
268
+ - `debug`: []
269
+ - `dataloader_drop_last`: False
270
+ - `dataloader_num_workers`: 0
271
+ - `dataloader_prefetch_factor`: None
272
+ - `past_index`: -1
273
+ - `disable_tqdm`: False
274
+ - `remove_unused_columns`: True
275
+ - `label_names`: None
276
+ - `load_best_model_at_end`: False
277
+ - `ignore_data_skip`: False
278
+ - `fsdp`: []
279
+ - `fsdp_min_num_params`: 0
280
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
281
+ - `fsdp_transformer_layer_cls_to_wrap`: None
282
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
283
+ - `deepspeed`: None
284
+ - `label_smoothing_factor`: 0.0
285
+ - `optim`: adamw_torch
286
+ - `optim_args`: None
287
+ - `adafactor`: False
288
+ - `group_by_length`: False
289
+ - `length_column_name`: length
290
+ - `ddp_find_unused_parameters`: None
291
+ - `ddp_bucket_cap_mb`: None
292
+ - `ddp_broadcast_buffers`: False
293
+ - `dataloader_pin_memory`: True
294
+ - `dataloader_persistent_workers`: False
295
+ - `skip_memory_metrics`: True
296
+ - `use_legacy_prediction_loop`: False
297
+ - `push_to_hub`: False
298
+ - `resume_from_checkpoint`: None
299
+ - `hub_model_id`: None
300
+ - `hub_strategy`: every_save
301
+ - `hub_private_repo`: False
302
+ - `hub_always_push`: False
303
+ - `gradient_checkpointing`: False
304
+ - `gradient_checkpointing_kwargs`: None
305
+ - `include_inputs_for_metrics`: False
306
+ - `eval_do_concat_batches`: True
307
+ - `fp16_backend`: auto
308
+ - `push_to_hub_model_id`: None
309
+ - `push_to_hub_organization`: None
310
+ - `mp_parameters`:
311
+ - `auto_find_batch_size`: False
312
+ - `full_determinism`: False
313
+ - `torchdynamo`: None
314
+ - `ray_scope`: last
315
+ - `ddp_timeout`: 1800
316
+ - `torch_compile`: False
317
+ - `torch_compile_backend`: None
318
+ - `torch_compile_mode`: None
319
+ - `dispatch_batches`: None
320
+ - `split_batches`: None
321
+ - `include_tokens_per_second`: False
322
+ - `include_num_input_tokens_seen`: False
323
+ - `neftune_noise_alpha`: None
324
+ - `optim_target_modules`: None
325
+ - `batch_eval_metrics`: False
326
+ - `eval_on_start`: False
327
+ - `eval_use_gather_object`: False
328
+ - `batch_sampler`: batch_sampler
329
+ - `multi_dataset_batch_sampler`: round_robin
330
+
331
+ </details>
332
+
333
+ ### Training Logs
334
+ | Epoch | Step | val_accuracy |
335
+ |:------:|:----:|:------------:|
336
+ | 0.1497 | 50 | 0.7265 |
337
+ | 0.2994 | 100 | 0.7696 |
338
+ | 0.4491 | 150 | 0.7715 |
339
+ | 0.5988 | 200 | 0.7659 |
340
+ | 0.7485 | 250 | 0.7790 |
341
+ | 0.8982 | 300 | 0.7771 |
342
+ | 1.0 | 334 | 0.7782 |
343
+
344
+
345
+ ### Framework Versions
346
+ - Python: 3.10.14
347
+ - Sentence Transformers: 3.1.0
348
+ - Transformers: 4.44.2
349
+ - PyTorch: 2.4.1+cu121
350
+ - Accelerate: 0.34.2
351
+ - Datasets: 2.20.0
352
+ - Tokenizers: 0.19.1
353
+
354
+ ## Citation
355
+
356
+ ### BibTeX
357
+
358
+ #### Sentence Transformers and SoftmaxLoss
359
+ ```bibtex
360
+ @inproceedings{reimers-2019-sentence-bert,
361
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
362
+ author = "Reimers, Nils and Gurevych, Iryna",
363
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
364
+ month = "11",
365
+ year = "2019",
366
+ publisher = "Association for Computational Linguistics",
367
+ url = "https://arxiv.org/abs/1908.10084",
368
+ }
369
+ ```
370
+
371
+ <!--
372
+ ## Glossary
373
+
374
+ *Clearly define terms in order to be accessible across audiences.*
375
+ -->
376
+
377
+ <!--
378
+ ## Model Card Authors
379
+
380
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
381
+ -->
382
+
383
+ <!--
384
  ## Model Card Contact
385
 
386
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
387
+ -->
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:767031afc79755e5cf8eeeab86900a2e8924bd3eefac36867eefdd14168dd4bd
3
  size 442491744
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b7d48d5817dcd227deb4ec7ee5a28d9c10ba06a33946d13683f3a9d83b286744
3
  size 442491744
runs/Sep16_22-40-15_default/events.out.tfevents.1726526417.default.5280.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1460a3634c1d755164e174435ce188f5d7ab91e529b16e7a514279f9e17071cc
3
+ size 4489