Bingsu commited on
Commit
dac17a8
·
verified ·
1 Parent(s): 3687868

Add new SentenceTransformer model

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ 0_StaticEmbedding/tokenizer.json filter=lfs diff=lfs merge=lfs -text
0_StaticEmbedding/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:37a060e1ef30a7a61a95f785c817f8bf445fd94aeffa84342844ba73f2249c1b
3
+ size 1024008288
0_StaticEmbedding/tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:249df0778f236f6ece390de0de746838ef25b9d6954b68c2ee71249e0a9d8fd4
3
+ size 17082799
README.md ADDED
@@ -0,0 +1,143 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - sentence-transformers
5
+ - sentence-similarity
6
+ - feature-extraction
7
+ base_model: Snowflake/snowflake-arctic-embed-l-v2.0
8
+ pipeline_tag: sentence-similarity
9
+ library_name: sentence-transformers
10
+ ---
11
+
12
+ # Static Embeddings from Snowflake/Snowflake-arctic-embed-l-v2.0
13
+
14
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Snowflake/snowflake-arctic-embed-l-v2.0](https://huggingface.co/Snowflake/snowflake-arctic-embed-l-v2.0). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
15
+
16
+ ## Model Details
17
+
18
+ ### Model Description
19
+ - **Model Type:** Sentence Transformer
20
+ - **Base model:** [Snowflake/snowflake-arctic-embed-l-v2.0](https://huggingface.co/Snowflake/snowflake-arctic-embed-l-v2.0) <!-- at revision 7f311bb640ad3babc0a4e3a8873240dcba44c9d2 -->
21
+ - **Maximum Sequence Length:** inf tokens
22
+ - **Output Dimensionality:** 1024 dimensions
23
+ - **Similarity Function:** Cosine Similarity
24
+ <!-- - **Training Dataset:** Unknown -->
25
+ <!-- - **Language:** Unknown -->
26
+ - **License:** apache-2.0
27
+
28
+ ### Model Sources
29
+
30
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
31
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
32
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
33
+
34
+ ### Full Model Architecture
35
+
36
+ ```
37
+ SentenceTransformer(
38
+ (0): StaticEmbedding(
39
+ (embedding): EmbeddingBag(250002, 1024, mode='mean')
40
+ )
41
+ )
42
+ ```
43
+
44
+ ## Usage
45
+
46
+ ### Direct Usage (Sentence Transformers)
47
+
48
+ First install the Sentence Transformers library:
49
+
50
+ ```bash
51
+ pip install -U sentence-transformers
52
+ ```
53
+
54
+ Then you can load this model and run inference.
55
+ ```python
56
+ from sentence_transformers import SentenceTransformer
57
+
58
+ # Download from the 🤗 Hub
59
+ model = SentenceTransformer("mykor/static-arctic-embed-l-v2.0")
60
+ # Run inference
61
+ sentences = [
62
+ 'The weather is lovely today.',
63
+ "It's so sunny outside!",
64
+ 'He drove to the stadium.',
65
+ ]
66
+ embeddings = model.encode(sentences)
67
+ print(embeddings.shape)
68
+ # [3, 1024]
69
+
70
+ # Get the similarity scores for the embeddings
71
+ similarities = model.similarity(embeddings, embeddings)
72
+ print(similarities.shape)
73
+ # [3, 3]
74
+ ```
75
+
76
+ <!--
77
+ ### Direct Usage (Transformers)
78
+
79
+ <details><summary>Click to see the direct usage in Transformers</summary>
80
+
81
+ </details>
82
+ -->
83
+
84
+ <!--
85
+ ### Downstream Usage (Sentence Transformers)
86
+
87
+ You can finetune this model on your own dataset.
88
+
89
+ <details><summary>Click to expand</summary>
90
+
91
+ </details>
92
+ -->
93
+
94
+ <!--
95
+ ### Out-of-Scope Use
96
+
97
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
98
+ -->
99
+
100
+ <!--
101
+ ## Bias, Risks and Limitations
102
+
103
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
104
+ -->
105
+
106
+ <!--
107
+ ### Recommendations
108
+
109
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
110
+ -->
111
+
112
+ ## Training Details
113
+
114
+ ### Framework Versions
115
+ - Python: 3.11.11
116
+ - Sentence Transformers: 3.3.1
117
+ - Transformers: 4.47.1
118
+ - PyTorch: 2.5.1+cu121
119
+ - Accelerate: 1.2.1
120
+ - Datasets:
121
+ - Tokenizers: 0.21.0
122
+
123
+ ## Citation
124
+
125
+ ### BibTeX
126
+
127
+ <!--
128
+ ## Glossary
129
+
130
+ *Clearly define terms in order to be accessible across audiences.*
131
+ -->
132
+
133
+ <!--
134
+ ## Model Card Authors
135
+
136
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
137
+ -->
138
+
139
+ <!--
140
+ ## Model Card Contact
141
+
142
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
143
+ -->
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.3.1",
4
+ "transformers": "4.47.1",
5
+ "pytorch": "2.5.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
modules.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "0_StaticEmbedding",
6
+ "type": "sentence_transformers.models.StaticEmbedding"
7
+ }
8
+ ]