leoozy commited on
Commit
865b157
1 Parent(s): b12f71d

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +86 -0
README.md ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ pipeline_tag: feature-extraction
5
+ ---
6
+
7
+
8
+ # Model Card for SynCSE-partial
9
+
10
+ # Model Details
11
+
12
+ ## Model Description
13
+
14
+ More information needed
15
+
16
+ - **Developed by:** SJTU-LIT
17
+ - **Shared by [Optional]:** SJTU-LIT
18
+
19
+ - **Model type:** Feature Extraction
20
+ - **Language(s) (NLP):** More information needed
21
+ - **License:** More information needed
22
+ - **Parent Model:** RoBERTa-base
23
+ - **Resources for more information:**
24
+ - [GitHub Repo](https://github.com/SJTU-LIT/SynCSE/tree/main)
25
+ - [Associated Paper](https://arxiv.org/abs/2305.15077)
26
+
27
+
28
+ # Uses
29
+
30
+
31
+ ## Direct Use
32
+ This model can be used for the task of feature extraction.
33
+
34
+ ## Out-of-Scope Use
35
+
36
+ The model should not be used to intentionally create hostile or alienating environments for people.
37
+
38
+ # Bias, Risks, and Limitations
39
+
40
+ Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.
41
+ ## Recommendations
42
+
43
+
44
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
45
+
46
+
47
+
48
+ # Training Data
49
+
50
+ The model craters note in the [Github Repository](https://github.com/SJTU-LIT/SynCSE/blob/main/README.md)
51
+ > We use 26.2k generated synthetic train SynCSE-partial-RoBERTa-base.
52
+
53
+ # Citation
54
+
55
+
56
+ **BibTeX:**
57
+
58
+ ```bibtex
59
+ @article{zhang2023contrastive,
60
+ title={Contrastive Learning of Sentence Embeddings from Scratch},
61
+ author={Zhang, Junlei and Lan, Zhenzhong and He, Junxian},
62
+ journal={arXiv preprint arXiv:2305.15077},
63
+ year={2023}
64
+ }
65
+ ```
66
+
67
+ # Model Card Contact
68
+
69
+ If you have any questions related to the code or the paper, feel free to email Junlei (`zhangjunlei@westlake.edu.cn`). If you encounter any problems when using the code, or want to report a bug, you can open an issue. Please try to specify the problem with details so we can help you better and quicker!
70
+
71
+
72
+
73
+ # How to Get Started with the Model
74
+
75
+ Use the code below to get started with the model.
76
+
77
+ <details>
78
+ <summary> Click to expand </summary>
79
+
80
+ ```python
81
+ from transformers import AutoTokenizer, AutoModel
82
+ tokenizer = AutoTokenizer.from_pretrained("sjtu-lit/SynCSE-partial-RoBERTa-base")
83
+ model = AutoModel.from_pretrained("sjtu-lit/SynCSE-partial-RoBERTa-base")
84
+
85
+ ```
86
+ </details>