Varine commited on
Commit
e9b6856
1 Parent(s): 581961c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -75
README.md CHANGED
@@ -48,11 +48,10 @@ This model can be used in tranlation missions between Chinese and English.
48
  As it's a traditional translation model, it can be used in many circumstances, including translation between some academical papers, news, and **even some of the literary works(as the excellent performance the model is in grammar and multi-context cases)**.
49
 
50
  ## Bias, Risks, and Limitations
51
- **1.Remember this is a beta version of this translation model,thus we add the limitation on the scale of input tokens, so plz make sure the scale of your input text won't overflow the limit.**
52
  **2.DO NOT APPLY THIS MODEL FOR ILLEGAL USES.**
53
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
54
 
55
- [More Information Needed]
56
 
57
  ### Recommendations
58
 
@@ -88,20 +87,11 @@ git clone "https://huggingface.co/Varine/opus-mt-zh-en-model"
88
  As the dataset we choose in training is tremendous in scale, so after analyzing, we decided to use the only 4% among the whole dataset to train, and we divided the 4% data in 10 epoch to evaluate the training loss and and validation loss in every part of the epoch.
89
  Moreover, we need to claim that, the data form that we used in our training progress is Chinese-English sentence pairs(to better embedding and compare them in higher-dimension space in Transformer architecture).
90
 
91
- #### Preprocessing [optional]
92
-
93
- [More Information Needed]
94
 
95
 
96
  #### Training Hyperparameters
97
 
98
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
99
-
100
- #### Speeds, Sizes, Times [optional]
101
-
102
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
103
-
104
- [More Information Needed]
105
 
106
  ## Evaluation
107
 
@@ -112,91 +102,41 @@ Moreover, we need to claim that, the data form that we used in our training prog
112
  #### Testing Data
113
 
114
  <!-- This should link to a Dataset Card if possible. -->
 
115
 
116
- [More Information Needed]
117
-
118
- #### Factors
119
-
120
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
121
-
122
- [More Information Needed]
123
-
124
- #### Metrics
125
-
126
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
127
-
128
- [More Information Needed]
129
-
130
- ### Results
131
-
132
- [More Information Needed]
133
-
134
- #### Summary
135
-
136
-
137
-
138
- ## Model Examination [optional]
139
-
140
- <!-- Relevant interpretability work for the model goes here -->
141
-
142
- [More Information Needed]
143
-
144
- ## Environmental Impact
145
-
146
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
147
 
148
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
149
 
150
- - **Hardware Type:** [More Information Needed]
151
- - **Hours used:** [More Information Needed]
152
- - **Cloud Provider:** [More Information Needed]
153
- - **Compute Region:** [More Information Needed]
154
- - **Carbon Emitted:** [More Information Needed]
155
 
156
- ## Technical Specifications [optional]
 
 
 
 
157
 
158
  ### Model Architecture and Objective
 
159
 
160
- [More Information Needed]
161
 
162
  ### Compute Infrastructure
 
163
 
164
- [More Information Needed]
165
 
166
  #### Hardware
167
 
168
- [More Information Needed]
169
 
170
  #### Software
171
 
172
- [More Information Needed]
173
-
174
- ## Citation [optional]
175
 
176
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
177
 
178
- **BibTeX:**
179
 
180
- [More Information Needed]
181
 
182
- **APA:**
183
-
184
- [More Information Needed]
185
-
186
- ## Glossary [optional]
187
-
188
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
189
-
190
- [More Information Needed]
191
-
192
- ## More Information [optional]
193
-
194
- [More Information Needed]
195
 
196
  ## Model Card Authors [optional]
197
 
198
- [More Information Needed]
199
 
200
  ## Model Card Contact
201
-
202
- [More Information Needed]
 
48
  As it's a traditional translation model, it can be used in many circumstances, including translation between some academical papers, news, and **even some of the literary works(as the excellent performance the model is in grammar and multi-context cases)**.
49
 
50
  ## Bias, Risks, and Limitations
51
+ **1.Remember this is a beta version of this translation model,thus we add the limitation on the scale of input tokens, so plz make sure the scale of your input text won't overflow the limit.**
52
  **2.DO NOT APPLY THIS MODEL FOR ILLEGAL USES.**
53
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
54
 
 
55
 
56
  ### Recommendations
57
 
 
87
  As the dataset we choose in training is tremendous in scale, so after analyzing, we decided to use the only 4% among the whole dataset to train, and we divided the 4% data in 10 epoch to evaluate the training loss and and validation loss in every part of the epoch.
88
  Moreover, we need to claim that, the data form that we used in our training progress is Chinese-English sentence pairs(to better embedding and compare them in higher-dimension space in Transformer architecture).
89
 
 
 
 
90
 
91
 
92
  #### Training Hyperparameters
93
 
94
+ - **Training regime:** **fp32**<!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 
 
 
 
 
 
95
 
96
  ## Evaluation
97
 
 
102
  #### Testing Data
103
 
104
  <!-- This should link to a Dataset Card if possible. -->
105
+ - wmt/wmt19
106
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
107
 
108
+ ## Hardware used in the training
109
 
 
 
 
 
 
110
 
111
+ - **Hardware Type:** **1x Nvidia A10 GPU with 30v CPUs, 200GiB RAM, 1 TiB SSD storage**
112
+ - **Hours used:** **4.08hrs(roughly estimated)**
113
+ - **Cloud Provider:** **Lambda Cloud.Co**
114
+ - **Compute Region:** **California, USA**
115
+ - **Carbon Emitted:** **N/A**
116
 
117
  ### Model Architecture and Objective
118
+ We use the Transformer architecture(Huggingface version) in this model,and it's universal architecture widely used in machine translation missions.
119
 
 
120
 
121
  ### Compute Infrastructure
122
+ Due to the limit of the computational ability on personal PC and the scale of the dataset, we decided to training our model on GPU cloud, which proved to be effective.
123
 
 
124
 
125
  #### Hardware
126
 
127
+ **Thanks to the Lambda Cloud, we use the A10 GPU of Nvidia to finish the project.**
128
 
129
  #### Software
130
 
131
+ **We used the Jupiter Notebook on cloud to run our code.**
 
 
132
 
 
133
 
 
134
 
 
135
 
 
 
 
 
 
 
 
 
 
 
 
 
 
136
 
137
  ## Model Card Authors [optional]
138
 
139
+ **Varine Xie**
140
 
141
  ## Model Card Contact
142
+ **Plz contact me through email:<https://varine7499@gmail.com>, and I'm glad to receive feedback from y'all!** 😊