Update README.md
Browse files
README.md
CHANGED
@@ -29,19 +29,20 @@ widget:
|
|
29 |
|
30 |
- [Overview](#overview)
|
31 |
- [Model Description](#model-description)
|
32 |
-
- [How to Use](#how-to-use)
|
33 |
- [Intended Uses and Limitations](#intended-uses-and-limitations)
|
|
|
|
|
34 |
- [Training](#training)
|
35 |
- [Training Data](#training-data)
|
36 |
- [Training Procedure](#training-procedure)
|
37 |
- [Additional Information](#additional-information)
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
|
46 |
</details>
|
47 |
|
@@ -54,6 +55,11 @@ widget:
|
|
54 |
## Model Description
|
55 |
**GPT2-large-bne** is a transformer-based model for the Spanish language. It is based on the [GPT-2](http://www.persagen.com/files/misc/radford2019language.pdf) model and has been pre-trained using the largest Spanish corpus known to date, with a total of 570GB of clean and deduplicated text processed for this work, compiled from the web crawlings performed by the [National Library of Spain (Biblioteca Nacional de España)](http://www.bne.es/en/Inicio/index.html) from 2009 to 2019.
|
56 |
|
|
|
|
|
|
|
|
|
|
|
57 |
## How to Use
|
58 |
|
59 |
Here is how to use this model:
|
@@ -87,9 +93,7 @@ Here is how to use this model to get the features of a given text in PyTorch:
|
|
87 |
torch.Size([1, 14, 1280])
|
88 |
```
|
89 |
|
90 |
-
##
|
91 |
-
|
92 |
-
You can use the raw model for text generation or fine-tune it to a downstream task.
|
93 |
|
94 |
The training data used for this model has not been released as a dataset one can browse. We know it contains a lot of
|
95 |
unfiltered content from the internet, which is far from neutral. Here's an example of how the model can have biased predictions:
|
@@ -141,9 +145,22 @@ The training lasted a total of 10 days with 32 computing nodes each one with 4 N
|
|
141 |
|
142 |
## Additional Information
|
143 |
|
144 |
-
###
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
145 |
|
146 |
-
The Text Mining Unit from Barcelona Supercomputing Center.
|
147 |
|
148 |
### Citation Information
|
149 |
If you use this model, please cite our [paper](http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6405):
|
@@ -166,21 +183,10 @@ Intelligence (SEDIA) within the framework of the Plan-TL.},
|
|
166 |
|
167 |
```
|
168 |
|
169 |
-
###
|
170 |
|
171 |
-
|
172 |
-
|
173 |
-
### Funding
|
174 |
|
175 |
-
This work was funded by the Spanish State Secretariat for Digitalization and Artificial Intelligence (SEDIA) within the framework of the Plan-TL.
|
176 |
-
|
177 |
-
### Licensing Information
|
178 |
-
|
179 |
-
This work is licensed under a [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
180 |
-
|
181 |
-
### Copyright
|
182 |
-
|
183 |
-
Copyright by the Spanish State Secretariat for Digitalization and Artificial Intelligence (SEDIA) (2022)
|
184 |
|
185 |
### Disclaimer
|
186 |
|
|
|
29 |
|
30 |
- [Overview](#overview)
|
31 |
- [Model Description](#model-description)
|
|
|
32 |
- [Intended Uses and Limitations](#intended-uses-and-limitations)
|
33 |
+
- [How to Use](#how-to-use)
|
34 |
+
- [Limitations and bias](#limitations-and-bias)
|
35 |
- [Training](#training)
|
36 |
- [Training Data](#training-data)
|
37 |
- [Training Procedure](#training-procedure)
|
38 |
- [Additional Information](#additional-information)
|
39 |
+
- [Contact Information](#contact-information)
|
40 |
+
- [Copyright](#copyright)
|
41 |
+
- [Licensing Information](#licensing-information)
|
42 |
+
- [Funding](#funding)
|
43 |
+
- [Citation Information](#citation-information)
|
44 |
+
- [Contributions](#contributions)
|
45 |
+
- [Disclaimer](#disclaimer)
|
46 |
|
47 |
</details>
|
48 |
|
|
|
55 |
## Model Description
|
56 |
**GPT2-large-bne** is a transformer-based model for the Spanish language. It is based on the [GPT-2](http://www.persagen.com/files/misc/radford2019language.pdf) model and has been pre-trained using the largest Spanish corpus known to date, with a total of 570GB of clean and deduplicated text processed for this work, compiled from the web crawlings performed by the [National Library of Spain (Biblioteca Nacional de España)](http://www.bne.es/en/Inicio/index.html) from 2009 to 2019.
|
57 |
|
58 |
+
|
59 |
+
## Intended Uses and Limitations
|
60 |
+
|
61 |
+
You can use the raw model for text generation or fine-tune it to a downstream task.
|
62 |
+
|
63 |
## How to Use
|
64 |
|
65 |
Here is how to use this model:
|
|
|
93 |
torch.Size([1, 14, 1280])
|
94 |
```
|
95 |
|
96 |
+
## Limitations and bias
|
|
|
|
|
97 |
|
98 |
The training data used for this model has not been released as a dataset one can browse. We know it contains a lot of
|
99 |
unfiltered content from the internet, which is far from neutral. Here's an example of how the model can have biased predictions:
|
|
|
145 |
|
146 |
## Additional Information
|
147 |
|
148 |
+
### Contact Information
|
149 |
+
|
150 |
+
For further information, send an email to <plantl-gob-es@bsc.es>
|
151 |
+
|
152 |
+
### Copyright
|
153 |
+
|
154 |
+
Copyright by the Spanish State Secretariat for Digitalization and Artificial Intelligence (SEDIA) (2022)
|
155 |
+
|
156 |
+
### Licensing Information
|
157 |
+
|
158 |
+
This work is licensed under a [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
159 |
+
|
160 |
+
### Funding
|
161 |
+
|
162 |
+
This work was funded by the Spanish State Secretariat for Digitalization and Artificial Intelligence (SEDIA) within the framework of the Plan-TL.
|
163 |
|
|
|
164 |
|
165 |
### Citation Information
|
166 |
If you use this model, please cite our [paper](http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6405):
|
|
|
183 |
|
184 |
```
|
185 |
|
186 |
+
### Contributions
|
187 |
|
188 |
+
[N/A]
|
|
|
|
|
189 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
190 |
|
191 |
### Disclaimer
|
192 |
|