Davlan commited on
Commit
c59daa4
1 Parent(s): 897d3a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +182 -50
README.md CHANGED
@@ -1,58 +1,190 @@
1
  ---
2
- base_model: xlm-r-large-script_expand
3
  tags:
4
  - generated_from_trainer
5
- metrics:
6
- - accuracy
7
  model-index:
8
- - name: afro-xlmr_large_76L_script
9
  results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ---
11
 
12
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
- should probably proofread and complete it, then remove this comment. -->
14
 
15
- # afro-xlmr_large_76L_script
16
-
17
- This model is a fine-tuned version of [xlm-r-large-script_expand](https://huggingface.co/xlm-r-large-script_expand) on the None dataset.
18
- It achieves the following results on the evaluation set:
19
- - Loss: 1.0044
20
- - Accuracy: 0.7963
21
-
22
- ## Model description
23
-
24
- More information needed
25
-
26
- ## Intended uses & limitations
27
-
28
- More information needed
29
-
30
- ## Training and evaluation data
31
-
32
- More information needed
33
-
34
- ## Training procedure
35
-
36
- ### Training hyperparameters
37
-
38
- The following hyperparameters were used during training:
39
- - learning_rate: 5e-05
40
- - train_batch_size: 40
41
- - eval_batch_size: 32
42
- - seed: 42
43
- - gradient_accumulation_steps: 8
44
- - total_train_batch_size: 320
45
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
- - lr_scheduler_type: linear
47
- - num_epochs: 3.0
48
-
49
- ### Training results
50
-
51
-
52
-
53
- ### Framework versions
54
-
55
- - Transformers 4.34.1
56
- - Pytorch 2.1.0+cu121
57
- - Datasets 2.14.5
58
- - Tokenizers 0.14.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: mit
3
  tags:
4
  - generated_from_trainer
 
 
5
  model-index:
6
+ - name: afro-xlmr-large-76L_script
7
  results: []
8
+ language:
9
+ - en
10
+ - am
11
+ - ar
12
+ - so
13
+ - sw
14
+ - pt
15
+ - af
16
+ - fr
17
+ - zu
18
+ - mg
19
+ - ha
20
+ - sn
21
+ - arz
22
+ - ny
23
+ - ig
24
+ - xh
25
+ - yo
26
+ - st
27
+ - rw
28
+ - tn
29
+ - ti
30
+ - ts
31
+ - om
32
+ - run
33
+ - nso
34
+ - ee
35
+ - ln
36
+ - tw
37
+ - pcm
38
+ - gaa
39
+ - loz
40
+ - lg
41
+ - guw
42
+ - bem
43
+ - efi
44
+ - lue
45
+ - lua
46
+ - toi
47
+ - ve
48
+ - tum
49
+ - tll
50
+ - iso
51
+ - kqn
52
+ - zne
53
+ - umb
54
+ - mos
55
+ - tiv
56
+ - lu
57
+ - ff
58
+ - kwy
59
+ - bci
60
+ - rnd
61
+ - luo
62
+ - wal
63
+ - ss
64
+ - lun
65
+ - wo
66
+ - nyk
67
+ - kj
68
+ - ki
69
+ - fon
70
+ - bm
71
+ - cjk
72
+ - din
73
+ - dyu
74
+ - kab
75
+ - kam
76
+ - kbp
77
+ - kr
78
+ - kmb
79
+ - kg
80
+ - nus
81
+ - sg
82
+ - taq
83
+ - tzm
84
+ - nqo
85
  ---
86
 
 
 
87
 
88
+ # afro-xlmr-large-75L_script
89
+
90
+ AfroXLMR-large was created by first augmenting the XLM-R-large model with missing scripts (N'Ko and Tifinagh), followed by an MLM adaptation of the expanded XLM-R-large model on 76 languages widely spoken in Africa
91
+ including 4 high-resource languages.
92
+
93
+ ### Pre-training corpus
94
+ A mix of mC4, Wikipedia and OPUS data
95
+
96
+ ### Languages
97
+
98
+ There are 75 languages available :
99
+ - English (eng)
100
+ - Amharic (amh)
101
+ - Arabic (ara)
102
+ - Somali (som)
103
+ - Kiswahili (swa)
104
+ - Portuguese (por)
105
+ - Afrikaans (afr)
106
+ - French (fra)
107
+ - isiZulu (zul)
108
+ - Malagasy (mlg)
109
+ - Hausa (hau)
110
+ - chiShona (sna)
111
+ - Egyptian Arabic (arz)
112
+ - Chichewa (nya)
113
+ - Igbo (ibo)
114
+ - isiXhosa (xho)
115
+ - Yorùbá (yor)
116
+ - Sesotho (sot)
117
+ - Kinyarwanda (kin)
118
+ - Tigrinya (tir)
119
+ - Tsonga (tso)
120
+ - Oromo (orm)
121
+ - Rundi (run)
122
+ - Northern Sotho (nso)
123
+ - Ewe (ewe)
124
+ - Lingala (lin)
125
+ - Twi (twi)
126
+ - Nigerian Pidgin (pcm)
127
+ - Ga (gaa)
128
+ - Lozi (loz)
129
+ - Luganda (lug)
130
+ - Gun (guw)
131
+ - Bemba (bem)
132
+ - Efik (efi)
133
+ - Luvale (lue)
134
+ - Luba-Lulua (lua)
135
+ - Tonga (toi)
136
+ - Tshivenḓa (ven)
137
+ - Tumbuka (tum)
138
+ - Tetela (tll)
139
+ - Isoko (iso)
140
+ - Kaonde (kqn)
141
+ - Zande (zne)
142
+ - Umbundu (umb)
143
+ - Mossi (mos)
144
+ - Tiv (tiv)
145
+ - Luba-Katanga (lub)
146
+ - Fula (fuv)
147
+ - San Salvador Kongo (kwy)
148
+ - Baoulé (bci)
149
+ - Ruund (rnd)
150
+ - Luo (luo)
151
+ - Wolaitta (wal)
152
+ - Swazi (ssw)
153
+ - Lunda (lun)
154
+ - Wolof (wol)
155
+ - Nyaneka (nyk)
156
+ - Kwanyama (kua)
157
+ - Kikuyu (kik)
158
+ - Fon (fon)
159
+ - Bambara (bam)
160
+ - Chokwe (cjk)
161
+ - Dinka (dik)
162
+ - Dyula (dyu)
163
+ - Kabyle (kab)
164
+ - Kamba (kam)
165
+ - Kabiyè (kbp)
166
+ - Kanuri (knc)
167
+ - Kimbundu (kmb)
168
+ - Kikongo (kon)
169
+ - Nuer (nus)
170
+ - Sango (sag)
171
+ - Tamasheq (taq)
172
+ - Tamazight (tzm)
173
+
174
+
175
+ ### Acknowledgment
176
+ We would like to thank Google Cloud for providing us access to TPU v3-8 through the free cloud credits. Model trained using flax, before converted to pytorch.
177
+
178
+
179
+ ### BibTeX entry and citation info.
180
+ ```
181
+ @misc{adelani2023sib200,
182
+ title={SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects},
183
+ author={David Ifeoluwa Adelani and Hannah Liu and Xiaoyu Shen and Nikita Vassilyev and Jesujoba O. Alabi and Yanke Mao and Haonan Gao and Annie En-Shiun Lee},
184
+ year={2023},
185
+ eprint={2309.07445},
186
+ archivePrefix={arXiv},
187
+ primaryClass={cs.CL}
188
+ }
189
+
190
+ ```