AkimfromParis commited on
Commit
412b71f
·
verified ·
1 Parent(s): 53cd244

Update font display source license v1.2

Browse files
Files changed (1) hide show
  1. src/about.py +35 -92
src/about.py CHANGED
@@ -90,9 +90,7 @@ LLM_BENCHMARKS_TEXT = f"""
90
  ## How it works
91
  📈 We evaluate Japanese Large Language Models on 52 key benchmarks leveraging our evaluation tool [llm-jp-eval](https://github.com/llm-jp/llm-jp-eval), a unified framework to evaluate Japanese LLMs on various evaluation tasks.
92
 
93
- Benchmarks:
94
-
95
- - **NLI (Natural Language Inference)**
96
 
97
  * `Jamp`, a Japanese NLI benchmark focused on temporal inference [Source](https://github.com/tomo-ut/temporalNLI_dataset) (License CC BY-SA 4.0)
98
 
@@ -104,126 +102,71 @@ Benchmarks:
104
 
105
  * `JSICK`, Japanese Sentences Involving Compositional Knowledge [Source](https://github.com/verypluming/JSICK) (License CC BY-SA 4.0)
106
 
107
- - **NQA (Question Answering)**
108
-
109
- ###JEMHopQA
110
-
111
- [Source](https://github.com/aiishii/JEMHopQA)
112
- (License CC BY-SA 4.0)
113
-
114
- ###NIILC
115
-
116
- [Source](https://github.com/mynlp/niilc-qa)
117
- (License CC BY-SA 4.0)
118
-
119
- ###JAQKET (AIO)
120
-
121
- [Source](https://www.nlp.ecei.tohoku.ac.jp/projects/jaqket/)
122
- (License CC BY-SA 4.0)(Other licenses are required for corporate usage)
123
-
124
- - **RC (Reading Comprehension)**
125
-
126
- ###JSQuAD
127
-
128
- [Source](https://github.com/yahoojapan/JGLUE)
129
- (License CC BY-SA 4.0)
130
 
131
- - **MC (Multiple Choice question answering)**
132
 
133
- ###JCommonsenseMorality
134
 
135
- [Source](https://github.com/Language-Media-Lab/commonsense-moral-ja)
136
- License:MIT License
137
 
138
- ###JCommonsenseQA
139
 
140
- [Source](https://github.com/yahoojapan/JGLUE)
141
- (License CC BY-SA 4.0)
142
 
143
- ###Kyoto University Commonsense Inference dataset (KUCI)
144
 
145
- [Source](https://github.com/ku-nlp/KUCI
146
- (License CC BY-SA 4.0)
147
 
148
- - **EL (Entity Linking)**
149
 
150
- ###chABSA
151
 
152
- [Source](https://github.com/chakki-works/chABSA-dataset)
153
- (License CC BY-SA 4.0)
154
 
155
- - **FA (Fundamental Analysis)**
156
 
157
- ###Wikipedia Annotated Corpus
158
 
159
- [Source](https://github.com/ku-nlp/WikipediaAnnotatedCorpus)
160
- (License CC BY-SA 4.0)
161
  List of tasks:
 
 
 
 
 
162
 
163
- Reading Prediction
164
- Named-entity recognition (NER)
165
- Dependency Parsing
166
- Predicate-argument structure analysis (PAS)
167
- Coreference Resolution
168
 
169
- - **MR (Mathematical Reasoning)**
170
 
171
- ###MAWPS
172
 
173
- [Source](https://github.com/nlp-waseda/chain-of-thought-ja-dataset)
174
- License:Apache-2.0
175
 
176
- ###MGSM
177
 
178
- [Source](https://huggingface.co/datasets/juletxara/mgsm)
179
- License:MIT License
180
 
181
- - **MT (Machine Translation)**
182
-
183
- ###Asian Language Treebank (ALT) - Parallel Corpus
184
-
185
- [Source](https://www2.nict.go.jp/astrec-att/member/mutiyama/ALT/index.html)
186
- (License CC BY-SA 4.0)
187
-
188
- ###WikiCorpus (Japanese-English Bilingual Corpus of Wikipedia's articles about the city of Kyoto)
189
-
190
- [Source](https://alaginrc.nict.go.jp/WikiCorpus/)
191
- License:CC BY-SA 3.0 deed
192
-
193
- - **STS (Semantic Textual Similarity)**
194
 
195
  This task is supported by llm-jp-eval, but it is not included in the evaluation score average.
196
 
197
- ###JSTS
198
-
199
- [Source](https://github.com/yahoojapan/JGLUE)
200
- (License CC BY-SA 4.0)
201
-
202
- - **HE (Human Examination)**
203
-
204
- ###MMLU
205
-
206
- [Source](https://github.com/hendrycks/test)
207
- License:MIT License
208
-
209
- ###JMMLU
210
 
211
- [Source](https://github.com/nlp-waseda/JMMLU)
212
- License:CC BY-SA 4.0(3 tasks under the CC BY-NC-ND 4.0 license)
213
 
214
- - **CG (Code Generation)**
215
 
216
- ###MBPP
217
 
218
- [Source](https://huggingface.co/datasets/llm-jp/mbpp-ja)
219
- (License CC BY-SA 4.0)
220
 
221
- - **SUM (Summarization)**
222
 
223
- ###XL-Sum
224
 
225
- [Source](https://github.com/csebuetnlp/xl-sum)
226
- License:CC BY-NC-SA 4.0(Due to the non-commercial license, this dataset will not be used, unless you specifically agree to the license and terms of use)
227
 
228
 
229
  ## Reproducibility
 
90
  ## How it works
91
  📈 We evaluate Japanese Large Language Models on 52 key benchmarks leveraging our evaluation tool [llm-jp-eval](https://github.com/llm-jp/llm-jp-eval), a unified framework to evaluate Japanese LLMs on various evaluation tasks.
92
 
93
+ **NLI (Natural Language Inference)**
 
 
94
 
95
  * `Jamp`, a Japanese NLI benchmark focused on temporal inference [Source](https://github.com/tomo-ut/temporalNLI_dataset) (License CC BY-SA 4.0)
96
 
 
102
 
103
  * `JSICK`, Japanese Sentences Involving Compositional Knowledge [Source](https://github.com/verypluming/JSICK) (License CC BY-SA 4.0)
104
 
105
+ **NQA (Question Answering)**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
106
 
107
+ * `JEMHopQA`, Japanese Explainable Multi-hop Question Answering [Source](https://github.com/aiishii/JEMHopQA) (License CC BY-SA 4.0)
108
 
109
+ * `NIILC`, NIILC Question Answering Dataset [Source](https://github.com/mynlp/niilc-qa) (License CC BY-SA 4.0)
110
 
111
+ * `JAQKET`, Japanese QA dataset on the subject of quizzes [Source](https://www.nlp.ecei.tohoku.ac.jp/projects/jaqket/) (License CC BY-SA 4.0 - Other licenses are required for corporate usage)
 
112
 
113
+ **RC (Reading Comprehension)**
114
 
115
+ * `JSQuAD`, Japanese version of SQuAD (part of JGLUE) [Source](https://github.com/yahoojapan/JGLUE) (License CC BY-SA 4.0)
 
116
 
117
+ **MC (Multiple Choice question answering)**
118
 
119
+ * `JCommonsenseMorality`, Japanese dataset for evaluating commonsense morality understanding [Source](https://github.com/Language-Media-Lab/commonsense-moral-ja) (License MIT License)
 
120
 
121
+ * `JCommonsenseQA`, Japanese version of CommonsenseQA [Source](https://github.com/yahoojapan/JGLUE) (License CC BY-SA 4.0)
122
 
123
+ * `KUCI`, Kyoto University Commonsense Inference dataset [Source](https://github.com/ku-nlp/KUCI (License CC BY-SA 4.0)
124
 
125
+ **EL (Entity Linking)**
 
126
 
127
+ * `chABSA`, Aspect-Based Sentiment Analysis dataset [Source](https://github.com/chakki-works/chABSA-dataset) (License CC BY-SA 4.0)
128
 
129
+ **FA (Fundamental Analysis)**
130
 
131
+ * `Wikipedia Annotated Corpus`, [Source](https://github.com/ku-nlp/WikipediaAnnotatedCorpus) (License CC BY-SA 4.0)
 
132
  List of tasks:
133
+ - Reading Prediction
134
+ - Named-entity recognition (NER)
135
+ - Dependency Parsing
136
+ - Predicate-argument structure analysis (PAS)
137
+ - Coreference Resolution
138
 
139
+ **MR (Mathematical Reasoning)**
 
 
 
 
140
 
141
+ * `MAWPS`, Japanese version of MAWPS (A Math Word Problem Repository) [Source](https://github.com/nlp-waseda/chain-of-thought-ja-dataset) (License Apache-2.0)
142
 
143
+ * `MGSM`, Japanese part of MGSM (Multilingual Grade School Math Benchmark) [Source](https://huggingface.co/datasets/juletxara/mgsm) (License MIT License)
144
 
145
+ **MT (Machine Translation)**
 
146
 
147
+ * `ALT`, Asian Language Treebank (ALT) - Parallel Corpus [Source](https://www2.nict.go.jp/astrec-att/member/mutiyama/ALT/index.html) (License CC BY-SA 4.0)
148
 
149
+ * `WikiCorpus`, Japanese-English Bilingual Corpus of Wikipedia's articles about the city of Kyoto [Source](https://alaginrc.nict.go.jp/WikiCorpus/) (License CC BY-SA 3.0)
 
150
 
151
+ **STS (Semantic Textual Similarity)**
 
 
 
 
 
 
 
 
 
 
 
 
152
 
153
  This task is supported by llm-jp-eval, but it is not included in the evaluation score average.
154
 
155
+ * `JSTS`, Japanese version of the STS (Semantic Textual Similarity) (part of JGLUE) [Source](https://github.com/yahoojapan/JGLUE) (License CC BY-SA 4.0)
 
 
 
 
 
 
 
 
 
 
 
 
156
 
157
+ **HE (Human Examination)**
 
158
 
159
+ * `MMLU`, Measuring Massive Multitask Language Understanding [Source](https://github.com/hendrycks/test) (License MIT License)
160
 
161
+ * `JMMLU`, Japanese Massive Multitask Language Understanding Benchmark [Source](https://github.com/nlp-waseda/JMMLU) (License CC BY-SA 4.0(3 tasks under the CC BY-NC-ND 4.0 license)
162
 
163
+ **CG (Code Generation)**
 
164
 
165
+ * `MBPP`, Japanese version of Mostly Basic Python Problems (MBPP) [Source](https://huggingface.co/datasets/llm-jp/mbpp-ja) (License CC BY-SA 4.0)
166
 
167
+ **SUM (Summarization)**
168
 
169
+ * `XL-Sum`, XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages [Source](https://github.com/csebuetnlp/xl-sum) (License CC BY-NC-SA 4.0, due to the non-commercial license, this dataset will not be used, unless you specifically agree to the license and terms of use)
 
170
 
171
 
172
  ## Reproducibility