Spaces:

StarscreamDeceptions
/

Multilingual-MMLU-Benchmark-Leaderboard

Sleeping

App Files Files Community

StarscreamDeceptions commited on Nov 25, 2024

Commit

e417adb

•

1 Parent(s): 9dad8c2

Update src/about.py

Browse files

Files changed (1) hide show

src/about.py +7 -7

src/about.py CHANGED Viewed

@@ -40,7 +40,11 @@ TITLE = """<img src="https://raw.githubusercontent.com/BobTsang1995/Multilingual
 # What does your leaderboard evaluate?
 INTRODUCTION_TEXT = """
-**Multilingual MMLU Benchmark Leaderboard:** This leaderboard is dedicated to evaluating the performance of both open-source and closed-source language models on the Multilingual MMLU benchmark. It assesses their memorization, reasoning, and linguistic capabilities across a wide range of languages. The leaderboard consolidates multiple MMLU datasets, originally created or manually translated into various languages, to provide a comprehensive evaluation of multilingual understanding in LLMs.
 """
 INTRODUCTION_TEXT_ZH = """
 **多语言 MMLU 基准榜单：** 这是一个开放的评测榜单，旨在评估开源和闭源语言模型在多语言 MMLU 基准测试中的表现，涵盖记忆、推理和语言能力。该榜单整合了多个 MMLU 数据集，这些数据集最初为多种语言创建或手动翻译，旨在全面评估大规模语言模型在多语言理解上的能力。
@@ -140,11 +144,7 @@ This leaderboard was independently developed as a non-profit initiative with the
 The entities above are ordered chronologically by the date they joined the project. However, the logos in the footer are ordered by the number of datasets donated.
 Thank you in particular to:
-- Task implementation: Yi Zhou (Cardiff University), Yusuke Sakai (Nara Institute of Science and Technology), Yongxin Zhou (Université Grenoble Alpes), Haonan Li (MBZUAI), Jiahui Geng (MBZUAI), Qing Li (MBZUAI), Wenxi Li (Tsinghua University/Shanghai Jiaotong University), Yuanyu Lin (University of Macau), Andy Way (Dublin City University), Zhuang Li (RMIT University), Zhongwei Wan (The Ohio State University), Di Wu (University of Amsterdam), Wen Lai (Technical University of Munich) (TUM)
-- Leaderboard implementation: Yi Zhou (Cardiff University), Yusuke Sakai (Nara Institute of Science and Technology), Yongxin Zhou (Université Grenoble Alpes), Haonan Li (MBZUAI), Jiahui Geng (MBZUAI), Qing Li (MBZUAI), Wenxi Li (Tsinghua University/Shanghai Jiaotong University), Yuanyu Lin (University of Macau), Andy Way (Dublin City University), Zhuang Li (RMIT University), Zhongwei Wan (The Ohio State University), Di Wu (University of Amsterdam), Wen Lai (Technical University of Munich) (TUM)
-- Model evaluation: Yi Zhou (Cardiff University), Yusuke Sakai (Nara Institute of Science and Technology), Yongxin Zhou (Université Grenoble Alpes), Haonan Li (MBZUAI), Jiahui Geng (MBZUAI), Qing Li (MBZUAI), Wenxi Li (Tsinghua University/Shanghai Jiaotong University), Yuanyu Lin (University of Macau), Andy Way (Dublin City University), Zhuang Li (RMIT University), Zhongwei Wan (The Ohio State University), Di Wu (University of Amsterdam), Wen Lai (Technical University of Munich) (TUM)
-- Communication: Yi Zhou (Cardiff University), Yusuke Sakai (Nara Institute of Science and Technology), Yongxin Zhou (Université Grenoble Alpes), Haonan Li (MBZUAI), Jiahui Geng (MBZUAI), Qing Li (MBZUAI), Wenxi Li (Tsinghua University/Shanghai Jiaotong University), Yuanyu Lin (University of Macau), Andy Way (Dublin City University), Zhuang Li (RMIT University), Zhongwei Wan (The Ohio State University), Di Wu (University of Amsterdam), Wen Lai (Technical University of Munich) (TUM)
-- Organization & colab leads: Yi Zhou (Cardiff University), Yusuke Sakai (Nara Institute of Science and Technology), Yongxin Zhou (Université Grenoble Alpes), Haonan Li (MBZUAI), Jiahui Geng (MBZUAI), Qing Li (MBZUAI), Wenxi Li (Tsinghua University/Shanghai Jiaotong University), Yuanyu Lin (University of Macau), Andy Way (Dublin City University), Zhuang Li (RMIT University), Zhongwei Wan (The Ohio State University), Di Wu (University of Amsterdam), Wen Lai (Technical University of Munich) (TUM)
 For information about the dataset authors please check the corresponding Dataset Cards (linked in the "Tasks" tab) and papers (included in the "Citation" section below). We would like to specially thank the teams that created or open-sourced their datasets specifically for the leaderboard (in chronological order):
 - [MMMLU](https://huggingface.co/datasets/openai/MMMLU): OpenAI
@@ -301,7 +301,7 @@ If everything is done, check you can launch the EleutherAIHarness on your model
 CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
 CITATION_BUTTON_TEXT = r"""@misc{Multilingual MMLU Benchmark Leaderboard2024,
-    author = {Yi Zhou (Cardiff University) and Yusuke Sakai (Nara Institute of Science and Technology) and Yongxin Zhou (Université Grenoble Alpes) and Haonan Li (MBZUAI) and Jiahui Geng (MBZUAI) and Qing Li (MBZUAI) and Wenxi Li (Tsinghua University/Shanghai Jiaotong University) and Yuanyu Lin (University of Macau) and Andy Way (Dublin City University) and Zhuang Li (RMIT University) and Zhongwei Wan (The Ohio State University) and Di Wu (University of Amsterdam) and Wen Lai (Technical University of Munich) (TUM)},
     title = {Multilingual MMLU Benchmark Leaderboard},
     year = {2024},
     publisher = {Hugging Face},

 # What does your leaderboard evaluate?
 INTRODUCTION_TEXT = """
+🌍 **Multilingual MMLU Benchmark Leaderboard:** This leaderboard is dedicated to evaluating and comparing the multilingual capabilities of large language models across different languages and cultures.
+🔬 MMMLU Dataset: The dataset used for evaluation is (OpenAI MMMLU Benchmark)[https://huggingface.co/datasets/openai/MMMLU], which covers a broad range of topics from 57 different categories, covering elementary-level knowledge up to advanced professional subjects like law, physics, history, and computer science. MMMLU contains 14 languages: AR_XY (Arabic), BN_BD (Bengali), DE_DE (German), ES_LA (Spanish), FR_FR (French), HI_IN (Hindi), ID_ID (Indonesian), IT_IT (Italian), JA_JP (Japanese), KO_KR (Korean), PT_BR (Brazilian Portuguese), SW_KE (Swahili), YO_NG (Yoruba), ZH_CN (Simplified Chinese).
+🎯 Our Goal is to raise awareness about the importance of improving the performance of LLMs across various languages, with a particular focus on cultural contexts. We strive to make LLM more inclusive and effective for users worldwide.
 """
 INTRODUCTION_TEXT_ZH = """
 **多语言 MMLU 基准榜单：** 这是一个开放的评测榜单，旨在评估开源和闭源语言模型在多语言 MMLU 基准测试中的表现，涵盖记忆、推理和语言能力。该榜单整合了多个 MMLU 数据集，这些数据集最初为多种语言创建或手动翻译，旨在全面评估大规模语言模型在多语言理解上的能力。
 The entities above are ordered chronologically by the date they joined the project. However, the logos in the footer are ordered by the number of datasets donated.
 Thank you in particular to:
+    Yi Zhou (Cardiff University), Yusuke Sakai (Nara Institute of Science and Technology), Yongxin Zhou (Université Grenoble Alpes), Haonan Li (MBZUAI), Jiahui Geng (MBZUAI), Qing Li (MBZUAI), Wenxi Li (Tsinghua University/Shanghai Jiaotong University), Yuanyu Lin (University of Macau), Andy Way (Dublin City University), Zhuang Li (RMIT University), Zhongwei Wan (The Ohio State University), Di Wu (University of Amsterdam), Wen Lai (Technical University of Munich) (TUM)
 For information about the dataset authors please check the corresponding Dataset Cards (linked in the "Tasks" tab) and papers (included in the "Citation" section below). We would like to specially thank the teams that created or open-sourced their datasets specifically for the leaderboard (in chronological order):
 - [MMMLU](https://huggingface.co/datasets/openai/MMMLU): OpenAI
 CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
 CITATION_BUTTON_TEXT = r"""@misc{Multilingual MMLU Benchmark Leaderboard2024,
+    author = {Yi Zhou and Yusuke Sakai and Yongxin Zhou and Haonan Li and Jiahui Geng and Qing Li and Wenxi Li and Yuanyu Lin and Andy Way and Zhuang Li and Zhongwei Wan and Di Wu and Wen Lai and Bo Zeng},
     title = {Multilingual MMLU Benchmark Leaderboard},
     year = {2024},
     publisher = {Hugging Face},