Evaluation datasets

community

AI & ML interests

None defined yet.

Recent Activity

clefourrier authored a paper 24 days ago

Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation

hynky updated a dataset about 1 month ago

lighteval/QazUNTv2

hynky updated a dataset about 1 month ago

lighteval/HAWP

View all activity

models

None public yet

datasets 72

lighteval/QazUNTv2

Viewer • Updated Nov 26 • 1.7k • 51

lighteval/HAWP

Viewer • Updated Nov 19 • 2.34k • 40

lighteval/MWP-TR

Viewer • Updated Nov 19 • 4.16k • 40

lighteval/MathQA-TR

Viewer • Updated Nov 19 • 19.6k • 33

lighteval/elkarhizketak

Viewer • Updated Oct 8 • 1.63k • 41

lighteval/hellaswag_thai

Viewer • Updated Sep 25 • 25.6k • 44

lighteval/ChineseSquad

Viewer • Updated Aug 3 • 76.4k • 152

lighteval/thaiqa_squad_fixed

Viewer • Updated Aug 1 • 4.07k • 34

lighteval/KenSwQuAD

Viewer • Updated Aug 1 • 7.5k • 55

lighteval/MATH-Hard

Viewer • Updated Jun 12 • 7.26k • 7.96k • 17