Spaces:
Running
Running
metadata
title: MJ Bench Leaderboard
emoji: 🥇
colorFrom: green
colorTo: indigo
sdk: gradio
app_file: app.py
pinned: true
license: apache-2.0
Start the configuration
Most of the variables to change for a default leaderboard are in src/env.py
(replace the path for your leaderboard) and src/about.py
(for tasks).
Results files should have the following format and be stored as json files:
{
"config": {
"model_dtype": "torch.float16", # or torch.bfloat16 or 8bit or 4bit
"model_name": "path of the model on the hub: org/model",
"model_sha": "revision on the hub",
},
"results": {
"task_name": {
"metric_name": score,
},
"task_name2": {
"metric_name": score,
}
}
}
Request files are created automatically by this tool.
If you encounter problem on the space, don't hesitate to restart it to remove the create eval-queue, eval-queue-bk, eval-results and eval-results-bk created folder.
Code logic for more complex edits
You'll find
- the main table' columns names and properties in
src/display/utils.py
- the logic to read all results and request files, then convert them in dataframe lines, in
src/leaderboard/read_evals.py
, andsrc/populate.py
- teh logic to allow or filter submissions in
src/submission/submit.py
andsrc/submission/check_validity.py
Citation
@misc{chen2024mjbenchmultimodalrewardmodel,
title={MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?},
author={Zhaorun Chen and Yichao Du and Zichen Wen and Yiyang Zhou and Chenhang Cui and Zhenzhen Weng and Haoqin Tu and Chaoqi Wang and Zhengwei Tong and Qinglan Huang and Canyu Chen and Qinghao Ye and Zhihong Zhu and Yuqing Zhang and Jiawei Zhou and Zhuokai Zhao and Rafael Rafailov and Chelsea Finn and Huaxiu Yao},
year={2024},
eprint={2407.04842},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2407.04842},
}