tttoaster commited on
Commit
8b3d9a9
1 Parent(s): 98e0fb3

Update constants.py

Browse files
Files changed (1) hide show
  1. constants.py +5 -0
constants.py CHANGED
@@ -82,6 +82,11 @@ TABLE_INTRODUCTION = """In the table below, we summarize each task performance o
82
  We use accurancy(%) as the primary evaluation metric for each tasks.
83
  SEED-Bench-1 calculates the overall accuracy by dividing the total number of correct QA answers by the total number of QA questions.
84
  SEED-Bench-2 represents the overall accuracy using the average accuracy of each dimension.
 
 
 
 
 
85
  """
86
 
87
  LEADERBORAD_INFO = """
 
82
  We use accurancy(%) as the primary evaluation metric for each tasks.
83
  SEED-Bench-1 calculates the overall accuracy by dividing the total number of correct QA answers by the total number of QA questions.
84
  SEED-Bench-2 represents the overall accuracy using the average accuracy of each dimension.
85
+ For PPL evaluation method, we count the loss for each candidate and select the lowest loss candidate. For detail, please refer [InternLM_Xcomposer_VL_interface](https://github.com/AILab-CVC/SEED-Bench/blob/387a067b6ba99ae5e8231f39ae2d2e453765765c/SEED-Bench-2/model/InternLM_Xcomposer_VL_interface.py#L74).
86
+ For PPL A/B/C/D evaluation method, please refer [EVAL_SEED.md](https://github.com/QwenLM/Qwen-VL/blob/master/eval_mm/seed_bench/EVAL_SEED.md) for more information.
87
+ For Generate evaluation method, please refer [Evaluation.md](https://github.com/haotian-liu/LLaVA/blob/main/docs/Evaluation.md#seed-bench) for detailed.
88
+ For the NG evaluation method, we indicate that the evaluation method is Not Given.
89
+ If you have any questions, please feel free to contact us.
90
  """
91
 
92
  LEADERBORAD_INFO = """