ubuntu commited on
Commit
f173fcb
·
1 Parent(s): 82855e1

update readme

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -71,7 +71,7 @@ For a higher resolution 448×672 image, we split it into 6 local image blocks us
71
  > <sup>1:带 `*` 号的模型是闭源模型</sup>
72
 
73
  对于上述所有比较模型,我们优先汇报其官方公布的结果。在缺少官方结果的情况下,我们采用了 [OpenCompass 榜单](https://rank.opencompass.org.cn/leaderboard-multimodal)的报告结果。若 OpenCompass 榜单上仍然缺少相应的数据集评估结果,
74
- 则来自于我们自行执行的评估流程所获得的数据。而评测框架则采用了[OpenCompass 评估框架](https://github.com/open-compass/OpenCompass/)。
75
 
76
  ### 传统VQA类任务
77
  传统VQA任务,作为多模态视觉问答领域学术论文常引用的评测任务,具备显著的学术参考价值。因此,我们也将在此类数据集上报告相关的评测结果。
@@ -84,7 +84,7 @@ For a higher resolution 448×672 image, we split it into 6 local image blocks us
84
  | VizWiz | **81.9** | 54.6 | 75.6 | 64.0 | 50.1 | 44.0 | 41.4 | 70.8 |
85
  | TextVQA | **74.2** | 64.3 | 53.7 | 62.4 | 63.8 | 69.6 | 63.1 | 54.0 |
86
 
87
- 同理,对于上述所有比较模型,我们优先汇报其官方公布的结果。在缺少官方结果的情况下,则来自于我们自行执行的评估流程所获得的数据。而评测框架则采用了[OpenCompass 评估框架](https://github.com/open-compass/OpenCompass/)。
88
 
89
  ## Evaluation Reports
90
 
@@ -110,7 +110,7 @@ To comprehensively assess the model's performance, we conducted thorough testing
110
 
111
  For all the compared models mentioned above, we prioritize reporting their officially published results. In cases where official results are unavailable, we rely on the reported results from the [OpenCompass leaderboard](https://rank.opencompass.org.cn/leaderboard-multimodal).
112
  If the corresponding dataset evaluation results are still missing from the OpenCompass leaderboard, we include data obtained from our own evaluation process.
113
- The evaluation framework used adheres to the [OpenCompass evaluation framework](https://github.com/open-compass/OpenCompass/).
114
 
115
  ### Traditional VQA tasks
116
  The traditional Visual Question Answering (VQA) task, frequently referenced in academic literature in the field of multimodal visual question answering, holds significant academic reference value.
@@ -127,7 +127,7 @@ Therefore, we will also report relevant evaluation results on datasets of this k
127
 
128
 
129
  Similarly, for all the compared models mentioned above, we prioritize reporting their officially published results. In the absence of official results, data is obtained from our own evaluation process.
130
- The evaluation framework used adheres to the [OpenCompass evaluation framework](https://github.com/open-compass/OpenCompass/).
131
 
132
 
133
  ## 效果示例
 
71
  > <sup>1:带 `*` 号的模型是闭源模型</sup>
72
 
73
  对于上述所有比较模型,我们优先汇报其官方公布的结果。在缺少官方结果的情况下,我们采用了 [OpenCompass 榜单](https://rank.opencompass.org.cn/leaderboard-multimodal)的报告结果。若 OpenCompass 榜单上仍然缺少相应的数据集评估结果,
74
+ 则来自于我们自行执行的评估流程所获得的数据。而评测框架则采用了[VLMEvalKit 评估框架](https://github.com/open-compass/VLMEvalKit/)。
75
 
76
  ### 传统VQA类任务
77
  传统VQA任务,作为多模态视觉问答领域学术论文常引用的评测任务,具备显著的学术参考价值。因此,我们也将在此类数据集上报告相关的评测结果。
 
84
  | VizWiz | **81.9** | 54.6 | 75.6 | 64.0 | 50.1 | 44.0 | 41.4 | 70.8 |
85
  | TextVQA | **74.2** | 64.3 | 53.7 | 62.4 | 63.8 | 69.6 | 63.1 | 54.0 |
86
 
87
+ 同理,对于上述所有比较模型,我们优先汇报其官方公布的结果。在缺少官方结果的情况下,则来自于我们自行执行的评估流程所获得的数据。而评测框架则采用了[VLMEvalKit 评估框架](https://github.com/open-compass/VLMEvalKit/)。
88
 
89
  ## Evaluation Reports
90
 
 
110
 
111
  For all the compared models mentioned above, we prioritize reporting their officially published results. In cases where official results are unavailable, we rely on the reported results from the [OpenCompass leaderboard](https://rank.opencompass.org.cn/leaderboard-multimodal).
112
  If the corresponding dataset evaluation results are still missing from the OpenCompass leaderboard, we include data obtained from our own evaluation process.
113
+ The evaluation framework used adheres to the [VLMEvalKit evaluation framework](https://github.com/open-compass/VLMEvalKit/).
114
 
115
  ### Traditional VQA tasks
116
  The traditional Visual Question Answering (VQA) task, frequently referenced in academic literature in the field of multimodal visual question answering, holds significant academic reference value.
 
127
 
128
 
129
  Similarly, for all the compared models mentioned above, we prioritize reporting their officially published results. In the absence of official results, data is obtained from our own evaluation process.
130
+ The evaluation framework used adheres to the [VLMEvalKit evaluation framework](https://github.com/open-compass/VLMEvalKit/).
131
 
132
 
133
  ## 效果示例