xverse
/

XVERSE-V-13B

Model card Files Files and versions Community

ubuntu commited on May 6, 2024

Commit

f173fcb

1 Parent(s): 82855e1

update readme

Browse files

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -71,7 +71,7 @@ For a higher resolution 448×672 image, we split it into 6 local image blocks us
 > <sup>1：带 `*`  号的模型是闭源模型</sup>
 对于上述所有比较模型，我们优先汇报其官方公布的结果。在缺少官方结果的情况下，我们采用了 [OpenCompass 榜单](https://rank.opencompass.org.cn/leaderboard-multimodal)的报告结果。若 OpenCompass 榜单上仍然缺少相应的数据集评估结果，
-则来自于我们自行执行的评估流程所获得的数据。而评测框架则采用了[OpenCompass 评估框架](https://github.com/open-compass/OpenCompass/)。
 ### 传统VQA类任务
 传统VQA任务，作为多模态视觉问答领域学术论文常引用的评测任务，具备显著的学术参考价值。因此，我们也将在此类数据集上报告相关的评测结果。
@@ -84,7 +84,7 @@ For a higher resolution 448×672 image, we split it into 6 local image blocks us
 | VizWiz             |   **81.9**   |         54.6          |   75.6     |    64.0      |    50.1     |      44.0       |    41.4     |       70.8        |
 | TextVQA            |   **74.2**   |         64.3          |   53.7     |    62.4      |    63.8     |      69.6       |    63.1     |       54.0        |
-同理，对于上述所有比较模型，我们优先汇报其官方公布的结果。在缺少官方结果的情况下，则来自于我们自行执行的评估流程所获得的数据。而评测框架则采用了[OpenCompass 评估框架](https://github.com/open-compass/OpenCompass/)。
 ## Evaluation Reports
@@ -110,7 +110,7 @@ To comprehensively assess the model's performance, we conducted thorough testing
 For all the compared models mentioned above, we prioritize reporting their officially published results. In cases where official results are unavailable, we rely on the reported results from the [OpenCompass leaderboard](https://rank.opencompass.org.cn/leaderboard-multimodal).
 If the corresponding dataset evaluation results are still missing from the OpenCompass leaderboard, we include data obtained from our own evaluation process.
-The evaluation framework used adheres to the [OpenCompass evaluation framework](https://github.com/open-compass/OpenCompass/).
 ### Traditional VQA tasks
 The traditional Visual Question Answering (VQA) task, frequently referenced in academic literature in the field of multimodal visual question answering, holds significant academic reference value.
@@ -127,7 +127,7 @@ Therefore, we will also report relevant evaluation results on datasets of this k
 Similarly, for all the compared models mentioned above, we prioritize reporting their officially published results. In the absence of official results, data is obtained from our own evaluation process.
-The evaluation framework used adheres to the [OpenCompass evaluation framework](https://github.com/open-compass/OpenCompass/).
 ## 效果示例

 > <sup>1：带 `*`  号的模型是闭源模型</sup>
 对于上述所有比较模型，我们优先汇报其官方公布的结果。在缺少官方结果的情况下，我们采用了 [OpenCompass 榜单](https://rank.opencompass.org.cn/leaderboard-multimodal)的报告结果。若 OpenCompass 榜单上仍然缺少相应的数据集评估结果，
+则来自于我们自行执行的评估流程所获得的数据。而评测框架则采用了[VLMEvalKit 评估框架](https://github.com/open-compass/VLMEvalKit/)。
 ### 传统VQA类任务
 传统VQA任务，作为多模态视觉问答领域学术论文常引用的评测任务，具备显著的学术参考价值。因此，我们也将在此类数据集上报告相关的评测结果。
 | VizWiz             |   **81.9**   |         54.6          |   75.6     |    64.0      |    50.1     |      44.0       |    41.4     |       70.8        |
 | TextVQA            |   **74.2**   |         64.3          |   53.7     |    62.4      |    63.8     |      69.6       |    63.1     |       54.0        |
+同理，对于上述所有比较模型，我们优先汇报其官方公布的结果。在缺少官方结果的情况下，则来自于我们自行执行的评估流程所获得的数据。而评测框架则采用了[VLMEvalKit 评估框架](https://github.com/open-compass/VLMEvalKit/)。
 ## Evaluation Reports
 For all the compared models mentioned above, we prioritize reporting their officially published results. In cases where official results are unavailable, we rely on the reported results from the [OpenCompass leaderboard](https://rank.opencompass.org.cn/leaderboard-multimodal).
 If the corresponding dataset evaluation results are still missing from the OpenCompass leaderboard, we include data obtained from our own evaluation process.
+The evaluation framework used adheres to the [VLMEvalKit evaluation framework](https://github.com/open-compass/VLMEvalKit/).
 ### Traditional VQA tasks
 The traditional Visual Question Answering (VQA) task, frequently referenced in academic literature in the field of multimodal visual question answering, holds significant academic reference value.
 Similarly, for all the compared models mentioned above, we prioritize reporting their officially published results. In the absence of official results, data is obtained from our own evaluation process.
+The evaluation framework used adheres to the [VLMEvalKit evaluation framework](https://github.com/open-compass/VLMEvalKit/).
 ## 效果示例