Image Feature Extraction
Transformers
Safetensors
intern_vit_6b
feature-extraction
custom_code
zwgao commited on
Commit
62f64dc
1 Parent(s): 688b3d7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -17,7 +17,7 @@ pipeline_tag: image-feature-extraction
17
 
18
  \[[Paper](https://arxiv.org/abs/2312.14238)\] \[[GitHub](https://github.com/OpenGVLab/InternVL)\] \[[Chat Demo](https://internvl.opengvlab.com/)\] \[[中文解读](https://zhuanlan.zhihu.com/p/675877376)]
19
 
20
- We develop InternViT-6B-448px-V1-5 based on the pre-training of the strong foundation of InternViT-6B-448px-V1.2. In this update, the resolution of training images is expanded from 448×448 to dynamic 448×448, where the basic tile size is 448×448 and the number of tiles ranges from 1 to 12.
21
  Additionally, we enhance the data scale, quality, and diversity of the pre-training dataset, resulting in the powerful robustness, OCR capability, and high-resolution processing capability of our
22
  1.5 version model.
23
 
 
17
 
18
  \[[Paper](https://arxiv.org/abs/2312.14238)\] \[[GitHub](https://github.com/OpenGVLab/InternVL)\] \[[Chat Demo](https://internvl.opengvlab.com/)\] \[[中文解读](https://zhuanlan.zhihu.com/p/675877376)]
19
 
20
+ We develop InternViT-6B-448px-V1-5 based on the pre-training of the strong foundation of [InternViT-6B-448px-V1.2](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-2). In this update, the resolution of training images is expanded from 448×448 to dynamic 448×448, where the basic tile size is 448×448 and the number of tiles ranges from 1 to 12.
21
  Additionally, we enhance the data scale, quality, and diversity of the pre-training dataset, resulting in the powerful robustness, OCR capability, and high-resolution processing capability of our
22
  1.5 version model.
23