OpenGVLab
/

InternVL-Chat-V1-1

Image-Text-to-Text

feature-extraction

Model card Files Files and versions Community

czczup commited on Jan 26, 2024

Commit

c90699f

·

verified ·

1 Parent(s): 62f8dfb

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -32,7 +32,7 @@ It is _**the largest open-source vision/vision-language foundation model (14B)**
 - **Training Strategy:**
   - Pretraining Stage
     - Learnable Component: InternViT-6B
-    - Data: 72M samples from COYO, LAION, CC12M, CC3M, SBU, Wukong, GRIT, Objects365, OpenImages, OCR data.
   - SFT Stage
     - Learnable Component: MLP + LLM
     - Data: A comprehensive collection of open-source SFT datasets, along with their Chinese translation versions, totaling approximately 10M.

 - **Training Strategy:**
   - Pretraining Stage
     - Learnable Component: InternViT-6B
+    - Data: Trained on 72M samples, including COYO, LAION, CC12M, CC3M, SBU, Wukong, GRIT, Objects365, OpenImages, and OCR data.
   - SFT Stage
     - Learnable Component: MLP + LLM
     - Data: A comprehensive collection of open-source SFT datasets, along with their Chinese translation versions, totaling approximately 10M.