doubility123
commited on
Commit
•
be93013
1
Parent(s):
9af3429
Update README.md
Browse files
README.md
CHANGED
@@ -21,7 +21,7 @@ Haoyu Lu*, Wen Liu*, Bo Zhang**, Bingxuan Wang, Kai Dong, Bo Liu, Jingxiang Sun,
|
|
21 |
|
22 |
DeepSeek-VL-7b-base uses the [SigLIP-L](https://huggingface.co/timm/ViT-L-16-SigLIP-384) and [SAM-B](https://huggingface.co/facebook/sam-vit-base) as the hybrid vision encoder supporting 1024 x 1024 image input
|
23 |
and is constructed based on the DeepSeek-LLM-7b-base which is trained on an approximate corpus of 2T text tokens. The whole DeepSeek-VL-7b-base model is finally trained around 400B vision-language tokens.
|
24 |
-
DeekSeel-VL-7b-chat is an instructed version based on [DeepSeek-VL-7b-
|
25 |
|
26 |
|
27 |
## 3. Quick Start
|
|
|
21 |
|
22 |
DeepSeek-VL-7b-base uses the [SigLIP-L](https://huggingface.co/timm/ViT-L-16-SigLIP-384) and [SAM-B](https://huggingface.co/facebook/sam-vit-base) as the hybrid vision encoder supporting 1024 x 1024 image input
|
23 |
and is constructed based on the DeepSeek-LLM-7b-base which is trained on an approximate corpus of 2T text tokens. The whole DeepSeek-VL-7b-base model is finally trained around 400B vision-language tokens.
|
24 |
+
DeekSeel-VL-7b-chat is an instructed version based on [DeepSeek-VL-7b-base](https://huggingface.co/deepseek-ai/deepseek-vl-7b-base).
|
25 |
|
26 |
|
27 |
## 3. Quick Start
|