Update README.md
Browse files
README.md
CHANGED
@@ -160,7 +160,8 @@ We release the SmolVLM checkpoints under the Apache 2.0 license.
|
|
160 |
|
161 |
### Training Data
|
162 |
|
163 |
-
The training data comes from [The Cauldron](https://huggingface.co/datasets/HuggingFaceM4/the_cauldron) and [Docmatix](https://huggingface.co/datasets/HuggingFaceM4/Docmatix) datasets, with emphasis on document understanding (25%) and image captioning (18%), while maintaining balanced coverage across other crucial capabilities like visual reasoning, chart comprehension, and general instruction following
|
|
|
164 |
|
165 |
|
166 |
|
|
|
160 |
|
161 |
### Training Data
|
162 |
|
163 |
+
The training data comes from [The Cauldron](https://huggingface.co/datasets/HuggingFaceM4/the_cauldron) and [Docmatix](https://huggingface.co/datasets/HuggingFaceM4/Docmatix) datasets, with emphasis on document understanding (25%) and image captioning (18%), while maintaining balanced coverage across other crucial capabilities like visual reasoning, chart comprehension, and general instruction following.
|
164 |
+
<img src="https://huggingface.co/HuggingFaceTB/SmolVLM-Instruct/resolve/main/mixture_the_cauldron.png" alt="Example Image" style="width:90%;" />
|
165 |
|
166 |
|
167 |
|