Update README.md
Browse files
README.md
CHANGED
@@ -8,12 +8,16 @@ tags:
|
|
8 |
library_name: transformers
|
9 |
---
|
10 |
|
11 |
-
# Qwen2-VL-2B
|
12 |
|
13 |
## Introduction
|
14 |
|
15 |
We're excited to unveil **Qwen2-VL**, the latest iteration of our Qwen-VL model, representing nearly a year of innovation.
|
16 |
|
|
|
|
|
|
|
|
|
17 |
### What’s New in Qwen2-VL?
|
18 |
|
19 |
#### Key Enhancements:
|
@@ -53,19 +57,6 @@ KeyError: 'qwen2_vl'
|
|
53 |
```
|
54 |
|
55 |
|
56 |
-
## Limitations
|
57 |
-
|
58 |
-
While Qwen2-VL are applicable to a wide range of visual tasks, it is equally important to understand its limitations. Here are some known restrictions:
|
59 |
-
|
60 |
-
1. Lack of Audio Support: The current model does **not comprehend audio information** within videos.
|
61 |
-
2. Data timeliness: Our image dataset is **updated until June 2023**, and information subsequent to this date may not be covered.
|
62 |
-
3. Constraints in Individuals and Intellectual Property (IP): The model's capacity to recognize specific individuals or IPs is limited, potentially failing to comprehensively cover all well-known personalities or brands.
|
63 |
-
4. Limited Capacity for Complex Instruction: When faced with intricate multi-step instructions, the model's understanding and execution capabilities require enhancement.
|
64 |
-
5. Insufficient Counting Accuracy: Particularly in complex scenes, the accuracy of object counting is not high, necessitating further improvements.
|
65 |
-
6. Weak Spatial Reasoning Skills: Especially in 3D spaces, the model's inference of object positional relationships is inadequate, making it difficult to precisely judge the relative positions of objects.
|
66 |
-
|
67 |
-
These limitations serve as ongoing directions for model optimization and improvement, and we are committed to continually enhancing the model's performance and scope of application.
|
68 |
-
|
69 |
## Citation
|
70 |
|
71 |
If you find our work helpful, feel free to give us a cite.
|
|
|
8 |
library_name: transformers
|
9 |
---
|
10 |
|
11 |
+
# Qwen2-VL-2B
|
12 |
|
13 |
## Introduction
|
14 |
|
15 |
We're excited to unveil **Qwen2-VL**, the latest iteration of our Qwen-VL model, representing nearly a year of innovation.
|
16 |
|
17 |
+
> [!Important]
|
18 |
+
> This is the base pretrained model of Qwen2-VL-2B without instruction tuning.
|
19 |
+
|
20 |
+
|
21 |
### What’s New in Qwen2-VL?
|
22 |
|
23 |
#### Key Enhancements:
|
|
|
57 |
```
|
58 |
|
59 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
60 |
## Citation
|
61 |
|
62 |
If you find our work helpful, feel free to give us a cite.
|