tuanio commited on
Commit
931f3a5
·
verified ·
1 Parent(s): 7683c6a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -9
README.md CHANGED
@@ -6,24 +6,48 @@ model-index:
6
  results: []
7
  ---
8
 
9
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
10
- should probably proofread and complete it, then remove this comment. -->
 
 
 
 
 
 
 
 
 
11
 
12
- # ft-moe-llava-qwen1.5-1.8b-vista-1ep
13
 
14
- This model was trained from scratch on an unknown dataset.
15
 
16
- ## Model description
17
 
18
- More information needed
19
 
20
- ## Intended uses & limitations
21
 
22
- More information needed
 
 
23
 
24
  ## Training and evaluation data
25
 
26
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
  ## Training procedure
29
 
 
6
  results: []
7
  ---
8
 
9
+ <p align="center">
10
+ <div style="display: flex;text-align: center;">
11
+ <div>
12
+ <img src="https://firebasestorage.googleapis.com/v0/b/database-7ca5c.appspot.com/o/llm%2F68747470733a2f2f7331312e617831782e636f6d2f323032332f31322f32382f70697176444d562e706e67.png?alt=media&token=30a2470d-861e-4295-a7f4-da48231724cf" width="250" style="margin-bottom: 0.2;"/>
13
+ </div>
14
+ <div>
15
+ <img src="https://firebasestorage.googleapis.com/v0/b/database-7ca5c.appspot.com/o/llm%2Flogo_qwen.jpg?alt=media&token=fd2cd557-2f45-4f94-86d3-a5e7c9eef630" width="600" style="margin-bottom: 1rem;"/>
16
+ </div>
17
+ </div>
18
+ <p>
19
+ <h1 align="center">MoE-LLaVA-Qwen1.5-1.8B×4-Top2: When Vision meet Small-scaled Language Model and Vietnamese Synthetic Dataset</h1>
20
 
21
+ <h5 align="center">
22
 
23
+ # Introducing MoE-LLaVA-Qwen1.5-1.8B×4-Top2 for Vietnamese
24
 
25
+ We are excited to present MoE-LLaVA-Qwen1.5-1.8B×4-Top2, tailored for the Vietnamese language. This model is part of our ongoing efforts to develop Vision Language Models (VLM) for Vietnamese, a domain that is currently limited and predominantly features larger models (**~7B parameters**). Our model activates approximately **2.2B** 🤗😎 parameters per call, significantly reducing the memory footprint, and it can be quantized for local execution.
26
 
27
+ ## Bias, Risks, and Limitations
28
 
29
+ The dataset may contain biases originating from its sources. Users should remain aware of these potential biases when utilizing the dataset.
30
 
31
+ ## More Information
32
+
33
+ This dataset represents the first stage of a two-stage development process for a larger model. Stay tuned for future developments by subscribing to our updates.
34
 
35
  ## Training and evaluation data
36
 
37
+ ### Training Dataset
38
+
39
+ Our model is trained on the comprehensive [Vi-VLM/Vista dataset](https://huggingface.co/datasets/Vi-VLM/Vista), which includes around 700,000 Vietnamese vision-language samples curated by Gemini Pro. We employed various prompt engineering techniques, including:
40
+
41
+ - **Few-shot Learning**
42
+ - **Caption-based Prompting**
43
+ - **Image-based Prompting**
44
+
45
+ ### Techniques Used
46
+
47
+ - **MoE-LLaVA**: [MoE-LLaVA](https://github.com/PKU-YuanGroup/MoE-LLaVA/tree/main)
48
+
49
+ ## Evaluation
50
+ - Comming soon 🫡
51
 
52
  ## Training procedure
53