metadata
license: apache-2.0
language:
- en
base_model:
- Qwen/Qwen2.5-0.5B-Instruct
- facebook/dinov2-small
pipeline_tag: visual-question-answering
Pretrain stage only, 4630 epochs
Introduction
We use the powerful TinyLLaVA Factory to create a super small image-text-to-text model.
The goal is to make it possible to run LLaVA models on edge devices (with few gigabytes of memory).
For LLM and vision tower, we choose OpenELM-270M-Instruct and facebook/dinov2-small, respectively.
POPE:
Category | # Samples | TP | FP | TN | FN | Accuracy | Precision | Recall | F1 Score | Yes Ratio |
---|---|---|---|---|---|---|---|---|---|---|
Adversarial | 3000 | 1312 | 1250 | 250 | 188 | 0.521 | 0.512 | 0.875 | 0.646 | 0.854 |
Popular | 3000 | 1312 | 1236 | 264 | 188 | 0.525 | 0.515 | 0.875 | 0.648 | 0.849 |
Random | 2910 | 1312 | 1185 | 225 | 188 | 0.528 | 0.525 | 0.875 | 0.656 | 0.858 |
Samples 5000, Accuracy 0% (:-|)
Samples 4241, Correct: -, Accuracy: -%, IMG-Accuracy: -%
Category | # Samples | Accuracy |
---|---|---|
Overall | 900 | 0.280 |
Overall-Art and Design | 120 | 0.208 |
Art | 30 | 0.167 |
Art Theory | 30 | 0.200 |
Design | 30 | 0.367 |
Music | 30 | 0.100 |
Overall-Business | 150 | 0.213 |
Accounting | 30 | 0.100 |
Economics | 30 | 0.367 |
Finance | 30 | 0.200 |
Management | 30 | 0.233 |
Marketing | 30 | 0.167 |
Overall-Science | 150 | 0.300 |
Biology | 30 | 0.300 |
Chemistry | 30 | 0.133 |
Geography | 30 | 0.300 |
Math | 30 | 0.333 |
Physics | 30 | 0.433 |
Overall-Health and Medicine | 150 | 0.340 |
Basic Medical Science | 30 | 0.300 |
Clinical Medicine | 30 | 0.133 |
Diagnostics and Laboratory Med. | 30 | 0.333 |
Pharmacy | 30 | 0.400 |
Public Health | 30 | 0.533 |
Overall-Humanities and Soc. Sci. | 120 | 0.342 |
History | 30 | 0.300 |
Literature | 30 | 0.567 |
Sociology | 30 | 0.233 |
Psychology | 30 | 0.267 |
Overall-Tech and Engineering | 210 | 0.276 |
Agriculture | 30 | 0.300 |
Architecture and Engineering | 30 | 0.200 |
Computer Science | 30 | 0.367 |
Electronics | 30 | 0.200 |
Energy and Power | 30 | 0.367 |
Materials | 30 | 0.233 |
Mechanical Engineering | 30 | 0.267 |