Update README.md
Browse files
README.md
CHANGED
@@ -10,6 +10,16 @@ pipeline_tag: visual-question-answering
|
|
10 |
|
11 |
Pretrain stage only, 4630 epochs
|
12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
| Category | # Samples | TP | FP | TN | FN | Accuracy | Precision | Recall | F1 Score | Yes Ratio |
|
14 |
|-----------------|---------------|--------|--------|--------|--------|--------------|---------------|------------|--------------|---------------|
|
15 |
| Adversarial | 3000 | 1312 | 1250 | 250 | 188 | 0.521 | 0.512 | 0.875 | 0.646 | 0.854 |
|
@@ -17,6 +27,14 @@ Pretrain stage only, 4630 epochs
|
|
17 |
| Random | 2910 | 1312 | 1185 | 225 | 188 | 0.528 | 0.525 | 0.875 | 0.656 | 0.858 |
|
18 |
|
19 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
[MMMU](https://tinyllava-factory.readthedocs.io/en/latest/Evaluation.html#mmmu)
|
21 |
|
22 |
| Category | # Samples | Accuracy |
|
|
|
10 |
|
11 |
Pretrain stage only, 4630 epochs
|
12 |
|
13 |
+
# Introduction
|
14 |
+
|
15 |
+
We use the powerful [TinyLLaVA Factory](https://github.com/TinyLLaVA/TinyLLaVA_Factory) to create a super small image-text-to-text model.
|
16 |
+
|
17 |
+
The goal is to make it possible to run LLaVA models on edge devices (with few gigabytes of memory).
|
18 |
+
|
19 |
+
For LLM and vision tower, we choose [OpenELM-270M-Instruct](apple/OpenELM-270M-Instruct) and [facebook/dinov2-small](facebook/dinov2-small), respectively.
|
20 |
+
|
21 |
+
[POPE](https://tinyllava-factory.readthedocs.io/en/latest/Evaluation.html#pope):
|
22 |
+
|
23 |
| Category | # Samples | TP | FP | TN | FN | Accuracy | Precision | Recall | F1 Score | Yes Ratio |
|
24 |
|-----------------|---------------|--------|--------|--------|--------|--------------|---------------|------------|--------------|---------------|
|
25 |
| Adversarial | 3000 | 1312 | 1250 | 250 | 188 | 0.521 | 0.512 | 0.875 | 0.646 | 0.854 |
|
|
|
27 |
| Random | 2910 | 1312 | 1185 | 225 | 188 | 0.528 | 0.525 | 0.875 | 0.656 | 0.858 |
|
28 |
|
29 |
|
30 |
+
[TEXTVQA](https://tinyllava-factory.readthedocs.io/en/latest/Evaluation.html#textvqa)
|
31 |
+
|
32 |
+
Samples 5000, Accuracy 0% (:-|)
|
33 |
+
|
34 |
+
[SCIENCEQA](https://tinyllava-factory.readthedocs.io/en/latest/Evaluation.html#scienceqa)
|
35 |
+
|
36 |
+
Samples 4241, Correct: -, Accuracy: -%, IMG-Accuracy: -%
|
37 |
+
|
38 |
[MMMU](https://tinyllava-factory.readthedocs.io/en/latest/Evaluation.html#mmmu)
|
39 |
|
40 |
| Category | # Samples | Accuracy |
|