Update README.md
Browse files
README.md
CHANGED
@@ -11,7 +11,6 @@ llava-dinov2-internlm2-7b-v1 is a LLaVA model fine-tuned from [InternLM2-Chat-7B
|
|
11 |
I did not carefully tune the training hyperparameters but the model still show capability to solve some tasks. It shows that a visual encoder can be integrated with an LLM, even when the encoder is not aligned with natural language with contrastive learning like CLIP.
|
12 |
|
13 |
## Example
|
14 |
-
|
15 |
data:image/s3,"s3://crabby-images/2eaa3/2eaa31eff42cfca1a9bb7296a8a3537bf066d5b8" alt="5bb2f23dd595d389e6a9a0aadebd87c.png"
|
16 |
Explain the photo in English:
|
17 |
data:image/s3,"s3://crabby-images/95a8e/95a8e99f7af5fdc6268fe7fac86c1ddc17d07da3" alt="eeb555092886be02e8e6215d0fdb229.png"
|
@@ -77,9 +76,13 @@ You just need
|
|
77 |
```
|
78 |
pip install protobuf
|
79 |
```
|
|
|
|
|
|
|
|
|
|
|
80 |
|
81 |
## Data prepration
|
82 |
-
|
83 |
1. File structure
|
84 |
|
85 |
```
|
|
|
11 |
I did not carefully tune the training hyperparameters but the model still show capability to solve some tasks. It shows that a visual encoder can be integrated with an LLM, even when the encoder is not aligned with natural language with contrastive learning like CLIP.
|
12 |
|
13 |
## Example
|
|
|
14 |
data:image/s3,"s3://crabby-images/2eaa3/2eaa31eff42cfca1a9bb7296a8a3537bf066d5b8" alt="5bb2f23dd595d389e6a9a0aadebd87c.png"
|
15 |
Explain the photo in English:
|
16 |
data:image/s3,"s3://crabby-images/95a8e/95a8e99f7af5fdc6268fe7fac86c1ddc17d07da3" alt="eeb555092886be02e8e6215d0fdb229.png"
|
|
|
76 |
```
|
77 |
pip install protobuf
|
78 |
```
|
79 |
+
4.
|
80 |
+
To use tensorboard to visualize the training loss curve:
|
81 |
+
```
|
82 |
+
pip install future tensorboard
|
83 |
+
```
|
84 |
|
85 |
## Data prepration
|
|
|
86 |
1. File structure
|
87 |
|
88 |
```
|