Update README.md
Browse files
README.md
CHANGED
@@ -11,7 +11,6 @@ llava-dinov2-internlm2-7b-v1 is a LLaVA model fine-tuned from [InternLM2-Chat-7B
|
|
11 |
I did not carefully tune the training hyperparameters but the model still show capability to solve some tasks. It shows that a visual encoder can be integrated with an LLM, even when the encoder is not aligned with natural language with contrastive learning like CLIP.
|
12 |
|
13 |
## Example
|
14 |
-
|
15 |
![5bb2f23dd595d389e6a9a0aadebd87c.png](https://cdn-uploads.huggingface.co/production/uploads/642a298ae5f33939cf3ee600/iOFZOwLGfEByCQ_2EkR7y.png)
|
16 |
Explain the photo in English:
|
17 |
![eeb555092886be02e8e6215d0fdb229.png](https://cdn-uploads.huggingface.co/production/uploads/642a298ae5f33939cf3ee600/CASHz1oxgowVS3n5e4LUq.png)
|
@@ -77,9 +76,13 @@ You just need
|
|
77 |
```
|
78 |
pip install protobuf
|
79 |
```
|
|
|
|
|
|
|
|
|
|
|
80 |
|
81 |
## Data prepration
|
82 |
-
|
83 |
1. File structure
|
84 |
|
85 |
```
|
|
|
11 |
I did not carefully tune the training hyperparameters but the model still show capability to solve some tasks. It shows that a visual encoder can be integrated with an LLM, even when the encoder is not aligned with natural language with contrastive learning like CLIP.
|
12 |
|
13 |
## Example
|
|
|
14 |
![5bb2f23dd595d389e6a9a0aadebd87c.png](https://cdn-uploads.huggingface.co/production/uploads/642a298ae5f33939cf3ee600/iOFZOwLGfEByCQ_2EkR7y.png)
|
15 |
Explain the photo in English:
|
16 |
![eeb555092886be02e8e6215d0fdb229.png](https://cdn-uploads.huggingface.co/production/uploads/642a298ae5f33939cf3ee600/CASHz1oxgowVS3n5e4LUq.png)
|
|
|
76 |
```
|
77 |
pip install protobuf
|
78 |
```
|
79 |
+
4.
|
80 |
+
To use tensorboard to visualize the training loss curve:
|
81 |
+
```
|
82 |
+
pip install future tensorboard
|
83 |
+
```
|
84 |
|
85 |
## Data prepration
|
|
|
86 |
1. File structure
|
87 |
|
88 |
```
|