aarticerebras commited on
Commit
ce72365
·
verified ·
1 Parent(s): b26a998

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -5
README.md CHANGED
@@ -8,17 +8,21 @@ The vision encoder checkpoints for this model can be found at [cerebras/Cerebras
8
 
9
  **Note**: _ShareGPT4V_ is added to the vision model name to ensure correct loading of checkpoints in [LLaVA source repo](https://github.com/haotian-liu/LLaVA/blob/main/llava/model/multimodal_encoder/builder.py#L8)
10
 
11
- For full details of this model and training details, please read our paper and release blog post **to be released shortly**.
12
 
13
- # Model Architecture
 
 
 
14
  Cerebras-LLaVA-7B is a transformer model with the following architecture details
15
  * Vision encoder: [CLIP-VisionModel-Large](cerebras/Cerebras-ViT-L-336-patch14-llava7b-ShareGPT4V). It handles images of size 336 x 336 with patch size of 14
16
  * Large Language Model: Pretrained from Vicuna-7B checkpoints and instruction finetuned on various datasets.
17
  * Projector: the projector module that connects the LLM and Vision encoder part consists of two linear layers with gelu activation (mlp2x-gelu)
18
 
19
- # Loading the model
20
 
21
  This model can directly be loaded using the [LLaVa source code repository](https://github.com/haotian-liu/LLaVA). For installation, please refer to the [instructions in source code repository](https://github.com/haotian-liu/LLaVA?tab=readme-ov-file#install).
 
22
 
23
  ```
24
  from llava.model.builder import load_pretrained_model
@@ -34,7 +38,13 @@ tokenizer, model, image_processor, context_len = load_pretrained_model(
34
  )
35
  ```
36
 
37
- # Acknowledgements
38
- We are thankful to all Cerebras engineers, past and present, that made this work possible.
 
 
 
 
 
 
39
 
40
 
 
8
 
9
  **Note**: _ShareGPT4V_ is added to the vision model name to ensure correct loading of checkpoints in [LLaVA source repo](https://github.com/haotian-liu/LLaVA/blob/main/llava/model/multimodal_encoder/builder.py#L8)
10
 
11
+ For full details of this model and training details, please read our upcoming blog post.
12
 
13
+ ## License
14
+
15
+
16
+ ## Model Architecture
17
  Cerebras-LLaVA-7B is a transformer model with the following architecture details
18
  * Vision encoder: [CLIP-VisionModel-Large](cerebras/Cerebras-ViT-L-336-patch14-llava7b-ShareGPT4V). It handles images of size 336 x 336 with patch size of 14
19
  * Large Language Model: Pretrained from Vicuna-7B checkpoints and instruction finetuned on various datasets.
20
  * Projector: the projector module that connects the LLM and Vision encoder part consists of two linear layers with gelu activation (mlp2x-gelu)
21
 
22
+ ## Loading the model
23
 
24
  This model can directly be loaded using the [LLaVa source code repository](https://github.com/haotian-liu/LLaVA). For installation, please refer to the [instructions in source code repository](https://github.com/haotian-liu/LLaVA?tab=readme-ov-file#install).
25
+ We perform all our evaluations using the LLaVA source code repository scripts.
26
 
27
  ```
28
  from llava.model.builder import load_pretrained_model
 
38
  )
39
  ```
40
 
41
+ ## Intended Use
42
+ Primary intended uses: The primary use of LLaVA is research on large multimodal models and chatbots.
43
+
44
+ Primary intended users: The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence
45
+
46
+
47
+ ## Acknowledgements
48
+ We are thankful to all Cerebras engineers that made this work possible.
49
 
50