Text Generation
Transformers
Safetensors
GGUF
llava
remyx
Inference Endpoints
salma-remyx commited on
Commit
a927b42
1 Parent(s): b2d6090

Add files including subdirectories

Browse files
.gitattributes CHANGED
@@ -1,35 +1,3 @@
1
- *.7z filter=lfs diff=lfs merge=lfs -text
2
- *.arrow filter=lfs diff=lfs merge=lfs -text
3
- *.bin filter=lfs diff=lfs merge=lfs -text
4
- *.bz2 filter=lfs diff=lfs merge=lfs -text
5
- *.ckpt filter=lfs diff=lfs merge=lfs -text
6
- *.ftz filter=lfs diff=lfs merge=lfs -text
7
- *.gz filter=lfs diff=lfs merge=lfs -text
8
- *.h5 filter=lfs diff=lfs merge=lfs -text
9
- *.joblib filter=lfs diff=lfs merge=lfs -text
10
- *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
- *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
- *.model filter=lfs diff=lfs merge=lfs -text
13
- *.msgpack filter=lfs diff=lfs merge=lfs -text
14
- *.npy filter=lfs diff=lfs merge=lfs -text
15
- *.npz filter=lfs diff=lfs merge=lfs -text
16
- *.onnx filter=lfs diff=lfs merge=lfs -text
17
- *.ot filter=lfs diff=lfs merge=lfs -text
18
- *.parquet filter=lfs diff=lfs merge=lfs -text
19
- *.pb filter=lfs diff=lfs merge=lfs -text
20
- *.pickle filter=lfs diff=lfs merge=lfs -text
21
- *.pkl filter=lfs diff=lfs merge=lfs -text
22
- *.pt filter=lfs diff=lfs merge=lfs -text
23
- *.pth filter=lfs diff=lfs merge=lfs -text
24
- *.rar filter=lfs diff=lfs merge=lfs -text
25
- *.safetensors filter=lfs diff=lfs merge=lfs -text
26
- saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
- *.tar.* filter=lfs diff=lfs merge=lfs -text
28
- *.tar filter=lfs diff=lfs merge=lfs -text
29
- *.tflite filter=lfs diff=lfs merge=lfs -text
30
- *.tgz filter=lfs diff=lfs merge=lfs -text
31
- *.wasm filter=lfs diff=lfs merge=lfs -text
32
- *.xz filter=lfs diff=lfs merge=lfs -text
33
- *.zip filter=lfs diff=lfs merge=lfs -text
34
- *.zst filter=lfs diff=lfs merge=lfs -text
35
- *tfevents* filter=lfs diff=lfs merge=lfs -text
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e951524a8b55d3792e37bc5821527f995d41219fbefd55c39adbe569086a41db
3
+ size 2391
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
README.md CHANGED
@@ -1,50 +1,3 @@
1
- ---
2
- license: apache-2.0
3
- ---
4
-
5
-
6
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/647777304ae93470ffc28913/iVKgqK6vTzCpCLVnWxmjA.png)
7
-
8
- # Model Card for SpaceLLaVA
9
-
10
- **SpaceLLaVA** uses LoRA to fine-tune [LLaVA](https://github.com/haotian-liu/LLaVA/tree/main) on a dataset designed with [VQASynth](https://github.com/remyxai/VQASynth/tree/main) to enhance spatial reasoning as in [SpatialVLM](https://spatial-vlm.github.io/)
11
-
12
- ## Model Details
13
-
14
- ### Model Description
15
-
16
- This model uses data synthesis techniques and publically available models to reproduce the work described in SpatialVLM to enhance the spatial reasoning of multimodal models.
17
- With a pipeline of expert models, we can infer spatial relationships between objects in a scene to create VQA dataset for spatial reasoning.
18
-
19
-
20
- - **Developed by:** remyx.ai
21
- - **Model type:** MultiModal Model, Vision Language Model, LLaVA
22
- - **License:** Apache-2.0
23
- - **Finetuned from model:** LLaVA
24
-
25
- ### Model Sources
26
-
27
- - **Repository:** [VQASynth](https://github.com/remyxai/VQASynth/tree/main)
28
- - **Paper:** [SpatialVLM](https://arxiv.org/abs/2401.12168)
29
-
30
- ## Uses
31
-
32
- Use this model to query spatial relationships between objects in a scene.
33
-
34
- ## Citation
35
- ```
36
- @article{chen2024spatialvlm,
37
- title = {SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities},
38
- author = {Chen, Boyuan and Xu, Zhuo and Kirmani, Sean and Ichter, Brian and Driess, Danny and Florence, Pete and Sadigh, Dorsa and Guibas, Leonidas and Xia, Fei},
39
- journal = {arXiv preprint arXiv:2401.12168},
40
- year = {2024},
41
- url = {https://arxiv.org/abs/2401.12168},
42
- }
43
-
44
- @misc{liu2023llava,
45
- title={Visual Instruction Tuning},
46
- author={Liu, Haotian and Li, Chunyuan and Wu, Qingyang and Lee, Yong Jae},
47
- publisher={NeurIPS},
48
- year={2023},
49
- }
50
- ```
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f8ea4eabb1259d4e862a6cd3c00b5cdcf28239a0b6d08e092a87a40d61834fd7
3
+ size 1803
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:578885be8f66b03539a7f5e088c7379b2983e42962ddfc64526e33824e038def
3
+ size 1395
generation_config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f08b1ec30ce67f8c85e45853bcc5486639df57ff573cff0dd6b0e14efa7bca80
3
+ size 154
ggml-model-f16.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4aa5e862c4a7c97fb617ed70a94ef6f5876dce2127fe0f1df12d6277d95d65e6
3
+ size 26033303520
ggml-model-q4_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8d34cbb74271253ea69f32662f29611c8b6f2e4e784d95e139a2272141f84aa3
3
+ size 7365834752
mmproj-model-f16.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1c8e0aa019d51f9529385ce87692f2fba40c679429b9af849ffbdce71b4b9366
3
+ size 645414080
model-00001-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bf813981eb324008586dc1f2643fe73e1c921d1305ef8af26037afcf37a8ae24
3
+ size 4978265728
model-00002-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:edf5d42738f2ddcc846e20484cee3e3776a3aefc34789c1702a09f96057bd650
3
+ size 4970422160
model-00003-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f05799392dd6d891aad87ac1778ce908e3c1bb21a1780cd97166106b84534c89
3
+ size 4970422184
model-00004-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8d6e5620ca84513a80337f909c5ad2d4f56173c8c91cc2bec0197c3d75b9bcd1
3
+ size 4933701432
model-00005-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f30ec13d596bcfb2b11ca34bf65ae011f3b0663dfd4bdaa18bb0edc2fe150bd4
3
+ size 4933722144
model-00006-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f5fae337a06c0c8481a848813586edfa919ac1ba495ad6433218318e4b4c0bb2
3
+ size 1915248256
model.safetensors.index.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:340636c562e772c515c3275553a697876135d82b2bb5c62e1d8c355e94c62eba
3
+ size 79096
special_tokens_map.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4859e5dbde90e059988a0a2136d8df3f2773d4d2fc4c4543690028f0b2166e7f
3
+ size 552
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
3
+ size 499723
tokenizer_config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b9dbbc4e94fa11210a21800939bd17da1e91b43f083833fb031b394111de6a9a
3
+ size 936