vidore
/

colqwen2-v1.0

Visual Document Retrieval

vidore-experimental

Model card Files Files and versions Community

tonywu71 commited on 8 days ago

Commit

00a95b8

·

verified ·

1 Parent(s): 1281da3

Update README.md

Files changed (1) hide show

README.md +2 -3

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-license: mit
 library_name: colpali
 base_model: vidore/colqwen2-base
 language:
@@ -14,11 +14,10 @@ pipeline_tag: visual-document-retrieval
 ### This is the base version trained with batch_size 256 instead of 32 for 5 epoch and with the updated pad token
-ColQwen is a model based on a novel model architecture and training strategy based on Vision Language Models (VLMs) to efficiently index documents from their visual features.
 It is a [Qwen2-VL-2B](https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct) extension that generates [ColBERT](https://arxiv.org/abs/2004.12832)- style multi-vector representations of text and images.
 It was introduced in the paper [ColPali: Efficient Document Retrieval with Vision Language Models](https://arxiv.org/abs/2407.01449) and first released in [this repository](https://github.com/ManuelFay/colpali)
-This version is the untrained base version to guarantee deterministic projection layer initialization.
 <p align="center"><img width=800 src="https://github.com/illuin-tech/colpali/blob/main/assets/colpali_architecture.webp?raw=true"/></p>
 ## Version specificity

 ---
+license: apache-2.0
 library_name: colpali
 base_model: vidore/colqwen2-base
 language:
 ### This is the base version trained with batch_size 256 instead of 32 for 5 epoch and with the updated pad token
+ColQwen2 is a model based on a novel model architecture and training strategy based on Vision Language Models (VLMs) to efficiently index documents from their visual features.
 It is a [Qwen2-VL-2B](https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct) extension that generates [ColBERT](https://arxiv.org/abs/2004.12832)- style multi-vector representations of text and images.
 It was introduced in the paper [ColPali: Efficient Document Retrieval with Vision Language Models](https://arxiv.org/abs/2407.01449) and first released in [this repository](https://github.com/ManuelFay/colpali)
 <p align="center"><img width=800 src="https://github.com/illuin-tech/colpali/blob/main/assets/colpali_architecture.webp?raw=true"/></p>
 ## Version specificity