Update README.md
Browse files
README.md
CHANGED
@@ -13,9 +13,14 @@ ColPali is a model based on a novel model architecture and training strategy bas
|
|
13 |
It is a [PaliGemma-3B](https://huggingface.co/google/paligemma-3b-mix-448) extension that generates [ColBERT](https://arxiv.org/abs/2004.12832)- style multi-vector representations of text and images.
|
14 |
It was introduced in the paper [ColPali: Efficient Document Retrieval with Vision Language Models](https://arxiv.org/abs/2407.01449) and first released in [this repository](https://github.com/ManuelFay/colpali)
|
15 |
|
16 |
-
|
|
|
|
|
|
|
|
|
|
|
17 |
It also stems from the fixed `vidore/colpaligemma-3b-pt-448-base` to guarantee deterministic projection layer initialization.
|
18 |
-
It was trained for 5 epochs, with in-batch negatives and hard mined negatives and a warmup of 1000 steps to help reduce non-english language collapse.
|
19 |
|
20 |
Data is the same as the ColPali data described in the paper.
|
21 |
|
@@ -45,6 +50,11 @@ We train on an 8 GPU setup with data parallelism, a learning rate of 5e-5 with l
|
|
45 |
|
46 |
## Usage
|
47 |
|
|
|
|
|
|
|
|
|
|
|
48 |
```python
|
49 |
import torch
|
50 |
import typer
|
|
|
13 |
It is a [PaliGemma-3B](https://huggingface.co/google/paligemma-3b-mix-448) extension that generates [ColBERT](https://arxiv.org/abs/2004.12832)- style multi-vector representations of text and images.
|
14 |
It was introduced in the paper [ColPali: Efficient Document Retrieval with Vision Language Models](https://arxiv.org/abs/2407.01449) and first released in [this repository](https://github.com/ManuelFay/colpali)
|
15 |
|
16 |
+
|
17 |
+
## Version specificity
|
18 |
+
|
19 |
+
This version is trained with `colpali-engine==0.2.0`.
|
20 |
+
|
21 |
+
Compared to `colpali`, this version is trained with right padding for queries to fix unwanted tokens in the query encoding.
|
22 |
It also stems from the fixed `vidore/colpaligemma-3b-pt-448-base` to guarantee deterministic projection layer initialization.
|
23 |
+
It was trained for 5 epochs, with in-batch negatives and hard mined negatives and a warmup of 1000 steps (10x longer) to help reduce non-english language collapse.
|
24 |
|
25 |
Data is the same as the ColPali data described in the paper.
|
26 |
|
|
|
50 |
|
51 |
## Usage
|
52 |
|
53 |
+
```bash
|
54 |
+
pip install colpali-engine==0.2.0
|
55 |
+
```
|
56 |
+
|
57 |
+
|
58 |
```python
|
59 |
import torch
|
60 |
import typer
|