|
--- |
|
library_name: transformers |
|
license: apache-2.0 |
|
language: |
|
- en |
|
tags: |
|
- reranker |
|
- cross-encoder |
|
--- |
|
|
|
<br><br> |
|
|
|
<p align="center"> |
|
<img src="https://aeiljuispo.cloudimg.io/v7/https://cdn-uploads.huggingface.co/production/uploads/603763514de52ff951d89793/AFoybzd5lpBQXEBrQHuTt.png?w=200&h=200&f=face" alt="Finetuner logo: Finetuner helps you to create experiments in order to improve embeddings on search tasks. It accompanies you to deliver the last mile of performance-tuning for neural search applications." width="150px"> |
|
</p> |
|
|
|
<p align="center"> |
|
<b>Trained by <a href="https://jina.ai/"><b>Jina AI</b></a>.</b> |
|
</p> |
|
|
|
# jina-reranker-v1-turbo-en |
|
|
|
This model is designed for **blazing-fast** reranking while maintaining **competitive performance**. What's more, it leverages the power of our [JinaBERT](https://arxiv.org/abs/2310.19923) model as their foundation. JinaBERT itself is a unique variant of the BERT architecture that supports the symmetric bidirectional variant of [ALiBi](https://arxiv.org/abs/2108.12409). This allows `jina-reranker-v1-turbo-en` to process significantly longer sequences of text compared to other reranking models, up to an impressive **8,192** tokens. |
|
|
|
To achieve the remarkable speed, the `jina-reranker-v1-turbo-en` employ a technique called knowledge distillation. Here, a complex, but slower, model (like our original [jina-reranker-v1-base-en](https://jina.ai/reranker/)) acts as a teacher, condensing its knowledge into a smaller, faster student model. This student retains most of the teacher's knowledge, allowing it to deliver similar accuracy in a fraction of the time. |
|
|
|
Here's a breakdown of the reranker models we provide: |
|
|
|
| Model Name | Layers | Hidden Size | Parameters (Millions) | |
|
| ------------------------------------------------------------------------------------ | ------ | ----------- | --------------------- | |
|
| [jina-reranker-v1-base-en](https://jina.ai/reranker/) | 12 | 768 | 137.0 | |
|
| [jina-reranker-v1-turbo-en](https://huggingface.co/jinaai/jina-reranker-v1-turbo-en) | 6 | 384 | 37.8 | |
|
| [jina-reranker-v1-tiny-en](https://huggingface.co/jinaai/jina-reranker-v1-tiny-en) | 4 | 384 | 33.0 | |
|
|
|
# Usage |
|
|
|
You can use Jina Reranker models directly from transformers package: |
|
|
|
```python |
|
!pip install transformers |
|
from transformers import AutoModelForSequenceClassification |
|
|
|
model = AutoModelForSequenceClassification.from_pretrained( |
|
'jinaai/jina-reranker-v1-turbo-en', num_labels=1, trust_remote_code=True |
|
) |
|
|
|
# Example query and documents |
|
query = "Organic skincare products for sensitive skin" |
|
documents = [ |
|
"Eco-friendly kitchenware for modern homes", |
|
"Biodegradable cleaning supplies for eco-conscious consumers", |
|
"Organic cotton baby clothes for sensitive skin", |
|
"Natural organic skincare range for sensitive skin", |
|
"Tech gadgets for smart homes: 2024 edition", |
|
"Sustainable gardening tools and compost solutions", |
|
"Sensitive skin-friendly facial cleansers and toners", |
|
"Organic food wraps and storage solutions", |
|
"All-natural pet food for dogs with allergies", |
|
"Yoga mats made from recycled materials" |
|
] |
|
|
|
# construct sentence pairs |
|
sentence_pairs = [[query, doc] for doc in documents] |
|
|
|
scores = model.compute_score(sentence_pairs) |
|
``` |
|
|
|
# Contact |
|
|
|
Join our [Discord community](https://discord.jina.ai/) and chat with other community members about ideas. |
|
|