metadata

pipeline_tag: text-classification
tags:
  - transformers
  - sentence-transformers
  - reranker
  - cross-encoder
language:
  - multilingual
license: cc-by-nc-4.0

Finetuner logo: Finetuner helps you to create experiments in order to improve embeddings on search tasks. It accompanies you to deliver the last mile of performance-tuning for neural search applications.

Trained by Jina AI.

jina-reranker-v2-base-multilingual

Intended Usage & Model Info

The Jina Reranker v2 (jina-reranker-v2-base-multilingual) is a transformer-based model that has been fine-tuned for text reranking task, which is a crucial component in many information retrieval systems. It is a cross-encoder model that takes a query and a document pair as input and outputs a score indicating the relevance of the document to the query. The model is trained on a large dataset of query-document pairs and is capable of reranking documents in multiple languages with high accuracy.

Compared with the state-of-the-art reranker models, including the previous released jina-reranker-v1-base-en, the Jina Reranker v2 model has demonstrated competitiveness across a series of benchmarks targeting for text retrieval, multilingual capability, function-calling-aware and text-to-SQL-aware reranking, and code retrieval tasks.

The jina-reranker-v2-base-multilingual model is capable of handling long texts with a context length of up to 1024 tokens, enabling the processing of extensive inputs. To enable the model to handle long texts that exceed 1024 tokens, the model uses a sliding window approach to chunk the input text into smaller pieces and rerank each chunk separately.

The model is also equipped with a flash attention mechanism, which significantly improves the model's performance.

Usage

The easiest way to starting using jina-reranker-v2-base-multilingual is to use Jina AI's Reranker API.

curl https://api.jina.ai/v1/rerank \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
  "model": "jina-reranker-v2-base-multilingual",
  "query": "Organic skincare products for sensitive skin",
  "documents": [
    "Eco-friendly kitchenware for modern homes",
    "Biodegradable cleaning supplies for eco-conscious consumers",
    "Organic cotton baby clothes for sensitive skin",
    "Natural organic skincare range for sensitive skin",
    "Tech gadgets for smart homes: 2024 edition",
    "Sustainable gardening tools and compost solutions",
    "Sensitive skin-friendly facial cleansers and toners",
    "Organic food wraps and storage solutions",
    "All-natural pet food for dogs with allergies",
    "Yoga mats made from recycled materials"
  ],
  "top_n": 3
}'

You can also use the transformers library to interact with the model programmatically.

!pip install transformers
from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained(
    'jinaai/jina-reranker-v2-base-multilingual',
    device_map="cuda",
    torch_dtype="auto",
    trust_remote_code=True,
)

# Example query and documents
query = "Organic skincare products for sensitive skin"
documents = [
    "Eco-friendly kitchenware for modern homes",
    "Biodegradable cleaning supplies for eco-conscious consumers",
    "Organic cotton baby clothes for sensitive skin",
    "Natural organic skincare range for sensitive skin",
    "Tech gadgets for smart homes: 2024 edition",
    "Sustainable gardening tools and compost solutions",
    "Sensitive skin-friendly facial cleansers and toners",
    "Organic food wraps and storage solutions",
    "All-natural pet food for dogs with allergies",
    "Yoga mats made from recycled materials"
]

# construct sentence pairs
sentence_pairs = [[query, doc] for doc in documents]

scores = model.compute_score(sentence_pairs, max_length=1024)

That's it! You can now use the jina-reranker-v2-base-multilingual model in your projects.

Note that by default, the jina-reranker-v2-base-multilingual model uses flash attention, which requires certain types of GPU hardware to run. If you encounter any issues, you can try call AutoModelForSequenceClassification.from_pretrained() with use_flash_attn=False. This will use the standard attention mechanism instead of flash attention. You can also try running the model on a CPU by setting device_map="cpu".

In addition to the compute_score() function, the jina-reranker-v2-base-multilingual model also provides a model.rerank() function that can be used to rerank documents based on a query. You can use it as follows:

result = model.rerank(
    query,
    documents,
    max_query_length=512,
    max_length=1024,
    top_n=3
)

Inside the result object, you will find the reranked documents along with their scores. You can use this information to further process the documents as needed.

What's more, the rerank() function will automatically chunk the input documents into smaller pieces if they exceed the model's maximum input length. This allows you to rerank long documents without running into memory issues. Specifically, the rerank() function will split the documents into chunks of size max_length and rerank each chunk separately. The scores from all the chunks are then combined to produce the final reranking results. You can control the query length and document length in each chunk by setting the max_query_length and max_length parameters. The rerank() function also supports the overlap parameter (default is 80) which determines how much overlap there is between adjacent chunks. This can be useful when reranking long documents to ensure that the model has enough context to make accurate predictions.