metadata

datasets:
  - oddadmix/arabic-triplets-large
  - akhooli/arabic-triplets-1m-curated-sims-len
language:
  - ar
base_model:
  - Omartificial-Intelligence-Space/Arabic-Triplet-Matryoshka-V2
tags:
  - reranking
  - arabic-nlp
  - nlp

Arabic Reranker V1 Model

This is an Arabic reranker model, fine-tuned from the Omartificial-Intelligence-Space/Arabic-Triplet-Matryoshka-V2, which itself is based on aubmindlab/bert-base-arabertv02. The model is designed to perform reranking tasks by scoring and ordering text options based on their relevance to a given query, specifically optimized for Arabic text.

This model was trained on a synthetic dataset of Arabic triplets generated using large language models (LLMs). It was refined using a scoring technique, making it ideal for ranking tasks in Arabic Natural Language Processing (NLP).

Model Use

This model is well-suited for Arabic text reranking tasks, including:

Information retrieval and document ranking
Search engine results reranking
Question-answering tasks requiring ranked answer choices

Example Usage

Below is an example of how to use the model with the sentence_transformers library to rerank paragraphs based on relevance to a query.

Code Example

from sentence_transformers import CrossEncoder

# Load the model
model = CrossEncoder('oddadmix/arabic-reranker-v1', max_length=512)

# Define the query and candidate paragraphs
Query = 'كيف يمكن استخدام التعلم العميق في معالجة الصور الطبية؟'
Paragraph1 = 'التعلم العميق يساعد في تحليل الصور الطبية وتشخيص الأمراض'
Paragraph2 = 'الذكاء الاصطناعي يستخدم في تحسين الإنتاجية في الصناعات'

# Score the paragraphs based on relevance to the query
scores = model.predict([(Query, Paragraph1), (Query, Paragraph2)])

# Output scores
print("Score for Paragraph 1:", scores[0])
print("Score for Paragraph 2:", scores[1])