Model Card for FIRST
FIRST is a language models trained for listwise reranking tasks leveraging the output logits of the first generated identifier to directly produce a ranked ordering of candidates. Built on the Zephyr-7B-β model, FIRST undergoes single-stage fine-tuning on a converted alphabetic version of the RankZephyr dataset. (i,e RankGPT-4 reorderings of OpenAI's Ada2 orderings for 5k queries) More details can be found in the paper.
Model Description
- Model type: Fine-tuned on listwise reranking data from Zephyr-7B-β model.
- Language(s) (NLP): English
- License: MIT
- Finetuned from model: HuggingFaceH4/zephyr-7b-beta
Model Sources
- Repository: https://github.com/gangiswag/llm-reranker
- Paper: https://arxiv.org/abs/2406.15657
Evaluations
At the time of release, FIRST demonstrates superior performance across a variety of reranking datasets. The table below provides a detailed performance comparison against other LLM rerankers on the BEIR benchmark. (More details can be found in the paper.)
Reranker | Training Data | Avg. | Climate FEVER | DBPedia | FEVER | FiQA | Hotpot QA | MS Marco | NFCorpus | NQ | Sci-docs | Sci-fact | Trec-COVID |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Rank Vicuna | GPT 3.5 | 50.7 | 28.2 | 50.0 | 81.0 | 35.9 | 73.5 | 36.7 | 33.1 | 58.6 | 18.4 | 70.5 | 71.3 |
Rank Zephyr | GPT 3.5 + 3.5 | 53.7 | 25.6 | 50.0 | 80.1 | 42.2 | 71.6 | 42.7 | 37.7 | 65.6 | 20.5 | 76.7 | 78.4 |
FIRST | GPT-4 | 54.3 | 26.7 | 50.9 | 81.7 | 42.2 | 74.2 | 44.4 | 37.4 | 66.4 | 20.4 | 74.6 | 78.8 |
Citation
If you find FIRST useful for your work, please consider citing our paper:
@article{reddy2024first,
title={FIRST: Faster Improved Listwise Reranking with Single Token Decoding},
author={Reddy, Revanth Gangi and Doo, JaeHyeok and Xu, Yifei and Sultan, Md Arafat and Swain, Deevya and Sil, Avirup and Ji, Heng},
journal={arXiv preprint arXiv:2406.15657},
year={2024}
}
- Downloads last month
- 182