Model Card for FIRST

FIRST is a language models trained for listwise reranking tasks leveraging the output logits of the first generated identifier to directly produce a ranked ordering of candidates. Built on the Zephyr-7B-β model, FIRST undergoes single-stage fine-tuning on a converted alphabetic version of the RankZephyr dataset. (i,e RankGPT-4 reorderings of OpenAI's Ada2 orderings for 5k queries) More details can be found in the paper.

Model Description

Model type: Fine-tuned on listwise reranking data from Zephyr-7B-β model.
Language(s) (NLP): English
License: MIT
Finetuned from model: HuggingFaceH4/zephyr-7b-beta

Model Sources

Repository: https://github.com/gangiswag/llm-reranker
Paper: https://arxiv.org/abs/2406.15657

Evaluations

At the time of release, FIRST demonstrates superior performance across a variety of reranking datasets. The table below provides a detailed performance comparison against other LLM rerankers on the BEIR benchmark. (More details can be found in the paper.)

Reranker	Training Data	Avg.	Climate FEVER	DBPedia	FEVER	FiQA	Hotpot QA	MS Marco	NFCorpus	NQ	Sci-docs	Sci-fact	Trec-COVID
Rank Vicuna	GPT 3.5	50.7	28.2	50.0	81.0	35.9	73.5	36.7	33.1	58.6	18.4	70.5	71.3
Rank Zephyr	GPT 3.5 + 3.5	53.7	25.6	50.0	80.1	42.2	71.6	42.7	37.7	65.6	20.5	76.7	78.4
FIRST	GPT-4	54.3	26.7	50.9	81.7	42.2	74.2	44.4	37.4	66.4	20.4	74.6	78.8

Citation

If you find FIRST useful for your work, please consider citing our paper:

@article{reddy2024first,
  title={FIRST: Faster Improved Listwise Reranking with Single Token Decoding},
  author={Reddy, Revanth Gangi and Doo, JaeHyeok and Xu, Yifei and Sultan, Md Arafat and Swain, Deevya and Sil, Avirup and Ji, Heng},
  journal={arXiv preprint arXiv:2406.15657},
  year={2024}
}