|
--- |
|
license: apache-2.0 |
|
--- |
|
|
|
|
|
# Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language models |
|
|
|
> [Tian Yu](https://tianyu0313.github.io/), [Shaolei Zhang](https://zhangshaolei1998.github.io/), and [Yang Feng](https://people.ucas.edu.cn/~yangfeng?language=en)* |
|
|
|
|
|
## Model Details |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
|
|
- **Discription:** This is Auto-RAG model trained with synthesized iterative retrieval instruction data. Details can be found in our paper. |
|
- **Developed by:** ICTNLP Group. Authors: Tian Yu, Shaolei Zhang and Yang Feng. |
|
- **Github Repository:** https://github.com/ictnlp/Auto-RAG |
|
- **Paper Link:** https://arxiv.org/abs/2411.19443 |
|
- **Finetuned from model:** Meta-Llama3-8B-Instruct |
|
|
|
|
|
## Uses |
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
|
You can directly deploy the model using vllm, such as: |
|
``` |
|
CUDA_VISIBLE_DEVICES=6,7 python -m vllm.entrypoints.openai.api_server \ |
|
--model PATH_TO_MODEL\ |
|
--gpu-memory-utilization 0.9 \ |
|
-tp 2 \ |
|
--max-model-len 8192\ |
|
--port 8000\ |
|
--host 0.0.0.0 |
|
``` |
|
|
|
## Citation |
|
|
|
``` |
|
@article{yu2024autorag, |
|
title={Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language Models}, |
|
author={Tian Yu and Shaolei Zhang and Yang Feng}, |
|
year={2024}, |
|
eprint={2411.19443}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL}, |
|
url={https://arxiv.org/abs/2411.19443}, |
|
} |
|
``` |
|
|
|
|