|
--- |
|
license: apache-2.0 |
|
--- |
|
|
|
|
|
# Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language models |
|
|
|
> [Tian Yu](https://tianyu0313.github.io/), [Shaolei Zhang](https://zhangshaolei1998.github.io/), and [Yang Feng](https://people.ucas.edu.cn/~yangfeng?language=en)* |
|
|
|
|
|
## Model Details |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
|
|
- **Discription:** These are the LoRA weights obtained by training with synthesized iterative retrieval instruction data. Details can be found in our paper. |
|
- **Developed by:** ICTNLP Group. Authors: Tian Yu, Shaolei Zhang and Yang Feng. |
|
- **Github Repository:** https://github.com/ictnlp/Auto-RAG |
|
- **Finetuned from model:** Meta-Llama3-8B-Instruct |
|
|
|
|
|
## Uses |
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
|
Merge the Meta-Llama3-8B-Instruct weights and Adapter weights. |
|
|
|
``` |
|
import os |
|
from transformers import AutoTokenizer, LlamaForCausalLM |
|
import torch |
|
|
|
model = LlamaForCausalLM.from_pretrained(PATH_TO_META_LLAMA3_8B_INSTRUCT, |
|
device_map="cpu", |
|
) |
|
from peft import PeftModel |
|
|
|
model = PeftModel.from_pretrained(model, |
|
PATH_TO_ADAPTER) |
|
|
|
from transformers import AutoTokenizer |
|
tokenizer = AutoTokenizer.from_pretrained(PATH_TO_META_LLAMA3_8B_INSTRUCT) |
|
|
|
model = model.merge_and_unload() |
|
model.save_pretrained(SAVE_PATH) |
|
tokenizer.save_pretrained(SAVE_PATH) |
|
``` |
|
|
|
Subsequently, you can deploy using frameworks such as vllm. |
|
|
|
## Citation |
|
|
|
``` |
|
@article{yu2024autorag, |
|
title={Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language Models}, |
|
author={Tian Yu and Shaolei Zhang and Yang Feng}, |
|
year={2024}, |
|
eprint={2411.19443}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL}, |
|
url={https://arxiv.org/abs/2411.19443}, |
|
} |
|
``` |
|
|
|
|