Summary:
Retrieval Augmented Generation (RAG) is a technique to specialize a language model with a specific knowledge domain by feeding in relevant data so that it can give better answers.
How does RAG works?
1. Ready/ Preprocess your input data i.e. tokenization & vectorization
2. Feed the processed data to the Language Model.
3. Indexing the stored data that matches the context of the query.
Implementing RAG with llama-index
1. Load relevant data and build an index
from llama_index import VectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
2. Query your data
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)
My application of RAG on ChatGPT
Check RAG.ipynb