# Cassandre-RAG Cassandre-RAG is a fine-tuned llama-3.1-8b model, built for RAG on French administrative documents, with a focus on sources from school administration. ## Training Data The model was fine-tuned on a specialized corpus consisting of: 1. Synthetic queries: Generated from chunks of text extracted from French administrative documents. 2. Retrieved documents: For each synthetic query, relevant documents were retrieved using the BM25 ranking algorithm. 3. Generated answers: Responses to the synthetic queries were created based on the retrieved documents. ```yaml Training Hyperparameters: Max Steps: 3000 Learning Rate: 3e-4 Batch Size: 2 per device Gradient Accumulation Steps: 4 Max Sequence Length: 8192 Weight Decay: 0.001 Warmup Ratio: 0.03 LR Scheduler: Linear Optimizer: paged_adamw_32bit LoRA Configuration: LoRA Alpha: 16 LoRA Dropout: 0.1 LoRA R: 64 Target Modules: - gate_proj - down_proj - up_proj - q_proj - v_proj - k_proj - o_proj Quantization: Quantization: 4-bit Quantization Type: nf4 Compute Dtype: float16 ## Usage Cassandre-RAG uses a custom syntax for parsing sources and generating sourced output. Each source should be preceded by an ID encapsulated in double asterisks (e.g., \*\*SOURCE_ID\*\*). ### Example Usage ```python import pandas as pd from vllm import LLM, SamplingParams # Load the model model_name = "PleIAs/Cassandre-RAG" llm = LLM(model_name, max_model_len=8128) # Set sampling parameters sampling_params = SamplingParams( temperature=0.7, top_p=0.95, max_tokens=3000, presence_penalty=1.2, stop=["#END#"] ) # Prepare the input data def prepare_prompt(query, sources): sources_text = "\n\n".join([f"**{src_id}**\n{content}" for src_id, content in sources]) return f"### Query ###\n{query}\n\n### Source ###\n{sources_text}\n\n### Analysis ###\n" # Example query and sources query = "Quelles sont les procédures pour inscrire un enfant à l'école primaire?" sources = [ ("SOURCE_001", "L'inscription à l'école primaire se fait généralement à la mairie..."), ("SOURCE_002", "Les documents nécessaires pour l'inscription scolaire incluent..."), ] # Prepare the prompt prompt = prepare_prompt(query, sources) # Generate the response outputs = llm.generate([prompt], sampling_params) generated_text = outputs[0].outputs[0].text print("Query:", query) print("\nGenerated Response:") print(generated_text)