Cassandre-RAG

Cassandre-RAG is a fine-tuned llama-3.1-8b model, built for RAG on French administrative documents, with a focus on sources from school administration.

Training Data

The model was fine-tuned on a specialized corpus consisting of:

Synthetic queries: Generated from chunks of text extracted from French administrative documents.
Retrieved documents: For each synthetic query, relevant documents were retrieved using the BM25 ranking algorithm.
Generated answers: Responses to the synthetic queries were created based on the retrieved documents.

Training Hyperparameters:
  Max Steps: 3000
  Learning Rate: 3e-4
  Batch Size: 2 per device
  Gradient Accumulation Steps: 4
  Max Sequence Length: 8192
  Weight Decay: 0.001
  Warmup Ratio: 0.03
  LR Scheduler: Linear
  Optimizer: paged_adamw_32bit

LoRA Configuration:
  LoRA Alpha: 16
  LoRA Dropout: 0.1
  LoRA R: 64
  Target Modules: 
    - gate_proj
    - down_proj
    - up_proj
    - q_proj
    - v_proj
    - k_proj
    - o_proj

Quantization:
  Quantization: 4-bit
  Quantization Type: nf4
  Compute Dtype: float16

## Usage

Cassandre-RAG uses a custom syntax for parsing sources and generating sourced output. 
Each source should be preceded by an ID encapsulated in double asterisks (e.g., \*\*SOURCE_ID\*\*).

### Example Usage

```python
import pandas as pd
from vllm import LLM, SamplingParams

# Load the model
model_name = "PleIAs/Cassandre-RAG"
llm = LLM(model_name, max_model_len=8128)

# Set sampling parameters
sampling_params = SamplingParams(
    temperature=0.7,
    top_p=0.95,
    max_tokens=3000,
    presence_penalty=1.2,
    stop=["#END#"]
)

# Prepare the input data
def prepare_prompt(query, sources):
    sources_text = "\n\n".join([f"**{src_id}**\n{content}" for src_id, content in sources])
    return f"### Query ###\n{query}\n\n### Source ###\n{sources_text}\n\n### Analysis ###\n"

# Example query and sources
query = "Quelles sont les procédures pour inscrire un enfant à l'école primaire?"
sources = [
    ("SOURCE_001", "L'inscription à l'école primaire se fait généralement à la mairie..."),
    ("SOURCE_002", "Les documents nécessaires pour l'inscription scolaire incluent..."),
]

# Prepare the prompt
prompt = prepare_prompt(query, sources)

# Generate the response
outputs = llm.generate([prompt], sampling_params)
generated_text = outputs[0].outputs[0].text

print("Query:", query)
print("\nGenerated Response:")
print(generated_text)