|
# Cassandre-RAG |
|
|
|
Cassandre-RAG is a fine-tuned llama-3.1-8b model, built for RAG on French administrative documents, with a focus on sources from school administration. |
|
|
|
## Training Data |
|
|
|
The model was fine-tuned on a specialized corpus consisting of: |
|
|
|
1. Synthetic queries: Generated from chunks of text extracted from French administrative documents. |
|
2. Retrieved documents: For each synthetic query, relevant documents were retrieved using the BM25 ranking algorithm. |
|
3. Generated answers: Responses to the synthetic queries were created based on the retrieved documents. |
|
|
|
```yaml |
|
Training Hyperparameters: |
|
Max Steps: 3000 |
|
Learning Rate: 3e-4 |
|
Batch Size: 2 per device |
|
Gradient Accumulation Steps: 4 |
|
Max Sequence Length: 8192 |
|
Weight Decay: 0.001 |
|
Warmup Ratio: 0.03 |
|
LR Scheduler: Linear |
|
Optimizer: paged_adamw_32bit |
|
|
|
LoRA Configuration: |
|
LoRA Alpha: 16 |
|
LoRA Dropout: 0.1 |
|
LoRA R: 64 |
|
Target Modules: |
|
- gate_proj |
|
- down_proj |
|
- up_proj |
|
- q_proj |
|
- v_proj |
|
- k_proj |
|
- o_proj |
|
|
|
Quantization: |
|
Quantization: 4-bit |
|
Quantization Type: nf4 |
|
Compute Dtype: float16 |
|
|
|
## Usage |
|
|
|
Cassandre-RAG uses a custom syntax for parsing sources and generating sourced output. |
|
Each source should be preceded by an ID encapsulated in double asterisks (e.g., \*\*SOURCE_ID\*\*). |
|
|
|
### Example Usage |
|
|
|
```python |
|
import pandas as pd |
|
from vllm import LLM, SamplingParams |
|
|
|
# Load the model |
|
model_name = "PleIAs/Cassandre-RAG" |
|
llm = LLM(model_name, max_model_len=8128) |
|
|
|
# Set sampling parameters |
|
sampling_params = SamplingParams( |
|
temperature=0.7, |
|
top_p=0.95, |
|
max_tokens=3000, |
|
presence_penalty=1.2, |
|
stop=["#END#"] |
|
) |
|
|
|
# Prepare the input data |
|
def prepare_prompt(query, sources): |
|
sources_text = "\n\n".join([f"**{src_id}**\n{content}" for src_id, content in sources]) |
|
return f"### Query ###\n{query}\n\n### Source ###\n{sources_text}\n\n### Analysis ###\n" |
|
|
|
# Example query and sources |
|
query = "Quelles sont les procédures pour inscrire un enfant à l'école primaire?" |
|
sources = [ |
|
("SOURCE_001", "L'inscription à l'école primaire se fait généralement à la mairie..."), |
|
("SOURCE_002", "Les documents nécessaires pour l'inscription scolaire incluent..."), |
|
] |
|
|
|
# Prepare the prompt |
|
prompt = prepare_prompt(query, sources) |
|
|
|
# Generate the response |
|
outputs = llm.generate([prompt], sampling_params) |
|
generated_text = outputs[0].outputs[0].text |
|
|
|
print("Query:", query) |
|
print("\nGenerated Response:") |
|
print(generated_text) |