PatronusAI
/

Llama-3-Patronus-Lynx-8B-Instruct

@@ -11,7 +11,7 @@ tags:
 # Model Card for Model ID
-Lynx is an open-source faithfulness evaluation model. Patronus-Lynx-8B-Instruct was trained on a mix of datasets such as CovidQA, PubmedQA, DROP, FinanceBench.
 The datasets contain a mix of hand-annotated and synthetic data. The maximum sequence length is 8000 tokens.
@@ -20,14 +20,14 @@ The datasets contain a mix of hand-annotated and synthetic data. The maximum seq
 - **Model Type:** Patronus-Lynx-8B-Instruct is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct model.
 - **Language:** Primarily English
 - **Developed by:** Patronus AI
-- **License:** [More Information Needed]
 ### Model Sources [optional]
 <!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
 ## How to Get Started with the Model
 The model is fine-tuned to be used to detect faithfulness in a RAG setting. Provided a document, question and answer, the model can evaluate whether the answer is faithful to the document.
@@ -66,17 +66,15 @@ The model was finetuned for 3 epochs using H100s on dataset of size 2400. We use
 ### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
 ## Evaluation
-The model was evaluated on [PatronusAI/hallucination-evaluation-benchmark](https://huggingface.co/datasets/PatronusAI/hallucination-evaluation-benchmark)
-<!-- This section describes the evaluation protocols and provides the results. -->
 ## Citation [optional]

 # Model Card for Model ID
+Lynx is an open-source hallucination evaluation model. Patronus-Lynx-8B-Instruct was trained on a mix of datasets such as CovidQA, PubmedQA, DROP, FinanceBench.
 The datasets contain a mix of hand-annotated and synthetic data. The maximum sequence length is 8000 tokens.
 - **Model Type:** Patronus-Lynx-8B-Instruct is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct model.
 - **Language:** Primarily English
 - **Developed by:** Patronus AI
+- **License:** [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license)
 ### Model Sources [optional]
 <!-- Provide the basic links for the model. -->
+- **Repository:** [https://github.com/patronus-ai/Lynx-hallucination-detection](https://github.com/patronus-ai/Lynx-hallucination-detection)
 ## How to Get Started with the Model
 The model is fine-tuned to be used to detect faithfulness in a RAG setting. Provided a document, question and answer, the model can evaluate whether the answer is faithful to the document.
 ### Training Data
+We train on 2400 samples consisting of CovidQA, PubmedQA, DROP and RAGTruth samples. For datasets that do not contain hallucinated samples, we generate perturbations to introduce hallucinations in the data. For more details about the data generation process, refer to the paper.
+The training data can be found here: [PatronusAI/drop-RAGTruth-covidqa-pubmed](https://huggingface.co/datasets/PatronusAI/drop-RAGTruth-covidqa-pubmed)
 ## Evaluation
+The model was evaluated on [PatronusAI/hallucination-evaluation-benchmark](https://huggingface.co/datasets/PatronusAI/hallucination-evaluation-benchmark).
+It outperforms GPT-3.5-Turbo, GPT-4-Turbo, GPT-4o and Claude Sonnet.
 ## Citation [optional]