sunitha-ravi commited on
Commit
231c50c
·
verified ·
1 Parent(s): f164ec3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -10
README.md CHANGED
@@ -11,7 +11,7 @@ tags:
11
 
12
  # Model Card for Model ID
13
 
14
- Lynx is an open-source faithfulness evaluation model. Patronus-Lynx-8B-Instruct was trained on a mix of datasets such as CovidQA, PubmedQA, DROP, FinanceBench.
15
  The datasets contain a mix of hand-annotated and synthetic data. The maximum sequence length is 8000 tokens.
16
 
17
 
@@ -20,14 +20,14 @@ The datasets contain a mix of hand-annotated and synthetic data. The maximum seq
20
  - **Model Type:** Patronus-Lynx-8B-Instruct is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct model.
21
  - **Language:** Primarily English
22
  - **Developed by:** Patronus AI
23
- - **License:** [More Information Needed]
24
 
25
  ### Model Sources [optional]
26
 
27
  <!-- Provide the basic links for the model. -->
28
 
29
- - **Repository:** [More Information Needed]
30
- - **Paper [optional]:** [More Information Needed]
31
 
32
  ## How to Get Started with the Model
33
  The model is fine-tuned to be used to detect faithfulness in a RAG setting. Provided a document, question and answer, the model can evaluate whether the answer is faithful to the document.
@@ -66,17 +66,15 @@ The model was finetuned for 3 epochs using H100s on dataset of size 2400. We use
66
 
67
  ### Training Data
68
 
69
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
70
-
71
- [More Information Needed]
72
 
 
73
 
74
  ## Evaluation
75
 
76
- The model was evaluated on [PatronusAI/hallucination-evaluation-benchmark](https://huggingface.co/datasets/PatronusAI/hallucination-evaluation-benchmark)
77
-
78
- <!-- This section describes the evaluation protocols and provides the results. -->
79
 
 
80
 
81
  ## Citation [optional]
82
 
 
11
 
12
  # Model Card for Model ID
13
 
14
+ Lynx is an open-source hallucination evaluation model. Patronus-Lynx-8B-Instruct was trained on a mix of datasets such as CovidQA, PubmedQA, DROP, FinanceBench.
15
  The datasets contain a mix of hand-annotated and synthetic data. The maximum sequence length is 8000 tokens.
16
 
17
 
 
20
  - **Model Type:** Patronus-Lynx-8B-Instruct is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct model.
21
  - **Language:** Primarily English
22
  - **Developed by:** Patronus AI
23
+ - **License:** [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license)
24
 
25
  ### Model Sources [optional]
26
 
27
  <!-- Provide the basic links for the model. -->
28
 
29
+ - **Repository:** [https://github.com/patronus-ai/Lynx-hallucination-detection](https://github.com/patronus-ai/Lynx-hallucination-detection)
30
+
31
 
32
  ## How to Get Started with the Model
33
  The model is fine-tuned to be used to detect faithfulness in a RAG setting. Provided a document, question and answer, the model can evaluate whether the answer is faithful to the document.
 
66
 
67
  ### Training Data
68
 
69
+ We train on 2400 samples consisting of CovidQA, PubmedQA, DROP and RAGTruth samples. For datasets that do not contain hallucinated samples, we generate perturbations to introduce hallucinations in the data. For more details about the data generation process, refer to the paper.
 
 
70
 
71
+ The training data can be found here: [PatronusAI/drop-RAGTruth-covidqa-pubmed](https://huggingface.co/datasets/PatronusAI/drop-RAGTruth-covidqa-pubmed)
72
 
73
  ## Evaluation
74
 
75
+ The model was evaluated on [PatronusAI/hallucination-evaluation-benchmark](https://huggingface.co/datasets/PatronusAI/hallucination-evaluation-benchmark).
 
 
76
 
77
+ It outperforms GPT-3.5-Turbo, GPT-4-Turbo, GPT-4o and Claude Sonnet.
78
 
79
  ## Citation [optional]
80