Update README.md
Browse files
README.md
CHANGED
@@ -17,6 +17,8 @@ tags:
|
|
17 |
- mistral
|
18 |
- salad-bench
|
19 |
- evluation
|
|
|
|
|
20 |
---
|
21 |
# MD-Judge for Salad-Bench
|
22 |
|
@@ -25,16 +27,16 @@ tags:
|
|
25 |
|
26 |
MD-Judge is a LLM-based safetyguard, fine-tund on top of [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1). MD-Judge serves as a classifier to evaluate the safety of QA pairs.
|
27 |
|
28 |
-
MD-Judge was born to study the safety of different LLMs serving as an general evaluation tool, which is proposed under the
|
29 |
-
|
30 |
-
- **
|
31 |
-
- **
|
|
|
32 |
|
33 |
## Model Sources
|
34 |
|
35 |
-
- **Repository:** [SALAD-Bench Github](
|
36 |
-
- **Paper:** [SALAD-BENCH](
|
37 |
-
|
38 |
## Model Performance
|
39 |
|
40 |
Compare our MD-Judge model with other methods on different public safety testsets using QA format. All the model-based methods are evaluated using the same safety proxy template.
|
@@ -122,5 +124,4 @@ Please refer to our [Github](https://github.com/OpenSafetyLab/SALAD-BENCH) for m
|
|
122 |
archivePrefix={arXiv},
|
123 |
primaryClass={cs.CL}
|
124 |
}
|
125 |
-
```
|
126 |
-
|
|
|
17 |
- mistral
|
18 |
- salad-bench
|
19 |
- evluation
|
20 |
+
- judge
|
21 |
+
pipeline_tag: text-generation
|
22 |
---
|
23 |
# MD-Judge for Salad-Bench
|
24 |
|
|
|
27 |
|
28 |
MD-Judge is a LLM-based safetyguard, fine-tund on top of [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1). MD-Judge serves as a classifier to evaluate the safety of QA pairs.
|
29 |
|
30 |
+
MD-Judge was born to study the safety of different LLMs serving as an general evaluation tool, which is proposed under the 🥗SALAD-Bench. You can check the following source for more information:
|
31 |
+
- [**Paper**](https://arxiv.org/abs/2402.02416)
|
32 |
+
- [**Code**](https://github.com/OpenSafetyLab/SALAD-BENCH)
|
33 |
+
- [**Data**](https://huggingface.co/datasets/OpenSafetyLab/Salad-Data)
|
34 |
+
- [**Project Page**](https://adwardlee.github.io/salad_bench/)
|
35 |
|
36 |
## Model Sources
|
37 |
|
38 |
+
- **Repository:** [SALAD-Bench Github]()
|
39 |
+
- **Paper:** [SALAD-BENCH]()
|
|
|
40 |
## Model Performance
|
41 |
|
42 |
Compare our MD-Judge model with other methods on different public safety testsets using QA format. All the model-based methods are evaluated using the same safety proxy template.
|
|
|
124 |
archivePrefix={arXiv},
|
125 |
primaryClass={cs.CL}
|
126 |
}
|
127 |
+
```
|
|