Update README.md
Browse files
README.md
CHANGED
@@ -25,7 +25,7 @@ tags:
|
|
25 |
|
26 |
MD-Judge is a LLM-based safetyguard, fine-tund on top of [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1). MD-Judge serves as a classifier to evaluate the safety of QA pairs.
|
27 |
|
28 |
-
MD-Judge was born to study the safety of different LLMs serving as an general evaluation tool, which is proposed under the [SALAD-Bench paper]()
|
29 |
|
30 |
- **Developed by:** The SALAD-Bench Team
|
31 |
- **Model type:** An auto-regressive language model based on the transformer architecture.
|
@@ -33,8 +33,7 @@ MD-Judge was born to study the safety of different LLMs serving as an general ev
|
|
33 |
## Model Sources
|
34 |
|
35 |
- **Repository:** [SALAD-Bench Github](https://github.com/OpenSafetyLab/SALAD-BENCH)
|
36 |
-
- **
|
37 |
-
- **Paper:** Coming soon
|
38 |
|
39 |
## Uses
|
40 |
```python
|
@@ -96,5 +95,13 @@ Please refer to our [Github](https://github.com/OpenSafetyLab/SALAD-BENCH) for m
|
|
96 |
## Citation
|
97 |
|
98 |
```bibtex
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
99 |
```
|
100 |
|
|
|
25 |
|
26 |
MD-Judge is a LLM-based safetyguard, fine-tund on top of [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1). MD-Judge serves as a classifier to evaluate the safety of QA pairs.
|
27 |
|
28 |
+
MD-Judge was born to study the safety of different LLMs serving as an general evaluation tool, which is proposed under the [SALAD-Bench paper](https://arxiv.org/abs/2402.02416)
|
29 |
|
30 |
- **Developed by:** The SALAD-Bench Team
|
31 |
- **Model type:** An auto-regressive language model based on the transformer architecture.
|
|
|
33 |
## Model Sources
|
34 |
|
35 |
- **Repository:** [SALAD-Bench Github](https://github.com/OpenSafetyLab/SALAD-BENCH)
|
36 |
+
- **Paper:** [SALAD-BENCH](https://arxiv.org/abs/2402.02416)
|
|
|
37 |
|
38 |
## Uses
|
39 |
```python
|
|
|
95 |
## Citation
|
96 |
|
97 |
```bibtex
|
98 |
+
@misc{li2024saladbench,
|
99 |
+
title={SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models},
|
100 |
+
author={Lijun Li and Bowen Dong and Ruohui Wang and Xuhao Hu and Wangmeng Zuo and Dahua Lin and Yu Qiao and Jing Shao},
|
101 |
+
year={2024},
|
102 |
+
eprint={2402.05044},
|
103 |
+
archivePrefix={arXiv},
|
104 |
+
primaryClass={cs.CL}
|
105 |
+
}
|
106 |
```
|
107 |
|