OpenSafetyLab
/

MD-Judge-v0.1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Foreshhh commited on Mar 4

Commit

4a9afb2

•

1 Parent(s): 475b9b9

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -43,7 +43,7 @@ Compare our MD-Judge model with other methods on different public safety testset
 - GPT-4: https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo
 - LlamaGuard: https://huggingface.co/meta-llama/LlamaGuard-7b
-| **Methods** | **Base** | **Enhance** | **ToxicChat** | **Beaver** | **SafeRLHF** |
 |-------------|----------|-------------|--------|------------|--------------|
 | Keyword     | 0.058    | 0.261       | 0.193  | 0.012      | 0.015        |
 | LlamaGuard  | 0.585    | 0.085       | 0.220  | 0.653      | 0.693        |
@@ -51,7 +51,7 @@ Compare our MD-Judge model with other methods on different public safety testset
 | GPT-4       | 0.785    | 0.827       | 0.470  | 0.842      | 0.835        |
 | MD-Judge    | **0.818**| **0.873**   | **0.644** | **0.866**  | **0.864**    |
-> Comparison of F1 scores between our model and other leading methods. Best results are **bolded** and second best are *underlined*. Base and Enhance indicate our SALAD-Base-Test and SALAD-Enhance-Test, TC means ToxicChat, and Beaver means Beavertails.
 ## Uses

 - GPT-4: https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo
 - LlamaGuard: https://huggingface.co/meta-llama/LlamaGuard-7b
+| **Methods** | **Base** | **Enhance** | **ToxicChat** | **Beavertails** | **SafeRLHF** |
 |-------------|----------|-------------|--------|------------|--------------|
 | Keyword     | 0.058    | 0.261       | 0.193  | 0.012      | 0.015        |
 | LlamaGuard  | 0.585    | 0.085       | 0.220  | 0.653      | 0.693        |
 | GPT-4       | 0.785    | 0.827       | 0.470  | 0.842      | 0.835        |
 | MD-Judge    | **0.818**| **0.873**   | **0.644** | **0.866**  | **0.864**    |
+> Comparison of F1 scores between our model and other leading methods. Best results are **bolded** and second best are *underlined*. Base and Enhance indicate our SALAD-Base-Test and SALAD-Enhance-Test.
 ## Uses