Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
danielz02
commited on
Commit
•
8eb0e38
1
Parent(s):
9f3155e
Modify about info
Browse files- README.md +1 -1
- src/display/about.py +21 -5
README.md
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
---
|
2 |
-
title:
|
3 |
emoji: 🥇
|
4 |
colorFrom: green
|
5 |
colorTo: indigo
|
|
|
1 |
---
|
2 |
+
title: Trustworthy LLM Leaderboard
|
3 |
emoji: 🥇
|
4 |
colorFrom: green
|
5 |
colorTo: indigo
|
src/display/about.py
CHANGED
@@ -23,19 +23,35 @@ class Tasks(Enum):
|
|
23 |
|
24 |
|
25 |
# Your leaderboard name
|
26 |
-
TITLE = """<h1 align="center" id="space-title">
|
27 |
|
28 |
# What does your leaderboard evaluate?
|
29 |
-
INTRODUCTION_TEXT = """
|
30 |
-
limitations, and potential risks associated with deploying these state-of-the-art Large
|
31 |
-
our paper for details."""
|
32 |
|
33 |
# Which evaluations are you running? how can people reproduce what you have?
|
34 |
LLM_BENCHMARKS_TEXT = f"""
|
35 |
## How it works
|
36 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
37 |
## Reproducibility
|
38 |
-
To reproduce our results,
|
39 |
|
40 |
"""
|
41 |
|
|
|
23 |
|
24 |
|
25 |
# Your leaderboard name
|
26 |
+
TITLE = """<h1 align="center" id="space-title">Trustworthy LLM leaderboard</h1>"""
|
27 |
|
28 |
# What does your leaderboard evaluate?
|
29 |
+
INTRODUCTION_TEXT = """Powered by DecodingTrust, this research endeavor is designed to help researchers better
|
30 |
+
understand the capabilities, limitations, and potential risks associated with deploying these state-of-the-art Large
|
31 |
+
Language Models (LLMs). See our paper for details."""
|
32 |
|
33 |
# Which evaluations are you running? how can people reproduce what you have?
|
34 |
LLM_BENCHMARKS_TEXT = f"""
|
35 |
## How it works
|
36 |
|
37 |
+
DecodingTrust aims at providing a thorough assessment of trustworthiness in GPT models.
|
38 |
+
|
39 |
+
This research endeavor is designed to help researchers and practitioners better understand the capabilities,
|
40 |
+
limitations, and potential risks involved in deploying these state-of-the-art Large Language Models (LLMs).
|
41 |
+
|
42 |
+
This project is organized around the following eight primary perspectives of trustworthiness, including:
|
43 |
+
|
44 |
+
Toxicity
|
45 |
+
Stereotype and bias
|
46 |
+
Adversarial robustness
|
47 |
+
Out-of-Distribution Robustness
|
48 |
+
Privacy
|
49 |
+
Robustness to Adversarial Demonstrations
|
50 |
+
Machine Ethics
|
51 |
+
Fairness
|
52 |
+
|
53 |
## Reproducibility
|
54 |
+
To reproduce our results, checkout https://github.com/AI-secure/DecodingTrust
|
55 |
|
56 |
"""
|
57 |
|