Zekun Wu
commited on
Commit
·
9b04c57
1
Parent(s):
ef3367f
add
Browse files- util/evaluator.py +11 -10
util/evaluator.py
CHANGED
@@ -32,23 +32,23 @@ class evaluator:
|
|
32 |
|
33 |
Factually Correct:
|
34 |
Definition: The explanation must be accurate and relevant to the question and the subject matter.
|
35 |
-
Score: (0-
|
36 |
|
37 |
Useful:
|
38 |
Definition: The explanation should enable the user to understand the answer better and should facilitate further reasoning or decision-making.
|
39 |
-
Score: (0-
|
40 |
|
41 |
Context Specific:
|
42 |
Definition: The explanation should be relevant to the specific context or scenario implied by the question.
|
43 |
-
Score: (0-
|
44 |
|
45 |
User Specific:
|
46 |
Definition: The explanation should cater to the knowledge level and interests of the user, assuming typical or specified user characteristics.
|
47 |
-
Score: (0-
|
48 |
|
49 |
Provides Pluralism:
|
50 |
Definition: The explanation should offer or accommodate multiple viewpoints or interpretations, allowing the user to explore various perspectives.
|
51 |
-
Score: (0-
|
52 |
|
53 |
After evaluating the provided question and explanation based on the five principles, please format your scores and justifications in a JSON dictionary. Directly provide me with the JSON without any additional text.
|
54 |
|
@@ -56,23 +56,23 @@ class evaluator:
|
|
56 |
{{
|
57 |
"Factually Correct": {{
|
58 |
"Justification": "The explanation is mostly accurate with only minor inaccuracies.",
|
59 |
-
"Score":
|
60 |
}},
|
61 |
"Useful": {{
|
62 |
"Justification": "The explanation is very helpful in understanding the main concept.",
|
63 |
-
"Score":
|
64 |
}},
|
65 |
"Context Specific": {{
|
66 |
"Justification": "The explanation is generally relevant to the specific context but lacks some detail.",
|
67 |
-
"Score":
|
68 |
}},
|
69 |
"User Specific": {{
|
70 |
"Justification": "The explanation is appropriate for the typical user but may be too technical for some.",
|
71 |
-
"Score":
|
72 |
}},
|
73 |
"Provides Pluralism": {{
|
74 |
"Justification": "The explanation considers multiple perspectives but could include more viewpoints.",
|
75 |
-
"Score":
|
76 |
}}
|
77 |
}}
|
78 |
|
@@ -164,6 +164,7 @@ def write_evaluation_commentary(scores):
|
|
164 |
evaluation_details = []
|
165 |
|
166 |
for principle, details in scores.items():
|
|
|
167 |
score = details.get('Score', -1)
|
168 |
justification = details.get('Justification', '')
|
169 |
|
|
|
32 |
|
33 |
Factually Correct:
|
34 |
Definition: The explanation must be accurate and relevant to the question and the subject matter.
|
35 |
+
Score: (0-10) How factually correct is the explanation? Consider the accuracy of the details provided and their relevance to the question.
|
36 |
|
37 |
Useful:
|
38 |
Definition: The explanation should enable the user to understand the answer better and should facilitate further reasoning or decision-making.
|
39 |
+
Score: (0-10) How useful is the explanation in helping the user understand the answer and make informed decisions?
|
40 |
|
41 |
Context Specific:
|
42 |
Definition: The explanation should be relevant to the specific context or scenario implied by the question.
|
43 |
+
Score: (0-10) How well does the explanation address the specific context or scenario of the question?
|
44 |
|
45 |
User Specific:
|
46 |
Definition: The explanation should cater to the knowledge level and interests of the user, assuming typical or specified user characteristics.
|
47 |
+
Score: (0-10) How well does the explanation cater to the needs and knowledge level of the intended user?
|
48 |
|
49 |
Provides Pluralism:
|
50 |
Definition: The explanation should offer or accommodate multiple viewpoints or interpretations, allowing the user to explore various perspectives.
|
51 |
+
Score: (0-10) How well does the explanation provide or support multiple perspectives?
|
52 |
|
53 |
After evaluating the provided question and explanation based on the five principles, please format your scores and justifications in a JSON dictionary. Directly provide me with the JSON without any additional text.
|
54 |
|
|
|
56 |
{{
|
57 |
"Factually Correct": {{
|
58 |
"Justification": "The explanation is mostly accurate with only minor inaccuracies.",
|
59 |
+
"Score": 9
|
60 |
}},
|
61 |
"Useful": {{
|
62 |
"Justification": "The explanation is very helpful in understanding the main concept.",
|
63 |
+
"Score": 8.5
|
64 |
}},
|
65 |
"Context Specific": {{
|
66 |
"Justification": "The explanation is generally relevant to the specific context but lacks some detail.",
|
67 |
+
"Score": 8
|
68 |
}},
|
69 |
"User Specific": {{
|
70 |
"Justification": "The explanation is appropriate for the typical user but may be too technical for some.",
|
71 |
+
"Score": 7.5
|
72 |
}},
|
73 |
"Provides Pluralism": {{
|
74 |
"Justification": "The explanation considers multiple perspectives but could include more viewpoints.",
|
75 |
+
"Score": 7
|
76 |
}}
|
77 |
}}
|
78 |
|
|
|
164 |
evaluation_details = []
|
165 |
|
166 |
for principle, details in scores.items():
|
167 |
+
print(details)
|
168 |
score = details.get('Score', -1)
|
169 |
justification = details.get('Justification', '')
|
170 |
|