Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
Spaces:
CoreyMorris
/
MMLU-by-task-Leaderboard
like
13
Running
App
Files
Files
Community
4
main
MMLU-by-task-Leaderboard
4 contributors
History:
185 commits
CoreyMorris
updated with new data
e05c716
3 months ago
.github
added a test and removed the code to only test a specific file because that code did not work
10 months ago
.gitattributes
1.52 kB
initial commit
12 months ago
.gitignore
68 Bytes
updated gitignore
10 months ago
.gitmodules
106 Bytes
added hugging face evaluation harness results submodule
12 months ago
README.md
202 Bytes
updated readme and requirements
9 months ago
app.py
15.6 kB
updated with new data
3 months ago
contaminated_models.csv
117 Bytes
Updated contaminated models
11 months ago
contaminated_models.txt
65 Bytes
Updated contaminated models
11 months ago
details_data_processor.py
4.04 kB
updated pipeline and init
11 months ago
dev_requirements.txt
252 Bytes
updated dev requirements
10 months ago
generate_csv.ipynb
25.8 kB
update
7 months ago
moral_app.py
11.1 kB
Extracted plotting functions from moral_app to plotting_utils to improve organization and testability
10 months ago
moral_scenarios_questions.csv
370 kB
Show a random question from the moral scenarios evaluation
10 months ago
plotting_utils.py
4.42 kB
Extracted plotting functions from moral_app to plotting_utils to improve organization and testability
10 months ago
processed_data_2023-09-29.csv
1.35 MB
Updated data and added notes about the site.
3 months ago
processed_data_2023-10-05.csv
1.35 MB
update
9 months ago
processed_data_2023-10-06.csv
1.62 MB
Added clickable links (#1)
9 months ago
processed_data_2023-10-08.csv
1.58 MB
added new result data
9 months ago
processed_data_2023-11-18.csv
1.18 MB
updated dashboard with new data
8 months ago
processed_data_2023-11-21.csv
1.25 MB
Updated with new results 11-21
7 months ago
processed_data_2024-04-16.csv
5.74 MB
updated with new data
3 months ago
requirements.txt
160 Bytes
updated readme and requirements
9 months ago
result_data.csv
1.35 MB
updated
9 months ago
result_data_processor.py
8.29 kB
Added clickable links (#1)
9 months ago
save_for_regression.py
1.86 kB
changed to save and load in a directory
11 months ago
split_question.py
964 Bytes
added code to split moral scenario question from one question to two
10 months ago
test_details_data_processing.py
4.33 kB
added a test
11 months ago
test_integration.py
1.96 kB
fixed test_streamlit_app_runs
11 months ago
test_paths.py
780 Bytes
added a test and removed the code to only test a specific file because that code did not work
10 months ago
test_regression.py
1.26 kB
added todo for test
11 months ago
test_result_data_processing.py
1.66 kB
Added organization to dataframe
11 months ago