Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
Spaces:
CoreyMorris
/
MMLU-by-task-Leaderboard
like
13
Running
App
Files
Files
Community
4
667f9a4
MMLU-by-task-Leaderboard
4 contributors
History:
157 commits
Corey Morris
updated
667f9a4
9 months ago
.github
added a test and removed the code to only test a specific file because that code did not work
11 months ago
.gitattributes
1.52 kB
initial commit
12 months ago
.gitignore
68 Bytes
updated gitignore
10 months ago
.gitmodules
106 Bytes
added hugging face evaluation harness results submodule
12 months ago
README.md
248 Bytes
initial commit
12 months ago
app.py
15.8 kB
WIP. Loading data from csv
9 months ago
contaminated_models.csv
117 Bytes
Updated contaminated models
11 months ago
contaminated_models.txt
65 Bytes
Updated contaminated models
11 months ago
details_data_processor.py
4.04 kB
updated pipeline and init
11 months ago
dev_requirements.txt
252 Bytes
updated dev requirements
10 months ago
moral_app.py
11.1 kB
Extracted plotting functions from moral_app to plotting_utils to improve organization and testability
10 months ago
moral_scenarios_questions.csv
370 kB
Show a random question from the moral scenarios evaluation
11 months ago
plotting_utils.py
4.42 kB
Extracted plotting functions from moral_app to plotting_utils to improve organization and testability
10 months ago
requirements.txt
156 Bytes
Updated dependencies
11 months ago
result_data.csv
1.35 MB
updated
9 months ago
result_data_processor.py
6.77 kB
WIP. Loading data from csv
9 months ago
save_for_regression.py
1.86 kB
changed to save and load in a directory
11 months ago
split_question.py
964 Bytes
added code to split moral scenario question from one question to two
10 months ago
test_details_data_processing.py
4.33 kB
added a test
11 months ago
test_integration.py
1.96 kB
fixed test_streamlit_app_runs
11 months ago
test_paths.py
780 Bytes
added a test and removed the code to only test a specific file because that code did not work
11 months ago
test_regression.py
1.26 kB
added todo for test
11 months ago
test_result_data_processing.py
1.66 kB
Added organization to dataframe
11 months ago