Spaces:

CoreyMorris
/

MMLU-by-task-Leaderboard

Running

App Files Files Community

MMLU-by-task-Leaderboard

4 contributors

History: 151 commits

Corey Morris

updated gitignore

76c8220 over 1 year ago

.github
added a test and removed the code to only test a specific file because that code did not work over 1 year ago
.gitattributes

1.52 kB

initial commit over 1 year ago
.gitignore

68 Bytes

updated gitignore over 1 year ago
.gitmodules

106 Bytes

added hugging face evaluation harness results submodule over 1 year ago
README.md

248 Bytes

initial commit over 1 year ago
app.py

16 kB

updated date and model count over 1 year ago
contaminated_models.csv

117 Bytes

Updated contaminated models over 1 year ago
contaminated_models.txt

65 Bytes

Updated contaminated models over 1 year ago
details_data_processor.py

4.04 kB

updated pipeline and init over 1 year ago
dev_requirements.txt

252 Bytes

updated dev requirements over 1 year ago
moral_app.py

11.1 kB

Extracted plotting functions from moral_app to plotting_utils to improve organization and testability over 1 year ago
moral_scenarios_questions.csv

370 kB

Show a random question from the moral scenarios evaluation over 1 year ago
plotting_utils.py

4.42 kB

Extracted plotting functions from moral_app to plotting_utils to improve organization and testability over 1 year ago
requirements.txt

156 Bytes

Updated dependencies over 1 year ago
result_data_processor.py

6.19 kB

Returning just a single file per model directory. Manually removing gpt-j-6b for now because there is something that is causing problems with processing the data over 1 year ago
save_for_regression.py

1.86 kB

changed to save and load in a directory over 1 year ago
test_details_data_processing.py

4.33 kB

added a test over 1 year ago
test_integration.py

1.96 kB

fixed test_streamlit_app_runs over 1 year ago
test_paths.py

780 Bytes

added a test and removed the code to only test a specific file because that code did not work over 1 year ago
test_regression.py

1.26 kB

added todo for test over 1 year ago
test_result_data_processing.py

1.66 kB

Added organization to dataframe over 1 year ago