Added introduction, links, and reduced the number of plots displayed a5fb364 Corey Morris commited on Aug 8, 2023
Modified download CSV feature so that the index column now has a title of model name 6a7ad7c Corey Morris commited on Aug 3, 2023
Add dashed line at the appropriate scale of the largest and smallest values on the plot so that plotly still zooms in to show that 7ed3839 Corey Morris commited on Jul 30, 2023
Refactoring. Moved ResultDataProcessor class to a separate file to make it easier to use with experimentation in a jupyter notebook 843a5ef Corey Morris commited on Jul 24, 2023
Added updated results from hugging face evaluation runs 51a128e Corey Morris commited on Jul 24, 2023
Improving clarity. Moved MMLU average column to a more appropriate spot 5129f48 Corey Morris commited on Jul 23, 2023
Hiding filters unless box is selected. Removed model name column because it is the index of the table 8488477 Corey Morris commited on Jul 23, 2023
Added a scatter plot with just the top 50 performing models on MMLU average ca8e784 Corey Morris commited on Jul 23, 2023
added MMLU overall average column. added a few charts comparing more moral reasoning and comparing MMLU overall to other data c671de9 Corey Morris commited on Jul 23, 2023
Added statsmodels to be able to use a trendline in plotly ed019c6 Corey Morris commited on Jul 23, 2023
Updated data cleanup so that column names are cleaned up appropriatly with regex=True c1a84da Corey Morris commited on Jul 23, 2023
fixed reversed plot. extracted making chart into a method 337b761 Corey Morris commited on Jul 23, 2023
Update app.py and requirements.txt so that it will work with huggingface streamlit with the pandas 1.x version ba99486 Corey Morris commited on Jul 23, 2023
updated requirements.txt with versions being used locally 7ae46ce Corey Morris commited on Jul 23, 2023
WIP commit. Troubleshoot chart display. Add behavior of filter 43b4e29 Corey Morris commited on Jul 23, 2023
added hugging face evaluation harness results submodule 4dcdfc8 Corey Morris commited on Jul 21, 2023