Added test to test the specific method that is currently producting an error
5b83d0b
Corey Morriscommited on
Added failing integration test. Currently fails because of the addition of the organization to the dataframe
de65005
Corey Morriscommited on
Added organization to dataframe
52d3b03
Corey Morriscommited on
added failing test for new behavior of organization column. Updated test for rows for the newly added rows
02b1702
Corey Morriscommited on
removed code to print the number of outliers. could add it back later as logging potentially
cd21f99
Corey Morriscommited on
renamed file for clarity
229d6d1
Corey Morriscommited on
Added .gitignore file
368802d
Corey Morriscommited on
Added new results from hugging face
49d555f
Corey Morriscommited on
MC1 column had 8 rows with a value of 1. It didn't make sense given the next highest value was 0.47 . Assuming they were data errors, they were removed
e03b231
Corey Morriscommited on
truthfulqa data added to dataframe
abac22e
Corey Morriscommited on
Refactor to make later code changes easier
6d41115
Corey Morriscommited on
Added test for removal of undesired columns. fixed code error in column removal
9549fcc
Corey Morriscommited on
Added initial tests. test_columns and test_rows will be updated or removed later as they test for the exact number of columns and rows. the number of rows will change as more models are added
85667d0
Corey Morriscommited on
Updated the last updated date to 18Aug
42ff7b9
Corey Morriscommited on
updated with new hugging face results
8ccc242
Corey Morriscommited on
Updated description with more models
7f24726
Corey Morriscommited on
updated results
80c79bd
Corey Morriscommited on
fixed error
d7b89ce
Corey Morriscommited on
Added google analytics snippet
9444cd2
Corey Morriscommited on
Increased size of scatter plot
2b16774
Corey Morriscommited on
Made the radar plot larger
f52387e
Corey Morriscommited on
Moved radar plots to higher in the page
12a9766
Corey Morriscommited on
Modified title and explanation to better reflect what the site is
18ec1ba
Corey Morriscommited on
Moved radar chart to after analysis
fb25b1e
Corey Morriscommited on
Added a default model to compare
7b77065
Corey Morriscommited on
Improved clarity of explanation for Radar charts
a450af5
Corey Morriscommited on
Fixed some of the diplicate model issue
618dcce
Corey Morriscommited on
Table now displays the columns that have the top differences
dc21a69
Corey Morriscommited on
removed charts with hardcoded tasks. removed hardcoding of model for other charts
a125eb8
Corey Morriscommited on
Finding top differences between tasks from the target model
627e0f9
Corey Morriscommited on
Added explanation for the plot and a dataframe of the models
2db58a0
Corey Morriscommited on
Added radar chart. Compares a model to the 5 models that have the closest performance on MMLU_average
9695a47
Corey Morriscommited on
added new results from hugging face
b9b6115
Corey Morriscommited on
Added header back for the table
2a7f691
Corey Morriscommited on
Added citation for the site
ea8703d
Corey Morriscommited on
Changed streamlit to wide layout to see more of the table
1e6b767
Corey Morriscommited on
Updated updated date
28d4d6a
Corey Morriscommited on
Added filter for parameter count. Fixed model filter so that it only filters on the Model name (index of the table)
8474e43
Corey Morriscommited on
Modified the selection of models and evaluations so that most do not show up by default. for a better user experience with 700+ models
0a33874
Corey Morriscommited on
Added search for Model name and Task name
3abc48f
Corey Morriscommited on
Added reasoning for having scatter plots
cb21769
Corey Morriscommited on
Updated title now that there are over 700 open source models in the dataset
a9f9804
Corey Morriscommited on
Updated with new results from hugging face
c173f6a
Corey Morriscommited on
Added statement and hypothesis about moral scenarios
d97426f
Corey Morriscommited on
Plots have a default title
f9a0f38
Corey Morriscommited on
updated requirements.txt
bb08057
Corey Morriscommited on
Refactor of create_plot
bdad6e6
Corey Morriscommited on
Added finding from moral scenarios about threshold