Update benchmark comparisons to add Openchat and Jackalope
Browse files
README.md
CHANGED
@@ -81,17 +81,13 @@ You are to roleplay as Edward Elric from fullmetal alchemist. You are in the wor
|
|
81 |
|
82 |
Hermes 2 on Mistral-7B outperforms all Nous & Hermes models of the past, save Hermes 70B, and surpasses most of the current Mistral finetunes across the board.
|
83 |
|
84 |
-
### GPT4All:
|
85 |
-

|
86 |
|
87 |
-
|
88 |
-

|
89 |
-
|
90 |
-
### BigBench:
|
91 |
-

|
92 |
|
93 |
### Averages Compared:
|
94 |
-
|
|
|
95 |
|
96 |
GPT-4All Benchmark Set
|
97 |
```
|
|
|
81 |
|
82 |
Hermes 2 on Mistral-7B outperforms all Nous & Hermes models of the past, save Hermes 70B, and surpasses most of the current Mistral finetunes across the board.
|
83 |
|
84 |
+
### GPT4All, Bigbench, TruthfulQA, and AGIEval Model Comparisons:
|
|
|
85 |
|
86 |
+

|
|
|
|
|
|
|
|
|
87 |
|
88 |
### Averages Compared:
|
89 |
+
|
90 |
+

|
91 |
|
92 |
GPT-4All Benchmark Set
|
93 |
```
|