giraffe176 commited on
Commit
c659af2
1 Parent(s): f4f39a2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -20,11 +20,13 @@ Hoping that, through a merge of really good models, I'd be able to create soemth
20
 
21
  I came across the EQ-Bench Benchmark [(Paper)](https://arxiv.org/abs/2312.06281) as part of my earlier testing. It is a very light and quick benchmark that yields powerful insights into how well the model performs in emotional intelligence related prompts.
22
  As part of this process, I tried to figure out if there was a way to determine an optimal set of gradient weights that would lead to the most successful merge as measured against EQ-Bench. At first, my goal was to simply exceed WestLake-7B, but then I kept pushing to see what I could come up with.
23
- Way too late in the process, did I learn that [dare_ties](https://arxiv.org/abs/2311.03099) has a random element to it, but considered it valuable information for next time. After concluding that project, I began collecting more data, this time setting a specified seed in mergekit for reproducibility.
24
  This model is *not* a result of the above work but is the genesis of how this model came to be.
25
 
26
  I present, **Starling_Monarch_Westlake_Garten-7B-v0.1**, the only 7B model to score > 80 on the EQ-Bench v2.1 benchmark found [here](https://github.com/EQ-bench/EQ-Bench), outscoring larger models like [abacusai/Smaug-72B-v0.1](https://huggingface.co/abacusai/Smaug-72B-v0.1) and [cognitivecomputations/dolphin-2.2-70b](https://huggingface.co/cognitivecomputations/dolphin-2.2-70b)
27
 
 
 
28
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
29
 
30
  ## Merge Details
 
20
 
21
  I came across the EQ-Bench Benchmark [(Paper)](https://arxiv.org/abs/2312.06281) as part of my earlier testing. It is a very light and quick benchmark that yields powerful insights into how well the model performs in emotional intelligence related prompts.
22
  As part of this process, I tried to figure out if there was a way to determine an optimal set of gradient weights that would lead to the most successful merge as measured against EQ-Bench. At first, my goal was to simply exceed WestLake-7B, but then I kept pushing to see what I could come up with.
23
+ Too late in the process, I learned that [dare_ties](https://arxiv.org/abs/2311.03099) has a random element to it. Valuable information for next time, I guess. After concluding that project, I began collecting more data, this time setting a specified seed in mergekit for reproducibility. As I was collecting data, I hit the goal I had set for myself.
24
  This model is *not* a result of the above work but is the genesis of how this model came to be.
25
 
26
  I present, **Starling_Monarch_Westlake_Garten-7B-v0.1**, the only 7B model to score > 80 on the EQ-Bench v2.1 benchmark found [here](https://github.com/EQ-bench/EQ-Bench), outscoring larger models like [abacusai/Smaug-72B-v0.1](https://huggingface.co/abacusai/Smaug-72B-v0.1) and [cognitivecomputations/dolphin-2.2-70b](https://huggingface.co/cognitivecomputations/dolphin-2.2-70b)
27
 
28
+ It also earned 8.109 on MT-Bench[(paper)](https://arxiv.org/abs/2306.05685), outscoring Chat-GPT 3.5 and Claude v1.
29
+
30
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
31
 
32
  ## Merge Details