InferenceIllusionist
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -17,7 +17,7 @@ license: apache-2.0
|
|
17 |
[Magic-Dolphin-7b](https://huggingface.co/InferenceIllusionist/Magic-Dolphin-7b) was an unexpected surprise. Profoundly satisfied with it as a first attempt. For this follow-up I wanted to target the MMLU benchmark specifically.
|
18 |
The challenge this time was placing more weight on Merlinite-7b as an unknown quantity that hasn't been in the spotlight despite its novel LAB tuning method.
|
19 |
|
20 |
-
<b>Excalibur-7b</b> builds on past success and is the
|
21 |
* Measuring KL-divergences for new quantization types brought a deeper understanding of benchmarking and assessing model performance
|
22 |
* This signifcantly sped up the testing process by using MMLU as a base, narrowing down over 10 candidate linear merges to 1: merliniteX-blockB1
|
23 |
* Reaching the limitations of linear merging necessitated a pivot to reviewing the viability of SLERP, DARE-TIES, and Passthrough methods
|
|
|
17 |
[Magic-Dolphin-7b](https://huggingface.co/InferenceIllusionist/Magic-Dolphin-7b) was an unexpected surprise. Profoundly satisfied with it as a first attempt. For this follow-up I wanted to target the MMLU benchmark specifically.
|
18 |
The challenge this time was placing more weight on Merlinite-7b as an unknown quantity that hasn't been in the spotlight despite its novel LAB tuning method.
|
19 |
|
20 |
+
<b>Excalibur-7b</b> builds on past success and is the culmination of several learnings:
|
21 |
* Measuring KL-divergences for new quantization types brought a deeper understanding of benchmarking and assessing model performance
|
22 |
* This signifcantly sped up the testing process by using MMLU as a base, narrowing down over 10 candidate linear merges to 1: merliniteX-blockB1
|
23 |
* Reaching the limitations of linear merging necessitated a pivot to reviewing the viability of SLERP, DARE-TIES, and Passthrough methods
|