InferenceIllusionist
/

Excalibur-7b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

InferenceIllusionist commited on Mar 15, 2024

Commit

2c53778

·

verified ·

1 Parent(s): 6a31197

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ The challenge this time was placing more weight on Merlinite-7b as an unknown qu
 <b>Excalibur-7b</b> builds on past success and is the culimation of several learnings:
 * Measuring KL-divergences for new quantization types brought a deeper understanding of benchmarking and assessing model performance
 * This signifcantly sped up the testing process by using MMLU as a base, narrowing down over 10 candidate linear merges to 1: merliniteX-blockB1
-* Reaching the limitations of linear merging necessitated a pivot to reviewing the viability of SLERP, dares-ties, and passthrough methods
 * Thus a competing candidate merge pool was tested between different merge alogrithms. Once more the list was narrowed from 10 candidates to 1: merliniteX-blockF2
 * merliniteX-blockF2 (SLERP of Magic-Dolphin-7B and jaskier-7b-dpo in unorthadox proportions) was originally planned for release earlier this week
 * Instead -blockB1 and -blockF2 were merged and the results were placed head to head in a final round of tests. Ultimately a more conventional execution of SLERP showed the best results for the final step.
@@ -33,7 +33,7 @@ The challenge this time was placing more weight on Merlinite-7b as an unknown qu
 # Bonus Question - Vision Capabilities
-<b>Requires additional [mistral-7b-mmproj-v1.5-Q4_1.gguf](https://huggingface.co/koboldcpp/mmproj/tree/main) file for vision functionality)</b>
 <img src="https://i.imgur.com/4wbUrjf.jpeg" width="550"/>

 <b>Excalibur-7b</b> builds on past success and is the culimation of several learnings:
 * Measuring KL-divergences for new quantization types brought a deeper understanding of benchmarking and assessing model performance
 * This signifcantly sped up the testing process by using MMLU as a base, narrowing down over 10 candidate linear merges to 1: merliniteX-blockB1
+* Reaching the limitations of linear merging necessitated a pivot to reviewing the viability of SLERP, DARE-TIES, and Passthrough methods
 * Thus a competing candidate merge pool was tested between different merge alogrithms. Once more the list was narrowed from 10 candidates to 1: merliniteX-blockF2
 * merliniteX-blockF2 (SLERP of Magic-Dolphin-7B and jaskier-7b-dpo in unorthadox proportions) was originally planned for release earlier this week
 * Instead -blockB1 and -blockF2 were merged and the results were placed head to head in a final round of tests. Ultimately a more conventional execution of SLERP showed the best results for the final step.
 # Bonus Question - Vision Capabilities
+<b>Requires additional [mistral-7b-mmproj-v1.5-Q4_1.gguf](https://huggingface.co/koboldcpp/mmproj/tree/main) file for vision functionality</b>
 <img src="https://i.imgur.com/4wbUrjf.jpeg" width="550"/>