InferenceIllusionist
/

Excalibur-7b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

InferenceIllusionist commited on Mar 15, 2024

Commit

ceb9fd0

·

verified ·

1 Parent(s): 2c53778

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -21,7 +21,7 @@ The challenge this time was placing more weight on Merlinite-7b as an unknown qu
 * Measuring KL-divergences for new quantization types brought a deeper understanding of benchmarking and assessing model performance
 * This signifcantly sped up the testing process by using MMLU as a base, narrowing down over 10 candidate linear merges to 1: merliniteX-blockB1
 * Reaching the limitations of linear merging necessitated a pivot to reviewing the viability of SLERP, DARE-TIES, and Passthrough methods
-* Thus a competing candidate merge pool was tested between different merge alogrithms. Once more the list was narrowed from 10 candidates to 1: merliniteX-blockF2
 * merliniteX-blockF2 (SLERP of Magic-Dolphin-7B and jaskier-7b-dpo in unorthadox proportions) was originally planned for release earlier this week
 * Instead -blockB1 and -blockF2 were merged and the results were placed head to head in a final round of tests. Ultimately a more conventional execution of SLERP showed the best results for the final step.

 * Measuring KL-divergences for new quantization types brought a deeper understanding of benchmarking and assessing model performance
 * This signifcantly sped up the testing process by using MMLU as a base, narrowing down over 10 candidate linear merges to 1: merliniteX-blockB1
 * Reaching the limitations of linear merging necessitated a pivot to reviewing the viability of SLERP, DARE-TIES, and Passthrough methods
+* Thus a competing candidate merge pool was tested between different merge algorithms. Once more the list was narrowed from 10 candidates to 1: merliniteX-blockF2
 * merliniteX-blockF2 (SLERP of Magic-Dolphin-7B and jaskier-7b-dpo in unorthadox proportions) was originally planned for release earlier this week
 * Instead -blockB1 and -blockF2 were merged and the results were placed head to head in a final round of tests. Ultimately a more conventional execution of SLERP showed the best results for the final step.