🚀Mixture of Expert (MoE) TEST
Collection
MoE done using mergekit
•
4 items
•
Updated
•
2
This is a merge of pre-trained language models created using mergekit.
This model was merged using the SLERP merge method.
The following models were included in the merge:
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | |
---|---|---|---|---|---|---|---|
- medmcqa | Yaml | none | 0 | acc | 0.6087 | ± | 0.0075 |
none | 0 | acc_norm | 0.6087 | ± | 0.0075 | ||
- medqa_4options | Yaml | none | 0 | acc | 0.6269 | ± | 0.0136 |
none | 0 | acc_norm | 0.6269 | ± | 0.0136 | ||
- anatomy (mmlu) | 0 | none | 0 | acc | 0.6963 | ± | 0.0397 |
- clinical_knowledge (mmlu) | 0 | none | 0 | acc | 0.7585 | ± | 0.0263 |
- college_biology (mmlu) | 0 | none | 0 | acc | 0.7847 | ± | 0.0344 |
- college_medicine (mmlu) | 0 | none | 0 | acc | 0.6936 | ± | 0.0351 |
- medical_genetics (mmlu) | 0 | none | 0 | acc | 0.8200 | ± | 0.0386 |
- professional_medicine (mmlu) | 0 | none | 0 | acc | 0.7684 | ± | 0.0256 |
stem | N/A | none | 0 | acc_norm | 0.6129 | ± | 0.0066 |
none | 0 | acc | 0.6440 | ± | 0.0057 | ||
- pubmedqa | 1 | none | 0 | acc | 0.7480 | ± | 0.0194 |
Groups | Version | Filter | n-shot | Metric | Value | Stderr | |
---|---|---|---|---|---|---|---|
stem | N/A | none | 0 | acc_norm | 0.6129 | ± | 0.0066 |
none | 0 | acc | 0.6440 | ± | 0.0057 |
The following YAML configuration was used to produce this model:
slices:
- sources:
- model: mlabonne/ChimeraLlama-3-8B-v3
layer_range: [0, 32]
- model: johnsnowlabs/JSL-MedLlama-3-8B-v2.0
layer_range: [0, 32]
merge_method: slerp
base_model: mlabonne/ChimeraLlama-3-8B-v3
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.5
dtype: bfloat16