sometimesanotion
commited on
Commit
•
e31ea1c
1
Parent(s):
b5ac082
Update README.md
Browse files
README.md
CHANGED
@@ -8,23 +8,46 @@ tags:
|
|
8 |
- merge
|
9 |
base_model:
|
10 |
- arcee-ai/Virtuoso-Small
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
pipeline_tag: text-generation
|
12 |
-
|
13 |
---
|
14 |
-
|
|
|
15 |
|
16 |
-
|
17 |
|
18 |
## Merge Details
|
19 |
### Merge Method
|
20 |
|
21 |
-
This model was
|
|
|
|
|
|
|
|
|
22 |
|
23 |
### Models Merged
|
24 |
|
25 |
-
|
26 |
-
|
27 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
28 |
|
29 |
### Configuration
|
30 |
|
|
|
8 |
- merge
|
9 |
base_model:
|
10 |
- arcee-ai/Virtuoso-Small
|
11 |
+
- CultriX/SeQwence-14B-EvolMerge
|
12 |
+
- CultriX/Qwen2.5-14B-Wernicke
|
13 |
+
- sthenno-com/miscii-14b-1028
|
14 |
+
- underwoods/medius-erebus-magnum-14b
|
15 |
+
- sometimesanotion/lamarck-14b-prose-model_stock
|
16 |
+
- sometimesanotion/lamarck-14b-reason-model_stock
|
17 |
+
metrics:
|
18 |
+
- accuracy
|
19 |
pipeline_tag: text-generation
|
|
|
20 |
---
|
21 |
+
![Lamarck.webp](https://huggingface.co/sometimesanotion/Lamarck-14B-v0.4/resolve/main/Lamarck.webp)
|
22 |
+
---
|
23 |
|
24 |
+
Lamarck 14B v0.4 Qwenvergence: it's a big step up for Lamarck in terms of output quality, reasoning, and prose quality. All the same ingredients are involved as in previous releases of Lamarck. They are more effectively combined.
|
25 |
|
26 |
## Merge Details
|
27 |
### Merge Method
|
28 |
|
29 |
+
This model was initialized from model_stock, and refined from there. No fine-tuning, use of models apart from those listed or the contents of Qwenvergence, wild parties, or sacrifices to the unnamed deities were involved.
|
30 |
+
|
31 |
+
Contrary to default CO2 emissions checks, this is merging from models already made, helping to upcycle and extend the life of the compute work. It's merged on a single workstation running on nearly 50% renewable electricity.
|
32 |
+
|
33 |
+
It was finalized using the [TIES](https://arxiv.org/abs/2306.01708) merge method using sometimesanotion/lamarck-14b-base as a base, per @rombodawg's continuous fine-tuning method.
|
34 |
|
35 |
### Models Merged
|
36 |
|
37 |
+
**Top influences:** These ancestors are base models and present in the Qwenvergence model_stock, reinforced in later steps:
|
38 |
+
|
39 |
+
- **[arcee-ai/Virtuoso-Small](https://huggingface.co/arcee-ai/Virtuoso-Small)** - A brand new model from Arcee, refined from the notable cross-architecture Llama-to-Qwen distillation [arcee-ai/SuperNova-Medius](https://huggingface.co/arcee-ai/SuperNova-Medius). The first two layers are nearly exclusively from Virtuoso. It has proven to be a well-rounded performer, and contributes a noticeable boost to the model's prose quality.
|
40 |
+
|
41 |
+
- **[CultriX/SeQwence-14B-EvolMerge](http://huggingface.co/CultriX/SeQwence-14B-EvolMerge)** - A top contender on reasoning benchmarks.
|
42 |
+
|
43 |
+
- **[VAGOsolutions/SauerkrautLM-v2-14b-DPO](https://huggingface.co/VAGOsolutions/SauerkrautLM-v2-14b-DPO)** - This model's influence is understated, but aids BBH and coding capability.
|
44 |
+
|
45 |
+
**Also inspiring:** While these models were not merged, they informed the design:
|
46 |
+
|
47 |
+
- **[CultriX/Qwen2.5-14B-Wernicke](http://huggingface.co/CultriX/Qwen2.5-14B-Wernicke)** - A top performer for Arc and GPQA, Wernicke is re-emphasized in small but highly-ranked portions of the model.
|
48 |
+
|
49 |
+
- **[sometimesanotion/lamarck-14b-prose-model_stock](https://huggingface.co/sometimesanotion/lamarck-14b-prose-model_stock)** - This brings in a little influence from [EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2), [oxyapi/oxy-1-small](https://huggingface.co/oxyapi/oxy-1-small), and [allura-org/TQ2.5-14B-Sugarquill-v1](https://huggingface.co/allura-org/TQ2.5-14B-Sugarquill-v1).
|
50 |
+
|
51 |
|
52 |
### Configuration
|
53 |
|