sometimesanotion commited on
Commit
e31ea1c
1 Parent(s): b5ac082

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -7
README.md CHANGED
@@ -8,23 +8,46 @@ tags:
8
  - merge
9
  base_model:
10
  - arcee-ai/Virtuoso-Small
 
 
 
 
 
 
 
 
11
  pipeline_tag: text-generation
12
-
13
  ---
14
- # output
 
15
 
16
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
17
 
18
  ## Merge Details
19
  ### Merge Method
20
 
21
- This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using sometimesanotion/lamarck-14b-base as a base.
 
 
 
 
22
 
23
  ### Models Merged
24
 
25
- The following models were included in the merge:
26
- * merges/Qwen2.5-14B-Qwenvergence-slerp
27
- * [arcee-ai/Virtuoso-Small](https://huggingface.co/arcee-ai/Virtuoso-Small)
 
 
 
 
 
 
 
 
 
 
 
28
 
29
  ### Configuration
30
 
 
8
  - merge
9
  base_model:
10
  - arcee-ai/Virtuoso-Small
11
+ - CultriX/SeQwence-14B-EvolMerge
12
+ - CultriX/Qwen2.5-14B-Wernicke
13
+ - sthenno-com/miscii-14b-1028
14
+ - underwoods/medius-erebus-magnum-14b
15
+ - sometimesanotion/lamarck-14b-prose-model_stock
16
+ - sometimesanotion/lamarck-14b-reason-model_stock
17
+ metrics:
18
+ - accuracy
19
  pipeline_tag: text-generation
 
20
  ---
21
+ ![Lamarck.webp](https://huggingface.co/sometimesanotion/Lamarck-14B-v0.4/resolve/main/Lamarck.webp)
22
+ ---
23
 
24
+ Lamarck 14B v0.4 Qwenvergence: it's a big step up for Lamarck in terms of output quality, reasoning, and prose quality. All the same ingredients are involved as in previous releases of Lamarck. They are more effectively combined.
25
 
26
  ## Merge Details
27
  ### Merge Method
28
 
29
+ This model was initialized from model_stock, and refined from there. No fine-tuning, use of models apart from those listed or the contents of Qwenvergence, wild parties, or sacrifices to the unnamed deities were involved.
30
+
31
+ Contrary to default CO2 emissions checks, this is merging from models already made, helping to upcycle and extend the life of the compute work. It's merged on a single workstation running on nearly 50% renewable electricity.
32
+
33
+ It was finalized using the [TIES](https://arxiv.org/abs/2306.01708) merge method using sometimesanotion/lamarck-14b-base as a base, per @rombodawg's continuous fine-tuning method.
34
 
35
  ### Models Merged
36
 
37
+ **Top influences:** These ancestors are base models and present in the Qwenvergence model_stock, reinforced in later steps:
38
+
39
+ - **[arcee-ai/Virtuoso-Small](https://huggingface.co/arcee-ai/Virtuoso-Small)** - A brand new model from Arcee, refined from the notable cross-architecture Llama-to-Qwen distillation [arcee-ai/SuperNova-Medius](https://huggingface.co/arcee-ai/SuperNova-Medius). The first two layers are nearly exclusively from Virtuoso. It has proven to be a well-rounded performer, and contributes a noticeable boost to the model's prose quality.
40
+
41
+ - **[CultriX/SeQwence-14B-EvolMerge](http://huggingface.co/CultriX/SeQwence-14B-EvolMerge)** - A top contender on reasoning benchmarks.
42
+
43
+ - **[VAGOsolutions/SauerkrautLM-v2-14b-DPO](https://huggingface.co/VAGOsolutions/SauerkrautLM-v2-14b-DPO)** - This model's influence is understated, but aids BBH and coding capability.
44
+
45
+ **Also inspiring:** While these models were not merged, they informed the design:
46
+
47
+ - **[CultriX/Qwen2.5-14B-Wernicke](http://huggingface.co/CultriX/Qwen2.5-14B-Wernicke)** - A top performer for Arc and GPQA, Wernicke is re-emphasized in small but highly-ranked portions of the model.
48
+
49
+ - **[sometimesanotion/lamarck-14b-prose-model_stock](https://huggingface.co/sometimesanotion/lamarck-14b-prose-model_stock)** - This brings in a little influence from [EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2), [oxyapi/oxy-1-small](https://huggingface.co/oxyapi/oxy-1-small), and [allura-org/TQ2.5-14B-Sugarquill-v1](https://huggingface.co/allura-org/TQ2.5-14B-Sugarquill-v1).
50
+
51
 
52
  ### Configuration
53