llmixer commited on
Commit
36415d2
1 Parent(s): f8fa8ed

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -0
README.md ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - 152334H/miqu-1-70b-sf
4
+ license: unknown
5
+ language:
6
+ - en
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - merge
10
+ - frankenmerge
11
+ - 95b
12
+ ---
13
+ # BigWeave v27 95b
14
+
15
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/65a6db055c58475cf9e6def1/4CbbAN-X7ZWj702JrcCGH.png" width=600>
16
+
17
+ The BigWeave models aim to experimentally identify merge settings for increasing model performance. The version number merely tracks various attempts and is not a quality indicator. Only results demonstrating good performance are retained and shared.
18
+
19
+ # Prompting Format
20
+ Chatml, Mistral, Vicuna.
21
+
22
+ # Merge process
23
+ This is a self-merge of 152334H/miqu-1-70b-sf. The 30 most important layers (according to exl2 measurements) are duplicated with 50% overlap.
24
+
25
+ Merge configuration:
26
+ ```
27
+ slices:
28
+ - sources:
29
+ - model: 152334H/miqu-1-70b-sf
30
+ layer_range: [0,40]
31
+ - sources:
32
+ - model: 152334H/miqu-1-70b-sf
33
+ layer_range: [34,45] # dup 34-44
34
+ - sources:
35
+ - model: 152334H/miqu-1-70b-sf
36
+ layer_range: [40,52]
37
+ - sources:
38
+ - model: 152334H/miqu-1-70b-sf
39
+ layer_range: [51,53] # dup 51-52
40
+ - sources:
41
+ - model: 152334H/miqu-1-70b-sf
42
+ layer_range: [52,55]
43
+ - sources:
44
+ - model: 152334H/miqu-1-70b-sf
45
+ layer_range: [54,56] # dup 54-55
46
+ - sources:
47
+ - model: 152334H/miqu-1-70b-sf
48
+ layer_range: [55,59]
49
+ - sources:
50
+ - model: 152334H/miqu-1-70b-sf
51
+ layer_range: [58,60] # dup 58-59
52
+ - sources:
53
+ - model: 152334H/miqu-1-70b-sf
54
+ layer_range: [59,72]
55
+ - sources:
56
+ - model: 152334H/miqu-1-70b-sf
57
+ layer_range: [64,79] # dup 64-78
58
+ - sources:
59
+ - model: 152334H/miqu-1-70b-sf
60
+ layer_range: [72,80]
61
+ merge_method: passthrough
62
+ dtype: float16
63
+
64
+ ```