athirdpath commited on
Commit
b77fad0
1 Parent(s): 45bb3ca

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +97 -0
README.md ADDED
@@ -0,0 +1,97 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ models:
3
+ - model: part3
4
+ - model: part1
5
+ - model: part2
6
+ merge_method: model_stock
7
+ base_model: part3
8
+ parameters:
9
+ normalize: true
10
+ int8_mask: true
11
+ dtype: float16
12
+ license: llama3
13
+ ---
14
+ This is a merge stock of 3 models:
15
+ - Part Wave
16
+ - Part Block
17
+ - Part Funnel
18
+
19
+ With Part Funnel as the base.
20
+
21
+ ---
22
+
23
+ Part Wave:
24
+ - sources:
25
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
26
+ layer_range: [0, 12]
27
+ - sources:
28
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
29
+ layer_range: [8, 18]
30
+ - sources:
31
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
32
+ layer_range: [13, 23]
33
+ - sources:
34
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
35
+ layer_range: [18, 32]
36
+
37
+ ---
38
+
39
+ Part Block:
40
+ - sources:
41
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
42
+ layer_range: [0, 15]
43
+ - sources:
44
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
45
+ layer_range: [8, 23]
46
+ - sources:
47
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
48
+ layer_range: [16, 32]
49
+
50
+ ---
51
+
52
+ Part Funnel:
53
+ - sources:
54
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
55
+ layer_range: [0, 15]
56
+ - sources:
57
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
58
+ layer_range: [14, 14]
59
+ - sources:
60
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
61
+ layer_range: [13, 13]
62
+ - sources:
63
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
64
+ layer_range: [12, 12]
65
+ - sources:
66
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
67
+ layer_range: [11, 11]
68
+ - sources:
69
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
70
+ layer_range: [10, 10]
71
+ - sources:
72
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
73
+ layer_range: [9, 9]
74
+ - sources:
75
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
76
+ layer_range: [8, 23]
77
+ - sources:
78
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
79
+ layer_range: [22, 22]
80
+ - sources:
81
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
82
+ layer_range: [21, 21]
83
+ - sources:
84
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
85
+ layer_range: [20, 20]
86
+ - sources:
87
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
88
+ layer_range: [19, 19]
89
+ - sources:
90
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
91
+ layer_range: [18, 18]
92
+ - sources:
93
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
94
+ layer_range: [17, 17]
95
+ - sources:
96
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
97
+ layer_range: [16, 32]