ryzen88 commited on
Commit
ac9bc15
1 Parent(s): eb0135e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +90 -85
README.md CHANGED
@@ -1,85 +1,90 @@
1
- ---
2
- base_model: []
3
- library_name: transformers
4
- tags:
5
- - mergekit
6
- - merge
7
-
8
- ---
9
- # model
10
- Llama-3-70b-Arimas-story-RP-V1.6
11
-
12
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
13
-
14
- ## Merge Details
15
- I Greatly expanded the amount of models used in this merge, experimented a lot with different idea's.
16
- This version feels a lot more convincing than V1.5 Hopefully the long context window will also remain strong after Quants.
17
- Because of the many merges switched back from BFloat to Float.
18
- Tried breadcrums without the Ties, that went very poorly.
19
-
20
- ### Merge Method
21
-
22
- This model was merged using the breadcrumbs_ties merge method using I:\Llama-3-70B-Instruct-Gradient-262k as a base.
23
-
24
- ### Models Merged
25
-
26
- The following models were included in the merge:
27
- * \Smaug-Llama-3-70B-Instruct
28
- * \Meta-LLama-3-Cat-Smaug-LLama-70b
29
- * \Meta-LLama-3-Cat-A-LLama-70b
30
- * \Llama-3-70B-Synthia-v3.5
31
- * \Llama-3-70B-Instruct-Gradient-524k
32
- * \Llama-3-70B-Instruct-Gradient-262k
33
- * \Tess-2.0-Llama-3-70B-v0.2
34
- * \Llama-3-Lumimaid-70B-v0.1-alt
35
-
36
- ### Configuration
37
-
38
- The following YAML configuration was used to produce this model:
39
-
40
- ```yaml
41
- models:
42
- - model: \Llama-3-70B-Instruct-Gradient-262k
43
- parameters:
44
- weight: 0.25
45
- density: 0.90
46
- gamma: 0.01
47
- - model: \Meta-LLama-3-Cat-Smaug-LLama-70b
48
- parameters:
49
- weight: 0.28
50
- density: 0.90
51
- gamma: 0.01
52
- - model: \Llama-3-Lumimaid-70B-v0.1-alt
53
- parameters:
54
- weight: 0.15
55
- density: 0.90
56
- gamma: 0.01
57
- - model: \Tess-2.0-Llama-3-70B-v0.2
58
- parameters:
59
- weight: 0.06
60
- density: 0.90
61
- gamma: 0.01
62
- - model: \Smaug-Llama-3-70B-Instruct
63
- parameters:
64
- weight: 0.04
65
- density: 0.90
66
- gamma: 0.01
67
- - model: \Llama-3-70B-Synthia-v3.5
68
- parameters:
69
- weight: 0.05
70
- density: 0.90
71
- gamma: 0.01
72
- - model: \Llama-3-70B-Instruct-Gradient-524k
73
- parameters:
74
- weight: 0.03
75
- density: 0.90
76
- gamma: 0.01
77
- - model: \Meta-LLama-3-Cat-A-LLama-70b
78
- parameters:
79
- weight: 0.14
80
- density: 0.90
81
- gamma: 0.01
82
- merge_method: breadcrumbs_ties
83
- base_model: I:\Llama-3-70B-Instruct-Gradient-262k
84
- dtype: float16
85
- ```
 
 
 
 
 
 
1
+ ---
2
+ base_model: []
3
+ library_name: transformers
4
+ tags:
5
+ - mergekit
6
+ - merge
7
+ - llama 3
8
+ - 70b
9
+ - arimas
10
+ - story
11
+ - roleplay
12
+ - rp
13
+ ---
14
+ # model
15
+ Llama-3-70b-Arimas-story-RP-V1.6
16
+
17
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
18
+
19
+ ## Merge Details
20
+ I Greatly expanded the amount of models used in this merge, experimented a lot with different idea's.
21
+ This version feels a lot more convincing than V1.5 Hopefully the long context window will also remain strong after Quants.
22
+ Because of the many merges switched back from BFloat to Float.
23
+ Tried breadcrums without the Ties, that went very poorly.
24
+
25
+ ### Merge Method
26
+
27
+ This model was merged using the breadcrumbs_ties merge method using I:\Llama-3-70B-Instruct-Gradient-262k as a base.
28
+
29
+ ### Models Merged
30
+
31
+ The following models were included in the merge:
32
+ * \Smaug-Llama-3-70B-Instruct
33
+ * \Meta-LLama-3-Cat-Smaug-LLama-70b
34
+ * \Meta-LLama-3-Cat-A-LLama-70b
35
+ * \Llama-3-70B-Synthia-v3.5
36
+ * \Llama-3-70B-Instruct-Gradient-524k
37
+ * \Llama-3-70B-Instruct-Gradient-262k
38
+ * \Tess-2.0-Llama-3-70B-v0.2
39
+ * \Llama-3-Lumimaid-70B-v0.1-alt
40
+
41
+ ### Configuration
42
+
43
+ The following YAML configuration was used to produce this model:
44
+
45
+ ```yaml
46
+ models:
47
+ - model: \Llama-3-70B-Instruct-Gradient-262k
48
+ parameters:
49
+ weight: 0.25
50
+ density: 0.90
51
+ gamma: 0.01
52
+ - model: \Meta-LLama-3-Cat-Smaug-LLama-70b
53
+ parameters:
54
+ weight: 0.28
55
+ density: 0.90
56
+ gamma: 0.01
57
+ - model: \Llama-3-Lumimaid-70B-v0.1-alt
58
+ parameters:
59
+ weight: 0.15
60
+ density: 0.90
61
+ gamma: 0.01
62
+ - model: \Tess-2.0-Llama-3-70B-v0.2
63
+ parameters:
64
+ weight: 0.06
65
+ density: 0.90
66
+ gamma: 0.01
67
+ - model: \Smaug-Llama-3-70B-Instruct
68
+ parameters:
69
+ weight: 0.04
70
+ density: 0.90
71
+ gamma: 0.01
72
+ - model: \Llama-3-70B-Synthia-v3.5
73
+ parameters:
74
+ weight: 0.05
75
+ density: 0.90
76
+ gamma: 0.01
77
+ - model: \Llama-3-70B-Instruct-Gradient-524k
78
+ parameters:
79
+ weight: 0.03
80
+ density: 0.90
81
+ gamma: 0.01
82
+ - model: \Meta-LLama-3-Cat-A-LLama-70b
83
+ parameters:
84
+ weight: 0.14
85
+ density: 0.90
86
+ gamma: 0.01
87
+ merge_method: breadcrumbs_ties
88
+ base_model: I:\Llama-3-70B-Instruct-Gradient-262k
89
+ dtype: float16
90
+ ```