Update README.md
Browse files
README.md
CHANGED
@@ -15,7 +15,7 @@ A creative writing `103b` parameter "self-merge" model with 32k context.
|
|
15 |
|
16 |
Created using [Mergekit](https://github.com/arcee-ai/mergekit) from my [Dark-Miqu-70B](https://huggingface.co/jukofyork/Dark-Miqu-70B) model.
|
17 |
|
18 |
-
-
|
19 |
- To help maintain cohesion, the '`q_proj`', '`k_proj`' and '`down_proj`' tensors were all scaled to hypothesised upper-bound values (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2063716974)).
|
20 |
|
21 |
# Prompting format
|
|
|
15 |
|
16 |
Created using [Mergekit](https://github.com/arcee-ai/mergekit) from my [Dark-Miqu-70B](https://huggingface.co/jukofyork/Dark-Miqu-70B) model.
|
17 |
|
18 |
+
- For self-merges specifically, the "standard" interleave pattern is identical to repeated blocks (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2081174251)).
|
19 |
- To help maintain cohesion, the '`q_proj`', '`k_proj`' and '`down_proj`' tensors were all scaled to hypothesised upper-bound values (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2063716974)).
|
20 |
|
21 |
# Prompting format
|