Update README.md
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@ A creative writing `103b` parameter "frankenmerge" model with 32k context.
|
|
8 |
|
9 |
# Model background
|
10 |
|
11 |
-
Created using [Mergekit](https://github.com/arcee-ai/mergekit) from my two miqu-based models: [Dark-Miqu-70B](https://huggingface.co/jukofyork/Dark-Miqu-70B) and [Dawn-Miqu-70B](https://huggingface.co/jukofyork/Dawn-Miqu-70B)
|
12 |
|
13 |
- To fix problems with "backwards time skips" in the generated stories, the "standard" interleave pattern was replaced by repeated blocks (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2081174251)).
|
14 |
- To help maintain cohesion, the '`q_proj`', '`k_proj`' and '`down_proj`' tensors were all scaled to hypothesised upper-bound values (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2063716974)).
|
|
|
8 |
|
9 |
# Model background
|
10 |
|
11 |
+
Created using [Mergekit](https://github.com/arcee-ai/mergekit) from my two `70b` parameter miqu-based models: [Dark-Miqu-70B](https://huggingface.co/jukofyork/Dark-Miqu-70B) and [Dawn-Miqu-70B](https://huggingface.co/jukofyork/Dawn-Miqu-70B).
|
12 |
|
13 |
- To fix problems with "backwards time skips" in the generated stories, the "standard" interleave pattern was replaced by repeated blocks (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2081174251)).
|
14 |
- To help maintain cohesion, the '`q_proj`', '`k_proj`' and '`down_proj`' tensors were all scaled to hypothesised upper-bound values (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2063716974)).
|