Update README.md
Browse files
README.md
CHANGED
@@ -4,11 +4,11 @@ license: other
|
|
4 |
|
5 |
![Deep-Miqu-103B.png](Deep-Miqu-103B.png)
|
6 |
|
7 |
-
A
|
8 |
|
9 |
# Model background
|
10 |
|
11 |
-
|
12 |
|
13 |
- To fix problems with "backwards time skips" in the generated stories, the "standard" interleave pattern was replaced by repeated blocks (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2081174251)).
|
14 |
- To help maintain cohesion, the '`q_proj`', '`k_proj`' and '`down_proj`' tensors were all scaled to hypothesised upper-bound values (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2063716974)).
|
|
|
4 |
|
5 |
![Deep-Miqu-103B.png](Deep-Miqu-103B.png)
|
6 |
|
7 |
+
A creative writing `103b` parameter "frankenmerge" model with 32k context.
|
8 |
|
9 |
# Model background
|
10 |
|
11 |
+
Created using [Mergekit](https://github.com/arcee-ai/mergekit)'s `'passthrough'` method from my two miqu-based models: [Dark-Miqu-70B](https://huggingface.co/jukofyork/Dark-Miqu-70B) and [Dawn-Miqu-70B](https://huggingface.co/jukofyork/Dawn-Miqu-70B):
|
12 |
|
13 |
- To fix problems with "backwards time skips" in the generated stories, the "standard" interleave pattern was replaced by repeated blocks (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2081174251)).
|
14 |
- To help maintain cohesion, the '`q_proj`', '`k_proj`' and '`down_proj`' tensors were all scaled to hypothesised upper-bound values (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2063716974)).
|