jukofyork commited on
Commit
36bf604
1 Parent(s): 739c5a9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -8,7 +8,7 @@ A creative writing `103b` parameter "frankenmerge" model with 32k context.
8
 
9
  # Model background
10
 
11
- Created using [Mergekit](https://github.com/arcee-ai/mergekit) from my two miqu-based models: [Dark-Miqu-70B](https://huggingface.co/jukofyork/Dark-Miqu-70B) and [Dawn-Miqu-70B](https://huggingface.co/jukofyork/Dawn-Miqu-70B):
12
 
13
  - To fix problems with "backwards time skips" in the generated stories, the "standard" interleave pattern was replaced by repeated blocks (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2081174251)).
14
  - To help maintain cohesion, the '`q_proj`', '`k_proj`' and '`down_proj`' tensors were all scaled to hypothesised upper-bound values (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2063716974)).
 
8
 
9
  # Model background
10
 
11
+ Created using [Mergekit](https://github.com/arcee-ai/mergekit) from my two `70b` parameter miqu-based models: [Dark-Miqu-70B](https://huggingface.co/jukofyork/Dark-Miqu-70B) and [Dawn-Miqu-70B](https://huggingface.co/jukofyork/Dawn-Miqu-70B).
12
 
13
  - To fix problems with "backwards time skips" in the generated stories, the "standard" interleave pattern was replaced by repeated blocks (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2081174251)).
14
  - To help maintain cohesion, the '`q_proj`', '`k_proj`' and '`down_proj`' tensors were all scaled to hypothesised upper-bound values (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2063716974)).