Looks good so far!

#2
by BigHuggyD - opened

8bpw EXL2 quant is up
https://huggingface.co/BigHuggyD/jukofyork_Deep-Miqu-103B-8.0bpw-h8-exl2

Quick dark test looked good! Testing a dark scenario now that is a slower burn to make sure it doesn't turn into a redemption story after some context is pumped through it. So far, I really like how it writes. One of my favorites to date. Prompt is basically "You are a writer" then I give it some general style guidance

The 120b should have been uploaded by now but has crapped out with an error twice :/

I'm about 12k of context in to my extended dark scenario, and the antagonist seems to be turning into a redemption story. I'm going to continue to play it out and see what happens, but so far, it seems to be drifting.

Since it is supposed to be a more 'neutral' model, I wonder if I need to be more explicit about the tone in my prompt.

Yeah, I think it might be hard to beat the stock Dark-Miqu-70b.

I tried @sophosympatheia 's suggestion of merging with migtissera/Tess-70B-v1.6 but it just reverted the model so all the characters want to "do good" and have instant redemption arcs, etc.

I'm a couple of hours off finishing uploading Deep-Miqu-120b so it might be worth trying that instead: from experience the 103b --> 120b --> 123b merge patterns make the models slightly more unhinged (and buggy) but more interesting too.

I might try just self-merging some more, but from a few quick experiments before; it didn't really do much to noticeably improve the model... The Dawn-Miqu-70b merge did seem to make the writing more descriptive, but not much use if it slowly reverts to Goody-Two-Shoes all the time :/

Yeah, I think it might be hard to beat the stock Dark-Miqu-70b.

I tried @sophosympatheia 's suggestion of merging with migtissera/Tess-70B-v1.6 but it just reverted the model so all the characters want to "do good" and have instant redemption arcs, etc.

I'm a couple of hours off finishing uploading Deep-Miqu-120b so it might be worth trying that instead: from experience the 103b --> 120b --> 123b merge patterns make the models slightly more unhinged (and buggy) but more interesting too.

I might try just self-merging some more, but from a few quick experiments before; it didn't really do much to noticeably improve the model... The Dawn-Miqu-70b merge did seem to make the writing more descriptive, but not much use if it slowly reverts to Goody-Two-Shoes all the time :/

Yes, my long form dark test definitely drifted when left to its own devices. I injected a really subtle nudge into the context, and it immediately redirected it, but that was not something I had to do with Dark-Miqu

Yeah, I think it might be hard to beat the stock Dark-Miqu-70b.

I tried @sophosympatheia 's suggestion of merging with migtissera/Tess-70B-v1.6 but it just reverted the model so all the characters want to "do good" and have instant redemption arcs, etc.

I'm a couple of hours off finishing uploading Deep-Miqu-120b so it might be worth trying that instead: from experience the 103b --> 120b --> 123b merge patterns make the models slightly more unhinged (and buggy) but more interesting too.

I might try just self-merging some more, but from a few quick experiments before; it didn't really do much to noticeably improve the model... The Dawn-Miqu-70b merge did seem to make the writing more descriptive, but not much use if it slowly reverts to Goody-Two-Shoes all the time :/

Yes, my long form dark test definitely drifted when left to its own devices. I injected a really subtle nudge into the context, and it immediately redirected it, but that was not something I had to do with Dark-Miqu

Yeah, I tried extending a few of the test stories and noticed it started trying to do this too :/

I'm now uploading 103b and 120b parameter self-merges of Dark-Miqu-70B. I don't know why I didn't give these more of a try before, but they do seem to be using more descriptive language and seem to have slightly less "weirdness" and/or inconsistencies in the generated stories. It will be a couple of days before the 103b is uploaded.

(I won't bother uploading the 123b version of Deep-Miqu as it is the least coherent of them all, and I think the self-merges might be more interesting to try now).

I'm now uploading 103b and 120b parameter self-merges of Dark-Miqu-70B. I don't know why I didn't give these more of a try before, but they do seem to be using more descriptive language and seem to have slightly less "weirdness" and/or inconsistencies in the generated stories. It will be a couple of days before the 103b is uploaded.

(I won't bother uploading the 123b version of Deep-Miqu as it is the least coherent of them all, and I think the self-merges might be more interesting to try now).

Interesting! I look forward to trying the self merges

Sign up or log in to comment