Thanks for all the hard work! Chance to see superhot-65b?
Hi there, really thanks for the LoRAs and the new context options for all of us.
Was wondering, is there planned a SuperHOT-8k LoRA for 65B? It would be amazing!
If not, no problem either!
I would use it in 65b certainly within 2 weeks for my upcoming release.
@ehartford
If you are training a new model, I recommend just applying a patch into your trainer's code assuming you have 4K or longer data. I can help if necessary. After all, I see a lot of people merging SuperHOT, but it is SuperHOT-8K, not Blank-Slate-8K lol
I will make a 65B though, I can see that merging is easier also if you don't have long enough data
Really thanks! There is a demand for a 65B 8K context option, at least for what can I see and discuss on TheBloke server, reddit LLMs subreddits and such. Really appreciated.
A 7B version would be amazing too for completeness sake! This blew up all over so I'm sure there is a ton of demand
@ehartford If you are training a new model, I recommend just applying a patch into your trainer's code assuming you have 4K or longer data. I can help if necessary. After all, I see a lot of people merging SuperHOT, but it is SuperHOT-8K, not Blank-Slate-8K lol
I will make a 65B though, I can see that merging is easier also if you don't have long enough data
Hi, thanks for all your hardwork, it will be great if you could also share the patch.
@auntieD Check the Files tab
Got it! Thanks so much! <3
Any chance you're willing to release the SuperHOT dataset (small though it is)? I'd like to build some 30B 16K and 65B loras & have the VRAM for it. That way it will at least be consistent with your 8K loras.