This model has some tokenization problems on its own (tokensurgery with a shotgun was applied), but was meant to be used in a merge. use at your own risk.
Uses ChatML Formatting, Text completion preset here
(Notes pulled from original card), [since the data is the same]: One off train most likely, this was done purely for internal testing purposes but seemed ok enough to release. I do not plan to offer any kind of extended support for using this model, so your mileage may vary depending on use and context size.
- (Nemo 12B instruct as base)
- 200k randomized subset of GU_instruct-Remastered-1.1, with a splash of 25k hathor/poppy sauce, slow cooked for 3 epochs on medium heat.
- Downloads last month
- 37