Electric Mist 7B

  • Developed by: maldv
  • License: cc-by-nc-4.0
  • Finetuned from model: alpindale/Mistral-7B-v0.2-hf
  • Methodology: Simple newline delimited, rolling window book and stripped conversation data.

Have you learned anything?

Yes, I learned that if you try to train models that aren't the base model, that the results are trash. I have heard rumors that merging the LoRA's works, which is why the companion LoRA for this is published as well.

Will It Write

It's good. It goes page after page. It needs an authors note to stay on track though.

Data

90% sci-fi fiction text data (with a lot of the pulpiest removed), then 10% of the other datasets mixed together; around 6000 8192 context samples, lora r 64, lr .00005, 2 epochs.

Chat Template

It was trained to follow no prompt at all, just to start going; which means the best results are from when you start the story. There is explicitly no chat in the training data. Simply double newline delimited (even with the orca, math, etc).

Issues

Punctuation isn't perfect, and has spacing issues, but I have yet to see it collapse even after dumping 40000 tokens in a rolling 8192 context.

Downloads last month
30
Safetensors
Model size
7.24B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for maldv/electric-mist-7b

Finetuned
(37)
this model
Quantizations
1 model

Datasets used to train maldv/electric-mist-7b