Text Generation
Transformers
Safetensors
mixtral
Mixture of Experts
frankenmoe
Merge
mergekit
lazymergekit
mlabonne/AlphaMonarch-7B
FPHam/Karen_TheEditor_V2_STRICT_Mistral_7B
SanjiWatsuki/Kunoichi-DPO-v2-7B
OmnicromsBrain/NeuralStar-7b-Lazy
conversational
Eval Results
text-generation-inference
Inference Endpoints
Usable context length?
#1
by
belisarius
- opened
Whats the maximum context this can take before producing nonsense?
The context window for Mixtral 4x7B is technically unlimited, but it operates with a 4K sliding window. Mixtral claims it's 32k but each expert was trained on 8k.
I run it with LMStudio set to 8k with the rolling context on, that and using Novelcrafter with custom prompts and the codex system I regularly get 16k or more but YMMV.