Text Generation
Transformers
PyTorch
Safetensors
English
mixtral
conversational
Inference Endpoints
text-generation-inference

Model working with input file but not when chatting further | Mac M3 Pro 36GB

#15
by meadow - opened

Hi! Thank you for your effort here. Apologies beforehand if this is a naive question.

I'm running this model with ollama via the terminal. I edited the manifest (modelfile?) to include my own custom paragraph as initial user message and the model responds well to that and actually loads pretty fast.

"(base) ➜ ~ ollama run dolphin-mixtral:latest

Hello
Title: Comprehensive Guide on How to...[redacted]"

However, once that message is completed and I send another one then it just ignores the one I sent and starts answering the initial user prompt from the modelfile.

Is this simply a memory/hardware related issue or is it worth diving into this further?

Thanks in advance!

Try to increase the context size setting.

I forgot how to do it on ollama but I believe you either can set it temporarily by issuing the "/set" command while chatting or hardcoding it in the modelfile.

Sign up or log in to comment