Original model link is 404

by klotz - opened Dec 30, 2023

Dec 30, 2023

https://huggingface.co/cognitivecomputations/dolphin-2.6.1-mixtral-8x7b gets 404.
Nothing with dolphin-2.6.1* seems nearby.

bartowski

Owner Dec 30, 2023

Yeah I was about to add a change, Eric saw the performance decreased with 2.6.1 so pulled it, I'll leave mine up for anyone who wants but he's working on retraining

The leading theory is that Axolotl's transformers build doesn't properly train the MoE router, and so it's being "naive in backpropagation", and so 2.7 or whatever he ends up calling it will be properly training the routing and have likely much higher performance

klotz

Dec 30, 2023

Thank you for the status and the great repo!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment