Is this an "update" on TheBloke/Yi-23B-Llama-GGUF?

#1
by mclassHF2023 - opened

Yesterday I successfully used TheBloke/Yi-23B-Llama-GGUF and now I don't see it anymore.

No, this is not that model. This model is Yi 34B, converted to Llama format to make it more compatible.

Yi-23B was a special experimental model, removing layers from Yi 34B Llama to make it smaller. Unfortunately, it did not work.

I pulled Yi-23B, Yi-15B and Yi-8B because they are currently unusable, outputting only gibberish.

See here for more info: https://huggingface.co/ByteWave/Yi-8B-Llama/discussions/1#655b2867a296e5c74c6f9c0c

If/when the models are re-done or fixed, I will do them again.

Wait, you say you used it successfully? You got usable output from it? I couldn't get anything good from it, and others report the same.

What prompt(s) did you use successfully?

Unfortunately, I deleted the file on my runpod instance... do you have a link where I could download it to try it? I admit I played around a LOT yesterday with different settings, so I definitely have to confirm it.

I know I used basically presets/StarChat.yaml for the Parameters->Generation
For the prompt, I used something like this, I think (again, can't test it right now):

user: '### Instruction:'
bot: '### Response:'
turn_template: <|user|>\n<|user-message|>\n\n<|bot|>\n<|bot-message|>\n\n
context: |+
  <|system-message|>

system_message: It is 2023-11-20 at around 12:33am EST. You are an artificial software developer assistant. Answer all questions with accurate and fully implemented source code. Explain your answers in detail

(Obviously ignore the time... ;) )
EDIT: I tried the "official" version and got some weird tokenizer error, so I must have mistaken the model for some other that I got working....

Sign up or log in to comment