Llama3
what is the censorship levels of Llama3 70?
I usually use oobabooga to test the models, and when I tried llama-3 it gave pretty meh results and sometimes just repeated my questions back to me. It doesn't seem like llama-cpp-python has llama-3 support yet, but I'm not sure that'll fix the bad responses or just tell it what the end token is. So unless there's a simple fix to get llama-3 working correctly, I don't really have time right now to try to figure out what's wrong because I'm pretty busy :/
QuantFactory has the correct eos and works with llama-cpp-python
https://huggingface.co/QuantFactory/Meta-Llama-3-70B-Instruct-GGUF
Even using today's 4/28 oobabooga update and trying every gguf released, I still get the same crap responses. Apparently the tokenizer is still being fixed
https://github.com/ggerganov/llama.cpp/pull/6920
though I don't know why it seems I'm the only one getting bad responses.
My experience is it is pretty heavily censored. Ran it exllamav2 EXL2.
Apparently you can jailbreak it but with so many great models that I don't have to jailbreak it's not worth the tokens. Especially with only 8k context. Anything less than 32k now feels like going backwards