My dude! You forgot to put the "rope_theta" settings in the config.json. Your model works past 900 tokens now. πŸ˜‹

Information: I ran diff on 4 files (special_tokens_map.json, tokenizer_config.json, tokenizer.json, and, config.json) comparing Meta's LLaMa-3.1-8B-Instruct's to yours. Your config.json was missing: "rope_theta": 500000.0,

Here's the diff output:

32d31
<   "rope_theta": 500000.0,
Owner

Haha thanks, I blame it on Meta they kept changing this config! :)

mlabonne changed pull request status to merged

I like this model, bro! Thank you for abliterating it. I hope you make an updated version that not only is equal to Meta's LLaMa-3.1-8B-Instruct, but that is better. πŸ˜‹

Sign up or log in to comment