What's up with these versions?
Are they continuation of previous model finetuned on same data, but for longer (more epochs)?
Yes, the Llama-3-8B-Instruct
series will be based on the previous DPO
fine-tuned models in order to improve them. The Leaderboard failed to calculate GSM8K for some of them, so I am not sure where exactly they stand in terms of scores.
Yes, the
Llama-3-8B-Instruct
series will be based on the previousDPO
fine-tuned models in order to improve them. The Leaderboard failed to calculate GSM8K for some of them, so I am not sure where exactly they stand in terms of scores.
I don't trust leader board anymore. And thanks for quick response
Me too. By now I learned to do a series of vibe tests, long-text input test, long-text output generation, and couple of other questions that were problematic before. Some 7B and some 72B models with very high score, just don't work properly. So I developed my own vibe tests before going any further with any model.
Great work!
Hi @MaziyarPanahi Thanks for your great works, is this also 32k context ? And the GGUF version already fixed for tokenizer bug ?
Hi @MaziyarPanahi Thanks for your great works, is this also 32k context ? And the GGUF version already fixed for tokenizer bug ?
Hi, you are very welcome. The model is the native 8K, however, you can easily change the RoPE theta and extend it to 16k or 32k with minimum loss in accuracy.
The GGUF models use the latest Llama.cpp, however, if you noticed anything please let me know. The model is small so I can fix it quickly.