T145
/

ZEUS-8B-V2

@@ -113,6 +113,7 @@ model-index:
         https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=T145/ZEUS-8B-V2
       name: Open LLM Leaderboard
 ---
 # ZEUS 8B 🌩️ V2
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
@@ -177,14 +178,15 @@ Based on the listed rankings as of 4/12/24, is the top-rank 8B model.
 # Inference Settings
-Personal recommendations are to use an [i1-Q4_K_M](https://www.reddit.com/r/LocalLLaMA/comments/1ck76rk/weightedimatrix_vs_static_quants/) quant with these settings:
 ```
 num_ctx = 4096
 repeat_penalty = 1.2
 temperature = 0.85
 top_k = 0 # Change to 40+ if you're roleplaying
-top_p = 1
 ```
 Other recommendations can be found on [this paper on mobile LLMs](https://openreview.net/pdf?id=ahVsd1hy2W), [this paper on balancing model parameters](https://arxiv.org/html/2408.13586v1), and [this Reddit post about tweaking Llama 3.1 parameters](https://www.reddit.com/r/LocalLLaMA/comments/1ej1zrl/try_these_settings_for_llama_31_for_longer_or/).

         https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=T145/ZEUS-8B-V2
       name: Open LLM Leaderboard
 ---
 # ZEUS 8B 🌩️ V2
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 # Inference Settings
+Personal recommendations are to use a [i1-Q4_K_M](https://www.reddit.com/r/LocalLLaMA/comments/1ck76rk/weightedimatrix_vs_static_quants/) quant with these settings:
 ```
 num_ctx = 4096
 repeat_penalty = 1.2
 temperature = 0.85
+tfs_z = 1.4
 top_k = 0 # Change to 40+ if you're roleplaying
+top_p = 1 # Change to 0.9 if top_k > 0
 ```
 Other recommendations can be found on [this paper on mobile LLMs](https://openreview.net/pdf?id=ahVsd1hy2W), [this paper on balancing model parameters](https://arxiv.org/html/2408.13586v1), and [this Reddit post about tweaking Llama 3.1 parameters](https://www.reddit.com/r/LocalLLaMA/comments/1ej1zrl/try_these_settings_for_llama_31_for_longer_or/).