T145 commited on
Commit
0ba2b94
1 Parent(s): c39427d

Updated inference settings

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -113,6 +113,7 @@ model-index:
113
  https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=T145/ZEUS-8B-V2
114
  name: Open LLM Leaderboard
115
  ---
 
116
  # ZEUS 8B 🌩️ V2
117
 
118
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
@@ -177,14 +178,15 @@ Based on the listed rankings as of 4/12/24, is the top-rank 8B model.
177
 
178
  # Inference Settings
179
 
180
- Personal recommendations are to use an [i1-Q4_K_M](https://www.reddit.com/r/LocalLLaMA/comments/1ck76rk/weightedimatrix_vs_static_quants/) quant with these settings:
181
 
182
  ```
183
  num_ctx = 4096
184
  repeat_penalty = 1.2
185
  temperature = 0.85
 
186
  top_k = 0 # Change to 40+ if you're roleplaying
187
- top_p = 1
188
  ```
189
 
190
  Other recommendations can be found on [this paper on mobile LLMs](https://openreview.net/pdf?id=ahVsd1hy2W), [this paper on balancing model parameters](https://arxiv.org/html/2408.13586v1), and [this Reddit post about tweaking Llama 3.1 parameters](https://www.reddit.com/r/LocalLLaMA/comments/1ej1zrl/try_these_settings_for_llama_31_for_longer_or/).
 
113
  https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=T145/ZEUS-8B-V2
114
  name: Open LLM Leaderboard
115
  ---
116
+
117
  # ZEUS 8B 🌩️ V2
118
 
119
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 
178
 
179
  # Inference Settings
180
 
181
+ Personal recommendations are to use a [i1-Q4_K_M](https://www.reddit.com/r/LocalLLaMA/comments/1ck76rk/weightedimatrix_vs_static_quants/) quant with these settings:
182
 
183
  ```
184
  num_ctx = 4096
185
  repeat_penalty = 1.2
186
  temperature = 0.85
187
+ tfs_z = 1.4
188
  top_k = 0 # Change to 40+ if you're roleplaying
189
+ top_p = 1 # Change to 0.9 if top_k > 0
190
  ```
191
 
192
  Other recommendations can be found on [this paper on mobile LLMs](https://openreview.net/pdf?id=ahVsd1hy2W), [this paper on balancing model parameters](https://arxiv.org/html/2408.13586v1), and [this Reddit post about tweaking Llama 3.1 parameters](https://www.reddit.com/r/LocalLLaMA/comments/1ej1zrl/try_these_settings_for_llama_31_for_longer_or/).