Updated inference settings
Browse files
README.md
CHANGED
@@ -113,6 +113,7 @@ model-index:
|
|
113 |
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=T145/ZEUS-8B-V2
|
114 |
name: Open LLM Leaderboard
|
115 |
---
|
|
|
116 |
# ZEUS 8B 🌩️ V2
|
117 |
|
118 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
@@ -177,14 +178,15 @@ Based on the listed rankings as of 4/12/24, is the top-rank 8B model.
|
|
177 |
|
178 |
# Inference Settings
|
179 |
|
180 |
-
Personal recommendations are to use
|
181 |
|
182 |
```
|
183 |
num_ctx = 4096
|
184 |
repeat_penalty = 1.2
|
185 |
temperature = 0.85
|
|
|
186 |
top_k = 0 # Change to 40+ if you're roleplaying
|
187 |
-
top_p = 1
|
188 |
```
|
189 |
|
190 |
Other recommendations can be found on [this paper on mobile LLMs](https://openreview.net/pdf?id=ahVsd1hy2W), [this paper on balancing model parameters](https://arxiv.org/html/2408.13586v1), and [this Reddit post about tweaking Llama 3.1 parameters](https://www.reddit.com/r/LocalLLaMA/comments/1ej1zrl/try_these_settings_for_llama_31_for_longer_or/).
|
|
|
113 |
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=T145/ZEUS-8B-V2
|
114 |
name: Open LLM Leaderboard
|
115 |
---
|
116 |
+
|
117 |
# ZEUS 8B 🌩️ V2
|
118 |
|
119 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
|
|
178 |
|
179 |
# Inference Settings
|
180 |
|
181 |
+
Personal recommendations are to use a [i1-Q4_K_M](https://www.reddit.com/r/LocalLLaMA/comments/1ck76rk/weightedimatrix_vs_static_quants/) quant with these settings:
|
182 |
|
183 |
```
|
184 |
num_ctx = 4096
|
185 |
repeat_penalty = 1.2
|
186 |
temperature = 0.85
|
187 |
+
tfs_z = 1.4
|
188 |
top_k = 0 # Change to 40+ if you're roleplaying
|
189 |
+
top_p = 1 # Change to 0.9 if top_k > 0
|
190 |
```
|
191 |
|
192 |
Other recommendations can be found on [this paper on mobile LLMs](https://openreview.net/pdf?id=ahVsd1hy2W), [this paper on balancing model parameters](https://arxiv.org/html/2408.13586v1), and [this Reddit post about tweaking Llama 3.1 parameters](https://www.reddit.com/r/LocalLLaMA/comments/1ej1zrl/try_these_settings_for_llama_31_for_longer_or/).
|