flyingfishinwater commited on
Commit
5e59e60
·
verified ·
1 Parent(s): 7f861ca

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +91 -20
README.md CHANGED
@@ -1,9 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Reader-LM 1.5B
 
2
  Jina Reader-LM is a model that convert HTML content to Markdown content, which is useful for content conversion tasks. The model is trained on a curated collection of HTML content and its corresponding Markdown content.
3
 
4
  **Model Intention:** Jina Reader-LM is used to convert HTML content to Markdown content, which is useful for content conversion tasks
5
 
6
- **Model URL:** [https://huggingface.co/flyingfishinwater/good_and_small_models/resolve/main/reader-lm-1.5b-Q4_K_M.gguf?download=true](https://huggingface.co/flyingfishinwater/good_and_small_models/resolve/main/reader-lm-1.5b-Q4_K_M.gguf?download=true)
7
 
8
  **Model Info URL:** [https://huggingface.co/jinaai/reader-lm-1.5b](https://huggingface.co/jinaai/reader-lm-1.5b)
9
 
@@ -20,6 +65,7 @@ Jina Reader-LM is a model that convert HTML content to Markdown content, which i
20
  **Context Length:** 8192 tokens
21
 
22
  **Prompt Format:**
 
23
  ```
24
  <|im_start|>system
25
  {{system}}<|im_end|>
@@ -27,7 +73,7 @@ Jina Reader-LM is a model that convert HTML content to Markdown content, which i
27
  {{prompt}}<|im_end|>
28
  <|im_start|>assistant
29
 
30
- ```
31
 
32
  **Template Name:** chatml
33
 
@@ -41,6 +87,7 @@ Jina Reader-LM is a model that convert HTML content to Markdown content, which i
41
  ---
42
 
43
  # WhiteRabbitNeo V2(Llama3.1)
 
44
  It identifies cybersecurity risks such as open ports, outdated software, default credentials, misconfigurations, injection flaws, unencrypted services, known vulnerabilities, CSRF, insecure object references, broken authentication, sensitive data exposure, API vulnerabilities, DoS risks, and buffer overflows, enabling threat detection and mitigation.
45
 
46
  **Model Intention:** It is a 8B model that can be used for defensive cybersecurity.
@@ -62,13 +109,14 @@ It identifies cybersecurity risks such as open ports, outdated software, default
62
  **Context Length:** 8192 tokens
63
 
64
  **Prompt Format:**
 
65
  ```
66
  <|begin_of_text|><|start_header_id|>system<|end_header_id|>
67
 
68
  {{system}}<|eot_id|><|start_header_id|>user<|end_header_id|>
69
 
70
 
71
- ```
72
 
73
  **Template Name:** chatml
74
 
@@ -82,6 +130,7 @@ It identifies cybersecurity risks such as open ports, outdated software, default
82
  ---
83
 
84
  # Dolphin 2.9.4 Gemma2 2b
 
85
  Dolphin-2.9.4 has a variety of instruction following, conversational, and coding skills. It also has agentic abilities and supports function calling. It is especially trained to obey the system prompt, and follow instructions in many languages. Dolphin is uncensored. We have filtered the dataset to remove alignment and bias. This makes the model more compliant.
86
 
87
  **Model Intention:** It has a variety of instruction following, conversational, and coding skills. It also has agentic abilities and supports function calling.
@@ -103,6 +152,7 @@ Dolphin-2.9.4 has a variety of instruction following, conversational, and coding
103
  **Context Length:** 4096 tokens
104
 
105
  **Prompt Format:**
 
106
  ```
107
  <|im_start|>system
108
  {{system}}<|im_end|>
@@ -110,7 +160,7 @@ Dolphin-2.9.4 has a variety of instruction following, conversational, and coding
110
  {{prompt}}<|im_end|>
111
  <|im_start|>assistant
112
 
113
- ```
114
 
115
  **Template Name:** chatml
116
 
@@ -124,6 +174,7 @@ Dolphin-2.9.4 has a variety of instruction following, conversational, and coding
124
  ---
125
 
126
  # Financial GPT
 
127
  FinGPT is deeply committed to fostering an open-source ecosystem dedicated to Financial Large Language Models (FinLLMs). FinGPT envisions democratizing access to both financial data and FinLLMs. It stands as an emblem of untapped potential within open finance, aspiring to be a significant catalyst stimulating innovation and refinement within the financial domain. Note: Nothing herein is financial advice, and NOT a recommendation to trade real money
128
 
129
  **Model Intention:** It's a professional stock market analyst. It can provide an analysis and prediction for the companies' stock price movement for the upcoming weeks.
@@ -143,12 +194,13 @@ FinGPT is deeply committed to fostering an open-source ecosystem dedicated to Fi
143
  **Context Length:** 4096 tokens
144
 
145
  **Prompt Format:**
 
146
  ```
147
  [INST]<<SYS>>
148
  {{systemp}}<</SYS>>
149
 
150
  Let's first analyze the positive developments and potential concerns for {{prompt}}. Come up with 2-4 most important factors respectively and keep them concise. Most factors should be inferred from company related news. Then make your prediction of the {{prompt}} stock price movement for next week. Provide a summary analysis to support your prediction.[/INST]
151
- ```
152
 
153
  **Template Name:** llama
154
 
@@ -162,6 +214,7 @@ Let's first analyze the positive developments and potential concerns for {{promp
162
  ---
163
 
164
  # Llama3.2 3B
 
165
  The Meta Llama 3.1 is pretrained and instruction tuned generative models in 8B sizes (text in/text out). It is optimized for multilingual dialogue use cases (English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai) and outperform closed chat models on common benchmarks.
166
 
167
  **Model Intention:** The latest Llama 3.2 is optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks
@@ -183,6 +236,7 @@ The Meta Llama 3.1 is pretrained and instruction tuned generative models in 8B s
183
  **Context Length:** 8192 tokens
184
 
185
  **Prompt Format:**
 
186
  ```
187
  <|begin_of_text|><|start_header_id|>user<|end_header_id|>
188
 
@@ -190,7 +244,7 @@ The Meta Llama 3.1 is pretrained and instruction tuned generative models in 8B s
190
 
191
  assistant
192
 
193
- ```
194
 
195
  **Template Name:** llama3.2
196
 
@@ -204,6 +258,7 @@ assistant
204
  ---
205
 
206
  # Mistral 7B v0.3
 
207
  The Mistral 7B v0.3 Large is a pretrained generative text model with 7 billion parameters. It extended vocabulary to 32768 and supports function calling.
208
 
209
  **Model Intention:** It's a 7B large model for Q&A purpose. But it requires a high-end device to run.
@@ -223,9 +278,10 @@ The Mistral 7B v0.3 Large is a pretrained generative text model with 7 billion p
223
  **Context Length:** 8192 tokens
224
 
225
  **Prompt Format:**
 
226
  ```
227
  <s>[INST]{{prompt}}[/INST]</s>
228
- ```
229
 
230
  **Template Name:** Mistral
231
 
@@ -239,6 +295,7 @@ The Mistral 7B v0.3 Large is a pretrained generative text model with 7 billion p
239
  ---
240
 
241
  # OpenChat 3.6(0522)
 
242
  OpenChat is an innovative library of open-source language models, fine-tuned with C-RLFT - a strategy inspired by offline reinforcement learning. Our models learn from mixed-quality data without preference labels, delivering exceptional performance on par with ChatGPT, even with a 7B model. Despite our simple approach, we are committed to developing a high-performance, commercially viable, open-source large language model, and we continue to make significant strides toward this vision.
243
 
244
  **Model Intention:** the Llama-3 based version OpenChat 3.6 20240522, outperforming official Llama 3 8B Instruct.
@@ -258,10 +315,11 @@ OpenChat is an innovative library of open-source language models, fine-tuned wit
258
  **Context Length:** 8192 tokens
259
 
260
  **Prompt Format:**
 
261
  ```
262
  {{system}}
263
  GPT4 Correct User: {{prompt}}<|end_of_turn|>GPT4 Correct Assistant:
264
- ```
265
 
266
  **Template Name:** Mistral
267
 
@@ -275,6 +333,7 @@ GPT4 Correct User: {{prompt}}<|end_of_turn|>GPT4 Correct Assistant:
275
  ---
276
 
277
  # Phi-3 Vision
 
278
  The Phi-3 4K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model. It is optimized for the instruction following and safety measures. It is good at common sense, language understanding, math, code, long context and logical reasoning, Phi-3 Mini-4K-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters.
279
 
280
  **Model Intention:** It's a Microsoft Phi-3B model with visual support. It can understand images as well as text
@@ -294,11 +353,12 @@ The Phi-3 4K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open m
294
  **Context Length:** 4096 tokens
295
 
296
  **Prompt Format:**
 
297
  ```
298
  <|user|>
299
  {{prompt}} <|end|>
300
  <|assistant|>
301
- ```
302
 
303
  **Template Name:** PHI3
304
 
@@ -312,6 +372,7 @@ The Phi-3 4K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open m
312
  ---
313
 
314
  # Yi 1.5 6B Chat
 
315
  Yi-1.5 is an upgraded version which delivers stronger performance in coding, math, reasoning, and instruction-following capability, while still maintaining excellent capabilities in language understanding, commonsense reasoning, and reading comprehension. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples. The Yi series models are the next generation of open-source large language models trained from scratch by 01.AI. The Yi series models become one of the strongest LLM worldwide, showing promise in language understanding, commonsense reasoning, reading comprehension, and more.
316
 
317
  **Model Intention:** It's a 6B model and can understand English and Chinese. It's good for coding, math, reasoning and language understanding
@@ -335,13 +396,14 @@ Yi-1.5 is an upgraded version which delivers stronger performance in coding, mat
335
  **Context Length:** 4096 tokens
336
 
337
  **Prompt Format:**
 
338
  ```
339
  <|im_start|>user
340
  <|im_end|>
341
  {{prompt}}
342
  <|im_start|>assistant
343
 
344
- ```
345
 
346
  **Template Name:** yi
347
 
@@ -355,6 +417,7 @@ Yi-1.5 is an upgraded version which delivers stronger performance in coding, mat
355
  ---
356
 
357
  # Google Gemma 2B
 
358
  Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Developed by Google DeepMind and other teams across Google, Gemma is named after the Latin gemma, meaning 'precious stone.' The Gemma model weights are supported by developer tools that promote innovation, collaboration, and the responsible use of artificial intelligence (AI).
359
 
360
  **Model Intention:** It's a 2B large model for Q&A purpose. But it requires a high-end device to run.
@@ -374,12 +437,13 @@ Gemma is a family of lightweight, state-of-the-art open models built from the sa
374
  **Context Length:** 8192 tokens
375
 
376
  **Prompt Format:**
 
377
  ```
378
  <bos><start_of_turn>user
379
  {{prompt}}<end_of_turn>
380
  <start_of_turn>model
381
 
382
- ```
383
 
384
  **Template Name:** gemma
385
 
@@ -393,6 +457,7 @@ Gemma is a family of lightweight, state-of-the-art open models built from the sa
393
  ---
394
 
395
  # StarCoder2 3B
 
396
  StarCoder2-3B model is a 3B parameter model trained on 17 programming languages from The Stack v2, with opt-out requests excluded. The model uses Grouped Query Attention, a context window of 16,384 tokens with a sliding window attention of 4,096 tokens, and was trained using the Fill-in-the-Middle objective on 3+ trillion tokens
397
 
398
  **Model Intention:** The model is good at 17 programming languages. By just start with your codes, the model will finish it.
@@ -412,10 +477,11 @@ StarCoder2-3B model is a 3B parameter model trained on 17 programming languages
412
  **Context Length:** 16384 tokens
413
 
414
  **Prompt Format:**
 
415
  ```
416
  {{prompt}}
417
 
418
- ```
419
 
420
  **Template Name:** starcoder
421
 
@@ -429,6 +495,7 @@ StarCoder2-3B model is a 3B parameter model trained on 17 programming languages
429
  ---
430
 
431
  # Qwen2.5 7B Chat
 
432
  Qwen is the large language model and large multimodal model series of the Qwen Team, Alibaba Group. It supports both Chinese and English. 通义千问是阿里巴巴公司开发的大大预言模型,支持中英文双语。
433
 
434
  **Model Intention:** Qwen2.5 is the latest series models that is good at multilingual, coding, mathematics, reasoning, etc
@@ -448,6 +515,7 @@ Qwen is the large language model and large multimodal model series of the Qwen T
448
  **Context Length:** 4096 tokens
449
 
450
  **Prompt Format:**
 
451
  ```
452
  <|im_start|>system
453
  {{system}}<|im_end|>
@@ -455,7 +523,7 @@ Qwen is the large language model and large multimodal model series of the Qwen T
455
  {{prompt}}<|im_end|>
456
  <|im_start|>assistant
457
 
458
- ```
459
 
460
  **Template Name:** chatml
461
 
@@ -469,6 +537,7 @@ Qwen is the large language model and large multimodal model series of the Qwen T
469
  ---
470
 
471
  # Qwen2 1.5B Chat
 
472
  Qwen is the large language model and large multimodal model series of the Qwen Team, Alibaba Group. It supports both Chinese and English. 通义千问是阿里巴巴公司开发的大大预言模型,支持中英文双语。
473
 
474
  **Model Intention:** Qwen2.5 is the latest series models that is good at multilingual, coding, mathematics, reasoning, etc
@@ -488,6 +557,7 @@ Qwen is the large language model and large multimodal model series of the Qwen T
488
  **Context Length:** 2048 tokens
489
 
490
  **Prompt Format:**
 
491
  ```
492
  <|im_start|>system
493
  {{system}}<|im_end|>
@@ -495,7 +565,7 @@ Qwen is the large language model and large multimodal model series of the Qwen T
495
  {{prompt}}<|im_end|>
496
  <|im_start|>assistant
497
 
498
- ```
499
 
500
  **Template Name:** chatml
501
 
@@ -509,6 +579,7 @@ Qwen is the large language model and large multimodal model series of the Qwen T
509
  ---
510
 
511
  # Qwen2 3B Chat
 
512
  Qwen is the large language model and large multimodal model series of the Qwen Team, Alibaba Group. It supports both Chinese and English. 通义千问是阿里巴巴公司开发的大大预言模型,支持中英文双语。
513
 
514
  **Model Intention:** Qwen2.5 is the latest series models that is good at multilingual, coding, mathematics, reasoning, etc
@@ -528,6 +599,7 @@ Qwen is the large language model and large multimodal model series of the Qwen T
528
  **Context Length:** 2048 tokens
529
 
530
  **Prompt Format:**
 
531
  ```
532
  <|im_start|>system
533
  {{system}}<|im_end|>
@@ -535,7 +607,7 @@ Qwen is the large language model and large multimodal model series of the Qwen T
535
  {{prompt}}<|im_end|>
536
  <|im_start|>assistant
537
 
538
- ```
539
 
540
  **Template Name:** chatml
541
 
@@ -549,6 +621,7 @@ Qwen is the large language model and large multimodal model series of the Qwen T
549
  ---
550
 
551
  # Dophin 2.9.2 Qwen2 7B
 
552
  This model is based on Mistral-7b-v0.2 with 16k context lengths. It's a uncensored model and supports a variety of instruction, conversational, and coding skills.
553
 
554
  **Model Intention:** It's a uncensored and good skilled English modal best for high performance iPhone, iPad & Mac
@@ -568,6 +641,7 @@ This model is based on Mistral-7b-v0.2 with 16k context lengths. It's a uncensor
568
  **Context Length:** 2048 tokens
569
 
570
  **Prompt Format:**
 
571
  ```
572
  <|im_start|>system
573
  {{system}}<|im_end|>
@@ -575,7 +649,7 @@ This model is based on Mistral-7b-v0.2 with 16k context lengths. It's a uncensor
575
  {{prompt}}<|im_end|>
576
  <|im_start|>assistant
577
 
578
- ```
579
 
580
  **Template Name:** chatml
581
 
@@ -583,7 +657,4 @@ This model is based on Mistral-7b-v0.2 with 16k context lengths. It's a uncensor
583
 
584
  **Add EOS Token:** No
585
 
586
- **Parse Special Tokens:** Yes
587
-
588
-
589
- ---
 
1
+ # SmolLM2 1.7B
2
+
3
+ SmolLM2 was trained on 11 trillion tokens and demonstrates significant advances over other small models, particularly in instruction following, knowledge, reasoning, and mathematics.
4
+
5
+ **Model Intention:** SmolLM2 is capable of solving a wide range of tasks while being lightweight enough to run on-device
6
+
7
+ **Model URL:** [https://huggingface.co/flyingfishinwater/good_and_small_models/resolve/main/smollm2-1.7b-instruct-q4_k_m.gguf?download=true](https://huggingface.co/flyingfishinwater/good_and_small_models/resolve/main/smollm2-1.7b-instruct-q4_k_m.gguf?download=true)
8
+
9
+ **Model Info URL:** [https://huggingface.co/HuggingFaceTB/SmolLM2-1.7B-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM2-1.7B-Instruct)
10
+
11
+ **Model License:** [License Info](https://choosealicense.com/licenses/apache-2.0/)
12
+
13
+ **Model Description:** SmolLM2 was trained on 11 trillion tokens and demonstrates significant advances over other small models, particularly in instruction following, knowledge, reasoning, and mathematics.
14
+
15
+ **Developer:** [https://huggingface.co/HuggingFaceTB](https://huggingface.co/HuggingFaceTB)
16
+
17
+ **Update Date:** 2024-11-02
18
+
19
+ **File Size:** 1060 MB
20
+
21
+ **Context Length:** 8192 tokens
22
+
23
+ **Prompt Format:**
24
+
25
+ ```
26
+ <|im_start|>system
27
+ {{system}}<|im_end|>
28
+ <|im_start|>user
29
+ {{prompt}}<|im_end|>
30
+ <|im_start|>assistant
31
+
32
+ ```
33
+
34
+ **Template Name:** chatml
35
+
36
+ **Add BOS Token:** Yes
37
+
38
+ **Add EOS Token:** No
39
+
40
+ **Parse Special Tokens:** Yes
41
+
42
+
43
+ ---
44
+
45
  # Reader-LM 1.5B
46
+
47
  Jina Reader-LM is a model that convert HTML content to Markdown content, which is useful for content conversion tasks. The model is trained on a curated collection of HTML content and its corresponding Markdown content.
48
 
49
  **Model Intention:** Jina Reader-LM is used to convert HTML content to Markdown content, which is useful for content conversion tasks
50
 
51
+ **Model URL:** [https://huggingface.co/flyingfishinwater/good_and_small_models/resolve/main/smollm2-1.7b-instruct-q4_k_m.gguf.gguf?download=true](https://huggingface.co/flyingfishinwater/good_and_small_models/resolve/main/smollm2-1.7b-instruct-q4_k_m.gguf.gguf?download=true)
52
 
53
  **Model Info URL:** [https://huggingface.co/jinaai/reader-lm-1.5b](https://huggingface.co/jinaai/reader-lm-1.5b)
54
 
 
65
  **Context Length:** 8192 tokens
66
 
67
  **Prompt Format:**
68
+
69
  ```
70
  <|im_start|>system
71
  {{system}}<|im_end|>
 
73
  {{prompt}}<|im_end|>
74
  <|im_start|>assistant
75
 
76
+ ```
77
 
78
  **Template Name:** chatml
79
 
 
87
  ---
88
 
89
  # WhiteRabbitNeo V2(Llama3.1)
90
+
91
  It identifies cybersecurity risks such as open ports, outdated software, default credentials, misconfigurations, injection flaws, unencrypted services, known vulnerabilities, CSRF, insecure object references, broken authentication, sensitive data exposure, API vulnerabilities, DoS risks, and buffer overflows, enabling threat detection and mitigation.
92
 
93
  **Model Intention:** It is a 8B model that can be used for defensive cybersecurity.
 
109
  **Context Length:** 8192 tokens
110
 
111
  **Prompt Format:**
112
+
113
  ```
114
  <|begin_of_text|><|start_header_id|>system<|end_header_id|>
115
 
116
  {{system}}<|eot_id|><|start_header_id|>user<|end_header_id|>
117
 
118
 
119
+ ```
120
 
121
  **Template Name:** chatml
122
 
 
130
  ---
131
 
132
  # Dolphin 2.9.4 Gemma2 2b
133
+
134
  Dolphin-2.9.4 has a variety of instruction following, conversational, and coding skills. It also has agentic abilities and supports function calling. It is especially trained to obey the system prompt, and follow instructions in many languages. Dolphin is uncensored. We have filtered the dataset to remove alignment and bias. This makes the model more compliant.
135
 
136
  **Model Intention:** It has a variety of instruction following, conversational, and coding skills. It also has agentic abilities and supports function calling.
 
152
  **Context Length:** 4096 tokens
153
 
154
  **Prompt Format:**
155
+
156
  ```
157
  <|im_start|>system
158
  {{system}}<|im_end|>
 
160
  {{prompt}}<|im_end|>
161
  <|im_start|>assistant
162
 
163
+ ```
164
 
165
  **Template Name:** chatml
166
 
 
174
  ---
175
 
176
  # Financial GPT
177
+
178
  FinGPT is deeply committed to fostering an open-source ecosystem dedicated to Financial Large Language Models (FinLLMs). FinGPT envisions democratizing access to both financial data and FinLLMs. It stands as an emblem of untapped potential within open finance, aspiring to be a significant catalyst stimulating innovation and refinement within the financial domain. Note: Nothing herein is financial advice, and NOT a recommendation to trade real money
179
 
180
  **Model Intention:** It's a professional stock market analyst. It can provide an analysis and prediction for the companies' stock price movement for the upcoming weeks.
 
194
  **Context Length:** 4096 tokens
195
 
196
  **Prompt Format:**
197
+
198
  ```
199
  [INST]<<SYS>>
200
  {{systemp}}<</SYS>>
201
 
202
  Let's first analyze the positive developments and potential concerns for {{prompt}}. Come up with 2-4 most important factors respectively and keep them concise. Most factors should be inferred from company related news. Then make your prediction of the {{prompt}} stock price movement for next week. Provide a summary analysis to support your prediction.[/INST]
203
+ ```
204
 
205
  **Template Name:** llama
206
 
 
214
  ---
215
 
216
  # Llama3.2 3B
217
+
218
  The Meta Llama 3.1 is pretrained and instruction tuned generative models in 8B sizes (text in/text out). It is optimized for multilingual dialogue use cases (English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai) and outperform closed chat models on common benchmarks.
219
 
220
  **Model Intention:** The latest Llama 3.2 is optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks
 
236
  **Context Length:** 8192 tokens
237
 
238
  **Prompt Format:**
239
+
240
  ```
241
  <|begin_of_text|><|start_header_id|>user<|end_header_id|>
242
 
 
244
 
245
  assistant
246
 
247
+ ```
248
 
249
  **Template Name:** llama3.2
250
 
 
258
  ---
259
 
260
  # Mistral 7B v0.3
261
+
262
  The Mistral 7B v0.3 Large is a pretrained generative text model with 7 billion parameters. It extended vocabulary to 32768 and supports function calling.
263
 
264
  **Model Intention:** It's a 7B large model for Q&A purpose. But it requires a high-end device to run.
 
278
  **Context Length:** 8192 tokens
279
 
280
  **Prompt Format:**
281
+
282
  ```
283
  <s>[INST]{{prompt}}[/INST]</s>
284
+ ```
285
 
286
  **Template Name:** Mistral
287
 
 
295
  ---
296
 
297
  # OpenChat 3.6(0522)
298
+
299
  OpenChat is an innovative library of open-source language models, fine-tuned with C-RLFT - a strategy inspired by offline reinforcement learning. Our models learn from mixed-quality data without preference labels, delivering exceptional performance on par with ChatGPT, even with a 7B model. Despite our simple approach, we are committed to developing a high-performance, commercially viable, open-source large language model, and we continue to make significant strides toward this vision.
300
 
301
  **Model Intention:** the Llama-3 based version OpenChat 3.6 20240522, outperforming official Llama 3 8B Instruct.
 
315
  **Context Length:** 8192 tokens
316
 
317
  **Prompt Format:**
318
+
319
  ```
320
  {{system}}
321
  GPT4 Correct User: {{prompt}}<|end_of_turn|>GPT4 Correct Assistant:
322
+ ```
323
 
324
  **Template Name:** Mistral
325
 
 
333
  ---
334
 
335
  # Phi-3 Vision
336
+
337
  The Phi-3 4K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model. It is optimized for the instruction following and safety measures. It is good at common sense, language understanding, math, code, long context and logical reasoning, Phi-3 Mini-4K-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters.
338
 
339
  **Model Intention:** It's a Microsoft Phi-3B model with visual support. It can understand images as well as text
 
353
  **Context Length:** 4096 tokens
354
 
355
  **Prompt Format:**
356
+
357
  ```
358
  <|user|>
359
  {{prompt}} <|end|>
360
  <|assistant|>
361
+ ```
362
 
363
  **Template Name:** PHI3
364
 
 
372
  ---
373
 
374
  # Yi 1.5 6B Chat
375
+
376
  Yi-1.5 is an upgraded version which delivers stronger performance in coding, math, reasoning, and instruction-following capability, while still maintaining excellent capabilities in language understanding, commonsense reasoning, and reading comprehension. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples. The Yi series models are the next generation of open-source large language models trained from scratch by 01.AI. The Yi series models become one of the strongest LLM worldwide, showing promise in language understanding, commonsense reasoning, reading comprehension, and more.
377
 
378
  **Model Intention:** It's a 6B model and can understand English and Chinese. It's good for coding, math, reasoning and language understanding
 
396
  **Context Length:** 4096 tokens
397
 
398
  **Prompt Format:**
399
+
400
  ```
401
  <|im_start|>user
402
  <|im_end|>
403
  {{prompt}}
404
  <|im_start|>assistant
405
 
406
+ ```
407
 
408
  **Template Name:** yi
409
 
 
417
  ---
418
 
419
  # Google Gemma 2B
420
+
421
  Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Developed by Google DeepMind and other teams across Google, Gemma is named after the Latin gemma, meaning 'precious stone.' The Gemma model weights are supported by developer tools that promote innovation, collaboration, and the responsible use of artificial intelligence (AI).
422
 
423
  **Model Intention:** It's a 2B large model for Q&A purpose. But it requires a high-end device to run.
 
437
  **Context Length:** 8192 tokens
438
 
439
  **Prompt Format:**
440
+
441
  ```
442
  <bos><start_of_turn>user
443
  {{prompt}}<end_of_turn>
444
  <start_of_turn>model
445
 
446
+ ```
447
 
448
  **Template Name:** gemma
449
 
 
457
  ---
458
 
459
  # StarCoder2 3B
460
+
461
  StarCoder2-3B model is a 3B parameter model trained on 17 programming languages from The Stack v2, with opt-out requests excluded. The model uses Grouped Query Attention, a context window of 16,384 tokens with a sliding window attention of 4,096 tokens, and was trained using the Fill-in-the-Middle objective on 3+ trillion tokens
462
 
463
  **Model Intention:** The model is good at 17 programming languages. By just start with your codes, the model will finish it.
 
477
  **Context Length:** 16384 tokens
478
 
479
  **Prompt Format:**
480
+
481
  ```
482
  {{prompt}}
483
 
484
+ ```
485
 
486
  **Template Name:** starcoder
487
 
 
495
  ---
496
 
497
  # Qwen2.5 7B Chat
498
+
499
  Qwen is the large language model and large multimodal model series of the Qwen Team, Alibaba Group. It supports both Chinese and English. 通义千问是阿里巴巴公司开发的大大预言模型,支持中英文双语。
500
 
501
  **Model Intention:** Qwen2.5 is the latest series models that is good at multilingual, coding, mathematics, reasoning, etc
 
515
  **Context Length:** 4096 tokens
516
 
517
  **Prompt Format:**
518
+
519
  ```
520
  <|im_start|>system
521
  {{system}}<|im_end|>
 
523
  {{prompt}}<|im_end|>
524
  <|im_start|>assistant
525
 
526
+ ```
527
 
528
  **Template Name:** chatml
529
 
 
537
  ---
538
 
539
  # Qwen2 1.5B Chat
540
+
541
  Qwen is the large language model and large multimodal model series of the Qwen Team, Alibaba Group. It supports both Chinese and English. 通义千问是阿里巴巴公司开发的大大预言模型,支持中英文双语。
542
 
543
  **Model Intention:** Qwen2.5 is the latest series models that is good at multilingual, coding, mathematics, reasoning, etc
 
557
  **Context Length:** 2048 tokens
558
 
559
  **Prompt Format:**
560
+
561
  ```
562
  <|im_start|>system
563
  {{system}}<|im_end|>
 
565
  {{prompt}}<|im_end|>
566
  <|im_start|>assistant
567
 
568
+ ```
569
 
570
  **Template Name:** chatml
571
 
 
579
  ---
580
 
581
  # Qwen2 3B Chat
582
+
583
  Qwen is the large language model and large multimodal model series of the Qwen Team, Alibaba Group. It supports both Chinese and English. 通义千问是阿里巴巴公司开发的大大预言模型,支持中英文双语。
584
 
585
  **Model Intention:** Qwen2.5 is the latest series models that is good at multilingual, coding, mathematics, reasoning, etc
 
599
  **Context Length:** 2048 tokens
600
 
601
  **Prompt Format:**
602
+
603
  ```
604
  <|im_start|>system
605
  {{system}}<|im_end|>
 
607
  {{prompt}}<|im_end|>
608
  <|im_start|>assistant
609
 
610
+ ```
611
 
612
  **Template Name:** chatml
613
 
 
621
  ---
622
 
623
  # Dophin 2.9.2 Qwen2 7B
624
+
625
  This model is based on Mistral-7b-v0.2 with 16k context lengths. It's a uncensored model and supports a variety of instruction, conversational, and coding skills.
626
 
627
  **Model Intention:** It's a uncensored and good skilled English modal best for high performance iPhone, iPad & Mac
 
641
  **Context Length:** 2048 tokens
642
 
643
  **Prompt Format:**
644
+
645
  ```
646
  <|im_start|>system
647
  {{system}}<|im_end|>
 
649
  {{prompt}}<|im_end|>
650
  <|im_start|>assistant
651
 
652
+ ```
653
 
654
  **Template Name:** chatml
655
 
 
657
 
658
  **Add EOS Token:** No
659
 
660
+ **Parse Special Tokens:** Yes