tinybiggames commited on
Commit
14ad760
1 Parent(s): b30dc0e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +78 -40
README.md CHANGED
@@ -1,45 +1,83 @@
1
- ---
2
- license: other
3
- tags:
4
- - generated_from_trainer
5
- - axolotl
6
- - llama-cpp
7
- - gguf-my-repo
8
- - Dllama
9
- - Infero
10
- base_model: meta-llama/Meta-Llama-3-8B
11
- datasets:
12
- - cognitivecomputations/Dolphin-2.9
13
- - teknium/OpenHermes-2.5
14
- - m-a-p/CodeFeedback-Filtered-Instruction
15
- - cognitivecomputations/dolphin-coder
16
- - cognitivecomputations/samantha-data
17
- - microsoft/orca-math-word-problems-200k
18
- - Locutusque/function-calling-chatml
19
- - internlm/Agent-FLAN
20
- model-index:
21
- - name: out
22
- results: []
23
- ---
24
 
25
  # tinybiggames/dolphin-2.9.1-llama-3-8b-Q4_K_M-GGUF
26
  This model was converted to GGUF format from [`cognitivecomputations/dolphin-2.9.1-llama-3-8b`](https://huggingface.co/cognitivecomputations/dolphin-2.9.1-llama-3-8b) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
27
  Refer to the [original model card](https://huggingface.co/cognitivecomputations/dolphin-2.9.1-llama-3-8b) for more details on the model.
28
- ## Use with tinyBigGAMES's Local LLM Inference Libraries
29
-
30
- Add to **config.json**
31
-
32
- ```Json
33
- {
34
- "filename": "dolphin-2.9.1-llama-3-8b.Q4_K_M.gguf",
35
- "name": "dolphin-llama3:8B:4_K_M",
36
- "max_context": 8000,
37
- "template": "<|im_start|>%s\\n%s<|im_end|>\\n",
38
- "template_end": "<|im_start|>assistant",
39
- "stop": [
40
- "<|im_start|>",
41
- "<|im_end|>",
42
- "assistant"
43
- ]
44
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
45
  ```
 
1
+ ---
2
+ license: other
3
+ tags:
4
+ - generated_from_trainer
5
+ - axolotl
6
+ - llama-cpp
7
+ - gguf-my-repo
8
+ - Dllama
9
+ - Infero
10
+ base_model: meta-llama/Meta-Llama-3-8B
11
+ datasets:
12
+ - cognitivecomputations/Dolphin-2.9
13
+ - teknium/OpenHermes-2.5
14
+ - m-a-p/CodeFeedback-Filtered-Instruction
15
+ - cognitivecomputations/dolphin-coder
16
+ - cognitivecomputations/samantha-data
17
+ - microsoft/orca-math-word-problems-200k
18
+ - Locutusque/function-calling-chatml
19
+ - internlm/Agent-FLAN
20
+ model-index:
21
+ - name: out
22
+ results: []
23
+ ---
24
 
25
  # tinybiggames/dolphin-2.9.1-llama-3-8b-Q4_K_M-GGUF
26
  This model was converted to GGUF format from [`cognitivecomputations/dolphin-2.9.1-llama-3-8b`](https://huggingface.co/cognitivecomputations/dolphin-2.9.1-llama-3-8b) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
27
  Refer to the [original model card](https://huggingface.co/cognitivecomputations/dolphin-2.9.1-llama-3-8b) for more details on the model.
28
+ ## Use with tinyBigGAMES's LMEngine Inference Library
29
+
30
+
31
+ How to configure LMEngine:
32
+
33
+ ```Delphi
34
+ Config_Init(
35
+ 'C:/LLM/gguf', // path to model files
36
+ -1 // number of GPU layer, -1 to use all available layers
37
+ );
38
+ ```
39
+
40
+ How to define model:
41
+
42
+ ```Delphi
43
+ Model_Define('dolphin-2.9.1-llama-3-8b.Q4_K_M.gguf',
44
+ 'dolphin-llama3:8B:Q4KM', 8000,
45
+ '<|im_start|>{role}\n{content}<|im_end|>\n',
46
+ '<|im_start|>assistant');
47
+ ```
48
+
49
+ How to add a message:
50
+
51
+ ```Delphi
52
+ Message_Add(
53
+ ROLE_USER, // role
54
+ 'What is AI?' // content
55
+ );
56
+ ```
57
+
58
+ `{role}` - will be substituted with the message "role"
59
+ `{content}` - will be substituted with the message "content"
60
+
61
+ How to do inference:
62
+
63
+ ```Delphi
64
+ var
65
+ LTokenOutputSpeed: Single;
66
+ LInputTokens: Int32;
67
+ LOutputTokens: Int32;
68
+ LTotalTokens: Int32;
69
+
70
+ if Inference_Run('dolphin-llama3:8B:Q4KM', 1024) then
71
+ begin
72
+ Inference_GetUsage(nil, @LTokenOutputSpeed, @LInputTokens, @LOutputTokens,
73
+ @LTotalTokens);
74
+ Console_PrintLn('', FG_WHITE);
75
+ Console_PrintLn('Tokens :: Input: %d, Output: %d, Total: %d, Speed: %3.1f t/s',
76
+ FG_BRIGHTYELLOW, LInputTokens, LOutputTokens, LTotalTokens, LTokenOutputSpeed);
77
+ end
78
+ else
79
+ begin
80
+ Console_PrintLn('', FG_WHITE);
81
+ Console_PrintLn('Error: %s', FG_RED, Error_Get());
82
+ end;
83
  ```