tinybiggames
/

dolphin-2.9.1-llama-3-8b-Q4_K_M-GGUF

Generated from Trainer

Model card Files Files and versions Community

dolphin-2.9.1-llama-3-8b-Q4_K_M-GGUF / README.md

tinybiggames's picture

Update README.md

65dfba0 verified about 1 month ago

|

raw history blame contribute delete

No virus

2.22 kB

	---
	license: other
	tags:
	- generated_from_trainer
	- axolotl
	- llama-cpp
	- gguf-my-repo
	- LMEngine
	base_model: meta-llama/Meta-Llama-3-8B
	datasets:
	- cognitivecomputations/Dolphin-2.9
	- teknium/OpenHermes-2.5
	- m-a-p/CodeFeedback-Filtered-Instruction
	- cognitivecomputations/dolphin-coder
	- cognitivecomputations/samantha-data
	- microsoft/orca-math-word-problems-200k
	- Locutusque/function-calling-chatml
	- internlm/Agent-FLAN
	model-index:
	- name: out
	results: []
	---

	# tinybiggames/dolphin-2.9.1-llama-3-8b-Q4_K_M-GGUF
	This model was converted to GGUF format from [`cognitivecomputations/dolphin-2.9.1-llama-3-8b`](https://huggingface.co/cognitivecomputations/dolphin-2.9.1-llama-3-8b) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
	Refer to the [original model card](https://huggingface.co/cognitivecomputations/dolphin-2.9.1-llama-3-8b) for more details on the model.
	## Use with tinyBigGAMES's [Inference](https://github.com/tinyBigGAMES) Libraries.


	How to configure LMEngine:

	```Delphi
	InitConfig(
	'C:/LLM/gguf', // path to model files
	-1 // number of GPU layer, -1 to use all available layers
	);
	```

	How to define model:

	```Delphi
	DefineModel('dolphin-2.9.1-llama-3-8b.Q4_K_M.gguf',
	'dolphin-2.9.1-llama-3-8b.Q4_K_M', 8000,
	'<\|im_start\|>{role}\n{content}<\|im_end\|>\n',
	'<\|im_start\|>assistant');
	```

	How to add a message:

	```Delphi
	AddMessage(
	ROLE_USER, // role
	'What is AI?' // content
	);
	```

	`{role}` - will be substituted with the message "role"
	`{content}` - will be substituted with the message "content"

	How to do inference:

	```Delphi
	var
	LTokenOutputSpeed: Single;
	LInputTokens: Int32;
	LOutputTokens: Int32;
	LTotalTokens: Int32;

	if RunInference('dolphin-2.9.1-llama-3-8b.Q4_K_M', 1024) then
	begin
	GetInferenceStats(nil, @LTokenOutputSpeed, @LInputTokens, @LOutputTokens,
	@LTotalTokens);
	PrintLn('', FG_WHITE);
	PrintLn('Tokens :: Input: %d, Output: %d, Total: %d, Speed: %3.1f t/s',
	FG_BRIGHTYELLOW, LInputTokens, LOutputTokens, LTotalTokens, LTokenOutputSpeed);
	end
	else
	begin
	PrintLn('', FG_WHITE);
	PrintLn('Error: %s', FG_RED, GetError());
	end;
	```