Sdff-Ltba
/

LightChatAssistant-2x7B-GGUF

Text Generation

Mixture of Experts

Not-For-All-Audiences

nsfw

Model card Files Files and versions Community

LightChatAssistant-2x7B-GGUF / README.md

Sdff-Ltba's picture

Update README.md

60d0868 verified 6 months ago

|

No virus

1.17 kB

	---
	language:
	- ja
	tags:
	- mistral
	- mixtral
	- not-for-all-audiences
	- nsfw
	pipeline_tag: text-generation
	---

	# chatntq_chatvector-MoE-Antler_chatvector-2x7B-GGUF

	[Sdff-Ltba/chatntq_chatvector-MoE-Antler_chatvector-2x7B](https://huggingface.co/Sdff-Ltba/chatntq_chatvector-MoE-Antler_chatvector-2x7B)をGGUF変換したものです。
	iMatrixを併用して量子化しています。

	## 量子化手順

	以下の通りに実行しました。
	```
	python ./llama.cpp/convert.py ./chatntq_chatvector-MoE-Antler_chatvector-2x7B --outtype f16 --outfile ./gguf-model_f16.gguf
	./llama.cpp/imatrix -m ./gguf-model_f16.gguf -f ./wiki.train.raw -o ./gguf-model_f16.imatrix --chunks 32
	./llama.cpp/quantize --imatrix ./gguf-model_f16.imatrix ./gguf-model_f16.gguf ./chatntq_chatvector-MoE-Antler_chatvector-2x7B_iq3xxs.gguf iq3_xxs
	```

	## 環境

	- CPU: Ryzen 5 5600X
	- GPU: GeForce RTX 3060 12GB
	- RAM: DDR4-3200 96GB
	- OS: Windows 10
	- software: Python 3.12.2、[KoboldCpp](https://github.com/LostRuins/koboldcpp) v1.61.2

	#### KoboldCppの設定

	(デフォルトから変更したもののみ記載)
	- `GPU Layers: 33` (33以上でフルロード)
	- `Context Size: 32768`