pmysl
/

c4ai-command-r-plus-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

c4ai-command-r-plus-GGUF / README.md

pmysl's picture

Update README.md

7d4fb9d 7 months ago

|

1.44 kB

	---
	license: cc-by-nc-4.0
	pipeline_tag: text-generation
	base_model: CohereForAI/c4ai-command-r-plus
	---

	# Command R+ GGUF

	## Description
	This repository contains experimental GGUF weights that are currently compatible with [pull request #6491](https://github.com/ggerganov/llama.cpp/pull/6491) in the `llama.cpp`. I will update them once support for Command R+ is merged into the llama.cpp repository.

	## Getting started
	1. Clone the `Carolinabanana/llama.cpp` repository:
	```bash
	git clone https://github.com/Carolinabanana/llama.cpp.git llama.cpp-fork
	cd llama.cpp-fork
	git reset --hard 8b6577bd631fec33eeadb4b9dfc5a07ed2118148
	```
	2. Build it using `make`
	3. Use it in the same way as the regular `llama.cpp`. If you're unsure of how to start, you can use the following command as a starting point:
	```bash
	./main -p "<\|START_OF_TURN_TOKEN\|><\|USER_TOKEN\|>Who are you?<\|END_OF_TURN_TOKEN\|><\|START_OF_TURN_TOKEN\|><\|CHATBOT_TOKEN\|>" --color -m /path/to/command-r-plus-Q3_K_L-00001-of-00002.gguf
	```

	## Merging Weights
	After commit `8a28d12`, weights are split with `gguf-split`, which means that you don't have to merge weights. Simply pass the first split, as in the example above, and `llama.cpp` will automatically load all splits. If, for some reason, you want to merge splits, you can use the following command:
	```bash
	./gguf-split --merge /path/to/command-r-plus-f16-00001-of-00005.gguf /path/to/command-r-plus-f16-combined.gguf
	```