macadeliccc
/

OmniCorso-7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

OmniCorso-7B / README.md

macadeliccc's picture

Update README.md

16169a4 verified 9 months ago

|

2.93 kB

	---
	base_model:
	- macadeliccc/MBX-7B-v3-DPO
	- mlabonne/OmniBeagle-7B
	tags:
	- mergekit
	- merge
	license: cc
	---
	# OmniCorso-7B

	![image/webp](https://cdn-uploads.huggingface.co/production/uploads/6455cc8d679315e4ef16fbec/PaG7ByWy1qnh_tcSuh35U.webp)

	## Code Example

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	tokenizer = AutoTokenizer.from_pretrained("macadeliccc/OmniCorso-7B")
	model = AutoModelForCausalLM.from_pretrained("macadeliccc/OmniCorso-7B")

	messages = [
	{"role": "system", "content": "Respond to the users request like a pirate"},
	{"role": "user", "content": "Can you write me a quicksort algorithm?"}
	]
	gen_input = tokenizer.apply_chat_template(messages, return_tensors="pt")
	```

	The following models were included in the merge:
	* [macadeliccc/MBX-7B-v3-DPO](https://huggingface.co/macadeliccc/MBX-7B-v3-DPO)
	* [mlabonne/OmniBeagle-7B](https://huggingface.co/mlabonne/OmniBeagle-7B)

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	slices:
	- sources:
	- model: mlabonne/OmniBeagle-7B
	layer_range: [0, 32]
	- model: macadeliccc/MBX-7B-v3-DPO
	layer_range: [0, 32]
	merge_method: slerp
	base_model: macadeliccc/MBX-7B-v3-DPO
	parameters:
	t:
	- filter: self_attn
	value: [0, 0.5, 0.3, 0.7, 1]
	- filter: mlp
	value: [1, 0.5, 0.7, 0.3, 0]
	- value: 0.5
	dtype: bfloat16

	```

	## Quantizations

	### GGUF

	+ [iMatrix](https://huggingface.co/macadeliccc/OmniCorso-7B-GGUF)

	### Exllamav2

	Quants are available thanks to user bartowski, check them out [here](https://huggingface.co/bartowski/OmniCorso-7B-exl2)

	\| Branch \| Bits \| lm_head bits \| VRAM (4k) \| VRAM (16k) \| VRAM (32k) \| Description \|
	\| ----- \| ---- \| ------- \| ------ \| ------ \| ------ \| ------------ \|
	\| [8_0](https://huggingface.co/bartowski/OmniCorso-7B-exl2/tree/8_0) \| 8.0 \| 8.0 \| 8.4 GB \| 9.8 GB \| 11.8 GB \| Maximum quality that ExLlamaV2 can produce, near unquantized performance. \|
	\| [6_5](https://huggingface.co/bartowski/OmniCorso-7B-exl2/tree/6_5) \| 6.5 \| 8.0 \| 7.2 GB \| 8.6 GB \| 10.6 GB \| Very similar to 8.0, good tradeoff of size vs performance, recommended. \|
	\| [5_0](https://huggingface.co/bartowski/OmniCorso-7B-exl2/tree/5_0) \| 5.0 \| 6.0 \| 6.0 GB \| 7.4 GB \| 9.4 GB \| Slightly lower quality vs 6.5, but usable on 8GB cards. \|
	\| [4_25](https://huggingface.co/bartowski/OmniCorso-7B-exl2/tree/4_25) \| 4.25 \| 6.0 \| 5.3 GB \| 6.7 GB \| 8.7 GB \| GPTQ equivalent bits per weight, slightly higher quality. \|
	\| [3_5](https://huggingface.co/bartowski/OmniCorso-7B-exl2/tree/3_5) \| 3.5 \| 6.0 \| 4.7 GB \| 6.1 GB \| 8.1 GB \| Lower quality, only use if you have to. \|


	## Evaluations

	<pre>----Benchmark Complete----
	2024-02-11 15:34:40
	Time taken: 178.3 mins
	Prompt Format: ChatML
	Model: macadeliccc/OmniCorso-7B
	Score (v2): 73.75
	Parseable: 167.0
	---------------
	Batch completed
	Time taken: 178.3 mins
	---------------
	</pre>