IHaveNoClueAndIMustPost
/

Llama-2-22B-GGML

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Llama-2-22B-GGML / README.md

IHaveNoClueAndIMustPost's picture

IHaveNoClueAndIMustPost

Update README.md

5ce63d1 over 1 year ago

|

history blame contribute delete

1.96 kB

	---
	datasets:
	- togethercomputer/RedPajama-Data-1T-Sample
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- text-generation-inference
	---
	This is [Llama2-22b](https://huggingface.co/chargoddard/llama2-22b) by [chargoddard](https://huggingface.co/chargoddard) in a couple of GGML formats. I have no idea what I'm doing so if something doesn't work as it should or not at all that's likely on me, not the models themselves.<br>
	A second model merge has been [released](https://huggingface.co/chargoddard/llama2-22b-blocktriangular) and the GGML conversions for that can be found [here](https://huggingface.co/IHaveNoClueAndIMustPost/llama2-22b-blocktriangular-GGML).

	While I haven't had any issues so far do note that the original repo states <i>"Not intended for use as-is - this model is meant to serve as a base for further tuning"</b>.

	Approximate VRAM requirements at 4K context:
	<table style='border: 2px #000000 solid; width: 50%' align='left' border='2'>
	<tbody>
	<tr>
	<td style='text-align: center'>MODEL</td>
	<td style='text-align: center'>SIZE</td>
	<td style='text-align: center'>VRAM</td>
	</tr>
	<tr>
	<td style='text-align: center'>q5_1</td>
	<td style='text-align: center'>16.4GB</td>
	<td style='text-align: center'>21.5GB</td>
	</tr>
	<tr>
	<td style='text-align: center'>q4_K_M</td>
	<td style='text-align: center'>13.2GB</td>
	<td style='text-align: center'>18.3GB</td>
	</tr>
	<tr>
	<td style='text-align: center'>q3_K_M</td>
	<td style='text-align: center'>10.6GB</td>
	<td style='text-align: center'>16.1GB</td>
	</tr>
	<tr>
	<td style='text-align: center'>q2_K</td>
	<td style='text-align: center'>9.2GB</td>
	<td style='text-align: center'>14.5GB</td>
	</tr>
	</tbody>
	</table>