File size: 1,962 Bytes
f6e1aa0
 
 
 
 
 
 
 
5ce63d1
94d7beb
 
d3bbef6
6dd0163
69bc469
4cb0fbe
127ee91
 
 
 
 
 
 
 
fb5ae76
127ee91
 
 
 
fb5ae76
127ee91
 
 
 
fb5ae76
127ee91
 
 
 
1edd565
127ee91
 
 
94d7beb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
---
datasets:
- togethercomputer/RedPajama-Data-1T-Sample
library_name: transformers
pipeline_tag: text-generation
tags:
- text-generation-inference
---
This is [Llama2-22b](https://huggingface.co/chargoddard/llama2-22b) by [chargoddard](https://huggingface.co/chargoddard) in a couple of GGML formats. I have no idea what I'm doing so if something doesn't work as it should or not at all that's likely on me, not the models themselves.<br>
A second model merge has been [released](https://huggingface.co/chargoddard/llama2-22b-blocktriangular) and the GGML conversions for that can be found [here](https://huggingface.co/IHaveNoClueAndIMustPost/llama2-22b-blocktriangular-GGML).

While I haven't had any issues so far do note that the original repo states <i>"Not intended for use as-is - this model is meant to serve as a base for further tuning"</b>.

Approximate VRAM requirements at 4K context:
<table style='border: 2px #000000 solid; width: 50%' align='left' border='2'>
    <tbody>
        <tr>
            <td style='text-align: center'>MODEL</td>
            <td style='text-align: center'>SIZE</td>
            <td style='text-align: center'>VRAM</td>
        </tr>
        <tr>
            <td style='text-align: center'>q5_1</td>
            <td style='text-align: center'>16.4GB</td>
            <td style='text-align: center'>21.5GB</td>
        </tr>
        <tr>
            <td style='text-align: center'>q4_K_M</td>
            <td style='text-align: center'>13.2GB</td>
            <td style='text-align: center'>18.3GB</td>
        </tr>
        <tr>
            <td style='text-align: center'>q3_K_M</td>
            <td style='text-align: center'>10.6GB</td>
            <td style='text-align: center'>16.1GB</td>
        </tr>
        <tr>
            <td style='text-align: center'>q2_K</td>
            <td style='text-align: center'>9.2GB</td>
            <td style='text-align: center'>14.5GB</td>
        </tr>
    </tbody>
</table>