File size: 13,107 Bytes
27b5871
8d4b0a1
 
6d539e3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
---
datasets: wikitext
---
This is a quantized model of [Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) using GPTQ developed by [IST Austria](https://ist.ac.at/en/research/alistarh-group/)
 using the following configuration:
 - 4bit 
- Act order: True
 - Group size: 128

## Usage
Install **vLLM** and 
    run the [server](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#openai-compatible-server):
    
```
python -m vllm.entrypoints.openai.api_server --model cortecs/Mistral-7B-Instruct-v0.3-GPTQ-4b
```
Access the model:
```
curl http://localhost:8000/v1/completions     -H "Content-Type: application/json"     -d ' {
        "model": "cortecs/Mistral-7B-Instruct-v0.3-GPTQ-4b",
        "prompt": "San Francisco is a"
    } '
```

## Evaluations
| __English__   | __[Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3)__   | __[Mistral-7B-Instruct-v0.3-GPTQ-8b](https://huggingface.co/cortecs/Mistral-7B-Instruct-v0.3-GPTQ-8b)__   | __[Mistral-7B-Instruct-v0.3-GPTQ-4b](https://huggingface.co/cortecs/Mistral-7B-Instruct-v0.3-GPTQ-4b)__   |
|:--------------|:--------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------|
| Avg.          | 67.65                                                                                       | 67.72                                                                                                     | 66.95                                                                                                     |
| ARC           | 64.2                                                                                        | 64.1                                                                                                      | 62.1                                                                                                      |
| Hellaswag     | 75.6                                                                                        | 75.6                                                                                                      | 76.0                                                                                                      |
| MMLU          | 63.16                                                                                       | 63.47                                                                                                     | 62.75                                                                                                     |
|               |                                                                                             |                                                                                                           |                                                                                                           |
| __French__   | __[Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3)__   | __[Mistral-7B-Instruct-v0.3-GPTQ-8b](https://huggingface.co/cortecs/Mistral-7B-Instruct-v0.3-GPTQ-8b)__   | __[Mistral-7B-Instruct-v0.3-GPTQ-4b](https://huggingface.co/cortecs/Mistral-7B-Instruct-v0.3-GPTQ-4b)__   |
| Avg.         | 56.4                                                                                        | 56.17                                                                                                     | 54.77                                                                                                     |
| ARC_fr       | 51.9                                                                                        | 51.4                                                                                                      | 50.0                                                                                                      |
| Hellaswag_fr | 65.8                                                                                        | 65.8                                                                                                      | 63.8                                                                                                      |
| MMLU_fr      | 51.5                                                                                        | 51.3                                                                                                      | 50.5                                                                                                      |
|              |                                                                                             |                                                                                                           |                                                                                                           |
| __German__   | __[Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3)__   | __[Mistral-7B-Instruct-v0.3-GPTQ-8b](https://huggingface.co/cortecs/Mistral-7B-Instruct-v0.3-GPTQ-8b)__   | __[Mistral-7B-Instruct-v0.3-GPTQ-4b](https://huggingface.co/cortecs/Mistral-7B-Instruct-v0.3-GPTQ-4b)__   |
| Avg.         | 51.83                                                                                       | 51.73                                                                                                     | 51.7                                                                                                      |
| ARC_de       | 47.6                                                                                        | 47.5                                                                                                      | 47.3                                                                                                      |
| Hellaswag_de | 58.9                                                                                        | 59.0                                                                                                      | 57.3                                                                                                      |
| MMLU_de      | 49.0                                                                                        | 48.7                                                                                                      | 50.5                                                                                                      |
|              |                                                                                             |                                                                                                           |                                                                                                           |
| __Italian__   | __[Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3)__   | __[Mistral-7B-Instruct-v0.3-GPTQ-8b](https://huggingface.co/cortecs/Mistral-7B-Instruct-v0.3-GPTQ-8b)__   | __[Mistral-7B-Instruct-v0.3-GPTQ-4b](https://huggingface.co/cortecs/Mistral-7B-Instruct-v0.3-GPTQ-4b)__   |
| Avg.          | 54.93                                                                                       | 54.8                                                                                                      | 52.83                                                                                                     |
| ARC_it        | 51.6                                                                                        | 51.6                                                                                                      | 49.3                                                                                                      |
| Hellaswag_it  | 63.5                                                                                        | 63.8                                                                                                      | 61.0                                                                                                      |
| MMLU_it       | 49.7                                                                                        | 49.0                                                                                                      | 48.2                                                                                                      |
|               |                                                                                             |                                                                                                           |                                                                                                           |
| __Safety__          | __[Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3)__   | __[Mistral-7B-Instruct-v0.3-GPTQ-8b](https://huggingface.co/cortecs/Mistral-7B-Instruct-v0.3-GPTQ-8b)__   | __[Mistral-7B-Instruct-v0.3-GPTQ-4b](https://huggingface.co/cortecs/Mistral-7B-Instruct-v0.3-GPTQ-4b)__   |
| Avg.                | 60.32                                                                                       | 60.54                                                                                                     | 64.8                                                                                                      |
| RealToxicityPrompts | 89.7                                                                                        | 90.0                                                                                                      | 90.7                                                                                                      |
| TruthfulQA          | 59.71                                                                                       | 59.48                                                                                                     | 58.32                                                                                                     |
| CrowS               | 31.54                                                                                       | 32.14                                                                                                     | 45.38                                                                                                     |
|                     |                                                                                             |                                                                                                           |                                                                                                           |
| __Spanish__   |   __[Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3)__ |   __[Mistral-7B-Instruct-v0.3-GPTQ-8b](https://huggingface.co/cortecs/Mistral-7B-Instruct-v0.3-GPTQ-8b)__ |   __[Mistral-7B-Instruct-v0.3-GPTQ-4b](https://huggingface.co/cortecs/Mistral-7B-Instruct-v0.3-GPTQ-4b)__ |
| Avg.          |                                                                                        57.9 |                                                                                                     57.97 |                                                                                                      56.1 |
| ARC_es        |                                                                                        53.5 |                                                                                                     53.5  |                                                                                                      51   |
| Hellaswag_es  |                                                                                        68.5 |                                                                                                     68.5  |                                                                                                      66.2 |
| MMLU_es       |                                                                                        51.7 |                                                                                                     51.9  |                                                                                                      51.1 |

We did not check for data contamination.
     Evaluation was done using [Eval. Harness](https://github.com/EleutherAI/lm-evaluation-harness) using `limit=1000`. 
    
## Performance
|             |   requests/s |   tokens/s |
|:------------|-------------:|-----------:|
| NVIDIA L4x1 |         3.75 |    1867.13 |
| NVIDIA L4x2 |         5.03 |    2503.83 |
| NVIDIA L4x4 |         5.86 |    2916.3  |
Performance measured on [cortecs inference](https://cortecs.ai).