File size: 936 Bytes
ecd905c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
This is the quantized version 4-bits created using autotrain, but it doesn't work.

## Error
### GPU 

![image/png](https://cdn-uploads.huggingface.co/production/uploads/62de65017e93762b858d3057/M0OoBfV1WC1QcLumyvy0L.png)

### CPU 

![image/png](https://cdn-uploads.huggingface.co/production/uploads/62de65017e93762b858d3057/ezLq3jhasIg--M-jJAMSI.png)


## Quantization Process

```py
!pip install auto-gptq
!pip install git+https://github.com/huggingface/optimum.git
!pip install git+https://github.com/huggingface/transformers.git
!pip install --upgrade accelerate
```

```py
from transformers import AutoModelForCausalLM, AutoTokenizer,GPTQConfig
tokenizer = AutoTokenizer.from_pretrained("inception-mbzuai/jais-13b-chat")
gptq_config = GPTQConfig(bits=4, dataset = "c4", tokenizer=tokenizer)
model = AutoModelForCausalLM.from_pretrained('inception-mbzuai/jais-13b-chat', quantization_config=gptq_config,trust_remote_code=True)
```