This is a quantized version of the Jais-13b-chat model

To load this model you will need the bitsandbytes quantization method

If you are using text-generator-webui Select Transformers

  • Compute d-type: bfloat16
  • Quantization Type : nf4
  • Load in 4-bit: True
  • Use double quantization: True
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import transformers
import torch

model_name = "jwnder/core42_jais-13b-chat-bnb-4bit"

import warnings
warnings.filterwarnings('ignore')

tokenizer = AutoTokenizer.from_pretrained(model_input_folder, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_input_folder, trust_remote_code=True)

inputs = tokenizer("Testing LLM!", return_tensors="pt")
start = datetime.now()
outputs = model.generate(**inputs)
end = datetime.now()
print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
Downloads last month
117
Safetensors
Model size
6.93B params
Tensor type
F32
FP16
U8
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support model that require custom code execution.