Edit model card

Model Card for Mistral-7B-Instruct-v0.1-8bit

The Mistral-7B-Instruct-v0.1-8bit is a 8bit quantize version with torch_dtype=torch.float16, I just load in 8bit and push here Mistral-7B-Instruct-v0.1.

For full details of this model please read our paper and release blog post.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "mistralai/Mistral-7B-Instruct-v0.1"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    load_in_8bit=True,
    use_flash_attention_2=True,
    torch_dtype=torch.float16,
    )

model.push_to_hub("LsTam/Mistral-7B-Instruct-v0.1-8bit")

To use it:

from transformers import AutoTokenizer, AutoModelForCausalLM

tok_name = "mistralai/Mistral-7B-Instruct-v0.1"
model_name = "LsTam/Mistral-7B-Instruct-v0.1-8bit"

tokenizer = AutoTokenizer.from_pretrained(tok_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    use_flash_attention_2=True,
    )
Downloads last month
16
Safetensors
Model size
7.24B params
Tensor type
F32
FP16
I8
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.