Edit model card

llama-2-7b-chat-MEDS-12

This is a llama-2-7b-chat-hf model fine-tuned using QLoRA (4-bit precision) on the s200862/medical_qa_meds dataset. This is an adapted version of the medalpaca/medical_meadow_wikidoc_patient_information dataset to match llama-2's instruction format.

πŸ”§ Training

It was trained on-premise in a jupyter notebook using an Nvidia RTX A4000 GPU with 16GB of VRAM and 16 GB of system RAM.

πŸ’» Usage

It is intended to give answers to medical questions.

# pip install transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "s200862/llama-2-7b-chat-MEDS-12"
prompt = "What causes Allergy?"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

sequences = pipeline(
    f'<s>[INST] {prompt} [/INST]',
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
    max_length=200,
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")
Downloads last month
9
Safetensors
Model size
6.74B params
Tensor type
FP16
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train s200862/llama-2-7b-chat-MEDS-12

Space using s200862/llama-2-7b-chat-MEDS-12 1