cxllin/StableMed-3b · Hugging Face

StableMed is a 3 billion parameter decoder-only language model fine tuned on 18k rows of medical questions over 1 epoch.

Usage

Get started generating text with StableMed by using the following code snippet:

from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("cxllin/StableMed-3b")
model = AutoModelForCausalLM.from_pretrained(
  "stabilityai/stablelm-3b-4e1t",
  trust_remote_code=True,
  torch_dtype="auto",
)
model.cuda()
inputs = tokenizer("The weather is always wonderful", return_tensors="pt").to("cuda")
tokens = model.generate(
  **inputs,
  max_new_tokens=64,
  temperature=0.75,
  top_p=0.95,
  do_sample=True,
)
print(tokenizer.decode(tokens[0], skip_special_tokens=True))

Model Architecture

The model is a decoder-only transformer similar to the LLaMA (Touvron et al., 2023) architecture with the following modifications:

Parameters	Hidden Size	Layers	Heads	Sequence Length
2,795,443,200	2560	32	32	4096

Position Embeddings: Rotary Position Embeddings (Su et al., 2021) applied to the first 25% of head embedding dimensions for improved throughput following Black et al. (2022).
Normalization: LayerNorm (Ba et al., 2016) with learned bias terms as opposed to RMSNorm (Zhang & Sennrich, 2019).
Tokenizer: GPT-NeoX (Black et al., 2022).

cxllin
/

StableMed-3b

Usage

Model Architecture

Dataset used to train cxllin/StableMed-3b