|
--- |
|
license: mit |
|
--- |
|
|
|
# ChiMed-GPT |
|
|
|
ChiMed-GPT is a Chinese medical large language model (LLM) that is built by continually training [Ziya-v2](https://arxiv.org/abs/2311.03301) on Chinese medical data, where pre-training, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF) are performed. |
|
|
|
More information about the model is coming soon. |
|
|
|
## Citation |
|
|
|
If you use or extend our work, please cite the following [paper](https://arxiv.org/abs/2311.06025): |
|
``` |
|
@article{USTC-ChiMed-GPT, |
|
title="{ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences}", |
|
author={Yuanhe Tian, Ruyi Gan, Yan Song, Jiaxing Zhang, Yongdong Zhang}, |
|
journal={arXiv preprint arXiv:2311.06025}, |
|
year={2023}, |
|
} |
|
``` |
|
|
|
## Usage |
|
```python |
|
from transformers import AutoTokenizer |
|
from transformers import LlamaForCausalLM |
|
import torch |
|
|
|
query="[human]:感冒怎么处理?\n[bot]:" |
|
model = LlamaForCausalLM.from_pretrained('SYNLP/ChiMed-GPT-1.0', torch_dtype=torch.float16, device_map="auto").eval() |
|
tokenizer = AutoTokenizer.from_pretrained(ckpt) |
|
input_ids = tokenizer(query, return_tensors="pt").input_ids.to('cuda:0') |
|
generate_ids = model.generate( |
|
input_ids, |
|
max_new_tokens=512, |
|
do_sample = True, |
|
top_p = 0.9) |
|
output = tokenizer.batch_decode(generate_ids)[0] |
|
print(output) |
|
``` |
|
|
|
|