Text Generation
Transformers
Safetensors
Thai
English
qwen2
text-generation-inference
sft
trl
4-bit precision
bitsandbytes
LoRA
Fine-Tuning with LoRA
LLM
GenAI
NT GenAI
ntgenai
lahnmah
NT Thai GPT
ntthaigpt
medical
medtech
HealthGPT
หลานม่า
NT Academy
conversational
Inference Endpoints
4-bit precision
File size: 8,229 Bytes
41d88bd 3b0ab97 44e0588 a37ba34 4e37f62 441e8d1 a37ba34 4e37f62 3b0ab97 ebdf0e0 f303874 41d88bd ca0021f 41d88bd 5f67597 41d88bd ca0021f 41d88bd ca0021f 41d88bd 70f9199 ca0021f 70f9199 ca0021f 41d88bd ca0021f 41d88bd 7aa896a 41d88bd ca0021f 41d88bd ca0021f 41d88bd ca0021f 41d88bd ca0021f 41d88bd ca0021f 41d88bd ca0021f 41d88bd ca0021f 41d88bd f1479f6 ca0021f f1479f6 41d88bd f1479f6 41d88bd f1479f6 41d88bd f1479f6 41d88bd f1479f6 eae0d35 f1479f6 eae0d35 299e4d9 3ac9c68 299e4d9 ebdf0e0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 |
---
library_name: transformers
tags:
- text-generation-inference
- sft
- trl
- 4-bit precision
- bitsandbytes
- LoRA
- Fine-Tuning with LoRA
- LLM
- GenAI
- NT GenAI
- ntgenai
- lahnmah
- NT Thai GPT
- ntthaigpt
datasets:
- Thaweewat/thai-med-pack
language:
- th
base_model:
- openthaigpt/openthaigpt1.5-7b-instruct
pipeline_tag: text-generation
license: apache-2.0
new_version: Aekanun/openthaigpt-MedChatModelv5.1
---
# Model Card for `openthaigpt1.5-7b-medical-tuned`
![image/png](https://cdn-uploads.huggingface.co/production/uploads/663ce15f197afc063058dc3a/U0TIiWGdNaxl_9TH90gIx.png)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/663ce15f197afc063058dc3a/mAZBm9Dk7-S-FQ4srj3aG.png)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/663ce15f197afc063058dc3a/PgRsAWRPGw6T2tsF2aJ3W.png)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/663ce15f197afc063058dc3a/lmreg4ibgBllTvzfhMeSU.png)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/663ce15f197afc063058dc3a/cPJ3PWKcqwV2ynNWO1Qrs.png)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/663ce15f197afc063058dc3a/mkM8VavlG9xHhgNlZ9E1X.png)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/663ce15f197afc063058dc3a/MecCnAmLlYdpBjwJjMQFu.png)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/663ce15f197afc063058dc3a/ijHMzw9Zrpm23o89vzsSc.png)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/663ce15f197afc063058dc3a/hOIyuIA_zT7_s8SG-ZDWQ.png)
<!-- Provide a quick summary of what the model is/does. -->
This model is fine-tuned from `openthaigpt1.5-7b-instruct` using Supervised Fine-Tuning (SFT) on the `Thaweewat/thai-med-pack` dataset. The model is designed for medical question-answering tasks in Thai, specializing in providing accurate and contextual answers based on medical information.
## Model Details
### Model Description
This model was fine-tuned using Supervised Fine-Tuning (SFT) to optimize it for medical question answering in Thai. The base model is `openthaigpt1.5-7b-instruct`, and it has been enhanced with domain-specific knowledge using the `Thaweewat/thai-med-pack` dataset.
- **Developed by:** Amornpan Phornchaicharoen
- **Fine-tuned by:** Amornpan Phornchaicharoen
- **Model type:** Causal Language Model (AutoModelForCausalLM)
- **Language(s):** Thai
- **License:** Amornpan Phornchaicharoen
- **Fine-tuned from model:** `openthaigpt1.5-7b-instruct`
- **Dataset used for fine-tuning:** `Thaweewat/thai-med-pack`
### Model Sources
- **Repository:** https://huggingface.co/amornpan
- **Citing Repository:** https://huggingface.co/Aekanun
- **Base Model:** https://huggingface.co/openthaigpt/openthaigpt1.5-7b-instruct
- **Dataset:** https://huggingface.co/datasets/Thaweewat/thai-med-pack
## Uses
### Direct Use
The model can be directly used for generating medical responses in Thai. It has been optimized for:
- Medical question-answering
- Providing clinical information
- Health-related dialogue generation
### Downstream Use
This model can be used as a foundational model for medical assistance systems, chatbots, and applications related to healthcare, specifically in the Thai language.
### Out-of-Scope Use
- This model should not be used for real-time diagnosis or emergency medical scenarios.
- Avoid using it for critical clinical decisions without human oversight, as the model is not intended to replace professional medical advice.
## Bias, Risks, and Limitations
### Bias
- The model might reflect biases present in the dataset, particularly when addressing underrepresented medical conditions or topics.
### Risks
- Responses may contain inaccuracies due to the inherent limitations of the model and the dataset used for fine-tuning.
- This model should not be used as the sole source of medical advice.
### Limitations
- Limited to the medical domain.
- The model is sensitive to prompts and may generate off-topic responses for non-medical queries.
## How to Get Started with the Model
Here’s how to load and use the model for generating medical responses in Thai:
## 1. Install the Required Packages
First, ensure you have installed the required libraries by running:
```python
pip install torch transformers bitsandbytes
```
## 2. Load the Model and Tokenizer
You can load the model and tokenizer directly from Hugging Face using the following code:
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
# Define the model path
model_path = 'amornpan/openthaigpt-MedChatModelv11'
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_path)
tokenizer.pad_token = tokenizer.eos_token
```
## 3. Prepare Your Input (Custom Prompt)
Create a custom medical prompt that you want the model to respond to:
```python
custom_prompt = "โปรดอธิบายลักษณะช่องปากที่เป็นมะเร็งในระยะเริ่มต้น"
PROMPT = f'[INST] <You are a question answering assistant. Answer the question as truthfully and helpfully as possible. คุณคือผู้ช่วยตอบคำถาม จงตอบคำถามอย่างถูกต้องและมีประโยชน์ที่สุด<>{custom_prompt}[/INST]'
# Tokenize the input prompt
inputs = tokenizer(PROMPT, return_tensors="pt", padding=True, truncation=True)
```
## 4. Configure the Model for Efficient Loading (4-bit Quantization)
The model uses 4-bit precision for efficient inference. Here’s how to set up the configuration:
```python
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16
)
```
## 5. Load the Model with Quantization Support
Now, load the model with the 4-bit quantization settings:
```python
model = AutoModelForCausalLM.from_pretrained(
model_path,
quantization_config=bnb_config,
trust_remote_code=True
)
```
## 6. Move the Model and Inputs to the GPU (if available)
For faster inference, move the model and input tensors to a GPU, if available:
```python
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
inputs = {k: v.to(device) for k, v in inputs.items()}
```
## 7. Generate a Response from the Model
Now, generate the medical response by running the model:
```python
outputs = model.generate(**inputs, max_new_tokens=200, do_sample=True)
```
## 8. Decode the Generated Text
Finally, decode and print the response from the model:
```python
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)
```
## 9. Output
```python
[INST] <You are a question answering assistant. Answer the question as truthfully and helpfully as possible. คุณคือผู้ช่วยตอบคำถาม จงตอบคำถามอย่างถูกต้องและมีประโยชน์ที่สุด<>โปรดอธิบายลักษณะช่องปากที่เป็นมะเร็งในระยะเริ่มต้น[/INST] [ANS] ช่องปากมะเร็งในระยะเริ่มต้น อาจไม่มีอาการชัดเจน แต่ผู้คนบางกลุ่มอาจสังเกตเห็นอาการต่อไปนี้:
- มีการกัดหรือกระแทกบริเวณช่องปากโดยไม่มีสาเหตุ
- มีจุด ฝี เมล็ด หรือความไม่เท่าเทียมภายในช่องปากที่ไม่หายวื้อ
- ปวดหรือเจ็บบริเวณช่องปาก
- เปลี่ยนแปลงสีของเนื้อเยื่อในช่องปาก (อาจเป็นสีขาว หรือ黑马)
- มีตุ่มที่ไม่หาย ภายในช่องปาก
- มีความลำบากในการกิน มี
```
## More Information
```amornpan@gmail.com``` |