amornpan
/

openthaigpt-MedChatModelv11

Model card Files Files and versions Community

amornpan commited on Oct 25, 2024

Commit

4f0b5c1

verified ·

1 Parent(s): c3ad444

Update README.md

Browse files

Files changed (1) hide show

README.md +17 -12

README.md CHANGED Viewed

@@ -103,17 +103,17 @@ Here’s how to load and use the model for generating medical responses in Thai:
 First, ensure you have installed the required libraries by running:
-python
 pip install torch transformers bitsandbytes
 ## 2. Load the Model and Tokenizer
 You can load the model and tokenizer directly from Hugging Face using the following code:
-python
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
 # Define the model path
 model_path = 'amornpan/openthaigpt-MedChatModelv11'
@@ -125,64 +125,69 @@ tokenizer.pad_token = tokenizer.eos_token
 Create a custom medical prompt that you want the model to respond to:
-python
 custom_prompt = "โปรดอธิบายลักษณะช่องปากที่เป็นมะเร็งในระยะเริ่มต้น"
 PROMPT = f'[INST] <You are a question answering assistant. Answer the question as truthfully and helpfully as possible. คุณคือผู้ช่วยตอบคำถาม จงตอบคำถามอย่างถูกต้องและมีประโยชน์ที่สุด<>{custom_prompt}[/INST]'
 # Tokenize the input prompt
 inputs = tokenizer(PROMPT, return_tensors="pt", padding=True, truncation=True)
 ## 4. Configure the Model for Efficient Loading (4-bit Quantization)
 The model uses 4-bit precision for efficient inference. Here’s how to set up the configuration:
-python
 bnb_config = BitsAndBytesConfig(
     load_in_4bit=True,
     bnb_4bit_quant_type="nf4",
     bnb_4bit_compute_dtype=torch.float16
 )
 ## 5. Load the Model with Quantization Support
 Now, load the model with the 4-bit quantization settings:
-python
 model = AutoModelForCausalLM.from_pretrained(
     model_path,
     quantization_config=bnb_config,
     trust_remote_code=True
 )
 ## 6. Move the Model and Inputs to the GPU (prefer GPU)
 For faster inference, move the model and input tensors to a GPU, if available:
-python
 device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
 model.to(device)
 inputs = {k: v.to(device) for k, v in inputs.items()}
 ## 7. Generate a Response from the Model
 Now, generate the medical response by running the model:
-python
 outputs = model.generate(**inputs, max_new_tokens=200, do_sample=True)
 ## 8. Decode the Generated Text
 Finally, decode and print the response from the model:
-python
 generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
 print(generated_text)
 ## 9. Output
-python
 [INST] <You are a question answering assistant. Answer the question as truthfully and helpfully as possible. คุณคือผู้ช่วยตอบคำถาม จงตอบคำถามอย่างถูกต้องและมีประโยชน์ที่สุด<>โปรดอธิบายลักษณะช่องปากที่เป็นมะเร็งในระยะเริ่มต้น[/INST] มะเร็งช่องปากเป็นมะเร็งเพียงชนิดเดียวที่ได้รับผลกระทบจากนิโคติน มันคือผู้ชายกลุ่มอายุ 60 – 75 คน คุณจะแสดงอาการและเกิดขึ้นอย่างรวดเร็วหากเกิดมะเร็งช่องปาก คุณจะสังเกตเห็นปื้นแพร่กระจายของเนื้องอก ส่วนใหญ่ในช่องปาก เนื้องอกแสดงว่าเป็นเจ้าแห่ที่กำลังทำลายเยียวยา ค้นหาทั้งภายในและภายนอกลิ้นที่อยู่ติดกางเกงป่อง มะเร็งกระเพาะปัสสาวะหรือมะเร็งกล้ามเนื้อกระเพาะ
 ### Authors
 * Amornpan Phornchaicharoen (amornpan@gmail.com)

 First, ensure you have installed the required libraries by running:
+```python
 pip install torch transformers bitsandbytes
+```
 ## 2. Load the Model and Tokenizer
 You can load the model and tokenizer directly from Hugging Face using the following code:
+```python
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
+```
 # Define the model path
 model_path = 'amornpan/openthaigpt-MedChatModelv11'
 Create a custom medical prompt that you want the model to respond to:
+```python
 custom_prompt = "โปรดอธิบายลักษณะช่องปากที่เป็นมะเร็งในระยะเริ่มต้น"
 PROMPT = f'[INST] <You are a question answering assistant. Answer the question as truthfully and helpfully as possible. คุณคือผู้ช่วยตอบคำถาม จงตอบคำถามอย่างถูกต้องและมีประโยชน์ที่สุด<>{custom_prompt}[/INST]'
 # Tokenize the input prompt
 inputs = tokenizer(PROMPT, return_tensors="pt", padding=True, truncation=True)
+```
 ## 4. Configure the Model for Efficient Loading (4-bit Quantization)
 The model uses 4-bit precision for efficient inference. Here’s how to set up the configuration:
+```python
 bnb_config = BitsAndBytesConfig(
     load_in_4bit=True,
     bnb_4bit_quant_type="nf4",
     bnb_4bit_compute_dtype=torch.float16
 )
+```
 ## 5. Load the Model with Quantization Support
 Now, load the model with the 4-bit quantization settings:
+```python
 model = AutoModelForCausalLM.from_pretrained(
     model_path,
     quantization_config=bnb_config,
     trust_remote_code=True
 )
+```
 ## 6. Move the Model and Inputs to the GPU (prefer GPU)
 For faster inference, move the model and input tensors to a GPU, if available:
+```python
 device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
 model.to(device)
 inputs = {k: v.to(device) for k, v in inputs.items()}
+```
 ## 7. Generate a Response from the Model
 Now, generate the medical response by running the model:
+```python
 outputs = model.generate(**inputs, max_new_tokens=200, do_sample=True)
+```
 ## 8. Decode the Generated Text
 Finally, decode and print the response from the model:
+```python
 generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
 print(generated_text)
 ## 9. Output
+```python
 [INST] <You are a question answering assistant. Answer the question as truthfully and helpfully as possible. คุณคือผู้ช่วยตอบคำถาม จงตอบคำถามอย่างถูกต้องและมีประโยชน์ที่สุด<>โปรดอธิบายลักษณะช่องปากที่เป็นมะเร็งในระยะเริ่มต้น[/INST] มะเร็งช่องปากเป็นมะเร็งเพียงชนิดเดียวที่ได้รับผลกระทบจากนิโคติน มันคือผู้ชายกลุ่มอายุ 60 – 75 คน คุณจะแสดงอาการและเกิดขึ้นอย่างรวดเร็วหากเกิดมะเร็งช่องปาก คุณจะสังเกตเห็นปื้นแพร่กระจายของเนื้องอก ส่วนใหญ่ในช่องปาก เนื้องอกแสดงว่าเป็นเจ้าแห่ที่กำลังทำลายเยียวยา ค้นหาทั้งภายในและภายนอกลิ้นที่อยู่ติดกางเกงป่อง มะเร็งกระเพาะปัสสาวะหรือมะเร็งกล้ามเนื้อกระเพาะ
+```
 ### Authors
 * Amornpan Phornchaicharoen (amornpan@gmail.com)