|
--- |
|
base_model: unsloth/deepseek-r1-distill-llama-8b-unsloth-bnb-4bit |
|
library_name: peft |
|
license: mit |
|
datasets: |
|
- FreedomIntelligence/medical-o1-reasoning-SFT |
|
language: |
|
- en |
|
tags: |
|
- medical |
|
--- |
|
|
|
# Model Card for DeepSeek-R1-Medical-COT |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
DeepSeek-R1-Medical-COT is a fine-tuned version of the DeepSeek-R1 model, optimized for medical chain-of-thought (COT) reasoning. It is designed to assist in medical-related tasks such as question-answering, reasoning, and decision support. This model is particularly useful for applications requiring structured reasoning in the medical domain. |
|
|
|
- **Developed by:** Mohamed Mahmoud |
|
- **Funded by [optional]:** Independent project |
|
- **Shared by:** Mohamed Mahmoud |
|
- **Model type:** Transformer-based Large Language Model (LLM) |
|
- **Language(s) (NLP):** English (en) |
|
- **License:** MIT |
|
- **Finetuned from model:** unsloth/deepseek-r1-distill-llama-8b-unsloth-bnb-4bit |
|
|
|
### Model Sources |
|
|
|
- **Repository:** [Hugging Face Model Repo](https://huggingface.co/thesnak/DeepSeek-R1-Medical-COT) |
|
|
|
- **LinkedIn:** [Mohamed Mahmoud](https://www.linkedin.com/in/mohamed-thesnak) |
|
|
|
## Uses |
|
|
|
### Direct Use |
|
|
|
The model can be used directly for medical reasoning tasks, including: |
|
|
|
- Answering medical questions |
|
- Assisting in medical decision-making |
|
- Supporting clinical research and literature review |
|
|
|
### Downstream Use |
|
|
|
- Fine-tuning for specialized medical applications |
|
- Integration into chatbots and virtual assistants for medical advice |
|
- Educational tools for medical students |
|
|
|
### Out-of-Scope Use |
|
|
|
- This model is not a replacement for professional medical advice. |
|
- Should not be used for clinical decision-making without expert validation. |
|
- May not perform well in languages other than English. |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
While fine-tuned for medical reasoning, the model may still have biases due to the limitations of its training data. Users should exercise caution and validate critical outputs with medical professionals. |
|
|
|
### Recommendations |
|
|
|
Users should verify outputs, particularly in sensitive medical contexts. The model is best used as an assistive tool rather than a primary decision-making system. |
|
|
|
## How to Get Started with the Model |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
import torch |
|
|
|
model_name = "thesnak/DeepSeek-R1-Medical-COT" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map="auto") |
|
|
|
input_text = "What are the symptoms of pneumonia?" |
|
inputs = tokenizer(input_text, return_tensors="pt").to("cuda") |
|
outputs = model.generate(**inputs, max_new_tokens=100) |
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
``` |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
The model was fine-tuned using the **FreedomIntelligence/medical-o1-reasoning-SFT** dataset, which contains medical question-answer pairs designed to improve reasoning capabilities. |
|
|
|
### Training Procedure |
|
|
|
#### Preprocessing |
|
|
|
- Tokenization using LLaMA tokenizer |
|
- Text cleaning and normalization |
|
|
|
#### Training Hyperparameters |
|
|
|
- **Precision:** bf16 mixed precision |
|
- **Optimizer:** AdamW |
|
- **Batch size:** 16 |
|
- **Learning rate:** 2e-5 |
|
- **Epochs:** 3 |
|
|
|
#### Speeds, Sizes, Times |
|
|
|
- **Training time:** Approximately 12 hours on a P100 GPU (Kaggle) |
|
- **Model size:** 8B parameters (bnb 4-bit quantized) |
|
|
|
#### Training Loss |
|
|
|
| Step | Training Loss | |
|
| ---- | ------------- | |
|
| 10 | 1.919000 | |
|
| 20 | 1.461800 | |
|
| 30 | 1.402500 | |
|
| 40 | 1.309000 | |
|
| 50 | 1.344400 | |
|
| 60 | 1.314100 | |
|
|
|
## Evaluation |
|
|
|
### Testing Data, Factors & Metrics |
|
|
|
#### Testing Data |
|
|
|
- The model was evaluated on held-out samples from **FreedomIntelligence/medical-o1-reasoning-SFT**. |
|
|
|
#### Factors |
|
|
|
- Performance was assessed on medical reasoning tasks. |
|
|
|
#### Metrics |
|
|
|
- **Perplexity:** Measured for general coherence. |
|
- **Accuracy:** Evaluated based on expert-verified responses. |
|
- **BLEU Score:** Used to assess response relevance. |
|
|
|
### Results |
|
|
|
- **Perplexity:** |
|
- **Accuracy:** |
|
- **BLEU Score:** |
|
|
|
## Model Examination |
|
|
|
Further interpretability analyses can be conducted using tools like Captum and SHAP to analyze how the model derives its medical reasoning responses. |
|
|
|
## Environmental Impact |
|
|
|
- **Hardware Type:** P100 GPU (Kaggle) |
|
- **Hours used:** 2 hours |
|
- **Cloud Provider:** Kaggle |
|
- **Compute Region:** N/A |
|
- **Carbon Emitted:** Estimated at 9.5 kg CO2eq |
|
- **[Kaggle Notebook](https://www.kaggle.com/code/thesnak/fine-tune-deepseek)** |
|
## Technical Specifications |
|
|
|
### Compute Infrastructure |
|
|
|
#### Hardware |
|
|
|
- P100 GPU (16GB VRAM) on Kaggle |
|
|
|
|
|
## Citation |
|
|
|
**BibTeX:** |
|
|
|
```bibtex |
|
@misc{mahmoud2025deepseekmedcot, |
|
title={DeepSeek-R1-Medical-COT}, |
|
author={Mohamed Mahmoud}, |
|
year={2025}, |
|
url={https://huggingface.co/thesnak/DeepSeek-R1-Medical-COT} |
|
} |
|
``` |
|
|
|
## Model Card Authors |
|
|
|
- Mohamed Mahmoud |
|
|
|
## Model Card Contact |
|
|
|
- [LinkedIn](https://www.linkedin.com/in/mohamed-thesnak) |