Model Card: Labrador Transformer Model

Model Overview

Labrador is a transformer-based machine learning model pre-trained on a masked language modeling (MLM) task. It is designed to analyze clinical laboratory data, focusing on morning routine lab values from the MIMIC IV dataset. The model aims to understand and predict laboratory test outcomes, providing insights for clinical informatics applications.

Intended Use

Primary Application: Research and analysis in clinical informatics, with a focus on laboratory data interpretation and prediction.
Target Users: Researchers, data scientists, and healthcare professionals with expertise in machine learning and clinical data.

Model/Data Specifications

Input Data: Laboratory values including Bicarbonate (Bic), Creatinine (Crt), Potassium (Pot), Sodium (Sod), Urea (Ure), Hemoglobin (Hgb), Platelets (Plt), and White Blood Cell count (Wbc).
Model Outputs: Predictive outputs for laboratory values, provided as both categorical and continuous data points.

Training Data

The model leverages anonymized data from the MIMIC IV dataset, specifically focusing on routine morning lab values from patients at Beth Israel Deaconess Medical Center.

Model Architecture & Parameters

Embedding Dimension: 756
Hidden Dimension: 756
Transformer Heads: 4
Number of Blocks: 10
Feedforward Dimension: 1024
Dropout Rate: 0.3
Activation: ReLU

Training Details

Optimizer: Adam
Epochs: 12
Learning Rate: 8e-6
Batch Size: 512
Masking Ratio: 40%

Limitations & Bias

Data Source Bias: The training data from a single healthcare institution may not be representative of broader populations.
Analytical Bias: The focus on specific lab values may not capture the full spectrum of patient health.
Generalization: The model's performance may vary across different healthcare settings and patient demographics.

Ethical Considerations

Data Privacy: Users must adhere to ethical standards and privacy laws when applying the model to sensitive health information.
Clinical Decision Making: The model's predictions should complement, not replace, clinical judgment and patient-specific considerations.

Acknowledgements

This work was supported by MIT Critical Data and utilizes the MIMIC IV dataset. We thank all contributors to the MIMIC project and acknowledge the patients and healthcare providers who made this research possible.

Model Details

Name: Labrador
Version: 1.0
Release Date: January 28, 2024
Developer: David Restrepo
Affiliation: MIT Critical Data
Contact: davidres@mit.edu

License

This model is released under the MIT License.

dsrestrepo
/

Labrador_pt