rohanshad/cmr_c0.1 · Hugging Face

A Generalizable Deep Learning System for Cardiac MRI

Rohan Shad, Cyril Zakka, Dhamanpreet Kaur, Robyn Fong, Ross Warren Filice, John Mongan, Kimberly Kalianos, Nishith Khandwala, David Eng, Matthew Leipzig, Walter Witschey, Alejandro de Feria, Victor Ferrari, Euan Ashley, Michael A. Acker, Curtis Langlotz, William Hiesinger

Project overview:

Here we describe a transformer-based vision system that learns complex pathophysiological visual representations from a large multi-institutional dataset of 19,041 CMR scans, guided by natural language supervision from the text reports accompanying each CMR study. We use a large language model to help ‘teach’ a vision network to generate meaningful low-dimensional representations of CMR studies, by showing examples of how radiologists describe what they see while drafting their reports. We utilize a contrastive learning objective using the InfoNCE objective. The video encoder used is an implementation of MVIT (Multi-scale vision transformers) initialzed using Kinetics-400 pre-trained weights. The text encoder used is an implementation of BERT (Bidirectional encoder representations with transformers) pretrained on pubmed abstracts with a custom vocabulary. Please see our paper for more. Link to GitHub Repo

rohanshad
/

cmr_c0.1

You need to agree to share your contact information to access this model

A Generalizable Deep Learning System for Cardiac MRI

Rohan Shad, Cyril Zakka, Dhamanpreet Kaur, Robyn Fong, Ross Warren Filice, John Mongan, Kimberly Kalianos, Nishith Khandwala, David Eng, Matthew Leipzig, Walter Witschey, Alejandro de Feria, Victor Ferrari, Euan Ashley, Michael A. Acker, Curtis Langlotz, William Hiesinger

Project overview: