|
--- |
|
license: mit |
|
language: |
|
- en |
|
pipeline_tag: image-text-to-text |
|
tags: |
|
- medical |
|
extra_gated_prompt: >- |
|
This model and associated code are released under the mit license and may only |
|
be used for non-commercial, academic research purposes with proper |
|
attribution. Any commercial use, sale, or other monetization of the SongCi |
|
model and its derivatives, which include models trained on outputs from the |
|
SongCi model or datasets created from the SongCi model, is prohibited and |
|
requires prior approval. Please note that the primary email used to sign up |
|
for your Hugging Face account must match your institutional email to receive |
|
approval. By downloading the model, you attest that all information |
|
(affiliation, research use) is correct and up-to-date. Downloading the model |
|
requires prior registration on Hugging Face and agreeing to the terms of use. |
|
By downloading this model, you agree not to distribute, publish or reproduce a |
|
copy of the model. If another user within your organization wishes to use the |
|
SongCi model, they must register as an individual user and agree to comply |
|
with the terms of use. Users may not attempt to re-identify the deidentified |
|
data used to develop the underlying model. If you are a commercial entity, |
|
please contact the corresponding author. |
|
extra_gated_fields: |
|
Full name (first and last): text |
|
Current affiliation (no abbreviations): text |
|
Type of Affiliation: |
|
type: select |
|
options: |
|
- Academia |
|
- Industry |
|
- label: Other |
|
value: other |
|
Current and official institutional email (**this must match your primary email in your Hugging Face account, @gmail/@hotmail/@qq email domains will be denied**): text |
|
Please explain your intended research use: text |
|
I agree to all terms outlined above: checkbox |
|
I agree to use this model for non-commercial, academic purposes only: checkbox |
|
I agree not to distribute the model, if another user within your organization wishes to use the SongCi model, they must register as an individual user: checkbox |
|
base_model: |
|
- vinid/plip |
|
--- |
|
|
|
# SongCi |
|
|
|
\[[Github Repo](https://github.com/shenxiaochenn/SongCi)\] |
|
|
|
SongCi is a multi-modal deep learning model tailored for forensic pathological analyses. |
|
The architecture consists of three main parts, i.e., an imaging encoder for WSI feature extraction, a text encoder for the embedding of gross key findings as well as diagnostic queries, and a multi-modal fusion block that integrates the embeddings of WSI and gross key findings to align with those of the diagnostic queries. |
|
|
|
![](https://huggingface.co/shenxiaochen/SongCi/resolve/main/model.png) |
|
|
|
|
|
# How to use SongCi ? |
|
|
|
|
|
### patch-level feature extraction |
|
|
|
```python |
|
|
|
import vision_former as vits |
|
import torch |
|
|
|
model = vits.__dict__['vit_small'](patch_size=16, num_classes=0) |
|
model.load_state_dict(torch.load("./songci.pth")) |
|
|
|
|
|
for p in model.parameters(): |
|
|
|
p.requires_grad = False |
|
|
|
|
|
model.eval() |
|
|
|
aa=torch.randn((10,3,224,224)) |
|
|
|
print(model(aa).shape) |
|
|
|
``` |
|
|
|
### multi-modality fusion |
|
|
|
```python |
|
|
|
from model_fusion_plip import fusionblock2,fusionblock_wonum |
|
import torch |
|
|
|
from transformers import CLIPModel |
|
|
|
def model_fusion(depth=2,noise_ratio=0.5, gate=True,num_em=True): |
|
prototype_all = torch.load("songci_prototype.pt",map_location="cuda") # import the prototype space |
|
disease_model = CLIPModel.from_pretrained("vinid/plip") |
|
disease_model.eval() |
|
|
|
if num_em == True: |
|
|
|
model_fusion = fusionblock2(prototype_all=prototype_all, text_model=disease_model, disease_model=disease_model, depth=depth, noise_ratio=noise_ratio, gated=gate) |
|
else: |
|
model_fusion = fusionblock_wonum(prototype_all=prototype_all, text_model=disease_model, disease_model=disease_model, |
|
depth=depth, noise_ratio=noise_ratio, gated=gate) |
|
|
|
return model_fusion |
|
|
|
model = model_fusion() |
|
|
|
model.load_state_dict(torch.load("fusion_checkpoint.pth",map_location="cpu")) |
|
print("finish!!!!") |
|
``` |
|
|
|
|
|
## License and Terms of Use |
|
This model and associated code are released under the MIT license and may only be used for non-commercial, academic research purposes with proper attribution. |
|
Any commercial use, sale, or other monetization of the SongCi model and its derivatives, which include models trained on outputs from the SongCi model or datasets created from the SongCi model, is prohibited and requires prior approval. |
|
Downloading the model requires prior registration on Hugging Face and agreeing to the terms of use. |
|
By downloading this model, you agree not to distribute, publish or reproduce a copy of the model. |
|
If another user within your organization wishes to use the SongCi model, they must register as an individual user and agree to comply with the terms of use. |
|
Users may not attempt to re-identify the deidentified data used to develop the underlying model. If you are a commercial entity, please contact the corresponding author. |
|
|
|
|
|
## Contact |
|
For any additional questions or comments, contact Chunfeng Lian (`chunfeng.lian@xjtu.edu.cn`), |
|
Chen Shen (`shenxiaochen@stu.xjtu.edu.cn`). |
|
|
|
## BibTeX |
|
|
|
```bibtex |
|
@misc{shen2024largevocabularyforensicpathologicalanalyses, |
|
title={Large-vocabulary forensic pathological analyses via prototypical cross-modal contrastive learning}, |
|
author={Chen Shen and Chunfeng Lian and Wanqing Zhang and Fan Wang and Jianhua Zhang and Shuanliang Fan and Xin Wei and Gongji Wang and Kehan Li and Hongshu Mu and Hao Wu and Xinggong Liang and Jianhua Ma and Zhenyuan Wang}, |
|
year={2024}, |
|
eprint={2407.14904}, |
|
archivePrefix={arXiv}, |
|
primaryClass={eess.IV}, |
|
url={https://arxiv.org/abs/2407.14904}, |
|
} |
|
``` |