File size: 1,971 Bytes
f025c9d
 
6805f21
 
 
 
 
f025c9d
88fd15a
6805f21
 
 
88fd15a
6805f21
 
88fd15a
6805f21
88fd15a
6805f21
88fd15a
6805f21
 
88fd15a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
---
license: apache-2.0
language: "en"
tags:
- longformer
- clinical
- biomedical
---

<span style="font-size:larger;">**KEPTlongfomer**</span> is a medical knowledge enhanced version of Longformer that was further pre-trained using [contrastive learning](https://arxiv.org/pdf/2210.03304.pdf). 
The model achieves SOTA performance on auto ICD coding on MIMIC-III as of 11/12/2022.
A sister model for better performance is available [here](https://huggingface.co/whaleloops/KEPTlongformer-PMM3/).

### Pre-training
We initialized this model from [clinical longformer](https://huggingface.co/yikuan8/Clinical-Longformer).

And then pretrained with Hierarchical Self-Alignment Pretrain (HSAP) using Knowledge Graph UMLS.
This includes (a) Hierarchy, (b) Synonym, (c) Abbreviation. For more info, see section 3.3 in [paper](https://arxiv.org/pdf/2210.03304.pdf).
The learning rate was 5e-5, weight decay was 0.01, adam epsilon was 1e-5.

### Usage
See our [github](https://github.com/whaleloops/KEPT/tree/rerank300) for how to use this with prompts on auto ICD coding.

With the following result:
| Metric  | Score |
| ------------- | ------------- |
|rec_micro| =0.5729403619819988|
|rec_macro| =0.11342156911120573|
|rec_at_8| =0.4094837705486378|
|rec_at_75| =0.8470734920535119|
|rec_at_50| =0.8005338782352|
|rec_at_5| =0.2891628170355805|
|rec_at_15| =0.5768778119750537|
|prec_micro| =0.6411968713105065|
|prec_macro| =0.12227610414493029|
|prec_at_8| =0.7760972716488731|
|prec_at_75| =0.197504942665085|
|prec_at_50| =0.2768090154211151|
|prec_at_5| =0.8483392645314354|
|prec_at_15| =0.6178529062870699|
|f1_micro| =0.6051499904242899|
|f1_macro| =0.11768251595637802|
|f1_at_8| =0.536107150495997|
|f1_at_75| =0.32032290907137506|
|f1_at_50| =0.411373195944102|
|f1_at_5| =0.43131028155283435|
|f1_at_15| =0.5966627077602488|
|auc_micro| =0.9651754312635265|
|auc_macro| =0.8566590059725866|
|acc_micro| =0.43384592341105344|
|acc_macro| =0.08639139221100567|