Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ tags:
|
|
12 |
|
13 |
# RadReportX
|
14 |
### Model description
|
15 |
-
Llama3.1-8B-instruct model fine tuned on synthetic data. There are two tasks that this model can achieve. The first task is an open-ended question, which is to detect phrases in a radiology report that represents an ICD-10 code. There is no restriction about the underlying disease. The second task is to detect disease out of 13 candidates from a radiology report. The candidate diseases are [Atelectasis, Cardiomegaly, Consolidation, Edema, Enlarged Cardiomediastinum, Fracture, Lung Lesion, Lung Opacity, Pleural Effusion, Pleural Other, Pneumonia, Pneumothorax, Support Devices]. When there are no diseases out of the candidates, the model will output 'Normal'.
|
16 |
|
17 |
### Training set and training process
|
18 |
There are two sources of training data. The first set is generated by GPT4o. The second source comes from MIMIC-CXR dataset (https://arxiv.org/pdf/1901.07042), with labels being extracted by Negbio algorithm. The training is conducted using torchtune framework (https://github.com/pytorch/torchtune). For details, please refer to our paper listed below.
|
|
|
12 |
|
13 |
# RadReportX
|
14 |
### Model description
|
15 |
+
Llama3.1-8B-instruct model fine tuned on synthetic data. There are two tasks that this model can achieve. The first task is an open-ended question, which is to detect phrases in a radiology report that represents an ICD-10 code. There is no restriction about the underlying disease. The second task is to detect disease out of 13 candidates from a radiology report. The candidate diseases are [*Atelectasis, Cardiomegaly, Consolidation, Edema, Enlarged Cardiomediastinum, Fracture, Lung Lesion, Lung Opacity, Pleural Effusion, Pleural Other, Pneumonia, Pneumothorax, Support Devices*]. When there are no diseases out of the candidates, the model will output 'Normal'.
|
16 |
|
17 |
### Training set and training process
|
18 |
There are two sources of training data. The first set is generated by GPT4o. The second source comes from MIMIC-CXR dataset (https://arxiv.org/pdf/1901.07042), with labels being extracted by Negbio algorithm. The training is conducted using torchtune framework (https://github.com/pytorch/torchtune). For details, please refer to our paper listed below.
|