Image-Text-to-Text
PEFT
Safetensors
English
File size: 5,771 Bytes
7f849e1
 
 
 
 
 
 
 
95c3215
de5a646
7f849e1
 
 
 
 
 
5767cfc
 
b02598a
5767cfc
 
 
7f849e1
 
dfd7bce
7f849e1
9d9bb85
 
 
 
 
75b380f
5cbfb5c
75b380f
 
 
 
 
 
 
9d9bb85
 
7f849e1
 
 
 
 
5767cfc
7f849e1
 
 
5767cfc
 
7f849e1
5767cfc
7f849e1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2892d82
7f849e1
 
 
5767cfc
7f849e1
 
 
50306c8
7f849e1
2b6a5b3
 
5767cfc
 
 
 
 
2b6a5b3
 
5767cfc
 
2b6a5b3
 
 
 
b2a436f
 
75504e3
b2a436f
 
 
 
 
 
 
 
 
 
 
 
7f849e1
 
95c3215
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
---
license: apache-2.0
datasets:
- eltorio/ROCO-radiology
language:
- en
base_model:
- HuggingFaceM4/Idefics3-8B-Llama3
pipeline_tag: image-text-to-text
library_name: peft
---

# IDEFICS3_ROCO

![Stage](https://img.shields.io/badge/stage-early%20development-yellow)![License](https://img.shields.io/badge/license-Apache%202.0-blue)![Contributors Welcome](https://img.shields.io/badge/contributors-welcome-brightgreen)[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/#fileId=https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb)

## Star the project

**If you appreciate my work, please consider giving it a like! 🤩**  
**I'm also looking for donations of free GPU time to complete the fine-tuning process.**  
**Please contact me if you can help! 🙏**  

## A Fine-tuned Radiology-focused Model based on Hugging Face's Idefics3 Model

This repository contains a fine-tuned version of the Hugging Face [Idefics3-8B-Llama3](https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3) model, built on top of the Meta Llama 3.1 8B architecture. Our model, `IDEFICS3_ROCO`, has been fine-tuned on the [Radiology Objects in Context (ROCO)](https://huggingface.co/datasets/eltorio/ROCO-radiology) dataset, a large-scale medical and multimodal imaging collection.

## TL;DR

For immediate use, you can load the model directly from Hugging Face:  

```python
from transformers import AutoProcessor, Idefics3ForConditionalGeneration, image_utils
import torch
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu') # on CPU it requires ≈ 3h/query 🙈
processor = AutoProcessor.from_pretrained(v)
model = Idefics3ForConditionalGeneration.from_pretrained(
        v, torch_dtype=torch.bfloat16
    ).to(device)

model.load_adapter("eltorio/IDEFICS3_ROCO")
```

### Model Information

* **Base Model:** Idefics3-8B-Llama3
* **Fine-tuning Dataset:** Radiology Objects in Context (ROCO)
* **License:** Apache-2.0
* **Current Status:** Fine-tuning process is finished. Contributions to complete the fine-tuning / vallidation / test processes are welcome!

### Training Progress Status

* Current checkpoint: 12267 (100% completed)
* Estimated remaining GPU time: 0 hours
* Hardware requirements: T4 GPU with >16GB VRAM
* Last update: november, 12th 2024

### Fine-tuning Code

The fine-tuning code is available as a Jupyter Notebook in the [ROCO-radiology dataset repository](https://huggingface.co/datasets/eltorio/ROCO-radiology) on Hugging Face:

* [ROCO-idefics3.ipynb](https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb)

The [Junyper Notebook](https://colab.research.google.com/#fileId=https%3A//huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/#fileId=https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb) contains the code to fine-tune the Idefics3-8B-Llama3 model on the ROCO dataset. The fine-tuning process is currently halted at checkpoint 640 (out of 24,000) due to limitations with Colab Free T4 GPU unit. Contributions to complete the fine-tuning process are welcome!

### Contributions Welcome

If you have the resources to complete the fine-tuning process, we would appreciate your contribution. Please fork this repository, finish the fine-tuning process, and submit a pull request with your updates.

### Citation

If you use this model in your work, please cite the original Idefics3 model and our fine-tuned model:

* [Idefics3-8B-Llama3](https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3)
* [IDEFICS3_ROCO](https://huggingface.co/eltorio/IDEFICS3_ROCO)

### Contribution Guide

1. **Technical Requirements**
   * Access to powerful GPU (T4, V100, A100 or equivalent)
   * Python environment with PyTorch
   * Disk space: ~100GB

2. **Getting Started**
   * Fork the repository
   * Resume from checkpoint 12267
   * Follow instructions in [ROCO-idefics3.ipynb](https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/#fileId=https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/ROCO-idefics3.ipynb)

3. **Contact**
   * For questions: [link to issues/discussions](https://huggingface.co/eltorio/IDEFICS3_ROCO/discussions)

### Docker Image

A AI training docker image is available for this model. The image and includes all necessary dependencies to run the fine-tuning process.  
You need to set the `HF_TOKEN` environment variable to your Hugging Face API token.  
You also need to have NVidia Docker container runtime installed.
Finnaly, you need to run the container with GPU support with `--gpus all` option.
The image is available on Docker Hub:  

```bash
export HF_TOKEN=hf_some_token
docker run --gpus all --user=42420:42420 -e HF_TOKEN=$HF_TOKEN -it sctg/roco-idefics3:latest bash -i  /start.sh $HF_TOKEN
```

The Dockerfile is available in the [IDEFICS_ROCO repository](https://huggingface.co/eltorio/IDEFICS3_ROCO/blob/main/Dockerfile).

### Use this model

According to the Apache license you should cite this model with:  

```bibtex
@misc {ronan_l.m._2024,
	author       = { {Ronan L.M.} },
	title        = { IDEFICS3_ROCO (Revision b02598a) },
	year         = 2024,
	url          = { https://huggingface.co/eltorio/IDEFICS3_ROCO },
	doi          = { 10.57967/hf/3504 },
	publisher    = { Hugging Face }
}
```

### Acknowledgments

This work was made possible by the [Hugging Face Transformers](https://huggingface.co/) library and the [ROCO-radiology dataset](https://huggingface.co/datasets/eltorio/ROCO-radiology).