Image-Text-to-Text
Transformers
Safetensors
English
idefics2
pretraining
multimodal
vision
Inference Endpoints
5 papers

Reproducing idefics-8b(instruct)

#61
by Iheb-Chaabane - opened

I’m trying to reproduce the instruct version starting from the base ( pretrained) checkpoint.
Can you please provide more details on the proportion of the datasets in cauldron and training hyper parameters (lr, weight decay, nbr epochs…)?
Thanks,

HuggingFaceM4 org

Most of this is detailed in the paper in appendix

Sign up or log in to comment