Spaces:
Runtime error
Runtime error
Prompt Tuning for Generative Multimodal Pretrained Models
Overview
This is the code for "Prompt Tuning for Generative Multimodal Pretrained Models", Check our paper on ArXiv. This paper explores prompt tuning for generative multimodal pretrained models, instead of the constrastive learning models. We specifically focuses on the unified sequence-to-sequence learning framework and implement on our OFA models.
Requirements
- python 3.7.4
- pytorch 1.8.1
- torchvision 0.9.1
- JAVA 1.8 (for COCO evaluation)
Installation
pip install -r requirements.txt
Datasets and Checkpoints
See datasets.md and checkpoints.md.
Training
We provide a demo script (run_scripts/refcoco/train_refcoco_prefix.sh
) that has all the required parts for training.
sh ./run_scripts/refcoco/train_refcoco_prefix.sh
A few options of note:
--encoder-prompt
:: whether to insert prompts to the encoder--decoder-prompt
:: whether to insert prompts to the decoder--encoder-prompt-length
:: encoder prompt length--decoder-prompt-length
:: decoder prompt length--bitfit
:: whether to use bitfit--adapter
:: whether to use adapter--adapter-dim
:: adapter projection dim
We recommend that your workspace directory should be organized like this:
OFA/
βββ checkpoints/
β βββ ofa_base.pt
β βββ ofa_large.pt
β βββ ...
βββ criterions/
βββ data/
βββ dataset/
β βββ caption_data/
β βββ refcoco_data/
β βββ ...
βββ fairseq/
βββ models/
βββ run_scripts/
βββ tasks/
βββ train.py
βββ trainer.py
βββ utils/