JustinLin610 commited on
Commit
34dc2df
1 Parent(s): 22ed627

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -3
README.md CHANGED
@@ -3,10 +3,14 @@ license: apache-2.0
3
  ---
4
 
5
  # OFA-huge-vqa
 
 
6
  This is the **huge** version of OFA model finetuned for **VQA**. OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image generation, visual grounding, image captioning, image classification, text generation, etc.) to a simple sequence-to-sequence learning framework.
7
 
8
  The directory includes 4 files, namely `config.json` which consists of model configuration, `vocab.json` and `merge.txt` for our OFA tokenizer, and lastly `pytorch_model.bin` which consists of model weights. There is no need to worry about the mismatch between Fairseq and transformers, since we have addressed the issue yet.
9
 
 
 
10
  To use it in transformers, please refer to https://github.com/OFA-Sys/OFA/tree/feature/add_transformers. Install the transformers and download the models as shown below.
11
  ```
12
  git clone --single-branch --branch feature/add_transformers https://github.com/OFA-Sys/OFA.git
@@ -15,7 +19,7 @@ git clone https://huggingface.co/OFA-Sys/OFA-huge-vqa
15
  ```
16
  After, refer the path to OFA-large to `ckpt_dir`, and prepare an image for the testing example below. Also, ensure that you have pillow and torchvision in your environment.
17
 
18
- ```
19
  >>> from PIL import Image
20
  >>> from torchvision import transforms
21
  >>> from transformers import OFATokenizer, OFAModel
@@ -39,7 +43,7 @@ After, refer the path to OFA-large to `ckpt_dir`, and prepare an image for the t
39
  >>> patch_img = patch_resize_transform(img).unsqueeze(0)
40
 
41
 
42
- >>> # using the generator of fairseq version
43
  >>> model = OFAModel.from_pretrained(ckpt_dir, use_cache=True)
44
  >>> generator = sequence_generator.SequenceGenerator(
45
  tokenizer=tokenizer,
@@ -53,7 +57,7 @@ After, refer the path to OFA-large to `ckpt_dir`, and prepare an image for the t
53
  >>> gen_output = generator.generate([model], data)
54
  >>> gen = [gen_output[i][0]["tokens"] for i in range(len(gen_output))]
55
 
56
- >>> # using the generator of huggingface version
57
  >>> model = OFAModel.from_pretrained(ckpt_dir, use_cache=False)
58
  >>> gen = model.generate(inputs, patch_images=patch_img, num_beams=5, no_repeat_ngram_size=3)
59
 
 
3
  ---
4
 
5
  # OFA-huge-vqa
6
+
7
+ ## Introduction
8
  This is the **huge** version of OFA model finetuned for **VQA**. OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image generation, visual grounding, image captioning, image classification, text generation, etc.) to a simple sequence-to-sequence learning framework.
9
 
10
  The directory includes 4 files, namely `config.json` which consists of model configuration, `vocab.json` and `merge.txt` for our OFA tokenizer, and lastly `pytorch_model.bin` which consists of model weights. There is no need to worry about the mismatch between Fairseq and transformers, since we have addressed the issue yet.
11
 
12
+
13
+ ## How to use
14
  To use it in transformers, please refer to https://github.com/OFA-Sys/OFA/tree/feature/add_transformers. Install the transformers and download the models as shown below.
15
  ```
16
  git clone --single-branch --branch feature/add_transformers https://github.com/OFA-Sys/OFA.git
 
19
  ```
20
  After, refer the path to OFA-large to `ckpt_dir`, and prepare an image for the testing example below. Also, ensure that you have pillow and torchvision in your environment.
21
 
22
+ ```python
23
  >>> from PIL import Image
24
  >>> from torchvision import transforms
25
  >>> from transformers import OFATokenizer, OFAModel
 
43
  >>> patch_img = patch_resize_transform(img).unsqueeze(0)
44
 
45
 
46
+ # using the generator of fairseq version
47
  >>> model = OFAModel.from_pretrained(ckpt_dir, use_cache=True)
48
  >>> generator = sequence_generator.SequenceGenerator(
49
  tokenizer=tokenizer,
 
57
  >>> gen_output = generator.generate([model], data)
58
  >>> gen = [gen_output[i][0]["tokens"] for i in range(len(gen_output))]
59
 
60
+ # using the generator of huggingface version
61
  >>> model = OFAModel.from_pretrained(ckpt_dir, use_cache=False)
62
  >>> gen = model.generate(inputs, patch_images=patch_img, num_beams=5, no_repeat_ngram_size=3)
63