Transformers
PyTorch
English
bridgetower
Inference Endpoints
LooperXX commited on
Commit
a09c704
1 Parent(s): d5659dc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -10
README.md CHANGED
@@ -52,7 +52,7 @@ for text in texts:
52
  scores[text] = outputs.logits[0,1].item()
53
  ```
54
 
55
- Here is how to use this model to perfom masked language modeling:
56
 
57
  ```python
58
  from transformers import BridgeTowerProcessor, BridgeTowerForMaskedLM
@@ -104,18 +104,14 @@ The model was pre-trained for 100k steps on 8 NVIDIA A100 GPUs with a batch size
104
  The optimizer used was AdamW with a learning rate of 1e-5. No data augmentation was used except for center-crop. The image resolution in pre-training is set to 288 x 288.
105
 
106
  ## Evaluation results
107
- Please refer to [Table 5](https://arxiv.org/pdf/2206.08657.pdf) for BridgeTower's performance on Image Retrieval and other down stream tasks.
108
 
109
  ### BibTeX entry and citation info
110
  ```bibtex
111
  @article{xu2022bridge,
112
- title={BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning},
113
- author={Xu, Xiao and
114
- Wu, Chenfei and
115
- Rosenman, Shachar and
116
- Lal, Vasudev and
117
- Duan, Nan},
118
- journal={arXiv preprint arXiv:2206.08657},
119
- year={2022}
120
  }
121
  ```
 
52
  scores[text] = outputs.logits[0,1].item()
53
  ```
54
 
55
+ Here is how to use this model to perform masked language modeling:
56
 
57
  ```python
58
  from transformers import BridgeTowerProcessor, BridgeTowerForMaskedLM
 
104
  The optimizer used was AdamW with a learning rate of 1e-5. No data augmentation was used except for center-crop. The image resolution in pre-training is set to 288 x 288.
105
 
106
  ## Evaluation results
107
+ Please refer to [Table 5](https://arxiv.org/pdf/2206.08657.pdf) for BridgeTower's performance on Image Retrieval and other downstream tasks.
108
 
109
  ### BibTeX entry and citation info
110
  ```bibtex
111
  @article{xu2022bridge,
112
+ title={BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning},
113
+ author={Xu, Xiao and Wu, Chenfei and Rosenman, Shachar and Lal, Vasudev and Che, Wanxiang and Duan, Nan},
114
+ journal={arXiv preprint arXiv:2206.08657},
115
+ year={2022}
 
 
 
 
116
  }
117
  ```