txiong23 commited on
Commit
41240ab
·
verified ·
1 Parent(s): 2f1f5be

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -0
README.md CHANGED
@@ -116,6 +116,7 @@ print(text_outputs)
116
  - **Mid Stage:** A mixture of 4.7M high-quality synthetic data, 1 epoch, full model
117
  - **Final-Image Stage:** A mixture of 3.6M single-image data, 1 epoch, full model
118
  - **OneVision Stage:** A mixture of 1.6M single-image/multi-image/video data, 1 epoch, full model
 
119
  - **Precision:** bfloat16
120
 
121
  ## Hardware & Software
@@ -130,4 +131,14 @@ print(text_outputs)
130
  @article{li2024llavaonevision,
131
  title={LLaVA-OneVision},
132
  }
 
 
 
 
 
 
 
 
 
 
133
  ```
 
116
  - **Mid Stage:** A mixture of 4.7M high-quality synthetic data, 1 epoch, full model
117
  - **Final-Image Stage:** A mixture of 3.6M single-image data, 1 epoch, full model
118
  - **OneVision Stage:** A mixture of 1.6M single-image/multi-image/video data, 1 epoch, full model
119
+ - **Critic / Preference Learning Stage:** 9.4k question-image input from [LLaVA-RLHF](https://llava-rlhf.github.io/) with self-generated responses, reward signal from [llava-critic-7b(https://huggingface.co/lmms-lab/llava-critic-7b), iterative DPO for 3 epoches, full model
120
  - **Precision:** bfloat16
121
 
122
  ## Hardware & Software
 
131
  @article{li2024llavaonevision,
132
  title={LLaVA-OneVision},
133
  }
134
+
135
+ @article{xiong2024llavacritic,
136
+ title={LLaVA-Critic: Learning to Evaluate Multimodal Models},
137
+ author={Xiong, Tianyi and Wang, Xiyao and Guo, Dong and Ye, Qinghao and Fan, Haoqi and Gu, Quanquan and Huang, Heng and Li, Chunyuan},
138
+ year={2024},
139
+ eprint={2410.02712},
140
+ archivePrefix={arXiv},
141
+ primaryClass={cs.CV},
142
+ url={https://arxiv.org/abs/2410.02712},
143
+ }
144
  ```