metadata
license: apache-2.0
datasets:
- openbmb/RLAIF-V-Dataset
language:
- en
Model Card for RLAIF-V
RLAIF-V-12B is a model exhibits super GPT-4V trustworthiness. The model is built on the SFT version of OmniLMM-12B, which is one of the first version of MiniCPM-V series.
We utilize a novel framework, RLAIF-V, which aligns MLLMs in a fully open-source paradigm. This alignment framework maximally exploits the open-source feedback from two key perspectives, including high-quality feedback data and an online feedback learning algorithm.
Model Details
Evaluation
- 🏅 Super GPT-4V Trustworthiness via Open-source Feedback. By learning from open-source AI feedback, RLAIF-V 12B achieves super GPT-4V trustworthiness in both generative and discriminative tasks.
- 💪 Maintaining Well Performance on General Abilities: On benchmarks tested with the general abilities (e.g. LLaVABench, MMStar), RLAIF-V-12B also performs well.
Examples
Model Description
- Related model: OmniLMM-12B
- Trained on data: RLAIF-V-Dataset