Planning to release bigger sizes, for better text generation? Can it be used vision-only, together with bigger LLMs?

by illtellyoulater - opened 10 days ago

10 days ago

According to your benchmarks, MiniCPM 2.6 outperforms GPT-4V and Claude 3.5 Sonnet in mutli-image and video understanding.

However at only 8b it certainly cannot compete with above models in terms of text generation.
So I have a couple of questions:

Are you planning to release bigger versions of this model with increased text generation capabilities?
Is it possible to use the current version in vision-only mode, together more capable, bigger LLMs?

Thank you.

yuzaa

OpenBMB org 9 days ago

We regret to inform you that we have no plans to release a larger version of the model currently. The version we have released is a friendly size model, and we hope it can be deployed and experienced on the end-side devices.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment