Planning to release bigger sizes, for better text generation? Can it be used vision-only, together with bigger LLMs?
#7
by
illtellyoulater
- opened
According to your benchmarks, MiniCPM 2.6 outperforms GPT-4V and Claude 3.5 Sonnet in mutli-image and video understanding.
However at only 8b it certainly cannot compete with above models in terms of text generation.
So I have a couple of questions:
- Are you planning to release bigger versions of this model with increased text generation capabilities?
- Is it possible to use the current version in vision-only mode, together more capable, bigger LLMs?
Thank you.
We regret to inform you that we have no plans to release a larger version of the model currently. The version we have released is a friendly size model, and we hope it can be deployed and experienced on the end-side devices.