Anybody know how/what can actually load/inference this model?

#50
by SytanSD - opened

I have tried the following: Ooobabooga TextGenWebui, llama.cpp, ollama.cpp, kobold.cpp, Tabby, Exlv2, and LM Studio, and not a single one of them has support for this model. I am trying to use this model as an open source, locally run alternative to GPT-4, as I do not like or wish to support OpenAI in any way possible, but it seems as though this model is just designed in a way that means running it on any pre-existing GUI is impossible

Any additional info would be massively appreciated, as I am having to put my job on hold to try and sort out a GPT-4 alternative.

Important to note: I have absolutely 0 experience with Diffusers/Transformers, and I have very little experience with code as well. I am trying to find a solution that allows me to run this model in a way that I can direct a front end to its port and have it fulfill requests from a tagging/captioning GUI

Maybe you can use lmdeploy which support MiniCPM-Llama3-V-2_5, you can use command line to get gradio, api_server or chat at terminal

https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5 - You can run llama.cpp server from here I believe

Sign up or log in to comment