Update readme, support llama.cpp
Browse files
README.md
CHANGED
@@ -383,6 +383,11 @@ print(res)
|
|
383 |
|
384 |
Please look at [GitHub](https://github.com/OpenBMB/MiniCPM-V) for more detail about usage.
|
385 |
|
|
|
|
|
|
|
|
|
|
|
386 |
## Int4 quantized version
|
387 |
Download the int4 quantized version for lower GPU memory (8GB) usage: [MiniCPM-Llama3-V-2_5-int4](https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-int4).
|
388 |
|
|
|
383 |
|
384 |
Please look at [GitHub](https://github.com/OpenBMB/MiniCPM-V) for more detail about usage.
|
385 |
|
386 |
+
|
387 |
+
## Inference with llama.cpp<a id="llamacpp"></a>
|
388 |
+
MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of [llama.cpp](https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv) for more detail.
|
389 |
+
|
390 |
+
|
391 |
## Int4 quantized version
|
392 |
Download the int4 quantized version for lower GPU memory (8GB) usage: [MiniCPM-Llama3-V-2_5-int4](https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-int4).
|
393 |
|