My Experience with Mistral AI's Nemo Instruct Model
#1
by
JLouisBiz
- opened
Computer configuration:
- Manufacturer: Dell Inc.
- Product Name: 0VHWTR, Version: A02
- Memory: 16 GB DDR3 RAM
- GPU: 4GB VRAM with NVIDIA Corporation GP107 [GeForce GTX 1050 Ti]
Running Mistral with success as:
$ ./Mistral-Nemo-Instruct-2407.Q4_K_M.llamafile -ngl 15
with following status:
llm_load_tensors: offloading 15 repeating layers to GPU
llm_load_tensors: offloaded 15/41 layers to GPU
llm_load_tensors: CPU buffer size = 7123.30 MiB
llm_load_tensors: CUDA0 buffer size = 2368.36 MiB
I am getting 2.44 tokens per second which is not much, though functional. I am fine with it, as for time being do not have better configuration. I have started using it today for reason that on this low end configuration I need better quality text scanning and replacements. Can't wait to switch to 24GB GPU and 128 RAM, with better configuration.
I wonder if functions may be used through llamafile, anybody knows it?
Thanks to all contributors in Mozilla and Mistral team who made some of their models free software.