Hello, I have some issues with my CMD Ollama while installing this model

by NCGWRjason - opened Nov 21, 2024

Nov 21, 2024

Hello,

I followed the official CMD installation instructions to install your Llama-3.2-1B-Instruct-GGUF model from Hugging Face using the following command:
https://huggingface.co/docs/hub/ollama

CMD query
ollama run hf.co/mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated-GGUF

I noticed that the Llama-3.2-1B-Instruct-GGUF repository contains multiple GGUF files.

However, after downloading, I found only one model file, specifically:
hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF:latest 807MB

Could you please clarify which specific GGUF file this corresponds to and why only one file was downloaded?

Thank you.

deleted

Nov 21, 2024

•

edited Nov 21, 2024

Not that it helps you directly but if i dont create it myself, i would normally just download the GGUF i want manually from HF to my machine. Then I import that into whatever tool i want to use, depending on its needs ( openwebui, oobas text gen, llamafile, my own code, whatever ). Doing it that way always works for me and i have a backup of the file in the process i can store.

I don't normally try using the 'tools utilities' to download for me

huggingkot

Nov 22, 2024

Could you please clarify which specific GGUF file this corresponds to and why only one file was downloaded?

RTFM?
From https://huggingface.co/docs/hub/ollama that you linked:

Custom Quantization

By default, the Q4_K_M quantization scheme is used, when it’s present inside the model repo. If not, we default to picking one reasonable quant type present inside the repo.

To select a different scheme, simply:

From Files and versions tab on a model page, open GGUF viewer on a particular GGUF file.

Choose ollama from Use this model dropdown.

The snippet would be in format (quantization tag added):

ollama run hf.co/{username}/{repository}:{quantization}

this is clear, which part you don't understand?

However, after downloading, I found only one model file, specifically:
hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF:latest 807MB

you downloaded Q4_K_M.

NCGWRjason

Nov 22, 2024

Could you please clarify which specific GGUF file this corresponds to and why only one file was downloaded?

RTFM?
From https://huggingface.co/docs/hub/ollama that you linked:

Custom Quantization

By default, the Q4_K_M quantization scheme is used, when it’s present inside the model repo. If not, we default to picking one reasonable quant type present inside the repo.

To select a different scheme, simply:

From Files and versions tab on a model page, open GGUF viewer on a particular GGUF file.

Choose ollama from Use this model dropdown.

The snippet would be in format (quantization tag added):

ollama run hf.co/{username}/{repository}:{quantization}

this is clear, which part you don't understand?

However, after downloading, I found only one model file, specifically:
hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF:latest 807MB

you downloaded Q4_K_M.

I means I use the CMD terminal query,
ollama run hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF

The /bartowski/Llama-3.2-1B-Instruct-GGUF repository contains multiple GGUF files. (https://huggingface.co/bartowski/Llama-3.2-1B-Instruct-GGUF/tree/main)

However, after downloading, I found only one model file in the ollama list, specifically:
hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF:latest 807MB

So, I specifically ran bartowski/Llama-3.2-1B-Instruct-GGUF, but it only downloaded Q4_K_M.gguf.
Is this the default model being downloaded? Or is it because this was the most recently uploaded version?
That’s why it appears as Llama-3.2-1B-Instruct-GGUF:latest in the ollama list.

Thank you

bartowski

Owner Nov 22, 2024

Is there a reason you want to download all the sizes..?

Q4_K_M is just the default that ollama uses

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment