hpcgroup
/

hpc-coder-v2-1.3b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

daniellnichols commited on Aug 9

Commit

088e534

•

1 Parent(s): 963b22b

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -45,6 +45,6 @@ Below is an instruction that describes a task. Write a response that appropriate
 ## Quantized Models
 4 and 8 bit quantized weights are available in the GGUF format for use with [llama.cpp](https://github.com/ggerganov/llama.cpp).
-The 4 bit model requires ~3.8 GB memory and can be found [here](https://huggingface.co/hpcgroup/hpc-coder-v2-1.3b-Q4_K_S-GGUF).
-The 8 bit model requires ~7.1 GB memory and can be found [here](https://huggingface.co/hpcgroup/hpc-coder-v2-1.3b-Q8_0-GGUF).
 Further information on how to use them with llama.cpp can be found in [its documentation](https://github.com/ggerganov/llama.cpp).

 ## Quantized Models
 4 and 8 bit quantized weights are available in the GGUF format for use with [llama.cpp](https://github.com/ggerganov/llama.cpp).
+The 4 bit model requires ~0.8 GB memory and can be found [here](https://huggingface.co/hpcgroup/hpc-coder-v2-1.3b-Q4_K_S-GGUF).
+The 8 bit model requires ~1.4 GB memory and can be found [here](https://huggingface.co/hpcgroup/hpc-coder-v2-1.3b-Q8_0-GGUF).
 Further information on how to use them with llama.cpp can be found in [its documentation](https://github.com/ggerganov/llama.cpp).