Post
41908
Updated: Lumimaid and TheSpice-v0.8.3
I have uploaded version 2 (v2) files for the Llama-3-Lumimaid-8B-v0.1-OAS GGUF Imatrix quants.
[model] Lewdiculous/Llama-3-Lumimaid-8B-v0.1-OAS-GGUF-IQ-Imatrix
You can recognize the new files by their
Imatrix data was generated from the FP16 and conversions directly from the BF16.
Hopefully avoiding any losses in the model conversion, as has been the recently discussed topic on Llama-3 and GGUF lately.
This is more disk and compute intensive so lets hope we get GPU inference support for BF16 models in llama.cpp.
If you are able to test them and noticed any issues compared to the original quants, let me know in the corresponding discussions.
---
Additionally, L3-TheSpice-8b-v0.8.3 GGUF Imatrix quants were also updated.
[model] Lewdiculous/L3-TheSpice-8b-v0.8.3-GGUF-IQ-Imatrix
I have uploaded version 2 (v2) files for the Llama-3-Lumimaid-8B-v0.1-OAS GGUF Imatrix quants.
[model] Lewdiculous/Llama-3-Lumimaid-8B-v0.1-OAS-GGUF-IQ-Imatrix
You can recognize the new files by their
v2
prefix.Imatrix data was generated from the FP16 and conversions directly from the BF16.
Hopefully avoiding any losses in the model conversion, as has been the recently discussed topic on Llama-3 and GGUF lately.
This is more disk and compute intensive so lets hope we get GPU inference support for BF16 models in llama.cpp.
If you are able to test them and noticed any issues compared to the original quants, let me know in the corresponding discussions.
---
Additionally, L3-TheSpice-8b-v0.8.3 GGUF Imatrix quants were also updated.
[model] Lewdiculous/L3-TheSpice-8b-v0.8.3-GGUF-IQ-Imatrix