ThomasBaruzier
/

Meta-Llama-3.1-70B-Instruct-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

ThomasBaruzier commited on Jul 29, 2024

Commit

6b0592e

·

verified ·

1 Parent(s): b974946

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -196,6 +196,9 @@ extra_gated_button_content: Submit
 # Llama.cpp imatrix quantizations of meta-llama/Meta-Llama-3.1-70B-Instruct
 Using llama.cpp commit [b5e9546](https://github.com/ggerganov/llama.cpp/commit/b5e95468b1676e1e5c9d80d1eeeb26f542a38f42) for quantization, featuring llama 3.1 rope scaling factors. This fixes low-quality issues when using 8-128k context lengths.
 Original model: [https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct)

 # Llama.cpp imatrix quantizations of meta-llama/Meta-Llama-3.1-70B-Instruct
+<!-- Better pic but I would like to talk about my quants on Linkedin so yeah <img src="https://cdn-uploads.huggingface.co/production/uploads/646410e04bf9122922289dc7/xlkSJli8IQ9KoTAuTKOF2.png" alt="llama" width="30%"/> -->
+<img src="https://cdn-uploads.huggingface.co/production/uploads/646410e04bf9122922289dc7/LQUL7YII8okA8CG54mQSI.jpeg" alt="llama" width="60%"/>
 Using llama.cpp commit [b5e9546](https://github.com/ggerganov/llama.cpp/commit/b5e95468b1676e1e5c9d80d1eeeb26f542a38f42) for quantization, featuring llama 3.1 rope scaling factors. This fixes low-quality issues when using 8-128k context lengths.
 Original model: [https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct)