Update README.md

Browse files

Files changed (1) hide show

README.md +20 -34

README.md CHANGED Viewed

@@ -8,6 +8,10 @@ tags:
 - llama-3.1
 - llama-3.1-instruct
 - gguf
 model_name: Llama-3.1-70B-Instruct-GGUF
 arxiv: 2407.21783
 base_model: meta-llama/Llama-3.1-70b-instruct.hf
@@ -62,11 +66,11 @@ Here are the quantized versions that I have available:
 - [ ] F16 ~ *NOT Recommended*
 - [ ] F32 ~ *NOT Recommended*
-Feel Free to reach out to me if you need a specific Quantization Type that I do not currently offer.
 ### 📈All Quantization Types Possible
-Below is a table of all the Quantication Types that are possible as well as short descriptions.
 | **#** | **or** | **Q#** | **:** | _Description Of Quantization Types_                            |
 |-------|:------:|:------:|:-----:|----------------------------------------------------------------|
@@ -128,10 +132,12 @@ git clone https://github.com/oobabooga/text-generation-webui.git
 | 4. | Once the download is finished, click the blue refresh icon within the Model tab that you are in.                                                                                                         |
 | 5. | Select your newly downloaded GGUF file in the Model drop-down. once selected, change the settings to best match your system.                                                                             |
 ### 2️⃣ Ollama
 Ollama runs as a local service.
 Although it technically works using a command-line interface, Ollama's best attribute is their REST API.
 Being able to utilize your locally ran LLMs through the use of this API can give you almost endless possibilities!
 *Feel free to reach out to me if you would like to know some examples that I use this API for*
 #### ☑️  How to install Ollama
@@ -141,49 +147,29 @@ https://ollama.com/download
 ```
 Using Windows, or Mac you will then download a file and run it.
 If you are using linux it will just provide a single command that you need to run in your terminal window.
-*Thats about it for installing Ollama*
 #### ✅Using Llama-3.1-70B-Instruct-GGUF with  Ollama
 Ollama does have a Model Library where you can download models:
 ```shell
 https://ollama.com/library
 ```
-This Model Library offers all sizes of regular Lama 3.1, as well as the 8B version of Llama 3.1-Instruct.
-However, if you would like to use the 70B quantized version of Llama 3.1-Instruct
-then you will have to use the following instructions.
 | #  | Running the 70B quantized version of Llama 3.1-Instruct with Ollama                          |
 |----|----------------------------------------------------------------------------------------------|
-| 1. | Download your desired version of  in the Files and Versions section of this Model Repository |
-| 2. | Next, create a Modelfile configuration that defines the model's behavior. For Example:       |
 ```shell
-# Modelfile
-FROM "./Llama-3.1-70B-Instruct-Q4_K_M.gguf"
-PARAMETER stop "<|im_start|>"
-PARAMETER stop "<|im_end|>"
-TEMPLATE """
-<|im_start|>system
-<|im_end|>
-<|im_start|>user
-<|im_end|>
-<|im_start|>assistant
-"""
-```
-*Replace ./Llama-3.1-70B-Instruct-Q4_K_M.gguf with the correct version and actual path to the GGUF file you downloaded.
-The TEMPLATE line defines the prompt format using system, user, and assistant roles.
-You can customize this based on your use case.*
-| #  | Running the 70B quantized version of Llama 3.1-Instruct with Ollama - *continued* |
-|----|-----------------------------------------------------------------------------------|
-| 3. | Now, build the Ollama model using the ollama create command:                      |
-```shell
-ollama create "Llama-3.1-70B-Instruct-Q4_K_M" -f ./Llama-3.1-70B-Instruct-Q4_K_M.gguf
 ```
-*Once again Replace the name: Llama-3.1-70B-Instruct-Q4_K_M and the
-model: ./Llama-3.1-70B-Instruct-Q4_K_M.gguf with the quantized model you are using.*
 | #  | Running the 70B quantized version of Llama 3.1-Instruct with Ollama - *continued* |
 |----|-----------------------------------------------------------------------------------|
-| 4. | You then can run your model using the ollama run command:                         |
-```shell
-ollama run Llama-3.1-70B-Instruct-Q4_K_M
-```
 -------------------------------------------------

 - llama-3.1
 - llama-3.1-instruct
 - gguf
+- ollama
+- Text-generation-webui
+- instruct
+- llama_3.1
 model_name: Llama-3.1-70B-Instruct-GGUF
 arxiv: 2407.21783
 base_model: meta-llama/Llama-3.1-70b-instruct.hf
 - [ ] F16 ~ *NOT Recommended*
 - [ ] F32 ~ *NOT Recommended*
+*Feel Free to reach out to me if you need a specific Quantization Type that I do not currently offer.*
 ### 📈All Quantization Types Possible
+Below is a table of all the Quantization Types that are possible as well as short descriptions.
 | **#** | **or** | **Q#** | **:** | _Description Of Quantization Types_                            |
 |-------|:------:|:------:|:-----:|----------------------------------------------------------------|
 | 4. | Once the download is finished, click the blue refresh icon within the Model tab that you are in.                                                                                                         |
 | 5. | Select your newly downloaded GGUF file in the Model drop-down. once selected, change the settings to best match your system.                                                                             |
 ### 2️⃣ Ollama
 Ollama runs as a local service.
 Although it technically works using a command-line interface, Ollama's best attribute is their REST API.
 Being able to utilize your locally ran LLMs through the use of this API can give you almost endless possibilities!
 *Feel free to reach out to me if you would like to know some examples that I use this API for*
 #### ☑️  How to install Ollama
 ```
 Using Windows, or Mac you will then download a file and run it.
 If you are using linux it will just provide a single command that you need to run in your terminal window.
+*That's about it for installing Ollama*
 #### ✅Using Llama-3.1-70B-Instruct-GGUF with  Ollama
 Ollama does have a Model Library where you can download models:
 ```shell
 https://ollama.com/library
 ```
+This Model Library offers many different LLM versions that you can use.
+However at the time of writing this, there is no version of Llama-3.1-Instruct offered in the Ollama library.
+If you would like to use Llama-3.1-Instruct (70B), do the following:
 | #  | Running the 70B quantized version of Llama 3.1-Instruct with Ollama                          |
 |----|----------------------------------------------------------------------------------------------|
+| 1. | Open up your terminal that you have Ollama Installed on.                                     |
+| 2. | Paste the following command:                                                                 |
 ```shell
+ollama run hf.co/hierholzer/Llama-3.1-70B-Instruct-GGUF:Q4_K_M
 ```
+*Replace Q4_K_M with whatever version you would like to use from this repository.*
 | #  | Running the 70B quantized version of Llama 3.1-Instruct with Ollama - *continued* |
 |----|-----------------------------------------------------------------------------------|
+| 3. | This will download & run the model. It will also be saved for future use.         |
 -------------------------------------------------