hierholzer commited on
Commit
bd3dc5c
·
verified ·
1 Parent(s): b139878

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -34
README.md CHANGED
@@ -8,6 +8,10 @@ tags:
8
  - llama-3.1
9
  - llama-3.1-instruct
10
  - gguf
 
 
 
 
11
  model_name: Llama-3.1-70B-Instruct-GGUF
12
  arxiv: 2407.21783
13
  base_model: meta-llama/Llama-3.1-70b-instruct.hf
@@ -62,11 +66,11 @@ Here are the quantized versions that I have available:
62
  - [ ] F16 ~ *NOT Recommended*
63
  - [ ] F32 ~ *NOT Recommended*
64
 
65
- Feel Free to reach out to me if you need a specific Quantization Type that I do not currently offer.
66
 
67
 
68
  ### 📈All Quantization Types Possible
69
- Below is a table of all the Quantication Types that are possible as well as short descriptions.
70
 
71
  | **#** | **or** | **Q#** | **:** | _Description Of Quantization Types_ |
72
  |-------|:------:|:------:|:-----:|----------------------------------------------------------------|
@@ -128,10 +132,12 @@ git clone https://github.com/oobabooga/text-generation-webui.git
128
  | 4. | Once the download is finished, click the blue refresh icon within the Model tab that you are in. |
129
  | 5. | Select your newly downloaded GGUF file in the Model drop-down. once selected, change the settings to best match your system. |
130
 
 
131
  ### 2️⃣ Ollama
132
  Ollama runs as a local service.
133
  Although it technically works using a command-line interface, Ollama's best attribute is their REST API.
134
  Being able to utilize your locally ran LLMs through the use of this API can give you almost endless possibilities!
 
135
  *Feel free to reach out to me if you would like to know some examples that I use this API for*
136
 
137
  #### ☑️ How to install Ollama
@@ -141,49 +147,29 @@ https://ollama.com/download
141
  ```
142
  Using Windows, or Mac you will then download a file and run it.
143
  If you are using linux it will just provide a single command that you need to run in your terminal window.
144
- *Thats about it for installing Ollama*
145
  #### ✅Using Llama-3.1-70B-Instruct-GGUF with Ollama
146
  Ollama does have a Model Library where you can download models:
147
  ```shell
148
  https://ollama.com/library
149
  ```
150
- This Model Library offers all sizes of regular Lama 3.1, as well as the 8B version of Llama 3.1-Instruct.
151
- However, if you would like to use the 70B quantized version of Llama 3.1-Instruct
152
- then you will have to use the following instructions.
 
 
153
  | # | Running the 70B quantized version of Llama 3.1-Instruct with Ollama |
154
  |----|----------------------------------------------------------------------------------------------|
155
- | 1. | Download your desired version of in the Files and Versions section of this Model Repository |
156
- | 2. | Next, create a Modelfile configuration that defines the model's behavior. For Example: |
157
  ```shell
158
- # Modelfile
159
- FROM "./Llama-3.1-70B-Instruct-Q4_K_M.gguf"
160
- PARAMETER stop "<|im_start|>"
161
- PARAMETER stop "<|im_end|>"
162
- TEMPLATE """
163
- <|im_start|>system
164
- <|im_end|>
165
- <|im_start|>user
166
- <|im_end|>
167
- <|im_start|>assistant
168
- """
169
- ```
170
- *Replace ./Llama-3.1-70B-Instruct-Q4_K_M.gguf with the correct version and actual path to the GGUF file you downloaded.
171
- The TEMPLATE line defines the prompt format using system, user, and assistant roles.
172
- You can customize this based on your use case.*
173
- | # | Running the 70B quantized version of Llama 3.1-Instruct with Ollama - *continued* |
174
- |----|-----------------------------------------------------------------------------------|
175
- | 3. | Now, build the Ollama model using the ollama create command: |
176
- ```shell
177
- ollama create "Llama-3.1-70B-Instruct-Q4_K_M" -f ./Llama-3.1-70B-Instruct-Q4_K_M.gguf
178
  ```
179
- *Once again Replace the name: Llama-3.1-70B-Instruct-Q4_K_M and the
180
- model: ./Llama-3.1-70B-Instruct-Q4_K_M.gguf with the quantized model you are using.*
181
  | # | Running the 70B quantized version of Llama 3.1-Instruct with Ollama - *continued* |
182
  |----|-----------------------------------------------------------------------------------|
183
- | 4. | You then can run your model using the ollama run command: |
184
- ```shell
185
- ollama run Llama-3.1-70B-Instruct-Q4_K_M
186
- ```
187
 
188
  -------------------------------------------------
189
 
 
8
  - llama-3.1
9
  - llama-3.1-instruct
10
  - gguf
11
+ - ollama
12
+ - Text-generation-webui
13
+ - instruct
14
+ - llama_3.1
15
  model_name: Llama-3.1-70B-Instruct-GGUF
16
  arxiv: 2407.21783
17
  base_model: meta-llama/Llama-3.1-70b-instruct.hf
 
66
  - [ ] F16 ~ *NOT Recommended*
67
  - [ ] F32 ~ *NOT Recommended*
68
 
69
+ *Feel Free to reach out to me if you need a specific Quantization Type that I do not currently offer.*
70
 
71
 
72
  ### 📈All Quantization Types Possible
73
+ Below is a table of all the Quantization Types that are possible as well as short descriptions.
74
 
75
  | **#** | **or** | **Q#** | **:** | _Description Of Quantization Types_ |
76
  |-------|:------:|:------:|:-----:|----------------------------------------------------------------|
 
132
  | 4. | Once the download is finished, click the blue refresh icon within the Model tab that you are in. |
133
  | 5. | Select your newly downloaded GGUF file in the Model drop-down. once selected, change the settings to best match your system. |
134
 
135
+
136
  ### 2️⃣ Ollama
137
  Ollama runs as a local service.
138
  Although it technically works using a command-line interface, Ollama's best attribute is their REST API.
139
  Being able to utilize your locally ran LLMs through the use of this API can give you almost endless possibilities!
140
+
141
  *Feel free to reach out to me if you would like to know some examples that I use this API for*
142
 
143
  #### ☑️ How to install Ollama
 
147
  ```
148
  Using Windows, or Mac you will then download a file and run it.
149
  If you are using linux it will just provide a single command that you need to run in your terminal window.
150
+ *That's about it for installing Ollama*
151
  #### ✅Using Llama-3.1-70B-Instruct-GGUF with Ollama
152
  Ollama does have a Model Library where you can download models:
153
  ```shell
154
  https://ollama.com/library
155
  ```
156
+ This Model Library offers many different LLM versions that you can use.
157
+ However at the time of writing this, there is no version of Llama-3.1-Instruct offered in the Ollama library.
158
+
159
+ If you would like to use Llama-3.1-Instruct (70B), do the following:
160
+
161
  | # | Running the 70B quantized version of Llama 3.1-Instruct with Ollama |
162
  |----|----------------------------------------------------------------------------------------------|
163
+ | 1. | Open up your terminal that you have Ollama Installed on. |
164
+ | 2. | Paste the following command: |
165
  ```shell
166
+ ollama run hf.co/hierholzer/Llama-3.1-70B-Instruct-GGUF:Q4_K_M
167
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
168
  ```
169
+ *Replace Q4_K_M with whatever version you would like to use from this repository.*
 
170
  | # | Running the 70B quantized version of Llama 3.1-Instruct with Ollama - *continued* |
171
  |----|-----------------------------------------------------------------------------------|
172
+ | 3. | This will download & run the model. It will also be saved for future use. |
 
 
 
173
 
174
  -------------------------------------------------
175