add gguf

Browse files

Files changed (1) hide show

README.md +6 -3

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ tags:
 - functions
 - function calling
 - sharded
-- ggml
 - gptq
 ---
 # Function Calling Llama 2 (version 2)
@@ -24,7 +24,7 @@ tags:
 2. Function descriptions are moved outside of the system prompt. This avoids the behaviour of function calling being affected by how the system prompt had been trained to influence the model.
 Available models:
-- Llama-7B-chat with function calling ([Base Model](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-adapters-v2)) - Free
 - Llama-13B-chat with function calling ([Base Model](https://huggingface.co/Trelis/Llama-2-13b-chat-hf-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/Llama-2-13b-chat-hf-function-calling-adapters-v2)) - Paid, [purchase here: €19.99 per user/seat.](https://buy.stripe.com/9AQ7te3lHdmbdZ68wz)
 - CodeLlama-34B-Instruct with function calling ([Base Model](https://huggingface.co/Trelis/CodeLlama-34b-Instruct-hf-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/CodeLlama-34b-Instruct-hf-function-calling-adapters-v2)) - Paid, [purchase here: €44.99 per user/seat.](https://buy.stripe.com/cN27teg8t2Hx5sA8wM)
@@ -142,7 +142,7 @@ It is recommended to handle cases where:
 ## Inference
-**Quick Start**
 Try out this notebook [fLlama_Inference notebook](https://colab.research.google.com/drive/1Ow5cQ0JNv-vXsT-apCceH6Na3b4L7JyW?usp=sharing)
 **Commercial Applications**
@@ -156,6 +156,9 @@ Below follows information on the original Llama 2 model...
 ~
 # **Llama 2**
 Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format. Links to other models can be found in the index at the bottom.

 - functions
 - function calling
 - sharded
+- gguf
 - gptq
 ---
 # Function Calling Llama 2 (version 2)
 2. Function descriptions are moved outside of the system prompt. This avoids the behaviour of function calling being affected by how the system prompt had been trained to influence the model.
 Available models:
+- Llama-7B-chat with function calling ([Base Model](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-adapters-v2)), ([GGUF - see the 'gguf' branch]) - Free
 - Llama-13B-chat with function calling ([Base Model](https://huggingface.co/Trelis/Llama-2-13b-chat-hf-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/Llama-2-13b-chat-hf-function-calling-adapters-v2)) - Paid, [purchase here: €19.99 per user/seat.](https://buy.stripe.com/9AQ7te3lHdmbdZ68wz)
 - CodeLlama-34B-Instruct with function calling ([Base Model](https://huggingface.co/Trelis/CodeLlama-34b-Instruct-hf-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/CodeLlama-34b-Instruct-hf-function-calling-adapters-v2)) - Paid, [purchase here: €44.99 per user/seat.](https://buy.stripe.com/cN27teg8t2Hx5sA8wM)
 ## Inference
+**Quick Start in Google Colab**
 Try out this notebook [fLlama_Inference notebook](https://colab.research.google.com/drive/1Ow5cQ0JNv-vXsT-apCceH6Na3b4L7JyW?usp=sharing)
 **Commercial Applications**
 ~
+**Run on your laptop**
+Run on your laptop [video and juypter notebook](https://youtu.be/rjSWCMVbD_U)
 # **Llama 2**
 Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format. Links to other models can be found in the index at the bottom.