Spaces:

galatolo
/

chat-with-cerbero-7b

Runtime error

Federico Galatolo commited on Oct 27, 2023

Commit

fafd74a

1 Parent(s): 7b3a47b

Q4_K quantization

Files changed (2) hide show

.gitignore ADDED Viewed

	@@ -0,0 +1 @@


1	+ /env

app.py CHANGED Viewed

@@ -9,7 +9,7 @@ from huggingface_hub import hf_hub_download
 llm = Llama(
     model_path=hf_hub_download(
         repo_id="galatolo/cerbero-7b-gguf",
-        filename="ggml-model-Q8_0.gguf",
     ),
     n_ctx=4086,
 )
@@ -51,7 +51,7 @@ def generate_text(message, history):
 demo = gr.ChatInterface(
     generate_text,
-    title="cerbero-7b running on CPU (quantized)",
     description="This is a quantized version of cerbero-7b running on CPU. It is less powerful than the original version, but it is much faster and it can even run on a Raspberry Pi 4.",
     examples=[
         "Dammi 3 idee di ricette che posso fare con i pistacchi",

 llm = Llama(
     model_path=hf_hub_download(
         repo_id="galatolo/cerbero-7b-gguf",
+        filename="ggml-model-Q4_K.gguf",
     ),
     n_ctx=4086,
 )
 demo = gr.ChatInterface(
     generate_text,
+    title="cerbero-7b running on CPU (quantized Q4_K)",
     description="This is a quantized version of cerbero-7b running on CPU. It is less powerful than the original version, but it is much faster and it can even run on a Raspberry Pi 4.",
     examples=[
         "Dammi 3 idee di ricette che posso fare con i pistacchi",