TheBloke
/

WizardLM-Uncensored-Falcon-7B-GGML

text-generation-inference

Model card Files Files and versions Community

TheBloke commited on Jun 23, 2023

Commit

55fa1b6

•

1 Parent(s): 38ea256

Update README.md

Files changed (1) hide show

README.md +11 -4

README.md CHANGED Viewed

@@ -21,9 +21,12 @@ license: other
 These files are **experimental** GGML format model files for [Eric Hartford's WizardLM Uncensored Falcon 7B](https://huggingface.co/ehartford/WizardLM-Uncensored-Falcon-7b).
-These GGML files will **not** work in llama.cpp, and at the time of writing they will not work with any UI or library. They cannot be used from Python code.
-They can be used with a new fork of llama.cpp that adds Falcon GGML support: [cmp-nc/ggllm.cpp](https://github.com/cmp-nct/ggllm.cpp)
 Note: It is not currently possible to use the new k-quant formats with Falcon 7B. This is being worked on.
@@ -36,7 +39,11 @@ Note: It is not currently possible to use the new k-quant formats with Falcon 7B
 <!-- compatibility_ggml start -->
 ## Compatibility
-To build cmp-nct's fork of llama.cpp with Falcon 7B support plus preliminary CUDA acceleration, please try the following steps:
 ```
 git clone https://github.com/cmp-nct/ggllm.cpp
@@ -48,7 +55,7 @@ Compiling on Windows: developer cmp-nct notes: 'I personally compile it using VS
 Once compiled you can then use `bin/falcon_main` just like you would use llama.cpp. For example:
 ```
-bin/falcon_main -t 8 -ngl 100 -b 1 -m wizard-falcon7b.ggmlv3.q4_0.bin -p "What is a falcon?\n### Response:"
 ```
 You can specify `-ngl 100` regardles of your VRAM, as it will automatically detect how much VRAM is available to be used.

 These files are **experimental** GGML format model files for [Eric Hartford's WizardLM Uncensored Falcon 7B](https://huggingface.co/ehartford/WizardLM-Uncensored-Falcon-7b).
+These GGML files will **not** work in llama.cpp, text-generation-webui or KoboldCpp.
+They can be used with:
+* [LoLLMS Web UI](https://github.com/ParisNeo/lollms-webui).
+* The ctransformers Python library, which includes LangChain support: [ctransformers](https://github.com/marella/ctransformers).
+* A new fork of llama.cpp that introduced this new Falcon GGML support: [cmp-nc/ggllm.cpp](https://github.com/cmp-nct/ggllm.cpp).
 Note: It is not currently possible to use the new k-quant formats with Falcon 7B. This is being worked on.
 <!-- compatibility_ggml start -->
 ## Compatibility
+The recommended UI for these GGMLs is [LoLLMS Web UI](https://github.com/ParisNeo/lollms-webui). Preliminary CUDA GPU acceleration is provided.
+For use from Python code, use [ctransformers](https://github.com/marella/ctransformers). Again, with preliminary CUDA GPU acceleration.
+Or to build cmp-nct's fork of llama.cpp with Falcon 7B support plus preliminary CUDA acceleration, please try the following steps:
 ```
 git clone https://github.com/cmp-nct/ggllm.cpp
 Once compiled you can then use `bin/falcon_main` just like you would use llama.cpp. For example:
 ```
+bin/falcon_main -t 8 -ngl 100 -b 1 -m falcon7b-instruct.ggmlv3.q4_0.bin -p "What is a falcon?\n### Response:"
 ```
 You can specify `-ngl 100` regardles of your VRAM, as it will automatically detect how much VRAM is available to be used.