GGUF
Inference Endpoints
maddes8cht commited on
Commit
dde4f54
1 Parent(s): 5b9c409

"Update README.md"

Browse files
Files changed (1) hide show
  1. README.md +96 -0
README.md ADDED
@@ -0,0 +1,96 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - ehartford/wizard_vicuna_70k_unfiltered
5
+ ---
6
+ [![banner](https://maddes8cht.github.io/assets/buttons/Huggingface-banner.jpg)]()
7
+
8
+ I'm constantly enhancing these model descriptions to provide you with the most relevant and comprehensive information
9
+
10
+ # open_llama_7b_qlora_uncensored - GGUF
11
+ - Model creator: [georgesung](https://huggingface.co/georgesung)
12
+ - Original model: [open_llama_7b_qlora_uncensored](https://huggingface.co/georgesung/open_llama_7b_qlora_uncensored)
13
+
14
+ OpenLlama is a free reimplementation of the original Llama Model which is licensed under Apache 2 license.
15
+
16
+
17
+
18
+ # About GGUF format
19
+
20
+ `gguf` is the current file format used by the [`ggml`](https://github.com/ggerganov/ggml) library.
21
+ A growing list of Software is using it and can therefore use this model.
22
+ The core project making use of the ggml library is the [llama.cpp](https://github.com/ggerganov/llama.cpp) project by Georgi Gerganov
23
+
24
+ # Quantization variants
25
+
26
+ There is a bunch of quantized files available to cater to your specific needs. Here's how to choose the best option for you:
27
+
28
+ # Legacy quants
29
+
30
+ Q4_0, Q4_1, Q5_0, Q5_1 and Q8 are `legacy` quantization types.
31
+ Nevertheless, they are fully supported, as there are several circumstances that cause certain model not to be compatible with the modern K-quants.
32
+ ## Note:
33
+ Now there's a new option to use K-quants even for previously 'incompatible' models, although this involves some fallback solution that makes them not *real* K-quants. More details can be found in affected model descriptions.
34
+ (This mainly refers to Falcon 7b and Starcoder models)
35
+
36
+ # K-quants
37
+
38
+ K-quants are designed with the idea that different levels of quantization in specific parts of the model can optimize performance, file size, and memory load.
39
+ So, if possible, use K-quants.
40
+ With a Q6_K, you'll likely find it challenging to discern a quality difference from the original model - ask your model two times the same question and you may encounter bigger quality differences.
41
+
42
+
43
+
44
+
45
+ ---
46
+
47
+ # Original Model Card:
48
+ # Overview
49
+ Fine-tuned [OpenLLaMA-7B](https://huggingface.co/openlm-research/open_llama_7b) with an uncensored/unfiltered Wizard-Vicuna conversation dataset [ehartford/wizard_vicuna_70k_unfiltered](https://huggingface.co/datasets/ehartford/wizard_vicuna_70k_unfiltered).
50
+ Used QLoRA for fine-tuning. Trained for one epoch on a 24GB GPU (NVIDIA A10G) instance, took ~18 hours to train.
51
+
52
+ # Prompt style
53
+ The model was trained with the following prompt style:
54
+ ```
55
+ ### HUMAN:
56
+ Hello
57
+
58
+ ### RESPONSE:
59
+ Hi, how are you?
60
+
61
+ ### HUMAN:
62
+ I'm fine.
63
+
64
+ ### RESPONSE:
65
+ How can I help you?
66
+ ...
67
+ ```
68
+
69
+ # Training code
70
+ Code used to train the model is available [here](https://github.com/georgesung/llm_qlora).
71
+
72
+ # Demo
73
+ For a Gradio chat application using this model, clone [this HuggingFace Space](https://huggingface.co/spaces/georgesung/open_llama_7b_qlora_uncensored_chat/tree/main) and run it on top of a GPU instance.
74
+ The basic T4 GPU instance will work.
75
+
76
+ # Blog post
77
+ Since this was my first time fine-tuning an LLM, I also wrote an accompanying blog post about how I performed the training :)
78
+
79
+ https://georgesung.github.io/ai/qlora-ift/
80
+
81
+ ***End of original Model File***
82
+ ---
83
+
84
+
85
+ ## Please consider to support my work
86
+ **Coming Soon:** I'm in the process of launching a sponsorship/crowdfunding campaign for my work. I'm evaluating Kickstarter, Patreon, or the new GitHub Sponsors platform, and I am hoping for some support and contribution to the continued availability of these kind of models. Your support will enable me to provide even more valuable resources and maintain the models you rely on. Your patience and ongoing support are greatly appreciated as I work to make this page an even more valuable resource for the community.
87
+
88
+ <center>
89
+
90
+ [![GitHub](https://maddes8cht.github.io/assets/buttons/github-io-button.png)](https://maddes8cht.github.io)
91
+ [![Stack Exchange](https://stackexchange.com/users/flair/26485911.png)](https://stackexchange.com/users/26485911)
92
+ [![GitHub](https://maddes8cht.github.io/assets/buttons/github-button.png)](https://github.com/maddes8cht)
93
+ [![HuggingFace](https://maddes8cht.github.io/assets/buttons/huggingface-button.png)](https://huggingface.co/maddes8cht)
94
+ [![Twitter](https://maddes8cht.github.io/assets/buttons/twitter-button.png)](https://twitter.com/maddes1966)
95
+
96
+ </center>