TheBloke commited on
Commit
9a8f70e
•
1 Parent(s): b621e6b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -15
README.md CHANGED
@@ -13,7 +13,7 @@ inference: false
13
  </div>
14
  <div style="display: flex; justify-content: space-between; width: 100%;">
15
  <div style="display: flex; flex-direction: column; align-items: flex-start;">
16
- <p><a href="https://discord.gg/UBgz4VXf">Chat & support: my new Discord server</a></p>
17
  </div>
18
  <div style="display: flex; flex-direction: column; align-items: flex-end;">
19
  <p><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? TheBloke's Patreon page</a></p>
@@ -32,7 +32,7 @@ It is the result of quantising to 4bit using [AutoGPTQ](https://github.com/PanQi
32
  * [4-bit GPTQ model for GPU inference](https://huggingface.co/TheBloke/falcon-40b-instruct-GPTQ)
33
  * [3-bit GPTQ model for GPU inference](https://huggingface.co/TheBloke/falcon-40b-instruct-3bit-GPTQ)
34
  * [Unquantised bf16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/tiiuae/falcon-40b-instruct)
35
-
36
  ## EXPERIMENTAL
37
 
38
  Please note this is an experimental GPTQ model. Support for it is currently quite limited.
@@ -126,7 +126,7 @@ It was created without groupsize to reduce VRAM requirements, and with `desc_act
126
 
127
  * `gptq_model-4bit--1g.safetensors`
128
  * Works only with latest AutoGPTQ CUDA, compiled from source as of commit `3cb1bf5`
129
- * At this time it does not work with AutoGPTQ Triton, but support will hopefully be added in time.
130
  * Works with text-generation-webui using `--autogptq --trust_remote_code`
131
  * At this time it does NOT work with one-click-installers
132
  * Does not work with any version of GPTQ-for-LLaMa
@@ -135,7 +135,9 @@ It was created without groupsize to reduce VRAM requirements, and with `desc_act
135
  <!-- footer start -->
136
  ## Discord
137
 
138
- For further support, and discussions on these models and AI in general, join us at: [TheBloke AI's Discord server](https://discord.gg/UBgz4VXf)
 
 
139
 
140
  ## Thanks, and how to contribute.
141
 
@@ -143,18 +145,18 @@ Thanks to the [chirper.ai](https://chirper.ai) team!
143
 
144
  I've had a lot of people ask if they can contribute. I enjoy providing models and helping people, and would love to be able to spend even more time doing it, as well as expanding into new projects like fine tuning/training.
145
 
146
- If you're able and willing to contribute, it'd be most gratefully received and will help me to keep providing models, and work on new AI projects.
147
 
148
- Donaters will get priority support on any and all AI/LLM/model questions, plus other benefits.
149
 
150
  * Patreon: https://patreon.com/TheBlokeAI
151
  * Ko-Fi: https://ko-fi.com/TheBlokeAI
152
 
153
- **Patreon special mentions**: Aemon Algiz; Johann-Peter Hartmann; Talal Aujan; Jonathan Leane; Illia Dulskyi; Khalefa Al-Ahmad; senxiiz; Sebastain Graf; Eugene Pentland; Nikolai Manek; Luke Pendergrass.
154
 
155
- Thank you to all my generous patrons and donaters.
156
  <!-- footer end -->
157
-
158
  # ✨ Original model card: Falcon-40B-Instruct
159
 
160
  # ✨ Falcon-40B-Instruct
@@ -167,9 +169,9 @@ Thank you to all my generous patrons and donaters.
167
 
168
  * **You are looking for a ready-to-use chat/instruct model based on [Falcon-40B](https://huggingface.co/tiiuae/falcon-40b).**
169
  * **Falcon-40B is the best open-source model available.** It outperforms [LLaMA](https://github.com/facebookresearch/llama), [StableLM](https://github.com/Stability-AI/StableLM), [RedPajama](https://huggingface.co/togethercomputer/RedPajama-INCITE-Base-7B-v0.1), [MPT](https://huggingface.co/mosaicml/mpt-7b), etc. See the [OpenLLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
170
- * **It features an architecture optimized for inference**, with FlashAttention ([Dao et al., 2022](https://arxiv.org/abs/2205.14135)) and multiquery ([Shazeer et al., 2019](https://arxiv.org/abs/1911.02150)).
171
 
172
- 💬 **This is an instruct model, which may not be ideal for further finetuning.** If you are interested in building your own instruct/chat model, we recommend starting from [Falcon-40B](https://huggingface.co/tiiuae/falcon-40b).
173
 
174
  💸 **Looking for a smaller, less expensive model?** [Falcon-7B-Instruct](https://huggingface.co/tiiuae/falcon-7b-instruct) is Falcon-40B-Instruct's small brother!
175
 
@@ -228,7 +230,7 @@ Falcon-40B-Instruct has been finetuned on a chat dataset.
228
 
229
  ### Out-of-Scope Use
230
 
231
- Production use without adequate assessment of risks and mitigation; any use cases which may be considered irresponsible or harmful.
232
 
233
  ## Bias, Risks, and Limitations
234
 
@@ -274,7 +276,7 @@ for seq in sequences:
274
 
275
  ### Training Data
276
 
277
- Falcon-40B-Instruct was finetuned on a 150M tokens from [Bai ze](https://github.com/project-baize/baize-chatbot) mixed with 5% of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) data.
278
 
279
 
280
  The data was tokenized with the Falcon-[7B](https://huggingface.co/tiiuae/falcon-7b)/[40B](https://huggingface.co/tiiuae/falcon-40b) tokenizer.
@@ -287,7 +289,7 @@ The data was tokenized with the Falcon-[7B](https://huggingface.co/tiiuae/falcon
287
  See the [OpenLLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) for early results.
288
 
289
 
290
- ## Technical Specifications
291
 
292
  For more information about pretraining, see [Falcon-40B](https://huggingface.co/tiiuae/falcon-40b).
293
 
@@ -315,7 +317,7 @@ For multiquery, we are using an internal variant which uses independent key and
315
 
316
  #### Hardware
317
 
318
- Falcon-40B-Instruct was trained on AWS SageMaker, on 64 A100 40GB GPUs in P4d instances.
319
 
320
  #### Software
321
 
 
13
  </div>
14
  <div style="display: flex; justify-content: space-between; width: 100%;">
15
  <div style="display: flex; flex-direction: column; align-items: flex-start;">
16
+ <p><a href="https://discord.gg/Jq4vkcDakD">Chat & support: my new Discord server</a></p>
17
  </div>
18
  <div style="display: flex; flex-direction: column; align-items: flex-end;">
19
  <p><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? TheBloke's Patreon page</a></p>
 
32
  * [4-bit GPTQ model for GPU inference](https://huggingface.co/TheBloke/falcon-40b-instruct-GPTQ)
33
  * [3-bit GPTQ model for GPU inference](https://huggingface.co/TheBloke/falcon-40b-instruct-3bit-GPTQ)
34
  * [Unquantised bf16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/tiiuae/falcon-40b-instruct)
35
+
36
  ## EXPERIMENTAL
37
 
38
  Please note this is an experimental GPTQ model. Support for it is currently quite limited.
 
126
 
127
  * `gptq_model-4bit--1g.safetensors`
128
  * Works only with latest AutoGPTQ CUDA, compiled from source as of commit `3cb1bf5`
129
+ * At this time it does not work with AutoGPTQ Triton, but support will hopefully be added in time.
130
  * Works with text-generation-webui using `--autogptq --trust_remote_code`
131
  * At this time it does NOT work with one-click-installers
132
  * Does not work with any version of GPTQ-for-LLaMa
 
135
  <!-- footer start -->
136
  ## Discord
137
 
138
+ For further support, and discussions on these models and AI in general, join us at:
139
+
140
+ [TheBloke AI's Discord server](https://discord.gg/Jq4vkcDakD)
141
 
142
  ## Thanks, and how to contribute.
143
 
 
145
 
146
  I've had a lot of people ask if they can contribute. I enjoy providing models and helping people, and would love to be able to spend even more time doing it, as well as expanding into new projects like fine tuning/training.
147
 
148
+ If you're able and willing to contribute it will be most gratefully received and will help me to keep providing more models, and to start work on new AI projects.
149
 
150
+ Donaters will get priority support on any and all AI/LLM/model questions and requests, access to a private Discord room, plus other benefits.
151
 
152
  * Patreon: https://patreon.com/TheBlokeAI
153
  * Ko-Fi: https://ko-fi.com/TheBlokeAI
154
 
155
+ **Patreon special mentions**: Aemon Algiz, Dmitriy Samsonov, Nathan LeClaire, Trenton Dambrowitz, Mano Prime, David Flickinger, vamX, Nikolai Manek, senxiiz, Khalefa Al-Ahmad, Illia Dulskyi, Jonathan Leane, Talal Aujan, V. Lukas, Joseph William Delisle, Pyrater, Oscar Rangel, Lone Striker, Luke Pendergrass, Eugene Pentland, Sebastain Graf, Johann-Peter Hartman.
156
 
157
+ Thank you to all my generous patrons and donaters!
158
  <!-- footer end -->
159
+
160
  # ✨ Original model card: Falcon-40B-Instruct
161
 
162
  # ✨ Falcon-40B-Instruct
 
169
 
170
  * **You are looking for a ready-to-use chat/instruct model based on [Falcon-40B](https://huggingface.co/tiiuae/falcon-40b).**
171
  * **Falcon-40B is the best open-source model available.** It outperforms [LLaMA](https://github.com/facebookresearch/llama), [StableLM](https://github.com/Stability-AI/StableLM), [RedPajama](https://huggingface.co/togethercomputer/RedPajama-INCITE-Base-7B-v0.1), [MPT](https://huggingface.co/mosaicml/mpt-7b), etc. See the [OpenLLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
172
+ * **It features an architecture optimized for inference**, with FlashAttention ([Dao et al., 2022](https://arxiv.org/abs/2205.14135)) and multiquery ([Shazeer et al., 2019](https://arxiv.org/abs/1911.02150)).
173
 
174
+ 💬 **This is an instruct model, which may not be ideal for further finetuning.** If you are interested in building your own instruct/chat model, we recommend starting from [Falcon-40B](https://huggingface.co/tiiuae/falcon-40b).
175
 
176
  💸 **Looking for a smaller, less expensive model?** [Falcon-7B-Instruct](https://huggingface.co/tiiuae/falcon-7b-instruct) is Falcon-40B-Instruct's small brother!
177
 
 
230
 
231
  ### Out-of-Scope Use
232
 
233
+ Production use without adequate assessment of risks and mitigation; any use cases which may be considered irresponsible or harmful.
234
 
235
  ## Bias, Risks, and Limitations
236
 
 
276
 
277
  ### Training Data
278
 
279
+ Falcon-40B-Instruct was finetuned on a 150M tokens from [Bai ze](https://github.com/project-baize/baize-chatbot) mixed with 5% of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) data.
280
 
281
 
282
  The data was tokenized with the Falcon-[7B](https://huggingface.co/tiiuae/falcon-7b)/[40B](https://huggingface.co/tiiuae/falcon-40b) tokenizer.
 
289
  See the [OpenLLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) for early results.
290
 
291
 
292
+ ## Technical Specifications
293
 
294
  For more information about pretraining, see [Falcon-40B](https://huggingface.co/tiiuae/falcon-40b).
295
 
 
317
 
318
  #### Hardware
319
 
320
+ Falcon-40B-Instruct was trained on AWS SageMaker, on 64 A100 40GB GPUs in P4d instances.
321
 
322
  #### Software
323