nsarrazin HF staff commited on
Commit
8100ea5
·
unverified ·
1 Parent(s): 98030ef

Privacy update & readme linting (#472)

Browse files

* privacy update

* typo

* update date

Files changed (2) hide show
  1. PRIVACY.md +10 -12
  2. README.md +17 -22
PRIVACY.md CHANGED
@@ -1,22 +1,25 @@
1
  ## Privacy
2
 
3
- > Last updated: July 23, 2023
4
 
5
  Users of HuggingChat are authenticated through their HF user account.
6
 
7
- By default, your conversations may be shared with the respective models' authors (e.g. if you're chatting with the Open Assistant model, to <a target="_blank" href="https://open-assistant.io/dashboard">Open Assistant</a>) to improve their training data and model over time. Model authors are the custodians of the data collected by their model, even if it's hosted on our platform.
8
 
9
  If you disable data sharing in your settings, your conversations will not be used for any downstream usage (including for research or model training purposes), and they will only be stored to let you access past conversations. You can click on the Delete icon to delete any past conversation at any moment.
10
 
11
- 🗓 Please also consult huggingface.co's main privacy policy at https://huggingface.co/privacy. To exercise any of your legal privacy rights, please send an email to privacy@huggingface.co.
12
 
13
  ## About available LLMs
14
 
15
- The goal of this app is to showcase that it is now (May 2023) possible to build an open source alternative to ChatGPT. 💪
16
 
17
- For now, it's running both OpenAssistant's [latest LLaMA based model](https://huggingface.co/OpenAssistant/oasst-sft-6-llama-30b-xor) (which is one of the current best open source chat models) as well as [Meta's newer Llama 2](https://huggingface.co/meta-llama/Llama-2-70b-chat-hf), but the plan in the longer-term is to expose all good-quality chat models from the Hub.
18
 
19
- We are not affiliated with Open Assistant nor Meta AI, but if you want to contribute to the training data for the next generation of open models, please consider contributing to https://open-assistant.io/ or https://ai.meta.com/llama/ ❤️
 
 
 
20
 
21
  ## Technical details
22
 
@@ -28,11 +31,6 @@ The inference backend is running the optimized [text-generation-inference](https
28
 
29
  It is therefore possible to deploy a copy of this app to a Space and customize it (swap model, add some UI elements, or store user messages according to your own Terms and conditions). You can also 1-click deploy your own instance using the [Chat UI Spaces Docker template](https://huggingface.co/new-space?template=huggingchat/chat-ui-template).
30
 
31
- We welcome any feedback on this app: please participate to the public discussion at https://huggingface.co/spaces/huggingchat/chat-ui/discussions
32
 
33
  <a target="_blank" href="https://huggingface.co/spaces/huggingchat/chat-ui/discussions"><img src="https://huggingface.co/datasets/huggingface/badges/raw/main/open-a-discussion-xl.svg" title="open a discussion"></a>
34
-
35
- ## Coming soon
36
-
37
- - User setting to share conversations with model authors (done ✅)
38
- - LLM watermarking
 
1
  ## Privacy
2
 
3
+ > Last updated: October 4, 2023
4
 
5
  Users of HuggingChat are authenticated through their HF user account.
6
 
7
+ By default, your conversations may be shared with the respective models' authors to improve their training data and model over time. Model authors are the custodians of the data collected by their model, even if it's hosted on our platform.
8
 
9
  If you disable data sharing in your settings, your conversations will not be used for any downstream usage (including for research or model training purposes), and they will only be stored to let you access past conversations. You can click on the Delete icon to delete any past conversation at any moment.
10
 
11
+ 🗓 Please also consult huggingface.co's main privacy policy at <https://huggingface.co/privacy>. To exercise any of your legal privacy rights, please send an email to <privacy@huggingface.co>.
12
 
13
  ## About available LLMs
14
 
15
+ The goal of this app is to showcase that it is now possible to build an open source alternative to ChatGPT. 💪
16
 
17
+ For now (October 2023), it's running:
18
 
19
+ - [Llama 2 70B](https://huggingface.co/meta-llama/Llama-2-70b-chat-hf)
20
+ - [CodeLlama 35B](https://about.fb.com/news/2023/08/code-llama-ai-for-coding/)
21
+ - [Falcon 180B](https://www.tii.ae/news/technology-innovation-institute-introduces-worlds-most-powerful-open-llm-falcon-180b)
22
+ - [Mistral 7B](https://mistral.ai/news/announcing-mistral-7b/)
23
 
24
  ## Technical details
25
 
 
31
 
32
  It is therefore possible to deploy a copy of this app to a Space and customize it (swap model, add some UI elements, or store user messages according to your own Terms and conditions). You can also 1-click deploy your own instance using the [Chat UI Spaces Docker template](https://huggingface.co/new-space?template=huggingchat/chat-ui-template).
33
 
34
+ We welcome any feedback on this app: please participate to the public discussion at <https://huggingface.co/spaces/huggingchat/chat-ui/discussions>
35
 
36
  <a target="_blank" href="https://huggingface.co/spaces/huggingchat/chat-ui/discussions"><img src="https://huggingface.co/datasets/huggingface/badges/raw/main/open-a-discussion-xl.svg" title="open a discussion"></a>
 
 
 
 
 
README.md CHANGED
@@ -39,7 +39,7 @@ The default config for Chat UI is stored in the `.env` file. You will need to ov
39
 
40
  Start by creating a `.env.local` file in the root of the repository. The bare minimum config you need to get Chat UI to run locally is the following:
41
 
42
- ```bash
43
  MONGODB_URL=<the URL to your mongoDB instance>
44
  HF_ACCESS_TOKEN=<your access token>
45
  ```
@@ -87,7 +87,7 @@ Chat UI features a powerful Web Search feature. It works by:
87
 
88
  The login feature is disabled by default and users are attributed a unique ID based on their browser. But if you want to use OpenID to authenticate your users, you can add the following to your `.env.local` file:
89
 
90
- ```bash
91
  OPENID_PROVIDER_URL=<your OIDC issuer>
92
  OPENID_CLIENT_ID=<your OIDC client ID>
93
  OPENID_CLIENT_SECRET=<your OIDC client secret>
@@ -99,7 +99,7 @@ These variables will enable the openID sign-in modal for users.
99
 
100
  You can use a few environment variables to customize the look and feel of chat-ui. These are by default:
101
 
102
- ```
103
  PUBLIC_APP_NAME=ChatUI
104
  PUBLIC_APP_ASSETS=chatui
105
  PUBLIC_APP_COLOR=blue
@@ -113,7 +113,7 @@ PUBLIC_APP_DISCLAIMER=
113
  - `PUBLIC_APP_DATA_SHARING` Can be set to 1 to add a toggle in the user settings that lets your users opt-in to data sharing with models creator.
114
  - `PUBLIC_APP_DISCLAIMER` If set to 1, we show a disclaimer about generated outputs on login.
115
 
116
- ### Web Search
117
 
118
  You can enable the web search by adding either `SERPER_API_KEY` ([serper.dev](https://serper.dev/)) or `SERPAPI_KEY` ([serpapi.com](https://serpapi.com/)) to your `.env.local`.
119
 
@@ -121,8 +121,7 @@ You can enable the web search by adding either `SERPER_API_KEY` ([serper.dev](ht
121
 
122
  You can customize the parameters passed to the model or even use a new model by updating the `MODELS` variable in your `.env.local`. The default one can be found in `.env` and looks like this :
123
 
124
- ```
125
-
126
  MODELS=`[
127
  {
128
  "name": "OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5",
@@ -162,15 +161,15 @@ MODELS=`[
162
 
163
  You can change things like the parameters, or customize the preprompt to better suit your needs. You can also add more models by adding more objects to the array, with different preprompts for example.
164
 
165
- #### Custom prompt templates:
166
 
167
  By default the prompt is constructed using `userMessageToken`, `assistantMessageToken`, `userMessageEndToken`, `assistantMessageEndToken`, `preprompt` parameters and a series of default templates.
168
 
169
- However, these templates can be modified by setting the `chatPromptTemplate` and `webSearchQueryPromptTemplate` parameters. Note that if WebSearch is not enabled, only `chatPromptTemplate` needs to be set. The template language is https://handlebarsjs.com. The templates have access to the model's prompt parameters (`preprompt`, etc.). However, if the templates are specified it is recommended to inline the prompt parameters, as using the references (`{{preprompt}}`) is deprecated.
170
 
171
  For example:
172
 
173
- ```
174
  <System>You are an AI, called ChatAI.</System>
175
  {{#each messages}}
176
  {{#ifUser}}<User>{{content}}</User>{{/ifUser}}
@@ -179,13 +178,13 @@ For example:
179
  <Assistant>
180
  ```
181
 
182
- **chatPromptTemplate**
183
 
184
  When quering the model for a chat response, the `chatPromptTemplate` template is used. `messages` is an array of chat messages, it has the format `[{ content: string }, ...]`. To idenify if a message is a user message or an assistant message the `ifUser` and `ifAssistant` block helpers can be used.
185
 
186
  The following is the default `chatPromptTemplate`, although newlines and indentiation have been added for readability.
187
 
188
- ```
189
  {{preprompt}}
190
  {{#each messages}}
191
  {{#ifUser}}{{@root.userMessageToken}}{{content}}{{@root.userMessageEndToken}}{{/ifUser}}
@@ -194,13 +193,13 @@ The following is the default `chatPromptTemplate`, although newlines and indenti
194
  {{assistantMessageToken}}
195
  ```
196
 
197
- **webSearchQueryPromptTemplate**
198
 
199
  When performing a websearch, the search query is constructed using the `webSearchQueryPromptTemplate` template. It is recommended that that the prompt instructs the chat model to only return a few keywords.
200
 
201
  The following is the default `webSearchQueryPromptTemplate`.
202
 
203
- ```
204
  {{userMessageToken}}
205
  My question is: {{message.content}}.
206
  Based on the conversation history (my previous questions are: {{previousMessages}}), give me an appropriate query to answer my question for google search. You should not say more than query. You should not say any words except the query. For the context, today is {{currentDate}}
@@ -216,13 +215,11 @@ A good option is to hit a [text-generation-inference](https://github.com/hugging
216
 
217
  To do this, you can add your own endpoints to the `MODELS` variable in `.env.local`, by adding an `"endpoints"` key for each model in `MODELS`.
218
 
219
- ```
220
-
221
  {
222
  // rest of the model config here
223
  "endpoints": [{"url": "https://HOST:PORT"}]
224
  }
225
-
226
  ```
227
 
228
  If `endpoints` is left unspecified, ChatUI will look for the model on the hosted Hugging Face inference API using the model name.
@@ -243,22 +240,20 @@ For `Bearer` you can use a token, which can be grabbed from [here](https://huggi
243
 
244
  You can then add the generated information and the `authorization` parameter to your `.env.local`.
245
 
246
- ```
247
-
248
  "endpoints": [
249
  {
250
  "url": "https://HOST:PORT",
251
  "authorization": "Basic VVNFUjpQQVNT",
252
  }
253
  ]
254
-
255
  ```
256
 
257
  ### Amazon SageMaker
258
 
259
  You can also specify your Amazon SageMaker instance as an endpoint for chat-ui. The config goes like this:
260
 
261
- ```
262
  "endpoints": [
263
  {
264
  "host" : "sagemaker",
@@ -268,6 +263,7 @@ You can also specify your Amazon SageMaker instance as an endpoint for chat-ui.
268
  "sessionToken": "", // optional
269
  "weight": 1
270
  }
 
271
  ```
272
 
273
  You can get the `accessKey` and `secretKey` from your AWS user, under programmatic access.
@@ -284,8 +280,7 @@ If you're using a self-signed certificate, e.g. for testing or development purpo
284
 
285
  If the model being hosted will be available on multiple servers/instances add the `weight` parameter to your `.env.local`. The `weight` will be used to determine the probability of requesting a particular endpoint.
286
 
287
- ```
288
-
289
  "endpoints": [
290
  {
291
  "url": "https://HOST:PORT",
 
39
 
40
  Start by creating a `.env.local` file in the root of the repository. The bare minimum config you need to get Chat UI to run locally is the following:
41
 
42
+ ```env
43
  MONGODB_URL=<the URL to your mongoDB instance>
44
  HF_ACCESS_TOKEN=<your access token>
45
  ```
 
87
 
88
  The login feature is disabled by default and users are attributed a unique ID based on their browser. But if you want to use OpenID to authenticate your users, you can add the following to your `.env.local` file:
89
 
90
+ ```env
91
  OPENID_PROVIDER_URL=<your OIDC issuer>
92
  OPENID_CLIENT_ID=<your OIDC client ID>
93
  OPENID_CLIENT_SECRET=<your OIDC client secret>
 
99
 
100
  You can use a few environment variables to customize the look and feel of chat-ui. These are by default:
101
 
102
+ ```env
103
  PUBLIC_APP_NAME=ChatUI
104
  PUBLIC_APP_ASSETS=chatui
105
  PUBLIC_APP_COLOR=blue
 
113
  - `PUBLIC_APP_DATA_SHARING` Can be set to 1 to add a toggle in the user settings that lets your users opt-in to data sharing with models creator.
114
  - `PUBLIC_APP_DISCLAIMER` If set to 1, we show a disclaimer about generated outputs on login.
115
 
116
+ ### Web Search config
117
 
118
  You can enable the web search by adding either `SERPER_API_KEY` ([serper.dev](https://serper.dev/)) or `SERPAPI_KEY` ([serpapi.com](https://serpapi.com/)) to your `.env.local`.
119
 
 
121
 
122
  You can customize the parameters passed to the model or even use a new model by updating the `MODELS` variable in your `.env.local`. The default one can be found in `.env` and looks like this :
123
 
124
+ ```env
 
125
  MODELS=`[
126
  {
127
  "name": "OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5",
 
161
 
162
  You can change things like the parameters, or customize the preprompt to better suit your needs. You can also add more models by adding more objects to the array, with different preprompts for example.
163
 
164
+ #### Custom prompt templates
165
 
166
  By default the prompt is constructed using `userMessageToken`, `assistantMessageToken`, `userMessageEndToken`, `assistantMessageEndToken`, `preprompt` parameters and a series of default templates.
167
 
168
+ However, these templates can be modified by setting the `chatPromptTemplate` and `webSearchQueryPromptTemplate` parameters. Note that if WebSearch is not enabled, only `chatPromptTemplate` needs to be set. The template language is <https://handlebarsjs.com>. The templates have access to the model's prompt parameters (`preprompt`, etc.). However, if the templates are specified it is recommended to inline the prompt parameters, as using the references (`{{preprompt}}`) is deprecated.
169
 
170
  For example:
171
 
172
+ ```prompt
173
  <System>You are an AI, called ChatAI.</System>
174
  {{#each messages}}
175
  {{#ifUser}}<User>{{content}}</User>{{/ifUser}}
 
178
  <Assistant>
179
  ```
180
 
181
+ ##### chatPromptTemplate
182
 
183
  When quering the model for a chat response, the `chatPromptTemplate` template is used. `messages` is an array of chat messages, it has the format `[{ content: string }, ...]`. To idenify if a message is a user message or an assistant message the `ifUser` and `ifAssistant` block helpers can be used.
184
 
185
  The following is the default `chatPromptTemplate`, although newlines and indentiation have been added for readability.
186
 
187
+ ```prompt
188
  {{preprompt}}
189
  {{#each messages}}
190
  {{#ifUser}}{{@root.userMessageToken}}{{content}}{{@root.userMessageEndToken}}{{/ifUser}}
 
193
  {{assistantMessageToken}}
194
  ```
195
 
196
+ ##### webSearchQueryPromptTemplate
197
 
198
  When performing a websearch, the search query is constructed using the `webSearchQueryPromptTemplate` template. It is recommended that that the prompt instructs the chat model to only return a few keywords.
199
 
200
  The following is the default `webSearchQueryPromptTemplate`.
201
 
202
+ ```prompt
203
  {{userMessageToken}}
204
  My question is: {{message.content}}.
205
  Based on the conversation history (my previous questions are: {{previousMessages}}), give me an appropriate query to answer my question for google search. You should not say more than query. You should not say any words except the query. For the context, today is {{currentDate}}
 
215
 
216
  To do this, you can add your own endpoints to the `MODELS` variable in `.env.local`, by adding an `"endpoints"` key for each model in `MODELS`.
217
 
218
+ ```env
 
219
  {
220
  // rest of the model config here
221
  "endpoints": [{"url": "https://HOST:PORT"}]
222
  }
 
223
  ```
224
 
225
  If `endpoints` is left unspecified, ChatUI will look for the model on the hosted Hugging Face inference API using the model name.
 
240
 
241
  You can then add the generated information and the `authorization` parameter to your `.env.local`.
242
 
243
+ ```env
 
244
  "endpoints": [
245
  {
246
  "url": "https://HOST:PORT",
247
  "authorization": "Basic VVNFUjpQQVNT",
248
  }
249
  ]
 
250
  ```
251
 
252
  ### Amazon SageMaker
253
 
254
  You can also specify your Amazon SageMaker instance as an endpoint for chat-ui. The config goes like this:
255
 
256
+ ```env
257
  "endpoints": [
258
  {
259
  "host" : "sagemaker",
 
263
  "sessionToken": "", // optional
264
  "weight": 1
265
  }
266
+ ]
267
  ```
268
 
269
  You can get the `accessKey` and `secretKey` from your AWS user, under programmatic access.
 
280
 
281
  If the model being hosted will be available on multiple servers/instances add the `weight` parameter to your `.env.local`. The `weight` will be used to determine the probability of requesting a particular endpoint.
282
 
283
+ ```env
 
284
  "endpoints": [
285
  {
286
  "url": "https://HOST:PORT",