hogepodge commited on
Commit
a93db3a
β€’
1 Parent(s): 3f5487c

Add support for Hugging Face Persistent Storage

Browse files

Add documentation on how to enable Hugging Face Persistent Storage
for database and task storage.

Files changed (2) hide show
  1. Dockerfile +42 -14
  2. README.md +29 -8
Dockerfile CHANGED
@@ -25,13 +25,44 @@ FROM heartexlabs/label-studio:hf-latest
25
 
26
  ################################################################################
27
  #
28
- # How to Enable Configuration Persistence
29
- # ---------------------------------------
 
30
  # By default this space stores all project configuration and data annotations
31
- # in local storage with Sqlite. If the space is reset, all configuration and
32
  # annotation data in the space will be lost. You can enable configuration
33
- # persistence by connecting an external Postgres database to your space,
34
- # guaranteeing that all project and annotation settings are preserved.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  #
36
  # Set the following secret variables to match your own hosted instance of
37
  # Postgres. We strongly recommend setting these as secrets to prevent leaking
@@ -46,16 +77,14 @@ FROM heartexlabs/label-studio:hf-latest
46
  # ENV POSTGRE_PORT=<db_port>
47
  # ENV POSTGRE_HOST=<db_host>
48
  #
49
- # Uncomment the following line to remove the warning about ephemeral storage
 
50
  #
51
  # ENV STORAGE_PERSISTENCE=1
52
  #
53
  # Note that you will need to connect cloud storage to host data items that you
54
  # want to annotate, as local storage will not be preserved across a space reset.
55
  #
56
- ################################################################################
57
-
58
- ################################################################################
59
  #
60
  # How to Enable Cloud Storage
61
  # ---------------------------
@@ -74,25 +103,24 @@ FROM heartexlabs/label-studio:hf-latest
74
  # STORAGE_AWS_BUCKET_NAME="<YOUR_BUCKET_NAME>"
75
  # STORAGE_AWS_REGION_NAME="<YOUR_BUCKET_REGION>"
76
  # STORAGE_AWS_FOLDER=""
77
- #
78
  # Google Cloud Storage
79
  # ====================
80
- #
81
  # STORAGE_TYPE=gcs
82
  # STORAGE_GCS_BUCKET_NAME="<YOUR_BUCKET_NAME>"
83
  # STORAGE_GCS_PROJECT_ID="<YOUR_PROJECT_ID>"
84
  # STORAGE_GCS_FOLDER=""
85
  # GOOGLE_APPLICATION_CREDENTIALS="/opt/heartex/secrets/key.json"
86
- #
87
  # Azure Blob Storage
88
  # ==================
89
- #
90
  # STORAGE_TYPE=azure
91
  # STORAGE_AZURE_ACCOUNT_NAME="<YOUR_STORAGE_ACCOUNT>"
92
  # STORAGE_AZURE_ACCOUNT_KEY="<YOUR_STORAGE_KEY>"
93
  # STORAGE_AZURE_CONTAINER_NAME="<YOUR_CONTAINER_NAME>"
94
  # STORAGE_AZURE_FOLDER=""
95
- #
96
  #
97
  ################################################################################
98
 
 
25
 
26
  ################################################################################
27
  #
28
+ # How to Enable Persistent Storage for Label Studio in Hugging Face Spaces
29
+ # ------------------------------------------------------------------------
30
+ #
31
  # By default this space stores all project configuration and data annotations
32
+ # in local storage with sqlite. If the space is reset, all configuration and
33
  # annotation data in the space will be lost. You can enable configuration
34
+ # persistence through one of two methods:
35
+ #
36
+ # 1) Enabling Hugging Face Persistent Storage for saving project and annotation
37
+ # settings, as well as local task storage.
38
+ # 2) Connecting an external Postgres database for saving project and annotation
39
+ # settings, and cloud by connecting cloud storage for tasks.
40
+ #
41
+ ################################################################################
42
+
43
+ ################################################################################
44
+ #
45
+ # How to Enable Hugging Face Persistent Storage for Label Studio
46
+ # --------------------------------------------------------------
47
+ #
48
+ # In the Hugging Face Label Studio Space settings, select the appropriate
49
+ # Persistent Storage tier. Note that Persistent Storage is a paid add-on.
50
+ # By default, persistent storage is mounted to /data. In your Space settings,
51
+ # set the following variables:
52
+ #
53
+ # LABEL_STUDIO_BASE_DATA_DIR=/data
54
+ # ENV STORAGE_PERSISTENCE=1
55
+ #
56
+ # Your space will restart. NOTE: if you have existing settings and data,
57
+ # they will be lost in this first restart. Data and setting will only be
58
+ # preserved on subsequent restarts of the space.
59
+ #
60
+ ################################################################################
61
+
62
+ ################################################################################
63
+ #
64
+ # How to Enable Configuration Persistence with Postgres
65
+ # -----------------------------------------------------
66
  #
67
  # Set the following secret variables to match your own hosted instance of
68
  # Postgres. We strongly recommend setting these as secrets to prevent leaking
 
77
  # ENV POSTGRE_PORT=<db_port>
78
  # ENV POSTGRE_HOST=<db_host>
79
  #
80
+ # Uncomment the following line or set the following Space variable to remove
81
+ # the warning about ephemeral storage
82
  #
83
  # ENV STORAGE_PERSISTENCE=1
84
  #
85
  # Note that you will need to connect cloud storage to host data items that you
86
  # want to annotate, as local storage will not be preserved across a space reset.
87
  #
 
 
 
88
  #
89
  # How to Enable Cloud Storage
90
  # ---------------------------
 
103
  # STORAGE_AWS_BUCKET_NAME="<YOUR_BUCKET_NAME>"
104
  # STORAGE_AWS_REGION_NAME="<YOUR_BUCKET_REGION>"
105
  # STORAGE_AWS_FOLDER=""
106
+ #
107
  # Google Cloud Storage
108
  # ====================
109
+ #
110
  # STORAGE_TYPE=gcs
111
  # STORAGE_GCS_BUCKET_NAME="<YOUR_BUCKET_NAME>"
112
  # STORAGE_GCS_PROJECT_ID="<YOUR_PROJECT_ID>"
113
  # STORAGE_GCS_FOLDER=""
114
  # GOOGLE_APPLICATION_CREDENTIALS="/opt/heartex/secrets/key.json"
115
+ #
116
  # Azure Blob Storage
117
  # ==================
118
+ #
119
  # STORAGE_TYPE=azure
120
  # STORAGE_AZURE_ACCOUNT_NAME="<YOUR_STORAGE_ACCOUNT>"
121
  # STORAGE_AZURE_ACCOUNT_KEY="<YOUR_STORAGE_KEY>"
122
  # STORAGE_AZURE_CONTAINER_NAME="<YOUR_CONTAINER_NAME>"
123
  # STORAGE_AZURE_FOLDER=""
 
124
  #
125
  ################################################################################
126
 
README.md CHANGED
@@ -37,7 +37,7 @@ credentials.
37
 
38
  **By default, these spaces permit anyone to create a new login
39
  account, allowing them to view and modify project configuration, data sets, and
40
- annotations. Without any modifications, treat this space like a demo environment.**
41
 
42
  ## Creating a Labeling Project
43
 
@@ -58,7 +58,7 @@ resources including tutorials and documentation.
58
  - πŸ€— [Tutorial: Using Label Studio with Hugging Face Datasets Hub](https://danielvanstrien.xyz/huggingface/huggingface-datasets/annotation/full%20stack%20deep%20learning%20notes/2022/09/07/label-studio-annotations-hub.html)
59
  - πŸ’‘ [Label Studio Docs](https://hubs.ly/Q01CN9Yq0)
60
 
61
-
62
  ![Gif of Label Studio annotating different types of data](https://raw.githubusercontent.com/heartexlabs/label-studio/master/images/annotation_examples.gif)
63
 
64
  ### Making your Label Studio Hugging Face Space production-ready
@@ -68,8 +68,8 @@ will full access to all projects and data. This is great for trying out
68
  Label Studio and collaborating on projects, but you may want to restrict
69
  access to your space to only authorized users. Add the following environment
70
  variable to your spaces Dockerfile to disable public account creation for
71
- this space.
72
-
73
  ENV LABEL_STUDIO_DISABLE_SIGNUP_WITHOUT_LINK=true
74
 
75
  Set secrets in your space to create an inital user, and log in with your
@@ -80,13 +80,34 @@ globally visible on a public space.
80
  LABEL_STUDIO_PASSWORD
81
 
82
  You will need to provide new users with an invitation link to join the space,
83
- which can be found in the Organizations interface of Label Studio
84
 
85
  By default this space stores all project configuration and data annotations
86
  in local storage with Sqlite. If the space is reset, all configuration and
87
  annotation data in the space will be lost. You can enable configuration
88
- persistence by connecting an external Postgres database to your space,
89
- guaranteeing that all project and annotation settings are preserved.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90
 
91
  Set the following secret variables to match your own hosted instance of
92
  Postgres. We strongly recommend setting these as secrets to prevent leaking
@@ -142,6 +163,6 @@ Azure Blob Storage
142
  STORAGE_AZURE_FOLDER=""
143
 
144
 
145
- ## Questions? Concerns? Want to get involved?
146
 
147
  Email the community team at [community@labelstud.io](mailto:community@labelstud.io)
 
37
 
38
  **By default, these spaces permit anyone to create a new login
39
  account, allowing them to view and modify project configuration, data sets, and
40
+ annotations. Without any modifications, treat this space like a demo environment.**
41
 
42
  ## Creating a Labeling Project
43
 
 
58
  - πŸ€— [Tutorial: Using Label Studio with Hugging Face Datasets Hub](https://danielvanstrien.xyz/huggingface/huggingface-datasets/annotation/full%20stack%20deep%20learning%20notes/2022/09/07/label-studio-annotations-hub.html)
59
  - πŸ’‘ [Label Studio Docs](https://hubs.ly/Q01CN9Yq0)
60
 
61
+
62
  ![Gif of Label Studio annotating different types of data](https://raw.githubusercontent.com/heartexlabs/label-studio/master/images/annotation_examples.gif)
63
 
64
  ### Making your Label Studio Hugging Face Space production-ready
 
68
  Label Studio and collaborating on projects, but you may want to restrict
69
  access to your space to only authorized users. Add the following environment
70
  variable to your spaces Dockerfile to disable public account creation for
71
+ this space.
72
+
73
  ENV LABEL_STUDIO_DISABLE_SIGNUP_WITHOUT_LINK=true
74
 
75
  Set secrets in your space to create an inital user, and log in with your
 
80
  LABEL_STUDIO_PASSWORD
81
 
82
  You will need to provide new users with an invitation link to join the space,
83
+ which can be found in the Organizations interface of Label Studio.
84
 
85
  By default this space stores all project configuration and data annotations
86
  in local storage with Sqlite. If the space is reset, all configuration and
87
  annotation data in the space will be lost. You can enable configuration
88
+ persistence in one of two ways:
89
+
90
+ 1. Enabling Persistent Storage in your Space settings and configuring Label
91
+ Studio to write its database and task storage there.
92
+
93
+ 2. Connecting an external Postgres database and cloud storage to your space,
94
+ guaranteeing that all project and annotation settings are preserved.
95
+
96
+ ### Enabling Hugging Face Persistent Storage
97
+
98
+ In the Hugging Face Label Studio Space settings, select the appropriate
99
+ Persistent Storage tier. Note that Persistent Storage is a paid add-on.
100
+ By default, persistent storage is mounted to /data. In your Space settings,
101
+ set the following variables:
102
+
103
+ LABEL_STUDIO_BASE_DATA_DIR=/data
104
+ ENV STORAGE_PERSISTENCE=1
105
+
106
+ Your space will restart. NOTE: if you have existing settings and data,
107
+ they will be lost in this first restart. Data and setting will only be
108
+ preserved on subsequent restarts of the space.
109
+
110
+ ### Enabling Postgres Database and Cloud Storage
111
 
112
  Set the following secret variables to match your own hosted instance of
113
  Postgres. We strongly recommend setting these as secrets to prevent leaking
 
163
  STORAGE_AZURE_FOLDER=""
164
 
165
 
166
+ ## Questions? Concerns? Want to get involved?
167
 
168
  Email the community team at [community@labelstud.io](mailto:community@labelstud.io)