hathawayj commited on
Commit
5d64005
1 Parent(s): cc336d6

new app with slides

Browse files
Files changed (2) hide show
  1. slides.html +304 -0
  2. streamlit.py +16 -1
slides.html ADDED
@@ -0,0 +1,304 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html>
3
+ <head>
4
+ <title>Introduction to Data Science Programming in Python</title>
5
+ <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
6
+ <style type="text/css">
7
+ @import url(https://fonts.googleapis.com/css?family=Yanone+Kaffeesatz);
8
+ @import url(https://fonts.googleapis.com/css?family=Droid+Serif:400,700,400italic);
9
+ @import url(https://fonts.googleapis.com/css?family=Ubuntu+Mono:400,700,400italic);
10
+
11
+ body { font-family: 'Droid Serif'; }
12
+ h1 {
13
+ font-family: 'Yanone Kaffeesatz';
14
+ font-weight: normal;
15
+ color:darkslategrey;
16
+ }
17
+ h2, h3 {
18
+ font-family: 'Yanone Kaffeesatz';
19
+ font-weight: normal;
20
+ }
21
+ .font40 {
22
+ font-size: 40px;
23
+ }
24
+ .font30 {
25
+ font-size: 30px;
26
+ }
27
+ .font20 {
28
+ font-size: 20px;
29
+ }
30
+ .remark-code, .remark-inline-code {
31
+ font-family: 'Ubuntu Mono';
32
+ font-size: 20px;
33
+ }
34
+ /* Two-column layout */
35
+ .left-column {
36
+ color: #777;
37
+ width: 50%;
38
+ float: left;
39
+ }
40
+ .left-column h2:last-of-type, .left-column h3:last-child {
41
+ color: #000;
42
+ }
43
+ .right-column {
44
+ width: 50%;
45
+ float: right;
46
+ padding-top: 1em;
47
+ }
48
+ .right-column h2:last-of-type, .right-column h3:last-child {
49
+ color: #000;
50
+ }
51
+ .inverse {
52
+ background: #272822;
53
+ color: #e4e4e1;
54
+ text-shadow: 0 0 20px #333;
55
+ }
56
+ .inverse h1, .inverse h2, .inverse h3 {
57
+ color: #f3f3f3;
58
+ line-height: 0.8em;
59
+ }
60
+ .lightfont {color:rgb(129, 126, 126);
61
+ </style>
62
+ </head>
63
+ <body>
64
+ <textarea id="source">
65
+
66
+ class: center, middle, font30
67
+
68
+ # Introduction to Streamlit Apps
69
+
70
+ J. Hathaway - Data Science Program Chair (BYU-I)
71
+
72
+ ---
73
+
74
+ class: font30
75
+
76
+ # Disclaimers
77
+
78
+ ## Dashboarding is easy to start with modern tools like Streamlit.
79
+
80
+ ### It is much harder to implement as [Full-Stack Developer](https://aws.amazon.com/what-is/full-stack-development) has it's own schooling and employment. Enjoy using these tools. However, know their purpose and use them accordingly.
81
+
82
+
83
+ ---
84
+ class: font20
85
+ # Agenda
86
+
87
+ Exemplify the data science process - Extract, Transform, Load, Analyze
88
+
89
+ 1. Checking installations (1 minute)
90
+ 2. Creating an account and navigating Hugging Face (10 minutes)
91
+ 3. Docker for dashboard development using Streamlit (10 minutes)
92
+ 4. Polars for data munging (5 minutes). _Don't munge data in your app (unless you have to)!_
93
+ 5. What are dashboards? (5 minutes)
94
+ 6. Why Streamlit for dashboards? (10 minutes)
95
+ 7. Visualization in dashboards (5 minutes)
96
+ 8. Tables in dashboards (5 minutes)
97
+ 9. Key Performance Indicators [KPIs] in dashboards (5 minutes)
98
+ 10. Challenge yourself to some dasbhoard edits (20 minutes)
99
+
100
+
101
+ ---
102
+ class: font40
103
+ # Checking our installation
104
+
105
+ 1. [Python Installed](https://www.python.org/downloads/)
106
+ 2. [VS Code Installed](https://code.visualstudio.com/download)
107
+ 3. [Python VS Code Extension Installed](https://marketplace.visualstudio.com/items?itemName=ms-python.python)
108
+ 4. [Docker Installed](https://www.docker.com/)
109
+ 4. Python packages installed.
110
+ ```python
111
+ pip install polars plotly streamlit
112
+ ```
113
+
114
+ ---
115
+ class: font20
116
+ # Hugging Face Accounts and Navigation
117
+
118
+ ## [Create your Hugging Face](https://huggingface.co/join) account.
119
+
120
+ > The platform where the machine learning community collaborates on models, datasets, and applications.
121
+
122
+ - [Hugging Face Docs](https://huggingface.co/docs)
123
+ - [Hugging Face Spaces](https://huggingface.co/docs/hub/spaces) ([Youtube Intro](https://www.youtube.com/watch?v=3bSVKNKb_PY))
124
+ - [Hugging Face Repositories](https://huggingface.co/docs/hub/repositories)
125
+ - [Hugging Face Organizations](https://huggingface.co/docs/hub/organizations)
126
+
127
+
128
+ ---
129
+ class: font20
130
+ # Docker for Dashboard Development
131
+
132
+ .left-column[
133
+ 1. Clone our Hugging Face repository
134
+ 2. Explore the `DockerFile` and `docker-compose.yml` files.
135
+ 3. Running `Docker compose up`
136
+ 4. Editing our App
137
+ 5. Pushing our changes
138
+ ]
139
+ .right-column[
140
+ ![:scale 65%](https://www.docker.com/wp-content/uploads/2023/08/logo-guide-logos-1.svg)
141
+ ]
142
+ ---
143
+ class: font20
144
+ # Polars for data munging
145
+
146
+ > Polars is a lightning fast DataFrame library/in-memory query engine. Its embarrassingly parallel execution, cache efficient algorithms and expressive API makes it perfect for efficient data wrangling, data pipelines, snappy APIs and so much more. Polars is about as fast as it gets, see the results in the [H2O.ai benchmark](https://h2oai.github.io/db-benchmark/).
147
+ > </br>
148
+ > [Polars Website](https://www.pola.rs/)
149
+
150
+ ![:scale 60%](https://raw.githubusercontent.com/pola-rs/polars-static/master/logos/polars_github_logo_rect_dark_name.svg)
151
+
152
+
153
+ ---
154
+
155
+ class: font20
156
+ # Introduction to Dashboarding (Structured design)
157
+
158
+ > A dashboard is a way of displaying various types of visual data in one place that let's the user focus on one general topic but explore questions within that topic.
159
+
160
+
161
+ ![:scale 85%](https://huggingface.co/spaces/ds460/docker_streamlit/resolve/main/img/dashboard_vmware_balance.png)
162
+
163
+ ---
164
+ class: font20
165
+ # Introduction to Dashboarding (Audience)
166
+
167
+ > A dashboard is a way of displaying various types of visual data in one place that let's the user focus on one general topic but explore questions within that topic.
168
+
169
+ > A poorly-designed dashboard doesn’t respect the reader’s time. The whole point of a dashboard is to create a product that will save the user’s time by including everything they need to know in one place. If they can’t go through the dashboard in a couple of minutes and get on with their job, the design needs to be changed.
170
+
171
+
172
+ ![:scale 40%](https://huggingface.co/spaces/ds460/docker_streamlit/resolve/main/img/dashboard_vmware_user.png)
173
+
174
+ [Reference 1](https://www.vmwareopsguide.com/dashboards/chapter-1-design-considerations/3.1.2-the-art-of-dashboard/) and [Reference 2](https://databox.com/bad-dashboard-examples)
175
+
176
+ ---
177
+
178
+ class: font20
179
+ # Why Streamlit for dashboards?
180
+
181
+
182
+ Streamlit turns data scripts into shareable web apps in minutes in pure Python. A faster way to build and share data apps with no front‑end experience required.
183
+
184
+
185
+ ![:scale 60%](https://huggingface.co/spaces/ds460/docker_streamlit/resolve/main/img/streamlit.jpg)
186
+
187
+ ---
188
+ class: font30
189
+ # Streamlit programming
190
+
191
+ Now let's practice using Streamlit with our installation of Python
192
+
193
+ __Streamlit practice (streamlit_try.py)__
194
+
195
+ _After deleting your Docker Image and Container, edit your `DockerFile` to build from a new `streamlit_try.py` script that you create in the folder. Use the code below for the app._
196
+
197
+
198
+ ```python
199
+ import streamlit as st
200
+ import polars as pl
201
+
202
+ st.write("Here's our first attempt at using data to create a table:")
203
+ st.write(pl.DataFrame({
204
+ 'first column': [1, 2, 3, 4],
205
+ 'second column': [10, 20, 30, 40]
206
+ }))
207
+ ```
208
+
209
+ ---
210
+ class: font20
211
+ # Introduction to Data Visualization
212
+
213
+ Our eyes are drawn to [colors and patterns](https://www.tableau.com/learn/whitepapers/tableau-visual-guidebook). We can quickly identify red from blue, and squares from circles. Our culture is visual, including everything from art and advertisements to TV and movies. Data visualization is another form of visual art that grabs our interest and keeps our eyes on the message.
214
+
215
+ .left-column[
216
+ ### Advantages of data visualization:
217
+
218
+ - Easily sharing information.
219
+ - Interactively explore opportunities.
220
+ - Visualize patterns and relationships.
221
+ ]
222
+ .right-column[
223
+ ### Disadvantages:
224
+
225
+ - Biased or inaccurate information.
226
+ - Correlation doesn’t always mean causation.
227
+ - Core messages can get lost in translation.
228
+ ]
229
+
230
+ [Tableau Reference](https://www.tableau.com/learn/articles/data-visualization)
231
+
232
+ ---
233
+ class: font20
234
+ # Introduction to __Plotly__ for Data Visualization
235
+
236
+ The Plotly Python package leverages the plotly.js JavaScript library to enables Python users to create beautiful interactive web-based visualizations. Plotly.js is built on top of d3.js and stack.gl, Plotly.js is a high-level, declarative charting library. plotly.js ships with over 40 chart types, including 3D charts, statistical graphs, and SVG maps.
237
+
238
+ ![:scale 50%](https://raw.githubusercontent.com/hathawayj/ghana_datascience/master/img/plotly_charts.png)
239
+
240
+ ---
241
+ class: font20
242
+ # Tables in dashboards
243
+
244
+ > Complexity is the downfall of dashboards. Raw data is always complex.
245
+
246
+ - [How to Fit Big Tables on Small Screens](https://www.youtube.com/watch?v=s7nvF1PuAWY)
247
+ - [Examples of great tables](https://posit-dev.github.io/great-tables/examples/)
248
+
249
+ ![:scale 75%](https://huggingface.co/spaces/ds460/docker_streamlit/resolve/main/img/tables.jpg)
250
+
251
+ ---
252
+ class: font40
253
+ # Key Performance Indicators (KPIs) in dashboards
254
+
255
+ > Too much summarization and too much dashboard real estate.
256
+
257
+ _[The Dark Side of KPIs: Uncovering the Limitations and Pitfalls](https://shahmm.medium.com/the-dark-side-of-kpis-uncovering-the-limitations-and-pitfalls-4139950e70ef)_
258
+
259
+ ![:scale 80%](https://huggingface.co/spaces/ds460/docker_streamlit/resolve/main/img/kpis.jpg)
260
+
261
+ ---
262
+ class: font20
263
+ # Streamlit Challenge Activity
264
+
265
+ - Add the ability to filter the chart to a specified year range with [st.date_input()](https://docs.streamlit.io/develop/api-reference/widgets/st.date_input)
266
+ - Add [Dataframes - st.data_editor()](https://docs.streamlit.io/develop/concepts/design/dataframes) to allow the user to pick which variables are displayed in the drop down.
267
+ - Add a few metrics to your dashboard using [st.metric()](https://docs.streamlit.io/develop/api-reference/data/st.metric)
268
+ - Report the year range of data available for the variable selected over all countries
269
+ - Add the percent growth from 2000 to the latest available year
270
+ - Add the country with the highest value in the latest year.
271
+ - Give the user of your app the ability to take a picture using [st.camera_input()](https://docs.streamlit.io/develop/api-reference/widgets/st.camera_input).
272
+ - Try to use a third party extension to allow the user to draw on the camera picture taken using [streamlit-drawable-canvas](https://github.com/andfanilo/streamlit-drawable-canvas?tab=readme-ov-file).
273
+ - Now organize your application using
274
+ - [st.set_page_config()](https://docs.streamlit.io/develop/api-reference/configuration/st.set_page_config)
275
+ - [st.columns()](https://docs.streamlit.io/develop/api-reference/layout/st.columns)
276
+
277
+ </textarea>
278
+ <script src="https://remarkjs.com/downloads/remark-latest.min.js" type="text/javascript">
279
+ </script>
280
+ <script type="text/javascript">
281
+ remark.macros.upper = function () {
282
+ // `this` is the value in the parenthesis, or undefined if left out
283
+ return this.toUpperCase();
284
+ };
285
+
286
+ remark.macros.random = function () {
287
+ // params are passed as function arguments: ["one", "of", "these", "words"]
288
+ var i = Math.floor(Math.random() * arguments.length);
289
+ return arguments[i];
290
+ };
291
+
292
+ remark.macros.scale = function (percentage) {
293
+ var url = this;
294
+ return '<img src="' + url + '" style="width: ' + percentage + '" />';
295
+ };
296
+
297
+ var slideshow = remark.create({
298
+ ratio: "16:9",
299
+ highlightLanguage: 'javascript',
300
+ highlightStyle: 'monokai'
301
+ });
302
+ </script>
303
+ </body>
304
+ </html>
streamlit.py CHANGED
@@ -5,8 +5,11 @@ import polars as pl
5
  import plotly.express as px
6
  import plotly.io as pio
7
  pio.templates.default = "simple_white"
 
 
8
  # %%
9
  # Data
 
10
  dat = pl.read_csv("dat_munged.csv")
11
  info = pl.read_csv("Metadata_Indicator_API_Download_DS2_en_csv_v2_5657328.csv").rename({"INDICATOR_CODE":"Indicator Code", "INDICATOR_NAME":"Indicator Name"})
12
  dat_vars = pl.read_csv("dat_vars.csv")
@@ -47,6 +50,8 @@ sp = px.line(use_dat.to_pandas(),
47
 
48
  st.markdown("## Country performance over time")
49
 
 
 
50
  st.markdown("__" + title_text + "__")
51
 
52
  st.markdown(subtitle_text)
@@ -79,4 +84,14 @@ def convert_df(df):
79
 
80
  csv = convert_df(display_dat)
81
 
82
- st.download_button("Download Data", data = csv, file_name = "data.csv", mime="text/csv")
 
 
 
 
 
 
 
 
 
 
 
5
  import plotly.express as px
6
  import plotly.io as pio
7
  pio.templates.default = "simple_white"
8
+
9
+ st.set_page_config(layout="wide")
10
  # %%
11
  # Data
12
+
13
  dat = pl.read_csv("dat_munged.csv")
14
  info = pl.read_csv("Metadata_Indicator_API_Download_DS2_en_csv_v2_5657328.csv").rename({"INDICATOR_CODE":"Indicator Code", "INDICATOR_NAME":"Indicator Name"})
15
  dat_vars = pl.read_csv("dat_vars.csv")
 
50
 
51
  st.markdown("## Country performance over time")
52
 
53
+ st.markdown("_You can read about streamlit [here](slides.html)_")
54
+
55
  st.markdown("__" + title_text + "__")
56
 
57
  st.markdown(subtitle_text)
 
84
 
85
  csv = convert_df(display_dat)
86
 
87
+ st.download_button("Download Data", data = csv, file_name = "data.csv", mime="text/csv")
88
+
89
+ st.markdown("## My presentation")
90
+
91
+ # Read file and keep in variable
92
+ with open('slides.html','r') as f:
93
+ html_data = f.read()
94
+
95
+ ## Show in webpage
96
+ st.header("Show an external HTML")
97
+ st.components.v1.html(html_data,height=1500)