File size: 16,356 Bytes
3380ee9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e144a1c
 
 
 
3380ee9
 
 
 
 
e144a1c
 
 
3380ee9
 
e144a1c
3380ee9
 
 
e144a1c
 
3380ee9
 
e144a1c
 
3380ee9
 
 
 
 
 
 
e144a1c
 
 
3380ee9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e144a1c
3380ee9
e144a1c
3380ee9
e144a1c
 
3380ee9
 
e144a1c
3380ee9
e144a1c
3380ee9
 
 
e144a1c
3380ee9
e144a1c
3380ee9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e144a1c
3380ee9
e144a1c
3380ee9
 
 
e144a1c
 
3380ee9
 
e144a1c
3380ee9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e144a1c
3380ee9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e144a1c
 
 
 
 
3380ee9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
"""Streamlit entrypoint"""

import base64
import time

import numpy as np
import streamlit as st
import sympy

from helpers.thompson_sampling import ThompsonSampler

eta, a, p, D, profit, var_cost, fixed_cost = sympy.symbols("eta a p D Profit varcost fixedcost")
np.random.seed(42)

st.set_page_config(
    page_title="๐Ÿ’ธ Dynamic Pricing ๐Ÿ’ธ",
    page_icon="๐Ÿ’ธ",
    layout="centered",
    initial_sidebar_state="auto",
    menu_items={
        'Get help': None,
        'Report a bug': None,
        'About': "https://www.ml6.eu/",
    }
)

st.title("๐Ÿ’ธ Dynamic Pricing ๐Ÿ’ธ")
st.subheader("Setting optimal prices with Bayesian stats ๐Ÿ“ˆ")

# (1) Intro
st.header("Let's start with the basics ๐Ÿ")

st.markdown("The beginning is usually a good place to start so we'll kick things off there.")
st.markdown("""The one crucial piece information we need in order to find the optimal price is
**how demand behaves over different price points**.  \nIf we can make a decent guess of what we 
can expect demand to be for a wide range of prices, we can figure out which price optimizes our 
target (i.e., revenue, profit, ...).""")
st.markdown("""For the keen economists amongst you, this is beginning to sound a lot like a
**demand curve**.""")

st.markdown("""Estimating a demand curve, sounds easy enough right?  \nLet's assume we have 
demand with constant price elasticity; so a certain percent change in price will cause a
constant percent change in demand, independent of the price level. In economics, this is often used 
as a proxy for demand curves in the wild.""")
st.markdown("So our demand data looks something like this:")
st.image("assets/images/ideal_case_demand.png")
st.markdown("""Alright now we can get out our trusted regression toolbox and fit a nice curve 
through the data because we know that our constant-elasticity demand function has this form:""")
st.latex(sympy.latex(sympy.Eq(sympy.Function(D)(p), a*p**(-eta), evaluate=False))) 
st.write("with shape parameter a and price elasticity ฮท")
st.image("assets/images/ideal_case_demand_fitted.png")
st.markdown("""Now that we have a reasonable estimate of our demand function, we can derive our 
expected profit at different price points because we know the following holds:""")
st.latex(f"{profit} = {p}*{sympy.Function(D)(p)} - [{var_cost}*{sympy.Function(D)(p)} + {fixed_cost}]")
st.image("assets/images/ideal_case_profit_curve.png")
st.markdown("""Finally we can dust off our good old high-school math book and find the
price which we expect will optimize profit which was ultimately the goal of all this.""")
st.image("assets/images/ideal_case_optimal_profit.png")
st.markdown("""Voilร  there you have it: we should price this product at 4.24 and we can expect
a bottom-line profit of 7.34""")
st.markdown("So can we kick back & relax now?  \nWell, there are a few issues with what we just did.")

# (2) Dynamic demand curves
st.header("The demands they are a-changin' ๐ŸŽธ")
st.markdown("""We arrive at our first bit of bad news: unfortunately, you can't just estimate a 
demand curve once and be done with it.  \nWhy? Because demand is influenced by many factors (e.g., 
market trends, competitor actions, human behavior, etc.) that tend to change a lot over time.""")
st.write("Below you can see an (exaggerated) example of what we're talking about:")

with open("assets/images/dynamic_demand.gif", "rb") as file_:
    contents = file_.read()
    data_url = base64.b64encode(contents).decode("utf-8")

st.markdown(
    f'<img src="data:image/gif;base64,{data_url}" alt="dynamic demand">',
    unsafe_allow_html=True,
)
st.markdown("""Now, you may think we can solve this issue by periodically re-estimating the demand 
curve.  \nAnd you would be very right! But also very wrong as this leads us nicely to the 
next issue.""")

# (3) Constrained data
st.header("Where are we getting this data anyways? ๐Ÿค”")
st.markdown("""So far, we have assumed that we get (and keep getting) data on demand levels at
different price points.  \n
Not only is this assumption **unrealistic**, it is also very **undesirable**""")
st.markdown("""Why? Because getting demand data on a wide spectrum of price points implies that
we are spending a significant amount of time setting prices that are either too high or too low!  \n
Which is ironically exactly the opposite of what we set out to achieve.""")
st.markdown("In practice, our demand observations will rather look something like this:")
st.image("assets/images/realistic_demand.png")
st.markdown("""As we can see, we have tried three price points in the past (โ‚ฌ7.5, โ‚ฌ10 and โ‚ฌ11) and
collected demand data.""")
st.markdown("""On a side note: keep in mind that we still assume the same latent demand curve and
optimal price point of โ‚ฌ4.24  \n
So (for the sake of the example) we have been massively overpricing our product in the past.""")
st.image("assets/images/realistic_demand_latent_curve.png")
st.markdown("""This limited data brings along a major challenge in estimating the demand curve 
though.  \n
Intuitively, it makes sense that we can make a reasonable estimate of expected demand at โ‚ฌ8 or โ‚ฌ9,
given the observed demand at โ‚ฌ7.5 and โ‚ฌ10.  \nBut can we extrapolate further to โ‚ฌ2 or โ‚ฌ20 with the 
same reasonable confidence? Probably not.""")
st.markdown("""This is a nice example of a very well-known problem in statistics called the 
**\"exploration-exploitation trade-off\"**  \n
๐Ÿ‘‰ **Exploration**: We want to explore the demand for a diverse enough range of price points
so that we can accurately estimate our demand curve.  \n
๐Ÿ‘‰ **Exploitation**: We want to exploit all the knowledge we have gained through exploring and
actually do what we set out to do: set our price at an optimal level.""")

# (4) Thompson sampling explanation
st.header("Enter: Thompson Sampling ๐Ÿ“Š")
st.markdown("""As we mentioned, this is a well-known problem in statistics. So luckily for us, 
there is a pretty neat solution in the form of **Thompson sampling**!""")
st.markdown("""Basically instead of estimating one demand function based on the data available to 
us, we will estimate a probability distribution of demand functions or simply put, for every 
possible demand function that fits our functional form (i.e. constant elasticity) 
we will estimate the probability that it is the correct one, given our data.""")
st.markdown("""Or mathematically speaking, we will place a prior distribution on the parameters
that define our demand function and update these priors to posterior distributions via Bayes rule,
thus obtaining a posterior distribution for our demand function""")
st.markdown("""Thompson sampling then entails just sampling a demand function out of this 
distribution, calculating the optimal price given this demand function, observing demand for this
new price point and using this information to refine our demand function estimates.""")
st.image("assets/images/flywheel_1.png")
st.markdown("""So:  \n
๐Ÿ‘‰ When we are **less certain** of our estimates, we will sample more diverse demand functions, 
which means that we will also explore more diverse price points. Thus, we will **explore**.  \n
๐Ÿ‘‰ When we are **more certain** of our estimates, we will sample a demand function close to 
the real one & set a price close to the optimal price more often. Thus, we will **exploit**.""")

st.markdown("""With that said, we'll take another look at our constrained data and see whether
Thompson sampling gets us any closer to the optimal price of โ‚ฌ4.24""")
st.image("assets/images/realistic_demand_latent_curve.png")
st.markdown("""Let's start working our mathemagic:  \n
We'll start off by placing semi-informed priors on the parameters that make up our 
demand function.""")

st.latex(f"{sympy.latex(a)} \sim N(ฮผ=0,ฯƒ=2)")
st.latex(f"{sympy.latex(eta)} \sim N(ฮผ=0.5,ฯƒ=0.5)")
st.latex("sd \sim Exp(\lambda=1)")
st.latex(f"{sympy.latex(D)}|P=p \sim N(ฮผ={sympy.latex(a*p**(-eta))},ฯƒ=sd)")

st.markdown("""These priors are semi-informed because we have the prior knowledge that 
price elasticity is most likely between 0 and 1. As for the other parameters, we have little
knowledge about them so we can place a pretty uninformative prior.""")
st.markdown("If that made sense to you, great. If it didn't, don't worry about it")

st.markdown("""Now that are priors are taken care of, we can update these beliefs by incorporating
the data at the โ‚ฌ7.5, โ‚ฌ10 and โ‚ฌ11 price levels we have available to us.""")
st.markdown("The resulting demand & profit curve distributions look a little something like this:")
st.image(["assets/images/posterior_demand.png", "assets/images/posterior_profit.png"])

st.markdown("""It's time to sample one demand curve out of this posterior distribution.  \n
The lucky curve is:""")
st.image("assets/images/posterior_demand_sample.png")
st.markdown("This results in the following expected profit curve")
st.image("assets/images/posterior_profit_sample.png")
st.markdown("""And eventually we arrive at a new price: โ‚ฌ5.25! Which is indeed considerably closer
to the actual optimal price of โ‚ฌ4.24""")
st.markdown("""Now that we have our first updated price point, why stop there? Let's simulate 10 
demand data points at this price point from out latent demand curve and check whether Thompson 
sampling will edge us even closer to that optimal โ‚ฌ4.24 point.""")
st.image("assets/images/updated_prices_demand.png")
st.markdown("""We know the drill by now.  \n
Let's recalculate our posteriors with this extra information.""")
st.image(["assets/images/posterior_demand_2.png", "assets/images/posterior_profit_2.png"])
st.markdown("""We immediately notice that the demand (and profit) posteriors are much less spread
apart this time around which implies that we are more confident in our predictions.""")
st.markdown("Now, we can sample just one curve from the distribution.")
st.image(["assets/images/posterior_demand_sample_2.png", "assets/images/posterior_profit_sample_2.png"])
st.markdown("""And finally we arrive at a price point of โ‚ฌ4.44 which is eerily close to
the actual optimum of โ‚ฌ4.24""")

# (5) Thompson sampling demo
st.header("Demo time ๐ŸŽฎ")
st.markdown("Now that we have covered the theory, you can go ahead and try it our for yourself!")

thompson_sampler = ThompsonSampler()
demo_button = st.checkbox(
    label='Ready for the Demo? ๐Ÿคฏ',
    help="Starts interactive Thompson sampling demo"
)
elasticity = st.slider(
    "Adjust latent elasticity",
    key="latent_elasticity",
    min_value=0.05,
    max_value=0.95,
    value=0.25,
    step=0.05,
)
while demo_button:
    thompson_sampler.run()
    time.sleep(1)

# (6) Extra topics
st.header("Some final remarks")

st.markdown("""Because we have purposefully kept the example above quite simple, you may still be
wondering what happens when added complexities show up.  \n
Let's discuss some of those concerns FAQ-style:""")

st.subheader("๐Ÿ‘‰ Isn't this constant-elasticity model a bit too simple to work in practice?")
st.markdown("Brief answer: usually yes it is.")
st.markdown("""Luckily, more flexible methods exist.  \n
We would recommend to use Gaussian Processes. We won't go into how these work here but the main idea
is that it doesn't impose a restrictive functional form onto the demand function but rather lets
the data speak for itself.""")

with open("assets/images/gaussian_process.gif", "rb") as file_:
    contents = file_.read()
    data_url = base64.b64encode(contents).decode("utf-8")

st.markdown(
    f'<img src="data:image/gif;base64,{data_url}" alt="gaussian process">',
    unsafe_allow_html=True,
)
st.markdown("""If you do want to learn more, we recommend these links: 
[1](https://distill.pub/2019/visual-exploration-gaussian-processes/),
[2](https://thegradient.pub/gaussian-process-not-quite-for-dummies/),
[3](https://sidravi1.github.io/blog/2018/05/15/latent-gp-and-binomial-likelihood)""")

st.subheader("""๐Ÿ‘‰ Price optimization is much more complex than just optimizing a simple profit function?""")
st.markdown("""It sure is. In reality, there are many added complexities that come into play, such
as inventory/capacity constraints, complex cost structures, ...""")
st.markdown("""The nice thing about our setup is that it consists of three components that you can 
change pretty much independently from each other.  \n
This means that you can make the price optimization pillar arbitrarily custom/complex. As long as
it takes in a demand function and spits out a price.""")
st.image("assets/images/flywheel_2.png")
st.markdown("You can tune the other two steps as much as you like too.")
st.image("assets/images/flywheel_3.png")

st.subheader("๐Ÿ‘‰ Changing prices has a huge impact. How can I mitigate this during experimentation?")
st.markdown("There are a few things we can do to minimize risk:")
st.markdown("""๐Ÿ‘‰ **A/B testing**: You can do a gradual roll-out of the new pricing system where a
small (but increasing) percentage of your transactions are based on this new system. This allows you
to start small & track/grow the impact over time.""")
st.markdown("""๐Ÿ‘‰ **Limit products**: Similarly to A/B testing, you can also segment on the 
product-level. For instance, you can start gradually rolling out dynamic pricing for one product 
type and extend this over time.""")
st.markdown("""๐Ÿ‘‰ **Bound price range**: Theoretically, Thompson sampling in its purest form can 
lead to any arbitrary price point (albeit with an increasingly low probability). In order to limit
the risk here, you can simply place a upper/lower bound on the price range you are comfortable
experimenting in.""")
st.markdown("""On top of all this, Bayesian methods (by design) explicitly quantify uncertainty. 
This allows you to have a very concrete view on the variance of our demand estimates""")

st.subheader("๐Ÿ‘‰ What if I have multiple products that can cannibalize each other?")
st.markdown("Here it really depends")
st.markdown("""๐Ÿ‘‰ **If you have a handful of products**, we can simply reformulate our objective while 
keeping our methods analogous.  \n
Instead of tuning one price to optimize profit for the demand function of one product, we tune N 
prices to optimize profit for the joint demand function of N products. This joint demand function 
can then account for correlations in demand within products.""")
st.markdown("""๐Ÿ‘‰ **If you have hundreds, thousands or more products**, we're sure you can imagine that 
the procedure described above becomes increasingly infeasible.  \n
A practical alternative is to group substitutable products into "baskets" and define the "price of
the basket" as the average price of all products in the basket.  \n
If we assume that the products in baskets are subtitutable but the products in different baskets are
not, we can optimize basket prices indepedently from one another.  \n
Finally, if we also assume that cannibalization remains constant if the ratio of prices remains 
constant, we can calculate individual product prices as a fixed ratio of its basket price.  \n""")
st.markdown("""For example, if a "burger basket" consists of a hamburger (โ‚ฌ1) and a cheeseburger 
(โ‚ฌ3), then the "burger price" is ((โ‚ฌ1 + โ‚ฌ3) / 2 =) โ‚ฌ2. So a hamburger costs 50% of the burger price 
and a cheeseburger costs 150% of the burger price.  \n
If we change the burger's price to โ‚ฌ3, a hamburger will cost (50% * โ‚ฌ3 =) โ‚ฌ1.5 and a cheeseburger
will cost (150% * โ‚ฌ3 =) โ‚ฌ4.5 because we assume that the cannibalization effect between hamburgers & 
cheeseburgers is the same when hamburgers cost โ‚ฌ1 & cheeseburgers cost โ‚ฌ3 and when hamburgers cost
โ‚ฌ1.5 & cheeseburgers cost โ‚ฌ4.5""")
st.image("assets/images/cannibalization.png")

st.subheader("๐Ÿ‘‰ Is dynamic pricing even relevant for slow-selling products?")
st.markdown("""The boring answer is that it depends. It depends on how dynamic the market is, the
quality of the prior information, ...""")
st.markdown("""But obviously this isn't very helpful.  \nIn general, we notice that you can already 
get quite far with limited data, especially if you have an accurate prior belief on how the demand
likely behaves.""")
st.markdown("""For reference, in our simple example where we showed a Thompson sampling update, we 
were already able to gain a lot of confidence in our estimates with just 10 extra demand 
observations.""")