Files changed (1) hide show
  1. app.py +4 -4
app.py CHANGED
@@ -6,7 +6,7 @@ from matplotlib.ticker import MultipleLocator
6
  INTRO = """# Harm's law
7
 
8
  The Chinchilla scaling laws focus on optimally scaling training compute but often we also care about inference cost.
9
- This tool follows [Harm de Vries' blog post](https://www.harmdevries.com/post/model-size-vs-compute-overhead/) and visualizes the tradeoff between training comput and inference cost (i.e. model size).
10
  """
11
 
12
  ### CHINCHILLA PARAMS:
@@ -82,11 +82,11 @@ Your specificied setting corresponds to the following training compute budget.
82
  **Compute budget (TFLOPs): {C:.2E}**
83
 
84
  ## Chinchilla optimal:
85
- If you are optimizeing for model performance and ignore inference cost this is the optimal setting for training:
86
 
87
- **Optimal model size: {N_opt/Bn:.2f}B parametes**
88
 
89
- **Optimal datset size: {D_opt/Bn:.2f}B tokens**
90
 
91
  ## Your setting trade-off:
92
  Compared to the compute optimal model.
 
6
  INTRO = """# Harm's law
7
 
8
  The Chinchilla scaling laws focus on optimally scaling training compute but often we also care about inference cost.
9
+ This tool follows [Harm de Vries' blog post](https://www.harmdevries.com/post/model-size-vs-compute-overhead/) and visualizes the tradeoff between training compute and inference cost (i.e. model size).
10
  """
11
 
12
  ### CHINCHILLA PARAMS:
 
82
  **Compute budget (TFLOPs): {C:.2E}**
83
 
84
  ## Chinchilla optimal:
85
+ If you are optimizing for model performance and ignore inference cost this is the optimal setting for training:
86
 
87
+ **Optimal model size: {N_opt/Bn:.2f}B parameters**
88
 
89
+ **Optimal dataset size: {D_opt/Bn:.2f}B tokens**
90
 
91
  ## Your setting trade-off:
92
  Compared to the compute optimal model.