Spaces:
Running
Running
app.py
CHANGED
@@ -37,7 +37,7 @@ def compute_optimal_vocab(Nnv, flops):
|
|
37 |
with gr.Blocks() as demo:
|
38 |
with gr.Column():
|
39 |
gr.Markdown(
|
40 |
-
"""<
|
41 |
This tool is used to predict the optimal vocabulary size given the non-vocabulary parameters.
|
42 |
We provide 3 ways for prediction:
|
43 |
|
@@ -46,9 +46,9 @@ with gr.Blocks() as demo:
|
|
46 |
- Approach 3: Parametric Fit of Loss Formula: Design a loss formula that considers the effect of vocabulary size and utilizes the loss to make prediction.
|
47 |
|
48 |
Approach 1 and 2 can only be used to compute the optimal vocabulary size when the compute is optimally allocated to non-vocabulary parameters, vocabulary parameters and data jointly.
|
49 |
-
Approach 3 will not only consider the case above, but also consider the case when the amount of data does not satisfy the optimal compute allocation, and can calculate the optimal vocabulary size with specified
|
50 |
|
51 |
-
Thanks for trying πππ!
|
52 |
""")
|
53 |
|
54 |
with gr.Row():
|
@@ -56,7 +56,7 @@ with gr.Blocks() as demo:
|
|
56 |
flops = gr.Textbox(label="FLOPs", placeholder="Optional (e.g. 7.05e21)")
|
57 |
output_text = gr.Textbox(label="Prediction")
|
58 |
with gr.Row():
|
59 |
-
btn = gr.Button("
|
60 |
|
61 |
btn.click(
|
62 |
compute_optimal_vocab,
|
|
|
37 |
with gr.Blocks() as demo:
|
38 |
with gr.Column():
|
39 |
gr.Markdown(
|
40 |
+
"""<h1>The Optimal Vocabulary Size Predictor</h1>
|
41 |
This tool is used to predict the optimal vocabulary size given the non-vocabulary parameters.
|
42 |
We provide 3 ways for prediction:
|
43 |
|
|
|
46 |
- Approach 3: Parametric Fit of Loss Formula: Design a loss formula that considers the effect of vocabulary size and utilizes the loss to make prediction.
|
47 |
|
48 |
Approach 1 and 2 can only be used to compute the optimal vocabulary size when the compute is optimally allocated to non-vocabulary parameters, vocabulary parameters and data jointly.
|
49 |
+
Approach 3 will not only consider the case above, but also consider the case when the amount of data does not satisfy the optimal compute allocation, and can calculate the optimal vocabulary size with specified FLOPs.
|
50 |
|
51 |
+
**Thanks for trying** πππ!
|
52 |
""")
|
53 |
|
54 |
with gr.Row():
|
|
|
56 |
flops = gr.Textbox(label="FLOPs", placeholder="Optional (e.g. 7.05e21)")
|
57 |
output_text = gr.Textbox(label="Prediction")
|
58 |
with gr.Row():
|
59 |
+
btn = gr.Button("Press it to compute the optimal vocabulary size")
|
60 |
|
61 |
btn.click(
|
62 |
compute_optimal_vocab,
|