Amitontheweb commited on
Commit
05464d2
1 Parent(s): cc1e863

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +66 -14
app.py CHANGED
@@ -411,28 +411,80 @@ with gr.Blocks() as demo:
411
 
412
  gr.Markdown (
413
  """
414
-
415
-
416
  ## About Params Playground
417
- A place to tweak, test and learn generative model parameters for text output.
418
-
419
- - Random Sampling - with Top P, Top K
420
 
421
- - Simple Beam search - with Early Stopping and Temperature
422
 
423
- - Diversity Beam search - with Group Diversity Penalty
 
 
424
 
425
- - Contrastive search - with Penalty Alpha
426
-
427
 
428
- Other parameters:
429
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
430
 
431
- - Length penalty
432
 
433
- - Repetition penalty
434
 
435
- - No repeat n-gram size
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
436
 
437
  """
438
 
 
411
 
412
  gr.Markdown (
413
  """
414
+ ##
 
415
  ## About Params Playground
 
 
 
416
 
417
+ A space to tweak, test and learn generative model parameters for text output.
418
 
419
+ **Strategies**:
420
+ --------------
421
+ Given some text as input, a decoder-only models hunt for the most popular continuation - whether the continuation makes sense or not - using various search strategies.
422
 
423
+ Example:
 
424
 
425
+ *Input: Today is a rainy day*
426
+
427
+ Option 1: , [probability score: 0.62]
428
+ Option 2: . [probability score: 0.21]
429
+ Option 3: ! [probability score: 0.73]
430
+
431
+
432
+ **Greedy Search**: Goes along the most well trodden path. Always picks up the next word/token carrying the highest probability score. Default for GPT2.
433
+
434
+ In this illustrative example, since "!" has the highest probability, a greedy strategy will output: Today is a rainy day!
435
+
436
+
437
+ **Random Sampling**: Picks up any random path or trail to walk on. Use ```do_sample=True```
438
+
439
+ *Temperature* - Increasing the temperature allows words with lesser probabilities to show up in the output. At Temp = 0, search becomes 'greedy' for words with high probabilities.
440
+
441
+ *Top_K*: Creates a small list of paths [tokens or words] to choose from. In the above example, if set to 2, only Option 1 and 3 - the two top ranking tokens in terms of probabilities, will be available for random sampling.
442
+
443
+ *Top_P*: Creates a small list of tokens based on the sum of their probability scores which should not exceed the Top P value. In the above example, if set to 0.80, only Option 3 will be available. If set to 1.5, Options 1 and 3 will be available. This metric can be used to make the output factually correct when the input is expecting facts like: "The capital of XYZ is [next token]"
444
+
445
+ When used with temperature: Reducing temperature makes the search greedy.
446
+
447
 
448
+ **Simple Beam search**: Selects the branches (beams) going towards other heavy laden branch of fruits, to find the heaviest set among the branches in all. Akin to greedy search, but finds the total heaviest or largest route.
449
 
450
+ If num_beams = 2, every branch will divide into the top two scoring tokens at each step, and so on till the search ends.
451
 
452
+ *Early Stopping*: Makes the search stop when a pre-determined criteria for ending the search is satisfied.
453
+
454
+
455
+ **Diversity Beam search**: Divided beams into groups of beams, and applies the diversity penalty. This makes the output more diverse and interesting.
456
+
457
+ *Group Diversity Penalty*: Used to instruct the next beam group to ignore the words/tokens already selected by previous groups.
458
+
459
+
460
+ **Contrastive search**: Uses the entire input context to create more interesting outputs.
461
+
462
+ *Penalty Alpha*: When α=0, search becomes greedy.
463
+
464
+ Refer: https://huggingface.co/blog/introducing-csearch
465
+
466
+
467
+ **Other parameters**:
468
+ ---------------------
469
+
470
+ - Length penalty: Used to force the model to meet the expected output length.
471
+
472
+ - Repetition penalty: Used to force the model to avoid repetition.
473
+
474
+ - No repeat n-gram size: Used to force the model not to repeat the n-size set of words. Avoid setting to 1, as this forces no two words to be identical.
475
+
476
+
477
+ References:
478
+ ------------
479
+
480
+ 1. https://huggingface.co/blog/how-to-generate
481
+
482
+ 2. https://huggingface.co/docs/transformers/generation_strategies#decoding-strategies
483
+
484
+ 3. https://huggingface.co/docs/transformers/main/en/main_classes/text_generation
485
+
486
+
487
+
488
 
489
  """
490