d-matrix-user commited on
Commit
74397ee
1 Parent(s): 4ed6bfb

updated README

Browse files
Files changed (1) hide show
  1. README.md +6 -4
README.md CHANGED
@@ -15,6 +15,7 @@ description: >-
15
  Perplexity metric implemented by d-Matrix.
16
  Perplexity (PPL) is one of the most common metrics for evaluating language models.
17
  It is defined as the exponentiated average negative log-likelihood of a sequence, calculated with exponent base `e`.
 
18
  For more information, see https://huggingface.co/docs/transformers/perplexity
19
  ---
20
 
@@ -26,13 +27,14 @@ description: >-
26
  Perplexity metric implemented by d-Matrix.
27
  Perplexity (PPL) is one of the most common metrics for evaluating language models.
28
  It is defined as the exponentiated average negative log-likelihood of a sequence, calculated with exponent base `e`.
 
29
  For more information, see https://huggingface.co/docs/transformers/perplexity
30
 
31
  ## How to Use
32
  At minimum, this metric requires the model and references as inputs.
33
  ```python
34
  >>> import evaluate
35
- >>> perplexity = evaluate.load("dmx_perplexity", module_type="metric")
36
  >>> input_texts = ["lorem ipsum", "Happy Birthday!", "Bienvenue"]
37
  >>> results = perplexity.compute(model='distilgpt2',references=input_texts)
38
  >>> print(results)
@@ -59,15 +61,15 @@ This metric outputs a dictionary, containing the loss and perplexity score.
59
  ```python
60
  >>> import evaluate
61
  >>> from datasets import load_dataset
62
- >>> perplexity = evaluate.load("dmx_perplexity", module_type="metric")
63
  >>> input_texts = load_dataset("wikitext", "wikitext-2-raw-v1", split="test")["text"][:10]
64
  >>> results = perplexity.compute(model='distilgpt2',references=input_texts)
65
  >>> print(list(results.keys()))
66
  ['loss', 'perplexity']
67
  >>> print(results['loss'])
68
- 3.8299286365509033
69
  >>> print(results['perplexity'])
70
- 46.05925369262695
71
  ```
72
 
73
  ## Citation(s)
 
15
  Perplexity metric implemented by d-Matrix.
16
  Perplexity (PPL) is one of the most common metrics for evaluating language models.
17
  It is defined as the exponentiated average negative log-likelihood of a sequence, calculated with exponent base `e`.
18
+ Note that this metric is intended for Causual Language Models, the perplexity calculation is only correct if model uses Cross Entropy Loss.
19
  For more information, see https://huggingface.co/docs/transformers/perplexity
20
  ---
21
 
 
27
  Perplexity metric implemented by d-Matrix.
28
  Perplexity (PPL) is one of the most common metrics for evaluating language models.
29
  It is defined as the exponentiated average negative log-likelihood of a sequence, calculated with exponent base `e`.
30
+ Note that this metric is intended for Causual Language Models, the perplexity calculation is only correct if model uses Cross Entropy Loss.
31
  For more information, see https://huggingface.co/docs/transformers/perplexity
32
 
33
  ## How to Use
34
  At minimum, this metric requires the model and references as inputs.
35
  ```python
36
  >>> import evaluate
37
+ >>> perplexity = evaluate.load("d-matrix/dmx_perplexity", module_type="metric")
38
  >>> input_texts = ["lorem ipsum", "Happy Birthday!", "Bienvenue"]
39
  >>> results = perplexity.compute(model='distilgpt2',references=input_texts)
40
  >>> print(results)
 
61
  ```python
62
  >>> import evaluate
63
  >>> from datasets import load_dataset
64
+ >>> perplexity = evaluate.load("d-matrix/dmx_perplexity", module_type="metric")
65
  >>> input_texts = load_dataset("wikitext", "wikitext-2-raw-v1", split="test")["text"][:10]
66
  >>> results = perplexity.compute(model='distilgpt2',references=input_texts)
67
  >>> print(list(results.keys()))
68
  ['loss', 'perplexity']
69
  >>> print(results['loss'])
70
+ 3.9706921577453613
71
  >>> print(results['perplexity'])
72
+ 53.021217346191406
73
  ```
74
 
75
  ## Citation(s)