Spaces:
Sleeping
Sleeping
Updated the Readme minor modifications in description section in semf1.py
Browse files
README.md
CHANGED
@@ -16,15 +16,15 @@ description: >-
|
|
16 |
for more details.
|
17 |
---
|
18 |
|
19 |
-
# Metric Card for
|
20 |
|
21 |
## Metric Description
|
22 |
-
|
23 |
summary with the reference overlap summary. It evaluates the semantic overlap summary at the sentence level and
|
24 |
computes precision, recall and F1 scores.
|
25 |
|
26 |
## How to Use
|
27 |
-
|
28 |
`predictions`: (a list of system generated documents in the form of sentences i.e. List[List[str]]),
|
29 |
`references`: (a list of ground-truth documents in the form of sentences i.e. List[List[str]])
|
30 |
|
@@ -42,32 +42,41 @@ metric = load("semf1")
|
|
42 |
results = metric.compute(predictions=predictions, references=references)
|
43 |
```
|
44 |
|
45 |
-
It also accepts
|
46 |
-
TODO: List optional arguments
|
47 |
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
52 |
|
53 |
### Output Values
|
54 |
|
55 |
-
|
|
|
|
|
|
|
|
|
56 |
|
57 |
-
|
58 |
|
59 |
-
|
60 |
|
61 |
-
|
62 |
|
63 |
-
|
64 |
-
*Give examples, preferrably with links to leaderboards or publications, to papers that have reported this metric, along with the values they have reported.*
|
65 |
|
66 |
-
|
67 |
-
*Give code examples of the metric being used. Try to include examples that clear up any potential ambiguity left from the metric description above. If possible, provide a range of examples that show both typical and atypical results, as well as examples where a variety of input parameters are passed.*
|
68 |
|
69 |
-
|
70 |
-
*Note any known limitations or biases that the metric has, with links and references if possible.*
|
71 |
|
72 |
## Citation
|
73 |
```bibtex
|
@@ -92,6 +101,6 @@ BERTScore outputs a dictionary with the following values:
|
|
92 |
```
|
93 |
|
94 |
## Further References
|
95 |
-
TODO: Add links to the slides and video
|
96 |
- [Paper](https://aclanthology.org/2022.emnlp-main.49/)
|
97 |
-
- [Presentation Slides]()
|
|
|
|
16 |
for more details.
|
17 |
---
|
18 |
|
19 |
+
# Metric Card for Sem-F1
|
20 |
|
21 |
## Metric Description
|
22 |
+
Sem-F1 metric leverages the pre-trained contextual embeddings and evaluates the model generated semantic overlap
|
23 |
summary with the reference overlap summary. It evaluates the semantic overlap summary at the sentence level and
|
24 |
computes precision, recall and F1 scores.
|
25 |
|
26 |
## How to Use
|
27 |
+
Sem-F1 takes 2 mandatory arguments:
|
28 |
`predictions`: (a list of system generated documents in the form of sentences i.e. List[List[str]]),
|
29 |
`references`: (a list of ground-truth documents in the form of sentences i.e. List[List[str]])
|
30 |
|
|
|
42 |
results = metric.compute(predictions=predictions, references=references)
|
43 |
```
|
44 |
|
45 |
+
It also accepts another optional arguments:
|
|
|
46 |
|
47 |
+
`model_type: Optional[str]`:
|
48 |
+
The model to use for encoding the sentences.
|
49 |
+
Options are:
|
50 |
+
[`pv1`](https://huggingface.co/sentence-transformers/paraphrase-distilroberta-base-v1),
|
51 |
+
[`stsb`](https://huggingface.co/sentence-transformers/stsb-roberta-large),
|
52 |
+
[`use`](https://huggingface.co/sentence-transformers/use-cmlm-multilingual).
|
53 |
+
The default value is `use`.
|
54 |
+
|
55 |
+
[//]: # (### Inputs)
|
56 |
+
|
57 |
+
[//]: # (*List all input arguments in the format below*)
|
58 |
+
|
59 |
+
[//]: # (- **input_field** *(type): Definition of input, with explanation if necessary. State any default value(s).*)
|
60 |
|
61 |
### Output Values
|
62 |
|
63 |
+
`precision`: The [precision](https://huggingface.co/metrics/precision) for each sentence from the `predictions` + `references` lists, which ranges from 0.0 to 1.0.
|
64 |
+
|
65 |
+
`recall`: The [recall](https://huggingface.co/metrics/recall) for each sentence from the `predictions` + `references` lists, which ranges from 0.0 to 1.0.
|
66 |
+
|
67 |
+
`f1`: The [F1 score](https://huggingface.co/metrics/f1) for each sentence from the `predictions` + `references` lists, which ranges from 0.0 to 1.0.
|
68 |
|
69 |
+
[//]: # (#### Values from Popular Papers)
|
70 |
|
71 |
+
[//]: # (*Give examples, preferrably with links to leaderboards or publications, to papers that have reported this metric, along with the values they have reported.*)
|
72 |
|
73 |
+
[//]: # (### Examples)
|
74 |
|
75 |
+
[//]: # (*Give code examples of the metric being used. Try to include examples that clear up any potential ambiguity left from the metric description above. If possible, provide a range of examples that show both typical and atypical results, as well as examples where a variety of input parameters are passed.*)
|
|
|
76 |
|
77 |
+
[//]: # (## Limitations and Bias)
|
|
|
78 |
|
79 |
+
[//]: # (*Note any known limitations or biases that the metric has, with links and references if possible.*)
|
|
|
80 |
|
81 |
## Citation
|
82 |
```bibtex
|
|
|
101 |
```
|
102 |
|
103 |
## Further References
|
|
|
104 |
- [Paper](https://aclanthology.org/2022.emnlp-main.49/)
|
105 |
+
- [Presentation Slides](https://auburn.box.com/s/rs5p7sttaonbvljnq0i5tk7xxw0vonn3)
|
106 |
+
- [Video](https://auburn.box.com/s/c1bmb8c0a2emc9xhnjfalvqo2100yxvf)
|
semf1.py
CHANGED
@@ -66,8 +66,10 @@ Args:
|
|
66 |
stsb - stsb-roberta-large
|
67 |
use - Universal Sentence Encoder
|
68 |
Returns:
|
69 |
-
|
70 |
-
|
|
|
|
|
71 |
Examples:
|
72 |
|
73 |
>>> import evaluate
|
@@ -85,7 +87,6 @@ Examples:
|
|
85 |
[0.77, 0.56]
|
86 |
"""
|
87 |
|
88 |
-
[["I go to School.", "You are stupid."]]
|
89 |
|
90 |
class Encoder(metaclass=abc.ABCMeta):
|
91 |
@abc.abstractmethod
|
|
|
66 |
stsb - stsb-roberta-large
|
67 |
use - Universal Sentence Encoder
|
68 |
Returns:
|
69 |
+
precision: Precision.
|
70 |
+
recall: Recall.
|
71 |
+
f1: F1 score.
|
72 |
+
|
73 |
Examples:
|
74 |
|
75 |
>>> import evaluate
|
|
|
87 |
[0.77, 0.56]
|
88 |
"""
|
89 |
|
|
|
90 |
|
91 |
class Encoder(metaclass=abc.ABCMeta):
|
92 |
@abc.abstractmethod
|