Spaces:
Build error
Build error
Update ISCO-08 Hierarchical Accuracy Measure documentation
Browse files- isco_ham.py +20 -20
- metric_template_1.py +20 -20
isco_ham.py
CHANGED
@@ -37,8 +37,8 @@ _CITATION = """
|
|
37 |
_DESCRIPTION = """
|
38 |
The ISCO-08 Hierarchical Accuracy Measure is an implementation
|
39 |
of the measure described in [Functional Annotation of Genes Using Hierarchical Text Categorization](https://www.researchgate.net/publication/44046343_Functional_Annotation_of_Genes_Using_Hierarchical_Text_Categorization)
|
40 |
-
(Kiritchenko, Svetlana and Famili, Fazel. 2005) with the ISCO-08 taxonomy by the International Labour Organization
|
41 |
-
|
42 |
1. The measure gives credit to partially correct classification,
|
43 |
e.g. misclassification into node $I$ (ISCO unit group "1120")
|
44 |
when the correct category is $G$ (ISCO unit group "1111")
|
@@ -47,24 +47,24 @@ should be penalized less than misclassification into node $D$
|
|
47 |
as $G$ and $D$ is not.
|
48 |
2. The measure punishes distant errors more heavily:
|
49 |
1. the measure gives higher evaluation for correctly classifying one level down compared to staying at the parent node, e.g. classification into node $E$ (ISCO minor group "111") is better than classification into its parent $C$ (ISCO sub-major group "11") since $E$ is closer to the correct category $G$;
|
50 |
-
2. the measure gives lower evaluation for incorrectly classifying one level down comparing to staying at the parent node, e.g. classification into node $F$ (ISCO minor group "112") is worse than classification into its parent $C$ since $F$ is farther away from $G
|
51 |
-
|
52 |
-
The features described are accomplished by pairing hierarchical variants of precision ($hP$) and recall ($hR$) to form a hierarchical F1 (hF_β) score where each sample belongs not only to its class (e.g., a unit group level code), but also to all ancestors of the class in a hierarchical graph (i.e., the minor, sub-major, and major group level codes)
|
53 |
-
|
54 |
-
Hierarchical precision can be computed with
|
55 |
-
$hP = \frac{| \v{C}_i ∩ \v{C}^′_i|} {|\v{C}^′_i |} = \frac{1}{2}
|
56 |
-
|
57 |
-
Hierarchical recall can be computed with
|
58 |
-
$hR = \frac{| \v{C}_i ∩ \v{C}^′_i|} {|\v{C}_i |} = \frac{1}{2}
|
59 |
-
|
60 |
-
Combining the two values $hP$ and $hR$ into one hF-measure
|
61 |
-
hF_β = \frac{(β^2 + 1) · hP · hR}{(β^2 · hP + hR)}, β ∈ [0, +∞)
|
62 |
-
|
63 |
-
Note
|
64 |
-
**TP**: True positive
|
65 |
-
**TN**: True negative
|
66 |
-
**FP**: False positive
|
67 |
-
**FN**: False negative
|
68 |
"""
|
69 |
|
70 |
_KWARGS_DESCRIPTION = """
|
|
|
37 |
_DESCRIPTION = """
|
38 |
The ISCO-08 Hierarchical Accuracy Measure is an implementation
|
39 |
of the measure described in [Functional Annotation of Genes Using Hierarchical Text Categorization](https://www.researchgate.net/publication/44046343_Functional_Annotation_of_Genes_Using_Hierarchical_Text_Categorization)
|
40 |
+
(Kiritchenko, Svetlana and Famili, Fazel. 2005) with the ISCO-08 taxonomy by the International Labour Organization.
|
41 |
+
|
42 |
1. The measure gives credit to partially correct classification,
|
43 |
e.g. misclassification into node $I$ (ISCO unit group "1120")
|
44 |
when the correct category is $G$ (ISCO unit group "1111")
|
|
|
47 |
as $G$ and $D$ is not.
|
48 |
2. The measure punishes distant errors more heavily:
|
49 |
1. the measure gives higher evaluation for correctly classifying one level down compared to staying at the parent node, e.g. classification into node $E$ (ISCO minor group "111") is better than classification into its parent $C$ (ISCO sub-major group "11") since $E$ is closer to the correct category $G$;
|
50 |
+
2. the measure gives lower evaluation for incorrectly classifying one level down comparing to staying at the parent node, e.g. classification into node $F$ (ISCO minor group "112") is worse than classification into its parent $C$ since $F$ is farther away from $G$.
|
51 |
+
|
52 |
+
The features described are accomplished by pairing hierarchical variants of precision ($hP$) and recall ($hR$) to form a hierarchical F1 (hF_β) score where each sample belongs not only to its class (e.g., a unit group level code), but also to all ancestors of the class in a hierarchical graph (i.e., the minor, sub-major, and major group level codes).
|
53 |
+
|
54 |
+
Hierarchical precision can be computed with:
|
55 |
+
$hP = \frac{| \v{C}_i ∩ \v{C}^′_i|} {|\v{C}^′_i |} = \frac{1}{2}$
|
56 |
+
|
57 |
+
Hierarchical recall can be computed with:
|
58 |
+
$hR = \frac{| \v{C}_i ∩ \v{C}^′_i|} {|\v{C}_i |} = \frac{1}{2}$
|
59 |
+
|
60 |
+
Combining the two values $hP$ and $hR$ into one hF-measure:
|
61 |
+
hF_β = \frac{(β^2 + 1) · hP · hR}{(β^2 · hP + hR)}, β ∈ [0, +∞)
|
62 |
+
|
63 |
+
Note:
|
64 |
+
**TP**: True positive
|
65 |
+
**TN**: True negative
|
66 |
+
**FP**: False positive
|
67 |
+
**FN**: False negative
|
68 |
"""
|
69 |
|
70 |
_KWARGS_DESCRIPTION = """
|
metric_template_1.py
CHANGED
@@ -37,8 +37,8 @@ _CITATION = """
|
|
37 |
_DESCRIPTION = """
|
38 |
The ISCO-08 Hierarchical Accuracy Measure is an implementation
|
39 |
of the measure described in [Functional Annotation of Genes Using Hierarchical Text Categorization](https://www.researchgate.net/publication/44046343_Functional_Annotation_of_Genes_Using_Hierarchical_Text_Categorization)
|
40 |
-
(Kiritchenko, Svetlana and Famili, Fazel. 2005) with the ISCO-08 taxonomy by the International Labour Organization
|
41 |
-
|
42 |
1. The measure gives credit to partially correct classification,
|
43 |
e.g. misclassification into node $I$ (ISCO unit group "1120")
|
44 |
when the correct category is $G$ (ISCO unit group "1111")
|
@@ -47,24 +47,24 @@ should be penalized less than misclassification into node $D$
|
|
47 |
as $G$ and $D$ is not.
|
48 |
2. The measure punishes distant errors more heavily:
|
49 |
1. the measure gives higher evaluation for correctly classifying one level down compared to staying at the parent node, e.g. classification into node $E$ (ISCO minor group "111") is better than classification into its parent $C$ (ISCO sub-major group "11") since $E$ is closer to the correct category $G$;
|
50 |
-
2. the measure gives lower evaluation for incorrectly classifying one level down comparing to staying at the parent node, e.g. classification into node $F$ (ISCO minor group "112") is worse than classification into its parent $C$ since $F$ is farther away from $G
|
51 |
-
|
52 |
-
The features described are accomplished by pairing hierarchical variants of precision ($hP$) and recall ($hR$) to form a hierarchical F1 (hF_β) score where each sample belongs not only to its class (e.g., a unit group level code), but also to all ancestors of the class in a hierarchical graph (i.e., the minor, sub-major, and major group level codes)
|
53 |
-
|
54 |
-
Hierarchical precision can be computed with
|
55 |
-
$hP = \frac{| \v{C}_i ∩ \v{C}^′_i|} {|\v{C}^′_i |} = \frac{1}{2}
|
56 |
-
|
57 |
-
Hierarchical recall can be computed with
|
58 |
-
$hR = \frac{| \v{C}_i ∩ \v{C}^′_i|} {|\v{C}_i |} = \frac{1}{2}
|
59 |
-
|
60 |
-
Combining the two values $hP$ and $hR$ into one hF-measure
|
61 |
-
hF_β = \frac{(β^2 + 1) · hP · hR}{(β^2 · hP + hR)}, β ∈ [0, +∞)
|
62 |
-
|
63 |
-
Note
|
64 |
-
**TP**: True positive
|
65 |
-
**TN**: True negative
|
66 |
-
**FP**: False positive
|
67 |
-
**FN**: False negative
|
68 |
"""
|
69 |
|
70 |
_KWARGS_DESCRIPTION = """
|
|
|
37 |
_DESCRIPTION = """
|
38 |
The ISCO-08 Hierarchical Accuracy Measure is an implementation
|
39 |
of the measure described in [Functional Annotation of Genes Using Hierarchical Text Categorization](https://www.researchgate.net/publication/44046343_Functional_Annotation_of_Genes_Using_Hierarchical_Text_Categorization)
|
40 |
+
(Kiritchenko, Svetlana and Famili, Fazel. 2005) with the ISCO-08 taxonomy by the International Labour Organization.
|
41 |
+
|
42 |
1. The measure gives credit to partially correct classification,
|
43 |
e.g. misclassification into node $I$ (ISCO unit group "1120")
|
44 |
when the correct category is $G$ (ISCO unit group "1111")
|
|
|
47 |
as $G$ and $D$ is not.
|
48 |
2. The measure punishes distant errors more heavily:
|
49 |
1. the measure gives higher evaluation for correctly classifying one level down compared to staying at the parent node, e.g. classification into node $E$ (ISCO minor group "111") is better than classification into its parent $C$ (ISCO sub-major group "11") since $E$ is closer to the correct category $G$;
|
50 |
+
2. the measure gives lower evaluation for incorrectly classifying one level down comparing to staying at the parent node, e.g. classification into node $F$ (ISCO minor group "112") is worse than classification into its parent $C$ since $F$ is farther away from $G$.
|
51 |
+
|
52 |
+
The features described are accomplished by pairing hierarchical variants of precision ($hP$) and recall ($hR$) to form a hierarchical F1 (hF_β) score where each sample belongs not only to its class (e.g., a unit group level code), but also to all ancestors of the class in a hierarchical graph (i.e., the minor, sub-major, and major group level codes).
|
53 |
+
|
54 |
+
Hierarchical precision can be computed with:
|
55 |
+
$hP = \frac{| \v{C}_i ∩ \v{C}^′_i|} {|\v{C}^′_i |} = \frac{1}{2}$
|
56 |
+
|
57 |
+
Hierarchical recall can be computed with:
|
58 |
+
$hR = \frac{| \v{C}_i ∩ \v{C}^′_i|} {|\v{C}_i |} = \frac{1}{2}$
|
59 |
+
|
60 |
+
Combining the two values $hP$ and $hR$ into one hF-measure:
|
61 |
+
hF_β = \frac{(β^2 + 1) · hP · hR}{(β^2 · hP + hR)}, β ∈ [0, +∞)
|
62 |
+
|
63 |
+
Note:
|
64 |
+
**TP**: True positive
|
65 |
+
**TN**: True negative
|
66 |
+
**FP**: False positive
|
67 |
+
**FN**: False negative
|
68 |
"""
|
69 |
|
70 |
_KWARGS_DESCRIPTION = """
|