danieldux commited on
Commit
908df22
·
1 Parent(s): 9cbdfb3

Update ISCO-08 Hierarchical Accuracy Measure documentation

Browse files
Files changed (2) hide show
  1. isco_ham.py +20 -20
  2. metric_template_1.py +20 -20
isco_ham.py CHANGED
@@ -37,8 +37,8 @@ _CITATION = """
37
  _DESCRIPTION = """
38
  The ISCO-08 Hierarchical Accuracy Measure is an implementation
39
  of the measure described in [Functional Annotation of Genes Using Hierarchical Text Categorization](https://www.researchgate.net/publication/44046343_Functional_Annotation_of_Genes_Using_Hierarchical_Text_Categorization)
40
- (Kiritchenko, Svetlana and Famili, Fazel. 2005) with the ISCO-08 taxonomy by the International Labour Organization.\n
41
- \n
42
  1. The measure gives credit to partially correct classification,
43
  e.g. misclassification into node $I$ (ISCO unit group "1120")
44
  when the correct category is $G$ (ISCO unit group "1111")
@@ -47,24 +47,24 @@ should be penalized less than misclassification into node $D$
47
  as $G$ and $D$ is not.
48
  2. The measure punishes distant errors more heavily:
49
  1. the measure gives higher evaluation for correctly classifying one level down compared to staying at the parent node, e.g. classification into node $E$ (ISCO minor group "111") is better than classification into its parent $C$ (ISCO sub-major group "11") since $E$ is closer to the correct category $G$;
50
- 2. the measure gives lower evaluation for incorrectly classifying one level down comparing to staying at the parent node, e.g. classification into node $F$ (ISCO minor group "112") is worse than classification into its parent $C$ since $F$ is farther away from $G$.\n
51
- \n
52
- The features described are accomplished by pairing hierarchical variants of precision ($hP$) and recall ($hR$) to form a hierarchical F1 (hF_β) score where each sample belongs not only to its class (e.g., a unit group level code), but also to all ancestors of the class in a hierarchical graph (i.e., the minor, sub-major, and major group level codes).\n
53
- \n
54
- Hierarchical precision can be computed with:\n
55
- $hP = \frac{| \v{C}_i ∩ \v{C}^′_i|} {|\v{C}^′_i |} = \frac{1}{2}$\n
56
- \n
57
- Hierarchical recall can be computed with:\n
58
- $hR = \frac{| \v{C}_i ∩ \v{C}^′_i|} {|\v{C}_i |} = \frac{1}{2}$\n
59
- \n
60
- Combining the two values $hP$ and $hR$ into one hF-measure:\n
61
- hF_β = \frac{(β^2 + 1) · hP · hR}{(β^2 · hP + hR)}, β ∈ [0, +∞)\n
62
- \n
63
- Note:\n
64
- **TP**: True positive\n
65
- **TN**: True negative\n
66
- **FP**: False positive\n
67
- **FN**: False negative\n
68
  """
69
 
70
  _KWARGS_DESCRIPTION = """
 
37
  _DESCRIPTION = """
38
  The ISCO-08 Hierarchical Accuracy Measure is an implementation
39
  of the measure described in [Functional Annotation of Genes Using Hierarchical Text Categorization](https://www.researchgate.net/publication/44046343_Functional_Annotation_of_Genes_Using_Hierarchical_Text_Categorization)
40
+ (Kiritchenko, Svetlana and Famili, Fazel. 2005) with the ISCO-08 taxonomy by the International Labour Organization.
41
+
42
  1. The measure gives credit to partially correct classification,
43
  e.g. misclassification into node $I$ (ISCO unit group "1120")
44
  when the correct category is $G$ (ISCO unit group "1111")
 
47
  as $G$ and $D$ is not.
48
  2. The measure punishes distant errors more heavily:
49
  1. the measure gives higher evaluation for correctly classifying one level down compared to staying at the parent node, e.g. classification into node $E$ (ISCO minor group "111") is better than classification into its parent $C$ (ISCO sub-major group "11") since $E$ is closer to the correct category $G$;
50
+ 2. the measure gives lower evaluation for incorrectly classifying one level down comparing to staying at the parent node, e.g. classification into node $F$ (ISCO minor group "112") is worse than classification into its parent $C$ since $F$ is farther away from $G$.
51
+
52
+ The features described are accomplished by pairing hierarchical variants of precision ($hP$) and recall ($hR$) to form a hierarchical F1 (hF_β) score where each sample belongs not only to its class (e.g., a unit group level code), but also to all ancestors of the class in a hierarchical graph (i.e., the minor, sub-major, and major group level codes).
53
+
54
+ Hierarchical precision can be computed with:
55
+ $hP = \frac{| \v{C}_i ∩ \v{C}^′_i|} {|\v{C}^′_i |} = \frac{1}{2}$
56
+
57
+ Hierarchical recall can be computed with:
58
+ $hR = \frac{| \v{C}_i ∩ \v{C}^′_i|} {|\v{C}_i |} = \frac{1}{2}$
59
+
60
+ Combining the two values $hP$ and $hR$ into one hF-measure:
61
+ hF_β = \frac{(β^2 + 1) · hP · hR}{(β^2 · hP + hR)}, β ∈ [0, +∞)
62
+
63
+ Note:
64
+ **TP**: True positive
65
+ **TN**: True negative
66
+ **FP**: False positive
67
+ **FN**: False negative
68
  """
69
 
70
  _KWARGS_DESCRIPTION = """
metric_template_1.py CHANGED
@@ -37,8 +37,8 @@ _CITATION = """
37
  _DESCRIPTION = """
38
  The ISCO-08 Hierarchical Accuracy Measure is an implementation
39
  of the measure described in [Functional Annotation of Genes Using Hierarchical Text Categorization](https://www.researchgate.net/publication/44046343_Functional_Annotation_of_Genes_Using_Hierarchical_Text_Categorization)
40
- (Kiritchenko, Svetlana and Famili, Fazel. 2005) with the ISCO-08 taxonomy by the International Labour Organization.\n
41
- \n
42
  1. The measure gives credit to partially correct classification,
43
  e.g. misclassification into node $I$ (ISCO unit group "1120")
44
  when the correct category is $G$ (ISCO unit group "1111")
@@ -47,24 +47,24 @@ should be penalized less than misclassification into node $D$
47
  as $G$ and $D$ is not.
48
  2. The measure punishes distant errors more heavily:
49
  1. the measure gives higher evaluation for correctly classifying one level down compared to staying at the parent node, e.g. classification into node $E$ (ISCO minor group "111") is better than classification into its parent $C$ (ISCO sub-major group "11") since $E$ is closer to the correct category $G$;
50
- 2. the measure gives lower evaluation for incorrectly classifying one level down comparing to staying at the parent node, e.g. classification into node $F$ (ISCO minor group "112") is worse than classification into its parent $C$ since $F$ is farther away from $G$.\n
51
- \n
52
- The features described are accomplished by pairing hierarchical variants of precision ($hP$) and recall ($hR$) to form a hierarchical F1 (hF_β) score where each sample belongs not only to its class (e.g., a unit group level code), but also to all ancestors of the class in a hierarchical graph (i.e., the minor, sub-major, and major group level codes).\n
53
- \n
54
- Hierarchical precision can be computed with:\n
55
- $hP = \frac{| \v{C}_i ∩ \v{C}^′_i|} {|\v{C}^′_i |} = \frac{1}{2}$\n
56
- \n
57
- Hierarchical recall can be computed with:\n
58
- $hR = \frac{| \v{C}_i ∩ \v{C}^′_i|} {|\v{C}_i |} = \frac{1}{2}$\n
59
- \n
60
- Combining the two values $hP$ and $hR$ into one hF-measure:\n
61
- hF_β = \frac{(β^2 + 1) · hP · hR}{(β^2 · hP + hR)}, β ∈ [0, +∞)\n
62
- \n
63
- Note:\n
64
- **TP**: True positive\n
65
- **TN**: True negative\n
66
- **FP**: False positive\n
67
- **FN**: False negative\n
68
  """
69
 
70
  _KWARGS_DESCRIPTION = """
 
37
  _DESCRIPTION = """
38
  The ISCO-08 Hierarchical Accuracy Measure is an implementation
39
  of the measure described in [Functional Annotation of Genes Using Hierarchical Text Categorization](https://www.researchgate.net/publication/44046343_Functional_Annotation_of_Genes_Using_Hierarchical_Text_Categorization)
40
+ (Kiritchenko, Svetlana and Famili, Fazel. 2005) with the ISCO-08 taxonomy by the International Labour Organization.
41
+
42
  1. The measure gives credit to partially correct classification,
43
  e.g. misclassification into node $I$ (ISCO unit group "1120")
44
  when the correct category is $G$ (ISCO unit group "1111")
 
47
  as $G$ and $D$ is not.
48
  2. The measure punishes distant errors more heavily:
49
  1. the measure gives higher evaluation for correctly classifying one level down compared to staying at the parent node, e.g. classification into node $E$ (ISCO minor group "111") is better than classification into its parent $C$ (ISCO sub-major group "11") since $E$ is closer to the correct category $G$;
50
+ 2. the measure gives lower evaluation for incorrectly classifying one level down comparing to staying at the parent node, e.g. classification into node $F$ (ISCO minor group "112") is worse than classification into its parent $C$ since $F$ is farther away from $G$.
51
+
52
+ The features described are accomplished by pairing hierarchical variants of precision ($hP$) and recall ($hR$) to form a hierarchical F1 (hF_β) score where each sample belongs not only to its class (e.g., a unit group level code), but also to all ancestors of the class in a hierarchical graph (i.e., the minor, sub-major, and major group level codes).
53
+
54
+ Hierarchical precision can be computed with:
55
+ $hP = \frac{| \v{C}_i ∩ \v{C}^′_i|} {|\v{C}^′_i |} = \frac{1}{2}$
56
+
57
+ Hierarchical recall can be computed with:
58
+ $hR = \frac{| \v{C}_i ∩ \v{C}^′_i|} {|\v{C}_i |} = \frac{1}{2}$
59
+
60
+ Combining the two values $hP$ and $hR$ into one hF-measure:
61
+ hF_β = \frac{(β^2 + 1) · hP · hR}{(β^2 · hP + hR)}, β ∈ [0, +∞)
62
+
63
+ Note:
64
+ **TP**: True positive
65
+ **TN**: True negative
66
+ **FP**: False positive
67
+ **FN**: False negative
68
  """
69
 
70
  _KWARGS_DESCRIPTION = """