5roop commited on
Commit
042c5ff
1 Parent(s): 9df745c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -10
README.md CHANGED
@@ -7,7 +7,7 @@ language:
7
  ---
8
  # XLM-R-BERTić
9
 
10
- This model was produced by pre-training [XLM-Roberta-large](https://huggingface.co/xlm-roberta-large) 48k steps on South Slavic languages.
11
 
12
  # Benchmarking
13
  Three tasks were chosen for model evaluation:
@@ -24,7 +24,7 @@ Mean F1 scores were used to evaluate performance.
24
 
25
  | system | dataset | F1 score |
26
  |:-----------------------------------------------------------------------|:--------|---------:|
27
- | **XLM-R-BERTić** (this model) | hr500k | 0.927 |
28
  | [BERTić](https://huggingface.co/classla/bcms-bertic) | hr500k | 0.925 |
29
  | [XLM-R-SloBERTić](https://huggingface.co/classla/xlm-r-slobertic) | hr500k | 0.923 |
30
  | [XLM-Roberta-Large](https://huggingface.co/xlm-roberta-large) | hr500k | 0.919 |
@@ -34,7 +34,7 @@ Mean F1 scores were used to evaluate performance.
34
  | system | dataset | F1 score |
35
  |:-----------------------------------------------------------------------|:---------|---------:|
36
  | [XLM-R-SloBERTić](https://huggingface.co/classla/xlm-r-slobertic) | ReLDI-hr | 0.812 |
37
- | **XLM-R-BERTić** (this model) | ReLDI-hr | 0.809 |
38
  | [crosloengual-bert](https://huggingface.co/EMBEDDIA/crosloengual-bert) | ReLDI-hr | 0.794 |
39
  | [BERTić](https://huggingface.co/classla/bcms-bertic) | ReLDI-hr | 0.792 |
40
  | [XLM-Roberta-Large](https://huggingface.co/xlm-roberta-large) | ReLDI-hr | 0.791 |
@@ -43,7 +43,7 @@ Mean F1 scores were used to evaluate performance.
43
  | system | dataset | F1 score |
44
  |:-----------------------------------------------------------------------|:-----------|---------:|
45
  | [XLM-R-SloBERTić](https://huggingface.co/classla/xlm-r-slobertic) | SETimes.SR | 0.949 |
46
- | **XLM-R-BERTić** (this model) | SETimes.SR | 0.940 |
47
  | [BERTić](https://huggingface.co/classla/bcms-bertic) | SETimes.SR | 0.936 |
48
  | [XLM-Roberta-Large](https://huggingface.co/xlm-roberta-large) | SETimes.SR | 0.933 |
49
  | [crosloengual-bert](https://huggingface.co/EMBEDDIA/crosloengual-bert) | SETimes.SR | 0.922 |
@@ -51,7 +51,7 @@ Mean F1 scores were used to evaluate performance.
51
 
52
  | system | dataset | F1 score |
53
  |:-----------------------------------------------------------------------|:---------|---------:|
54
- | **XLM-R-BERTić** (this model) | ReLDI-sr | 0.841 |
55
  | [XLM-R-SloBERTić](https://huggingface.co/classla/xlm-r-slobertic) | ReLDI-sr | 0.824 |
56
  | [BERTić](https://huggingface.co/classla/bcms-bertic) | ReLDI-sr | 0.798 |
57
  | [XLM-Roberta-Large](https://huggingface.co/xlm-roberta-large) | ReLDI-sr | 0.774 |
@@ -69,7 +69,7 @@ The procedure is explained in greater detail in the dedicated [benchmarking repo
69
  | [BERTić](https://huggingface.co/classla/bcms-bertic) | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.612 |
70
  | [XLM-R-SloBERTić](https://huggingface.co/classla/xlm-r-slobertic) | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.607 |
71
  | [XLM-Roberta-Large](https://huggingface.co/xlm-roberta-large) | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.605 |
72
- | **XLM-R-BERTić** (this model) | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.601 |
73
  | [crosloengual-bert](https://huggingface.co/EMBEDDIA/crosloengual-bert) | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.537 |
74
  | [XLM-Roberta-Base](https://huggingface.co/xlm-roberta-base) | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.500 |
75
  | dummy (mean) | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | -0.12 |
@@ -77,12 +77,13 @@ The procedure is explained in greater detail in the dedicated [benchmarking repo
77
 
78
  ## COPA
79
 
 
80
 
81
  | system | dataset | Accuracy score |
82
  |:-----------------------------------------------------------------------|:--------|---------------:|
83
  | [BERTić](https://huggingface.co/classla/bcms-bertic) | Copa-SR | 0.689 |
84
  | [XLM-R-SloBERTić](https://huggingface.co/classla/xlm-r-slobertic) | Copa-SR | 0.665 |
85
- | **XLM-R-BERTić** (this model) | Copa-SR | 0.637 |
86
  | [crosloengual-bert](https://huggingface.co/EMBEDDIA/crosloengual-bert) | Copa-SR | 0.607 |
87
  | [XLM-Roberta-Base](https://huggingface.co/xlm-roberta-base) | Copa-SR | 0.573 |
88
  | [XLM-Roberta-Large](https://huggingface.co/xlm-roberta-large) | Copa-SR | 0.570 |
@@ -92,13 +93,11 @@ The procedure is explained in greater detail in the dedicated [benchmarking repo
92
  |:-----------------------------------------------------------------------|:--------|---------------:|
93
  | [BERTić](https://huggingface.co/classla/bcms-bertic) | Copa-HR | 0.669 |
94
  | [XLM-R-SloBERTić](https://huggingface.co/classla/xlm-r-slobertic) | Copa-HR | 0.628 |
95
- | **XLM-R-BERTić** (this model) | Copa-HR | 0.635 |
96
  | [crosloengual-bert](https://huggingface.co/EMBEDDIA/crosloengual-bert) | Copa-HR | 0.669 |
97
  | [XLM-Roberta-Base](https://huggingface.co/xlm-roberta-base) | Copa-HR | 0.585 |
98
  | [XLM-Roberta-Large](https://huggingface.co/xlm-roberta-large) | Copa-HR | 0.571 |
99
 
100
-
101
-
102
  # Citation
103
  (to be added soon)
104
  # Authors
 
7
  ---
8
  # XLM-R-BERTić
9
 
10
+ This model was produced by pre-training [XLM-Roberta-large](https://huggingface.co/xlm-roberta-large) 48k steps on South Slavic languages using [XLM-R-BERTić dataset](https://huggingface.co/datasets/classla/xlm-r-bertic-data)
11
 
12
  # Benchmarking
13
  Three tasks were chosen for model evaluation:
 
24
 
25
  | system | dataset | F1 score |
26
  |:-----------------------------------------------------------------------|:--------|---------:|
27
+ | [XLM-R-BERTić](https://huggingface.co/classla/xlm-r-bertic) | hr500k | 0.927 |
28
  | [BERTić](https://huggingface.co/classla/bcms-bertic) | hr500k | 0.925 |
29
  | [XLM-R-SloBERTić](https://huggingface.co/classla/xlm-r-slobertic) | hr500k | 0.923 |
30
  | [XLM-Roberta-Large](https://huggingface.co/xlm-roberta-large) | hr500k | 0.919 |
 
34
  | system | dataset | F1 score |
35
  |:-----------------------------------------------------------------------|:---------|---------:|
36
  | [XLM-R-SloBERTić](https://huggingface.co/classla/xlm-r-slobertic) | ReLDI-hr | 0.812 |
37
+ | [XLM-R-BERTić](https://huggingface.co/classla/xlm-r-bertic) | ReLDI-hr | 0.809 |
38
  | [crosloengual-bert](https://huggingface.co/EMBEDDIA/crosloengual-bert) | ReLDI-hr | 0.794 |
39
  | [BERTić](https://huggingface.co/classla/bcms-bertic) | ReLDI-hr | 0.792 |
40
  | [XLM-Roberta-Large](https://huggingface.co/xlm-roberta-large) | ReLDI-hr | 0.791 |
 
43
  | system | dataset | F1 score |
44
  |:-----------------------------------------------------------------------|:-----------|---------:|
45
  | [XLM-R-SloBERTić](https://huggingface.co/classla/xlm-r-slobertic) | SETimes.SR | 0.949 |
46
+ | [XLM-R-BERTić](https://huggingface.co/classla/xlm-r-bertic) | SETimes.SR | 0.940 |
47
  | [BERTić](https://huggingface.co/classla/bcms-bertic) | SETimes.SR | 0.936 |
48
  | [XLM-Roberta-Large](https://huggingface.co/xlm-roberta-large) | SETimes.SR | 0.933 |
49
  | [crosloengual-bert](https://huggingface.co/EMBEDDIA/crosloengual-bert) | SETimes.SR | 0.922 |
 
51
 
52
  | system | dataset | F1 score |
53
  |:-----------------------------------------------------------------------|:---------|---------:|
54
+ | [XLM-R-BERTić](https://huggingface.co/classla/xlm-r-bertic) | ReLDI-sr | 0.841 |
55
  | [XLM-R-SloBERTić](https://huggingface.co/classla/xlm-r-slobertic) | ReLDI-sr | 0.824 |
56
  | [BERTić](https://huggingface.co/classla/bcms-bertic) | ReLDI-sr | 0.798 |
57
  | [XLM-Roberta-Large](https://huggingface.co/xlm-roberta-large) | ReLDI-sr | 0.774 |
 
69
  | [BERTić](https://huggingface.co/classla/bcms-bertic) | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.612 |
70
  | [XLM-R-SloBERTić](https://huggingface.co/classla/xlm-r-slobertic) | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.607 |
71
  | [XLM-Roberta-Large](https://huggingface.co/xlm-roberta-large) | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.605 |
72
+ | [XLM-R-BERTić](https://huggingface.co/classla/xlm-r-bertic) | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.601 |
73
  | [crosloengual-bert](https://huggingface.co/EMBEDDIA/crosloengual-bert) | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.537 |
74
  | [XLM-Roberta-Base](https://huggingface.co/xlm-roberta-base) | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | 0.500 |
75
  | dummy (mean) | ParlaSent_BCS.jsonl | ParlaSent_BCS_test.jsonl | -0.12 |
 
77
 
78
  ## COPA
79
 
80
+ Two South Slavic COPA datasets were used, [COPA-HR](https://huggingface.co/datasets/classla/copa_hr) and [COPA-SR_lat](https://huggingface.co/datasets/classla/COPA-SR_lat).
81
 
82
  | system | dataset | Accuracy score |
83
  |:-----------------------------------------------------------------------|:--------|---------------:|
84
  | [BERTić](https://huggingface.co/classla/bcms-bertic) | Copa-SR | 0.689 |
85
  | [XLM-R-SloBERTić](https://huggingface.co/classla/xlm-r-slobertic) | Copa-SR | 0.665 |
86
+ | [XLM-R-BERTić](https://huggingface.co/classla/xlm-r-bertic) | Copa-SR | 0.637 |
87
  | [crosloengual-bert](https://huggingface.co/EMBEDDIA/crosloengual-bert) | Copa-SR | 0.607 |
88
  | [XLM-Roberta-Base](https://huggingface.co/xlm-roberta-base) | Copa-SR | 0.573 |
89
  | [XLM-Roberta-Large](https://huggingface.co/xlm-roberta-large) | Copa-SR | 0.570 |
 
93
  |:-----------------------------------------------------------------------|:--------|---------------:|
94
  | [BERTić](https://huggingface.co/classla/bcms-bertic) | Copa-HR | 0.669 |
95
  | [XLM-R-SloBERTić](https://huggingface.co/classla/xlm-r-slobertic) | Copa-HR | 0.628 |
96
+ | [XLM-R-BERTić](https://huggingface.co/classla/xlm-r-bertic) | Copa-HR | 0.635 |
97
  | [crosloengual-bert](https://huggingface.co/EMBEDDIA/crosloengual-bert) | Copa-HR | 0.669 |
98
  | [XLM-Roberta-Base](https://huggingface.co/xlm-roberta-base) | Copa-HR | 0.585 |
99
  | [XLM-Roberta-Large](https://huggingface.co/xlm-roberta-large) | Copa-HR | 0.571 |
100
 
 
 
101
  # Citation
102
  (to be added soon)
103
  # Authors