mcse-flickr-roberta-base / eval_results.log
Miaoran's picture
Upload 2 files
2ea2738 verified
2021-10-07 16:25:08,199 : ***** Transfer task : STS12 *****
2021-10-07 16:25:11,524 : MSRpar : pearson = 0.6385, spearman = 0.6277, align_loss = 0.1472, uniform_loss = -1.9531
2021-10-07 16:25:12,835 : MSRvid : pearson = 0.8927, spearman = 0.8893, align_loss = 0.1807, uniform_loss = -1.9198
2021-10-07 16:25:13,964 : SMTeuroparl : pearson = 0.5341, spearman = 0.5812, align_loss = 0.1966, uniform_loss = -1.3802
2021-10-07 16:25:15,951 : surprise.OnWN : pearson = 0.7567, spearman = 0.7188, align_loss = 0.2179, uniform_loss = -1.9324
2021-10-07 16:25:17,066 : surprise.SMTnews : pearson = 0.7105, spearman = 0.6142, align_loss = 0.1903, uniform_loss = -1.4507
2021-10-07 16:25:17,071 : ALL : Pearson = 0.8081, Spearman = 0.7174, align_loss = 0.1855, uniform_loss = -1.7752
2021-10-07 16:25:17,071 : ALL (weighted average) : Pearson = 0.7222, Spearman = 0.7042, align_loss = 0.1852, uniform_loss = -1.7910
2021-10-07 16:25:17,072 : ALL (average) : Pearson = 0.7065, Spearman = 0.6862, align_loss = 0.1865, uniform_loss = -1.7272
2021-10-07 16:25:17,077 : ***** Transfer task : STS13 (-SMT) *****
2021-10-07 16:25:18,125 : FNWN : pearson = 0.6377, spearman = 0.6645, align_loss = 0.2690, uniform_loss = -1.7120
2021-10-07 16:25:19,856 : headlines : pearson = 0.7919, spearman = 0.7916, align_loss = 0.1893, uniform_loss = -1.9193
2021-10-07 16:25:20,985 : OnWN : pearson = 0.8636, spearman = 0.8378, align_loss = 0.2356, uniform_loss = -1.8226
2021-10-07 16:25:20,988 : ALL : Pearson = 0.8223, Spearman = 0.8260, align_loss = 0.2194, uniform_loss = -1.8502
2021-10-07 16:25:20,988 : ALL (weighted average) : Pearson = 0.7993, Spearman = 0.7928, align_loss = 0.2167, uniform_loss = -1.8570
2021-10-07 16:25:20,988 : ALL (average) : Pearson = 0.7644, Spearman = 0.7646, align_loss = 0.2313, uniform_loss = -1.8180
2021-10-07 16:25:20,989 : ***** Transfer task : STS14 *****
2021-10-07 16:25:22,216 : deft-forum : pearson = 0.5599, spearman = 0.5425, align_loss = 0.2140, uniform_loss = -1.7958
2021-10-07 16:25:23,530 : deft-news : pearson = 0.7945, spearman = 0.7483, align_loss = 0.1556, uniform_loss = -1.8096
2021-10-07 16:25:25,363 : headlines : pearson = 0.7907, spearman = 0.7687, align_loss = 0.1759, uniform_loss = -1.9653
2021-10-07 16:25:26,984 : images : pearson = 0.8765, spearman = 0.8525, align_loss = 0.1784, uniform_loss = -2.0066
2021-10-07 16:25:28,642 : OnWN : pearson = 0.8735, spearman = 0.8602, align_loss = 0.2378, uniform_loss = -1.8534
2021-10-07 16:25:30,908 : tweet-news : pearson = 0.7726, spearman = 0.7054, align_loss = 0.2983, uniform_loss = -1.8457
2021-10-07 16:25:30,913 : ALL : Pearson = 0.7965, Spearman = 0.7567, align_loss = 0.2150, uniform_loss = -1.8916
2021-10-07 16:25:30,913 : ALL (weighted average) : Pearson = 0.7934, Spearman = 0.7623, align_loss = 0.2162, uniform_loss = -1.8945
2021-10-07 16:25:30,913 : ALL (average) : Pearson = 0.7779, Spearman = 0.7463, align_loss = 0.2100, uniform_loss = -1.8794
2021-10-07 16:25:30,918 : ***** Transfer task : STS15 *****
2021-10-07 16:25:32,577 : answers-forums : pearson = 0.7341, spearman = 0.7387, align_loss = 0.3866, uniform_loss = -1.9430
2021-10-07 16:25:34,257 : answers-students : pearson = 0.7654, spearman = 0.7759, align_loss = 0.2215, uniform_loss = -1.2486
2021-10-07 16:25:36,065 : belief : pearson = 0.8094, spearman = 0.8136, align_loss = 0.2693, uniform_loss = -1.7862
2021-10-07 16:25:38,331 : headlines : pearson = 0.8290, spearman = 0.8334, align_loss = 0.1846, uniform_loss = -1.9589
2021-10-07 16:25:40,201 : images : pearson = 0.9071, spearman = 0.9132, align_loss = 0.1945, uniform_loss = -2.0353
2021-10-07 16:25:40,206 : ALL : Pearson = 0.8362, Spearman = 0.8449, align_loss = 0.2321, uniform_loss = -1.7768
2021-10-07 16:25:40,206 : ALL (weighted average) : Pearson = 0.8183, Spearman = 0.8247, align_loss = 0.2321, uniform_loss = -1.7768
2021-10-07 16:25:40,206 : ALL (average) : Pearson = 0.8090, Spearman = 0.8150, align_loss = 0.2513, uniform_loss = -1.7944
2021-10-07 16:25:40,211 : ***** Transfer task : STS16 *****
2021-10-07 16:25:41,053 : answer-answer : pearson = 0.7412, spearman = 0.7434, align_loss = 0.2435, uniform_loss = -1.4824
2021-10-07 16:25:41,737 : headlines : pearson = 0.8196, spearman = 0.8381, align_loss = 0.1571, uniform_loss = -1.9802
2021-10-07 16:25:42,487 : plagiarism : pearson = 0.8495, spearman = 0.8620, align_loss = 0.1564, uniform_loss = -1.6272
2021-10-07 16:25:43,891 : postediting : pearson = 0.8548, spearman = 0.8739, align_loss = 0.1171, uniform_loss = -1.7985
2021-10-07 16:25:44,530 : question-question : pearson = 0.7249, spearman = 0.7206, align_loss = 0.1987, uniform_loss = -1.7836
2021-10-07 16:25:44,533 : ALL : Pearson = 0.7944, Spearman = 0.8074, align_loss = 0.1746, uniform_loss = -1.7344
2021-10-07 16:25:44,534 : ALL (weighted average) : Pearson = 0.7992, Spearman = 0.8091, align_loss = 0.1746, uniform_loss = -1.7331
2021-10-07 16:25:44,534 : ALL (average) : Pearson = 0.7980, Spearman = 0.8076, align_loss = 0.1746, uniform_loss = -1.7344
2021-10-07 16:25:44,536 :
***** Transfer task : STSBenchmark*****
2021-10-07 16:26:05,137 : train : pearson = 0.8198, spearman = 0.7975, align_loss = 0.1852, uniform_loss = -1.9620
2021-10-07 16:26:11,036 : dev : pearson = 0.8570, spearman = 0.8597, align_loss = 0.2060, uniform_loss = -1.9924
2021-10-07 16:26:16,108 : test : pearson = 0.8090, spearman = 0.8152, align_loss = 0.1776, uniform_loss = -1.9208
2021-10-07 16:26:16,137 : ALL : Pearson = 0.8258, Spearman = 0.8145, align_loss = 0.1876, uniform_loss = -1.9607
2021-10-07 16:26:16,137 : ALL (weighted average) : Pearson = 0.8246, Spearman = 0.8112, align_loss = 0.1876, uniform_loss = -1.9607
2021-10-07 16:26:16,137 : ALL (average) : Pearson = 0.8286, Spearman = 0.8241, align_loss = 0.1896, uniform_loss = -1.9584
2021-10-07 16:26:16,147 :
***** Transfer task : SICKRelatedness*****
2021-10-07 16:26:28,880 : train : pearson = 0.8053, spearman = 0.7307, align_loss = 0.1797, uniform_loss = -1.9860
2021-10-07 16:26:30,456 : dev : pearson = 0.8165, spearman = 0.7514, align_loss = 0.1858, uniform_loss = -2.1245
2021-10-07 16:26:44,329 : test : pearson = 0.7953, spearman = 0.7230, align_loss = 0.1798, uniform_loss = -1.9833
2021-10-07 16:26:44,336 : ALL : Pearson = 0.8009, Spearman = 0.7280, align_loss = 0.1800, uniform_loss = -1.9917
2021-10-07 16:26:44,336 : ALL (weighted average) : Pearson = 0.8009, Spearman = 0.7279, align_loss = 0.1800, uniform_loss = -1.9916
2021-10-07 16:26:44,336 : ALL (average) : Pearson = 0.8057, Spearman = 0.7350, align_loss = 0.1817, uniform_loss = -2.0313
2021-10-07 16:26:44,336 : ------ test ------
2021-10-07 16:26:44,337 : +--------+--------+--------+--------+--------+--------------+-----------------+--------+
| STS12 | STS13 | STS14 | STS15 | STS16 | STSBenchmark | SICKRelatedness | Avg. |
+--------+--------+--------+--------+--------+--------------+-----------------+--------+
| 71.74 | 82.60 | 75.67 | 84.49 | 80.74 | 81.52 | 72.30 | 78.44 |
| 0.185 | 0.219 | 0.215 | 0.232 | 0.175 | 0.178 | 0.180 | 0.198 |
| -1.775 | -1.850 | -1.892 | -1.777 | -1.734 | -1.921 | -1.983 | -1.847 |
+--------+--------+--------+--------+--------+--------------+-----------------+--------+
2021-10-07 16:26:44,338 : +------+------+------+------+------+------+------+------+
| MR | CR | SUBJ | MPQA | SST2 | TREC | MRPC | Avg. |
+------+------+------+------+------+------+------+------+
| 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
+------+------+------+------+------+------+------+------+