Zemulax/masked-lm-tpu

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

Train Loss: 7.7770
Train Accuracy: 0.0241
Validation Loss: 7.7589
Validation Accuracy: 0.0230
Epoch: 98

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0001, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0001, 'decay_steps': 223250, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 11750, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.001}
training_precision: float32

Training results

Train Loss	Train Accuracy	Validation Loss	Validation Accuracy	Epoch
10.2868	0.0	10.2891	0.0	0
10.2817	0.0000	10.2764	0.0	1
10.2772	0.0000	10.2667	0.0000	2
10.2604	0.0000	10.2521	0.0	3
10.2421	0.0000	10.2282	0.0000	4
10.2219	0.0	10.2010	0.0	5
10.1957	0.0	10.1669	0.0	6
10.1667	0.0000	10.1388	0.0000	7
10.1278	0.0000	10.0908	0.0000	8
10.0848	0.0000	10.0405	0.0001	9
10.0496	0.0002	9.9921	0.0007	10
9.9940	0.0010	9.9422	0.0039	11
9.9424	0.0035	9.8765	0.0110	12
9.8826	0.0092	9.8156	0.0182	13
9.8225	0.0155	9.7461	0.0209	14
9.7670	0.0201	9.6768	0.0222	15
9.7065	0.0219	9.6127	0.0222	16
9.6352	0.0227	9.5445	0.0220	17
9.5757	0.0226	9.4795	0.0219	18
9.4894	0.0232	9.3985	0.0222	19
9.4277	0.0234	9.3386	0.0222	20
9.3676	0.0229	9.2753	0.0220	21
9.2980	0.0229	9.2170	0.0219	22
9.2361	0.0233	9.1518	0.0219	23
9.1515	0.0236	9.0827	0.0223	24
9.1171	0.0228	9.0406	0.0218	25
9.0447	0.0234	8.9867	0.0218	26
9.0119	0.0229	8.9307	0.0221	27
8.9625	0.0229	8.8969	0.0221	28
8.9098	0.0230	8.8341	0.0223	29
8.8726	0.0227	8.8118	0.0220	30
8.8574	0.0223	8.7910	0.0219	31
8.7798	0.0231	8.7506	0.0221	32
8.7535	0.0231	8.7055	0.0222	33
8.7333	0.0228	8.6801	0.0223	34
8.6985	0.0231	8.6837	0.0220	35
8.6816	0.0229	8.6243	0.0223	36
8.6356	0.0228	8.6323	0.0217	37
8.6392	0.0225	8.5603	0.0225	38
8.5802	0.0233	8.5722	0.0219	39
8.5825	0.0228	8.5548	0.0220	40
8.5625	0.0228	8.5272	0.0220	41
8.5415	0.0228	8.5200	0.0222	42
8.5124	0.0230	8.4787	0.0222	43
8.4999	0.0229	8.4819	0.0218	44
8.4561	0.0235	8.4453	0.0221	45
8.4854	0.0223	8.4378	0.0220	46
8.4367	0.0229	8.4212	0.0222	47
8.4096	0.0232	8.4033	0.0221	48
8.4162	0.0228	8.3869	0.0221	49
8.4005	0.0229	8.3768	0.0218	50
8.3583	0.0235	8.3470	0.0224	51
8.3428	0.0235	8.3540	0.0221	52
8.3491	0.0231	8.3201	0.0225	53
8.3551	0.0231	8.3382	0.0221	54
8.3186	0.0231	8.3136	0.0219	55
8.3139	0.0226	8.2844	0.0222	56
8.3170	0.0229	8.2740	0.0221	57
8.2886	0.0231	8.2485	0.0223	58
8.2648	0.0233	8.2336	0.0223	59
8.2714	0.0225	8.2321	0.0221	60
8.2446	0.0233	8.2135	0.0223	61
8.2303	0.0230	8.1980	0.0223	62
8.2022	0.0237	8.1996	0.0222	63
8.2222	0.0227	8.1822	0.0222	64
8.1690	0.0236	8.2005	0.0220	65
8.1741	0.0233	8.1446	0.0226	66
8.1990	0.0224	8.1586	0.0219	67
8.1395	0.0236	8.1243	0.0225	68
8.1675	0.0229	8.1275	0.0222	69
8.1432	0.0229	8.1374	0.0217	70
8.1197	0.0234	8.1078	0.0221	71
8.1046	0.0232	8.0991	0.0221	72
8.1013	0.0231	8.0794	0.0222	73
8.0887	0.0228	8.0720	0.0221	74
8.0661	0.0233	8.0573	0.0222	75
8.0548	0.0231	8.0313	0.0226	76
8.0307	0.0235	8.0278	0.0222	77
8.0626	0.0226	8.0084	0.0224	78
8.0276	0.0229	8.0099	0.0221	79
8.0213	0.0231	7.9930	0.0222	80
7.9798	0.0237	7.9742	0.0224	81
8.0135	0.0226	7.9857	0.0218	82
7.9500	0.0235	7.9505	0.0223	83
7.9519	0.0234	7.9711	0.0217	84
7.9616	0.0228	7.9288	0.0223	85
7.9803	0.0225	7.8997	0.0226	86
7.9369	0.0227	7.9015	0.0225	87
7.9309	0.0229	7.9010	0.0224	88
7.9367	0.0226	7.8988	0.0220	89
7.8840	0.0230	7.8774	0.0216	90
7.8785	0.0233	7.8527	0.0225	91
7.8998	0.0226	7.8509	0.0219	92
7.8451	0.0232	7.8488	0.0221	93
7.8596	0.0231	7.8310	0.0222	94
7.8434	0.0231	7.8168	0.0229	95
7.7929	0.0238	7.7815	0.0233	96
7.8174	0.0236	7.7857	0.0232	97
7.7770	0.0241	7.7589	0.0230	98

Framework versions

Transformers 4.30.1
TensorFlow 2.12.0
Tokenizers 0.13.3

Zemulax
/

masked-lm-tpu

Zemulax/masked-lm-tpu

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results