xlm-roberta-base-finetuned-Parallel-mlm-0.15-base-27OCT

This model is a fine-tuned version of xlm-roberta-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.9339

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 32
total_train_batch_size: 256
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss
No log	0.0998	100	1.4204
No log	0.1997	200	1.3432
No log	0.2995	300	1.3054
No log	0.3993	400	1.2756
1.5915	0.4992	500	1.2552
1.5915	0.5990	600	1.2327
1.5915	0.6988	700	1.2152
1.5915	0.7987	800	1.2012
1.5915	0.8985	900	1.2002
1.3946	0.9983	1000	1.1854
1.3946	1.0982	1100	1.1824
1.3946	1.1980	1200	1.1723
1.3946	1.2979	1300	1.1589
1.3946	1.3977	1400	1.1490
1.321	1.4975	1500	1.1387
1.321	1.5974	1600	1.1356
1.321	1.6972	1700	1.1252
1.321	1.7970	1800	1.1259
1.321	1.8969	1900	1.1182
1.2735	1.9967	2000	1.1144
1.2735	2.0965	2100	1.0966
1.2735	2.1964	2200	1.1005
1.2735	2.2962	2300	1.0952
1.2735	2.3960	2400	1.0935
1.235	2.4959	2500	1.0840
1.235	2.5957	2600	1.0766
1.235	2.6955	2700	1.0719
1.235	2.7954	2800	1.0665
1.235	2.8952	2900	1.0644
1.1954	2.9950	3000	1.0656
1.1954	3.0949	3100	1.0574
1.1954	3.1947	3200	1.0495
1.1954	3.2945	3300	1.0475
1.1954	3.3944	3400	1.0452
1.1707	3.4942	3500	1.0399
1.1707	3.5940	3600	1.0363
1.1707	3.6939	3700	1.0291
1.1707	3.7937	3800	1.0338
1.1707	3.8936	3900	1.0348
1.1509	3.9934	4000	1.0319
1.1509	4.0932	4100	1.0219
1.1509	4.1931	4200	1.0214
1.1509	4.2929	4300	1.0161
1.1509	4.3927	4400	1.0158
1.1275	4.4926	4500	1.0153
1.1275	4.5924	4600	1.0067
1.1275	4.6922	4700	1.0058
1.1275	4.7921	4800	1.0097
1.1275	4.8919	4900	1.0037
1.1127	4.9917	5000	1.0048
1.1127	5.0916	5100	1.0022
1.1127	5.1914	5200	0.9947
1.1127	5.2912	5300	0.9947
1.1127	5.3911	5400	0.9907
1.0944	5.4909	5500	0.9909
1.0944	5.5907	5600	0.9861
1.0944	5.6906	5700	0.9858
1.0944	5.7904	5800	0.9861
1.0944	5.8902	5900	0.9791
1.0847	5.9901	6000	0.9787
1.0847	6.0899	6100	0.9744
1.0847	6.1897	6200	0.9752
1.0847	6.2896	6300	0.9712
1.0847	6.3894	6400	0.9723
1.0662	6.4893	6500	0.9706
1.0662	6.5891	6600	0.9688
1.0662	6.6889	6700	0.9692
1.0662	6.7888	6800	0.9655
1.0662	6.8886	6900	0.9637
1.0559	6.9884	7000	0.9629
1.0559	7.0883	7100	0.9618
1.0559	7.1881	7200	0.9622
1.0559	7.2879	7300	0.9605
1.0559	7.3878	7400	0.9560
1.0439	7.4876	7500	0.9562
1.0439	7.5874	7600	0.9566
1.0439	7.6873	7700	0.9515
1.0439	7.7871	7800	0.9514
1.0439	7.8869	7900	0.9542
1.0358	7.9868	8000	0.9504
1.0358	8.0866	8100	0.9502
1.0358	8.1864	8200	0.9494
1.0358	8.2863	8300	0.9451
1.0358	8.3861	8400	0.9461
1.0242	8.4859	8500	0.9447
1.0242	8.5858	8600	0.9455
1.0242	8.6856	8700	0.9441
1.0242	8.7854	8800	0.9399
1.0242	8.8853	8900	0.9410
1.0198	8.9851	9000	0.9391
1.0198	9.0850	9100	0.9390
1.0198	9.1848	9200	0.9379
1.0198	9.2846	9300	0.9382
1.0198	9.3845	9400	0.9377
1.0094	9.4843	9500	0.9363
1.0094	9.5841	9600	0.9354
1.0094	9.6840	9700	0.9353
1.0094	9.7838	9800	0.9351
1.0094	9.8836	9900	0.9342
1.011	9.9835	10000	0.9339

Framework versions

Transformers 4.43.4
Pytorch 2.1.1+cu121
Datasets 3.0.2
Tokenizers 0.19.1

iwan-rg
/

XLM-Roberta-base-Finetuned-EN-AR-Parallel

xlm-roberta-base-finetuned-Parallel-mlm-0.15-base-27OCT

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for iwan-rg/XLM-Roberta-base-Finetuned-EN-AR-Parallel

Evaluation results