Update README.md

0e7d7f3 verified 11 months ago

4.41 kB

	---
	license: apache-2.0
	base_model: ibaucells/RoBERTa-ca-CaWikiTC
	tags:
	- generated_from_trainer
	model-index:
	- name: test5_balanced_60ep
	results: []
	pipeline_tag: zero-shot-classification
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# test5_balanced_60ep

	This model is a fine-tuned version of [ibaucells/RoBERTa-ca-CaWikiTC](https://huggingface.co/ibaucells/RoBERTa-ca-CaWikiTC) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.8209

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 10
	- eval_batch_size: 10
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 500
	- num_epochs: 60

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 2.8299 \| 1.0 \| 63 \| 2.8361 \|
	\| 2.8512 \| 2.0 \| 126 \| 2.8253 \|
	\| 2.8118 \| 3.0 \| 189 \| 2.8376 \|
	\| 2.7238 \| 4.0 \| 252 \| 2.7157 \|
	\| 2.5914 \| 5.0 \| 315 \| 2.6045 \|
	\| 2.5527 \| 6.0 \| 378 \| 2.5580 \|
	\| 2.4197 \| 7.0 \| 441 \| 2.6009 \|
	\| 2.2561 \| 8.0 \| 504 \| 2.3989 \|
	\| 1.9213 \| 9.0 \| 567 \| 2.2129 \|
	\| 1.8147 \| 10.0 \| 630 \| 2.1155 \|
	\| 1.4336 \| 11.0 \| 693 \| 2.0509 \|
	\| 1.2948 \| 12.0 \| 756 \| 1.8276 \|
	\| 0.9833 \| 13.0 \| 819 \| 1.8245 \|
	\| 0.8577 \| 14.0 \| 882 \| 1.7360 \|
	\| 0.665 \| 15.0 \| 945 \| 1.8072 \|
	\| 0.5397 \| 16.0 \| 1008 \| 1.8583 \|
	\| 0.2861 \| 17.0 \| 1071 \| 1.9073 \|
	\| 0.3394 \| 18.0 \| 1134 \| 2.0320 \|
	\| 0.2941 \| 19.0 \| 1197 \| 2.1608 \|
	\| 0.1868 \| 20.0 \| 1260 \| 2.1440 \|
	\| 0.175 \| 21.0 \| 1323 \| 2.2099 \|
	\| 0.3357 \| 22.0 \| 1386 \| 2.3903 \|
	\| 0.1311 \| 23.0 \| 1449 \| 2.4006 \|
	\| 0.103 \| 24.0 \| 1512 \| 2.4874 \|
	\| 0.1323 \| 25.0 \| 1575 \| 2.5191 \|
	\| 0.0928 \| 26.0 \| 1638 \| 2.3582 \|
	\| 0.0872 \| 27.0 \| 1701 \| 2.4494 \|
	\| 0.1271 \| 28.0 \| 1764 \| 2.5821 \|
	\| 0.1796 \| 29.0 \| 1827 \| 2.6190 \|
	\| 0.0479 \| 30.0 \| 1890 \| 2.5549 \|
	\| 0.0197 \| 31.0 \| 1953 \| 2.8739 \|
	\| 0.0222 \| 32.0 \| 2016 \| 2.7927 \|
	\| 0.019 \| 33.0 \| 2079 \| 2.8467 \|
	\| 0.0946 \| 34.0 \| 2142 \| 2.9086 \|
	\| 0.0084 \| 35.0 \| 2205 \| 2.7976 \|
	\| 0.0539 \| 36.0 \| 2268 \| 2.8300 \|
	\| 0.0553 \| 37.0 \| 2331 \| 2.9293 \|
	\| 0.0124 \| 38.0 \| 2394 \| 3.0047 \|
	\| 0.0447 \| 39.0 \| 2457 \| 2.9724 \|
	\| 0.0363 \| 40.0 \| 2520 \| 2.9775 \|
	\| 0.0056 \| 41.0 \| 2583 \| 2.9972 \|
	\| 0.0072 \| 42.0 \| 2646 \| 3.0162 \|
	\| 0.0091 \| 43.0 \| 2709 \| 3.0348 \|
	\| 0.031 \| 44.0 \| 2772 \| 3.0490 \|
	\| 0.0091 \| 45.0 \| 2835 \| 3.0648 \|
	\| 0.029 \| 46.0 \| 2898 \| 3.0730 \|
	\| 0.0076 \| 47.0 \| 2961 \| 3.0866 \|
	\| 0.0115 \| 48.0 \| 3024 \| 3.0979 \|
	\| 0.0047 \| 49.0 \| 3087 \| 3.1102 \|
	\| 0.0079 \| 50.0 \| 3150 \| 3.1242 \|
	\| 0.0039 \| 51.0 \| 3213 \| 3.1299 \|
	\| 0.0043 \| 52.0 \| 3276 \| 3.1386 \|
	\| 0.0322 \| 53.0 \| 3339 \| 3.1478 \|
	\| 0.0041 \| 54.0 \| 3402 \| 3.1478 \|
	\| 0.0289 \| 55.0 \| 3465 \| 3.1570 \|
	\| 0.0038 \| 56.0 \| 3528 \| 3.1661 \|
	\| 0.004 \| 57.0 \| 3591 \| 3.1546 \|
	\| 0.0037 \| 58.0 \| 3654 \| 3.1575 \|
	\| 0.0069 \| 59.0 \| 3717 \| 3.1587 \|
	\| 0.0135 \| 60.0 \| 3780 \| 3.1626 \|


	### Framework versions

	- Transformers 4.38.2
	- Pytorch 2.1.0+cu121
	- Datasets 2.18.0
	- Tokenizers 0.15.2

	---
	license: apache-2.0
	base_model: ibaucells/RoBERTa-ca-CaWikiTC
	tags:
	- generated_from_trainer
	model-index:
	- name: test5_balanced_60ep
	results: []
	pipeline_tag: zero-shot-classification
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# test5_balanced_60ep

	This model is a fine-tuned version of [ibaucells/RoBERTa-ca-CaWikiTC](https://huggingface.co/ibaucells/RoBERTa-ca-CaWikiTC) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.8209

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 10
	- eval_batch_size: 10
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 500
	- num_epochs: 60

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 2.8299 \| 1.0 \| 63 \| 2.8361 \|
	\| 2.8512 \| 2.0 \| 126 \| 2.8253 \|
	\| 2.8118 \| 3.0 \| 189 \| 2.8376 \|
	\| 2.7238 \| 4.0 \| 252 \| 2.7157 \|
	\| 2.5914 \| 5.0 \| 315 \| 2.6045 \|
	\| 2.5527 \| 6.0 \| 378 \| 2.5580 \|
	\| 2.4197 \| 7.0 \| 441 \| 2.6009 \|
	\| 2.2561 \| 8.0 \| 504 \| 2.3989 \|
	\| 1.9213 \| 9.0 \| 567 \| 2.2129 \|
	\| 1.8147 \| 10.0 \| 630 \| 2.1155 \|
	\| 1.4336 \| 11.0 \| 693 \| 2.0509 \|
	\| 1.2948 \| 12.0 \| 756 \| 1.8276 \|
	\| 0.9833 \| 13.0 \| 819 \| 1.8245 \|
	\| 0.8577 \| 14.0 \| 882 \| 1.7360 \|
	\| 0.665 \| 15.0 \| 945 \| 1.8072 \|
	\| 0.5397 \| 16.0 \| 1008 \| 1.8583 \|
	\| 0.2861 \| 17.0 \| 1071 \| 1.9073 \|
	\| 0.3394 \| 18.0 \| 1134 \| 2.0320 \|
	\| 0.2941 \| 19.0 \| 1197 \| 2.1608 \|
	\| 0.1868 \| 20.0 \| 1260 \| 2.1440 \|
	\| 0.175 \| 21.0 \| 1323 \| 2.2099 \|
	\| 0.3357 \| 22.0 \| 1386 \| 2.3903 \|
	\| 0.1311 \| 23.0 \| 1449 \| 2.4006 \|
	\| 0.103 \| 24.0 \| 1512 \| 2.4874 \|
	\| 0.1323 \| 25.0 \| 1575 \| 2.5191 \|
	\| 0.0928 \| 26.0 \| 1638 \| 2.3582 \|
	\| 0.0872 \| 27.0 \| 1701 \| 2.4494 \|
	\| 0.1271 \| 28.0 \| 1764 \| 2.5821 \|
	\| 0.1796 \| 29.0 \| 1827 \| 2.6190 \|
	\| 0.0479 \| 30.0 \| 1890 \| 2.5549 \|
	\| 0.0197 \| 31.0 \| 1953 \| 2.8739 \|
	\| 0.0222 \| 32.0 \| 2016 \| 2.7927 \|
	\| 0.019 \| 33.0 \| 2079 \| 2.8467 \|
	\| 0.0946 \| 34.0 \| 2142 \| 2.9086 \|
	\| 0.0084 \| 35.0 \| 2205 \| 2.7976 \|
	\| 0.0539 \| 36.0 \| 2268 \| 2.8300 \|
	\| 0.0553 \| 37.0 \| 2331 \| 2.9293 \|
	\| 0.0124 \| 38.0 \| 2394 \| 3.0047 \|
	\| 0.0447 \| 39.0 \| 2457 \| 2.9724 \|
	\| 0.0363 \| 40.0 \| 2520 \| 2.9775 \|
	\| 0.0056 \| 41.0 \| 2583 \| 2.9972 \|
	\| 0.0072 \| 42.0 \| 2646 \| 3.0162 \|
	\| 0.0091 \| 43.0 \| 2709 \| 3.0348 \|
	\| 0.031 \| 44.0 \| 2772 \| 3.0490 \|
	\| 0.0091 \| 45.0 \| 2835 \| 3.0648 \|
	\| 0.029 \| 46.0 \| 2898 \| 3.0730 \|
	\| 0.0076 \| 47.0 \| 2961 \| 3.0866 \|
	\| 0.0115 \| 48.0 \| 3024 \| 3.0979 \|
	\| 0.0047 \| 49.0 \| 3087 \| 3.1102 \|
	\| 0.0079 \| 50.0 \| 3150 \| 3.1242 \|
	\| 0.0039 \| 51.0 \| 3213 \| 3.1299 \|
	\| 0.0043 \| 52.0 \| 3276 \| 3.1386 \|
	\| 0.0322 \| 53.0 \| 3339 \| 3.1478 \|
	\| 0.0041 \| 54.0 \| 3402 \| 3.1478 \|
	\| 0.0289 \| 55.0 \| 3465 \| 3.1570 \|
	\| 0.0038 \| 56.0 \| 3528 \| 3.1661 \|
	\| 0.004 \| 57.0 \| 3591 \| 3.1546 \|
	\| 0.0037 \| 58.0 \| 3654 \| 3.1575 \|
	\| 0.0069 \| 59.0 \| 3717 \| 3.1587 \|
	\| 0.0135 \| 60.0 \| 3780 \| 3.1626 \|


	### Framework versions

	- Transformers 4.38.2
	- Pytorch 2.1.0+cu121
	- Datasets 2.18.0
	- Tokenizers 0.15.2