asadfgglie commited on
Commit
577290c
1 Parent(s): d8b4762

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +93 -70
README.md CHANGED
@@ -1,70 +1,93 @@
1
- ---
2
- license: mit
3
- base_model: MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7
4
- tags:
5
- - generated_from_trainer
6
- metrics:
7
- - accuracy
8
- model-index:
9
- - name: mDeBERTa-v3-base-xnli-multilingual-nli-2mil7
10
- results: []
11
- ---
12
-
13
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
- should probably proofread and complete it, then remove this comment. -->
15
-
16
- # mDeBERTa-v3-base-xnli-multilingual-nli-2mil7
17
-
18
- This model is a fine-tuned version of [MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7) on the None dataset.
19
- It achieves the following results on the evaluation set:
20
- - Loss: 0.4486
21
- - F1 Macro: 0.8264
22
- - F1 Micro: 0.8274
23
- - Accuracy Balanced: 0.8270
24
- - Accuracy: 0.8274
25
- - Precision Macro: 0.8260
26
- - Recall Macro: 0.8270
27
- - Precision Micro: 0.8274
28
- - Recall Micro: 0.8274
29
-
30
- ## Model description
31
-
32
- More information needed
33
-
34
- ## Intended uses & limitations
35
-
36
- More information needed
37
-
38
- ## Training and evaluation data
39
-
40
- More information needed
41
-
42
- ## Training procedure
43
-
44
- ### Training hyperparameters
45
-
46
- The following hyperparameters were used during training:
47
- - learning_rate: 2e-05
48
- - train_batch_size: 16
49
- - eval_batch_size: 128
50
- - seed: 20241201
51
- - gradient_accumulation_steps: 2
52
- - total_train_batch_size: 32
53
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
54
- - lr_scheduler_type: linear
55
- - lr_scheduler_warmup_ratio: 0.06
56
- - num_epochs: 3
57
-
58
- ### Training results
59
-
60
- | Training Loss | Epoch | Step | Validation Loss | F1 Macro | F1 Micro | Accuracy Balanced | Accuracy | Precision Macro | Recall Macro | Precision Micro | Recall Micro |
61
- |:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|:-----------------:|:--------:|:---------------:|:------------:|:---------------:|:------------:|
62
- | 0.3242 | 1.69 | 200 | 0.4044 | 0.8308 | 0.8312 | 0.8322 | 0.8312 | 0.8306 | 0.8322 | 0.8312 | 0.8312 |
63
-
64
-
65
- ### Framework versions
66
-
67
- - Transformers 4.33.3
68
- - Pytorch 2.5.1+cu121
69
- - Datasets 2.14.7
70
- - Tokenizers 0.13.3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7
3
+ tags:
4
+ - generated_from_trainer
5
+ metrics:
6
+ - accuracy
7
+ model-index:
8
+ - name: mDeBERTa-v3-base-xnli-multilingual-nli-2mil7
9
+ results: []
10
+ datasets:
11
+ - asadfgglie/nli-zh-tw-all
12
+ language:
13
+ - zh
14
+ pipeline_tag: zero-shot-classification
15
+ ---
16
+
17
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
18
+ should probably proofread and complete it, then remove this comment. -->
19
+
20
+ # mDeBERTa-v3-base-xnli-multilingual-zeroshot-v4.0-only-nli-downsample
21
+
22
+ This model use same dataset with [asadfgglie/mDeBERTa-v3-base-xnli-multilingual-zeroshot-v1.0](https://huggingface.co/asadfgglie/mDeBERTa-v3-base-xnli-multilingual-zeroshot-v1.0), but training set was downsampled as 80% size of non-nli dataset [asadfgglie/BanBan_2024-10-17-facial_expressions-nli](https://huggingface.co/datasets/asadfgglie/BanBan_2024-10-17-facial_expressions-nli).
23
+
24
+ This model is a fine-tuned version of [MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7) on the None dataset.
25
+ It achieves the following results on the evaluation set:
26
+ - Loss: 0.4486
27
+ - F1 Macro: 0.8264
28
+ - F1 Micro: 0.8274
29
+ - Accuracy Balanced: 0.8270
30
+ - Accuracy: 0.8274
31
+ - Precision Macro: 0.8260
32
+ - Recall Macro: 0.8270
33
+ - Precision Micro: 0.8274
34
+ - Recall Micro: 0.8274
35
+
36
+ ## Model description
37
+
38
+ More information needed
39
+
40
+ ## Intended uses & limitations
41
+
42
+ More information needed
43
+
44
+ ## Training and evaluation data
45
+
46
+ More information needed
47
+
48
+ ## Training procedure
49
+
50
+ ### Training hyperparameters
51
+
52
+ The following hyperparameters were used during training:
53
+ - learning_rate: 2e-05
54
+ - train_batch_size: 16
55
+ - eval_batch_size: 128
56
+ - seed: 20241201
57
+ - gradient_accumulation_steps: 2
58
+ - total_train_batch_size: 32
59
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
60
+ - lr_scheduler_type: linear
61
+ - lr_scheduler_warmup_ratio: 0.06
62
+ - num_epochs: 3
63
+
64
+ ### Training results
65
+
66
+ | Training Loss | Epoch | Step | Validation Loss | F1 Macro | F1 Micro | Accuracy Balanced | Accuracy | Precision Macro | Recall Macro | Precision Micro | Recall Micro |
67
+ |:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|:-----------------:|:--------:|:---------------:|:------------:|:---------------:|:------------:|
68
+ | 0.3242 | 1.69 | 200 | 0.4044 | 0.8308 | 0.8312 | 0.8322 | 0.8312 | 0.8306 | 0.8322 | 0.8312 | 0.8312 |
69
+
70
+ ### Eval results
71
+ |Datasets|asadfgglie/nli-zh-tw-all/test|asadfgglie/BanBan_2024-10-17-facial_expressions-nli/test|eval_dataset|test_dataset|
72
+ | :---: | :---: | :---: | :---: | :---: |
73
+ |eval_loss|0.445|1.142|0.429|0.449|
74
+ |eval_f1_macro|0.827|0.505|0.83|0.826|
75
+ |eval_f1_micro|0.828|0.55|0.831|0.827|
76
+ |eval_accuracy_balanced|0.828|0.548|0.831|0.827|
77
+ |eval_accuracy|0.828|0.55|0.831|0.827|
78
+ |eval_precision_macro|0.827|0.575|0.83|0.826|
79
+ |eval_recall_macro|0.828|0.548|0.831|0.827|
80
+ |eval_precision_micro|0.828|0.55|0.831|0.827|
81
+ |eval_recall_micro|0.828|0.55|0.831|0.827|
82
+ |eval_runtime|275.581|4.734|54.573|209.065|
83
+ |eval_samples_per_second|30.844|199.853|31.151|32.526|
84
+ |eval_steps_per_second|0.243|1.69|0.257|0.258|
85
+ |epoch|2.99|2.99|2.99|2.99|
86
+ |Size of dataset|8500|946|1700|6800|
87
+
88
+ ### Framework versions
89
+
90
+ - Transformers 4.33.3
91
+ - Pytorch 2.5.1+cu121
92
+ - Datasets 2.14.7
93
+ - Tokenizers 0.13.3