asadfgglie commited on
Commit
09087a9
1 Parent(s): f106296

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +99 -72
README.md CHANGED
@@ -1,72 +1,99 @@
1
- ---
2
- license: mit
3
- base_model: MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7
4
- tags:
5
- - generated_from_trainer
6
- metrics:
7
- - accuracy
8
- model-index:
9
- - name: mDeBERTa-v3-base-xnli-multilingual-zeroshot-v5.0-nli-downsample-and-non-nli
10
- results: []
11
- ---
12
-
13
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
- should probably proofread and complete it, then remove this comment. -->
15
-
16
- # mDeBERTa-v3-base-xnli-multilingual-zeroshot-v5.0-nli-downsample-and-non-nli
17
-
18
- This model is a fine-tuned version of [MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7) on the None dataset.
19
- It achieves the following results on the evaluation set:
20
- - Loss: 0.4531
21
- - F1 Macro: 0.8330
22
- - F1 Micro: 0.8337
23
- - Accuracy Balanced: 0.8331
24
- - Accuracy: 0.8337
25
- - Precision Macro: 0.8330
26
- - Recall Macro: 0.8331
27
- - Precision Micro: 0.8337
28
- - Recall Micro: 0.8337
29
-
30
- ## Model description
31
-
32
- More information needed
33
-
34
- ## Intended uses & limitations
35
-
36
- More information needed
37
-
38
- ## Training and evaluation data
39
-
40
- More information needed
41
-
42
- ## Training procedure
43
-
44
- ### Training hyperparameters
45
-
46
- The following hyperparameters were used during training:
47
- - learning_rate: 2e-05
48
- - train_batch_size: 16
49
- - eval_batch_size: 128
50
- - seed: 20241201
51
- - gradient_accumulation_steps: 2
52
- - total_train_batch_size: 32
53
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
54
- - lr_scheduler_type: linear
55
- - lr_scheduler_warmup_ratio: 0.06
56
- - num_epochs: 3
57
-
58
- ### Training results
59
-
60
- | Training Loss | Epoch | Step | Validation Loss | F1 Macro | F1 Micro | Accuracy Balanced | Accuracy | Precision Macro | Recall Macro | Precision Micro | Recall Micro |
61
- |:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|:-----------------:|:--------:|:---------------:|:------------:|:---------------:|:------------:|
62
- | 0.3748 | 0.85 | 200 | 0.4218 | 0.7971 | 0.7999 | 0.7970 | 0.7999 | 0.7973 | 0.7970 | 0.7999 | 0.7999 |
63
- | 0.2693 | 1.69 | 400 | 0.4523 | 0.8061 | 0.8078 | 0.8077 | 0.8078 | 0.8053 | 0.8077 | 0.8078 | 0.8078 |
64
- | 0.1905 | 2.54 | 600 | 0.4720 | 0.8226 | 0.8242 | 0.8241 | 0.8242 | 0.8217 | 0.8241 | 0.8242 | 0.8242 |
65
-
66
-
67
- ### Framework versions
68
-
69
- - Transformers 4.33.3
70
- - Pytorch 2.5.1+cu121
71
- - Datasets 2.14.7
72
- - Tokenizers 0.13.3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7
3
+ tags:
4
+ - generated_from_trainer
5
+ metrics:
6
+ - accuracy
7
+ model-index:
8
+ - name: mDeBERTa-v3-base-xnli-multilingual-zeroshot-v5.0-nli-downsample-and-non-nli
9
+ results: []
10
+ datasets:
11
+ - asadfgglie/nli-zh-tw-all
12
+ - asadfgglie/BanBan_2024-10-17-facial_expressions-nli
13
+ language:
14
+ - zh
15
+ pipeline_tag: zero-shot-classification
16
+ ---
17
+
18
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
19
+ should probably proofread and complete it, then remove this comment. -->
20
+
21
+ # mDeBERTa-v3-base-xnli-multilingual-zeroshot-v5.0-nli-downsample-and-non-nli
22
+
23
+ This model is merge dataset stratege version of v3.0 and v4.0.
24
+
25
+ This model is a fine-tuned version of [MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7) on the None dataset.
26
+ It achieves the following results on the evaluation set:
27
+ - Loss: 0.4531
28
+ - F1 Macro: 0.8330
29
+ - F1 Micro: 0.8337
30
+ - Accuracy Balanced: 0.8331
31
+ - Accuracy: 0.8337
32
+ - Precision Macro: 0.8330
33
+ - Recall Macro: 0.8331
34
+ - Precision Micro: 0.8337
35
+ - Recall Micro: 0.8337
36
+
37
+ ## Model description
38
+
39
+ More information needed
40
+
41
+ ## Intended uses & limitations
42
+
43
+ More information needed
44
+
45
+ ## Training and evaluation data
46
+
47
+ More information needed
48
+
49
+ ## Training procedure
50
+
51
+ ### Training hyperparameters
52
+
53
+ The following hyperparameters were used during training:
54
+ - learning_rate: 2e-05
55
+ - train_batch_size: 16
56
+ - eval_batch_size: 128
57
+ - seed: 20241201
58
+ - gradient_accumulation_steps: 2
59
+ - total_train_batch_size: 32
60
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
61
+ - lr_scheduler_type: linear
62
+ - lr_scheduler_warmup_ratio: 0.06
63
+ - num_epochs: 3
64
+
65
+ ### Training results
66
+
67
+ | Training Loss | Epoch | Step | Validation Loss | F1 Macro | F1 Micro | Accuracy Balanced | Accuracy | Precision Macro | Recall Macro | Precision Micro | Recall Micro |
68
+ |:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|:-----------------:|:--------:|:---------------:|:------------:|:---------------:|:------------:|
69
+ | 0.3748 | 0.85 | 200 | 0.4218 | 0.7971 | 0.7999 | 0.7970 | 0.7999 | 0.7973 | 0.7970 | 0.7999 | 0.7999 |
70
+ | 0.2693 | 1.69 | 400 | 0.4523 | 0.8061 | 0.8078 | 0.8077 | 0.8078 | 0.8053 | 0.8077 | 0.8078 | 0.8078 |
71
+ | 0.1905 | 2.54 | 600 | 0.4720 | 0.8226 | 0.8242 | 0.8241 | 0.8242 | 0.8217 | 0.8241 | 0.8242 | 0.8242 |
72
+
73
+ ### Eval results
74
+
75
+ |Datasets|asadfgglie/nli-zh-tw-all/test|asadfgglie/BanBan_2024-10-17-facial_expressions-nli/test|eval_dataset|test_dataset|
76
+ | :---: | :---: | :---: | :---: | :---: |
77
+ |eval_loss|0.48|0.269|0.484|0.453|
78
+ |eval_f1_macro|0.821|0.909|0.816|0.833|
79
+ |eval_f1_micro|0.822|0.909|0.818|0.834|
80
+ |eval_accuracy_balanced|0.821|0.909|0.816|0.833|
81
+ |eval_accuracy|0.822|0.909|0.818|0.834|
82
+ |eval_precision_macro|0.821|0.909|0.816|0.833|
83
+ |eval_recall_macro|0.821|0.909|0.816|0.833|
84
+ |eval_precision_micro|0.822|0.909|0.818|0.834|
85
+ |eval_recall_micro|0.822|0.909|0.818|0.834|
86
+ |eval_runtime|239.87|4.066|58.954|236.797|
87
+ |eval_samples_per_second|35.436|232.633|32.042|31.913|
88
+ |eval_steps_per_second|0.279|1.967|0.254|0.253|
89
+ |epoch|2.99|2.99|2.99|2.99|
90
+ |Size of dataset|8500|946|1889|7557|
91
+
92
+
93
+
94
+ ### Framework versions
95
+
96
+ - Transformers 4.33.3
97
+ - Pytorch 2.5.1+cu121
98
+ - Datasets 2.14.7
99
+ - Tokenizers 0.13.3