Commit
•
c899018
1
Parent(s):
97c31f1
Update README.md
Browse files
README.md
CHANGED
@@ -57,20 +57,18 @@ print(output)
|
|
57 |
### Details on data and training
|
58 |
The code for preparing the data and training & evaluating the model is fully open-source here: https://github.com/MoritzLaurer/zeroshot-classifier/tree/main
|
59 |
|
60 |
-
|
61 |
-
The model can only do text classification tasks.
|
62 |
|
63 |
-
Please consult the original DeBERTa paper and the papers for the different datasets for potential biases.
|
64 |
|
65 |
## Metrics
|
66 |
|
67 |
-
Balanced accuracy
|
68 |
`deberta-v3-large-zeroshot-v1.1-all-33` was trained on all datasets, with only maximum 500 texts per class to avoid overfitting.
|
69 |
-
The metrics on these datasets are therefore not strictly zeroshot, as the model has seen some data for each task.
|
70 |
`deberta-v3-large-zeroshot-v1.1-heldout` indicates zeroshot performance on the respective dataset.
|
71 |
To calculate these zeroshot metrics, the pipeline was run 28 times, each time with one dataset held out from training to simulate a zeroshot setup.
|
72 |
|
73 |
-
![figure_large_v1.1](https://
|
74 |
|
75 |
|
76 |
| | deberta-v3-large-mnli-fever-anli-ling-wanli-binary | deberta-v3-large-zeroshot-v1.1-heldout | deberta-v3-large-zeroshot-v1.1-all-33 |
|
@@ -115,6 +113,12 @@ To calculate these zeroshot metrics, the pipeline was run 28 times, each time wi
|
|
115 |
|
116 |
|
117 |
|
|
|
|
|
|
|
|
|
|
|
|
|
118 |
## License
|
119 |
The base model (DeBERTa-v3) is published under the MIT license.
|
120 |
The datasets the model was fine-tuned on are published under a diverse set of licenses.
|
|
|
57 |
### Details on data and training
|
58 |
The code for preparing the data and training & evaluating the model is fully open-source here: https://github.com/MoritzLaurer/zeroshot-classifier/tree/main
|
59 |
|
60 |
+
Hyperparameters and other details are available in this Weights & Biases repo: https://wandb.ai/moritzlaurer/deberta-v3-large-zeroshot-v1-1-all-33/table?workspace=user-
|
|
|
61 |
|
|
|
62 |
|
63 |
## Metrics
|
64 |
|
65 |
+
Balanced accuracy is reported for all datasets.
|
66 |
`deberta-v3-large-zeroshot-v1.1-all-33` was trained on all datasets, with only maximum 500 texts per class to avoid overfitting.
|
67 |
+
The metrics on these datasets are therefore not strictly zeroshot, as the model has seen some data for each task during training.
|
68 |
`deberta-v3-large-zeroshot-v1.1-heldout` indicates zeroshot performance on the respective dataset.
|
69 |
To calculate these zeroshot metrics, the pipeline was run 28 times, each time with one dataset held out from training to simulate a zeroshot setup.
|
70 |
|
71 |
+
![figure_large_v1.1](https://raw.githubusercontent.com/MoritzLaurer/zeroshot-classifier/main/results/fig_large_v1.1.png)
|
72 |
|
73 |
|
74 |
| | deberta-v3-large-mnli-fever-anli-ling-wanli-binary | deberta-v3-large-zeroshot-v1.1-heldout | deberta-v3-large-zeroshot-v1.1-all-33 |
|
|
|
113 |
|
114 |
|
115 |
|
116 |
+
## Limitations and bias
|
117 |
+
The model can only do text classification tasks.
|
118 |
+
|
119 |
+
Please consult the original DeBERTa paper and the papers for the different datasets for potential biases.
|
120 |
+
|
121 |
+
|
122 |
## License
|
123 |
The base model (DeBERTa-v3) is published under the MIT license.
|
124 |
The datasets the model was fine-tuned on are published under a diverse set of licenses.
|