sarahyurick commited on
Commit
0d4be1b
1 Parent(s): db91539

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +75 -75
README.md CHANGED
@@ -37,11 +37,6 @@ This model is ready for commercial use.
37
  # License
38
  This model is released under the [NVIDIA Open Model License Agreement](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf).
39
 
40
- # References
41
- * [DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing](https://arxiv.org/abs/2111.09543)
42
- * [DeBERTa: Decoding-enhanced BERT with Disentangled Attention](https://github.com/microsoft/DeBERTa)
43
- * [Training language models to follow instructions with human feedback](https://arxiv.org/pdf/2203.02155)
44
-
45
  # Model Architecture
46
  The model architecture uses a DeBERTa backbone and incorporates multiple classification heads, each dedicated to a task categorization or complexity dimension. This approach enables the training of a unified network, allowing it to predict simultaneously during inference. Deberta-v3-base can theoretically handle up to 12k tokens, but default context length is set at 512 tokens.
47
 
@@ -50,6 +45,77 @@ The model architecture uses a DeBERTa backbone and incorporates multiple classif
50
 
51
  The inference code for this model is available through the NeMo Curator GitHub repository. Check out this [example notebook](https://github.com/NVIDIA/NeMo-Curator/tree/main/tutorials/distributed_data_classification) to get started.
52
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
  # How to Use in Transformers
54
  To use the prompt task and complexity classifier, use the following code:
55
 
@@ -263,76 +329,10 @@ print(result)
263
  # {'task_type_1': ['Code Generation'], 'task_type_2': ['Text Generation'], 'task_type_prob': [0.767], 'creativity_scope': [0.0826], 'reasoning': [0.0632], 'contextual_knowledge': [0.056], 'number_of_few_shots': [0], 'domain_knowledge': [0.9803], 'no_label_reason': [0.0], 'constraint_ct': [0.5578], 'prompt_complexity_score': [0.27822]}
264
  ```
265
 
266
- # Input & Output
267
- ## Input
268
- * Input Type: Text
269
- * Input Format: String
270
- * Input Parameters: 1D
271
- * Other Properties Related to Input: Token Limit of 512 tokens
272
-
273
- ## Output
274
- * Output Type: Text/Numeric Classifications
275
- * Output Format: String & Numeric
276
- * Output Parameters: 1D
277
- * Other Properties Related to Output: None
278
-
279
- ## Examples
280
-
281
- ```
282
- Prompt: Write a mystery set in a small town where an everyday object goes missing, causing a ripple of curiosity and suspicion. Follow the investigation and reveal the surprising truth behind the disappearance.
283
- ```
284
-
285
- | Task | Complexity | Creativity | Reasoning | Contextual Knowledge | Domain Knowledge | Constraints | # of Few Shots |
286
- |------------------|------------|------------|-----------|-----------------------|------------------|-------------|----------------|
287
- | Text Generation | 0.472 | 0.867 | 0.056 | 0.048 | 0.226 | 0.785 | 0 |
288
-
289
- ```
290
- Prompt: Antibiotics are a type of medication used to treat bacterial infections. They work by either killing the bacteria or preventing them from reproducing, allowing the body’s immune system to fight off the infection. Antibiotics are usually taken orally in the form of pills, capsules, or liquid solutions, or sometimes administered intravenously. They are not effective against viral infections, and using them inappropriately can lead to antibiotic resistance. Explain the above in one sentence.
291
- ```
292
-
293
- | Task | Complexity | Creativity | Reasoning | Contextual Knowledge | Domain Knowledge | Constraints | # of Few Shots |
294
- |-----------------|------------|------------|-----------|-----------------------|------------------|-------------|----------------|
295
- | Summarization | 0.133 | 0.003 | 0.014 | 0.003 | 0.644 | 0.211 | 0 |
296
-
297
- # Software Integration
298
- * Runtime Engine: Python 3.10 and NeMo Curator
299
- * Supported Hardware Microarchitecture Compatibility: NVIDIA GPU, Volta™ or higher (compute capability 7.0+), CUDA 12 (or above)
300
- * Preferred/Supported Operating System(s): Ubuntu 22.04/20.04
301
-
302
- # Model Version
303
- Prompt Task and Complexity Classifier v1.1
304
-
305
- # Training, Testing, and Evaluation Datasets
306
- ## Training Data
307
- * 4024 English prompts with task distribution outlined below
308
- * Prompts were annotated by humans according to task and complexity taxonomies
309
-
310
- Task distribution:
311
- | Task | Count |
312
- |------------------|-------|
313
- | Open QA | 1214 |
314
- | Closed QA | 786 |
315
- | Text Generation | 480 |
316
- | Chatbot | 448 |
317
- | Classification | 267 |
318
- | Summarization | 230 |
319
- | Code Generation | 185 |
320
- | Rewrite | 169 |
321
- | Other | 104 |
322
- | Brainstorming | 81 |
323
- | Extraction | 60 |
324
- | Total | 4024 |
325
-
326
- ## Evaluation
327
- For evaluation, Top-1 accuracy metric was used, which involves matching the category with the highest probability to the expected answer. Additionally, n-fold cross-validation was used to produce n different values for this metric to verify the consistency of the results. The table below displays the average of the top-1 accuracy values for the N folds calculated for each complexity dimension separately.
328
-
329
- | | Task Accuracy | Creative Accuracy | Reasoning Accuracy | Contextual Accuracy | FewShots Accuracy | Domain Accuracy | Constraint Accuracy |
330
- |-|------------------|-------------------|--------------------|---------------------|-------------------|-----------------|---------------------|
331
- | Average of 10 Folds | 0.981 | 0.996 | 0.997 | 0.981 | 0.979 | 0.937 | 0.991 |
332
-
333
- # Inference
334
- * Engine: PyTorch
335
- * Test Hardware: A10G
336
 
337
  # Ethical Considerations
338
  NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
 
37
  # License
38
  This model is released under the [NVIDIA Open Model License Agreement](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf).
39
 
 
 
 
 
 
40
  # Model Architecture
41
  The model architecture uses a DeBERTa backbone and incorporates multiple classification heads, each dedicated to a task categorization or complexity dimension. This approach enables the training of a unified network, allowing it to predict simultaneously during inference. Deberta-v3-base can theoretically handle up to 12k tokens, but default context length is set at 512 tokens.
42
 
 
45
 
46
  The inference code for this model is available through the NeMo Curator GitHub repository. Check out this [example notebook](https://github.com/NVIDIA/NeMo-Curator/tree/main/tutorials/distributed_data_classification) to get started.
47
 
48
+ # Input & Output
49
+ ## Input
50
+ * Input Type: Text
51
+ * Input Format: String
52
+ * Input Parameters: 1D
53
+ * Other Properties Related to Input: Token Limit of 512 tokens
54
+
55
+ ## Output
56
+ * Output Type: Text/Numeric Classifications
57
+ * Output Format: String & Numeric
58
+ * Output Parameters: 1D
59
+ * Other Properties Related to Output: None
60
+
61
+ ## Examples
62
+
63
+ ```
64
+ Prompt: Write a mystery set in a small town where an everyday object goes missing, causing a ripple of curiosity and suspicion. Follow the investigation and reveal the surprising truth behind the disappearance.
65
+ ```
66
+
67
+ | Task | Complexity | Creativity | Reasoning | Contextual Knowledge | Domain Knowledge | Constraints | # of Few Shots |
68
+ |------------------|------------|------------|-----------|-----------------------|------------------|-------------|----------------|
69
+ | Text Generation | 0.472 | 0.867 | 0.056 | 0.048 | 0.226 | 0.785 | 0 |
70
+
71
+ ```
72
+ Prompt: Antibiotics are a type of medication used to treat bacterial infections. They work by either killing the bacteria or preventing them from reproducing, allowing the body’s immune system to fight off the infection. Antibiotics are usually taken orally in the form of pills, capsules, or liquid solutions, or sometimes administered intravenously. They are not effective against viral infections, and using them inappropriately can lead to antibiotic resistance. Explain the above in one sentence.
73
+ ```
74
+
75
+ | Task | Complexity | Creativity | Reasoning | Contextual Knowledge | Domain Knowledge | Constraints | # of Few Shots |
76
+ |-----------------|------------|------------|-----------|-----------------------|------------------|-------------|----------------|
77
+ | Summarization | 0.133 | 0.003 | 0.014 | 0.003 | 0.644 | 0.211 | 0 |
78
+
79
+ # Software Integration
80
+ * Runtime Engine: Python 3.10 and NeMo Curator
81
+ * Supported Hardware Microarchitecture Compatibility: NVIDIA GPU, Volta™ or higher (compute capability 7.0+), CUDA 12 (or above)
82
+ * Preferred/Supported Operating System(s): Ubuntu 22.04/20.04
83
+
84
+ # Model Version
85
+ Prompt Task and Complexity Classifier v1.1
86
+
87
+ # Training, Testing, and Evaluation Datasets
88
+ ## Training Data
89
+ * 4024 English prompts with task distribution outlined below
90
+ * Prompts were annotated by humans according to task and complexity taxonomies
91
+
92
+ Task distribution:
93
+ | Task | Count |
94
+ |------------------|-------|
95
+ | Open QA | 1214 |
96
+ | Closed QA | 786 |
97
+ | Text Generation | 480 |
98
+ | Chatbot | 448 |
99
+ | Classification | 267 |
100
+ | Summarization | 230 |
101
+ | Code Generation | 185 |
102
+ | Rewrite | 169 |
103
+ | Other | 104 |
104
+ | Brainstorming | 81 |
105
+ | Extraction | 60 |
106
+ | Total | 4024 |
107
+
108
+ ## Evaluation
109
+ For evaluation, Top-1 accuracy metric was used, which involves matching the category with the highest probability to the expected answer. Additionally, n-fold cross-validation was used to produce n different values for this metric to verify the consistency of the results. The table below displays the average of the top-1 accuracy values for the N folds calculated for each complexity dimension separately.
110
+
111
+ | | Task Accuracy | Creative Accuracy | Reasoning Accuracy | Contextual Accuracy | FewShots Accuracy | Domain Accuracy | Constraint Accuracy |
112
+ |-|------------------|-------------------|--------------------|---------------------|-------------------|-----------------|---------------------|
113
+ | Average of 10 Folds | 0.981 | 0.996 | 0.997 | 0.981 | 0.979 | 0.937 | 0.991 |
114
+
115
+ # Inference
116
+ * Engine: PyTorch
117
+ * Test Hardware: A10G
118
+
119
  # How to Use in Transformers
120
  To use the prompt task and complexity classifier, use the following code:
121
 
 
329
  # {'task_type_1': ['Code Generation'], 'task_type_2': ['Text Generation'], 'task_type_prob': [0.767], 'creativity_scope': [0.0826], 'reasoning': [0.0632], 'contextual_knowledge': [0.056], 'number_of_few_shots': [0], 'domain_knowledge': [0.9803], 'no_label_reason': [0.0], 'constraint_ct': [0.5578], 'prompt_complexity_score': [0.27822]}
330
  ```
331
 
332
+ # References
333
+ * [DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing](https://arxiv.org/abs/2111.09543)
334
+ * [DeBERTa: Decoding-enhanced BERT with Disentangled Attention](https://github.com/microsoft/DeBERTa)
335
+ * [Training language models to follow instructions with human feedback](https://arxiv.org/pdf/2203.02155)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
336
 
337
  # Ethical Considerations
338
  NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.