metadata

license: cc-by-nc-sa-4.0
tags:
  - generated_from_trainer
metrics:
  - precision
  - recall
  - f1
  - accuracy
model-index:
  - name: requirements_ambiguity_v2
    results: []
widget:
  - text: >-
      In de Zaaktypeconfiguratie kan per fase een andere behandelaar worden
      geconfigureerd waardoor bij de overgang naar de volgende status de
      behandelaar automatisch wordt gewijzigd. De behandelaar/groep behandelaren
      kan automatisch worden bepaald op basis van een kenmerk.
  - text: >-
      Er kan informatie aan het digitale formulier worden toegevoegd
      (gespreksverslagen en resultaatafspraken bijvoorbeeld) door medewerker
      en/of leidinggevende, dit kan tussentijds opgeslagen en/of afgesloten
      worden voordat het wordt vrijgegeven voor de andere partij.
  - text: >-
      De Oplossing ondersteunt parafering en het plaatsen van een
      gecertificeerde elektronische handtekening.
  - text: >-
      De Aangeboden oplossing biedt de functionaliteit om individuele en
      bulkmutaties te verwerken met ingangsdatum op elke willekeurige datum in
      de maand, zowel in het verleden als in de toekomst, binnen een lopend
      kalenderjaar.
language:
  - nl

requirements_ambiguity_v2

This model is a fine-tuned version of GroNLP/bert-base-dutch-cased on a private dataset with 2,523 labeled software requirements for ambiguity detection in Dutch.

Please contact me via LinkedIn if you have any questions about this model or the dataset used.

The dataset and this model were created as part of the final project assignment of the Natural Language Understanding course (XCS224U) from the Professional AI Program of the Stanford School of Engineering.

It achieves the following results on the evaluation set:

Loss: 0.7485
Accuracy: 0.8458
F1: 0.8442
Recall: 0.7474

Intended uses & limitations

The model performs automated ambiguity detection through binary text classification. Its intended use is as a tool voor requirements engineers to detect spurious and ambiguous formulations.

Training and evaluation data

The model was trained on ReqAmbi dataset. This dataset is private and contains 2,523 requirement formulations. Each requirement is manually labeled 0 (unambiguous) or 1 (ambiguous). The dataset is split 2,019/253/253 into train, validation and test. The reported metrics are from the evaluation on the test set. The validation set was used for cross-validation during training.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 4

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1	Recall
0.5268	1.0	36	0.5424	0.8063	0.8057	0.7263
0.318	2.0	72	0.4688	0.8182	0.8182	0.7579
0.1244	3.0	108	0.6019	0.8379	0.8366	0.7474
0.0308	4.0	144	0.7485	0.8458	0.8442	0.7474

Framework versions

Transformers 4.24.0
Pytorch 2.0.0
Datasets 2.9.0
Tokenizers 0.11.0