File size: 1,767 Bytes

---
license: cc-by-nc-sa-4.0
language:
- en
tags:
- disfluency identification
---

# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->

This BERT model classifies a dialogue system's user utterance as fluent or disfluent.

## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->



- **Developed by:** 4i Intelligent Insights
- **Model type:** BERT base cased
- **Language(s) (NLP):** English
- **License:** cc-by-nc-sa-4.0

### Model Sources

<!-- Provide the basic links for the model. -->

- **Repository:** http://research.4i.ai/code/BERT_disfluency_cls
- **Paper:** https://aclanthology.org/2023.findings-acl.728/

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

The model is intended to be used for classifying English utterances of users interacting with a dialogue system. In our evaluation, the user utterances were speech transcriptions.


## Out-of-Scope Use

<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

This model has not been evaluated to be used on machine-generated text. 


## Bias, Risks, and Limitations

<!-- This section is meant to convey both technical and sociotechnical limitations. -->

This model may not be accurate with non-native English speakers.




## Training Data

<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

The model has been fine-tuned on the Fisher English Corpus:
http://github.com/joshua-decoder/fisher-callhome-corpus