margotwagner's picture
Update README.md
1c185cb
|
raw
history blame
2.59 kB
---
license: mit
datasets:
- tweet_eval
- bookcorpus
- wikipedia
- cc_news
language:
- en
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- medical
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
Pretrained model on English language for text classification. Model trained from [tweet_emotion_eval](https://huggingface.co/elozano/tweet_emotion_eval) ([roberta-base](https://huggingface.co/roberta-base) fine-tuned on emotion task of [tweet_eval](https://huggingface.co/datasets/tweet_eval) dataset) on psychotherapy text transcripts.
Given a sentence, this model provides a binary classification as either symptomatic or non-symptomatic where symptomatic means the sentence displays signs of anxiety and/or depression.
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
- **Developed by:** Margot Wagner, Jasleen Jagayat, Anchan Kumar, Amir Shirazi, Nazanin Alavi, Mohsen Omrani
- **Funded by:** Queen's University
- **Model type:** RoBERTa
- **Language(s) (NLP):** English
- **License:** MIT
- **Finetuned from model:** [elonzano/tweet_emotion_eval](https://huggingface.co/elozano/tweet_emotion_eval)
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
This model is intended to be used to assess the mental health status using sentence-level text data. Specifically, it looks for symptoms related to anxiety and depression.
## How to Get Started with the Model
Use the code below to get started with the model.
```python
from transformers import pipeline
classifier = pipeline(task="text-classification", model="margotwagner/roberta-psychotherapy-eval")
sentences = ["I am not having a great day"]
model_outputs = classifier(sentences)
print(model_outputs[0])
# produces a list of dicts for each of the labels
```
## Training Details
### Training Data
This model was fine-tuned using English sentence-level data in a supervised manner where symptomatic labels were obtained from expert clinicians. Sentences were required to be independent in nature. Back-translation was utilized to increase the size of the training dataset.
### Training Procedure
Weighted cross-entropy loss function was employed to address class imbalance. Model accuracy in the form of F1 was used for model selection.
### Metrics
F1 score was used as the model accuracy metric, as it maintains a balance between precision and recall with particular importance given to positive examples.