--- license: mit datasets: - tweet_eval - bookcorpus - wikipedia - cc_news language: - en metrics: - accuracy pipeline_tag: text-classification tags: - medical --- # Model Card for Model ID Pretrained model on English language for text classification. Model trained from [tweet_emotion_eval](https://huggingface.co/elozano/tweet_emotion_eval) ([roberta-base](https://huggingface.co/roberta-base) fine-tuned on emotion task of [tweet_eval](https://huggingface.co/datasets/tweet_eval) dataset) on psychotherapy text transcripts. Given a sentence, this model provides a binary classification as either symptomatic or non-symptomatic where symptomatic means the sentence displays signs of anxiety and/or depression. ## Model Details ### Model Description - **Developed by:** Margot Wagner, Jasleen Jagayat, Anchan Kumar, Amir Shirazi, Nazanin Alavi, Mohsen Omrani - **Funded by:** Queen's University - **Model type:** RoBERTa - **Language(s) (NLP):** English - **License:** MIT - **Finetuned from model:** [elonzano/tweet_emotion_eval](https://huggingface.co/elozano/tweet_emotion_eval) ## Uses This model is intended to be used to assess the mental health status using sentence-level text data. Specifically, it looks for symptoms related to anxiety and depression. ## How to Get Started with the Model Use the code below to get started with the model. ```python from transformers import pipeline classifier = pipeline(task="text-classification", model="margotwagner/roberta-psychotherapy-eval") sentences = ["I am not having a great day"] model_outputs = classifier(sentences) print(model_outputs[0]) # produces a list of dicts for each of the labels ``` ## Training Details ### Training Data This model was fine-tuned using English sentence-level data in a supervised manner where symptomatic labels were obtained from expert clinicians. Sentences were required to be independent in nature. Back-translation was utilized to increase the size of the training dataset. ### Training Procedure Weighted cross-entropy loss function was employed to address class imbalance. Model accuracy in the form of F1 was used for model selection. ### Testing Data & Metrics #### Testing Data The testing data used was clinical data from a board-reviewed and ethically-compliant online psychotherapy clinical trial conducted at Queen’s University between 2020 and 2021. The study underwent a thorough review process by the Queen’s University Health Sciences and Affiliated Teaching Hospitals Research Ethics Board to ensure adherence to ethical standards (File #: 6020045). #### Metrics F1 score was used as the model accuracy metric, as it maintains a balance between precision and recall with particular importance given to positive examples.