|
--- |
|
license: apache-2.0 |
|
|
|
widget: |
|
- text: "One day I will be a real teacher and I will try to do the best I can for the children." |
|
example_title: "Classification (without context)" |
|
--- |
|
|
|
# Model Card for XLM-Roberta-large-reflective-conf4 |
|
|
|
This is a reflectivity classification model trained to distinguish different types of reflectivity in the reports of teaching students. |
|
|
|
It was evaluated in a cross-lingual settings and was found to work well also in languages outside English -- see the results in the referenced paper. |
|
|
|
## Model Details |
|
|
|
- **Repository:** https://github.com/EduMUNI/reflection-classification |
|
- **Paper:** https://link.springer.com/article/10.1007/s10639-022-11254-7 |
|
|
|
- **Developed by:** Michal Stefanik & Jan Nehyba, Masaryk University |
|
- **Model type:** Roberta-large |
|
- **Finetuned from model:** [XLM-R-large](https://huggingface.co/xlm-roberta-large) |
|
|
|
## Usage |
|
|
|
To match the training format, it is best to use the prepared wrapper that will format the classified sentence and its surrounding context in the expected format: |
|
|
|
```python |
|
from transformers import AutoConfig, AutoModelForSequenceClassification, AutoTokenizer |
|
|
|
LABELS = ["Other", "Belief", "Perspective", "Feeling", "Experience", |
|
"Reflection", "Difficulty", "Intention", "Learning"] |
|
|
|
class NeuralClassifier: |
|
|
|
def __init__(self, model_path: str, uses_context: bool, device: str): |
|
self.config = AutoConfig.from_pretrained(model_path) |
|
self.device = device |
|
self.model = AutoModelForSequenceClassification.from_pretrained(model_path, config=self.config).to(device) |
|
self.tokenizer = AutoTokenizer.from_pretrained(model_path) |
|
self.uses_context = uses_context |
|
|
|
def predict_sentence(self, sentence: str, context: str = None): |
|
if context is None and self.uses_context: |
|
raise ValueError("You need to pass in context argument, including the sentence") |
|
|
|
features = self.tokenizer(sentence, text_pair=context, |
|
padding="max_length", truncation=True, return_tensors='pt') |
|
outputs = self.model(**features.to(self.device), return_dict=True) |
|
argmax = outputs.logits.argmax(dim=-1).detach().cpu().tolist()[0] |
|
labels = LABELS[argmax] |
|
|
|
return labels |
|
``` |
|
|
|
The wrapper can be used as follows: |
|
```python |
|
classifier = NeuralClassifier(model_path="MU-NLPC/XLM-R-large-reflective-conf4", |
|
uses_context=False, |
|
device="cpu") |
|
|
|
test_sentences = ["And one day I will be a real teacher and I will try to do the best I can for the children.", |
|
"I felt really well!", |
|
"gfagdhj gjfdjgh dg"] |
|
|
|
y_pred = [classifier.predict_sentence(sentence) for sentence in tqdm(test_sentences)] |
|
|
|
print(y_pred) |
|
|
|
>>> ['Intention', 'Feeling', 'Other'] |
|
``` |
|
|
|
### Training Data |
|
|
|
The model was trained on a [CEReD dataset](http://hdl.handle.net/11372/LRT-3573) and aims for the best possible evaluation in cross-lingual settings (on unseen languages). |
|
|
|
See the reproducible training script in the project directory: https://github.com/EduMUNI/reflection-classification |
|
|
|
## Citation |
|
|
|
If you use the model in scientific work, please acknowledge our work as follows. |
|
|
|
```bibtex |
|
@Article{Nehyba2022applications, |
|
author={Nehyba, Jan and {\v{S}}tef{\'a}nik, Michal}, |
|
title={Applications of deep language models for reflective writings}, |
|
journal={Education and Information Technologies}, |
|
year={2022}, |
|
month={Sep}, |
|
day={05}, |
|
issn={1573-7608}, |
|
doi={10.1007/s10639-022-11254-7}, |
|
url={https://doi.org/10.1007/s10639-022-11254-7} |
|
} |
|
``` |