File size: 3,615 Bytes
39e4834 b8a9250 39e4834 b8a9250 39e4834 de54e91 39e4834 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 |
---
license: apache-2.0
widget:
- text: "One day I will be a real teacher and I will try to do the best I can for the children."
example_title: "Classification (without context)"
---
# Model Card for XLM-Roberta-large-reflective-conf4
This is a reflectivity classification model trained to distinguish different types of reflectivity in the reports of teaching students.
It was evaluated in a cross-lingual settings and was found to work well also in languages outside English -- see the results in the referenced paper.
## Model Details
- **Repository:** https://github.com/EduMUNI/reflection-classification
- **Paper:** https://link.springer.com/article/10.1007/s10639-022-11254-7
- **Developed by:** Michal Stefanik & Jan Nehyba, Masaryk University
- **Model type:** Roberta-large
- **Finetuned from model:** [XLM-R-large](https://huggingface.co/xlm-roberta-large)
## Usage
To match the training format, it is best to use the prepared wrapper that will format the classified sentence and its surrounding context in the expected format:
```python
from transformers import AutoConfig, AutoModelForSequenceClassification, AutoTokenizer
LABELS = ["Other", "Belief", "Perspective", "Feeling", "Experience",
"Reflection", "Difficulty", "Intention", "Learning"]
class NeuralClassifier:
def __init__(self, model_path: str, uses_context: bool, device: str):
self.config = AutoConfig.from_pretrained(model_path)
self.device = device
self.model = AutoModelForSequenceClassification.from_pretrained(model_path, config=self.config).to(device)
self.tokenizer = AutoTokenizer.from_pretrained(model_path)
self.uses_context = uses_context
def predict_sentence(self, sentence: str, context: str = None):
if context is None and self.uses_context:
raise ValueError("You need to pass in context argument, including the sentence")
features = self.tokenizer(sentence, text_pair=context,
padding="max_length", truncation=True, return_tensors='pt')
outputs = self.model(**features.to(self.device), return_dict=True)
argmax = outputs.logits.argmax(dim=-1).detach().cpu().tolist()[0]
labels = LABELS[argmax]
return labels
```
The wrapper can be used as follows:
```python
classifier = NeuralClassifier(model_path="MU-NLPC/XLM-R-large-reflective-conf4",
uses_context=False,
device="cpu")
test_sentences = ["And one day I will be a real teacher and I will try to do the best I can for the children.",
"I felt really well!",
"gfagdhj gjfdjgh dg"]
y_pred = [classifier.predict_sentence(sentence) for sentence in tqdm(test_sentences)]
print(y_pred)
>>> ['Intention', 'Feeling', 'Other']
```
### Training Data
The model was trained on a [CEReD dataset](http://hdl.handle.net/11372/LRT-3573) and aims for the best possible evaluation in cross-lingual settings (on unseen languages).
See the reproducible training script in the project directory: https://github.com/EduMUNI/reflection-classification
## Citation
If you use the model in scientific work, please acknowledge our work as follows.
```bibtex
@Article{Nehyba2022applications,
author={Nehyba, Jan and {\v{S}}tef{\'a}nik, Michal},
title={Applications of deep language models for reflective writings},
journal={Education and Information Technologies},
year={2022},
month={Sep},
day={05},
issn={1573-7608},
doi={10.1007/s10639-022-11254-7},
url={https://doi.org/10.1007/s10639-022-11254-7}
}
``` |