michal-stefanik commited on
Commit
39e4834
1 Parent(s): a3496e5

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +90 -0
README.md ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ # Model Card for XLM-Roberta-large-reflective-conf4
5
+
6
+ This is a reflectivity classification model trained to distinguish different types of reflectivity in the reports of teaching students.
7
+
8
+ It was evaluated in a cross-lingual settings and was found to work well also in languages outside English -- see the results in the referenced paper.
9
+
10
+ ## Model Details
11
+
12
+ - **Repository:** https://github.com/EduMUNI/reflection-classification
13
+ - **Paper:** https://link.springer.com/article/10.1007/s10639-022-11254-7
14
+
15
+ - **Developed by:** Jan Nehyba & Michal Stefanik, Masaryk University
16
+ - **Model type:** Roberta-large
17
+ - **Finetuned from model:** [XLM-R-large](https://huggingface.co/xlm-roberta-large)
18
+
19
+ ## Usage
20
+
21
+ To match the training format, it is best to use the prepared wrapper that will format the classified sentence and its surrounding context in the expected format:
22
+
23
+ ```python
24
+ from transformers import AutoConfig, AutoModelForSequenceClassification, AutoTokenizer
25
+
26
+ LABELS = ["Other", "Belief", "Perspective", "Feeling", "Experience",
27
+ "Reflection", "Difficulty", "Intention", "Learning"]
28
+
29
+ class NeuralClassifier:
30
+
31
+ def __init__(self, model_path: str, uses_context: bool, device: str):
32
+ self.config = AutoConfig.from_pretrained(model_path)
33
+ self.device = device
34
+ self.model = AutoModelForSequenceClassification.from_pretrained(model_path, config=self.config).to(device)
35
+ self.tokenizer = AutoTokenizer.from_pretrained(model_path)
36
+ self.uses_context = uses_context
37
+
38
+ def predict_sentence(self, sentence: str, context: str = None):
39
+ if context is None and self.uses_context:
40
+ raise ValueError("You need to pass in context argument, including the sentence")
41
+
42
+ features = self.tokenizer(sentence, text_pair=context,
43
+ padding="max_length", truncation=True, return_tensors='pt')
44
+ outputs = self.model(**features.to(self.device), return_dict=True)
45
+ argmax = outputs.logits.argmax(dim=-1).detach().cpu().tolist()[0]
46
+ labels = LABELS[argmax]
47
+
48
+ return labels
49
+ ```
50
+
51
+ The wrapper can be used as follows:
52
+ ```python
53
+ classifier = NeuralClassifier(model_path="MU-NLPC/XLM-R-large-reflective-conf4",
54
+ uses_context=False,
55
+ device="cpu")
56
+
57
+ test_sentences = ["And one day I will be a real teacher and I will try to do the best I can for the children.",
58
+ "I felt really well!",
59
+ "gfagdhj gjfdjgh dg"]
60
+
61
+ y_pred = [classifier.predict_sentence(sentence) for sentence in tqdm(test_sentences)]
62
+
63
+ print(y_pred)
64
+
65
+ >>> ['Intention', 'Feeling', 'Other']
66
+ ```
67
+
68
+ ### Training Data
69
+
70
+ The model was trained on a [CEReD dataset](http://hdl.handle.net/11372/LRT-3573) and aims for the best possible evaluation in cross-lingual settings (on unseen languages).
71
+
72
+ See the reproducible training script in the project directory: https://github.com/EduMUNI/reflection-classification
73
+
74
+ ## Citation
75
+
76
+ If you use the model in scientific work, please acknowledge our work as follows.
77
+
78
+ ```bibtex
79
+ @Article{Nehyba2022applications,
80
+ author={Nehyba, Jan and {\v{S}}tef{\'a}nik, Michal},
81
+ title={Applications of deep language models for reflective writings},
82
+ journal={Education and Information Technologies},
83
+ year={2022},
84
+ month={Sep},
85
+ day={05},
86
+ issn={1573-7608},
87
+ doi={10.1007/s10639-022-11254-7},
88
+ url={https://doi.org/10.1007/s10639-022-11254-7}
89
+ }
90
+ ```