Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,49 @@
|
|
1 |
---
|
2 |
license: mit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
+
language:
|
4 |
+
- ru
|
5 |
+
metrics:
|
6 |
+
- f1
|
7 |
+
library_name: transformers
|
8 |
+
tags:
|
9 |
+
- russian
|
10 |
+
- conversation
|
11 |
+
- chats
|
12 |
+
- embeddings
|
13 |
+
- coherence
|
14 |
---
|
15 |
+
# Model Card
|
16 |
+
|
17 |
+
This model is trained to predict whether two given messages from some group chat with many members can have a `reply_to` relationship.
|
18 |
+
|
19 |
+
# Training details
|
20 |
+
|
21 |
+
It's based on [Conversational RuBERT](https://docs.deeppavlov.ai/en/master/features/models/bert.html) (cased, 12-layer, 768-hidden, 12-heads, 180M parameters) that was trained on several social media datasets. We fine-tuned it with the data from several Telegram chats. The positive `reply_to` examples were obtained by natural user annotation. The negative ones were obtained by shuffling the messages.
|
22 |
+
The task perfectly aligns with the Next Sentence Prediction task, so the fine-tuning was done in that manner. See the [paper](https://www.dialog-21.ru/media/5871/buyanoviplusetal046.pdf) for more details.
|
23 |
+
|
24 |
+
# Usage
|
25 |
+
|
26 |
+
**Note:** if two messages have `reply_to` relationship, then **they have "zero" label**. This is because of the NSP formulation.
|
27 |
+
```python
|
28 |
+
from transformers import AutoTokenizer, BertForNextSentencePrediction
|
29 |
+
tokenizer = AutoTokenizer.from_pretrained("rubert_reply_recovery", )
|
30 |
+
model = BertForNextSentencePrediction.from_pretrained("rubert_reply_recovery", )
|
31 |
+
|
32 |
+
inputs = tokenizer(['Где можно получить СНИЛС?', 'Я тут уже много лет'], ["Можете в МФЦ", "Куда отправить это письмо?"], return_tensors='pt',
|
33 |
+
truncation=True, max_length=512, padding = 'max_length',)
|
34 |
+
output = model(**inputs)
|
35 |
+
print(output.logits.argmax(dim=1))
|
36 |
+
# tensor([0, 1])
|
37 |
+
```
|
38 |
+
|
39 |
+
|
40 |
+
# Citation
|
41 |
+
|
42 |
+
```bibtex
|
43 |
+
@article{Buyanov2023WhoIA,
|
44 |
+
title={Who is answering to whom? Modeling reply-to relationships in Russian asynchronous chats},
|
45 |
+
author={Igor Buyanov and Darya Yaskova and Ilya Sochenkov},
|
46 |
+
journal={Computational Linguistics and Intellectual Technologies},
|
47 |
+
year={2023}
|
48 |
+
}
|
49 |
+
```
|