xww033 commited on
Commit
7a4c94c
1 Parent(s): 8598da4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -0
README.md CHANGED
@@ -1,3 +1,69 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+ # mPMR: A Multilingual Pre-trained Machine Reader at Scale
5
+ Multilingual Pre-trained Machine Reader (mPMR) is a multilingual extension of PMR.
6
+ mPMR is pre-trained with 18 million Machine Reading Comprehension (MRC) examples constructed with Wikipedia Hyperlinks.
7
+ It was introduced in the paper mPMR: A Multilingual Pre-trained Machine Reader at Scale by
8
+ Weiwen Xu, Xin Li, Wai Lam, Lidong Bing
9
+ and first released in [this repository](https://github.com/DAMO-NLP-SG/PMR).
10
+
11
+ This model is initialized with xlm-roberta-large and further continued pre-trained with an MRC objective.
12
+
13
+ ## Model description
14
+ The model is pre-trained with distantly labeled data using a learning objective called Wiki Anchor Extraction (WAE).
15
+ Specifically, we constructed a large volume of general-purpose and high-quality MRC-style training data based on Wikipedia anchors (i.e., hyperlinked texts).
16
+ For each Wikipedia anchor, we composed a pair of correlated articles.
17
+ One side of the pair is the Wikipedia article that contains detailed descriptions of the hyperlinked entity, which we defined as the definition article.
18
+ The other side of the pair is the article that mentions the specific anchor text, which we defined as the mention article.
19
+ We composed an MRC-style training instance in which the anchor is the answer,
20
+ the surrounding passage of the anchor in the mention article is the context, and the definition of the anchor entity in the definition article is the query.
21
+ Based on the above data, we then introduced a novel WAE problem as the pre-training task of mPMR.
22
+ In this task, mPMR determines whether the context and the query are relevant.
23
+ If so, mPMR extracts the answer from the context that satisfies the query description.
24
+
25
+ During fine-tuning, we unified downstream NLU tasks in our MRC formulation, which typically falls into four categories:
26
+ (1) span extraction with pre-defined labels (e.g., NER) in which each task label is treated as a query to search the corresponding answers in the input text (context);
27
+ (2) span extraction with natural questions (e.g., EQA) in which the question is treated as the query for answer extraction from the given passage (context);
28
+ (3) sequence classification with pre-defined task labels, such as sentiment analysis. Each task label is used as a query for the input text (context); and
29
+ (4) sequence classification with natural questions on multiple choices, such as multi-choice QA (MCQA). We treated the concatenation of the question and one choice as the query for the given passage (context).
30
+ Then, in the output space, we tackle span extraction problems by predicting the probability of context span being the answer.
31
+ We tackle sequence classification problems by conducting relevance classification on [CLS] (extracting [CLS] if relevant).
32
+
33
+ ## Model variations
34
+ There are two versions of models released. The details are:
35
+
36
+ | Model | Backbone | #params |
37
+ |------------|-----------|----------|
38
+ | [mPMR-base](https://huggingface.co/DAMO-NLP-SG/mPMR-base) | [xlm-roberta-base](https://huggingface.co/xlm-roberta-base) | 270M |
39
+ | [mPMR-large](https://huggingface.co/DAMO-NLP-SG/mPMR-large) | [xlm-roberta-large](https://huggingface.co/xlm-roberta-large) | 550M |
40
+
41
+
42
+
43
+ ## Intended uses & limitations
44
+ The models need to be fine-tuned on the data downstream tasks. During fine-tuning, no task-specific layer is required.
45
+
46
+ ### How to use
47
+ You can try the codes from [this repo](https://github.com/DAMO-NLP-SG/mPMR).
48
+
49
+
50
+
51
+ ### BibTeX entry and citation info
52
+ ```bibtxt
53
+ @article{xu2022clozing,
54
+ title={From Clozing to Comprehending: Retrofitting Pre-trained Language Model to Pre-trained Machine Reader},
55
+ author={Xu, Weiwen and Li, Xin and Zhang, Wenxuan and Zhou, Meng and Bing, Lidong and Lam, Wai and Si, Luo},
56
+ journal={arXiv preprint arXiv:2212.04755},
57
+ year={2022}
58
+ }
59
+ @inproceedings{xu2022mpmr,
60
+ title = "mPMR: A Multilingual Pre-trained Machine Reader at Scale",
61
+ author = "Xu, Weiwen and
62
+ Li, Xin and
63
+ Lam, Wai and
64
+ Bing, Lidong",
65
+ booktitle = "The 61th Annual Meeting of the Association for Computational Linguistics.",
66
+ year = "2023",
67
+ }
68
+
69
+ ```