add tips in README
Browse files
README.md
CHANGED
@@ -28,9 +28,34 @@ This model is released under the [Creative Commons 4.0 International License](ht
|
|
28 |
# download Manbyo-Dictionary
|
29 |
|
30 |
mkdir -p /usr/local/lib/mecab/dic/userdic
|
31 |
-
wget https://sociocom.jp/~data/2018-manbyo/data/MANBYO_201907_Dic-utf8.dic
|
|
|
32 |
```
|
33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
**Input text must be converted to full-width characters(全角)in advance.**
|
35 |
|
36 |
You can use this model for masked language modeling as follows:
|
|
|
28 |
# download Manbyo-Dictionary
|
29 |
|
30 |
mkdir -p /usr/local/lib/mecab/dic/userdic
|
31 |
+
wget https://sociocom.jp/~data/2018-manbyo/data/MANBYO_201907_Dic-utf8.dic
|
32 |
+
mv MANBYO_201907_Dic-utf8.dic /usr/local/lib/mecab/dic/userdic
|
33 |
```
|
34 |
|
35 |
+
---
|
36 |
+
|
37 |
+
**Note: If you don't have root privileges and find it difficult to download the Manbyo Dictionary to `/usr/local/lib/mecab/dic/userdic`, you can still load our model by overriding tokenizer settings as follows:**
|
38 |
+
|
39 |
+
```bash
|
40 |
+
# download Manbyo-Dictionary wherever you like
|
41 |
+
|
42 |
+
wget https://sociocom.jp/~data/2018-manbyo/data/MANBYO_201907_Dic-utf8.dic
|
43 |
+
mv MANBYO_201907_Dic-utf8.dic /anywhere/you/like
|
44 |
+
```
|
45 |
+
|
46 |
+
```python
|
47 |
+
from transformers import AutoModelForMaskedLM, AutoTokenizer
|
48 |
+
|
49 |
+
model = AutoModelForMaskedLM.from_pretrained("alabnii/jmedroberta-base-manbyo-wordpiece")
|
50 |
+
tokenizer = AutoTokenizer.from_pretrained("alabnii/jmedroberta-base-manbyo-wordpiece", **{
|
51 |
+
"mecab_kwargs": {
|
52 |
+
"mecab_option": "-u /anywhere/you/like/MANBYO_201907_Dic-utf8.dic"
|
53 |
+
}
|
54 |
+
})
|
55 |
+
```
|
56 |
+
|
57 |
+
---
|
58 |
+
|
59 |
**Input text must be converted to full-width characters(全角)in advance.**
|
60 |
|
61 |
You can use this model for masked language modeling as follows:
|