OpenNLPLab commited on
Commit
4b9f8cb
1 Parent(s): cc13b21

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +201 -0
README.md ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - HGRN
8
+ - Recurrent Neural Network
9
+ ---
10
+
11
+ <div align="center">
12
+ <h1>
13
+ HGRN - Hierarchically Gated Recurrent Neural Network for Sequence Modeling
14
+ </h1>
15
+ </div>
16
+
17
+ <p align="center">
18
+ 💻 <a href="https://github.com/OpenNLPLab/HGRN" target="_blank">GitHub </a>
19
+ </p>
20
+
21
+
22
+ - [Overall Architecture](#overall-architecture)
23
+ - [Experiments](#experiments)
24
+ - [Environment Preparation](#environment-preparation)
25
+ - [Env1](#env1)
26
+ - [Env2](#env2)
27
+ - [Autoregressive language model](#autoregressive-language-model)
28
+ - [1) Preprocess the data](#1-preprocess-the-data)
29
+ - [2) Train the autoregressive language model](#2-train-the-autoregressive-language-model)
30
+ - [Image modeling](#image-modeling)
31
+ - [LRA](#lra)
32
+ - [1) Preparation](#1-preparation)
33
+ - [2) Training](#2-training)
34
+ - [Standalone code](#standalone-code)
35
+
36
+
37
+ ## Overall Architecture
38
+
39
+ The overall network architecture is as follows:
40
+
41
+ <div align="center"> <img src="./hgrn.png" width = "100%" height = "100%" alt="network" align=center /></div>
42
+
43
+
44
+ ## Experiments
45
+
46
+ ### Environment Preparation
47
+
48
+ Our experiment uses two conda environments, where Autoregressive language modeling, needs to configure the environment according to the Env1 part, and LRA needs to configure the environment according to the Env2 part.
49
+
50
+ #### Env1
51
+
52
+ First build the conda environment based on the yaml file:
53
+
54
+ ```
55
+ conda env create --file env1.yaml
56
+ ```
57
+
58
+ If you meet an error when installing torch, just remove torch and torchvision in the yaml file, rerun the above command, and then run the below commands:
59
+
60
+ ```
61
+ conda activate hgrn
62
+ wget https://download.pytorch.org/whl/cu111/torch-1.8.1%2Bcu111-cp36-cp36m-linux_x86_64.whl
63
+ pip install torch-1.8.1+cu111-cp36-cp36m-linux_x86_64.whl
64
+ pip install -r requirements_hgrn.txt
65
+ ```
66
+
67
+ Then, install `hgru-pytorch`:
68
+ ```
69
+ conda activate hgrn
70
+ cd hgru-pytorch
71
+ pip install .
72
+ ```
73
+
74
+ Finally, install our version of fairseq:
75
+
76
+ ```
77
+ cd fairseq
78
+ pip install --editable ./
79
+ ```
80
+
81
+
82
+
83
+ #### Env2
84
+
85
+ Build the conda environment based on the yaml file:
86
+
87
+ ```
88
+ conda env create --file env2.yaml
89
+ ```
90
+
91
+ If you encounter difficulties in setting up the environment, you can install the conda environment first, and then use the following command to install the pip packages:
92
+ ```
93
+ pip install torch==1.10.0+cu111 torchvision==0.11.1+cu111 -f https://download.pytorch.org/whl/torch_stable.html
94
+ pip install -r requirements_lra.txt
95
+ ```
96
+
97
+ Finally, install `hgru-pytorch`:
98
+ ```
99
+ conda activate lra
100
+ cd hgru-pytorch
101
+ pip install .
102
+ ```
103
+
104
+
105
+ ### Autoregressive language model
106
+
107
+ #### 1) Preprocess the data
108
+
109
+ First download the [WikiText-103 dataset](https://www.salesforce.com/products/einstein/ai-research/the-wikitext-dependency-language-modeling-dataset/):
110
+
111
+ ```
112
+ wget https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-103-raw-v1.zip
113
+ unzip wikitext-103-raw-v1.zip
114
+ ```
115
+
116
+ Next, encode it with the GPT-2 BPE:
117
+
118
+ ```
119
+ mkdir -p gpt2_bpe
120
+ wget -O gpt2_bpe/encoder.json https://dl.fbaipublicfiles.com/fairseq/gpt2_bpe/encoder.json
121
+ wget -O gpt2_bpe/vocab.bpe https://dl.fbaipublicfiles.com/fairseq/gpt2_bpe/vocab.bpe
122
+ for SPLIT in train valid test; do \
123
+ python -m examples.roberta.multiprocessing_bpe_encoder \
124
+ --encoder-json gpt2_bpe/encoder.json \
125
+ --vocab-bpe gpt2_bpe/vocab.bpe \
126
+ --inputs wikitext-103-raw/wiki.${SPLIT}.raw \
127
+ --outputs wikitext-103-raw/wiki.${SPLIT}.bpe \
128
+ --keep-empty \
129
+ --workers 60; \
130
+ done
131
+ ```
132
+
133
+ Finally, preprocess/binarize the data using the GPT-2 fairseq dictionary:
134
+
135
+ ```
136
+ wget -O gpt2_bpe/dict.txt https://dl.fbaipublicfiles.com/fairseq/gpt2_bpe/dict.txt
137
+ fairseq-preprocess \
138
+ --only-source \
139
+ --srcdict gpt2_bpe/dict.txt \
140
+ --trainpref wikitext-103-raw/wiki.train.bpe \
141
+ --validpref wikitext-103-raw/wiki.valid.bpe \
142
+ --testpref wikitext-103-raw/wiki.test.bpe \
143
+ --destdir data-bin/wikitext-103 \
144
+ --workers 60
145
+ ```
146
+
147
+ This step comes from [fairseq](https://github.com/facebookresearch/fairseq/blob/main/examples/roberta/README.pretraining.md).
148
+
149
+
150
+
151
+
152
+ #### 2) Train the autoregressive language model
153
+
154
+ Use the following command to train language model:
155
+
156
+ ```
157
+ bash script_alm.sh
158
+ ```
159
+
160
+ You should change data_dir to preprocessed data.
161
+
162
+
163
+
164
+ ### Image modeling
165
+
166
+ ```
167
+ bash script_im.sh
168
+ ```
169
+
170
+
171
+ ### LRA
172
+
173
+ #### 1) Preparation
174
+
175
+ Download the codebase:
176
+
177
+ ```
178
+ git clone https://github.com/OpenNLPLab/lra.git
179
+ ```
180
+
181
+ Download the data:
182
+
183
+ ```
184
+ wget https://storage.googleapis.com/long-range-arena/lra_release.gz
185
+ mv lra_release.gz lra_release.tar.gz
186
+ tar -xvf lra_release.tar.gz
187
+ ```
188
+
189
+
190
+ #### 2) Training
191
+
192
+ Use the following script to run the experiments, you should change `PREFIX` to your lra path, change `tasks` to a specific task:
193
+
194
+ ```
195
+ python script_lra.py
196
+ ```
197
+
198
+
199
+
200
+ ## Standalone code
201
+ See [hgru-pytorch](https://github.com/Doraemonzzz/hgru-pytorch).