Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
@@ -22,6 +22,17 @@ App Demonstration Video:
|
|
22 |
|
23 |
Dataset: https://github.com/suzgunmirac/hupd
|
24 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
**milestone3:**
|
26 |
|
27 |
milestone3 notebook:
|
|
|
22 |
|
23 |
Dataset: https://github.com/suzgunmirac/hupd
|
24 |
|
25 |
+
**Data Preprocessing**
|
26 |
+
|
27 |
+
I used the load_dataset function to load all the patent applications that were filed to the USPTO in January 2016. We specify the date ranges of the training and validation sets as January 1-21, 2016 and January 22-31, 2016, respectively. This is a smaller dataset.
|
28 |
+
|
29 |
+
There are two datasets: train and validation. Here are the steps I did:
|
30 |
+
|
31 |
+
- Label-to-index mapping for the decision status field
|
32 |
+
- map the 'abstract' and 'claims' sections
|
33 |
+
- format them
|
34 |
+
- use DataLoader with batch_size = 16
|
35 |
+
|
36 |
**milestone3:**
|
37 |
|
38 |
milestone3 notebook:
|