ayethuzar commited on
Commit
c3dd1c6
1 Parent(s): 24a763d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -0
README.md CHANGED
@@ -22,6 +22,17 @@ App Demonstration Video:
22
 
23
  Dataset: https://github.com/suzgunmirac/hupd
24
 
 
 
 
 
 
 
 
 
 
 
 
25
  **milestone3:**
26
 
27
  milestone3 notebook:
 
22
 
23
  Dataset: https://github.com/suzgunmirac/hupd
24
 
25
+ **Data Preprocessing**
26
+
27
+ I used the load_dataset function to load all the patent applications that were filed to the USPTO in January 2016. We specify the date ranges of the training and validation sets as January 1-21, 2016 and January 22-31, 2016, respectively. This is a smaller dataset.
28
+
29
+ There are two datasets: train and validation. Here are the steps I did:
30
+
31
+ - Label-to-index mapping for the decision status field
32
+ - map the 'abstract' and 'claims' sections
33
+ - format them
34
+ - use DataLoader with batch_size = 16
35
+
36
  **milestone3:**
37
 
38
  milestone3 notebook: