thatdramebaazguy commited on
Commit
52df02d
·
1 Parent(s): 669cc30

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -40
README.md CHANGED
@@ -1,4 +1,56 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  language:
3
  - English
4
  -
@@ -11,44 +63,4 @@ license:
11
  datasets:
12
  - wikimovies
13
  -
14
- metrics:
15
- -
16
- -
17
  ---
18
-
19
- # MyModelName
20
-
21
- ## Model description
22
-
23
- You can embed local or remote images using `![](...)`
24
-
25
- ## Intended uses & limitations
26
-
27
- #### How to use
28
-
29
- ```python
30
- # You can include sample code which will be formatted
31
- ```
32
-
33
- #### Limitations and bias
34
-
35
- Provide examples of latent issues and potential remediations.
36
-
37
- ## Training data
38
-
39
- Describe the data you used to train the model.
40
- If you initialized it with pre-trained weights, add a link to the pre-trained model card or repository with description of the pre-training data.
41
-
42
- ## Training procedure
43
-
44
- Preprocessing, hardware used, hyperparameters...
45
-
46
- ## Eval results
47
-
48
- ### BibTeX entry and citation info
49
-
50
- ```bibtex
51
- @inproceedings{...,
52
- year={2020}
53
- }
54
- ```
 
1
  ---
2
+ datasets:
3
+ - wikimovies
4
+ license: cc-by-4.0
5
+ ---
6
+ # roberta-base for MLM
7
+
8
+ ```
9
+ model_name = "thatdramebaazguy/roberta-base-wikimovies"
10
+ pipeline(model=model_name, tokenizer=model_name, revision="v1.0", task="Fill-Mask")
11
+ ```
12
+ ## Overview
13
+ **Language model:** roberta-base
14
+ **Language:** English
15
+ **Downstream-task:** Fill-Mask
16
+ **Training data:** wikimovies
17
+ **Eval data:** wikimovies
18
+ **Code:** See [example](https://github.com/adityaarunsinghal/Domain-Adaptation/blob/master/shell_scripts/train_movie_roberta.sh)
19
+ **Infrastructure**: 2x Tesla v100
20
+
21
+ ## Hyperparameters
22
+ ```
23
+ num_examples = 4346
24
+ batch_size = 16
25
+ n_epochs = 3
26
+ base_LM_model = "roberta-base"
27
+ learning_rate = 5e-05
28
+ max_query_length=64
29
+ Gradient Accumulation steps = 1
30
+ Total optimization steps = 816
31
+ evaluation_strategy=IntervalStrategy.NO
32
+ prediction_loss_only=False
33
+ per_device_train_batch_size=8
34
+ per_device_eval_batch_size=8
35
+ adam_beta1=0.9
36
+ adam_beta2=0.999
37
+ adam_epsilon=1e-08,
38
+ max_grad_norm=1.0
39
+ lr_scheduler_type=SchedulerType.LINEAR
40
+ warmup_ratio=0.0
41
+ seed=42
42
+ eval_steps=500
43
+ metric_for_best_model=None
44
+ greater_is_better=None
45
+ label_smoothing_factor=0.0
46
+ ```
47
+ ## Performance
48
+
49
+ perplexity = 4.3808
50
+
51
+ Some of my work:
52
+ - [Domain-Adaptation Project](https://github.com/adityaarunsinghal/Domain-Adaptation/)
53
+ ---
54
  language:
55
  - English
56
  -
 
63
  datasets:
64
  - wikimovies
65
  -
 
 
 
66
  ---