vasudevgupta commited on
Commit
74df642
1 Parent(s): bb0f874

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: apache-2.0
4
+ datasets:
5
+ - trivia_qa
6
+ ---
7
+
8
+ # BigBird base trivia-itc
9
+
10
+ This model is a fine-tune checkpoint of `bigbird-roberta-base`, fine-tuned on `trivia_qa` with `BigBirdForQuestionAnsweringHead` on its top.
11
+
12
+ ## How to use
13
+
14
+ Here is how to use this model to get the features of a given text in PyTorch:
15
+
16
+ ```python
17
+ from transformers import BigBirdForQuestionAnswering
18
+
19
+ # by default its in `block_sparse` mode with num_random_blocks=3, block_size=64
20
+ model = BigBirdForQuestionAnswering.from_pretrained("google/bigbird-base-trivia-itc")
21
+
22
+ # you can change `attention_type` to full attention like this:
23
+ model = BigBirdForQuestionAnswering.from_pretrained("google/bigbird-base-trivia-itc", attention_type="original_full")
24
+
25
+ # you can change `block_size` & `num_random_blocks` like this:
26
+ model = BigBirdForQuestionAnswering.from_pretrained("google/bigbird-base-trivia-itc", block_size=16, num_random_blocks=2)
27
+
28
+ question = "Replace me by any text you'd like."
29
+ context = "Put some context for answering"
30
+ encoded_input = tokenizer(question, context, return_tensors='pt')
31
+ output = model(**encoded_input)
32
+ ```
33
+
34
+ # Fine-tuning config & hyper-parameters
35
+
36
+ - No. of global token = 128
37
+ - Window length = 192
38
+ - No. of random token = 192
39
+ - Max. sequence length = 4096
40
+ - No. of heads = 12
41
+ - No. of hidden layers = 12
42
+ - Hidden layer size = 768
43
+ - Batch size = 32
44
+ - Loss = cross-entropy noisy spans
45
+
46
+ ## BibTeX entry and citation info
47
+
48
+ ```tex
49
+ @misc{zaheer2021big,
50
+ title={Big Bird: Transformers for Longer Sequences},
51
+ author={Manzil Zaheer and Guru Guruganesh and Avinava Dubey and Joshua Ainslie and Chris Alberti and Santiago Ontanon and Philip Pham and Anirudh Ravula and Qifan Wang and Li Yang and Amr Ahmed},
52
+ year={2021},
53
+ eprint={2007.14062},
54
+ archivePrefix={arXiv},
55
+ primaryClass={cs.LG}
56
+ }
57
+ ```