tsurubee commited on
Commit
1ad26c6
1 Parent(s): 9bb7c0a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -3
README.md CHANGED
@@ -1,3 +1,33 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+ ## VHHBERT
6
+
7
+ VHHBERT is a RoBERTa-based model pre-trained on two million VHH sequences in [VHHCorpus-2M](https://huggingface.co/datasets/COGNANO/VHHCorpus-2M).
8
+ VHHBERT has the same model parameters as RoBERTa<sub>BASE</sub>, except that it used positional embeddings with a length of 185 to cover the maximum sequence length of 179 in VHHCorpus-2M.
9
+ Further details on VHHBERT are described in our paper "A SARS-CoV-2 Interaction Dataset and VHH Sequence Corpus for Antibody Language Models.”
10
+
11
+ ## Usage
12
+
13
+ The model and tokenizer can be loaded using the `transformers` library.
14
+
15
+ ```python
16
+ from transformers import BertTokenizer, RobertaModel
17
+ tokenizer = BertTokenizer.from_pretrained("tsurubee/VHHBERT")
18
+ model = RobertaModel.from_pretrained("tsurubee/VHHBERT")
19
+ ```
20
+
21
+ ## Links
22
+
23
+ - Pre-training Corpus: https://huggingface.co/datasets/COGNANO/VHHCorpus-2M
24
+ - Code: https://github.com/cognano/AVIDa-SARS-CoV-2
25
+ - Paper: TBD
26
+
27
+ ## Citation
28
+
29
+ If you use VHHBERT in your research, please cite the following paper.
30
+
31
+ ```bibtex
32
+ TBD
33
+ ```