Shaltiel commited on
Commit
f13834d
ยท
1 Parent(s): 2506021

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +69 -0
README.md CHANGED
@@ -1,3 +1,72 @@
1
  ---
2
  license: cc-by-4.0
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-4.0
3
+ language:
4
+ - he
5
  ---
6
+ # DictaBERT-Large: A State-of-the-Art BERT-Large Suite for Modern Hebrew
7
+
8
+ State-of-the-art language model for Hebrew, released [here](https://arxiv.org/abs/2308.16687).
9
+
10
+ This is the BERT-large base model pretrained with the masked-language-modeling objective.
11
+
12
+ For the bert-base models for other tasks, see [here](https://huggingface.co/collections/dicta-il/dictabert-6588e7cc08f83845fc42a18b).
13
+
14
+ For the bert-large models for other tasks, see [to-be-added].
15
+
16
+ Sample usage:
17
+
18
+ ```python
19
+ from transformers import pipeline
20
+
21
+ oracle = pipeline('question-answering', model='dicta-il/dictabert-large-heq')
22
+
23
+
24
+ context = 'ื‘ื ื™ื™ืช ืคืจื•ืคื™ืœื™ื ืฉืœ ืžืฉืชืžืฉื™ื ื ื—ืฉื‘ืช ืขืœ ื™ื“ื™ ืจื‘ื™ื ื›ืื™ื•ื ืคื•ื˜ื ืฆื™ืืœื™ ืขืœ ื”ืคืจื˜ื™ื•ืช. ืžืกื™ื‘ื” ื–ื• ื”ื’ื‘ื™ืœื• ื—ืœืง ืžื”ืžื“ื™ื ื•ืช ื‘ืืžืฆืขื•ืช ื—ืงื™ืงื” ืืช ื”ืžื™ื“ืข ืฉื ื™ืชืŸ ืœื”ืฉื™ื’ ื‘ืืžืฆืขื•ืช ืขื•ื’ื™ื•ืช ื•ืืช ืื•ืคืŸ ื”ืฉื™ืžื•ืฉ ื‘ืขื•ื’ื™ื•ืช. ืืจืฆื•ืช ื”ื‘ืจื™ืช, ืœืžืฉืœ, ืงื‘ืขื” ื—ื•ืงื™ื ื ื•ืงืฉื™ื ื‘ื›ืœ ื”ื ื•ื’ืข ืœื™ืฆื™ืจืช ืขื•ื’ื™ื•ืช ื—ื“ืฉื•ืช. ื—ื•ืงื™ื ืืœื•, ืืฉืจ ื ืงื‘ืขื• ื‘ืฉื ืช 2000, ื ืงื‘ืขื• ืœืื—ืจ ืฉื ื—ืฉืฃ ื›ื™ ื”ืžืฉืจื“ ืœื™ื™ืฉื•ื ื”ืžื“ื™ื ื™ื•ืช ืฉืœ ื”ืžืžืฉืœ ื”ืืžืจื™ืงืื™ ื ื’ื“ ื”ืฉื™ืžื•ืฉ ื‘ืกืžื™ื (ONDCP) ื‘ื‘ื™ืช ื”ืœื‘ืŸ ื”ืฉืชืžืฉ ื‘ืขื•ื’ื™ื•ืช ื›ื“ื™ ืœืขืงื•ื‘ ืื—ืจื™ ืžืฉืชืžืฉื™ื ืฉืฆืคื• ื‘ืคืจืกื•ืžื•ืช ื ื’ื“ ื”ืฉื™ืžื•ืฉ ื‘ืกืžื™ื ื‘ืžื˜ืจื” ืœื‘ื“ื•ืง ื”ืื ืžืฉืชืžืฉื™ื ืืœื• ื ื›ื ืกื• ืœืืชืจื™ื ื”ืชื•ืžื›ื™ื ื‘ืฉื™ืžื•ืฉ ื‘ืกืžื™ื. ื“ื ื™ืืœ ื‘ืจืื ื˜, ืคืขื™ืœ ื”ื“ื•ื’ืœ ื‘ืคืจื˜ื™ื•ืช ื”ืžืฉืชืžืฉื™ื ื‘ืื™ื ื˜ืจื ื˜, ื—ืฉืฃ ื›ื™ ื”-CIA ืฉืœื— ืขื•ื’ื™ื•ืช ืงื‘ื•ืขื•ืช ืœืžื—ืฉื‘ื™ ืื–ืจื—ื™ื ื‘ืžืฉืš ืขืฉืจ ืฉื ื™ื. ื‘-25 ื‘ื“ืฆืžื‘ืจ 2005 ื’ื™ืœื” ื‘ืจืื ื˜ ื›ื™ ื”ืกื•ื›ื ื•ืช ืœื‘ื™ื˜ื—ื•ืŸ ืœืื•ืžื™ (ื”-NSA) ื”ืฉืื™ืจื” ืฉืชื™ ืขื•ื’ื™ื•ืช ืงื‘ื•ืขื•ืช ื‘ืžื—ืฉื‘ื™ ืžื‘ืงืจื™ื ื‘ื’ืœืœ ืฉื“ืจื•ื’ ืชื•ื›ื ื”. ืœืื—ืจ ืฉื”ื ื•ืฉื ืคื•ืจืกื, ื”ื ื‘ื™ื˜ืœื• ืžื™ื“ ืืช ื”ืฉื™ืžื•ืฉ ื‘ื”ืŸ.'
25
+ question = 'ื›ื™ืฆื“ ื”ื•ื’ื‘ืœ ื”ืžื™ื“ืข ืฉื ื™ืชืŸ ืœื”ืฉื™ื’ ื‘ืืžืฆืขื•ืช ื”ืขื•ื’ื™ื•ืช?'
26
+
27
+ oracle(question=question, context=context)
28
+ ```
29
+
30
+ Output:
31
+ ```json
32
+ {
33
+ "score": 0.998887836933136,
34
+ "start": 101,
35
+ "end": 114,
36
+ "answer": "ื‘ืืžืฆืขื•ืช ื—ืงื™ืงื”"
37
+ }
38
+ ```
39
+
40
+ ## Citation
41
+
42
+ If you use DictaBERT in your research, please cite ```DictaBERT: A State-of-the-Art BERT Suite for Modern Hebrew```
43
+
44
+ **BibTeX:**
45
+
46
+ ```bibtex
47
+ @misc{shmidman2023dictabert,
48
+ title={DictaBERT: A State-of-the-Art BERT Suite for Modern Hebrew},
49
+ author={Shaltiel Shmidman and Avi Shmidman and Moshe Koppel},
50
+ year={2023},
51
+ eprint={2308.16687},
52
+ archivePrefix={arXiv},
53
+ primaryClass={cs.CL}
54
+ }
55
+ ```
56
+
57
+ ## License
58
+
59
+ Shield: [![CC BY 4.0][cc-by-shield]][cc-by]
60
+
61
+ This work is licensed under a
62
+ [Creative Commons Attribution 4.0 International License][cc-by].
63
+
64
+ [![CC BY 4.0][cc-by-image]][cc-by]
65
+
66
+ [cc-by]: http://creativecommons.org/licenses/by/4.0/
67
+ [cc-by-image]: https://i.creativecommons.org/l/by/4.0/88x31.png
68
+ [cc-by-shield]: https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg
69
+
70
+
71
+
72
+