Shaltiel commited on
Commit
039894b
โ€ข
1 Parent(s): c546453

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -7
README.md CHANGED
@@ -2,19 +2,71 @@
2
  license: cc-by-4.0
3
  language:
4
  - he
 
5
  ---
 
6
 
7
- # DictaLM - 7B parameters - Instruct Model
8
 
9
- When initializing you must specify `trust_remote_code=True`
 
 
 
 
10
 
11
- A few notes about the model:
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
- - This is an alpha version of the model, and there are many improvements to come.
14
- - The model works better for tasks of information retrieval (given a paragraph and a question, to answer based on the paragraph), and of general questions (although the world knowledge is relatively limited).
15
 
16
- We are actively working on improving the model, so stay tuned.
 
 
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
  This work is licensed under a
20
  [Creative Commons Attribution 4.0 International License][cc-by].
@@ -23,4 +75,4 @@ This work is licensed under a
23
 
24
  [cc-by]: http://creativecommons.org/licenses/by/4.0/
25
  [cc-by-image]: https://i.creativecommons.org/l/by/4.0/88x31.png
26
- [cc-by-shield]: https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg
 
2
  license: cc-by-4.0
3
  language:
4
  - he
5
+ inference: false
6
  ---
7
+ # **DictaLM**: A Large Generative Language Model for Modern Hebrew
8
 
9
+ A large generative pretrained transformer (GPT) language model for Hebrew, released [link to be added].
10
 
11
+ This model was fine-tuned for instructions:
12
+ - General questions:
13
+ ```
14
+ ืžื” ื–ื” ื‘ื™ืช ืกืคืจ?
15
+ ```
16
 
17
+ ```
18
+ ืงื™ื‘ืœืชื™ ื—ืชืš ืงืœ ื‘ืืฆื‘ืข. ืžื”ื™ ื”ื“ืจืš ื”ื ื›ื•ื ื” ืœื˜ืคืœ ื‘ื–ื”?
19
+ ```
20
+ - Simple tasks:
21
+ ```
22
+ ืชืฆื™ืข ื›ืžื” ืจืขื™ื•ื ื•ืช ืœืคืขื™ืœื•ืช ืขื ื™ืœื“ื™ื ื‘ื ื™ 5:
23
+ ```
24
+ - Information retrieval from a paragraph context:
25
+
26
+ ```
27
+ ื”ืžืกื™ืง ื”ื™ื“ื ื™ ื”ื•ื ื”ื“ืจืš ื”ืžืกื•ืจืชื™ืช ื•ื”ืขืชื™ืงื” ืœืงื˜ื™ืฃ ื–ื™ืชื™ื. ืฉื™ื˜ื” ื–ื• ื“ื•ืจืฉืช ื›ื•ื— ืื“ื ืจื‘ ื‘ืื•ืคืŸ ื™ื—ืกื™ ื•ืขื“ื™ื™ืŸ ืžืงื•ื‘ืœืช ื‘ื™ืฉืจืืœ ื•ื‘ืžืงื•ืžื•ืช ืจื‘ื™ื ื‘ืขื•ืœื. ืฉื™ื˜ื•ืช ืžืกื™ืง ื™ื“ื ื™ ืžืืคืฉืจื•ืช ื—ื™ืกื›ื•ืŸ ืขืœื•ื™ื•ืช ื‘ืžืงื•ืžื•ืช ื‘ื”ื ื›ื•ื— ื”ืื“ื ื–ื•ืœ ื•ืขืœื•ืช ื”ืฉื™ื˜ื•ืช ื”ืžืžื•ื›ื ื•ืช ื’ื‘ื•ื”ื”. ืœื–ื™ืชื™ื ื”ืžื™ื•ืขื“ื™ื ืœืžืื›ืœ (ืœื›ื‘ื™ืฉื”, ื‘ื ื™ื’ื•ื“ ืœื–ื™ืชื™ื ืœืฉืžืŸ) ืžืชืื™ื ื™ื•ืชืจ ืžืกื™ืง ื™ื“ื ื™ ื›ื™ื•ื•ืŸ ืฉื”ืคืจื™ ืคื—ื•ืช ื ืคื’ืข ื‘ืžื”ืœืš ื”ืžืกื™ืง ื‘ืฉื™ื˜ื” ื–ื• (ืคื’ื™ืขื•ืช ื‘ืงืœื™ืคืช ื”ืคืจื™ ื‘ื–ื™ืชื™ื ืœืฉืžืŸ ืคื—ื•ืช ืžืฉืžืขื•ืชื™ื•ืช). ื›ืžื• ื›ืŸ ืžื•ืขื“ืฃ ืžืกื™ืง ื™ื“ื ื™ ื‘ืื–ื•ืจื™ื ื‘ื”ื ื”ื˜ื•ืคื•ื’ืจืคื™ื” ื”ืžืงื•ืžื™ืช ืื• ืฆืคื™ืคื•ืช ื”ืขืฆื™ื ืœื ืžืืคืฉืจื™ื ื’ื™ืฉื” ื ื•ื—ื” ืœื›ืœื™ื ืžื›ื ื™ื. ื”ืฉื™ื˜ื” ื”ื™ื“ื ื™ืช ืžืืคืฉืจืช ื’ื ืœืžืกื•ืง ืขืฆื™ื ืฉื•ื ื™ื ื‘ืžื•ืขื“ื™ื ืฉื•ื ื™ื, ื‘ื”ืชืื ืœืงืฆื‘ ื”ื‘ืฉืœืช ื”ืคืจื™ ื”ื˜ื‘ืขื™ ื‘ื›ืœ ืขืฅ.
28
+
29
+ ืขืœ ื‘ืกื™ืก ื”ืคืกืงื” ื”ื–ืืช, ืžื” ื”ื•ื ื”ื™ืชืจื•ืŸ ืฉืœ ืžืกื™ืง ื™ื“ื ื™ ืžื‘ื—ื™ื ืช ืงืฆื‘ ื”ื‘ืฉืœืช ื”ืคืจื™?
30
+ ```
31
 
32
+ ## Sample usage:
 
33
 
34
+ ```python
35
+ from transformers import AutoModelForCausalLM, AutoTokenizer
36
+ import torch
37
 
38
+ tokenizer = AutoTokenizer.from_pretrained('dicta-il/dictalm-7b-instruct')
39
+ model = AutoModelForCausalLM.from_pretrained('dicta-il/dictalm-7b-instruct', trust_remote_code=True).cuda()
40
+
41
+ model.eval()
42
+
43
+ with torch.inference_mode():
44
+ prompt = 'ืชืฆื™ืข ื›ืžื” ืจืขื™ื•ื ื•ืช ืœืคืขื™ืœื•ืช ืขื ื™ืœื“ื™ื ื‘ื ื™ 5:\n'
45
+ kwargs = dict(
46
+ inputs=tokenizer(prompt, return_tensors='pt').input_ids.to(model.device),
47
+ do_sample=True,
48
+ top_k=50,
49
+ top_p=0.95,
50
+ temperature=0.75,
51
+ max_length=100,
52
+ min_new_tokens=5
53
+ )
54
+
55
+ print(tokenizer.batch_decode(model.generate(**kwargs), skip_special_tokens=True))
56
+ ```
57
+
58
+
59
+ ## Citation
60
+
61
+ If you use DictaLM in your research, please cite ```ADD CITATION HERE```
62
+
63
+ **BibTeX:**
64
+
65
+ ```ADD BIBTEXT HERE```
66
+
67
+ ## License
68
+
69
+ Shield: [![CC BY 4.0][cc-by-shield]][cc-by]
70
 
71
  This work is licensed under a
72
  [Creative Commons Attribution 4.0 International License][cc-by].
 
75
 
76
  [cc-by]: http://creativecommons.org/licenses/by/4.0/
77
  [cc-by-image]: https://i.creativecommons.org/l/by/4.0/88x31.png
78
+ [cc-by-shield]: https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg