metadata
license: cc-by-4.0
language:
- he
inference: false
DictaLM: A Large Generative Language Model for Modern Hebrew
A large generative pretrained transformer (GPT) language model for Hebrew, released [link to be added].
This model was fine-tuned for instructions:
General questions:
ืื ืื ืืืช ืกืคืจ?
ืงืืืืชื ืืชื ืงื ืืืฆืืข. ืืื ืืืจื ืื ืืื ื ืืืคื ืืื?
Simple tasks:
ืชืฆืืข ืืื ืจืขืืื ืืช ืืคืขืืืืช ืขื ืืืืื ืื ื 5:
Information retrieval from a paragraph context:
ืืืกืืง ืืืื ื ืืื ืืืจื ืืืกืืจืชืืช ืืืขืชืืงื ืืงืืืฃ ืืืชืื. ืฉืืื ืื ืืืจืฉืช ืืื ืืื ืจื ืืืืคื ืืืกื ืืขืืืื ืืงืืืืช ืืืฉืจืื ืืืืงืืืืช ืจืืื ืืขืืื. ืฉืืืืช ืืกืืง ืืื ื ืืืคืฉืจืืช ืืืกืืื ืขืืืืืช ืืืงืืืืช ืืื ืืื ืืืื ืืื ืืขืืืช ืืฉืืืืช ืืืืืื ืืช ืืืืื. ืืืืชืื ืืืืืขืืื ืืืืื (ืืืืืฉื, ืื ืืืื ืืืืชืื ืืฉืื) ืืชืืื ืืืชืจ ืืกืืง ืืื ื ืืืืื ืฉืืคืจื ืคืืืช ื ืคืืข ืืืืื ืืืกืืง ืืฉืืื ืื (ืคืืืขืืช ืืงืืืคืช ืืคืจื ืืืืชืื ืืฉืื ืคืืืช ืืฉืืขืืชืืืช). ืืื ืื ืืืขืืฃ ืืกืืง ืืื ื ืืืืืจืื ืืื ืืืืคืืืจืคืื ืืืงืืืืช ืื ืฆืคืืคืืช ืืขืฆืื ืื ืืืคืฉืจืื ืืืฉื ื ืืื ืืืืื ืืื ืื. ืืฉืืื ืืืื ืืช ืืืคืฉืจืช ืื ืืืกืืง ืขืฆืื ืฉืื ืื ืืืืขืืื ืฉืื ืื, ืืืชืื ืืงืฆื ืืืฉืืช ืืคืจื ืืืืขื ืืื ืขืฅ. ืขื ืืกืืก ืืคืกืงื ืืืืช, ืื ืืื ืืืชืจืื ืฉื ืืกืืง ืืื ื ืืืืื ืช ืงืฆื ืืืฉืืช ืืคืจื?
Sample usage:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tokenizer = AutoTokenizer.from_pretrained('dicta-il/dictalm-7b-instruct')
model = AutoModelForCausalLM.from_pretrained('dicta-il/dictalm-7b-instruct', trust_remote_code=True).cuda()
model.eval()
with torch.inference_mode():
prompt = 'ืชืฆืืข ืืื ืจืขืืื ืืช ืืคืขืืืืช ืขื ืืืืื ืื ื 5:\n'
kwargs = dict(
inputs=tokenizer(prompt, return_tensors='pt').input_ids.to(model.device),
do_sample=True,
top_k=50,
top_p=0.95,
temperature=0.75,
max_length=100,
min_new_tokens=5
)
print(tokenizer.batch_decode(model.generate(**kwargs), skip_special_tokens=True))
Citation
If you use DictaLM in your research, please cite ADD CITATION HERE
BibTeX:
ADD BIBTEXT HERE
License
This work is licensed under a Creative Commons Attribution 4.0 International License.